[
https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617778#comment-14617778
]
Stefania commented on CASSANDRA-8894:
-------------------------------------
Yes I assumed a normal distribution of the record size. Your suggestion of a
uniform distribution of _start position_ within a page is more straight-forward
however. Let's start with that: {{size}} = 95 percentile, chance of crossing =
{{(size % 4096) / 4096}}
Noted about adding size percentile and chance of crossing threshold to the
config without mention in the yaml. I'll also add a *global* setting to
indicate if the data directories are SSD or spinning disk, and this will
instead be in the yaml.
> Our default buffer size for (uncompressed) buffered reads should be smaller,
> and based on the expected record size
> ------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-8894
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8894
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Benedict
> Assignee: Stefania
> Labels: benedict-to-commit
> Fix For: 3.x
>
>
> A large contributor to slower buffered reads than mmapped is likely that we
> read a full 64Kb at once, when average record sizes may be as low as 140
> bytes on our stress tests. The TLB has only 128 entries on a modern core, and
> each read will touch 32 of these, meaning we are unlikely to almost ever be
> hitting the TLB, and will be incurring at least 30 unnecessary misses each
> time (as well as the other costs of larger than necessary accesses). When
> working with an SSD there is little to no benefit reading more than 4Kb at
> once, and in either case reading more data than we need is wasteful. So, I
> propose selecting a buffer size that is the next larger power of 2 than our
> average record size (with a minimum of 4Kb), so that we expect to read in one
> operation. I also propose that we create a pool of these buffers up-front,
> and that we ensure they are all exactly aligned to a virtual page, so that
> the source and target operations each touch exactly one virtual page per 4Kb
> of expected record size.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)