[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size

Stefania (JIRA) Tue, 07 Jul 2015 18:05:44 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617778#comment-14617778
 ]


Stefania commented on CASSANDRA-8894:
-------------------------------------

Yes I assumed a normal distribution of the record size. Your suggestion of a 
uniform distribution of _start position_ within a page is more straight-forward 
however. Let's start with that: {{size}} = 95 percentile, chance of crossing = 
{{(size % 4096) / 4096}}

Noted about adding size percentile and chance of crossing threshold to the 
config without mention in the yaml. I'll also add a *global* setting to 
indicate if the data directories are SSD or spinning disk, and this will 
instead be in the yaml. 

> Our default buffer size for (uncompressed) buffered reads should be smaller, 
> and based on the expected record size
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8894
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>              Labels: benedict-to-commit
>             Fix For: 3.x
>
>
> A large contributor to slower buffered reads than mmapped is likely that we 
> read a full 64Kb at once, when average record sizes may be as low as 140 
> bytes on our stress tests. The TLB has only 128 entries on a modern core, and 
> each read will touch 32 of these, meaning we are unlikely to almost ever be 
> hitting the TLB, and will be incurring at least 30 unnecessary misses each 
> time (as well as the other costs of larger than necessary accesses). When 
> working with an SSD there is little to no benefit reading more than 4Kb at 
> once, and in either case reading more data than we need is wasteful. So, I 
> propose selecting a buffer size that is the next larger power of 2 than our 
> average record size (with a minimum of 4Kb), so that we expect to read in one 
> operation. I also propose that we create a pool of these buffers up-front, 
> and that we ensure they are all exactly aligned to a virtual page, so that 
> the source and target operations each touch exactly one virtual page per 4Kb 
> of expected record size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size

Reply via email to