[ 
https://issues.apache.org/jira/browse/CASSANDRA-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661492#comment-16661492
 ] 

Joseph Lynch commented on CASSANDRA-13241:
------------------------------------------

I don't mean to add another perspective to this, but I am not sure we're 
considering the compression ratio loss on real world data enough here. We've 
been talking a lot about the memory requirements but I think the bigger issues 
are:
 * Ratio loss leading to less of the dataset being hot in OS page cache
 * OS Read-ahead is usually 16 or 32kb, so if you're reading less than that 
from disk you're still going to read 16 or 32kb...

I think for Cassandra which relies on the OS page cache heavily for 
performance, 16kb is the absolute minimum I would default to. For example from 
IRC today I ran Ariel's ratio 
[script|https://gist.github.com/jolynch/411e62ac592bfb55cfdd5db87c77ef6f] on a 
(somewhat arbitrary) 3.0.17 production cluster dataset and saw the following 
ratios :
{noformat}
Chunk size 4096, ratio 0.541505
Chunk size 8192, ratio 0.467537
Chunk size 16384, ratio 0.425122
Chunk size 32768, ratio 0.387040
Chunk size 65536, ratio 0.352454
{noformat}
The reduction in ratio at 4-8kb would destroy the OS page cache imo. 16KB isn't 
too bad, and 32kb is downright fine.

In my experience, 32kb is often an easy win, and 16kb is often a good idea for 
less compressible datasets. Last I checked Scylla uses direct io and bypasses 
the OS cache so I don't think we should use their default unless we implement 
direct io as well (and the buffer cache on top of it)...

If the dataset is less than RAM, then yea 4kb all the way ...

> Lower default chunk_length_in_kb from 64kb to 16kb
> --------------------------------------------------
>
>                 Key: CASSANDRA-13241
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13241
>             Project: Cassandra
>          Issue Type: Wish
>          Components: Core
>            Reporter: Benjamin Roth
>            Assignee: Ariel Weisberg
>            Priority: Major
>         Attachments: CompactIntegerSequence.java, 
> CompactIntegerSequenceBench.java, CompactSummingIntegerSequence.java
>
>
> Having a too low chunk size may result in some wasted disk space. A too high 
> chunk size may lead to massive overreads and may have a critical impact on 
> overall system performance.
> In my case, the default chunk size lead to peak read IOs of up to 1GB/s and 
> avg reads of 200MB/s. After lowering chunksize (of course aligned with read 
> ahead), the avg read IO went below 20 MB/s, rather 10-15MB/s.
> The risk of (physical) overreads is increasing with lower (page cache size) / 
> (total data size) ratio.
> High chunk sizes are mostly appropriate for bigger payloads pre request but 
> if the model consists rather of small rows or small resultsets, the read 
> overhead with 64kb chunk size is insanely high. This applies for example for 
> (small) skinny rows.
> Please also see here:
> https://groups.google.com/forum/#!topic/scylladb-dev/j_qXSP-6-gY
> To give you some insights what a difference it can make (460GB data, 128GB 
> RAM):
> - Latency of a quite large CF: https://cl.ly/1r3e0W0S393L
> - Disk throughput: https://cl.ly/2a0Z250S1M3c
> - This shows, that the request distribution remained the same, so no "dynamic 
> snitch magic": https://cl.ly/3E0t1T1z2c0J



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to