Joseph Lynch created CASSANDRA-14886:
----------------------------------------

             Summary: Add a tool for estimating compression effects for 
different block sizes / compressors
                 Key: CASSANDRA-14886
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14886
             Project: Cassandra
          Issue Type: Improvement
          Components: Compression
            Reporter: Joseph Lynch


A common question from users of compression is "which block size should I use". 
Until we figure out how to auto-tune the block size (or use something like zstd 
dictionary training), it might be useful to ship a tool similar to the one 
[~aweisberg] created ([gist 
mirror|https://gist.github.com/jolynch/411e62ac592bfb55cfdd5db87c77ef6f]) for 
CASSANDRA-13241 that users could point at an existing sstable and it would 
output expected ratios for that sstable re-compressed with either different 
block sizes or a different compressor all together. For example maybe something 
like:
{noformat}
$ /cassandra/tools/bin/sstable-compression-estimate <foo>
Compressor | Chunk Size | Ratio | Read Speed | Off-Heap Memory |                
----------------------------------------------------------------                
                                                                                
                                                                   
LZ4        | 4096       | 0.54  | 0.2 ms     | 100kb           |                
LZ4        | 8192       | 0.46  | 0.3 ms     | 50kb            |                
LZ4        | 16384      | 0.42  | 0.3 ms     | 24kb            |                
LZ4        | 32768      | 0.38  | 0.4 ms     | 12kb            |                
LZ4        | 65536      | 0.35  | 0.8 ms     | 6kb             |                
----------------------------------------------------------------                
Zstd       | 4096       | 0.40  | 0.3 ms     | 100kb           |                
Zstd       | 8192       | 0.34  | 0.4 ms     | 50kb            |                
Zstd       | 16384      | 0.25  | 0.5 ms     | 24kb            | 

...
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to