[ 
https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062379#comment-13062379
 ] 

Chris Burroughs commented on CASSANDRA-47:
------------------------------------------

.bq Using 64kb buffer 1.7GB file could be compressed into 110MB (data added 
using ./bin/stress -n 1000000 -S 1024 -V, where -V option generates average 
size values and different cardinality from 50 (default) to 250).

This seems like an unrealistically good compression ratio.  If I gzip a real 
world SSTable that has redundant data that should be ripe for compression I 
only see 641M-->217M.  What's the gzip compression ratio with the SSTables that 
stress.java workload generates?  

Stu, could you post your custom YCSB workload from CASSANDRA-674 for comparison?

> SSTable compression
> -------------------
>
>                 Key: CASSANDRA-47
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-47
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>              Labels: compression
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar
>
>
> We should be able to do SSTable compression which would trade CPU for I/O 
> (almost always a good trade).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to