[
https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062379#comment-13062379
]
Chris Burroughs commented on CASSANDRA-47:
------------------------------------------
.bq Using 64kb buffer 1.7GB file could be compressed into 110MB (data added
using ./bin/stress -n 1000000 -S 1024 -V, where -V option generates average
size values and different cardinality from 50 (default) to 250).
This seems like an unrealistically good compression ratio. If I gzip a real
world SSTable that has redundant data that should be ripe for compression I
only see 641M-->217M. What's the gzip compression ratio with the SSTables that
stress.java workload generates?
Stu, could you post your custom YCSB workload from CASSANDRA-674 for comparison?
> SSTable compression
> -------------------
>
> Key: CASSANDRA-47
> URL: https://issues.apache.org/jira/browse/CASSANDRA-47
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Pavel Yaskevich
> Labels: compression
> Fix For: 1.0
>
> Attachments: CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar
>
>
> We should be able to do SSTable compression which would trade CPU for I/O
> (almost always a good trade).
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira