[
https://issues.apache.org/jira/browse/CASSANDRA-10995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117244#comment-15117244
]
Benedict commented on CASSANDRA-10995:
--------------------------------------
You probably want to us a larger dataset. I suspect that is all happily
fitting into RAM. Turning off compression may yield larger dividends for
on-disk performance for small rows, since fewer sectors need to be touched
As far as compressible data is concerned, yes, narrowing the population size
_for each column_ in the yaml will increase compressibility. But only if the
population is very small, since you would need for the data to occur multiple
times on a single page. Realistically a dictionary generator should be added,
which is not very hard, and was on my todo list for a long time. That or a
weighted random byte generator, that is more likely to produce certain bytes
(or byte sequences) than others, which would avoid the necessity of a
dictionary while providing the same benefit.
> Consider disabling sstable compression by default in 3.x
> --------------------------------------------------------
>
> Key: CASSANDRA-10995
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10995
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Aleksey Yeschenko
> Assignee: Jim Witschey
>
> With the new sstable format introduced in CASSANDRA-8099, it's very likely
> that enabled sstable compression is no longer the right default option.
> [~slebresne]'s [blog post|http://www.datastax.com/2015/12/storage-engine-30]
> on the new storage engine has some comparison numbers for 2.2/3.0, with and
> without compression that show that in many cases compression no longer has a
> significant effect on sstable sizes - all while sill consuming extra
> resources for both writes (compression) and reads (decompression).
> We should run a comprehensive set of benchmarks to determine whether or not
> compression should be switched to 'off' now in 3.x.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)