[
https://issues.apache.org/jira/browse/CASSANDRA-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941681#comment-14941681
]
Benjamin Lerer commented on CASSANDRA-10225:
--------------------------------------------
Computing the compression ratio by making the sum of the
{{compressedFileLength}} and dividing it by the sum of the {{dataLength}} does
not look a bad approach to me but it seems that the data length might not
always be the real length (according to a comment in {{CompressionMetadata}}).
[~benedict] I am not too familiar with this part of the code. Is there a risk
that computing the compression ratio this way give us a wrong result?
> Make compression ratio much more accurate
> -----------------------------------------
>
> Key: CASSANDRA-10225
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10225
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Jeremy Hanna
> Assignee: Brett Snyder
> Labels: lhf
> Fix For: 2.1.x
>
> Attachments: cassandra-2.1-10225.txt
>
>
> Currently in cfstats, it will take an average over the compression ratios of
> all of the sstables without regard to the data sizes. This can lead to a
> very inaccurate value. It would be good to factor in the uncompressed and
> compressed sizes for the sstables to give an accurate number.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)