[ 
https://issues.apache.org/jira/browse/CASSANDRA-13703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153675#comment-16153675
 ] 

Dimitar Dimitrov edited comment on CASSANDRA-13703 at 9/5/17 1:56 PM:
----------------------------------------------------------------------

+1 - from my still largely layman perspective, the change looks good.

A couple of small nits:
* {{CompressedChunkReader.maybeCheckCrc()}} (renamed in this patch to 
{{shouldCheckCrc()}}) may be removed or at least its visibility could be 
reduced. It seems to be used solely in CompressedChunkReader.java, lines 158 
and 204 in this patch.
* The no-argument / single-argument factory methods for Snappy and LZ4 
compressions in CompressionParams.java seem to differ in the values that they 
default to for min compression ratio and max compressed length.
** For Snappy, if nothing is specified, or only chunk length is specified, a 
default min compression ratio of 1.1 is used, and therefore max compressed 
length ends up somewhere roughly around 90% of chunk length.
** For LZ4, if nothing is specified, or only chunk length is specified, a 
default max compressed length of chunk length is used, and therefore min 
compression ratio ends up at 1.0 (I'm not sure if a precision error is possible 
there).

Edit: Of course, take this review with the appropriate rock-sized grain of salt.


was (Author: dimitarndimitrov):
+1 - from my still largely layman perspective, the change looks good.

A couple of small nits:
* {{CompressedChunkReader.maybeCheckCrc()}} (renamed in this patch to 
{{shouldCheckCrc()}}) may be removed or at least its visibility could be 
reduced. It seems to be used solely in CompressedChunkReader.java, lines 158 
and 204 in this patch.
* The no-argument / single-argument factory methods for Snappy and LZ4 
compressions in CompressionParams.java seem to differ in the values that they 
default to for min compression ratio and max compressed length.
** For Snappy, if nothing is specified, or only chunk length is specified, a 
default min compression ratio of 1.1 is used, and therefore max compressed 
length ends up somewhere roughly around 90% of chunk length.
** For LZ4, if nothing is specified, or only chunk length is specified, a 
default max compressed length of chunk length is used, and therefore min 
compression ratio ends up at 1.0 (I'm not sure if a precision error is possible 
there).

> Using min_compress_ratio <= 1 causes corruption
> -----------------------------------------------
>
>                 Key: CASSANDRA-13703
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13703
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>            Priority: Blocker
>             Fix For: 4.x
>
>         Attachments: patch
>
>
> This is because chunks written uncompressed end up below the compressed size 
> threshold. Demonstrated by applying the attached patch meant to improve the 
> testing of the 10520 changes, and running 
> {{CompressedSequentialWriterTest.testLZ4Writer}}.
> The default {{min_compress_ratio: 0}} is not affected as it never writes 
> uncompressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to