[ https://issues.apache.org/jira/browse/CASSANDRA-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13525993#comment-13525993 ]
Adrien Grand commented on CASSANDRA-5038: ----------------------------------------- bq. Cool, yeah I'm not sure if we can use the "known size" decompressor, does it have to be exact or can it be upper bounded? We know from the block size the max compressed length. It needs to be exact, or decompression will fail. An option to be able to use it is to write the original length as an int (or better as a variable-length int) before the compressed bytes. Upon decompression, first read the original length and then use this original length to call the "known size" decompressor. bq. I'd suggest you add a simple way for us to pick the best compressor for our node. This is what the LZ4Factory#defaultInstance (I should probably rename it to fastestInstance) aims at doing but it only tries unsafe then safe right now. I'll try to add support for the native impl soon. Another feature of these compressors you might be interested in is that you can provide them with an output buffer of any length and they will succeed only if they managed to generate an output which is small enough (and they will fail as soon as they know they won't make it). So for example, you could decide to write the raw bytes instead of the compressed bytes if LZ4 didn't manage to compress your data by more than 10%: {code} final int maxAcceptableCompressedLength = originalLength * 90 / 100; try { dest[0] = 0; // means compressed final int compressedLength = compressor.compress(src, 0, originalLength, dest, 1, maxAcceptableCompressedLength); return 1 + compressedLength; } catch (LZ4Exception e) { dest[0] = 1; // means not compressed System.arraycopy(src, 0, dest, 1, originalLength); return 1 + originalLength; } {code} (Only the native LZ4 HC impl doesn't support this feature.) > LZ4Compressor > ------------- > > Key: CASSANDRA-5038 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5038 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: T Jake Luciani > Priority: Minor > Fix For: 1.2.1 > > Attachments: LZ4Compressor.java, lz4-java.jar > > > LZ4 is a new compression algo that's ~2x faster than Snappy. > [~jpountz] has written a nice java port which includes a misc.Unsafe version > that performs >= than our java snappy version. > Details at http://blog.jpountz.net/post/28092106032/wow-lz4-is-fast > The nice thing is this should work with java7 and be more portable. > We can also fallback the pure java impl -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira