[
https://issues.apache.org/jira/browse/CASSANDRA-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sylvain Lebresne updated CASSANDRA-3001:
----------------------------------------
Attachment: 0002-Add-deflate-compressor.patch
0001-Pluggable-algorithm-and-chunk-length.patch
Attaching patch to make the compression algorithm configurable, as well as the
chunk length. It implements the idea of having compression be "similar" to the
compaction strategies as far as thrift is concerned.
Talking of the chunk length, its default value is 65535, which is 64k-1, not
64k. I think this is problem because of the following line in
CRAR.decompressChunk:
{noformat}
// buffer offset is always aligned
bufferOffset = current & ~(buffer.length - 1);
{noformat}
which I believe only works if buffer.length is a power of 2 (which 64k-1 is
not). We should either change this line or enforce that the chunk length is a
power of two. The attached patch choose the second solution, enforcing a power
of 2 length (and thus set the default chunk to 65536).
The second attached patch adds a compressor based on Java deflate default
implementation. Sadly, I haven't found a way to compute in advance what is the
max size a piece of compressed data can take (that is, an equivalent to
Snappy.maxCompressedLength()), so the patch does slightly modify the
ICompressor interface to allow the compression function to resize the buffer if
need be. This is arguably not very elegant, though it works. Besides, I haven't
really made any true benchmarks, but given the time it takes to compact the
result of a default stress session, this sound sloooooow (but it does result in
non-negligibly smaller files than Snappy). Don't know if we want to commit that
part: felt reasonable to try it at the very least.
> Make the compression algorithm and chunk length configurable
> ------------------------------------------------------------
>
> Key: CASSANDRA-3001
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3001
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Priority: Minor
> Labels: compression
> Fix For: 1.0
>
> Attachments: 0001-Pluggable-algorithm-and-chunk-length.patch,
> 0002-Add-deflate-compressor.patch
>
>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira