[
https://issues.apache.org/jira/browse/CASSANDRA-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stu Hood updated CASSANDRA-2398:
--------------------------------
Attachment: 0002-CASSANDRA-2398-Type-specific-compression-for-counters.txt
0001-CASSANDRA-2398-Add-type-specific-compression-to-Abstra.txt
Modified the compression interface to deal with ByteBuffers, and added support
for compression of CounterColumnType. The compression ratios for examples in
the unit tests are:
{quote}
# with CounterColumnType specific compression
2.745098 for 4 values (inbytes: 140, outbytes: 51)
4.7719297 for 4 values (inbytes: 272, outbytes: 57)
5.9710145 for 8 values (inbytes: 412, outbytes: 69)
5.415465 for 10000 values (inbytes: 350034, outbytes: 64636)
# with generic LZF compression
2.5 for 4 values (inbytes: 140, outbytes: 56)
4.1846156 for 4 values (inbytes: 272, outbytes: 65)
4.2916665 for 8 values (inbytes: 412, outbytes: 96)
2.3148732 for 10000 values (inbytes: 349944, outbytes: 151172)
{quote}
> Type specific compression
> -------------------------
>
> Key: CASSANDRA-2398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2398
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Stu Hood
> Labels: compression
> Fix For: 1.0
>
> Attachments:
> 0001-CASSANDRA-2398-Add-type-specific-compression-to-Abstra.txt,
> 0002-CASSANDRA-2398-Type-specific-compression-for-counters.txt
>
>
> Cassandra has a lot of locations that are ripe for type specific compression.
> A short list:
> Indexes
> * Keys compressed as BytesType, which could default to LZO/LZMA
> * Offsets (delta and varint encoding)
> * Column names added by 2319
> Data
> * Keys, columns, timestamps: see
> http://wiki.apache.org/cassandra/FileFormatDesignDoc
> A basic interface for type specific compression could be as simple as:
> {code:java}
> public void compress(int version, final List<ByteBuffer> from, DataOutput to)
> throws IOException
> public void decompress(int version, DataInput from, List<ByteBuffer> to)
> throws IOException
> public void skip(int version, DataInput from) throws IOException
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira