[ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140658#comment-14140658
 ] 

Branimir Lambov commented on CASSANDRA-6809:
--------------------------------------------

Patch is available for review at https://github.com/blambov/cassandra/pull/2:

{panel}
Provides two implementations of commit log segments, one matching the
previous memory-mapped writing method for uncompressed logs and one
that uses in-memory buffers and compresses sections between sync markers
before writing to the log. Replay is changed to decompress these sections
and keep track of the uncompressed position to correctly identify the replay
position.

The compression class and parameters are specified in cassandra.yaml and
stored in the commit log descriptor.

Tested by the test-compression target, which now enables LZ4Compression
of commit logs in addition to compression for SSTables.
{panel}


[~jasobrown]: Using a writer interface will probably be a little cleaner from a 
design point of view, but if we want to preserve all features of the current 
approach the two writing methods and the log segment class are so tightly 
coupled that it doesn't really matter.

The measurements I did compared the number of commit log writes of a fixed size 
that one could perform in a given time period (the ComitLogStress test 
introduced in CASSANDRA-3578 and slightly updated here). Memory-mapped IO does 
seem to provide some benefit at least on Windows, which for me means we should 
not be removing it yet.

> Compressed Commit Log
> ---------------------
>
>                 Key: CASSANDRA-6809
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log. 
> Doing so should improve throughput, but some care will need to be taken to 
> ensure we use as much of a segment as possible. I propose decoupling the 
> writing of the records from the segments. Basically write into a (queue of) 
> DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
> MB written to the CL (where X is ordinarily CLS size), and then pack as many 
> of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to