[ https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284213#comment-14284213 ]
Benedict commented on CASSANDRA-6809: ------------------------------------- I haven't looked _closely_ at the patch, but it seems to me that a simpler approach would be to split CL.sync() into a compression phase and a sync phase, so that there remains only one sync writer. The first phase would submit to an executor service, and the second would flush them to disk (possibly started immediately, but waiting on the completion of each compression chunk). If 0 compression threads, the compression executor can be an inline executor. Submission to the compression stage could in a follow up change be submitted by a mutator rather than the sync thread, and this would permit smaller-than-segment batches to be submitted, so that there is both less delay between sync starting and flushing, and so that parallelism can be achieved for flush of a single segment. The main advantage, though, is it is easier to reason about than multiple sync threads. > Compressed Commit Log > --------------------- > > Key: CASSANDRA-6809 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6809 > Project: Cassandra > Issue Type: Improvement > Reporter: Benedict > Assignee: Branimir Lambov > Priority: Minor > Labels: performance > Fix For: 3.0 > > Attachments: ComitLogStress.java, logtest.txt > > > It seems an unnecessary oversight that we don't compress the commit log. > Doing so should improve throughput, but some care will need to be taken to > ensure we use as much of a segment as possible. I propose decoupling the > writing of the records from the segments. Basically write into a (queue of) > DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X > MB written to the CL (where X is ordinarily CLS size), and then pack as many > of the compressed chunks into a CLS as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)