[
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285436#comment-14285436
]
Branimir Lambov commented on CASSANDRA-6809:
--------------------------------------------
The current approach boils down to multiple sync threads which
* form sections at regular time intervals,
* compress the section,
* wait for any previous syncs to have retired,
* write and flush the compressed data,
* retire the sync.
(In the uncompressed case a single thread which skips the second and third
step.)
Let me try to rephrase what you are saying to make sure I understand it
correctly:
* single sync thread forms sections at regular time intervals and sends them to
compression executor/phase (SPMC queue),
* compression task sends completed sections to flush executor/phase (MPSC
queue, ordering and wait for the first in-flight one required),
* flush task retires syncs in order.
Is this what you mean? Why is this simpler, or of comparable complexity?
Wouldn't the two extra queues waste resources and increase latency?
Smaller-than-segment batches (sections) are already part of the design in both
cases, assuming that the sync period is sane (e.g. ~100ms). In both approaches
there's room to further separate write and flush at the expense of added
complexity.
> Compressed Commit Log
> ---------------------
>
> Key: CASSANDRA-6809
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Benedict
> Assignee: Branimir Lambov
> Priority: Minor
> Labels: performance
> Fix For: 3.0
>
> Attachments: ComitLogStress.java, logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log.
> Doing so should improve throughput, but some care will need to be taken to
> ensure we use as much of a segment as possible. I propose decoupling the
> writing of the records from the segments. Basically write into a (queue of)
> DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X
> MB written to the CL (where X is ordinarily CLS size), and then pack as many
> of the compressed chunks into a CLS as possible.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)