[ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103622#comment-14103622
 ] 

Jason Brown edited comment on CASSANDRA-6809 at 8/20/14 8:37 AM:
-----------------------------------------------------------------

bq. If we're dropping recycling, ... bottlenecking anything.

Reread this paragraph several times, now it makes sense. I wasn't thinking 
about the write perf, necessarily, but about having the file contiguous on 
disk. However, since the commit log files are, more or less, one-time use 
(meaning, we're not doing tons of random nor sequential I/O reads on them), I 
guess worrying about a large contiguous block on disk isn't necessary.

bq. Per-disk sync threads

I'm still not sure sync threads, in the manner initially described above, are 
totally necessary. If you are worried about the time for the mmap'ed buffers to 
flush in the same thread that's handling all the CL entry processing + any 
possible compression or encryption, a simple solution might be to have a sync 
thread that merely invokes the mmap buffer flush. Thus, the main CL thread(s) 
can continue processing the new entries and writing to the mmap buffer, but the 
sync thread eats the cost of the msync.


was (Author: jasobrown):
bq. If we're dropping recycling, ... bottlenecking anything.

Reread this paragraph several times, now it makes sense. I wasn't thinking 
about the write perf, necessarily, but about having the file contiguous on 
disk. However, since the commit log files are, more or less, one-time use 
(meaning, we're not doing tons of random nor sequential I/O reads on them), I 
guess worrying about a large contiguous block on disk isn't necessary.

bq. Per-disk sync threads

I'm still not sure sync threads are totally necessary. If you are worried about 
the time for the mmap'ed buffers to flush in the same thread that's handling 
all the CL entry processing + any possible compression or encryption, a simple 
solution might be to have a sync thread that merely invokes the mmap buffer 
flush. Thus, the main CL thread(s) can continue processing the new entries and 
writing to the mmap buffer, but the sync thread eats the cost of the msync.

> Compressed Commit Log
> ---------------------
>
>                 Key: CASSANDRA-6809
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.0
>
>
> It seems an unnecessary oversight that we don't compress the commit log. 
> Doing so should improve throughput, but some care will need to be taken to 
> ensure we use as much of a segment as possible. I propose decoupling the 
> writing of the records from the segments. Basically write into a (queue of) 
> DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
> MB written to the CL (where X is ordinarily CLS size), and then pack as many 
> of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to