[jira] [Commented] (CASSANDRA-6809) Compressed Commit Log

Ariel Weisberg (JIRA) Tue, 13 Jan 2015 11:13:27 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275726#comment-14275726
 ]


Ariel Weisberg commented on CASSANDRA-6809:
-------------------------------------------

I finished my review. Comments are in the pull request. It looks good and could 
ship as is. I have some thoughts about potential scope creep I would advocate 
for. Also some other directions for enhancement the commit log could go in as 
well as some reservations about performance in some cases. I only just noticed 
CommitLog stress so I need to check that out so I can understand the numbers 
and what is being tested.

RE CASSANDRA-7075  multiple CL disks. I see this as a work around for not 
having RAID-0 of the volumes being used for the CL and that is it. And that may 
introduce it's own load balancing issues as well as a mess of code for 
scattering/gathering mutations that I am less comfortable with. Writing a CL 
pipeline that can do the maximum supported sequential IO to a single file is 
doable, and if I had a choice it is what I would rather write. From a user 
perspective it is a nice feature to not to be forced to provide a RAID volume 
and to me that should be the primary motivation.

Also fascinating (to me) piece of trivia. When I tested in the past I could 
call force() on a mapped byte buffer far fewer times then I could call force() 
on a FileChannel. So if I had a battery backed disk controller and I appended a 
page (in a preallocated file) and called force() in a loop with a 
MappedByteBuffer it would do a few hundreds syncs a second, but with 
FileChannel.force it would do a few thousand. MBB was slow enough to be a 
concern for synchronous commits.


> Compressed Commit Log
> ---------------------
>
>                 Key: CASSANDRA-6809
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log. 
> Doing so should improve throughput, but some care will need to be taken to 
> ensure we use as much of a segment as possible. I propose decoupling the 
> writing of the records from the segments. Basically write into a (queue of) 
> DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
> MB written to the CL (where X is ordinarily CLS size), and then pack as many 
> of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6809) Compressed Commit Log

Reply via email to