[ https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275726#comment-14275726 ]
Ariel Weisberg commented on CASSANDRA-6809: ------------------------------------------- I finished my review. Comments are in the pull request. It looks good and could ship as is. I have some thoughts about potential scope creep I would advocate for. Also some other directions for enhancement the commit log could go in as well as some reservations about performance in some cases. I only just noticed CommitLog stress so I need to check that out so I can understand the numbers and what is being tested. RE CASSANDRA-7075 multiple CL disks. I see this as a work around for not having RAID-0 of the volumes being used for the CL and that is it. And that may introduce it's own load balancing issues as well as a mess of code for scattering/gathering mutations that I am less comfortable with. Writing a CL pipeline that can do the maximum supported sequential IO to a single file is doable, and if I had a choice it is what I would rather write. From a user perspective it is a nice feature to not to be forced to provide a RAID volume and to me that should be the primary motivation. Also fascinating (to me) piece of trivia. When I tested in the past I could call force() on a mapped byte buffer far fewer times then I could call force() on a FileChannel. So if I had a battery backed disk controller and I appended a page (in a preallocated file) and called force() in a loop with a MappedByteBuffer it would do a few hundreds syncs a second, but with FileChannel.force it would do a few thousand. MBB was slow enough to be a concern for synchronous commits. > Compressed Commit Log > --------------------- > > Key: CASSANDRA-6809 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6809 > Project: Cassandra > Issue Type: Improvement > Reporter: Benedict > Assignee: Branimir Lambov > Priority: Minor > Labels: performance > Fix For: 3.0 > > Attachments: logtest.txt > > > It seems an unnecessary oversight that we don't compress the commit log. > Doing so should improve throughput, but some care will need to be taken to > ensure we use as much of a segment as possible. I propose decoupling the > writing of the records from the segments. Basically write into a (queue of) > DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X > MB written to the CL (where X is ordinarily CLS size), and then pack as many > of the compressed chunks into a CLS as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)