[
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275726#comment-14275726
]
Ariel Weisberg commented on CASSANDRA-6809:
-------------------------------------------
I finished my review. Comments are in the pull request. It looks good and could
ship as is. I have some thoughts about potential scope creep I would advocate
for. Also some other directions for enhancement the commit log could go in as
well as some reservations about performance in some cases. I only just noticed
CommitLog stress so I need to check that out so I can understand the numbers
and what is being tested.
RE CASSANDRA-7075 multiple CL disks. I see this as a work around for not
having RAID-0 of the volumes being used for the CL and that is it. And that may
introduce it's own load balancing issues as well as a mess of code for
scattering/gathering mutations that I am less comfortable with. Writing a CL
pipeline that can do the maximum supported sequential IO to a single file is
doable, and if I had a choice it is what I would rather write. From a user
perspective it is a nice feature to not to be forced to provide a RAID volume
and to me that should be the primary motivation.
Also fascinating (to me) piece of trivia. When I tested in the past I could
call force() on a mapped byte buffer far fewer times then I could call force()
on a FileChannel. So if I had a battery backed disk controller and I appended a
page (in a preallocated file) and called force() in a loop with a
MappedByteBuffer it would do a few hundreds syncs a second, but with
FileChannel.force it would do a few thousand. MBB was slow enough to be a
concern for synchronous commits.
> Compressed Commit Log
> ---------------------
>
> Key: CASSANDRA-6809
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Benedict
> Assignee: Branimir Lambov
> Priority: Minor
> Labels: performance
> Fix For: 3.0
>
> Attachments: logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log.
> Doing so should improve throughput, but some care will need to be taken to
> ensure we use as much of a segment as possible. I propose decoupling the
> writing of the records from the segments. Basically write into a (queue of)
> DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X
> MB written to the CL (where X is ordinarily CLS size), and then pack as many
> of the compressed chunks into a CLS as possible.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)