[ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360347#comment-14360347
 ] 

Branimir Lambov commented on CASSANDRA-6809:
--------------------------------------------

Rebased and updated 
[here|https://github.com/apache/cassandra/compare/trunk...blambov:6809-compressed-logs-rebase],
 now using ByteBuffers for compressing. Some fixes to the {{DeflateCompressor}} 
implementation are included.

{quote}
At 
https://github.com/apache/cassandra/compare/trunk...blambov:6809-compressed-logs#diff-d07279710c482983e537aed26df80400R340
If archiving fails it appears to delete the segment now. Is that the right 
thing to do?
{quote}
It deletes if archival was successful ({{deleteFile = archiveSuccess}}). The 
old code was doing the same thing, a bit more confusingly ({{deleteFile = 
!archiveSuccess ? false : true}}).

bq. CSLM's understanding of segment size is skewed because compressed segments 
are less than the expected segment size in reality. With real compression 
ratios it's going to be off by 30-50%. If when the size is known it's tracking 
could be corrected it would be nice.

Unfortunately that's not trivial. Measurement must happen when the file is done 
writing, which is a point CLSM doesn't currently have access to; moreover that 
could be triggered from either sync() or close() on the segment and I don't 
want to include the risk that I don't get all updates working correctly into 
this patch. Changed the description of the parameter to reflect what it is 
currently measuring.

This can and should be fixed soon after, though, and I'll open an issue as soon 
as this is committed.

bq. For the buffer pooling. I would be tempted to not wait for the collector to 
get to the DBB. If the DBB is promoted due to compaction or some other 
allocation hog it may not be reclaimed for some time. In 
CompressedSegment.close maybe null the field then invoke the cleaner on the 
buffer. There is a utility method for doing that so you don't have to access 
the interface directly (generates a compiler warning).

Done.

bq. Also make MAX_BUFFERPOOL_SIZE configurable via a property. I have been 
prefixing internal C* properties with "cassandra." I suspect that at several 
hundred megabytes a second we will have more than 3 32 megabyte buffers in 
flight. I have a personal fear of shipping constants that aren't quite right 
and putting them all in properties can save waiting for code changes.

Good catch, the number of segments needed can depend on the sync period, thus 
this does need to be exposed. Done, as a non-published option in cassandra.yaml 
for now.

bq. I tested on Linux. If I drop the page cache on the new code it doesn't 
generate reads. I tested the old code and it generated a few hundred megabytes 
of reads.

Thank you.


> Compressed Commit Log
> ---------------------
>
>                 Key: CASSANDRA-6809
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: docs-impacting, performance
>             Fix For: 3.0
>
>         Attachments: ComitLogStress.java, logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log. 
> Doing so should improve throughput, but some care will need to be taken to 
> ensure we use as much of a segment as possible. I propose decoupling the 
> writing of the records from the segments. Basically write into a (queue of) 
> DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
> MB written to the CL (where X is ordinarily CLS size), and then pack as many 
> of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to