[ 
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920728#action_12920728
 ] 

Peter Schuller commented on CASSANDRA-1470:
-------------------------------------------

On the commit log direct I/O: The buffering is currently limited to 128k, which 
I would expect to some extent negate use of periodic sync mode given that for 
cases where people do periodic sync in a write-heavy environment, they probably 
don't want direct I/O (effectively fsync() in terms of performance 
characteristics) every 128k.

It would also be detrimental for row mutations that are say > 50k since the 
probability of hitting disk more than once for a single row mutation commit 
would be high.

If my understanding is correct, some possible suggestions:

(1) up commit log buffer size significantly to mitigate the problem, or even to 
the extent that an entire commit log segment is kept in ram (also has negative 
effects on the latency once you *do* sync)

(2) only enable direct I/O when in batched mode (not very useful)

(3) actually prefer posix_fadvise() in this case (contrary to the 
sstable/compaction case).

Other than the extra effort, (3) is probably cleanest (by my initial feeling) 
in this particular case since the commit log maps directly to the actual 
functionality implemented by DONT_NEED with none of the drawbacks talked about 
above that applied to the compaction case.



> use direct io for compaction
> ----------------------------
>
>                 Key: CASSANDRA-1470
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 0.6.7, 0.7.1
>
>         Attachments: 1470-v2.txt, 1470.txt, 
> use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch
>
>
> When compaction scans through a group of sstables, it forces the data in the 
> os buffer cache being used for hot reads, which can have a dramatic negative 
> effect on performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to