[ 
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934639#action_12934639
 ] 

Peter Schuller commented on CASSANDRA-1470:
-------------------------------------------

Note though that decreasing swappyness doesn't really address the main concern 
with doing direct I/O; namely that compaction avoids evicting significant 
amounts of relevant data from buffer cache. As far as I can tell, while 
lowering the pressure on the buffer cache will probably decrease the tendency 
for the OS to swap (if swap is enabled), but that seems more like a side-effect 
than a primary goal for this ticket?

Now, it's true that once compaction finishes you will have a hotness issue (see 
CASSANDRA-1658), but in the mean time the compaction process itself is 
continuously eviction data from page cache potentially for an extended period 
of time, affecting live traffic for the duration.

In particular though, the intended effects aren't really something you can 
benchmark except under specific circumstances (such as a given total data set, 
a given locality of the live read traffic and a given memory size, etc). The 
potential negative effects of compaction as it happens now (without direct i/o 
or fadvise) may be negligible for certain workloads, but can certainly be a 
killer for others. In general, the bigger you data set and active set, the more 
you would expect to be affected by compaction in terms of disk iops. Unless the 
operating system implements a buffer cache eviction policy that specifically 
avoids large sequential scans eviction frequently accessed data, direct I/O, 
fadvise or some equivalent trick is definitely going to be needed for workloads 
where this matters.

I took the benchmark to be a test as to whether or not the changes actually 
worked (rather than measuring a performance improvement), and for reasons 
already talked about (writes and commitlog) the effects is actually unseen - 
but that's expected as long as you're still thrashing the buffer cache. 

> use direct io for compaction
> ----------------------------
>
>                 Key: CASSANDRA-1470
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>             Fix For: 0.7.1
>
>         Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, 
> CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, 
> CASSANDRA-1470-v2.patch, 
> CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, 
> CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, 
> CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, 
> CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, 
> CASSANDRA-1470.patch, 
> use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch
>
>
> When compaction scans through a group of sstables, it forces the data in the 
> os buffer cache being used for hot reads, which can have a dramatic negative 
> effect on performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to