[
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934639#action_12934639
]
Peter Schuller commented on CASSANDRA-1470:
-------------------------------------------
Note though that decreasing swappyness doesn't really address the main concern
with doing direct I/O; namely that compaction avoids evicting significant
amounts of relevant data from buffer cache. As far as I can tell, while
lowering the pressure on the buffer cache will probably decrease the tendency
for the OS to swap (if swap is enabled), but that seems more like a side-effect
than a primary goal for this ticket?
Now, it's true that once compaction finishes you will have a hotness issue (see
CASSANDRA-1658), but in the mean time the compaction process itself is
continuously eviction data from page cache potentially for an extended period
of time, affecting live traffic for the duration.
In particular though, the intended effects aren't really something you can
benchmark except under specific circumstances (such as a given total data set,
a given locality of the live read traffic and a given memory size, etc). The
potential negative effects of compaction as it happens now (without direct i/o
or fadvise) may be negligible for certain workloads, but can certainly be a
killer for others. In general, the bigger you data set and active set, the more
you would expect to be affected by compaction in terms of disk iops. Unless the
operating system implements a buffer cache eviction policy that specifically
avoids large sequential scans eviction frequently accessed data, direct I/O,
fadvise or some equivalent trick is definitely going to be needed for workloads
where this matters.
I took the benchmark to be a test as to whether or not the changes actually
worked (rather than measuring a performance improvement), and for reasons
already talked about (writes and commitlog) the effects is actually unseen -
but that's expected as long as you're still thrashing the buffer cache.
> use direct io for compaction
> ----------------------------
>
> Key: CASSANDRA-1470
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Pavel Yaskevich
> Fix For: 0.7.1
>
> Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch,
> CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch,
> CASSANDRA-1470-v2.patch,
> CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch,
> CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch,
> CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch,
> CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch,
> CASSANDRA-1470.patch,
> use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch
>
>
> When compaction scans through a group of sstables, it forces the data in the
> os buffer cache being used for hot reads, which can have a dramatic negative
> effect on performance.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.