[
https://issues.apache.org/jira/browse/CASSANDRA-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609683#comment-14609683
]
Benedict commented on CASSANDRA-7404:
-------------------------------------
I'm not _totally_ convinced this is worth the time investment, especially with
the negative results. With early opening of compaction results, we already
bound the utilised page cache (which could be bounded even lower than it is
already, if we wanted), and by going through the page cache we utilise any data
that is already present (i.e. "hot" files incur no disk seeks). So if doing
this we should at least limit it to cold files.
This would be much more useful once we have an in-process page cache, where
these kinds of decisions can be made more effectively.
> Use direct i/o for sequential operations (compaction/streaming)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-7404
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7404
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jason Brown
> Assignee: Ariel Weisberg
> Labels: performance
> Fix For: 3.x
>
>
> Investigate using linux's direct i/o for operations where we read
> sequentially through a file (repair and bootstrap streaming, compaction
> reads, and so on). Direct i/o does not go through the kernel page page, so it
> should leave the hot cache pages used for live reads unaffected.
> Note: by using direct i/o, we will probably take a performance hit on reading
> the file we're sequentially scanning through (that is, compactions may get
> slower), but the goal of this ticket is to limit the impact of these
> background tasks on the main read/write functionality. Of course, I'll
> measure any perf hit that is incurred, and see if there's any mechanisms to
> mitigate it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)