[
https://issues.apache.org/jira/browse/CASSANDRA-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608574#comment-14608574
]
Ariel Weisberg commented on CASSANDRA-7404:
-------------------------------------------
I don't remember how the buffer sizing worked, but it's not as simple as just
auto-sizing using an existing policy. With direct IO you have to manage your
own read ahead for spinning disk because the kernel isn't going to do it for
you in the page cache. Then you have to not run out of memory in the worst case
scenario where too many tables are being merged together.
That is how we ended up with the hybrid where a manageable number of files are
opened with a big buffer to allow large reads, and if too many files are open
we stop doing direct IO and let the kernel manage read ahead and memory via the
page cache.
tl;dr buffer pooling with direct IO needs to be done carefully to bound
maxmimum memory usage and seeks on spinning disks. I haven't looked at how off
heap buffers are managed in the issues you reference so I don't know what
is/isn't already solved.
> Use direct i/o for sequential operations (compaction/streaming)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-7404
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7404
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jason Brown
> Assignee: Ariel Weisberg
> Labels: performance
> Fix For: 3.x
>
>
> Investigate using linux's direct i/o for operations where we read
> sequentially through a file (repair and bootstrap streaming, compaction
> reads, and so on). Direct i/o does not go through the kernel page page, so it
> should leave the hot cache pages used for live reads unaffected.
> Note: by using direct i/o, we will probably take a performance hit on reading
> the file we're sequentially scanning through (that is, compactions may get
> slower), but the goal of this ticket is to limit the impact of these
> background tasks on the main read/write functionality. Of course, I'll
> measure any perf hit that is incurred, and see if there's any mechanisms to
> mitigate it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)