[ 
https://issues.apache.org/jira/browse/CASSANDRA-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14607905#comment-14607905
 ] 

Stefania commented on CASSANDRA-7404:
-------------------------------------

Well, luckily - or unluckily depending on the point of view - several things 
have changed since this patch was written. So first of all a rebase is 
required. I would wait not only for CASSANDRA-8099 but also for CASSANDRA-8894. 

Here is what to keep in mind for the rebase, which also serves as a first quick 
round of comments:

* We now have a FileChannel wrapper, the ChannelProxy, which is normally owned 
by the SegmentedFile Builder and dished out to RAR instances that share it. It 
is reference-counted so it can be safely shared and the resources should be 
released in the close method.  For performance limitations in the ref global 
state (CASSANDRA-9379) we cannot keep a reference in the RAR itself, so be 
careful here but it is clearly commented in the code. Basically the owners of 
RARs are responsible for ensuring the channel stays
open and it is closed afterwards. I would integrate the direct IO functionality 
here. This class must be thread-safe and this is achieved by exporting only 
FileChannel thread-safe operations (we did the clean-up of other operations 
when we introduced it). We should try and do without locks. For more details 
see CASSANDRA-8893.

* The pooled segmented files are gone and so is the file cache service, making 
the RAR hierarchy of constructors easier.

* CASSANDRA-8894 will change the buffer size, which is no longer fixed to 64k 
but determined dynamically with a minimum value of 4k, see that ticket for 
details. This cost us a new parameter when creating the segmented files. We 
need to handle different buffer sizes for direct IO in a consistent way, that's 
why I would wait for CASSANDRA-8894 before rebasing. CASSANDRA-8894 will also 
ensure that the RAR reads at page boundaries.

* We now have a pool of page aligned buffers, the BufferPool, see 
CASSANDRA-8897. Therefore we should try and use this pool rather than 
allocating buffers directly via MemoryUtil.asAlignedByteBuffer(), which becomes 
redundant. If required there are planned enhancements in CASSANDRA-9468. The 
RAR uses this pool already.

So, unless I overlooked something, supporting direct IO should involve creating 
a direct IO ChannelProxy and setting the correct buffer size, unless of course 
too many direct IO files are already open.

> Use direct i/o for sequential operations (compaction/streaming)
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-7404
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7404
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jason Brown
>            Assignee: Ariel Weisberg
>              Labels: performance
>             Fix For: 3.x
>
>
> Investigate using linux's direct i/o for operations where we read 
> sequentially through a file (repair and bootstrap streaming, compaction 
> reads, and so on). Direct i/o does not go through the kernel page page, so it 
> should leave the hot cache pages used for live reads unaffected.
> Note: by using direct i/o, we will probably take a performance hit on reading 
> the file we're sequentially scanning through (that is, compactions may get 
> slower), but the goal of this ticket is to limit the impact of these 
> background tasks on the main read/write functionality. Of course, I'll 
> measure any perf hit that is incurred, and see if there's any mechanisms to 
> mitigate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to