[ 
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944388#comment-13944388
 ] 

Pavel Yaskevich edited comment on CASSANDRA-6746 at 3/23/14 9:31 AM:
---------------------------------------------------------------------

[~enigmacurry] Here is a patch (rebased with the latest cassandra-2.1 branch) 
which should improve the warm up period (it does on my SSD machine), what it 
does is simple - sets all RAR to FADV_RANDOM (whole file), when 
SegmentedFile.getSegment(position) is called on PoolingSegmentedFile (which is 
enabled by setting 'disk_access_mode: standard' in cassandra.yaml) it would 
mark first buffer, 64KB by default, as sequential area and do FADV_WILLNEED on 
the first page starting from "position", that works as "kind of" of smart 
read-ahead (if we discard they idea that we already thashing by polling 64KB to 
read one small row) because getSegment(position) for buffered files points to 
the start of the row. Can you please test it on your HDD machines to see if 
that actually works in the environment with higher I/O latencies?... Another 
useful test would be to test this code in mixed write/read mode to effectively 
check how good is page replacement mechanism in the kernel :) 

P.S. please set device read-ahead (blockdev --setra ...) back to it's default 
value before doing the tests.


was (Author: xedin):
[~enigmacurry] Here is a patch (rebased with the latest cassandra-2.1 branch) 
which should improve the warm up period (it does on my SSD machine), what it 
does is simple - sets all RAR to FADV_RANDOM (whole file), when 
SegmentedFile.getSegment(position) is called on PoolingSegmentedFile (which is 
enabled by setting 'disk_access_mode: standard' in cassandra.yaml) it would 
mark first buffer, 64KB by default, as sequential area and do FADV_WILLNEED on 
the first page starting from "position", that works as "kind of" of smart 
read-ahead (if we discard they idea that we already thashing by polling 64KB to 
read one small row). Can you please test it on your HDD machines to see if that 
actually works in the environment with higher I/O latencies?... Another useful 
test would be to test this code in mixed write/read mode to effectively check 
how good is page replacement mechanism in the kernel :) 

P.S. please set device read-ahead (blockdev --setra ...) back to it's default 
value before doing the tests.

> Reads have a slow ramp up in speed
> ----------------------------------
>
>                 Key: CASSANDRA-6746
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Ryan McGuire
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 2.1 beta2
>
>         Attachments: 2.1_vs_2.0_read.png, 6746-patched.png, 
> 6746.blockdev_setra.full.png, 6746.blockdev_setra.zoomed.png, 6746.txt, 
> buffered-io-tweaks.patch, cassandra-2.0-bdplab-trial-fincore.tar.bz2, 
> cassandra-2.1-bdplab-trial-fincore.tar.bz2
>
>
> On a physical four node cluister I am doing a big write and then a big read. 
> The read takes a long time to ramp up to respectable speeds.
> !2.1_vs_2.0_read.png!
> [See data 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to