[
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944388#comment-13944388
]
Pavel Yaskevich edited comment on CASSANDRA-6746 at 3/23/14 9:31 AM:
---------------------------------------------------------------------
[~enigmacurry] Here is a patch (rebased with the latest cassandra-2.1 branch)
which should improve the warm up period (it does on my SSD machine), what it
does is simple - sets all RAR to FADV_RANDOM (whole file), when
SegmentedFile.getSegment(position) is called on PoolingSegmentedFile (which is
enabled by setting 'disk_access_mode: standard' in cassandra.yaml) it would
mark first buffer, 64KB by default, as sequential area and do FADV_WILLNEED on
the first page starting from "position", that works as "kind of" of smart
read-ahead (if we discard they idea that we already thashing by polling 64KB to
read one small row) because getSegment(position) for buffered files points to
the start of the row. Can you please test it on your HDD machines to see if
that actually works in the environment with higher I/O latencies?... Another
useful test would be to test this code in mixed write/read mode to effectively
check how good is page replacement mechanism in the kernel :)
P.S. please set device read-ahead (blockdev --setra ...) back to it's default
value before doing the tests.
was (Author: xedin):
[~enigmacurry] Here is a patch (rebased with the latest cassandra-2.1 branch)
which should improve the warm up period (it does on my SSD machine), what it
does is simple - sets all RAR to FADV_RANDOM (whole file), when
SegmentedFile.getSegment(position) is called on PoolingSegmentedFile (which is
enabled by setting 'disk_access_mode: standard' in cassandra.yaml) it would
mark first buffer, 64KB by default, as sequential area and do FADV_WILLNEED on
the first page starting from "position", that works as "kind of" of smart
read-ahead (if we discard they idea that we already thashing by polling 64KB to
read one small row). Can you please test it on your HDD machines to see if that
actually works in the environment with higher I/O latencies?... Another useful
test would be to test this code in mixed write/read mode to effectively check
how good is page replacement mechanism in the kernel :)
P.S. please set device read-ahead (blockdev --setra ...) back to it's default
value before doing the tests.
> Reads have a slow ramp up in speed
> ----------------------------------
>
> Key: CASSANDRA-6746
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Ryan McGuire
> Assignee: Benedict
> Labels: performance
> Fix For: 2.1 beta2
>
> Attachments: 2.1_vs_2.0_read.png, 6746-patched.png,
> 6746.blockdev_setra.full.png, 6746.blockdev_setra.zoomed.png, 6746.txt,
> buffered-io-tweaks.patch, cassandra-2.0-bdplab-trial-fincore.tar.bz2,
> cassandra-2.1-bdplab-trial-fincore.tar.bz2
>
>
> On a physical four node cluister I am doing a big write and then a big read.
> The read takes a long time to ramp up to respectable speeds.
> !2.1_vs_2.0_read.png!
> [See data
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
--
This message was sent by Atlassian JIRA
(v6.2#6252)