[
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935792#comment-13935792
]
Benedict edited comment on CASSANDRA-6746 at 3/14/14 11:04 PM:
---------------------------------------------------------------
I will do some empirical testing so we have some data to work with. It seems to
me that "trickle" flushing would still be better than this, as we could still
trash the entire file's worth of cache by racing ahead of the disk (assuming
this is permitted), although we could still DONTNEED after trickle sync for
compaction. WILLNEEDing a large file _after flush_ is potentially even worse
behaviour, though, as if the DONTNEED has been obeyed (or they've fallen out of
cache due to not being read during flush - which is probably likely during a
large flush) we're just proactively inducing a period of high intensity random
seeks for data that would naturally be read in anyway if they are needed, and
otherwise would not.
That said, it might be easier to just pick an approach (the one you suggest is
certainly better than what we currently do), and then deliver iterative
replacement, as it solves all of the above problems.
was (Author: benedict):
I will do some empirical testing so we have some data to work with. It seems to
me that "trickle" flushing would still be better than this, although we could
still DONTNEED after trickle sync for compaction. WILLNEEDing a large file
_after flush_ is potentially even worse behaviour, though, as if the DONTNEED
has been obeyed (or they've fallen out of cache due to not being read during
flush - which is probably likely during a large flush) we're just proactively
inducing a period of high intensity random seeks for data that would naturally
be read in anyway if they are needed, and otherwise would not.
That said, it might be easier to just pick an approach (the one you suggest is
certainly better than what we currently do), and then deliver iterative
replacement, as it solves all of the above problems.
> Reads have a slow ramp up in speed
> ----------------------------------
>
> Key: CASSANDRA-6746
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Ryan McGuire
> Assignee: Benedict
> Labels: performance
> Fix For: 2.1 beta2
>
> Attachments: 2.1_vs_2.0_read.png, 6746-patched.png, 6746.txt,
> cassandra-2.0-bdplab-trial-fincore.tar.bz2,
> cassandra-2.1-bdplab-trial-fincore.tar.bz2
>
>
> On a physical four node cluister I am doing a big write and then a big read.
> The read takes a long time to ramp up to respectable speeds.
> !2.1_vs_2.0_read.png!
> [See data
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
--
This message was sent by Atlassian JIRA
(v6.2#6252)