[jira] [Comment Edited] (CASSANDRA-6746) Reads have a slow ramp up in speed

Benedict (JIRA) Fri, 14 Mar 2014 16:06:11 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935792#comment-13935792
 ]


Benedict edited comment on CASSANDRA-6746 at 3/14/14 11:04 PM:
---------------------------------------------------------------

I will do some empirical testing so we have some data to work with. It seems to 
me that "trickle" flushing would still be better than this, as we could still 
trash the entire file's worth of cache by racing ahead of the disk (assuming 
this is permitted), although we could still DONTNEED after trickle sync for 
compaction. WILLNEEDing a large file _after flush_ is potentially even worse 
behaviour, though, as if the DONTNEED has been obeyed (or they've fallen out of 
cache due to not being read during flush - which is probably likely during a 
large flush) we're just proactively inducing a period of high intensity random 
seeks for data that would naturally be read in anyway if they are needed, and 
otherwise would not.

That said, it might be easier to just pick an approach (the one you suggest is 
certainly better than what we currently do), and then deliver iterative 
replacement, as it solves all of the above problems.


was (Author: benedict):
I will do some empirical testing so we have some data to work with. It seems to 
me that "trickle" flushing would still be better than this, although we could 
still DONTNEED after trickle sync for compaction. WILLNEEDing a large file 
_after flush_ is potentially even worse behaviour, though, as if the DONTNEED 
has been obeyed (or they've fallen out of cache due to not being read during 
flush - which is probably likely during a large flush) we're just proactively 
inducing a period of high intensity random seeks for data that would naturally 
be read in anyway if they are needed, and otherwise would not.

That said, it might be easier to just pick an approach (the one you suggest is 
certainly better than what we currently do), and then deliver iterative 
replacement, as it solves all of the above problems.

> Reads have a slow ramp up in speed
> ----------------------------------
>
>                 Key: CASSANDRA-6746
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Ryan McGuire
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 2.1 beta2
>
>         Attachments: 2.1_vs_2.0_read.png, 6746-patched.png, 6746.txt, 
> cassandra-2.0-bdplab-trial-fincore.tar.bz2, 
> cassandra-2.1-bdplab-trial-fincore.tar.bz2
>
>
> On a physical four node cluister I am doing a big write and then a big read. 
> The read takes a long time to ramp up to respectable speeds.
> !2.1_vs_2.0_read.png!
> [See data 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-6746) Reads have a slow ramp up in speed

Reply via email to