[ 
https://issues.apache.org/jira/browse/CASSANDRA-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15361046#comment-15361046
 ] 

Marcus Eriksson edited comment on CASSANDRA-12071 at 7/4/16 9:05 AM:
---------------------------------------------------------------------

{{Flush}} runs on the {{flushExecutor}} and splits the memtable up based on the 
number of data directories on the node and executes the {{FlushRunnables}} on 
the per disk flush executors. It then blocks until all per disk runnables are 
completed to be able to run {{PostFlush}} etc.

Since {{flushExecutor}} size is 1 we can only ever execute a single flush at a 
time.

Patch to bump it to {{memtable_flush_writers}} 
[here|https://github.com/krummas/cassandra/commits/marcuse/12071] - in my tests 
this removes the timeout issues

[~aweisberg] could you try it out?

[~blambov] to review since you might have some ideas for further improvements

tests here:
http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-12071-dtest/
http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-12071-testall/


was (Author: krummas):
{{Flush}} runs on the {{flushExecutor}} and splits the memtable up based on the 
number of data directories on the node and executes the {{FlushRunnables}} on 
the per disk flush executors. It then blocks until all per disk runnables are 
completed to be able to run {{PostFlush}} etc.

Since {{flushExecutor}} size is 1 we can only ever execute a single flush at a 
time.

Patch to bump it to {{memtable_flush_writers}} 
[here|https://github.com/krummas/cassandra/commits/marcuse/12071] - in my tests 
this removes the timeout issues

[~aweisberg] could you try it out?

[~blambov] to review since you might have some ideas for further improvements

> Regression in flushing throughput under load after CASSANDRA-6696
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-12071
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12071
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Ariel Weisberg
>            Assignee: Marcus Eriksson
>             Fix For: 3.9
>
>
> The way flushing used to work is that a ColumnFamilyStore could have multiple 
> Memtables flushing at once and multiple ColumnFamilyStores could flush at the 
> same time. The way it works now there can be only a single flush of any 
> ColumnFamilyStore & Memtable running in the C* process, and the number of 
> threads applied to that flush is bounded by the number of disks in JBOD.
> This works ok most of the time but occasionally flushing will be a little 
> slower and ingest will outstrip it and then block on available memory. At 
> this point you see several second stalls that cause timeouts.
> This is a problem for reasonable configurations that don't use JBOD but have 
> access to a fast disk that can handle some IO queuing (RAID, SSD).
> You can reproduce on beefy hardware (12 cores 24 threads, 64 gigs of RAM, 
> SSD) if you unthrottle compaction or set it to something like 64 
> megabytes/second and run with 8 compaction threads and stress with the 
> default write workload and a reasonable number of threads. I tested with 96.
> It started happening after about 60 gigabytes of data was loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to