[ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000743#comment-14000743
 ] 

Benedict edited comment on CASSANDRA-4718 at 5/17/14 1:01 PM:
--------------------------------------------------------------

But the sep branch was actually faster more often than it was slower? And yes 
it routes intelligently, but to both replicas...?

I've attached three graphs to visualise the output from Jason's test runs, that 
I hope express better what I was trying to get across in my previous comment: 
that the sep branch is actually faster in the workload that operates over a 
smaller domain (run3), and that it is also more often faster for the disk bound 
workloads, but that I expect that the difference is most likely random 
variation. The evidence is that run2 shows both crossing each other at 
different points, run1 is faster universally for sep (and both perform the same 
amount of IO per operation) and because, unless there is a bug, it should be 
very difficult for either patch to demonstrate a major difference in 
performance on disk bound workloads - so long as all read workers are 
scheduled, the disk is exclusively what should define our throughput. I'm doing 
my best to produce workload on hardware I have available to me to rule out any 
such issue, but my point is that it is very hard to get accurate consistent 
numbers with which to draw strong conclusions when the difference we're 
measuring is smaller than measurement noise.


was (Author: benedict):
But like I said, the sep branch was actually faster more often than it was 
slower? And yes it routes intelligently, but to both replicas...?

> More-efficient ExecutorService for improved throughput
> ------------------------------------------------------
>
>                 Key: CASSANDRA-4718
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Benedict
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.1.0
>
>         Attachments: 4718-v1.patch, PerThreadQueue.java, 
> austin_diskbound_read.svg, aws.svg, aws_read.svg, 
> backpressure-stress.out.txt, baq vs trunk.png, 
> belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_latency.svg, 
> jason_run1.svg, jason_run2.svg, jason_run3.svg, jason_write.svg, op costs of 
> various queues.ods, stress op rate with various queues.ods, 
> stress_2014May15.txt, stress_2014May16.txt, v1-stress.out
>
>
> Currently all our execution stages dequeue tasks one at a time.  This can 
> result in contention between producers and consumers (although we do our best 
> to minimize this by using LinkedBlockingQueue).
> One approach to mitigating this would be to make consumer threads do more 
> work in "bulk" instead of just one task per dequeue.  (Producer threads tend 
> to be single-task oriented by nature, so I don't see an equivalent 
> opportunity there.)
> BlockingQueue has a drainTo(collection, int) method that would be perfect for 
> this.  However, no ExecutorService in the jdk supports using drainTo, nor 
> could I google one.
> What I would like to do here is create just such a beast and wire it into (at 
> least) the write and read stages.  (Other possible candidates for such an 
> optimization, such as the CommitLog and OutboundTCPConnection, are not 
> ExecutorService-based and will need to be one-offs.)
> AbstractExecutorService may be useful.  The implementations of 
> ICommitLogExecutorService may also be useful. (Despite the name these are not 
> actual ExecutorServices, although they share the most important properties of 
> one.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to