[ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984933#comment-13984933
 ] 

Benedict commented on CASSANDRA-4718:
-------------------------------------

I've uploaded a new version of the patch 
[here|https://github.com/belliottsmith/cassandra/tree/4718-fjp]

I've refactored the DebuggableForkJoinPool a little to support a limited queue 
(so that our native transport queue doesn't get too long), and to support the 
metrics that users may have gotten used to.

I've tested the branch out very minimally and do see a very modest performance 
benefit on my box for reads, but that's far from conclusive - however it's 
quite likely any benefit is more visible on machines with more cores going 
spare though, as the single queue lock for a standard executor could easily 
become a point of contention.

One slight concern I have with this approach is that it in order to make 
_enqueueing_ tasks less contentious we will need to either fork ForkJoinPool, 
or see if it is possible to implement an EventLoopGroup backed by a FJP, and 
use the same FJP to manage the connections as we do the execution of our tasks 
(as enqueuing tasks from a FJ-worker is contention-free). Given how FJP is 
intended to be used it is not optimised for enqueueing tasks, and is no more 
efficient (probably slightly less) than a standard executor. That's a future 
problem, however.



> More-efficient ExecutorService for improved throughput
> ------------------------------------------------------
>
>                 Key: CASSANDRA-4718
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Jason Brown
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.1.0
>
>         Attachments: 4718-v1.patch, PerThreadQueue.java, baq vs trunk.png, op 
> costs of various queues.ods, stress op rate with various queues.ods, 
> v1-stress.out
>
>
> Currently all our execution stages dequeue tasks one at a time.  This can 
> result in contention between producers and consumers (although we do our best 
> to minimize this by using LinkedBlockingQueue).
> One approach to mitigating this would be to make consumer threads do more 
> work in "bulk" instead of just one task per dequeue.  (Producer threads tend 
> to be single-task oriented by nature, so I don't see an equivalent 
> opportunity there.)
> BlockingQueue has a drainTo(collection, int) method that would be perfect for 
> this.  However, no ExecutorService in the jdk supports using drainTo, nor 
> could I google one.
> What I would like to do here is create just such a beast and wire it into (at 
> least) the write and read stages.  (Other possible candidates for such an 
> optimization, such as the CommitLog and OutboundTCPConnection, are not 
> ExecutorService-based and will need to be one-offs.)
> AbstractExecutorService may be useful.  The implementations of 
> ICommitLogExecutorService may also be useful. (Despite the name these are not 
> actual ExecutorServices, although they share the most important properties of 
> one.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to