[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986626#comment-13986626 ]
Benedict commented on CASSANDRA-4718: ------------------------------------- For comparison, a graph of Jason's results: https://docs.google.com/spreadsheets/d/1mLxyY9syaAlDb1ALGQ-oF7Qo0tQffbcNgFMVPktde88/edit?usp=sharing I'd like to do a couple of things here: # Tweak the Low Signal patch to potentially signal more intelligently rather than just always aggregating the last 5us of requests # Try increasing the queue length # Try these tests for a standardized load - the stress functionality we're using is great for giving a good ballpark idea of performance, but it varies the number of ops with each run, so running with a fixed 10M ops per run might be useful (stress could maybe do with an "ops per thread" option, as for the low thread counts this is a lot of work, but for high counts not very much) The lowsignal patch looks to outperform at certain thresholds, but underperform at others, and I'm hoping 1 and 2 might help us make it better overall. At high thread counts the difference is almost 20% for writes, which is non-trivial. > More-efficient ExecutorService for improved throughput > ------------------------------------------------------ > > Key: CASSANDRA-4718 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 > Project: Cassandra > Issue Type: Improvement > Reporter: Jonathan Ellis > Assignee: Jason Brown > Priority: Minor > Labels: performance > Fix For: 2.1.0 > > Attachments: 4718-v1.patch, PerThreadQueue.java, > backpressure-stress.out.txt, baq vs trunk.png, op costs of various > queues.ods, stress op rate with various queues.ods, v1-stress.out > > > Currently all our execution stages dequeue tasks one at a time. This can > result in contention between producers and consumers (although we do our best > to minimize this by using LinkedBlockingQueue). > One approach to mitigating this would be to make consumer threads do more > work in "bulk" instead of just one task per dequeue. (Producer threads tend > to be single-task oriented by nature, so I don't see an equivalent > opportunity there.) > BlockingQueue has a drainTo(collection, int) method that would be perfect for > this. However, no ExecutorService in the jdk supports using drainTo, nor > could I google one. > What I would like to do here is create just such a beast and wire it into (at > least) the write and read stages. (Other possible candidates for such an > optimization, such as the CommitLog and OutboundTCPConnection, are not > ExecutorService-based and will need to be one-offs.) > AbstractExecutorService may be useful. The implementations of > ICommitLogExecutorService may also be useful. (Despite the name these are not > actual ExecutorServices, although they share the most important properties of > one.) -- This message was sent by Atlassian JIRA (v6.2#6252)