[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

Benedict (JIRA) Tue, 20 May 2014 08:09:07 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14003406#comment-14003406
 ]


Benedict commented on CASSANDRA-4718:
-------------------------------------

One thing worth mentioning is that the size of the dataset over which this is 
effective is not necessarily represented accurately by the test, as it was run 
over a fully-compacted dataset, so the 10M keys would have been randomly 
distributed across all pages (we select from a prefix of the key range, but the 
murmur hash will get evenly distributed across the entire dataset once fully 
compacted). Were this run on a real dataset, with the most recent data being 
compacted separately to the older data, and the most recent data being hit 
primarily, there would be greater locality of data access and so any gains 
should be effective over a larger quantity of data.

> More-efficient ExecutorService for improved throughput
> ------------------------------------------------------
>
>                 Key: CASSANDRA-4718
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Benedict
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.1.0
>
>         Attachments: 4718-v1.patch, E100M_summary_key_s.svg, 
> E10M_summary_key_s.svg, E600M_summary_key_s.svg, PerThreadQueue.java, 
> austin_diskbound_read.svg, aws.svg, aws_read.svg, 
> backpressure-stress.out.txt, baq vs trunk.png, 
> belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_latency.svg, 
> jason_run1.svg, jason_run2.svg, jason_run3.svg, jason_write.svg, op costs of 
> various queues.ods, stress op rate with various queues.ods, 
> stress_2014May15.txt, stress_2014May16.txt, v1-stress.out
>
>
> Currently all our execution stages dequeue tasks one at a time.  This can 
> result in contention between producers and consumers (although we do our best 
> to minimize this by using LinkedBlockingQueue).
> One approach to mitigating this would be to make consumer threads do more 
> work in "bulk" instead of just one task per dequeue.  (Producer threads tend 
> to be single-task oriented by nature, so I don't see an equivalent 
> opportunity there.)
> BlockingQueue has a drainTo(collection, int) method that would be perfect for 
> this.  However, no ExecutorService in the jdk supports using drainTo, nor 
> could I google one.
> What I would like to do here is create just such a beast and wire it into (at 
> least) the write and read stages.  (Other possible candidates for such an 
> optimization, such as the CommitLog and OutboundTCPConnection, are not 
> ExecutorService-based and will need to be one-offs.)
> AbstractExecutorService may be useful.  The implementations of 
> ICommitLogExecutorService may also be useful. (Despite the name these are not 
> actual ExecutorServices, although they share the most important properties of 
> one.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

Reply via email to