[ 
https://issues.apache.org/jira/browse/CASSANDRA-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14614774#comment-14614774
 ] 

Olivier Michallat commented on CASSANDRA-9558:
----------------------------------------------

[~pingtimeout] found the bottleneck in the driver's code. The culprit is this 
line in the flusher code:
{code}
while (null != (flush = queued.poll())) {
{code}
In the driver, producers for this queue are application threads flushing their 
queries; the consumer is the Netty event loop, which executes the flusher code. 
What happens in stress tests is that we have many producers constantly 
enqueuing new messages, so the consumer ends up spinning a lot in this loop, 
which delays messages. This explains why it works better with more connections: 
more connections = more event loops = more queues = less pressure on each queue.

The workaround is to add a limit to the maximum number of messages that can be 
flushed in one go. We're experimenting with this right now, it will go into 
2.1.7 and 2.0.11.

The problem does not exist with Cassandra because it's a server, both the 
producer and the consumer is the event loop.

> Cassandra-stress regression in 2.2
> ----------------------------------
>
>                 Key: CASSANDRA-9558
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9558
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Alan Boudreault
>            Assignee: Andy Tolbert
>             Fix For: 2.2.0 rc2
>
>         Attachments: 2.1.log, 2.2.log, CASSANDRA-9558-2.patch, 
> CASSANDRA-9558-ProtocolV2.patch, atolber-CASSANDRA-9558-stress.tgz, 
> atolber-trunk-driver-coalescing-disabled.txt, 
> stress-2.1-java-driver-2.0.9.2.log, stress-2.1-java-driver-2.2+PATCH.log, 
> stress-2.1-java-driver-2.2.log, stress-2.2-java-driver-2.2+PATCH.log, 
> stress-2.2-java-driver-2.2.log
>
>
> We are seeing some regression in performance when using cassandra-stress 2.2. 
> You can see the difference at this url:
> http://riptano.github.io/cassandra_performance/graph_v5/graph.html?stats=stress_regression.json&metric=op_rate&operation=1_write&smoothing=1&show_aggregates=true&xmin=0&xmax=108.57&ymin=0&ymax=168147.1
> The cassandra version of the cluster doesn't seem to have any impact. 
> //cc [~tjake] [~benedict]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to