[ 
https://issues.apache.org/jira/browse/CASSANDRA-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated CASSANDRA-675:
------------------------------

    Attachment: issue675.patchv1

The attached patch fixes 2 problems.

1. TcpReader.read() has to go through a set of protocols (header, content, etc) 
to fully read a message. If the socket doesn't have enough bytes to read to 
produce a full message, a ReadNotCompleteException is thrown. This exception is 
then handled by TcpReader.read() and if the socket has new bytes to read, 
TcpReader.read() resumes from the protocol that's left last time. It seems that 
this exception handling can take at least 1-2ms. The patch converts exception 
to a normal return with a special value. 

I still don't quite understand why exception handling in java is so expensive 
though. Note that just pre-allocating ReadNotCompleteException itself (which I 
thought is where most of the overhead came from) doesn't help. Throwing 
exception has to be completely avoided.

2. Change Selector.select(1) to Selector.select() and wake up the selector 
every time that the interest bit of a selectionkey needs to be changed. Without 
this change, it could take up to 1ms for the interest bit to be registered with 
the selector.

Here are some performance results of reading a column with a 4k value (average 
response time in ms for local weak reads, the old quorum read w/o this patch, 
the new quorum reads with this patch). With the patch, quorum reads are 3X-6X 
faster.

threads local_weak_read old_quorum_read new_quorum_read
1       0.71974982      9.546683881     2.002089927
2       0.919307311     12.34252153     1.966206096
4       1.018249762     18.62889817     2.243343764
8       1.136501263     25.49487977     3.213168828
16      1.796865109     29.8252928      5.889078686
32      3.60204913      40.44861522     11.65799948

This patch also has a minor change to TCP streaming. Can someone verify TCP 
streaming still works with this patch?

> TcpReader is slow because of using Exception handling in the normal path
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-675
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-675
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.9
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.9
>
>         Attachments: issue675.patchv1
>
>
> TcpReader has the overhead of 1-2ms per message reading. This makes quorum 
> reads too slow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to