[ https://issues.apache.org/jira/browse/CASSANDRA-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13812866#comment-13812866 ]
Norman Maurer commented on CASSANDRA-6235: ------------------------------------------ You are still using 3.x right ? > Improve native protocol server latency > -------------------------------------- > > Key: CASSANDRA-6235 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6235 > Project: Cassandra > Issue Type: Improvement > Reporter: Sylvain Lebresne > Attachments: NPTester.java > > > The tl;dr is that the native protocol server seems to add some non > negligeable latency to operations compared to the thrift server. And the > added latency seems to lie within Netty's internal as far as I can tell. I'm > not sure what to tweak to try to reduce that. > The test I ran is simple: it's {{stress -t 1 -L3}}, the Cassandra stress test > for insertions with just 1 thread and using CQL-over-thrift (to make things > more comparable). What I'm interested in is the average latency. Also, > because I don't care about testing the storage engine or even CQL processing, > I've disabled the processing of statements: all queries just return an empty > result set right away (there's no parsing of the query in particular). The > resulting branch is at > https://github.com/pcmanus/cassandra/commits/latency-testing (note that > there's a trivial patch to have stress show the latency in microseconds). > With that branch (single node), I get with thrift ~62μs of average latency. > That number is actually fairly stable across runs (not doing any real > processing helps having consistent performance here). > For the native protocol, I wanted to eliminate the possibility that the > DataStax Java driver was the bottleneck so I wrote a very simple class > (NPTester.java, attached) that emulates the stress test above but with the > native protocol. It's not execssively pretty but its simple (no dependencies, > compiles with javac NPTester.java) and it tries to minimize the client side > overhead. It's just a basic loop that write query frames (serializing them > largely manually) and read the result back. And it measures the latency as > close to the socket as possible. Unless I've done something really wrong, it > should have less client side overhead than what stress has. > With that tester, the average latency I get is ~140μs. This is more than > twice that of thrift. > To try to understand where that additional latency was spent, I > "instrumented" the Frame coder/decoder to record latencies (last commit of > the latency-test branch above): it records how long it takes to decode, > execute and re-encode the query. The latency for that is ~35μs (as other > numbers above, this is pretty consistent over runs). Given that my ping on > localhost is <30μs, this suggest that compared to thrift, Netty spends ~70μs > more than the thrift server somewhere while reading and/or writing data on > the wire. I've try yourkitting it but I didn't saw anything obvious so I'm > not sure what's the problem, but it sure would be nice to get on par (or at > least much closer) with thrift on such a simple test. > I'll note that if I run the same tests without disabling actual query > processing, the tests have a bit more variability, but for thrift I get > ~220-230μs latency on average while the NPTester gets ~290-300μs. In other > words, there still seems to be that 70μs overhead for the native protocol. > Which in that case is still a >30% slowdown. I'll also note that test > comparisons with more threads (using the java driver this time) also show the > native protocol being slightly slower than thrift (~5-10% slower), and while > there might be inefficiencies in the java driver, I'm growing more and more > convinced that at least part of it is due to the latency "issue" described > above. -- This message was sent by Atlassian JIRA (v6.1#6144)