[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320700#comment-14320700 ]
Benedict commented on CASSANDRA-8789: ------------------------------------- bq. is 64k a useful threshold? It's a little arbitrary, but seems a reasonable cutoff bq. Will systems regularly hit that threshold as part of operations like repair, streaming It looks possible that merkle trees might be significantly larger than this, but I'm not certain. In this case the extra connection would simply be mostly idle, which isn't really a problem. bq. One finicky thing is that we don't actually know the serialized size of messages Well, we have a few options here: # We already calculate this during serialization, so we could extract the work and use it twice # Whilst building our message we could accumulate its size, since it would be useful elsewhere as well (such as enforcing memory utilisation constraints), and might reduce the cost of serialization # We could serialize our messages into a ByteBuffer, that we copy wholesale into the second connection's buffer if we overflow, resetting the small message connection's buffer (this might also obviate the need to calculate during serialization, as we could leave room prefixing the message serialization location to annotate its size once serialization completes) # We could bound the number of elements we have to visit by tracking the average size of each element and treating the whole thing as oversized if assuming the average applies for the remainder of the message we would exceed our limit. This would let us abort potentially very early for large messages, but not small ones. > Revisit how OutboundTcpConnection pools two connections for different message > types > ----------------------------------------------------------------------------------- > > Key: CASSANDRA-8789 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Ariel Weisberg > Assignee: Ariel Weisberg > Fix For: 3.0 > > > I was looking at this trying to understand what messages flow over which > connection. > For reads the request goes out over the command connection and the response > comes back over the ack connection. > For writes the request goes out over the command connection and the response > comes back over the command connection. > Reads get a dedicated socket for responses. Mutation commands and responses > both travel over the same socket along with read requests. > Sockets are used uni-directional so there are actually four sockets in play > and four threads at each node (2 inbounded, 2 outbound). > CASSANDRA-488 doesn't leave a record of what the impact of this change was. > If someone remembers what situations were made better it would be good to > know. > I am not clear on when/how this is helpful. The consumer side shouldn't be > blocking so the only head of line blocking issue is the time it takes to > transfer data over the wire. > If message size is the cause of blocking issues then the current design mixes > small messages and large messages on the same connection retaining the head > of line blocking. > Read requests share the same connection as write requests (which are large), > and write acknowledgments (which are small) share the same connections as > write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)