Sylvain Lebresne created CASSANDRA-10344:
--------------------------------------------
Summary: Optimize ReadResponse
Key: CASSANDRA-10344
URL: https://issues.apache.org/jira/browse/CASSANDRA-10344
Project: Cassandra
Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Fix For: 3.0.0 rc1
The handling of {{ReadResponse}} has quite a bit of inefficiencies. The way it
works is based on constraints from early version of CASSANDRA-8099, but this
doesn't make sense anymore. This is particularly true for local response where
we fully serialize the response in memory to deserialize it a short time later.
But
# serialization/deserialization takes times, more than necessary in that case
# we serialize in a {{DataInputBuffer}} with a default initial size, which for
largish response might require a few somewhat costly resizing.
So, since we're materializing the full result in memory anyway, it should quite
a lot more efficient to materialize it in a simple list of
{{ImmutableBTreePartition}} in that case.
To a lesser extend, the serialization of {{ReadResponse}} that go over the wire
is probably not ideal either. Due to current assumptions of
{{MessagingService}}, we need to know the full serialized size of every
response upfront, which means we do have to materialize results in memory in
this case too. Currently, we do so by serialializing the full response in
memory first, and then writing that result. Here again, the serialization in
memory might require some resizing/copying, and we're fundamentally copying
things twice (this could be especially costly with largish user values). So
here too I suggest to materialize the result in a list of
{{ImmutableBTreePartition}}, compute the serialized size from it and then
serialize it. This also allow to do better sizing of our data structures on the
receiving side.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)