[
https://issues.apache.org/jira/browse/CASSANDRA-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803199#comment-14803199
]
Benedict commented on CASSANDRA-10344:
--------------------------------------
So, given the somewhat equivocal impact on cstar, I'd say at the very least we
should hold off until we have time to analyze further.
Just to pontificate briefly: the status quo has the advantage that, whilst
we're waiting for digests from another server, the heap overhead is kept to a
minimum (and could even be near zero, if we used off heap memory). This should
result in fewer promotions, and fewer old gen GCs. [This
graph|http://cstar.datastax.com/graph?stats=cba8c332-5d1f-11e5-bf84-42010af0688f&metric=gc_max_ms&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=871.97&ymin=0&ymax=605]
shows a bump in maximum GC pause time, which may be attributable to this (or
the construction of the {{ImmutableBTreePartition}}), and this is despite all
queries being CL.ONE. We should probably compare against QUORUM queries before
considering this for inclusion.
I'll note I don't have any strong feelings on the patch, just noting my
thoughts for consideration.
> Optimize ReadResponse
> ---------------------
>
> Key: CASSANDRA-10344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10344
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Fix For: 3.0.0 rc1
>
>
> The handling of {{ReadResponse}} has quite a bit of inefficiencies. The way
> it works is based on constraints from early version of CASSANDRA-8099, but
> this doesn't make sense anymore. This is particularly true for local response
> where we fully serialize the response in memory to deserialize it a short
> time later. But
> # serialization/deserialization takes times, more than necessary in that case
> # we serialize in a {{DataInputBuffer}} with a default initial size, which
> for largish response might require a few somewhat costly resizing.
> So, since we're materializing the full result in memory anyway, it should
> quite a lot more efficient to materialize it in a simple list of
> {{ImmutableBTreePartition}} in that case.
> To a lesser extend, the serialization of {{ReadResponse}} that go over the
> wire is probably not ideal either. Due to current assumptions of
> {{MessagingService}}, we need to know the full serialized size of every
> response upfront, which means we do have to materialize results in memory in
> this case too. Currently, we do so by serialializing the full response in
> memory first, and then writing that result. Here again, the serialization in
> memory might require some resizing/copying, and we're fundamentally copying
> things twice (this could be especially costly with largish user values). So
> here too I suggest to materialize the result in a list of
> {{ImmutableBTreePartition}}, compute the serialized size from it and then
> serialize it. This also allow to do better sizing of our data structures on
> the receiving side.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)