[ 
https://issues.apache.org/jira/browse/CASSANDRA-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803199#comment-14803199
 ] 

Benedict commented on CASSANDRA-10344:
--------------------------------------

So, given the somewhat equivocal impact on cstar, I'd say at the very least we 
should hold off until we have time to analyze further.

Just to pontificate briefly: the status quo has the advantage that, whilst 
we're waiting for digests from another server, the heap overhead is kept to a 
minimum (and could even be near zero, if we used off heap memory). This should 
result in fewer promotions, and fewer old gen GCs. [This 
graph|http://cstar.datastax.com/graph?stats=cba8c332-5d1f-11e5-bf84-42010af0688f&metric=gc_max_ms&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=871.97&ymin=0&ymax=605]
 shows a bump in maximum GC pause time, which may be attributable to this (or 
the construction of the {{ImmutableBTreePartition}}), and this is despite all 
queries being CL.ONE. We should probably compare against QUORUM queries before 
considering this for inclusion.

I'll note I don't have any strong feelings on the patch, just noting my 
thoughts for consideration.

> Optimize ReadResponse
> ---------------------
>
>                 Key: CASSANDRA-10344
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10344
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0.0 rc1
>
>
> The handling of {{ReadResponse}} has quite a bit of inefficiencies. The way 
> it works is based on constraints from early version of CASSANDRA-8099, but 
> this doesn't make sense anymore. This is particularly true for local response 
> where we fully serialize the response in memory to deserialize it a short 
> time later.  But
> # serialization/deserialization takes times, more than necessary in that case
> # we serialize in a {{DataInputBuffer}} with a default initial size, which 
> for largish response might require a few somewhat costly resizing.
> So, since we're materializing the full result in memory anyway, it should 
> quite a lot more efficient to materialize it in a simple list of 
> {{ImmutableBTreePartition}} in that case.
> To a lesser extend, the serialization of {{ReadResponse}} that go over the 
> wire is probably not ideal either. Due to current assumptions of 
> {{MessagingService}}, we need to know the full serialized size of every 
> response upfront, which means we do have to materialize results in memory in 
> this case too. Currently, we do so by serialializing the full response in 
> memory first, and then writing that result. Here again, the serialization in 
> memory might require some resizing/copying, and we're fundamentally copying 
> things twice (this could be especially costly with largish user values).  So 
> here too I suggest to materialize the result in a list of 
> {{ImmutableBTreePartition}}, compute the serialized size from it and then 
> serialize it. This also allow to do better sizing of our data structures on 
> the receiving side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to