[ 
https://issues.apache.org/jira/browse/CASSANDRA-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18079263#comment-18079263
 ] 

Dmitry Konstantinov commented on CASSANDRA-21354:
-------------------------------------------------

A new run after the fix looks ok: 
https://pre-ci.cassandra.apache.org/job/cassandra/537/#showFailuresLink

> Avoid serialization and deserialization for coordinator-local single 
> partition data read
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21354
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21354
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Local/Other
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 6.x, 7.x
>
>         Attachments: image-2026-05-07-10-10-16-886.png, may3_max_alloc.html, 
> may6_noser_alloc.html, read_noser_proto_trunk_ci_summary.htm
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when we execute a local read we fetch data from SSTables and 
> Memtables using a merging iterator and write it to a byte buffer. Later when 
> we combine a CQL response we deserialize the data back to iterate over them 
> as a part of coordinator logic. So, we allocate rows and cells twice here, 
> during the read from SSTables/Memtables and during the deserialization by 
> coordinator logic if we read data locally (it is a typical scenario because 
> usually drivers are sending requests to replicas).
> The idea of optimization: if we do a single partition read of a small number 
> of rows we can keep the data in memory and avoid this double row objects 
> allocation.
> We should limit amount of such data kept in memory to avoid too much pressure 
> on GC due to extended lifetime for these objects and promoting them to an old 
> generation.
> So, a system property can be used to limit number of rows we keep in memory 
> in this scenario, as well as to disable the logic in case of any issues.
> We cannot get a number of rows in advance, so we do a rows limiting logic in 
> the following way: we read first N rows to memory and if we still have 
> something we serialize the remaining to a buffer and then concatenate 
> iterators for the in-memory rows + deserialized one
>  !image-2026-05-07-10-10-16-886.png|width=800! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to