[
https://issues.apache.org/jira/browse/CASSANDRA-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Konstantinov updated CASSANDRA-21354:
--------------------------------------------
Description:
Currently when we execute a local read we fetch data from SSTables and
Memtables using a merging iterator and write it to a byte buffer. Later when we
combine a CQL response we deserialize the data back to iterate over them as a
part of coordinator logic. So, we allocate rows and cells twice here, during
the read from SSTables/Memtables and during the deserialization by coordinator
logic if we read data locally (it is a typical scenario because usually drivers
are sending requests to replicas).
The idea of optimization: if we do a single partition read of a small number of
rows we can keep the data in memory and avoid this double row objects
allocation.
We should limit amount of such data kept in memory to avoid too much pressure
on GC due to extended lifetime for these objects and promoting them to an old
generation.
So, a system property can be used to limit number of rows we keep in memory in
this scenario, as well as to disable the logic in case of any issues.
We cannot get a number of rows in advance, so we do a rows limiting logic in
the following way: we read first N rows to memory and if we still have
something we serialize the remaining to a buffer and then concatenate iterators
for the in-memory rows + deserialized one
!image-2026-05-07-10-10-16-886.png|width=800!
was:
Currently when we execute a local read we fetch data from SSTables and
Memtables using a merging iterator and write it to a byte buffer. Later when we
combine a CQL response we deserialize the data back to iterate over them as a
part of coordinator logic. So, we allocate rows and cells twice here, during
the read from SSTables/Memtables and during the deserialization by coordinator
logic if we read data locally (it is a typical scenario because usually drivers
are sending requests to replicas).
The idea of optimization: if we do a single partition read of a small number of
rows we can keep the data in memory and avoid this double row objects
allocation.
We should limit amount of such data kept in memory to avoid too much pressure
on GC due to extended lifetime for these objects and promoting them to an old
generation.
So, a system property can be used to limit number of rows we keep in memory in
this scenario, as well as to disable the logic in case of any issues.
!image-2026-05-07-10-10-16-886.png|width=800!
> Avoid serialization and deserialization for coordinator-local single
> partition data read
> ----------------------------------------------------------------------------------------
>
> Key: CASSANDRA-21354
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21354
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Local/Other
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 6.x, 7.x
>
> Attachments: image-2026-05-07-10-10-16-886.png, may3_max_alloc.html,
> may6_noser_alloc.html
>
>
> Currently when we execute a local read we fetch data from SSTables and
> Memtables using a merging iterator and write it to a byte buffer. Later when
> we combine a CQL response we deserialize the data back to iterate over them
> as a part of coordinator logic. So, we allocate rows and cells twice here,
> during the read from SSTables/Memtables and during the deserialization by
> coordinator logic if we read data locally (it is a typical scenario because
> usually drivers are sending requests to replicas).
> The idea of optimization: if we do a single partition read of a small number
> of rows we can keep the data in memory and avoid this double row objects
> allocation.
> We should limit amount of such data kept in memory to avoid too much pressure
> on GC due to extended lifetime for these objects and promoting them to an old
> generation.
> So, a system property can be used to limit number of rows we keep in memory
> in this scenario, as well as to disable the logic in case of any issues.
> We cannot get a number of rows in advance, so we do a rows limiting logic in
> the following way: we read first N rows to memory and if we still have
> something we serialize the remaining to a buffer and then concatenate
> iterators for the in-memory rows + deserialized one
> !image-2026-05-07-10-10-16-886.png|width=800!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]