[jira] [Created] (CASSANDRA-21371) Account tombstones paging for select queries

Dmitry Konstantinov (Jira) Thu, 14 May 2026 02:30:51 -0700

Dmitry Konstantinov created CASSANDRA-21371:
-----------------------------------------------


             Summary: Account tombstones paging for select queries
                 Key: CASSANDRA-21371
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21371
             Project: Apache Cassandra
          Issue Type: New Feature
          Components: Consistency/Coordination, Local/Other
            Reporter: Dmitry Konstantinov


Currently Cassandra uses only live rows to track if it is enough rows for a 
page to return. As a result if a partition has a lot of tombstones we get 
performance issues: such queries are slow and consume read threads + a lot of 
objects are allocated to handle them and it creates a GC pressure spike.

To avoid such issues we can:
 * consider tombstone unfiltered items during read paging in the same ways as 
alive rows
 * OR introduce a byte-sized limit for page which accounts alive rows as well 
as tombstones

It means we can return a smaller number of rows in a result or even 0 but it 
does not mean that there is no data anymore in DB, an app still needs to fetch 
data.
So, if we have for example 60k row tombstones, we will fetch 5k and return back 
an answer to a client, so this particular request can be slower from a client 
point of view due to several pages to fetch but it will allow to have DB heathy 
by avoid read threads occupied for an unpredictable amount of time.

CQL protocol already has an explicit flag to show if we have more data to 
return -  
[https://cassandra.apache.org/doc/latest/cassandra/reference/native-protocol.html]
 , so no changes on CQL protocol level are expected:
{code:java}
0x0002    Has_more_pages: indicates whether this is not the last
                      page of results and more should be retrieved. If set, the
                      <paging_state> will be present. The <paging_state> is a
                      [bytes] value that should be used in QUERY/EXECUTE to
                      continue paging and retrieve the remainder of the result 
for
                      this query (See Section 7 for more details).
{code}
{code:java}
- Clients should not rely on the actual size of the result set returned to
    decide if there are more results to fetch or not. Instead, they should 
always
    check the Has_more_pages flag (unless they did not enable paging for the 
query
    obviously). Clients should also not assert that no result will have more 
than
    <result_page_size> results. While the current implementation always respects
    the exact value of <result_page_size>, we reserve the right to return
    slightly smaller or bigger pages in the future for performance 
reasons.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (CASSANDRA-21371) Account tombstones paging for select queries

Reply via email to