[
https://issues.apache.org/jira/browse/CASSANDRA-21371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Konstantinov updated CASSANDRA-21371:
--------------------------------------------
Summary: Account tombstones in paging for select queries (was: Account
tombstones paging for select queries)
> Account tombstones in paging for select queries
> -----------------------------------------------
>
> Key: CASSANDRA-21371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21371
> Project: Apache Cassandra
> Issue Type: New Feature
> Components: Consistency/Coordination, Local/Other
> Reporter: Dmitry Konstantinov
> Priority: Normal
>
> Currently Cassandra uses only live rows to track if it is enough rows for a
> page to return. As a result if a partition has a lot of tombstones we get
> performance issues: such queries are slow and consume read threads + a lot of
> objects are allocated to handle them and it creates a GC pressure spike.
> To avoid such issues we can:
> * consider tombstone unfiltered items during read paging in the same ways as
> alive rows
> * OR introduce a byte-sized limit for page which accounts alive rows as well
> as tombstones
> It means we can return a smaller number of rows in a result or even 0 but it
> does not mean that there is no data anymore in DB, an app still needs to
> fetch data.
> So, if we have for example 60k row tombstones, we will fetch 5k and return
> back an answer to a client, so this particular request can be slower from a
> client point of view due to several pages to fetch but it will allow to have
> DB heathy by avoid read threads occupied for an unpredictable amount of time.
> CQL protocol already has an explicit flag to show if we have more data to
> return -
> [https://cassandra.apache.org/doc/latest/cassandra/reference/native-protocol.html]
> , so no changes on CQL protocol level are expected:
> {code:java}
> 0x0002 Has_more_pages: indicates whether this is not the last
> page of results and more should be retrieved. If set,
> the
> <paging_state> will be present. The <paging_state> is a
> [bytes] value that should be used in QUERY/EXECUTE to
> continue paging and retrieve the remainder of the
> result for
> this query (See Section 7 for more details).
> {code}
> {code:java}
> - Clients should not rely on the actual size of the result set returned to
> decide if there are more results to fetch or not. Instead, they should
> always
> check the Has_more_pages flag (unless they did not enable paging for the
> query
> obviously). Clients should also not assert that no result will have more
> than
> <result_page_size> results. While the current implementation always
> respects
> the exact value of <result_page_size>, we reserve the right to return
> slightly smaller or bigger pages in the future for performance
> reasons.{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]