[
https://issues.apache.org/jira/browse/CASSANDRA-21371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Konstantinov updated CASSANDRA-21371:
--------------------------------------------
Description:
Currently Cassandra uses only live rows to track if it is enough rows for a
page to return. As a result if a partition has a lot of tombstones we get
performance issues: such queries are slow and consume read threads + a lot of
objects are allocated to handle them and it creates a GC pressure spike.
To avoid such issues we can:
* consider tombstone unfiltered items during read paging in the same ways as
live rows
* OR introduce a byte-sized limit per page which includes live rows as well as
tombstones
It means we can return a smaller number of rows in a result or even 0 but it
does not mean that there is no data anymore in DB, an app still needs to fetch
data.
So, if we have for example 60k row tombstones, we will fetch 5k and return back
an answer to a client, so this particular request can be slower from a client
point of view due to several pages to fetch but it will allow to have DB heathy
by avoid read threads occupied for an unpredictable amount of time.
CQL protocol already has an explicit flag to show if we have more data to
return -
[https://cassandra.apache.org/doc/latest/cassandra/reference/native-protocol.html]
, so no changes on CQL protocol level are expected:
{code:java}
0x0002 Has_more_pages: indicates whether this is not the last
page of results and more should be retrieved. If set, the
<paging_state> will be present. The <paging_state> is a
[bytes] value that should be used in QUERY/EXECUTE to
continue paging and retrieve the remainder of the result
for
this query (See Section 7 for more details).
{code}
{code:java}
- Clients should not rely on the actual size of the result set returned to
decide if there are more results to fetch or not. Instead, they should
always
check the Has_more_pages flag (unless they did not enable paging for the
query
obviously). Clients should also not assert that no result will have more
than
<result_page_size> results. While the current implementation always respects
the exact value of <result_page_size>, we reserve the right to return
slightly smaller or bigger pages in the future for performance
reasons.{code}
was:
Currently Cassandra uses only live rows to track if it is enough rows for a
page to return. As a result if a partition has a lot of tombstones we get
performance issues: such queries are slow and consume read threads + a lot of
objects are allocated to handle them and it creates a GC pressure spike.
To avoid such issues we can:
* consider tombstone unfiltered items during read paging in the same ways as
alive rows
* OR introduce a byte-sized limit for page which accounts alive rows as well
as tombstones
It means we can return a smaller number of rows in a result or even 0 but it
does not mean that there is no data anymore in DB, an app still needs to fetch
data.
So, if we have for example 60k row tombstones, we will fetch 5k and return back
an answer to a client, so this particular request can be slower from a client
point of view due to several pages to fetch but it will allow to have DB heathy
by avoid read threads occupied for an unpredictable amount of time.
CQL protocol already has an explicit flag to show if we have more data to
return -
[https://cassandra.apache.org/doc/latest/cassandra/reference/native-protocol.html]
, so no changes on CQL protocol level are expected:
{code:java}
0x0002 Has_more_pages: indicates whether this is not the last
page of results and more should be retrieved. If set, the
<paging_state> will be present. The <paging_state> is a
[bytes] value that should be used in QUERY/EXECUTE to
continue paging and retrieve the remainder of the result
for
this query (See Section 7 for more details).
{code}
{code:java}
- Clients should not rely on the actual size of the result set returned to
decide if there are more results to fetch or not. Instead, they should
always
check the Has_more_pages flag (unless they did not enable paging for the
query
obviously). Clients should also not assert that no result will have more
than
<result_page_size> results. While the current implementation always respects
the exact value of <result_page_size>, we reserve the right to return
slightly smaller or bigger pages in the future for performance
reasons.{code}
> Account tombstones in paging for select queries
> -----------------------------------------------
>
> Key: CASSANDRA-21371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21371
> Project: Apache Cassandra
> Issue Type: New Feature
> Components: Consistency/Coordination, Local/Other
> Reporter: Dmitry Konstantinov
> Priority: Normal
>
> Currently Cassandra uses only live rows to track if it is enough rows for a
> page to return. As a result if a partition has a lot of tombstones we get
> performance issues: such queries are slow and consume read threads + a lot of
> objects are allocated to handle them and it creates a GC pressure spike.
> To avoid such issues we can:
> * consider tombstone unfiltered items during read paging in the same ways as
> live rows
> * OR introduce a byte-sized limit per page which includes live rows as well
> as tombstones
> It means we can return a smaller number of rows in a result or even 0 but it
> does not mean that there is no data anymore in DB, an app still needs to
> fetch data.
> So, if we have for example 60k row tombstones, we will fetch 5k and return
> back an answer to a client, so this particular request can be slower from a
> client point of view due to several pages to fetch but it will allow to have
> DB heathy by avoid read threads occupied for an unpredictable amount of time.
> CQL protocol already has an explicit flag to show if we have more data to
> return -
> [https://cassandra.apache.org/doc/latest/cassandra/reference/native-protocol.html]
> , so no changes on CQL protocol level are expected:
> {code:java}
> 0x0002 Has_more_pages: indicates whether this is not the last
> page of results and more should be retrieved. If set,
> the
> <paging_state> will be present. The <paging_state> is a
> [bytes] value that should be used in QUERY/EXECUTE to
> continue paging and retrieve the remainder of the
> result for
> this query (See Section 7 for more details).
> {code}
> {code:java}
> - Clients should not rely on the actual size of the result set returned to
> decide if there are more results to fetch or not. Instead, they should
> always
> check the Has_more_pages flag (unless they did not enable paging for the
> query
> obviously). Clients should also not assert that no result will have more
> than
> <result_page_size> results. While the current implementation always
> respects
> the exact value of <result_page_size>, we reserve the right to return
> slightly smaller or bigger pages in the future for performance
> reasons.{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]