[
https://issues.apache.org/jira/browse/CASSANDRA-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539033#comment-17539033
]
Venkata Harikrishna Nukala commented on CASSANDRA-17222:
--------------------------------------------------------
Sorry for the very long delay. Spent some time on it and got stuck at this
point.
As per my understanding (please correct me if I am wrong): when there are
results equal to page size, then a response is sent to the client and server
doesn't maintain query state. The client requests for the next page and the
server executes based on the query state received from the client. If the
server doesn't maintain any state of query, then calculating the *total* number
of internal requests happening for a group by or aggregate query is not
possible right (unless the state is maintained somewhere)? If we update the
histogram with the internal requests counts for each page, I feel it won't give
the picture we expect, is it?
> Add new metrics to track the number of requests performed by GROUP BY and
> Aggregation queries
> ----------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-17222
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17222
> Project: Cassandra
> Issue Type: Improvement
> Components: Observability/Metrics
> Reporter: Benjamin Lerer
> Assignee: Venkata Harikrishna Nukala
> Priority: Normal
> Labels: AdventCalendar2021, lhf
> Fix For: 4.x
>
>
> When a user perform a GROUP BY query or an aggregate query (e.g. {{SELECT
> count\(\*) FROM my_table}}) internally C* will send multiple internal
> requests to avoid running out of memory. The page size used for those
> internal queries is the same as the external page size.
> Having a some visibility on the number of internal requests happening for a
> group by or an aggregate query is important as it might help administrators
> to debug performance issues.
> We should add some separate metrics for GROUP BY queries and Aggregate queries
> +Additional information for newcomers:+
> * A new metric class called {{AggregationMetrics}} should be created with an
> {{Histogram}} called {{internalPagesPerGroupByQuerie}} and another called
> {{internalPagesPerAggregateQuerie}} (see {{BatchMetrics}} for an example
> * High level query paging are managed by {{AggregationQueryPager}}. The
> number of queries performed should be incremented within {{fetchSubPage}} and
> the metrics should be updated on close.
> * To test that the numbers are reliable, you need to create a new Unit Test
> {{AggregationMetricsTest}}. To have some example of how to test group by
> queries with paging, you can look into
> {{SelectGroupByTest.testGroupByWithPaging()}} to check how to clear the
> histograms between test you can look into
> {{BatchMetricsTests.clearHistogram()}}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]