aho135 commented on code in PR #19083: URL: https://github.com/apache/druid/pull/19083#discussion_r2893646659
########## docs/operations/basic-cluster-tuning.md: ########## @@ -334,6 +334,25 @@ Non-nested GroupBy queries require 1 merge buffer per query, while a nested Grou The number of merge buffers determines the number of GroupBy queries that can be processed concurrently. +#### Using metrics to tune GroupBy buffer configuration + +Druid can emit metrics that help you right-size merge buffers and related GroupBy configuration. These metrics are available when the `GroupByStatsMonitor` module is enabled by adding `org.apache.druid.server.metrics.GroupByStatsMonitor` to `druid.monitoring.monitors`. See the [metrics reference](metrics.md) for full details. + +##### Sizing `druid.processing.buffer.sizeBytes` + +- `mergeBuffer/maxBytesUsed`: peak merge buffer bytes used by any single GroupBy query within the emission period. If this value consistently approaches `druid.processing.buffer.sizeBytes`, consider increasing the buffer size. +- `groupBy/maxSpilledBytes`: peak bytes spilled to disk by any single GroupBy query. Non-zero values indicate that merge buffers are too small to hold intermediate results in memory, causing disk spill. Increasing `druid.processing.buffer.sizeBytes` reduces spilling. You can also adjust `druid.query.groupBy.maxOnDiskStorage` to control how much spilling is allowed before a query fails. +- `groupBy/spilledQueries`: how often GroupBy queries are spilling to disk. Non-zero values may indicate that your buffer size is too small and should be increased to avoid performance issues caused by excessive spilling. Review Comment: ```suggestion - `groupBy/spilledQueries`: number of GroupBy queries spilled to disk within the emission period. Non-zero values may indicate that your buffer size is too small and should be increased to avoid performance issues caused by excessive spilling. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
