maytasm opened a new pull request, #19141:
URL: https://github.com/apache/druid/pull/19141

   Add spill file count limit for GroupBy query
   
   ### Description
   GroupBy queries that group on high-cardinality dimensions can create a large 
number of spill files. This problem is more likely when queries contain many 
aggregators and/or aggregators with large memory footprints (e.g., DataSketch). 
This is because GroupBy can only hold a limited number of unique groupings in 
memory before flushing to disk — the exact limit depends on the size of each 
row, which is determined by the size of the aggregators. The issue arises when 
GroupBy attempts to merge all the spill files. Currently, GroupBy merges spill 
files by opening all of them simultaneously. Opening these files requires 
memory for objects such as MappingIterator, SmileParser, etc., which can cause 
historical nodes to OOM.            
   
   This PR fixes the issue by introducing a new property: 
`druid.query.groupBy.maxSpillFileCount`
   The maximum number of spill files allowed per GroupBy query. When the limit 
is reached, the query fails with a ResourceLimitExceededException. This 
property can be used to prevent historical nodes from OOMing due to an 
excessive number of spill files being opened simultaneously during the merge 
phase. Defaults to Integer.MAX_VALUE (unlimited). Can also be set per query via 
the query context key `maxSpillFileCount`.
   
   ---
   Release Notes
   - Added a new GroupBy query configuration property 
druid.query.groupBy.maxSpillFileCount to limit the maximum number of spill 
files created per query. When the limit is exceeded, the query fails with a 
clear error message instead of causing historical nodes to OOM during spill 
file merging. The limit can also be overridden per query via the query context 
`maxSpillFileCount`.
   
   
   ##### Key changed/added classes in this PR
    * 
`processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/LimitedTemporaryStorage.java`
    * 
`processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/SpillingGrouper.java`
   
   
   
   This PR has:
   
   - [x] been self-reviewed.
      - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [x] added documentation for new or modified features or behaviors.
   - [x] a release note entry in the PR description.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [x] been tested in a test Druid cluster.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to