[GitHub] [druid] rohangarg commented on pull request #12139: Limit the subquery results by memory usage (estimated)

GitBox Tue, 11 Jan 2022 07:21:34 -0800


rohangarg commented on pull request #12139:
URL: https://github.com/apache/druid/pull/12139#issuecomment-1010068322



   Some thoughts : 
   1. For performance, I'd suggest to also benchmark `estimateResultRowSize` 
(complete function) as a function of (numRows, numCols, sizeOfCols) to measure 
the independent impact. For instance, currently we have a 100k limit on 
subquery rows so for all successful cases, we'd be only measuring the size of 
100k rows by default. Maybe the benchmark also helps in determining the default 
parameters we might have to set (like the 'n' for sampling if needed). Also, 
more things like different strategies for fixed width columns and variable 
width columns can be thought of. Or even caching of the size for subquery to 
help in concurrency cases for same subquery.
   2. Should we have the config as `maxSubqueryResultMemory` to make the config 
clearer and scoped? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] rohangarg commented on pull request #12139: Limit the subquery results by memory usage (estimated)

Reply via email to