rohangarg commented on pull request #12139: URL: https://github.com/apache/druid/pull/12139#issuecomment-1010068322
Some thoughts : 1. For performance, I'd suggest to also benchmark `estimateResultRowSize` (complete function) as a function of (numRows, numCols, sizeOfCols) to measure the independent impact. For instance, currently we have a 100k limit on subquery rows so for all successful cases, we'd be only measuring the size of 100k rows by default. Maybe the benchmark also helps in determining the default parameters we might have to set (like the 'n' for sampling if needed). Also, more things like different strategies for fixed width columns and variable width columns can be thought of. Or even caching of the size for subquery to help in concurrency cases for same subquery. 2. Should we have the config as `maxSubqueryResultMemory` to make the config clearer and scoped? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
