gianm opened a new issue #6743:
URL: https://github.com/apache/druid/issues/6743


   Theta sketches have a very large max size by default, relative to typical 
row sizes (about 250KB with "size" set to the default of 16384). The 
ingestion-time row size estimator (getMaxBytesPerRowForAggregators in 
OnheapIncrementalIndex) uses this figure to estimate row sizes when theta 
sketches are used at ingestion time, leading to way more spills than is 
reasonable. It would be better to use an estimate based more on actual current 
size. I'm not sure how to get this, though.
   
   @leerho - or anyone else - do you have any ideas or suggestions?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to