jihoonson commented on issue #11544:
URL: https://github.com/apache/druid/issues/11544#issuecomment-894013851


   Yes, I think the problem is too many items per country. Druid uses a 
fixed-size buffer per row to keep the sketch (`DoublesSketch`). Since the 
buffer size is fixed but Druid doesn't know the number of items in advance, it 
estimates the buffer size to be large enough to hold one billion items in the 
sketch. So, when you have less items than one billion, the sketch can fit in 
the buffer and everything works well. The interesting part is when you have 
more items than one billion. In that case, Druid lets the sketch allocate extra 
heap memory to hold those items that don't fit in the buffer. However, 
`DoublesSketch` is not working as we expected and throws NPE when it tries to 
allocate more memory. This issue is filed in 
https://github.com/apache/datasketches-java/issues/358.
   
   As a workaround, you could use other functions to compute approximate 
quantiles, such as `DS_QUANTILES_SKETCH` or `APPROX_QUANTILE`. Note that 
`APPROX_QUANTILE` uses the deprecated approximate histogram aggregator and its 
accuracy might be not great.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to