AlexanderSaydakov commented on issue #11544: URL: https://github.com/apache/druid/issues/11544#issuecomment-895525040
As Lee Rhodes said, it might take quite a while to fix the root cause and go through release cycles for datasketches-memory and datasketches-java. Therefore I would suggest using the workaround that I mentioned above, namely increasing the MAX_STREAM_LENGTH constant. It affects the size pre-allocated for each sketch in the BufferAggregator. The assumption was that due to data fragmentation across multiple dimensions with power-law distribution only a small number of sketches will reach that size and move to on-heap memory. Since this mechanism is broken now, let's set a much higher limit until it is fixed. And let's do it quickly before 0.22 branch is created. I can do a pull request if we agree on the value. Here is the size of one slot in the BufferAggregator in bytes for the default sketch parameter K=128 for different values of MAX_STREAM_LENGTH: 1B (current): 24608 10B: 28704 100B: 31776 1T: 34848 I suggest setting to 1T. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
