himanshug commented on issue #6743: IncrementalIndex generally overestimates 
theta sketch size
URL: 
https://github.com/apache/incubator-druid/issues/6743#issuecomment-486450735
 
 
   @leerho It appears that postgres does have a memory allocator in order to 
provide the "palloc" and "pfree" methods . @gianm was suggesting something 
similar. In that case DS library would allow some way of passing those 
functions . Druid(or other users of DS) would implement the memory allocator  
in the way that makes most sense for them (e.g. allocating a big chunk of 
memory at startup and then giving off chunks from this in "palloc" or delegate 
each "palloc" to underlying jvm heap or os ...)
   I looked into this a long time ago and one way was hacking it was to use 
"MemoryRegion" and "MemoryRequest" as in 
https://github.com/himanshug/druid/blob/growable_aggregator_final/extensions/datasketches/src/main/java/io/druid/query/aggregation/datasketches/theta/SketchResizableBufferAggregator.java#L120
 (as you might guess this is based on pretty old version of DS library :) ) .
   
   @gianm for IncrementalIndex , if above is done, simplest would be to use 
BufferAggregator and it would be more accurate as well than trying to do 
sizeOf(aggregator) . Current implementation to spill based on 
`getMaxIntermediateSize()` is puzzling to me as the number returned there is 
totally unrelated to what smallest/current/largest heap utilization of on-heap 
Aggregator would be. That number is only relevant when BufferAggregator is used.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to