[GitHub] [druid] gianm commented on issue #12022: [Proposal] Improve memory estimations in `OnHeapIncrementalIndex`

GitBox Sat, 04 Dec 2021 10:58:55 -0800


gianm commented on issue #12022:
URL: https://github.com/apache/druid/issues/12022#issuecomment-986076579



   To me this feels like a change that likely doesn't require incompatibility 
in order to perform well, for a couple specific reasons (below). So I do think 
we should do benchmarking before deciding to make things incompatible.
   
   > doing this in the proposed backwards compatible way adds a branch after 
every single call to the aggregator which, given that this is a hot method, 
could have significant inadvertent impacts to query performance
   
   I was thinking that query-time callers would stick with `void aggregate`, so 
no branching would be needed on those paths. Currently the users of Aggregator 
are timeseries and topN, which have a natural limitation on the number of 
Aggregators they create anyway: timeseries only makes one set per time bucket 
and topN has its threshold. In practice I haven't noticed any issues with these 
query types running out of memory, so I don't think there is a reason to add 
the overhead of size estimation.
   
   > My only worry with that is that the most common implementation of 
Aggregator.aggregate would likely be via that default implementation and I'm 
not sure if this would be adding yet another virtual function call to the hot 
path.
   
   This one seems like the kind of thing that the JVM should be able to 
optimize. Also, for the reason above, it's a code path that I'd only expect to 
get called during ingestion anyway, where aggregation time is not as big a 
driver of overall performance as it is during queries.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] gianm commented on issue #12022: [Proposal] Improve memory estimations in `OnHeapIncrementalIndex`

Reply via email to