gianm commented on issue #8071: add aggregators for computing mean/average URL: https://github.com/apache/incubator-druid/issues/8071#issuecomment-513528841 Would the algorithm you mention be more or less efficient than the following -- which is what people mostly do today in Druid? ``` // maintain following variables long count; double sum; // update with a value v count++; sum += v; // merging count = count1 + count2; sum = sum1 + sum2; // finalizing result = sum / count; ``` I think yours has less likelihood of overflowing `sum`, but it involves a division on every update, and that might slow things down. (I haven't benchmarked it.) Btw, some comments related to Druid SQL: - In Druid SQL it's already easy. You can write `AVG(value)` and it will generate the sum and count aggs and the necessary postagg. I'm hoping that in general, people looking for easy ways to do stuff will gravitate towards SQL. - If we do end up adding native mean aggregators that are better than doing the sum+count+postagg thing, then we can also modify Druid SQL to use them. The code is in `AvgSqlAggregator`.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
