gianm commented on issue #8071: add aggregators for computing mean/average
URL: 
https://github.com/apache/incubator-druid/issues/8071#issuecomment-513528841
 
 
   Would the algorithm you mention be more or less efficient than the following 
-- which is what people mostly do today in Druid?
   
   ```
   // maintain following variables
   long count;
   double sum;
   
   // update with a value v
   count++;
   sum += v;
   
   // merging
   count = count1 + count2;
   sum = sum1 + sum2;
   
   // finalizing
   result = sum / count;
   ```
   
   I think yours has less likelihood of overflowing `sum`, but it involves a 
division on every update, and that might slow things down. (I haven't 
benchmarked it.)
   
   Btw, some comments related to Druid SQL:
   
   - In Druid SQL it's already easy. You can write `AVG(value)` and it will 
generate the sum and count aggs and the necessary postagg. I'm hoping that in 
general, people looking for easy ways to do stuff will gravitate towards SQL.
   - If we do end up adding native mean aggregators that are better than doing 
the sum+count+postagg thing, then we can also modify Druid SQL to use them. The 
code is in `AvgSqlAggregator`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to