On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote: > On 25/05/16 04:13, Emilio G. Cota wrote: > (snip) > > +double qdist_avg(const struct qdist *dist) > > +{ > > + unsigned long count; > > + size_t i; > > + double ret = 0; > > + > > + count = qdist_sample_count(dist); > > + if (!count) { > > + return NAN; > > + } > > + for (i = 0; i < dist->n; i++) { > > + struct qdist_entry *e = &dist->entries[i]; > > + > > + ret += e->x * e->count / count; > > Please use Welford’s method or something like that, see > http://stackoverflow.com/a/1346890.
Yes, the way the mean is computed right now, we might suffer from underflow if count is huge. But I'd rather take that, than the perf penalty of an iterative method (such as the one used in Welford's). Note that we might have huge amounts of items, e.g. one item per head bucket in qht's occupancy qdist (and 0.5M head buckets is easy to achieve). If we were to use an iterative method, we'd need to do something like: double qdist_avg(const struct qdist *dist) { size_t i, j; double ret = 0; if (!qdist_sample_count(dist)) { return NAN; } /* compute moving average to prevent under/overflow */ for (i = 0; i < dist->n; i++) { struct qdist_entry *e = &dist->entries[i]; for (j = 0; j < e->count; j++) { ret += (e->x - ret) / (i + j + 1); } } return ret; } Note that skipping the inner loop would be incorrect. I measured the time it takes to execute qdist_avg(&hst.occupancy) at the end of booting debian jessie for ARM. The difference is significant: Original: 0.000002 s Iterative: 0.002846 s So really I think we should be OK with a potential underflow. If you want I can add a comment to remind our future selves of these findings. Emilio