Andrew Dunstan schrieb am 2015-01-20:
> On 01/20/2015 01:26 PM, Arne Scheffer wrote: > >Interesting patch. > >I did a quick review looking only into the patch file. > >The "sum of variances" variable contains > >the "sum of squared differences" instead, I think. > Umm, no. It's not. Umm, yes, i think, it is ;-) > e->counters.sum_var_time += > (total_time - old_mean) * (total_time - e->counters.mean_time); > This is not a square that's being added. That's correct. Nevertheless it's the difference between the computed sum of squared differences and the preceeding one, added in every step. > old_mean is not the same as > e->counters.mean_time. > Since the variance is this value divided by (n - 1), AIUI, I think > "sum > of variances" isn't a bad description. I'm open to alternative > suggestions. > >And a very minor aspect: > >The term "standard deviation" in your code stands for > >(corrected) sample standard deviation, I think, > >because you devide by n-1 instead of n to keep the > >estimator unbiased. > >How about mentioning the prefix "sample" > >to indicate this beiing the estimator? > I don't understand. I'm following pretty exactly the calculations > stated > at <http://www.johndcook.com/blog/standard_deviation/> (There is nothing bad about that calculations, Welford's algorithm is simply sequently adding the differences mentioned above.) VlG-Arne > I'm not a statistician. Perhaps others who are more literate in > statistics can comment on this paragraph. > >And I'm sure I'm missing C specifics (again) > >(or it's the reduced patch file scope), > >but you introduce sqrtd, but sqrt is called? > Good catch. Will fix. > cheers > andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers