On Tuesday, 14 July 2020 at 19:36:21 UTC, jmh530 wrote:
On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote:
[...]
It would be helpful to provide a link.
You should only need one accumulator for mean and centered sum
of squares. See the python example under the Welford example
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
This may have broken optimization somehow.
variance and standardDeviation were recently added to
mir.math.stat. They have the option to switch between Welford's
algorithm and the others. What you call as the naive algorithm,
is VarianceAlgo.twoPass and the Welford algorithm can be
toggled with VarianceAlgo.online, which is the default option.
It also would be interesting if you re-did the analysis with
the built-in mir functions.
There are some other small differences between your
implementation and the one in mir, beyond the issue discussed
above. You take the absolute value before the square root and
force the use of sum!"fast". Another difference is
VarianceAlgo.online in mir is using a precise calculation of
the mean rather than the fast update that Welford uses. This
may have a modest impact on performance, but should provide
more accurate results.
Ok, the wiki page looks more informative, I shall look into my
Welford implementation.
I've just used standardDeviation from Mir and it showed even
worse results than both of the examples above.
Here is a (WIP) project as of now.
Line 160 in
https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d
std of [60, 60] matrix 0.0389492 (> 0.001727)
std of [300, 300] matrix 1.03592 (> 0.043452)
std of [600, 600] matrix 4.2875 (> 0.182177)
std of [800, 800] matrix 7.9415 (> 0.345367)