D Mir: standard deviation speed

tastyminerals via Digitalmars-d-learn Tue, 14 Jul 2020 12:05:55 -0700

I am trying to implement standard deviation calculation in Mirfor benchmark purposes.I have two implementations. One is the straightforward std =sqrt(mean(abs(x - x.mean())**2)) and the other follows Welford'salgorithm for computing variance (as described here:https://www.johndcook.com/blog/standard_deviation/).

However, although the first implementation should be lessefficient / slower, the benchmarking results show a startlingdifference in its favour. I'd like to understand if I am doingsomething wrong and would appreciate some explanation.


# Naive std
import std.math : abs;
import mir.ndslice;
import mir.math.common : pow, sqrt, fastmath;
import mir.math.sum : sum;
import mir.math.stat : mean;

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)
{
    pragma(inline, false);
    if (flatMatrix.empty)
        return 0.0;
    double n = cast(double) flatMatrix.length;
    double mu = flatMatrix.mean;

return (flatMatrix.map!(a => (a - mu).abs ^^ 2).sum!"fast" /n).sqrt;

}


# std with Welford's variance
@fastmath double sdWelford(T)(Slice!(T*, 1) flatMatrix)
{
    pragma(inline, false);
    if (flatMatrix.empty)
        return 0.0;

    double m0 = 0.0;
    double m1 = 0.0;
    double s0 = 0.0;
    double s1 = 0.0;
    double n = 0.0;
    foreach (x; flatMatrix.field)
    {
        ++n;
        m1 = m0 + (x - m0) / n;
        s1 = s0 + (x - m0) * (x - m1);
        m0 = m1;
        s0 = s1;
    }
    // switch to n - 1 for sample variance
    return (s1 / n).sqrt;
}

Benchmarking:

Naive std (1k loops):
  std of [60, 60] matrix 0.001727
  std of [300, 300] matrix 0.043452
  std of [600, 600] matrix 0.182177
  std of [800, 800] matrix 0.345367

std with Welford's variance (1k loops):
  std of [60, 60] matrix 0.0225476
  std of [300, 300] matrix 0.534528
  std of [600, 600] matrix 2.0714
  std of [800, 800] matrix 3.60142

D Mir: standard deviation speed

Reply via email to