I am trying to implement standard deviation calculation in Mir for benchmark purposes. I have two implementations. One is the straightforward std = sqrt(mean(abs(x - x.mean())**2)) and the other follows Welford's algorithm for computing variance (as described here: https://www.johndcook.com/blog/standard_deviation/).

However, although the first implementation should be less efficient / slower, the benchmarking results show a startling difference in its favour. I'd like to understand if I am doing something wrong and would appreciate some explanation.

# Naive std
import std.math : abs;
import mir.ndslice;
import mir.math.common : pow, sqrt, fastmath;
import mir.math.sum : sum;
import mir.math.stat : mean;

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)
{
    pragma(inline, false);
    if (flatMatrix.empty)
        return 0.0;
    double n = cast(double) flatMatrix.length;
    double mu = flatMatrix.mean;
return (flatMatrix.map!(a => (a - mu).abs ^^ 2).sum!"fast" / n).sqrt;
}


# std with Welford's variance
@fastmath double sdWelford(T)(Slice!(T*, 1) flatMatrix)
{
    pragma(inline, false);
    if (flatMatrix.empty)
        return 0.0;

    double m0 = 0.0;
    double m1 = 0.0;
    double s0 = 0.0;
    double s1 = 0.0;
    double n = 0.0;
    foreach (x; flatMatrix.field)
    {
        ++n;
        m1 = m0 + (x - m0) / n;
        s1 = s0 + (x - m0) * (x - m1);
        m0 = m1;
        s0 = s1;
    }
    // switch to n - 1 for sample variance
    return (s1 / n).sqrt;
}

Benchmarking:

Naive std (1k loops):
  std of [60, 60] matrix 0.001727
  std of [300, 300] matrix 0.043452
  std of [600, 600] matrix 0.182177
  std of [800, 800] matrix 0.345367

std with Welford's variance (1k loops):
  std of [60, 60] matrix 0.0225476
  std of [300, 300] matrix 0.534528
  std of [600, 600] matrix 2.0714
  std of [800, 800] matrix 3.60142

Reply via email to