Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 07:51:31 UTC, 9il wrote: On Wednesday, 15 July 2020 at 07:34:59 UTC, tastyminerals wrote: On Wednesday, 15 July 2020 at 06:57:21 UTC, 9il wrote: On Wednesday, 15 July 2020 at 06:55:51 UTC, 9il wrote: On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals wrote: On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote: [...] Good to know. So, it's fine to use it with sum!"fast" but better avoid it for general purposes. They both are more precise by default. This was a reply to the other your post in the thread, sorry. Mir algorithms are more precise by default then the algorithms you have provided. Right. Is this why standardDeviation is significantly slower? Yes. It allows you to pick a summation option, you can try others then default in benchmarks. Indeed, I played around with VarianceAlgo and Summation options, and they impact the end result a lot. ans = matrix.flattened.standardDeviation!(VarianceAlgo.naive, Summation.appropriate); std of [300, 300] matrix 0.375903 std of [60, 60] matrix 0.0156448 std of [600, 600] matrix 1.54429 std of [800, 800] matrix 3.03954 ans = matrix.flattened.standardDeviation!(VarianceAlgo.online, Summation.appropriate); std of [300, 300] matrix 1.12404 std of [60, 60] matrix 0.041968 std of [600, 600] matrix 5.01617 std of [800, 800] matrix 8.64363 The Summation.fast behaves strange though. I wonder what happened here? ans = matrix.flattened.standardDeviation!(VarianceAlgo.naive, Summation.fast); std of [300, 300] matrix 1e-06 std of [60, 60] matrix 9e-07 std of [600, 600] matrix 1.2e-06 std of [800, 800] matrix 9e-07 ans = matrix.flattened.standardDeviation!(VarianceAlgo.online, Summation.fast); std of [300, 300] matrix 9e-07 std of [60, 60] matrix 9e-07 std of [600, 600] matrix 1.1e-06 std of [800, 800] matrix 1e-06
[OT] Re: D Mir: standard deviation speed
> I've mixed up @fastmath and @fmamath as well. No worries. Seems like this might be a good awareness opportunity to change one of the names to be more descriptive and more distinctive. FWIW, I read half way through threw the thread before I caught onto the distinction. I can imagine making that same mistake in the future.
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 11:41:35 UTC, 9il wrote: [snip] Ah, no, my bad! You write @fmamath, I have read it as @fastmath. @fmamath is OK here. I've mixed up @fastmath and @fmamath as well. No worries.
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 11:37:23 UTC, jmh530 wrote: On Wednesday, 15 July 2020 at 11:26:19 UTC, 9il wrote: [snip] @fmamath private double sd(T)(Slice!(T*, 1) flatMatrix) @fastmath violates all summation algorithms except `"fast"`. The same bug is in the original author's post. I hadn't realized that @fmamath was the problem, rather than @fastmath overall. @fmamathis used on many mir.math.stat functions, though admittedly not in the accumulators. Ah, no, my bad! You write @fmamath, I have read it as @fastmath. @fmamath is OK here.
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 11:26:19 UTC, 9il wrote: [snip] @fmamath private double sd(T)(Slice!(T*, 1) flatMatrix) @fastmath violates all summation algorithms except `"fast"`. The same bug is in the original author's post. I hadn't realized that @fmamath was the problem, rather than @fastmath overall. @fmamathis used on many mir.math.stat functions, though admittedly not in the accumulators.
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 11:23:00 UTC, jmh530 wrote: On Wednesday, 15 July 2020 at 05:57:56 UTC, tastyminerals wrote: [snip] Here is a (WIP) project as of now. Line 160 in https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d std of [60, 60] matrix 0.0389492 (> 0.001727) std of [300, 300] matrix 1.03592 (> 0.043452) std of [600, 600] matrix 4.2875 (> 0.182177) std of [800, 800] matrix 7.9415 (> 0.345367) I changed the dflags-ldc to "-mcpu-native -O" and compiled with `dub run --compiler=ldc2`. I got similar results as yours for both in the initial run. I changed sd to @fmamath private double sd(T)(Slice!(T*, 1) flatMatrix) @fastmath violates all summation algorithms except `"fast"`. The same bug is in the original author's post.
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 05:57:56 UTC, tastyminerals wrote: [snip] Here is a (WIP) project as of now. Line 160 in https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d std of [60, 60] matrix 0.0389492 (> 0.001727) std of [300, 300] matrix 1.03592 (> 0.043452) std of [600, 600] matrix 4.2875 (> 0.182177) std of [800, 800] matrix 7.9415 (> 0.345367) I changed the dflags-ldc to "-mcpu-native -O" and compiled with `dub run --compiler=ldc2`. I got similar results as yours for both in the initial run. I changed sd to @fmamath private double sd(T)(Slice!(T*, 1) flatMatrix) { pragma(inline, false); if (flatMatrix.empty) return 0.0; double n = cast(double) flatMatrix.length; double mu = flatMatrix.mean; return (flatMatrix.map!(a => (a - mu) ^^ 2) .sum!"precise" / n).sqrt; } and got std of [10, 10] matrix 0.0016321 std of [20, 20] matrix 0.0069788 std of [300, 300] matrix 2.42063 std of [60, 60] matrix 0.0828711 std of [600, 600] matrix 9.72251 std of [800, 800] matrix 18.1356 And the biggest change by far was the sum!"precise" instead of sum!"fast". When I ran your benchStd function with ans = matrix.flattened.standardDeviation!(double, "online", "fast"); I got std of [10, 10] matrix 1e-07 std of [20, 20] matrix 0 std of [300, 300] matrix 0 std of [60, 60] matrix 1e-07 std of [600, 600] matrix 0 std of [800, 800] matrix 0 I got the same result with Summator.naive. That almost seems too low. The default is Summator.appropriate, which is resolved to Summator.pairwise in this case. It is faster than Summator.precise, but still slower than Summator.naive or Summator.fast. Your welfordSD should line up with Summator.naive. When I change that to ans = matrix.flattened.standardDeviation!(double, "online", "precise"); I get Running .\mir_benchmarks_2.exe std of [10, 10] matrix 0.0031737 std of [20, 20] matrix 0.0153603 std of [300, 300] matrix 4.15738 std of [60, 60] matrix 0.171211 std of [600, 600] matrix 17.7443 std of [800, 800] matrix 34.2592 I also tried changing your welfordSD function based on the stuff I mentioned above, but it did not make a large difference.
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 07:34:59 UTC, tastyminerals wrote: On Wednesday, 15 July 2020 at 06:57:21 UTC, 9il wrote: On Wednesday, 15 July 2020 at 06:55:51 UTC, 9il wrote: On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals wrote: On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote: [...] Good to know. So, it's fine to use it with sum!"fast" but better avoid it for general purposes. They both are more precise by default. This was a reply to the other your post in the thread, sorry. Mir algorithms are more precise by default then the algorithms you have provided. Right. Is this why standardDeviation is significantly slower? Yes. It allows you to pick a summation option, you can try others then default in benchmarks.
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 06:57:21 UTC, 9il wrote: On Wednesday, 15 July 2020 at 06:55:51 UTC, 9il wrote: On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals wrote: On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote: On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote: @fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix) @fastmath shouldn't be really used with summation algorithms except the `"fast"` version of them. Otherwise, they may or may not behave like "fast". Good to know. So, it's fine to use it with sum!"fast" but better avoid it for general purposes. They both are more precise by default. This was a reply to the other your post in the thread, sorry. Mir algorithms are more precise by default then the algorithms you have provided. Right. Is this why standardDeviation is significantly slower?
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals wrote: On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote: On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote: @fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix) @fastmath shouldn't be really used with summation algorithms except the `"fast"` version of them. Otherwise, they may or may not behave like "fast". Good to know. So, it's fine to use it with sum!"fast" but better avoid it for general purposes. They both are more precise by default.
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 06:55:51 UTC, 9il wrote: On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals wrote: On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote: On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote: @fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix) @fastmath shouldn't be really used with summation algorithms except the `"fast"` version of them. Otherwise, they may or may not behave like "fast". Good to know. So, it's fine to use it with sum!"fast" but better avoid it for general purposes. They both are more precise by default. This was a reply to the other your post in the thread, sorry. Mir algorithms are more precise by default then the algorithms you have provided.
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote: On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote: @fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix) @fastmath shouldn't be really used with summation algorithms except the `"fast"` version of them. Otherwise, they may or may not behave like "fast". Good to know. So, it's fine to use it with sum!"fast" but better avoid it for general purposes.
Re: D Mir: standard deviation speed
On Tuesday, 14 July 2020 at 19:36:21 UTC, jmh530 wrote: On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote: [...] It would be helpful to provide a link. You should only need one accumulator for mean and centered sum of squares. See the python example under the Welford example https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance This may have broken optimization somehow. variance and standardDeviation were recently added to mir.math.stat. They have the option to switch between Welford's algorithm and the others. What you call as the naive algorithm, is VarianceAlgo.twoPass and the Welford algorithm can be toggled with VarianceAlgo.online, which is the default option. It also would be interesting if you re-did the analysis with the built-in mir functions. There are some other small differences between your implementation and the one in mir, beyond the issue discussed above. You take the absolute value before the square root and force the use of sum!"fast". Another difference is VarianceAlgo.online in mir is using a precise calculation of the mean rather than the fast update that Welford uses. This may have a modest impact on performance, but should provide more accurate results. Ok, the wiki page looks more informative, I shall look into my Welford implementation. I've just used standardDeviation from Mir and it showed even worse results than both of the examples above. Here is a (WIP) project as of now. Line 160 in https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d std of [60, 60] matrix 0.0389492 (> 0.001727) std of [300, 300] matrix 1.03592 (> 0.043452) std of [600, 600] matrix 4.2875 (> 0.182177) std of [800, 800] matrix 7.9415 (> 0.345367)
Re: D Mir: standard deviation speed
On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote: On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote: @fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix) @fastmath shouldn't be really used with summation algorithms except the `"fast"` version of them. Otherwise, they may or may not behave like "fast". `mean` is the summation algorithm too
Re: D Mir: standard deviation speed
On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote: @fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix) @fastmath shouldn't be really used with summation algorithms except the `"fast"` version of them. Otherwise, they may or may not behave like "fast".
Re: D Mir: standard deviation speed
On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote: I am trying to implement standard deviation calculation in Mir for benchmark purposes. I have two implementations. One is the straightforward std = sqrt(mean(abs(x - x.mean())**2)) and the other follows Welford's algorithm for computing variance (as described here: https://www.johndcook.com/blog/standard_deviation/). However, although the first implementation should be less efficient / slower, the benchmarking results show a startling difference in its favour. I'd like to understand if I am doing something wrong and would appreciate some explanation. # Naive std import std.math : abs; import mir.ndslice; import mir.math.common : pow, sqrt, fastmath; import mir.math.sum : sum; import mir.math.stat : mean; @fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix) { pragma(inline, false); if (flatMatrix.empty) return 0.0; double n = cast(double) flatMatrix.length; double mu = flatMatrix.mean; return (flatMatrix.map!(a => (a - mu).abs ^^ 2).sum!"fast" / n).sqrt; } # std with Welford's variance @fastmath double sdWelford(T)(Slice!(T*, 1) flatMatrix) { pragma(inline, false); if (flatMatrix.empty) return 0.0; double m0 = 0.0; double m1 = 0.0; double s0 = 0.0; double s1 = 0.0; double n = 0.0; foreach (x; flatMatrix.field) { ++n; m1 = m0 + (x - m0) / n; s1 = s0 + (x - m0) * (x - m1); m0 = m1; s0 = s1; } // switch to n - 1 for sample variance return (s1 / n).sqrt; } Benchmarking: Naive std (1k loops): std of [60, 60] matrix 0.001727 std of [300, 300] matrix 0.043452 std of [600, 600] matrix 0.182177 std of [800, 800] matrix 0.345367 std with Welford's variance (1k loops): std of [60, 60] matrix 0.0225476 std of [300, 300] matrix 0.534528 std of [600, 600] matrix 2.0714 std of [800, 800] matrix 3.60142 It would be helpful to provide a link. You should only need one accumulator for mean and centered sum of squares. See the python example under the Welford example https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance This may have broken optimization somehow. variance and standardDeviation were recently added to mir.math.stat. They have the option to switch between Welford's algorithm and the others. What you call as the naive algorithm, is VarianceAlgo.twoPass and the Welford algorithm can be toggled with VarianceAlgo.online, which is the default option. It also would be interesting if you re-did the analysis with the built-in mir functions. There are some other small differences between your implementation and the one in mir, beyond the issue discussed above. You take the absolute value before the square root and force the use of sum!"fast". Another difference is VarianceAlgo.online in mir is using a precise calculation of the mean rather than the fast update that Welford uses. This may have a modest impact on performance, but should provide more accurate results.
D Mir: standard deviation speed
I am trying to implement standard deviation calculation in Mir for benchmark purposes. I have two implementations. One is the straightforward std = sqrt(mean(abs(x - x.mean())**2)) and the other follows Welford's algorithm for computing variance (as described here: https://www.johndcook.com/blog/standard_deviation/). However, although the first implementation should be less efficient / slower, the benchmarking results show a startling difference in its favour. I'd like to understand if I am doing something wrong and would appreciate some explanation. # Naive std import std.math : abs; import mir.ndslice; import mir.math.common : pow, sqrt, fastmath; import mir.math.sum : sum; import mir.math.stat : mean; @fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix) { pragma(inline, false); if (flatMatrix.empty) return 0.0; double n = cast(double) flatMatrix.length; double mu = flatMatrix.mean; return (flatMatrix.map!(a => (a - mu).abs ^^ 2).sum!"fast" / n).sqrt; } # std with Welford's variance @fastmath double sdWelford(T)(Slice!(T*, 1) flatMatrix) { pragma(inline, false); if (flatMatrix.empty) return 0.0; double m0 = 0.0; double m1 = 0.0; double s0 = 0.0; double s1 = 0.0; double n = 0.0; foreach (x; flatMatrix.field) { ++n; m1 = m0 + (x - m0) / n; s1 = s0 + (x - m0) * (x - m1); m0 = m1; s0 = s1; } // switch to n - 1 for sample variance return (s1 / n).sqrt; } Benchmarking: Naive std (1k loops): std of [60, 60] matrix 0.001727 std of [300, 300] matrix 0.043452 std of [600, 600] matrix 0.182177 std of [800, 800] matrix 0.345367 std with Welford's variance (1k loops): std of [60, 60] matrix 0.0225476 std of [300, 300] matrix 0.534528 std of [600, 600] matrix 2.0714 std of [800, 800] matrix 3.60142