Re: D Mir: standard deviation speed

2020-07-15 Thread tastyminerals via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 07:51:31 UTC, 9il wrote:

On Wednesday, 15 July 2020 at 07:34:59 UTC, tastyminerals wrote:

On Wednesday, 15 July 2020 at 06:57:21 UTC, 9il wrote:

On Wednesday, 15 July 2020 at 06:55:51 UTC, 9il wrote:
On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals 
wrote:

On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote:

[...]


Good to know. So, it's fine to use it with sum!"fast" but 
better avoid it for general purposes.


They both are more precise by default.


This was a reply to the other your post in the thread, sorry. 
Mir algorithms are more precise by default then the 
algorithms you have provided.


Right. Is this why standardDeviation is significantly slower?


Yes. It allows you to pick a summation option, you can try 
others then default in benchmarks.


Indeed, I played around with VarianceAlgo and Summation options, 
and they impact the end result a lot.


ans = matrix.flattened.standardDeviation!(VarianceAlgo.naive, 
Summation.appropriate);

std of [300, 300] matrix 0.375903
std of [60, 60] matrix 0.0156448
std of [600, 600] matrix 1.54429
std of [800, 800] matrix 3.03954

ans = matrix.flattened.standardDeviation!(VarianceAlgo.online, 
Summation.appropriate);

std of [300, 300] matrix 1.12404
std of [60, 60] matrix 0.041968
std of [600, 600] matrix 5.01617
std of [800, 800] matrix 8.64363


The Summation.fast behaves strange though. I wonder what happened 
here?


ans = matrix.flattened.standardDeviation!(VarianceAlgo.naive, 
Summation.fast);

std of [300, 300] matrix 1e-06
std of [60, 60] matrix 9e-07
std of [600, 600] matrix 1.2e-06
std of [800, 800] matrix 9e-07

ans = matrix.flattened.standardDeviation!(VarianceAlgo.online, 
Summation.fast);

std of [300, 300] matrix 9e-07
std of [60, 60] matrix 9e-07
std of [600, 600] matrix 1.1e-06
std of [800, 800] matrix 1e-06


[OT] Re: D Mir: standard deviation speed

2020-07-15 Thread Schrom, Brian T via Digitalmars-d-learn
> I've mixed up @fastmath and @fmamath as well. No worries.
Seems like this might be a good awareness opportunity to change one
of the names to be more descriptive and more distinctive.

FWIW, I read half way through threw the thread before I caught onto
the distinction.  I can imagine making that same mistake in the
future.


Re: D Mir: standard deviation speed

2020-07-15 Thread jmh530 via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 11:41:35 UTC, 9il wrote:

[snip]

Ah, no, my bad! You write @fmamath, I have read it as 
@fastmath. @fmamath is OK here.


I've mixed up @fastmath and @fmamath as well. No worries.


Re: D Mir: standard deviation speed

2020-07-15 Thread 9il via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 11:37:23 UTC, jmh530 wrote:

On Wednesday, 15 July 2020 at 11:26:19 UTC, 9il wrote:

[snip]


@fmamath private double sd(T)(Slice!(T*, 1) flatMatrix)


@fastmath violates all summation algorithms except `"fast"`.
The same bug is in the original author's post.


I hadn't realized that @fmamath was the problem, rather than 
@fastmath overall. @fmamathis used on many mir.math.stat 
functions, though admittedly not in the accumulators.


Ah, no, my bad! You write @fmamath, I have read it as @fastmath. 
@fmamath is OK here.


Re: D Mir: standard deviation speed

2020-07-15 Thread jmh530 via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 11:26:19 UTC, 9il wrote:

[snip]


@fmamath private double sd(T)(Slice!(T*, 1) flatMatrix)


@fastmath violates all summation algorithms except `"fast"`.
The same bug is in the original author's post.


I hadn't realized that @fmamath was the problem, rather than 
@fastmath overall. @fmamathis used on many mir.math.stat 
functions, though admittedly not in the accumulators.


Re: D Mir: standard deviation speed

2020-07-15 Thread 9il via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 11:23:00 UTC, jmh530 wrote:

On Wednesday, 15 July 2020 at 05:57:56 UTC, tastyminerals wrote:

[snip]

Here is a (WIP) project as of now.
Line 160 in 
https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d


std of [60, 60] matrix 0.0389492 (> 0.001727)
std of [300, 300] matrix 1.03592 (> 0.043452)
std of [600, 600] matrix 4.2875 (> 0.182177)
std of [800, 800] matrix 7.9415 (> 0.345367)


I changed the dflags-ldc to "-mcpu-native -O" and compiled with 
`dub run --compiler=ldc2`. I got similar results as yours for 
both in the initial run.


I changed sd to

@fmamath private double sd(T)(Slice!(T*, 1) flatMatrix)


@fastmath violates all summation algorithms except `"fast"`.
The same bug is in the original author's post.



Re: D Mir: standard deviation speed

2020-07-15 Thread jmh530 via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 05:57:56 UTC, tastyminerals wrote:

[snip]

Here is a (WIP) project as of now.
Line 160 in 
https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d


std of [60, 60] matrix 0.0389492 (> 0.001727)
std of [300, 300] matrix 1.03592 (> 0.043452)
std of [600, 600] matrix 4.2875 (> 0.182177)
std of [800, 800] matrix 7.9415 (> 0.345367)


I changed the dflags-ldc to "-mcpu-native -O" and compiled with 
`dub run --compiler=ldc2`. I got similar results as yours for 
both in the initial run.


I changed sd to

@fmamath private double sd(T)(Slice!(T*, 1) flatMatrix)
{
pragma(inline, false);
if (flatMatrix.empty)
return 0.0;
double n = cast(double) flatMatrix.length;
double mu = flatMatrix.mean;
return (flatMatrix.map!(a => (a - mu) ^^ 2)
.sum!"precise" / n).sqrt;
}

and got

std of [10, 10] matrix 0.0016321
std of [20, 20] matrix 0.0069788
std of [300, 300] matrix 2.42063
std of [60, 60] matrix 0.0828711
std of [600, 600] matrix 9.72251
std of [800, 800] matrix 18.1356

And the biggest change by far was the sum!"precise" instead of 
sum!"fast".


When I ran your benchStd function with
ans = matrix.flattened.standardDeviation!(double, "online", 
"fast");

I got
std of [10, 10] matrix 1e-07
std of [20, 20] matrix 0
std of [300, 300] matrix 0
std of [60, 60] matrix 1e-07
std of [600, 600] matrix 0
std of [800, 800] matrix 0

I got the same result with Summator.naive. That almost seems too 
low.


The default is Summator.appropriate, which is resolved to 
Summator.pairwise in this case. It is faster than 
Summator.precise, but still slower than Summator.naive or 
Summator.fast. Your welfordSD should line up with Summator.naive.


When I change that to
ans = matrix.flattened.standardDeviation!(double, "online", 
"precise");

I get
Running .\mir_benchmarks_2.exe
std of [10, 10] matrix 0.0031737
std of [20, 20] matrix 0.0153603
std of [300, 300] matrix 4.15738
std of [60, 60] matrix 0.171211
std of [600, 600] matrix 17.7443
std of [800, 800] matrix 34.2592

I also tried changing your welfordSD function based on the stuff 
I mentioned above, but it did not make a large difference.


Re: D Mir: standard deviation speed

2020-07-15 Thread 9il via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 07:34:59 UTC, tastyminerals wrote:

On Wednesday, 15 July 2020 at 06:57:21 UTC, 9il wrote:

On Wednesday, 15 July 2020 at 06:55:51 UTC, 9il wrote:
On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals 
wrote:

On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote:

[...]


Good to know. So, it's fine to use it with sum!"fast" but 
better avoid it for general purposes.


They both are more precise by default.


This was a reply to the other your post in the thread, sorry. 
Mir algorithms are more precise by default then the algorithms 
you have provided.


Right. Is this why standardDeviation is significantly slower?


Yes. It allows you to pick a summation option, you can try others 
then default in benchmarks.


Re: D Mir: standard deviation speed

2020-07-15 Thread tastyminerals via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 06:57:21 UTC, 9il wrote:

On Wednesday, 15 July 2020 at 06:55:51 UTC, 9il wrote:
On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals 
wrote:

On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote:
On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals 
wrote:

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)


@fastmath shouldn't be really used with summation algorithms 
except the `"fast"` version of them. Otherwise, they may or 
may not behave like "fast".


Good to know. So, it's fine to use it with sum!"fast" but 
better avoid it for general purposes.


They both are more precise by default.


This was a reply to the other your post in the thread, sorry. 
Mir algorithms are more precise by default then the algorithms 
you have provided.


Right. Is this why standardDeviation is significantly slower?




Re: D Mir: standard deviation speed

2020-07-15 Thread 9il via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals wrote:

On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote:

On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote:

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)


@fastmath shouldn't be really used with summation algorithms 
except the `"fast"` version of them. Otherwise, they may or 
may not behave like "fast".


Good to know. So, it's fine to use it with sum!"fast" but 
better avoid it for general purposes.


They both are more precise by default.


Re: D Mir: standard deviation speed

2020-07-15 Thread 9il via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 06:55:51 UTC, 9il wrote:

On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals wrote:

On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote:

On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote:

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)


@fastmath shouldn't be really used with summation algorithms 
except the `"fast"` version of them. Otherwise, they may or 
may not behave like "fast".


Good to know. So, it's fine to use it with sum!"fast" but 
better avoid it for general purposes.


They both are more precise by default.


This was a reply to the other your post in the thread, sorry. Mir 
algorithms are more precise by default then the algorithms you 
have provided.


Re: D Mir: standard deviation speed

2020-07-15 Thread tastyminerals via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote:

On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote:

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)


@fastmath shouldn't be really used with summation algorithms 
except the `"fast"` version of them. Otherwise, they may or may 
not behave like "fast".


Good to know. So, it's fine to use it with sum!"fast" but better 
avoid it for general purposes.




Re: D Mir: standard deviation speed

2020-07-15 Thread tastyminerals via Digitalmars-d-learn

On Tuesday, 14 July 2020 at 19:36:21 UTC, jmh530 wrote:

On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote:

  [...]


It would be helpful to provide a link.

You should only need one accumulator for mean and centered sum 
of squares. See the python example under the Welford example

https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
This may have broken optimization somehow.

variance and standardDeviation were recently added to 
mir.math.stat. They have the option to switch between Welford's 
algorithm and the others. What you call as the naive algorithm, 
is VarianceAlgo.twoPass and the Welford algorithm can be 
toggled with VarianceAlgo.online, which is the default option. 
It also would be interesting if you re-did the analysis with 
the built-in mir functions.


There are some other small differences between your 
implementation and the one in mir, beyond the issue discussed 
above. You take the absolute value before the square root and 
force the use of sum!"fast". Another difference is 
VarianceAlgo.online in mir is using a precise calculation of 
the mean rather than the fast update that Welford uses. This 
may have a modest impact on performance, but should provide 
more accurate results.


Ok, the wiki page looks more informative, I shall look into my 
Welford implementation.


I've just used standardDeviation from Mir and it showed even 
worse results than both of the examples above.


Here is a (WIP) project as of now.
Line 160 in 
https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d


std of [60, 60] matrix 0.0389492 (> 0.001727)
std of [300, 300] matrix 1.03592 (> 0.043452)
std of [600, 600] matrix 4.2875 (> 0.182177)
std of [800, 800] matrix 7.9415 (> 0.345367)



Re: D Mir: standard deviation speed

2020-07-14 Thread 9il via Digitalmars-d-learn

On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote:

On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote:

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)


@fastmath shouldn't be really used with summation algorithms 
except the `"fast"` version of them. Otherwise, they may or may 
not behave like "fast".


`mean` is the summation algorithm too


Re: D Mir: standard deviation speed

2020-07-14 Thread 9il via Digitalmars-d-learn

On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote:

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)


@fastmath shouldn't be really used with summation algorithms 
except the `"fast"` version of them. Otherwise, they may or may 
not behave like "fast".


Re: D Mir: standard deviation speed

2020-07-14 Thread jmh530 via Digitalmars-d-learn

On Tuesday, 14 July 2020 at 19:04:45 UTC, tastyminerals wrote:
I am trying to implement standard deviation calculation in Mir 
for benchmark purposes.
I have two implementations. One is the straightforward std = 
sqrt(mean(abs(x - x.mean())**2)) and the other follows 
Welford's algorithm for computing variance (as described here: 
https://www.johndcook.com/blog/standard_deviation/).


However, although the first implementation should be less 
efficient / slower, the benchmarking results show a startling 
difference in its favour. I'd like to understand if I am doing 
something wrong and would appreciate some explanation.


# Naive std
import std.math : abs;
import mir.ndslice;
import mir.math.common : pow, sqrt, fastmath;
import mir.math.sum : sum;
import mir.math.stat : mean;

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)
{
pragma(inline, false);
if (flatMatrix.empty)
return 0.0;
double n = cast(double) flatMatrix.length;
double mu = flatMatrix.mean;
return (flatMatrix.map!(a => (a - mu).abs ^^ 2).sum!"fast" 
/ n).sqrt;

}


# std with Welford's variance
@fastmath double sdWelford(T)(Slice!(T*, 1) flatMatrix)
{
pragma(inline, false);
if (flatMatrix.empty)
return 0.0;

double m0 = 0.0;
double m1 = 0.0;
double s0 = 0.0;
double s1 = 0.0;
double n = 0.0;
foreach (x; flatMatrix.field)
{
++n;
m1 = m0 + (x - m0) / n;
s1 = s0 + (x - m0) * (x - m1);
m0 = m1;
s0 = s1;
}
// switch to n - 1 for sample variance
return (s1 / n).sqrt;
}

Benchmarking:

Naive std (1k loops):
  std of [60, 60] matrix 0.001727
  std of [300, 300] matrix 0.043452
  std of [600, 600] matrix 0.182177
  std of [800, 800] matrix 0.345367

std with Welford's variance (1k loops):
  std of [60, 60] matrix 0.0225476
  std of [300, 300] matrix 0.534528
  std of [600, 600] matrix 2.0714
  std of [800, 800] matrix 3.60142


It would be helpful to provide a link.

You should only need one accumulator for mean and centered sum of 
squares. See the python example under the Welford example

https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
This may have broken optimization somehow.

variance and standardDeviation were recently added to 
mir.math.stat. They have the option to switch between Welford's 
algorithm and the others. What you call as the naive algorithm, 
is VarianceAlgo.twoPass and the Welford algorithm can be toggled 
with VarianceAlgo.online, which is the default option. It also 
would be interesting if you re-did the analysis with the built-in 
mir functions.


There are some other small differences between your 
implementation and the one in mir, beyond the issue discussed 
above. You take the absolute value before the square root and 
force the use of sum!"fast". Another difference is 
VarianceAlgo.online in mir is using a precise calculation of the 
mean rather than the fast update that Welford uses. This may have 
a modest impact on performance, but should provide more accurate 
results.


D Mir: standard deviation speed

2020-07-14 Thread tastyminerals via Digitalmars-d-learn
I am trying to implement standard deviation calculation in Mir 
for benchmark purposes.
I have two implementations. One is the straightforward std = 
sqrt(mean(abs(x - x.mean())**2)) and the other follows Welford's 
algorithm for computing variance (as described here: 
https://www.johndcook.com/blog/standard_deviation/).


However, although the first implementation should be less 
efficient / slower, the benchmarking results show a startling 
difference in its favour. I'd like to understand if I am doing 
something wrong and would appreciate some explanation.


# Naive std
import std.math : abs;
import mir.ndslice;
import mir.math.common : pow, sqrt, fastmath;
import mir.math.sum : sum;
import mir.math.stat : mean;

@fastmath private double sd0(T)(Slice!(T*, 1) flatMatrix)
{
pragma(inline, false);
if (flatMatrix.empty)
return 0.0;
double n = cast(double) flatMatrix.length;
double mu = flatMatrix.mean;
return (flatMatrix.map!(a => (a - mu).abs ^^ 2).sum!"fast" / 
n).sqrt;

}


# std with Welford's variance
@fastmath double sdWelford(T)(Slice!(T*, 1) flatMatrix)
{
pragma(inline, false);
if (flatMatrix.empty)
return 0.0;

double m0 = 0.0;
double m1 = 0.0;
double s0 = 0.0;
double s1 = 0.0;
double n = 0.0;
foreach (x; flatMatrix.field)
{
++n;
m1 = m0 + (x - m0) / n;
s1 = s0 + (x - m0) * (x - m1);
m0 = m1;
s0 = s1;
}
// switch to n - 1 for sample variance
return (s1 / n).sqrt;
}

Benchmarking:

Naive std (1k loops):
  std of [60, 60] matrix 0.001727
  std of [300, 300] matrix 0.043452
  std of [600, 600] matrix 0.182177
  std of [800, 800] matrix 0.345367

std with Welford's variance (1k loops):
  std of [60, 60] matrix 0.0225476
  std of [300, 300] matrix 0.534528
  std of [600, 600] matrix 2.0714
  std of [800, 800] matrix 3.60142