Johann Hibschman-4 wrote:
>
> I just came across Welford's method for calculating standard deviations,
> and I realized I didn't know how to implement it in J.
>
> See
>
> - http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
> -
> http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-computing-standard-deviation/
> - http://www.johndcook.com/standard_deviation.html
>
> for the various articles that got me wondering about this.
>
> Here's the algorithm, as stated in the above links.
>
> 1. Initialize m_1 = x_1 and s_1 = 0.
> 2. For subsequent xs,
> m_k = m_{k-1} + (x_k - m_{k-1})/k
> s_k = s_{k-1} + (x_k - m_{k-1})*(x_k - m_k)
> 3. Then the kth running standard deviation is:
> stddev_k = s_k / (k - 1)
>
> I've not tried to switch it from 1-based indexing to 0-based indexing,
> or to use J-like phrasing in the problem, for fear I'd make a stupid
> mistake.
>
> How would I do this in J?
>
> Cheers,
> Johann
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
>
Very interesting. Here is one solution based on the reduce:
We take three values in the accumulator, k, M and S,
and the iteration is (using index starting with 1 notation):
f =. dyad : 0
'k M S'=.y
Mk=.M+k%~x-M
(k+1);Mk;S+(x-M)*x-Mk
)
The work is then done according to the scheme:
array f reduce acc0,
where acc0 is initial value of k;M;S, giving the following:
Wstd =: }: f~ RL~ 2;0;~{:
RL =: 1 : 0
:
for_e.y do.x=.x u e end.
x
)
Finally, we get the std. deviation from the result of Wstd as:
Wstddev =: [: %: [: ({: % 2-~{.) >@:Wstd
The code could be shortened a bit by a better arrangement
of what comes from the left/right side (e.g using f RR instead of f~RL~),
but I don't know how to make it significantly shorter than this.
Here RL is not crucial--Wstd should be doable with Insert using
a slight modification of the above scheme, although I don't think
the code will be much shorter (assuming that we don't count
the definition of RL, of course.)
Examples of the stability of Welford's algorithm against
cancellation errors:
a=:1e6+1e6?@$0
stddev a
0.288662
Wstddev a
0.288662
a=:1e12+1e6?@$0
stddev a
0.573414
Wstddev a
0.288817
18j16":%%:12 NB. std. dev. of the uniform distribution.
0.2886751345948129
It's also interesting to take longer arrays and keep
more bits of array elements intact:
a=:1e9+2e7?@$0
stddev a
0.398581 NB. all the bits of the result gone...
Wstddev a
0.288684
--
View this message in context:
http://old.nabble.com/Welford%27s-method-for-standard-deviations-tp32458412s24193p32460913.html
Sent from the J Programming mailing list archive at Nabble.com.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm