Johann Hibschman-4 wrote:
> 
> I just came across Welford's method for calculating standard deviations,
> and I realized I didn't know how to implement it in J.
> 
> See
> 
>   - http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
>   -
> http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-computing-standard-deviation/
>   - http://www.johndcook.com/standard_deviation.html
> 
> for the various articles that got me wondering about this.
> 
> Here's the algorithm, as stated in the above links.
> 
>   1. Initialize m_1 = x_1 and s_1 = 0.
>   2. For subsequent xs,
>      m_k = m_{k-1} + (x_k - m_{k-1})/k
>      s_k = s_{k-1} + (x_k - m_{k-1})*(x_k - m_k)
>   3. Then the kth running standard deviation is:
>      stddev_k = s_k / (k - 1)
> 
> I've not tried to switch it from 1-based indexing to 0-based indexing,
> or to use J-like phrasing in the problem, for fear I'd make a stupid
> mistake.
> 
> How would I do this in J?
> 
> Cheers,
> Johann
> 
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> 
> 

Very interesting. Here is one solution based on the reduce:

We take three values in the accumulator, k, M and S,
and the iteration is (using index starting with 1 notation):

   f =. dyad : 0
   'k M S'=.y
   Mk=.M+k%~x-M
   (k+1);Mk;S+(x-M)*x-Mk
)



The work is then done according to the scheme:

   array f reduce acc0,

where acc0 is initial value of k;M;S, giving the following:

    Wstd =: }: f~ RL~ 2;0;~{: 

    RL =: 1 : 0
:
  for_e.y do.x=.x u e end.
  x
)

Finally, we get the std. deviation from the result of Wstd as:

   Wstddev =: [: %: [: ({: % 2-~{.) >@:Wstd

   The code could be shortened a bit by a better arrangement 
of what comes from the left/right side (e.g using f RR instead of f~RL~), 
but I don't know how to make it significantly shorter than this.

    Here RL is not crucial--Wstd should be doable with Insert using 
a slight modification of the above scheme, although I don't think 
the code will be much shorter (assuming that we don't count 
the definition of RL, of course.)

   Examples of the stability of Welford's algorithm against 
cancellation errors:

   a=:1e6+1e6?@$0
   stddev a
0.288662
   Wstddev a
0.288662

   a=:1e12+1e6?@$0
   stddev a
0.573414
   Wstddev a
0.288817

   18j16":%%:12       NB. std. dev. of the uniform distribution.
0.2886751345948129

It's also interesting to take longer arrays and keep
more bits of array elements intact:

   a=:1e9+2e7?@$0
   stddev a
0.398581    NB. all the bits of the result gone...
   Wstddev a
0.288684

-- 
View this message in context: 
http://old.nabble.com/Welford%27s-method-for-standard-deviations-tp32458412s24193p32460913.html
Sent from the J Programming mailing list archive at Nabble.com.

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to