Re: [Jprogramming] Welford's method for standard deviations

Viktor Cerovski Thu, 15 Sep 2011 09:35:11 -0700


Mike Day-3 wrote:
> 
> Here's a somewhat J-ish way to do the job.  It's ok for small data
> sets but falls down on a few million cases.
> 
> NB. given x,  append a reversed column of k, then append a row: 0 0
> NB. If you like, this reverts to origin zero, setting M0 = 0
> NB. x gets reversed on the way,  but I don't think that usually matters
> NB. If it does matter,  just do Mk on |.x
> 
> makexM =: 0 ,~ ] ,.~ [: >: [: i. [: - #
> 
> NB. carry out Mk-1 + (xk - Mk-1)/k on (xk, xk-1,. Mk, Mk-1)
> NB. and prepend xk to produce xk, Mk
> 
>     mk =: ({:@] , {:@[ - -&{: % {.@])
> 
> NB. Put these verbs together and clean up the resulting 2 column
> NB. array of (reversed) x and Mk on reversed x:
> 
>     Mk =:  }.@: |.@: (mk ~/\.) @: makexM
> 
> NB. diffs xk - Mk
> 
>     difxk_Mk =: -/"1
> 
> NB. diffs xk - Mk-1
> 
>     difxk_Mkm1 =: }:@:(-~//.)
> 
> NB. combine these for Sk:
> 
>     Sk =: +/@: (difxk_Mk * difxk_Mkm1) @: Mk
> 
> NB. eg given x5
>     x5
> 
> 0.971379 0.0466292 0.330109 0.01346 0.220004
> 
>     Mk x5
> 
>   0.220004 0.220004
>    0.01346 0.116732
>   0.330109 0.187858
> 0.0466292 0.152551
>   0.971379 0.316316
> 
>     Sk x5
> 
> 0.603026
> 
>     ssdev x5  NB. cf stats.ijs verb
> 
> 0.603026
> 
Looks interesting to me, I'll study it more when I get the chance.




> It's a bit quicker than Victor's Wstd.
> 
> However, Sk ran out of space on his
>     a=:1e9+2e7?@$0;
> It does run with
>     a=:1e9+5e6?@$0,
> but is then much slower than ssdev (about 60
> times!).  On my machine, Wstd is another 4
> times slower than that.   All 3 verbs gave
> the same sum of squares for this value of a .
> 
Wstd can be sped up by simply rewriting it
as a for. loop, getting a standard imperative-style
program.  The speedup should be considerable since
there will be no boxing/unboxing being done for
every element of x, and because of the simpler iteration.


> For practical purposes,  I should think ssdev
> and stddev  will do for now.  Instability in
> standard deviation might be associated with
> problems with the data,  and I would begin to
> worry about whether I should transform the
> data and alter my conceptual model for the
> system producing the results.  I don't suppose
> many statisticians would be too concerned by
> variations in estimates of a deviance of
> 1 part in a billion.
> 
> Mike
> 
> On 13/09/2011 8:23 PM, Johann Hibschman wrote:
>> I just came across Welford's method for calculating standard deviations,
>> and I realized I didn't know how to implement it in J.
>>
>> See
>>
>>    - http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
>>    -
>> http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-computing-standard-deviation/
>>    - http://www.johndcook.com/standard_deviation.html
>>
>> for the various articles that got me wondering about this.
>>
>> Here's the algorithm, as stated in the above links.
>>
>>    1. Initialize m_1 = x_1 and s_1 = 0.
>>    2. For subsequent xs,
>>       m_k = m_{k-1} + (x_k - m_{k-1})/k
>>       s_k = s_{k-1} + (x_k - m_{k-1})*(x_k - m_k)
>>    3. Then the kth running standard deviation is:
>>       stddev_k = s_k / (k - 1)
>>
>> I've not tried to switch it from 1-based indexing to 0-based indexing,
>> or to use J-like phrasing in the problem, for fear I'd make a stupid
>> mistake.
>>
>> How would I do this in J?
>>
>> Cheers,
>> Johann
>>
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
> 
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Welford%27s-method-for-standard-deviations-tp32458412s24193p32473075.html
Sent from the J Programming mailing list archive at Nabble.com.

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Welford's method for standard deviations

Reply via email to