Mike Day-3 wrote:
>
> Here's a somewhat J-ish way to do the job. It's ok for small data
> sets but falls down on a few million cases.
>
> NB. given x, append a reversed column of k, then append a row: 0 0
> NB. If you like, this reverts to origin zero, setting M0 = 0
> NB. x gets reversed on the way, but I don't think that usually matters
> NB. If it does matter, just do Mk on |.x
>
> makexM =: 0 ,~ ] ,.~ [: >: [: i. [: - #
>
> NB. carry out Mk-1 + (xk - Mk-1)/k on (xk, xk-1,. Mk, Mk-1)
> NB. and prepend xk to produce xk, Mk
>
> mk =: ({:@] , {:@[ - -&{: % {.@])
>
> NB. Put these verbs together and clean up the resulting 2 column
> NB. array of (reversed) x and Mk on reversed x:
>
> Mk =: }.@: |.@: (mk ~/\.) @: makexM
>
> NB. diffs xk - Mk
>
> difxk_Mk =: -/"1
>
> NB. diffs xk - Mk-1
>
> difxk_Mkm1 =: }:@:(-~//.)
>
> NB. combine these for Sk:
>
> Sk =: +/@: (difxk_Mk * difxk_Mkm1) @: Mk
>
> NB. eg given x5
> x5
>
> 0.971379 0.0466292 0.330109 0.01346 0.220004
>
> Mk x5
>
> 0.220004 0.220004
> 0.01346 0.116732
> 0.330109 0.187858
> 0.0466292 0.152551
> 0.971379 0.316316
>
> Sk x5
>
> 0.603026
>
> ssdev x5 NB. cf stats.ijs verb
>
> 0.603026
>
Looks interesting to me, I'll study it more when I get the chance.
> It's a bit quicker than Victor's Wstd.
>
> However, Sk ran out of space on his
> a=:1e9+2e7?@$0;
> It does run with
> a=:1e9+5e6?@$0,
> but is then much slower than ssdev (about 60
> times!). On my machine, Wstd is another 4
> times slower than that. All 3 verbs gave
> the same sum of squares for this value of a .
>
Wstd can be sped up by simply rewriting it
as a for. loop, getting a standard imperative-style
program. The speedup should be considerable since
there will be no boxing/unboxing being done for
every element of x, and because of the simpler iteration.
> For practical purposes, I should think ssdev
> and stddev will do for now. Instability in
> standard deviation might be associated with
> problems with the data, and I would begin to
> worry about whether I should transform the
> data and alter my conceptual model for the
> system producing the results. I don't suppose
> many statisticians would be too concerned by
> variations in estimates of a deviance of
> 1 part in a billion.
>
> Mike
>
> On 13/09/2011 8:23 PM, Johann Hibschman wrote:
>> I just came across Welford's method for calculating standard deviations,
>> and I realized I didn't know how to implement it in J.
>>
>> See
>>
>> - http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
>> -
>> http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-computing-standard-deviation/
>> - http://www.johndcook.com/standard_deviation.html
>>
>> for the various articles that got me wondering about this.
>>
>> Here's the algorithm, as stated in the above links.
>>
>> 1. Initialize m_1 = x_1 and s_1 = 0.
>> 2. For subsequent xs,
>> m_k = m_{k-1} + (x_k - m_{k-1})/k
>> s_k = s_{k-1} + (x_k - m_{k-1})*(x_k - m_k)
>> 3. Then the kth running standard deviation is:
>> stddev_k = s_k / (k - 1)
>>
>> I've not tried to switch it from 1-based indexing to 0-based indexing,
>> or to use J-like phrasing in the problem, for fear I'd make a stupid
>> mistake.
>>
>> How would I do this in J?
>>
>> Cheers,
>> Johann
>>
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
>
--
View this message in context:
http://old.nabble.com/Welford%27s-method-for-standard-deviations-tp32458412s24193p32473075.html
Sent from the J Programming mailing list archive at Nabble.com.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm