There's obviously some sort of effect with floating-point rounding going on.
Try this one out:
r=:1e12+0.5-~1e6?@$0
plot 0 128 128; '(1e12-~mean@:(r&+))"0'
r is a random distribution around 1e12, and here we plot the difference
between the mean and 1e12 against the amount we added to r. It
On Fri, Sep 16, 2011 at 3:01 PM, Johann Hibschman
wrote:
> However, I do find it interesting that Welford's algorithm can do this
> on a single pass through the data.
It's a single pass, but with three distinct values being produced as
intermediate results.
That's the moral equivalent of three p
Raul Miller writes:
> And, of course, Welford's approach does achieve that. But Welford's
> algorithm assumes a scalar computing architecture. Meanwhile, it
> seems to me that subtracting the computed mean achieves approximately
> the same thing that Welford's approach achieves, but modularized
On Fri, Sep 16, 2011 at 10:23 AM, Viktor Cerovski
wrote:
> stddev 1e12 + x - 0.5
> 0.288938
>
> Same x translated by tiny (compared to 1e12) 0.5,
> still large mean and the problem is gone.
>
> So, what you suggest is the correct remedy, but for the wrong
> reasons.
I think you are arguing that
Eric,
It would be very nice if you would add the `stats' addon to the online
jhs so that folks could compare examples more directly.
Please.
--
(B=)
--
For information about J forums see http://www.jsoftware.com/forums.htm
Raul Miller-4 wrote:
>
> On Thu, Sep 15, 2011 at 12:18 PM, Viktor Cerovski
> wrote:
>> You(r subtracted) mean (is) this:
>>
>> stddev _1e12+1e12+1e6?@$0
>> 0.288461
>
> If I am allowed to know the mean of 1e12+1e6?@$0 -- if I am allowed to
> treat the data as non-random -- then, yes.
>
> Bu
Apologies for misspelling Viktor's name.
Yes - I've had a look at the more scalar approach.
FWIW, The best explicit form I've come up with is :
kMS =: {:@:(kMS2/)@:,&0 NB. appending zero sets k0=M0=S0=0
kMS2 =: 3 : 0
:
k =. >: km1 =. <.{.y NB. k, k-1
dx =. x - 1{y,0
On Thu, Sep 15, 2011 at 12:18 PM, Viktor Cerovski
wrote:
> You(r subtracted) mean (is) this:
>
> stddev _1e12+1e12+1e6?@$0
> 0.288461
If I am allowed to know the mean of 1e12+1e6?@$0 -- if I am allowed to
treat the data as non-random -- then, yes.
But that was not really my point. The point,
Mike Day-3 wrote:
>
> Here's a somewhat J-ish way to do the job. It's ok for small data
> sets but falls down on a few million cases.
>
> NB. given x, append a reversed column of k, then append a row: 0 0
> NB. If you like, this reverts to origin zero, setting M0 = 0
> NB. x gets reversed on
Raul Miller-4 wrote:
>
> On Wed, Sep 14, 2011 at 10:48 AM, Johann Hibschman
> wrote:
>> Raul Miller writes:
>>> J's standard deviation routine seems to already be dealing with the
>>> numerical stability issue.
>>> [...]
>>> Or am I overlooking something?
>>
>> J's standard deviation routine i
On Wed, Sep 14, 2011 at 10:48 AM, Johann Hibschman
wrote:
> Raul Miller writes:
>> J's standard deviation routine seems to already be dealing with the
>> numerical stability issue.
>> [...]
>> Or am I overlooking something?
>
> J's standard deviation routine is perfectly accurate for any sane dat
Raul Miller writes:
> J's standard deviation routine seems to already be dealing with the
> numerical stability issue.
> [...]
> Or am I overlooking something?
J's standard deviation routine is perfectly accurate for any sane data.
John Cook gave an (extreme) example where Welford's method gives
On Tue, Sep 13, 2011 at 5:47 PM, Johann Hibschman
wrote:
> That's a different algorithm, one with poor numeric properties.
After reading this thread, I would like to note:
require'stats'
stddev 1e6?@$0
0.288712
stddev 12e6+1e6?@$0
0.288499
J's standard deviation routine seems to alre
Here's a somewhat J-ish way to do the job. It's ok for small data
sets but falls down on a few million cases.
NB. given x, append a reversed column of k, then append a row: 0 0
NB. If you like, this reverts to origin zero, setting M0 = 0
NB. x gets reversed on the way, but I don't think that us
Interesting problem. Welford's method doesn't translate to J very well,
but there might be other ways of achieving the same result.
The basic problem is that +/ doesn't give the right answer for very long
arrays with large average values and some important small deviations.
As more and more el
Johann Hibschman-4 wrote:
>
> I just came across Welford's method for calculating standard deviations,
> and I realized I didn't know how to implement it in J.
>
> See
>
> - http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
> -
> http://www.johndcook.com/blog/2008/09/26/comp
t; >> Van: programming-boun...@jsoftware.com [mailto:programming-
> >> boun...@jsoftware.com] Namens Johann Hibschman
> >> Verzonden: dinsdag 13 september 2011 21:23
> >> Aan: programming@jsoftware.com
> >> Onderwerp: [Jprogramming] Welford's met
nsdag 13 september 2011 21:23
>> Aan: programming@jsoftware.com
>> Onderwerp: [Jprogramming] Welford's method for standard deviations
>>
>> I just came across Welford's method for calculating standard deviations,
>> and I realized I didn't know how
erwerp: [Jprogramming] Welford's method for standard deviations
>
> I just came across Welford's method for calculating standard deviations,
> and I realized I didn't know how to implement it in J.
>
> See
>
> - http://en.wikipedia.org/wiki/Algorithms_for_calc
I just came across Welford's method for calculating standard deviations,
and I realized I didn't know how to implement it in J.
See
- http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
-
http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-computing-standard-devi
20 matches
Mail list logo