There's obviously some sort of effect with floating-point rounding going on. Try this one out:
r=:1e12+0.5-~1e6?@$0 plot 0 128 128; '(1e12-~mean@:(r&+))"0' r is a random distribution around 1e12, and here we plot the difference between the mean and 1e12 against the amount we added to r. It seems that the mean doesn't like to move much, but there are jumps near powers of two (and especially large jumps near large powers). After the mean is subtracted once, the numbers are well within the bounds of normal floating-point arithmetic and behave normally under stddev. Marshall On Fri, Sep 16, 2011 at 3:01 PM, Johann Hibschman <[email protected]>wrote: > Raul Miller <[email protected]> writes: > > > And, of course, Welford's approach does achieve that. But Welford's > > algorithm assumes a scalar computing architecture. Meanwhile, it > > seems to me that subtracting the computed mean achieves approximately > > the same thing that Welford's approach achieves, but modularized to > > eliminate the "scalar architecture" aspect. > > What I found surprising from all this is that the iteration of the > difference helps. > > e.g., > > r=:1e12+?1e6#0 > %: (+/@:*: % <:@#) (- mean) r > 0.573789 > %: (+/@:*: % <:@#) (- mean)^:2 r > 0.288728 > > I certainly did not expect that to work, since I expected the second > iteration of (- mean) to be a no-op, but it looks like that's where the > "numerical weirdness" lies. > > > That said, if there are generic cases where Welford's approach behaves > > better than subtracting the computed mean from the data set, I would > > be interested in hearing about them. (But I doubt they exist, except > > in hand crafted special cases with no general utility, because > > Welford's algorithm is doing essentially the same thing, with a > > running average.) > > I'd also like to see it, but I've not been able to find an example. > However, I do find it interesting that Welford's algorithm can do this > on a single pass through the data. > > Just a few months ago, I bought myself the Dover edition of Hamming's > numerical methods book. I think I'll use this as an excuse to read more > of it. > > Regards, > Johann > > P.S. On my particular sample, adding 0.5 makes the problem worse, so it > appears to depend on the random numbers involved. > > %: (+/@:*: % <:@#) (- mean) r+0.5 > 1.02374 > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
