Numpy gives the same results as above, but interestingly R gives the mathematically correct result. It appears that they use a 2-pass approach for computing the mean: https://github.com/wch/r-source/blob/6338928e70891145221c781a19d2bb6b17868fb5/src/main/summary.c#L429-L437
-s On Monday, 7 September 2015 10:47:42 UTC+1, Simon Byrne wrote: > > The problem is that the computation of the mean is inaccurate: although > the relative error is small, when you perform the subtraction for computing > the standard deviation, as the numbers are almost the same, the relative > error is massively amplified (this is typically known as *catastrophic > cancellation*). > > One thing we could do here is compute the mean using compensated summation > and Dekker-style division. But I'm not sure it's really worth the effort. > > -simon > > On Monday, 7 September 2015 09:03:18 UTC+1, Tim Holy wrote: >> >> On Sunday, September 06, 2015 11:46:05 PM Diego Javier Zea wrote: >> > Then, what is the best way to test if a number is equal to 0.0? >> >> Not to do it. Much better is to see whether it's much smaller than >> something >> finite, e.g., sta < 100*eps(ave). ave provides the scale against which >> sta >> should be measured. >> >> --Tim >> >> >> > >> > 2015-09-06 22:47 GMT-03:00 Tim Holy <[email protected]>: >> > > 0 is a special case because there is no "scale" by which to measure >> > > "closeness": >> > > >> > > julia> eps(1.0) >> > > 2.220446049250313e-16 >> > > >> > > julia> eps(1e-10) >> > > 1.2924697071141057e-26 >> > > >> > > julia> eps(1e-100) >> > > 1.2689709186578246e-116 >> > > >> > > --Tim >> > > >> > > On Sunday, September 06, 2015 06:39:06 PM Diego Javier Zea wrote: >> > > > Thanks Tim! But I don't fully understand it, because I believed >> that* >> > > > isapprox()* should be *true*, but gives me *false* in this case. My >> > > > particular function with the problem use the *std()* output in the >> next >> > > > way. Is this correct? I'm using *isapprox* and >> > > > * !isapprox* >> > > > function calculatezscore{T,N}(value::AbstractArray{T,N}, average:: >> > > > AbstractArray{T,N}, sd::AbstractArray{T,N}) >> > > > >> > > > zscore = similar(value) >> > > > if size(value) == size(average) == size(sd) >> > > > >> > > > for i in eachindex(zscore) >> > > > >> > > > val = value[i] >> > > > ave = average[i] >> > > > sta = sd[i] >> > > > if val ≈ ave >> > > > >> > > > zscore[i] = 0.0 >> > > > >> > > > elseif sta ≉ 0.0 >> > > > >> > > > zscore[i] = (val - ave)/sta >> > > > >> > > > else >> > > > >> > > > zscore[i] = NaN >> > > > >> > > > end >> > > > >> > > > end >> > > > >> > > > else >> > > > >> > > > throw(ErrorException("The elements should have the same size")) >> > > > >> > > > end >> > > > zscore >> > > > >> > > > end >> > > > >> > > > Are there other ways to manage the reality of the float arithmetic? >> > > > >> > > > El domingo, 6 de septiembre de 2015, 22:27:23 (UTC-3), Tim Holy >> escribió: >> > > > > 1e-17 is indistinguishable from mathematical zero if it's based >> on >> > > > > differences >> > > > > among numbers that are O(1). >> > > > > >> > > > > julia> a = fill(total, 100); >> > > > > >> > > > > julia> sum(a)/100 == total >> > > > > false >> > > > > >> > > > > julia> sum(a)/100 - total >> > > > > -1.3877787807814457e-17 >> > > > > >> > > > > But this is off by only one ulp: >> > > > > >> > > > > julia> nextfloat(sum(a)/100) == total >> > > > > true >> > > > > >> > > > > So my advice is that you should write your algorithms in a manner >> that >> > > > > takes >> > > > > the realities of floating-point arithmetic into account. >> > > > > >> > > > > --Tim >> > > > > >> > > > > On Sunday, September 06, 2015 05:19:36 PM Diego Javier Zea wrote: >> > > > > > The returned value from std should be zero if all the values >> are >> > > >> > > equal >> > > >> > > > > > (since the distance between the mean and the values should be >> zero). >> > > > > > However, the returned value isn't zero: >> > > > > > >> > > > > > julia> total >> > > > > > 0.11008615848501174 >> > > > > > >> > > > > > julia> mean(Float64[total for i in 1:100]) >> > > > > > 0.11008615848501173 >> > > > > > >> > > > > > julia> std(Float64[total for i in 1:100]) >> > > > > > 1.3947701538996772e-17 >> > > > > > >> > > > > > julia> std(Float64[total for i in 1:100]) ≈ zero(Float64) >> > > > > > false >> > > > > > >> > > > > > julia> mean(Float64[total for i in 1:100]) ≈ total >> > > > > > true >> > > > > > >> > > > > > >> > > > > > How can I get a more precise result from *std()*? >> > > > > > >> > > > > > Best, >> >>
