Re: [FastBit-users] VARSAMP , STDSAMP

K. John Wu Fri, 05 Apr 2013 09:40:42 -0700

Hi, Alex,

Here is a guess as what might be happening.  The operator minus might
be receiving two operands that are very close to each other.
Theoretically, we might expect a zero to be returned, but due to
floating-point computation issues, the result is a small negative
number -- this could happen if you have very small variances (i.e.,
small standard deviations), which is more likely to happen if you have
groups in group by operations that are small and containing the same
number.  For example, a group with three 0.1, the average would be
0.1, then the computation of (0.1^2)/3 will be different from
(0.1/3)^2.  Attached is a small program demonstrating the problem and
here is the output I got


$ gcc -O0 fperror.c
$ ./a.out
(0.1*0.1+0.1*0.1+0.1*0.1)/3 (0.0100000000000000019) -
((0.1+0.1+0.1)/3)^2 (0.0100000000000000019) = -1.73472347597680709e-18

Take a look at your use case and see if this might be the case.

If this is indeed the case, then the we could take the absolute value
before computing the square root.  This should fix the problem with
the standard deviation, but will continue to return negative numbers
for variance.

John


On 4/5/13 12:47 AM, Alexandre Maurel wrote:
> Sorry John, forget me first mail, your formula is correct ... I missed a 
> bracket ...
> 
> I am still not sure why but in some context VARSAMP return negative 
> number !!!, which cause a floating point error if I try to compute 
> STDSAMP ...
> 
> Regards,
> 
> Alex
> 
> 
> Le 05/04/2013 09:23, Alexandre Maurel a écrit :
>> Hello John,
>>
>> There is a problem with the computation of VARSAMP and STDSAMP .
>>
>> In selectParser.cc file you will find,
>>
>> // sample variance is computed as
>> // (sum (x^2) / count(*) - (sum (x) / count(*))^2) * (count(*) / 
>> (count(*)-1))
>>
>> which should be corrected by
>>
>> (sum (x^2) / ( count(*) -1) - (sum (x) / count(*))^2) * (count(*) / 
>> (count(*)-1))
>>
>> same correction for STDSAMP
>>
>> sqrt((sum (x^2) / ( count(*) -1) - (sum (x) / count(*))^2) * (count(*) 
>> / (count(*)-1)))
>>
>>
>> Regards,
>>
>> Alex
> 
> _______________________________________________
> FastBit-users mailing list
> [email protected]
> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>

#include <stdio.h>

int main() {
    double a[1], b[1], c[1];
    *a = 0.1;
    *b = (*a * *a + *a * *a + *a * *a) / 3.0;
    *c = (*a + *a + *a) / 3.0;
    *c = *c * *c;
    printf("(0.1*0.1+0.1*0.1+0.1*0.1)/3 (%0.18g) - ((0.1+0.1+0.1)/3)^2 (%.18g) 
= %.18g\n",
           *b, *b, *b - *c);
    return 0;
}

_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Re: [FastBit-users] VARSAMP , STDSAMP

Reply via email to