Hello John, Thank you for the answer. What you describe is exactly what is happening.
In my case turning off compilation optimization fix the problem, but taking the absolute value before computing the square root is probably a better solution. Alex Le 05/04/2013 21:01, [email protected] a écrit : > Send FastBit-users mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of FastBit-users digest..." > > > Today's Topics: > > 1. VARSAMP , STDSAMP (Alexandre Maurel) > 2. Re: VARSAMP , STDSAMP (Alexandre Maurel) > 3. Re: VARSAMP , STDSAMP (K. John Wu) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 05 Apr 2013 09:23:44 +0200 > From: Alexandre Maurel <[email protected]> > Subject: [FastBit-users] VARSAMP , STDSAMP > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hello John, > > There is a problem with the computation of VARSAMP and STDSAMP . > > In selectParser.cc file you will find, > > // sample variance is computed as > // (sum (x^2) / count(*) - (sum (x) / count(*))^2) * (count(*) / > (count(*)-1)) > > which should be corrected by > > (sum (x^2) / ( count(*) -1) - (sum (x) / count(*))^2) * (count(*) / > (count(*)-1)) > > same correction for STDSAMP > > sqrt((sum (x^2) / ( count(*) -1) - (sum (x) / count(*))^2) * (count(*) / > (count(*)-1))) > > > Regards, > > Alex > > > ------------------------------ > > Message: 2 > Date: Fri, 05 Apr 2013 09:47:49 +0200 > From: Alexandre Maurel <[email protected]> > Subject: Re: [FastBit-users] VARSAMP , STDSAMP > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Sorry John, forget me first mail, your formula is correct ... I missed a > bracket ... > > I am still not sure why but in some context VARSAMP return negative > number !!!, which cause a floating point error if I try to compute > STDSAMP ... > > Regards, > > Alex > > > Le 05/04/2013 09:23, Alexandre Maurel a ?crit : >> Hello John, >> >> There is a problem with the computation of VARSAMP and STDSAMP . >> >> In selectParser.cc file you will find, >> >> // sample variance is computed as >> // (sum (x^2) / count(*) - (sum (x) / count(*))^2) * (count(*) / >> (count(*)-1)) >> >> which should be corrected by >> >> (sum (x^2) / ( count(*) -1) - (sum (x) / count(*))^2) * (count(*) / >> (count(*)-1)) >> >> same correction for STDSAMP >> >> sqrt((sum (x^2) / ( count(*) -1) - (sum (x) / count(*))^2) * (count(*) >> / (count(*)-1))) >> >> >> Regards, >> >> Alex > > > ------------------------------ > > Message: 3 > Date: Fri, 05 Apr 2013 09:40:14 -0700 > From: "K. John Wu" <[email protected]> > Subject: Re: [FastBit-users] VARSAMP , STDSAMP > To: FastBit Users <[email protected]> > Cc: Alexandre Maurel <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset="iso-8859-1" > > Hi, Alex, > > Here is a guess as what might be happening. The operator minus might > be receiving two operands that are very close to each other. > Theoretically, we might expect a zero to be returned, but due to > floating-point computation issues, the result is a small negative > number -- this could happen if you have very small variances (i.e., > small standard deviations), which is more likely to happen if you have > groups in group by operations that are small and containing the same > number. For example, a group with three 0.1, the average would be > 0.1, then the computation of (0.1^2)/3 will be different from > (0.1/3)^2. Attached is a small program demonstrating the problem and > here is the output I got > > $ gcc -O0 fperror.c > $ ./a.out > (0.1*0.1+0.1*0.1+0.1*0.1)/3 (0.0100000000000000019) - > ((0.1+0.1+0.1)/3)^2 (0.0100000000000000019) = -1.73472347597680709e-18 > > Take a look at your use case and see if this might be the case. > > If this is indeed the case, then the we could take the absolute value > before computing the square root. This should fix the problem with > the standard deviation, but will continue to return negative numbers > for variance. > > John > > > On 4/5/13 12:47 AM, Alexandre Maurel wrote: >> Sorry John, forget me first mail, your formula is correct ... I missed a >> bracket ... >> >> I am still not sure why but in some context VARSAMP return negative >> number !!!, which cause a floating point error if I try to compute >> STDSAMP ... >> >> Regards, >> >> Alex >> >> >> Le 05/04/2013 09:23, Alexandre Maurel a ?crit : >>> Hello John, >>> >>> There is a problem with the computation of VARSAMP and STDSAMP . >>> >>> In selectParser.cc file you will find, >>> >>> // sample variance is computed as >>> // (sum (x^2) / count(*) - (sum (x) / count(*))^2) * (count(*) / >>> (count(*)-1)) >>> >>> which should be corrected by >>> >>> (sum (x^2) / ( count(*) -1) - (sum (x) / count(*))^2) * (count(*) / >>> (count(*)-1)) >>> >>> same correction for STDSAMP >>> >>> sqrt((sum (x^2) / ( count(*) -1) - (sum (x) / count(*))^2) * (count(*) >>> / (count(*)-1))) >>> >>> >>> Regards, >>> >>> Alex >> _______________________________________________ >> FastBit-users mailing list >> [email protected] >> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >> > -------------- next part -------------- > #include <stdio.h> > > int main() { > double a[1], b[1], c[1]; > *a = 0.1; > *b = (*a * *a + *a * *a + *a * *a) / 3.0; > *c = (*a + *a + *a) / 3.0; > *c = *c * *c; > printf("(0.1*0.1+0.1*0.1+0.1*0.1)/3 (%0.18g) - ((0.1+0.1+0.1)/3)^2 > (%.18g) = %.18g\n", > *b, *b, *b - *c); > return 0; > } > > ------------------------------ > > _______________________________________________ > FastBit-users mailing list > [email protected] > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > > > End of FastBit-users Digest, Vol 68, Issue 4 > ******************************************** _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
