Hi Brian and Awhan, just a quick follow-up on the recurrence formula of the variance. Awhan, if you are interested in a deeper understanding of this issue and have access to the following book, I would recommend you take a look at Higham, N.*Accuracy and Stability of Numerical Algorithms *. Second Edition. SIAM, 2002. Section 1.9 is called "Computing the Sample Variance" .
Cheers, -- Juan 2011/8/7 Brian Hawkins <[email protected]> > Hi Awhan, > > I didn't write the code, but I think it's safe to say that the > recurrence relations and use of long double is indeed motivated by > numerical concerns. Running sums are a common cause of overflow. > Variables inside a loop have local scope but otherwise behave > similarly as other variables. IMO this coding style is cleaner, and > it also makes parallelism simpler to implement. > > Regards, > Brian > > > Date: Sun, 7 Aug 2011 04:22:49 +0530 > > From: Awhan Patnaik <[email protected]> > > To: [email protected] > > Subject: [Help-gsl] use of recurrence relation while computing mean, > > variance etc. > > Message-ID: > > <CAM+ON= > [email protected]> > > Content-Type: text/plain; charset=ISO-8859-1 > > > > hello all, > > > > i have 3 questions to bother you with. > > > > 1) what is the motivation behind using the recurrence relation for the > > computation of the mean ? seems to me that a division by the number of > > elements *after* the for loop will result in fewer division operations > > than the current implementation. i have noticed the use of recurrence > > type computation of variance as well. > > > > 2) many functions have a return type of double but the quantity of > > interest that is to be returned is declared as a long double inside > > the function body. why is this done? is it that conversion from a long > > double to double results in less loss of precision? > > > > 3) declaration of variables inside the for loop body e.g. delta in the > > following snippet from variance_source.c > > > > /* find the sum of the squares */ > > for (i = 0; i < n; i++) > > { > > const long double delta = (data[i * stride] - mean); > > variance += (delta * delta - variance) / (i + 1); > > } > > > > does the compiler keep creating a local delta each time control enters > > the loop or is it created once but treated as a local variable and > > valid only in the scope of the for loop? > > _______________________________________________ > Help-gsl mailing list > [email protected] > https://lists.gnu.org/mailman/listinfo/help-gsl > _______________________________________________ Help-gsl mailing list [email protected] https://lists.gnu.org/mailman/listinfo/help-gsl
