Hi,
I was thinking again on the sum issue. Indeed, if you read my last post
(the TestCase with the Spreadsheet where 3 columns with the *same* data
yielded different results), there is a problem in the summation of cells.
PROBLEM
========
The sum is one of the most elemental operations in a spreadsheet, so
while the error seemed trivial, it points to a more severe underlying
problem. The sum is used almost *everywhere*, so it would be highly
recommended to have a robust sum.
I explained previously a mechanism to compute a sum in a more robust
fashion. It implies first sorting the data and then adding from the
lowest (absolute number, i.e. closest to 0) upwards.
In pseudocode, this would look like:
X[i] = sort( abs( X[i] ))
for(i=0; i<length(); i++) { sum += X[i]; }
DISADVANTAGES
==============
Array has to be sort first (using the absolute values). More time
required for the computation.
One of the main disadvantages is the extra time needed for computation.
On modern computers, this isn't a real problem for small data sets, but
it might cause a bottleneck for large sets. However, this is also the
situation, where most of the benefit would be seen (in terms of higher
accuracy).
Also, the formula used for calculating the variance:
for (::std::vector<double>::size_type i = 0; i < n; i++)
vSum += (values[i] - vMean) * (values[i] - vMean);
while it is the correct algorithm, it could be further improved by
sorting initially the residuals (and also the elements when calculating
the mean):
[in pseudocode]
X[i] = (values[i] - vMean) * (values[i] - vMean);
sort( X[i] ); // X[i] = only positive numbers, so no need for abs()
for(i= 0; i<length; i++) { vSum += X[i]; }
// Please NOTE: we do not sort here values[i] around 0,
// BUT values[i] around vMean;
// this is equivalent with sorting abs(values[i] - vMean);
[Note: there might exist more sophisticated algorithms for even better
performance!!!]
Kind regards,
Leonard Mada
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]