Hi,

I was thinking again on the sum issue. Indeed, if you read my last post (the TestCase with the Spreadsheet where 3 columns with the *same* data yielded different results), there is a problem in the summation of cells.

PROBLEM
========
The sum is one of the most elemental operations in a spreadsheet, so while the error seemed trivial, it points to a more severe underlying problem. The sum is used almost *everywhere*, so it would be highly recommended to have a robust sum.

I explained previously a mechanism to compute a sum in a more robust fashion. It implies first sorting the data and then adding from the lowest (absolute number, i.e. closest to 0) upwards.
In pseudocode, this would look like:
X[i] = sort( abs( X[i] ))
for(i=0; i<length(); i++) { sum += X[i]; }

DISADVANTAGES
==============
Array has to be sort first (using the absolute values). More time required for the computation.

One of the main disadvantages is the extra time needed for computation. On modern computers, this isn't a real problem for small data sets, but it might cause a bottleneck for large sets. However, this is also the situation, where most of the benefit would be seen (in terms of higher accuracy).

Also, the formula used for calculating the variance:
for (::std::vector<double>::size_type i = 0; i < n; i++)
       vSum += (values[i] - vMean) * (values[i] - vMean);

while it is the correct algorithm, it could be further improved by sorting initially the residuals (and also the elements when calculating the mean):
[in pseudocode]
  X[i] = (values[i] - vMean) * (values[i] - vMean);
  sort( X[i] ); // X[i] = only positive numbers, so no need for abs()
  for(i= 0; i<length; i++) { vSum += X[i]; }
// Please NOTE: we do not sort here values[i] around 0,
// BUT values[i] around vMean;
// this is equivalent with sorting  abs(values[i] - vMean);

[Note: there might exist more sophisticated algorithms for even better performance!!!]

Kind regards,

Leonard Mada

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to