RE: Faults and Errors in EXCEL

David Heiser Mon, 21 Jan 2002 20:26:58 -0800


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Humberto Barreto
Sent: Monday, January 21, 2002 6:29 AM
To: David Heiser; [EMAIL PROTECTED]
Subject: Re: Faults and Errors in EXCEL

At 05:41 PM 1/20/02 -0800, David Heiser wrote:
>These are my questions:
>         1. Given that X is singular, how can I test this in EXCEL given
> that I only
>have the MMULT, MINVERSE and MDETERM matrix functions available in EXCEL?
>One test is to calculate MDETERM on the X'X matrix. For this data set, the
>determinant of X'X is ~E-08, which is not particularly small. Another test
>is to multiply (X'X)^-1 times X"X and look for something clearly not a
>unitary matrix. In this case it very clearly is nothing like 'I'. I tend to
>favor the latter.

If you are saying that you don't trust the latter, (X'X)-1 x (X'X) should
equal Identity matrix, in other cases, then why not always do both and if
either fails, you know there's a problem?
---------------------------------------------------------------------
I don't see this as a 'trust' problem. I see it a a problem in making a
decision about the results of the regression. What is the (true/false)
decision on the validity of the values of the regession coefficients? Are
they good to 2 significant figures, 3, 4 or what?  DAH
---------------------------------------------------------------------
How about taking the inverse of the (X'X)-1 result?  How does that do?

I was playing with

2       3
2       3.000001

With 5 zeroes (like above), minverse on the matrix, then minverse on the
inverse gives you the original matrix back, but six or more zeroes and it
does not, getting worse and worse as you add zeroes.

Would this help?
---------------------------------------------------------------------
You sort of are just restating my problem. It degrads. McCullough's position
is that any results from a program that has a LRE value for the computed
coefficient that is less than the corresponding LRE value computed by STATA
is absolutely to be rejected.

If my data has only 4 significant figures, all I need is a program that will
give 5 figure coefficient accuracy. My objective is to be able to come up
with something that can be derived from the data-program that tells me I
have at least 5 figure accuracy. DAH
--------------------------------------------------------------------
If so, now you have 3 tests that must be passed.

>         3. What is the appropriate screen to use with EXCEL (without
> additional
>macros) to indicate that the results are wrong in a regression. With the
>complicated data sets now being fitted, singularity is not obvious.

Please send me the Excel workbook with the data. I'd like to try a few
ideas.  I'm thinking a chart of y and predicted y might show some obvious
problems.
--------------------------------------------------------------------
I will send you by separate message copies of the EXCEL files that I have
been working with, so you can try out your ideas. DAH
--------------------------------------------------------------------
>         4. Telling students that EXCEL does not properly compute
multivariate
>regression is obviously an over-kill.

I agree, although LINEST's limitation of 16 X variables is pretty bad,
don't you think?
--------------------------------------------------------------------
No I don't.

Any regression program that uses/depends on the basic IEEE 64 bit floating
point instruction set as implemented in the Intel (and others) chip, has
very severe problems with some matrix problems. The problem is the inherent
inaccuracy of the inversion operation on a matrix. Although the rounding
unit error is about E-16, the error propagates in subsequent computations.
In any summation series, the errors are bounded by n-1 times the adjusted
rounding unit (Stewart), where n is the number of additions. All matrix
inversions methods that I am aware of depend on the "Shur complement"
operation which is S=A22-A21*A11'*A12 (after Stewart), which involves
subtraction. In any floating point subtraction, the result has fewer
significant figures then the two original terms. The floating point
operation then in effect adds zeros on the right to fill out the mantissa to
a base length of 52 bits. This an inherent loss of significant figures.

Consequently, a matrix inversion is the most inaccurate computation in any
software package.

My position is that any commercial software package that uses IEEE 64 bit
arithmetic will have significant errors in the X'X inversion matrix, and
this is the primary source of uncorrectable errors in regression
coefficients. I would say that any attempt to do multivariate work with more
than 16 variables should be avoided.

Now if you have a Fortran package and can do 128 bit work in software then,
you can look at many more variables and the values have a higher trust
level.

The future of computations depends on having 128 and 256 bit floating point
chip sets within the next 5 years.

DAH
--------------------------------------------------------------------
Prof. Humberto Barreto
Department of Economics
Wabash College
Crawfordsville, IN 47933

[EMAIL PROTECTED]
Voice: (765) 361-6315
FAX: (765) 361-6277
http://www.wabash.edu/econexcel

=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
                  http://jse.stat.ncsu.edu/
=================================================================

=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
                  http://jse.stat.ncsu.edu/
=================================================================

RE: Faults and Errors in EXCEL

Reply via email to