-----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Humberto Barreto Sent: Monday, January 21, 2002 6:29 AM To: David Heiser; [EMAIL PROTECTED] Subject: Re: Faults and Errors in EXCEL
At 05:41 PM 1/20/02 -0800, David Heiser wrote: >These are my questions: > 1. Given that X is singular, how can I test this in EXCEL given > that I only >have the MMULT, MINVERSE and MDETERM matrix functions available in EXCEL? >One test is to calculate MDETERM on the X'X matrix. For this data set, the >determinant of X'X is ~E-08, which is not particularly small. Another test >is to multiply (X'X)^-1 times X"X and look for something clearly not a >unitary matrix. In this case it very clearly is nothing like 'I'. I tend to >favor the latter. If you are saying that you don't trust the latter, (X'X)-1 x (X'X) should equal Identity matrix, in other cases, then why not always do both and if either fails, you know there's a problem? --------------------------------------------------------------------- I don't see this as a 'trust' problem. I see it a a problem in making a decision about the results of the regression. What is the (true/false) decision on the validity of the values of the regession coefficients? Are they good to 2 significant figures, 3, 4 or what? DAH --------------------------------------------------------------------- How about taking the inverse of the (X'X)-1 result? How does that do? I was playing with 2 3 2 3.000001 With 5 zeroes (like above), minverse on the matrix, then minverse on the inverse gives you the original matrix back, but six or more zeroes and it does not, getting worse and worse as you add zeroes. Would this help? --------------------------------------------------------------------- You sort of are just restating my problem. It degrads. McCullough's position is that any results from a program that has a LRE value for the computed coefficient that is less than the corresponding LRE value computed by STATA is absolutely to be rejected. If my data has only 4 significant figures, all I need is a program that will give 5 figure coefficient accuracy. My objective is to be able to come up with something that can be derived from the data-program that tells me I have at least 5 figure accuracy. DAH -------------------------------------------------------------------- If so, now you have 3 tests that must be passed. > 3. What is the appropriate screen to use with EXCEL (without > additional >macros) to indicate that the results are wrong in a regression. With the >complicated data sets now being fitted, singularity is not obvious. Please send me the Excel workbook with the data. I'd like to try a few ideas. I'm thinking a chart of y and predicted y might show some obvious problems. -------------------------------------------------------------------- I will send you by separate message copies of the EXCEL files that I have been working with, so you can try out your ideas. DAH -------------------------------------------------------------------- > 4. Telling students that EXCEL does not properly compute multivariate >regression is obviously an over-kill. I agree, although LINEST's limitation of 16 X variables is pretty bad, don't you think? -------------------------------------------------------------------- No I don't. Any regression program that uses/depends on the basic IEEE 64 bit floating point instruction set as implemented in the Intel (and others) chip, has very severe problems with some matrix problems. The problem is the inherent inaccuracy of the inversion operation on a matrix. Although the rounding unit error is about E-16, the error propagates in subsequent computations. In any summation series, the errors are bounded by n-1 times the adjusted rounding unit (Stewart), where n is the number of additions. All matrix inversions methods that I am aware of depend on the "Shur complement" operation which is S=A22-A21*A11'*A12 (after Stewart), which involves subtraction. In any floating point subtraction, the result has fewer significant figures then the two original terms. The floating point operation then in effect adds zeros on the right to fill out the mantissa to a base length of 52 bits. This an inherent loss of significant figures. Consequently, a matrix inversion is the most inaccurate computation in any software package. My position is that any commercial software package that uses IEEE 64 bit arithmetic will have significant errors in the X'X inversion matrix, and this is the primary source of uncorrectable errors in regression coefficients. I would say that any attempt to do multivariate work with more than 16 variables should be avoided. Now if you have a Fortran package and can do 128 bit work in software then, you can look at many more variables and the values have a higher trust level. The future of computations depends on having 128 and 256 bit floating point chip sets within the next 5 years. DAH -------------------------------------------------------------------- Prof. Humberto Barreto Department of Economics Wabash College Crawfordsville, IN 47933 [EMAIL PROTECTED] Voice: (765) 361-6315 FAX: (765) 361-6277 http://www.wabash.edu/econexcel ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ ================================================================= ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =================================================================