I have to really thank you all for your time! @ David Heiser.
Your remarks were very helpful and escpecially that one about stepwise regression.I m working on them already!That procedure for the right plot sounds interesting.I would really appreciate it if you could sent it to me.Thanks again! @ Frank Jarrel Would you please tell me which software you mean or where i can find it?My work till now was 100% in Excel and i never thought to use an other software as i never actually needed a "powerful tool" for statistics.It seems that i need it now! @ Jerry Dizinno Dont worry you didnt insult me...You are right that i m no expert in statistics as i never had to work with so many data for regression before.My essay does not depend so much in statistics but i need some general statistical indicators for my data, to go on with it.Cheers! Frank E Harrell Jr <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > If you have to center the data, then for sure Excel is using antiquated matrix >arithmetic. Why rely on Excel when there is far better (and free, e.g., R) software >for doing this? -Frank Harrell > > On 29 May 2002 17:31:27 -0700 > [EMAIL PROTECTED] (David Heiser) wrote: > > > > > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On > > Behalf Of GreekEagle > > Sent: Wednesday, May 29, 2002 10:23 AM > > To: [EMAIL PROTECTED] > > Subject: Excel multiple regression equations...Help please! > > --------------------------------------------------------------- > > 1. Excel uses the standard matrix method here. > > Goto any textbook that gives the matrix form for linear regression with > > intercepts > > The Data Analysis package just uses the existing Excel matrix functions. > > These are: > > ************************** > > Matrix Operations, Range Inputs, Single or Range Outputs > > MDETERM – Returns the matrix determinant (as a single value) of a symmetric > > array. All cells within the range must have a number. > > {MINVERSE} – Returns the inverse matrix of a symmetric matrix. All cells > > within the range must have a number. > > {MMULT} – Returns the matrix product of two arrays. The two arrays must have > > conformal rows and columns. Both arrays must have numbers in all cells. > > {TRANSPOSE} – Returns the transpose of an array. Used to shift col-row > > arrays to row-col form > > *************************************************************************** > > MULTIVARIATE REGRESSION APPLICATIONS > > > > This includes polynomial fits of single variables, where the power terms are > > generated as separate variables. Given the scope of Excel, it is best to > > take a simple approach to multivariate regression. Excel lacks the tools to > > properly evaluate these more complex fits. Also to be recognized is that the > > more complex fits may fit the data very well within the range of the data, > > but give totally wrong results when predictions are made using variable > > values beyond the range of the generating data. This is especially true of > > polynomial fits. > > For polynomial fits it is best to center the X data, then derive the power > > terms as additional variables from the centered X values. > > > > For multivariate data, linear regression in LINEST is done by matrix > > operations. The inverse of the (X’X) matrix is accurate for non-singular X > > matrices to about 1E-14. It appears to be a straight forward LU > > decomposition. For accuracy, it is best to center the data after any > > transformations, since the X’X matrix is otherwise dominated by the squares > > of the absolute values, and will result in inaccurate results. > > *************************************************************************** > > Notes: > > Be sure to center the data and use the centered data as the range of > > inputs. Use the regression with intercepts to get the right R2 and t values. > > DAH > > ------------------------------------------------ > > ................. but i cant tell if > > Excel is using it with no differences,correlations etc.Can you help me > > please?Anything...... > > ------------------------------------------------ > > Numerical testing shows that Excel gives the right values here, except when > > regression through the origin is selected, or when true singularity occurs > > in the matrices. Excel does fine when the correlations between X variables > > are as high as 0.999, as long as the data going in has been centered (i.e. > > subtracted from the respective means) > > DAH > > ------------------------------------------------------------ > > > > -My final regression functions giving the Y values, have 5 or 6 (X) > > different variables with 23 observations for each variable.My data are > > strongly linear with determination (R Square) over 0,95.However the > > t-Student values( for 95% precision) for the variables' coefficents > > are below the crtical "Student" value in many cases.I would like most > > of the X variables to be statistically important in my Y functions.I > > know it is impossible that all the X variables are statistically > > important (with t-Student) but i would like them to be as more as > > possible...I tried to improve that so I have tried to make some > > combinations among the X variables different from those i originally > > wanted to..i tried to increase the variables' observations with some > > more data ...i transformed my data to logarithms and i tested > > regression again but it didnt help much.Any ideas or tips for how i > > can do it?Even a new data process method is acceptable! > > > > Thanks for your help!Cheers! > > --------------------------------------------------------------- > > Excel is weak here, since it does not help you identify what is the problem. > > One help is to do stepwise regression, looking at the residuals in terms of > > a normal plot. Excel however in the Data Analysis package does not give the > > right chart when the box is checked. You have to go through a procedure to > > get the right plot. I can send it as a file if you are interested. These > > residual plots may indicate that one or more points may be dominating the > > solution, and trimming the data may give what you are looking for. > > > > Here is a typical (partial) Excel output (from the centered Longley data). > > Coefficients Standard Error t Stat P-value Lower >95% > > Upper 95% > > Intercept 1.55227E-13 80.19964825 1.93551E-15 > > 1.000000000 -184.9408401 184.9408401 > > X1 17.69270551 89.65919182 0.197332868 0.848488565 > -189.0618953 > > 224.4473063 > > X2 -0.035358222 0.035266279 -1.002607104 0.345408261 >-0.116682459 > > 0.045966016 > > X3 -2.039028779 0.516631745 -3.94677408 > > .004255236 -3.230384489 -0.847673069 > > X4 -1.065810024 0.243234175 -4.381826788 > > 02343047 -1.6267094 -0.504910648 > > X5 -0.060888276 0.239469089 -0.254263614 0.805705356 >-0.613105343 > > 0.491328791 > > X6 2354.192413 1546.080224 1.522684513 0.166337112 -1211.077282 > > 5919.462107 > > X7 -522.1544056 1461.828537 -0.357192648 > > 0.730186832 -3893.139236 2848.830425 > > > > The P-value represents the probability that the coefficient is truly zero. > > The lower and upper 95% represent coefficient values that would include 95% > > of all calculated coefficient values from calculations on additional data > > sets taken as repetitions from a parent population. Most real data is like > > this, showing a large uncertainty in the true value of a coefficient. > > Although the R squared here is .9955, it clearly shows that any one of the > > coefficients could be zero, but there is a probability of 0.0023 that all > > the coefficients would be zero. > > > > Also consider what field/discipline you are working in. Psychologists tend > > to accept anything over 90% as being significant. Others it has to be over > > 99%. > > DAHeiser > > > > . > > . > > ================================================================= > > Instructions for joining and leaving this list, remarks about the > > problem of INAPPROPRIATE MESSAGES, and archives are available at: > > . http://jse.stat.ncsu.edu/ . > > ================================================================= . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
