The Edstat Digest-Daily (text) - Volume 1 : Issue 125 Today's Topics: Re: why not Excel for regression? Re: interpolation Re: interpolation
---------------------------------------------------------------------- Date: Sun, 14 Jul 2002 10:19:01 -0500 From: "Joe Ward" <[EMAIL PROTECTED]> To: "David Heiser" <[EMAIL PROTECTED]> Cc: "edstat" <[EMAIL PROTECTED]> Subject: *** RECOVERED EMAIL ***: Re: why not Excel for regression? Message-ID: <005701c22b49$ccd7d8e0$[EMAIL PROTECTED]> David -- this is a great message. Thanks for making it available to ALL. Please comment more on your statements re the "pass through origin" option. > In your regression menu, do not check the "constant is zero" box. If you are > doing a regression through zero, I can give you way to avoid some to the > errors here. In the beginning of "giving folks the power to use Prediction/Regression/Linear Models", I always start with the "passing through the origin" OR "no-intercept" option for all models. For instructional simplicity it seems much easier to understand what's happening with this approach. Then later students can understand what the "default" is doing. Students can better understand that the "default" is actually comparing two models within the "one model". Ii becomes easy to explain the TOTAL SUM OF SQUARES that should be called the "adjusted total sum of squares" and what happens to the "one missing degree of freedom". Thanks again for your valuable message -- Joe ********************************** Joe H. Ward, Jr. 167 East Arrowhead Dr. San Antonio, TX 78228-2402 Phone: 210-433-6575 Fax: 210-433-2828 Email: [EMAIL PROTECTED] http://www.northside.isd.tenet.edu/healthww/biostatistics/wardindex ============================== Health Careers High School 4646 Hamilton Wolfe Road San Antonio, TX 78229 ********************************** ----- Original Message ----- From: "David Heiser" <[EMAIL PROTECTED]> To: "Beliavsky" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Saturday, July 13, 2002 11:56 PM Subject: *** RECOVERED EMAIL ***: Re: why not Excel for regression? > > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On > Behalf Of Beliavsky > Sent: Saturday, July 13, 2002 5:53 AM > To: [EMAIL PROTECTED] > Subject: why not Excel for regression? > > > In response to my question about a regression program, a few people > suggested > Excel -- with reservations. I do often use Excel for regressions, but a > command > line is my favorite GUI :). Could someone explain what is wrong with Excel's > regression function? Is this discussed on the web somewhere? > . > ------------------------------------------------------------------------- > I have done an extensive analysis on Excel's regressions, including > evaluations with the NIST Strd tests. > > One inherent problem is with all commercial stat packages, including Excel > is the limitations imposed by the IEEE 64 bit floating point number > representation. They all basically have limited performance. However, I can > get Excel to give more accurate answers than what comes out from some of the > commercial stat packages. > > Excel matrix routines are fairly robust given the fact that the most likely > error is singularity. Banded data matrices do not usually occur. For Excel > using multivariate regression, the inversion routine is fairly robust > against moderate singularity. > > Excel will only handle multivariate regressions with 16 or less independent > (X) variables. > > Both the X (matrix) and Y (vector) data sets should be centered about the > means, and the centered data selected for the regressions. If your data is > only good to 2-3 significant figures, then the effort to center the data may > not justify the improved accuracy from the use of centered data. > > Use Tools - Data Analysis - Regression to get the generally useful output. > The specific functions such as forecast, intercept, slope, steyx just show > single values. For the Data Analysis Regression routine you need to be > familiar with all the menu inputs. > > Regression: > Input Y Range: A block defining the input Y data. A list of values. > Input X Range: A block defining the input X data. For a single variable it > is a list, for multivariate data it is a contiguous block of variables. > Labels: Does the first cell in the input data have a label? Applied to > both X and Y data blocks. > Constant is Zero: Check box if the regression is forced to go through the > origin. > Confidence Level: Check box for +/- confidence level about the computed > coefficient values and put the desired level in the next box. > (Percentage). > Residuals: Boxes to be checked are: > Residuals: Check box if a column of residuals is wanted. > Standardized Residuals: Check if a column of standardized residuals is > wanted. > Residual Plots: Check box if plot is desired > Line Fit Plots: Check box if plot is desired > Normal Probability Plots: Check box if plot is desired. > Output: Extensive, occupies 9 columns by 20 or more rows. Overwrites any > cells. Include a row for each variable, 6 rows for any residual table > headers and a row of residual values for each data point > > > Attempting to try and use the Solver package to minimize the sum of squares > of the residuals will result in less accurate results. However for > non-linear regressions, it may be the only way. The Solver package uses > numerical first derivatives which does a poor job on solving regression > problems. > > If you are trying to fit some non-linear equations, transform the data > first, then center, then fit, then translate coefficients to the > non-centered data, and then translate to the original data system. This of > course assumes that the residuals after transformation are approximately > normally distributed. > > In your regression menu, do not check the "constant is zero" box. If you are > doing a regression through zero, I can give you way to avoid some to the > errors here. > > Excel will not give you a plot of residuals versus "normal distribution" z > values to see if the residuals are near normally distributed if you check > the "Normal Probability Plots" box. It takes some additional calculations > and chart changes to get this. > > Other than this, Excel will give you good results. The F dist results are > accurate for small tails to 1E-300. I haven't checked the t dist extreme > ranges yet. > > > > > > > > > > > ================================================================= > Instructions for joining and leaving this list, remarks about the > problem of INAPPROPRIATE MESSAGES, and archives are available at: > . http://jse.stat.ncsu.edu/ . > ================================================================= > > . > . > ================================================================= > Instructions for joining and leaving this list, remarks about the > problem of INAPPROPRIATE MESSAGES, and archives are available at: > . http://jse.stat.ncsu.edu/ . > ================================================================= [Attachment of type text/html removed.] ------------------------------ Date: Mon, 15 Jul 2002 19:10:37 -0700 From: "David Heiser" <[EMAIL PROTECTED]> To: "Ashraf A Ahmed" <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> Subject: *** RECOVERED EMAIL ***: Re: interpolation Message-ID: <[EMAIL PROTECTED]> -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Ashraf A Ahmed Sent: Monday, July 15, 2002 4:38 AM To: [EMAIL PROTECTED] Subject: *** RECOVERED EMAIL ***: interpolation Dear Sir, I have a statistical problem and would appreciate your guidance. I am doing my research in Civil and Environmental Engineering. I have a field site 155 ft length and 60 ft wide, where there are 4 boreholes. I know the soil properties only at these boreholes. I need to generate these properties for the whole site. I have read that many interpolation options like Kriging, natural neighbor or inverse distance weighted can do that. But I don't know which method of those mentioned above gives me the best results. I would appreciate very much your hint in that. Also, is there any program which could be downloaded free from your website or any other website that can do that?. Thanks and hope to hear from you. ---------------------------------------------------------------- Civil Engineers like kriging in situations of ground contamination. DAHeiser ------------------------------ Date: Mon, 15 Jul 2002 17:25:13 -0500 From: Jay Warner <[EMAIL PROTECTED]> To: Ashraf A Ahmed <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: *** RECOVERED EMAIL ***: Re: interpolation Message-ID: <[EMAIL PROTECTED]> 1) What is 'best'? This is not a rhetorical question. Different interpolation methods do different things, and some of those things are more desireable than others, for a given situation. Sorry if that makes life more confusing :) 2) Perhaps the simplest method is also the best. (2a) 'best' may mean, the deviation between what you interpolate and what you would actually detect, if you dug a 5th hole. you can estimate this by doing the interpolation on 3 of the holes, and see how well it predicts the 4th. If you dare :) (2b) Ask, what is the precision of the measurmeents at each borehole. No interpolation method should be asked to do better than that. (2c) Any interpolation _assumes_ uniform, or 'slowly changing' properties between holes. (slowly changing = small 1st & 2nd derivative) Do you have reason to believe this assumption is false? If so, then you must take this belief into account in your interpolation scheme, whatever you choose. If false, and you believe properties change slowly, then what are you waiting for? a simple method will do better, and get you close to home sooner. any help? Cheers, Jay Ashraf A Ahmed wrote: > Dear Sir, > > I have a statistical problem and would appreciate your guidance. > > I am doing my research in Civil and Environmental Engineering. > I have a field site 155 ft length and 60 ft wide, where there are 4 > boreholes. I know the soil properties only at these boreholes. I > need to generate these properties for the whole site. > I have read that many interpolation options like Kriging, natural > neighbor or inverse distance weighted can do that. > But I don't know which method of those mentioned above gives me > the best results. I would appreciate very much your hint in that. > Also, is there any program which could be downloaded free from > your website or any other website that can do that?. > > Thanks and hope to hear from you. > > ====================================== > > Ashraf Ahmed > Department of Civil Engineering > The University of Manchester > Oxford Road > Manchester M13 9PL > UK > > Phone: +44(0)161-275-4375 > Fax : +44(0)161-275-4361 > . > . > ================================================================= > Instructions for joining and leaving this list, remarks about the > problem of INAPPROPRIATE MESSAGES, and archives are available at: > . http://jse.stat.ncsu.edu/ . > ================================================================= -- Jay Warner Principal Scientist Warner Consulting, Inc. 4444 North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? ------------------------------ End of The Edstat Digest-Daily V1 #125 ********** . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
