The Edstat Digest-Daily (text)  -   Volume 1 : Issue 125

Today's Topics:
  Re: why not Excel for regression?
  Re: interpolation
  Re: interpolation

----------------------------------------------------------------------

Date: Sun, 14 Jul 2002 10:19:01 -0500
From: "Joe Ward" <[EMAIL PROTECTED]>
To: "David Heiser" <[EMAIL PROTECTED]>
Cc: "edstat" <[EMAIL PROTECTED]>
Subject: *** RECOVERED EMAIL ***:  Re: why not Excel for regression?
Message-ID: <005701c22b49$ccd7d8e0$[EMAIL PROTECTED]>

David -- this is a great message. Thanks for making it available to ALL.

Please comment more on your statements re the "pass through origin" option.

> In your regression menu, do not check the "constant is zero" box. If you are
> doing a regression through zero, I can give you way to avoid some to the
> errors here.

In the beginning of "giving folks the power to use Prediction/Regression/Linear 
Models", I always start with
the "passing through the origin" OR "no-intercept" option for all models.  For 
instructional simplicity it seems
much easier to understand what's happening with this approach.  Then later students 
can understand
what the "default" is doing.

Students can better understand that the "default" is actually comparing two models 
within the "one model".
Ii becomes easy to explain the TOTAL SUM OF SQUARES that should be called the 
"adjusted total sum of squares"
 and what happens to the "one missing degree of freedom".

Thanks again for your valuable message

--  Joe
**********************************
Joe H. Ward,  Jr.
167 East Arrowhead Dr.
San Antonio, TX 78228-2402
Phone: 210-433-6575
Fax:     210-433-2828
Email:  [EMAIL PROTECTED]
http://www.northside.isd.tenet.edu/healthww/biostatistics/wardindex
==============================
Health Careers High School
4646 Hamilton Wolfe Road
San Antonio, TX 78229
**********************************



----- Original Message ----- 
From: "David Heiser" <[EMAIL PROTECTED]>
To: "Beliavsky" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Saturday, July 13, 2002 11:56 PM
Subject: *** RECOVERED EMAIL ***:  Re: why not Excel for regression?


> 
> 
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Beliavsky
> Sent: Saturday, July 13, 2002 5:53 AM
> To: [EMAIL PROTECTED]
> Subject: why not Excel for regression?
> 
> 
> In response to my question about a regression program, a few people
> suggested
> Excel -- with reservations. I do often use Excel for regressions, but a
> command
> line is my favorite GUI :). Could someone explain what is wrong with Excel's
> regression function? Is this discussed on the web somewhere?
> .
> -------------------------------------------------------------------------
> I have done an extensive analysis on Excel's regressions, including
> evaluations with the NIST Strd tests.
> 
> One inherent problem is with all commercial stat packages, including Excel
> is the limitations imposed by the IEEE 64 bit floating point number
> representation. They all basically have limited performance. However, I can
> get Excel to give more accurate answers than what comes out from some of the
> commercial stat packages.
> 
> Excel matrix routines are fairly robust given the fact that the most likely
> error is singularity. Banded data matrices do not usually occur. For Excel
> using multivariate regression, the inversion routine is fairly robust
> against moderate singularity.
> 
> Excel will only handle multivariate regressions with 16 or less independent
> (X) variables.
> 
> Both the X (matrix) and Y (vector) data sets should be centered about the
> means, and the centered data selected for the regressions. If your data is
> only good to 2-3 significant figures, then the effort to center the data may
> not justify the improved accuracy from the use of centered data.
> 
> Use Tools - Data Analysis - Regression to get the generally useful output.
> The specific functions such as forecast, intercept, slope, steyx just show
> single values. For the Data Analysis Regression routine you need to be
> familiar with all the menu inputs.
> 
> Regression:
> Input Y Range: A block defining the input Y data. A list of values.
> Input X Range: A block defining the input X data. For a single variable it
> is a list, for multivariate data it is a contiguous block of variables.
> Labels: Does the first cell in the input data have a label? Applied to
> both X and Y data blocks.
> Constant is Zero: Check box if the regression is forced to go through the
> origin.
> Confidence Level: Check box for +/- confidence level about the computed
> coefficient values and put the desired level in the next box.
> (Percentage).
> Residuals: Boxes to be checked are:
> Residuals: Check box if a column of residuals is wanted.
> Standardized Residuals: Check if a column of standardized residuals is
> wanted.
> Residual Plots: Check box if plot is desired
> Line Fit Plots: Check box if plot is desired
> Normal Probability Plots: Check box if plot is desired.
> Output: Extensive, occupies 9 columns by 20 or more rows. Overwrites any
> cells. Include a row for each variable, 6 rows for any residual table
> headers and a row of residual values for each data point
> 
> 
> Attempting to try and use the Solver package to minimize the sum of squares
> of the residuals will result in less accurate results. However for
> non-linear regressions, it may be the only way. The Solver package uses
> numerical first derivatives which does a poor job on solving regression
> problems.
> 
> If you are trying to fit some non-linear equations, transform the data
> first, then center, then fit, then translate coefficients to the
> non-centered data, and then translate to the original data system. This of
> course assumes that the residuals after transformation are approximately
> normally distributed.
> 
> In your regression menu, do not check the "constant is zero" box. If you are
> doing a regression through zero, I can give you way to avoid some to the
> errors here.
> 
> Excel will not give you a plot of residuals versus "normal distribution" z
> values to see if the residuals are near normally distributed if you check
> the "Normal Probability Plots" box. It takes some additional calculations
> and chart changes to get this.
> 
> Other than this, Excel will give you good results. The F dist results are
> accurate for small tails to 1E-300. I haven't checked the t dist extreme
> ranges yet.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> =================================================================
> Instructions for joining and leaving this list, remarks about the
> problem of INAPPROPRIATE MESSAGES, and archives are available at:
> .                  http://jse.stat.ncsu.edu/                    .
> =================================================================
> 
> .
> .
> =================================================================
> Instructions for joining and leaving this list, remarks about the
> problem of INAPPROPRIATE MESSAGES, and archives are available at:
> .                  http://jse.stat.ncsu.edu/                    .
> =================================================================


[Attachment of type text/html removed.]

------------------------------

Date: Mon, 15 Jul 2002 19:10:37 -0700
From: "David Heiser" <[EMAIL PROTECTED]>
To: "Ashraf A Ahmed" <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>
Subject: *** RECOVERED EMAIL ***:  Re: interpolation
Message-ID: <[EMAIL PROTECTED]>



-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of Ashraf A Ahmed
Sent: Monday, July 15, 2002 4:38 AM
To: [EMAIL PROTECTED]
Subject: *** RECOVERED EMAIL ***:  interpolation


Dear Sir,

I have a statistical problem and would appreciate your guidance.

I am doing my research in Civil and Environmental Engineering.
I have a field site 155 ft length and 60 ft wide, where there are 4 
boreholes. I know the soil properties only at these boreholes. I 
need to generate these properties for the whole site.
I have read that many interpolation options like Kriging, natural 
neighbor or inverse distance weighted can do that.
But I don't know which method of those mentioned above gives me 
the best results. I would appreciate very much your hint in that.
Also, is there any program which could be downloaded free from 
your website or any other website that can do that?.

Thanks and hope to hear from you.
----------------------------------------------------------------
Civil Engineers like kriging in situations of ground contamination.

DAHeiser

------------------------------

Date: Mon, 15 Jul 2002 17:25:13 -0500
From: Jay Warner <[EMAIL PROTECTED]>
To: Ashraf A Ahmed <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED]
Subject: *** RECOVERED EMAIL ***:  Re: interpolation
Message-ID: <[EMAIL PROTECTED]>

1)    What is 'best'?  This is not a rhetorical question.  Different
interpolation methods do different things, and some of those things
are more desireable than others, for a given situation.  Sorry if that
makes life more confusing :)

2)    Perhaps the simplest method is also the best.
(2a)    'best' may mean, the deviation between what you interpolate
and what you would actually detect, if you dug a 5th hole.  you can
estimate this by doing the interpolation on 3 of the holes, and see
how well it predicts the 4th.  If you dare :)
(2b)    Ask, what is the precision of the measurmeents at each
borehole.  No interpolation method should be asked to do better than
that.
(2c)    Any interpolation _assumes_ uniform, or 'slowly changing'
properties between holes.  (slowly changing = small 1st & 2nd
derivative)  Do you have reason to believe this assumption is false?
If so, then you must take this belief into account in your
interpolation scheme, whatever you choose.

If false, and you believe properties change slowly, then what are you
waiting for?  a simple method will do better, and get you close to
home sooner.

any help?

Cheers,
Jay

Ashraf A Ahmed wrote:

> Dear Sir,
>
> I have a statistical problem and would appreciate your guidance.
>
> I am doing my research in Civil and Environmental Engineering.
> I have a field site 155 ft length and 60 ft wide, where there are 4
> boreholes. I know the soil properties only at these boreholes. I
> need to generate these properties for the whole site.
> I have read that many interpolation options like Kriging, natural
> neighbor or inverse distance weighted can do that.
> But I don't know which method of those mentioned above gives me
> the best results. I would appreciate very much your hint in that.
> Also, is there any program which could be downloaded free from
> your website or any other website that can do that?.
>
> Thanks and hope to hear from you.
>
> ======================================
>
> Ashraf Ahmed
> Department of Civil Engineering
> The University of Manchester
> Oxford Road
> Manchester M13 9PL
> UK
>
> Phone: +44(0)161-275-4375
> Fax  : +44(0)161-275-4361
> .
> .
> =================================================================
> Instructions for joining and leaving this list, remarks about the
> problem of INAPPROPRIATE MESSAGES, and archives are available at:
> .                  http://jse.stat.ncsu.edu/                    .
> =================================================================

--
Jay Warner
Principal Scientist
Warner Consulting, Inc.
4444 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX: (262) 681-1133
email: [EMAIL PROTECTED]
web: http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?





------------------------------

End of The Edstat Digest-Daily V1 #125
**********
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to