Bob makes his, as-usual, valuable comments!!
----- Original Message -----
From: Bob Hayden <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Sunday, November 28, 1999 9:17 PM
Subject: Re: sets of values
| > On 27 Nov 1999 18:44:33 -0800, [EMAIL PROTECTED] wrote:
| >
| > > Obviously the sets are not related in a linear fashion.
| > >
| > > I would suggest that a 4th degree polynomial equation best fits the data.
| >
| > Oh! that should have been obvious....
|
| > Rich Ulrich, [EMAIL PROTECTED]
| > http://www.pitt.edu/~wpilib/index.html
|
| I was hoping someone else would respond to the polynomial problem.
| Rich did, but I fear his point and his humor may be lost on those who
| need it most.
|
| Higher order polynomial fits are problematic in many ways. It would
| be VERY unusual for a polynomial of degree higher than two to be a
| reasonable model (outside of cases where prior theory specifically
| predicts a higher order polynomial). A software package that
| recommends fitting a slew of higher order polynomials and then
| choosing among them is of dubious statistical quality. To know what
| to do instead you would need to know more about the context and
| meaning of the data. For example, in my current regression class we
| had data on electrical consumption of condominium units of different
| sizes. A parabola gave a considerably better fit than a straight line
| -- but it also predicted that costs would peak out at a size within
| the range of the data and then drop off for larger sizes. This is not
| very sensible. My choice was to transform size into 1/size^2. This
| was not perfect but it was reasonable for the range of sizes studied
| and did not do bizarre things just beyond that range. It gave a model
| that rose more slowly for large sizes but never went down with
| increasing size.
|
| PS I learned about the dangers of fitting higher order polynomials as
| part of a final programming assignment in a Fortran course I took as
| an undergraduate at MIT in about 1970. If you have n data points with
| distinct x-values, a polynomial of degree n-1 gives a PERFECT fit in
| the sense of going right through each point. However, for n more than
| a few, it wiggles wildly between points and the matrix algebra croaked
| all the canned packages we had at the time because of multicollinearity
| problems. The point of the assignment: having a computer is no
| substitute for knowing what you're doing.
|
|
| _
| | | Robert W. Hayden
| | | Department of Mathematics
| / | Plymouth State College MSC#29
| | | Plymouth, New Hampshire 03264 USA
| | * | Rural Route 1, Box 10
| / | Ashland, NH 03217-9702
| | ) (603) 968-9914 (home)
| L_____/ [EMAIL PROTECTED]
| fax (603) 535-2943 (work)
|
----- Joe Ward comments --
Hi, Bob --
Re your first paragraph--
nth-degree polynomials CAN BE USEFUL IN FITTING A WIDE RANGE OF MODELS!
I'm assuming that you are referring to a LINEAR MODEL of the form:
Y = a0*U + a1*X + a2*X^2 + a3*X^3 + ... + an*X^n + E
(where U is a predictor of 1's -- the most neglected and misunderstood
predictor of all time)
By applying the capabilities acquired in learning to apply restrictions
to investigate hypotheses using LINEAR MODELS it is possible to use
ONLY THOSE PARTS OF A GENERAL POLYNOMIAL THAT DO A GOOD JOB OF FITTING THE DATA.
We might START with a model of the general form shown above and then impose
restrictions
so that we can use only THAT PART OF THE FUNCTION THAT HAS a monotonic increasing or
decreasing portion of the more-general form; or, if desired, use only a portion of the
function that has TWO CHANGES OF DIRECTION, etc. It isn't necessary to use ALL of the
predictors in the
general form.
If students are given the capability by their statistics teachers to impose
restrictions on
models, these students will have useful tools outside the statistics world.
A student might want to create a model of the general form:
Y = a0*U + a1*X + a2*X^2
such that the slope = 0 at X = k
or
such that the slope = s at X = k
A curious student might want to spend many hours exploring the possibilities.
In the SAS system, it is quite easy have a very general STARTING MODEL, then
use the RESTRICT STATEMENT to create an ASSUMED MODEL and then use the TEST
STATEMENT to test hypotheses.
When students are first learning to impose restrictions it seems best to have them
actually develop the RESTRICTED MODEL and then to VERIFY THAT THE RESTRICTED MODEL
HAS THE DESIRED PROPERTIES. Even the SAS system might create "strange" models that
are not what the user has in mind.
Students can apply their "basic" algebra (IF they have "basic" algebra)to some
practical use!
----
And, referring to your MIT experience, the LINEAR DEPENDENCE (MULTICOLLINEARITY)
PROBLEM is what stimulated us (in the 1950's) to use an iterative method to solve
least-squares problems on the IBM 602A, IBM 607 and IBM 650 (A wee-bit of historic
nostalgia).
-- Joe
*************************************************************
Joe Ward Health Careers High School
167 East Arrowhead Dr 4646 Hamilton Wolfe
San Antonio, TX 78228-2402 San Antonio, TX 78229
Phone: 210-433-6575 Phone: 210-617-5400
Fax: 210-433-2828 Fax: 210-617-5423
[EMAIL PROTECTED]
http://www.ijoa.org/joeward/wardindex.html
*************************************************************