Tom Moore asked...
----- Forwarded message from Thomas L. Moore -----
Hello,
Does anyone know of a good example of cubic regression that you'd be
willing to share?
Thanks.
----- End of forwarded message from Thomas L. Moore -----
I don't know if this is what Tom had in mind, but it is one of my
favorite datasets. I dug it out and reran my analysis for Tom, and
thought I'd share it with others as well. (The software is Minitab 11.)
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Univariate Data Well Fit by a Cubic
MTB > notitles
MTB > gstd
* NOTE * Standard Graphics are enabled.
Professional Graphics are disabled.
Use the GPRO command to enable Professional Graphics.
MTB > note Turn back the clock to Minitab 5.1
MTB > retr 'e:\stats\minitab8\stats1a\smt15.16'
Retrieving worksheet from file: e:\stats\minitab8\stats1a\smt15.16
Worksheet was saved on 11/15/1996
MTB > print c1 c2
Row height age
1 86.500 2
2 95.500 3
3 103.000 4
4 109.800 5
5 116.400 6
6 122.400 7
7 128.200 8
8 133.800 9
9 139.600 10
10 145.000 11
MTB > note Girl's "typical" heights in cm from 1980 World Almanac
MTB > plot c1 c2
-
- *
140+ *
-
height - *
- *
- *
120+
- *
-
- *
- *
100+
- *
-
- *
-
------+---------+---------+---------+---------+---------+age
2.0 4.0 6.0 8.0 10.0 12.0
MTB > correlation c1 c2
Correlation of height and age = 0.997
MTB > note Relationship appears fairly linear
MTB > regress c1 1 c2;
SUBC> residuals in c3.
The regression equation is
height = 76.6 + 6.37 age
Predictor Coef StDev T P
Constant 76.641 1.188 64.52 0.000
age 6.3661 0.1672 38.08 0.000
S = 1.518 R-Sq = 99.5% R-Sq(adj) = 99.4%
Analysis of Variance
Source DF SS MS F P
Regression 1 3343.5 3343.5 1450.45 0.000
Error 8 18.4 2.3
Total 9 3361.9
Unusual Observations
Obs age height Fit StDev Fit Residual St Resid
1 2.0 86.500 89.373 0.892 -2.873 -2.34R
R denotes an observation with a large standardized residual
MTB > note Is this a good model? Why? Let's look at the residulals.
MTB > plot c3 c2
-
1.5+ *
- * *
C3 - *
- *
-
0.0+ *
- *
- *
-
-
-1.5+
- *
-
-
-
-3.0+ *
------+---------+---------+---------+---------+---------+age
2.0 4.0 6.0 8.0 10.0 12.0
MTB > note Looks curved to me.
MTB > note Data used and analyzed to this point in Siegel and Morgan,
MTB > note Statistics and Data Analysis: An Introduction, Wiley, 1996, pp.554-556
MTB > note This is the best example I've ever seen of the power of residual plots!-)
MTB > note However, I hate to let sleeping dogs lay (or lie).
MTB > note Would a quadratic term help?
MTB > let c4=c2*c2
MTB > name c4 'age-sqr'
MTB > regress c1 2 c2 c4;
SUBC> residuals in c5.
The regression equation is
height = 70.6 + 8.67 age - 0.177 age-sqr
Predictor Coef StDev T P
Constant 70.6133 0.8601 82.10 0.000
age 8.6706 0.2962 29.28 0.000
age-sqr -0.17727 0.02236 -7.93 0.000
S = 0.5138 R-Sq = 99.9% R-Sq(adj) = 99.9%
Analysis of Variance
Source DF SS MS F P
Regression 2 3360.0 1680.0 6362.89 0.000
Error 7 1.8 0.3
Total 9 3361.9
Source DF Seq SS
age 1 3343.5
age-sqr 1 16.6
Unusual Observations
Obs age height Fit StDev Fit Residual St Resid
1 2.0 86.500 87.245 0.404 -0.745 -2.35R
R denotes an observation with a large standardized residual
MTB > plot c5 vs. c2
-
-
0.50+ * * *
-
C5 - *
-
- *
0.00+ *
-
- *
-
- *
-0.50+ *
-
- *
-
-
------+---------+---------+---------+---------+---------+age
2.0 4.0 6.0 8.0 10.0 12.0
MTB > note Oy! It's still curved, but not a parabola.
MTB > let c6=c2*c4
MTB > name c6 'age-cube'
MTB > regress c1 3 c2 c4 c6;
SUBC> residuals in c7.
The regression equation is
height = 66.5 + 11.3 age - 0.629 age-sqr + 0.0232 age-cube
Predictor Coef StDev T P
Constant 66.4594 0.6508 102.12 0.000
age 11.2662 0.3755 30.01 0.000
age-sqr -0.62879 0.06328 -9.94 0.000
age-cube 0.023155 0.003221 7.19 0.000
S = 0.1790 R-Sq = 100.0% R-Sq(adj) = 100.0%
Analysis of Variance
Source DF SS MS F P
Regression 3 3361.7 1120.6 34976.76 0.000
Error 6 0.2 0.0
Total 9 3361.9
Source DF Seq SS
age 1 3343.5
age-sqr 1 16.6
age-cube 1 1.7
Unusual Observations
Obs age height Fit StDev Fit Residual St Resid
1 2.0 86.500 86.662 0.162 -0.162 -2.16R
R denotes an observation with a large standardized residual
MTB > plot c7 c2
0.30+
- *
C7 -
- *
-
0.15+
-
-
- *
-
0.00+ * *
- *
- *
-
- *
-0.15+ *
- *
------+---------+---------+---------+---------+---------+age
2.0 4.0 6.0 8.0 10.0 12.0
Looks much more random now.
MTB > print c1-c7
Row height age C3 age-sqr C5 age-cube C7
1 86.500 2 -2.87273 4 -0.745455 8 -0.161958
2 95.500 3 -0.23879 9 0.470303 27 0.275804
3 103.000 4 0.89515 16 0.540605 64 0.054358
4 109.800 5 1.32909 25 0.265457 125 -0.165219
5 116.400 6 1.56303 36 0.144849 216 -0.021864
6 122.400 7 1.19697 49 -0.221212 343 -0.054499
7 128.200 8 0.63090 64 -0.432732 512 -0.002056
8 133.800 9 -0.13515 81 -0.489696 729 -0.003449
9 139.600 10 -0.70121 100 0.007883 1000 0.202382
10 145.000 11 -1.66728 121 0.459998 1331 -0.123499
MTB > let c8=abs(c7)
MTB > average c8
Mean of C8 = 0.10651
MTB > note Data is given to nearest tenth, at best, so I'll stop here.
_
| | Robert W. Hayden
| | Work: Department of Mathematics
/ | Plymouth State College MSC#29
| | Plymouth, New Hampshire 03264 USA
| * | fax (603) 535-2943
/ | Home: 82 River Street (use this in the summer)
| ) Ashland, NH 03217
L_____/ (603) 968-9914 (use this year-round)
Map of New [EMAIL PROTECTED] (works year-round)
Hampshire http://mathpc04.plymouth.edu (works year-round)
===========================================================================
This list is open to everyone. Occasionally, less thoughtful
people send inappropriate messages. Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.
For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================