Interpretation requires a knowledge of the subject matter -- not just of
p-values.
One should look for plausible explanations for the coefficients:
Plausibility is a sign of a good model.
A lack of plausibility raises questions about a model.
Plausibility requires a knowledge of the subjects matter.
-------------------------------------------------------------
Consider the Minitab PULSE dataset.
Pulse1 is first measurement of pulse (beats per minute)
Pulse2 is second measurement of pulse (bpm)
RAN, SEX and SMOKES are binary.
RAN =1: subject ran in place between pulse measurements
RAN =0: subject did not run/exercise between pulse measurements
Thus, we might expect if RAN = 0 (simple test-retest)
a = 0 and b1 = 1.
-------------------------------------------------------------------
In doing the actual regression, I obtained:
MTB > regress c2 6 c1 c3 c4 c5 c9 c10
The regression equation is
PULSE2 = 17.3 + 0.763 PULSE1 + 36.2 RAN + 1.17 SMOKES - 1.04 SEX
- 8.03 Ran*Sex - 20.6 Ran*Smokes
Predictor Coef StDev T P
Constant 17.332 6.009 2.88 0.005
PULSE1 0.76297 0.07786 9.80 0.000
RAN 36.249 3.005 12.06 0.000
SMOKES 1.169 2.325 0.50 0.616
SEX -1.036 2.137 -0.48 0.629
Ran*Sex -8.030 3.567 -2.25 0.027
Ran*Smok -20.649 3.510 -5.88 0.000
If I eliminate the two non-significant terms, I got:
MTB > regress c2 4 c1 c3 c9 c10
The regression equation is
PULSE2 = 16.4 + 0.773 PULSE1 + 36.4 RAN - 6.89 Ran*Sex - 21.6 Ran*Smokes
Predictor Coef StDev T P
Constant 16.352 5.556 2.94 0.004
PULSE1 0.77275 0.07546 10.24 0.000
RAN 36.450 2.733 13.34 0.000
Ran*Sex -6.893 2.707 -2.55 0.013
Ran*Smok -21.582 2.869 -7.52 0.000
S = 7.566 R-Sq = 81.3% R-Sq(adj) = 80.4%
--------------------------------------------------------------------------
Using the last model, I would say that running (RAN=1)
* adds 36.45 beats per minute to one's pulse rate (if female non-smoker).
* adds 29... beats per minute to one's rest pulse rate (if male non-smoker)
* adds 17 beats per minute to one's pulse rate (if female smoker)
* adds 8... beats per minute to one's pulse rate (if male smoker).
=================
Interpretation requires a knowledge of the subject matter -- not just
p-values.
One should look for plausible explanations for the coefficients:
1. Why is the constant positive? (2nd measurement effect?)
2. Why is the coefficient for PULSE1 different from 1? (regression effect?)
3. Why does running in place (RAN=1) increase the 2nd pulse measure?
4. Why is the increase somewhat less for males than for females?
5. Why is the increase much less for smokers than for non-smokers?
Plausibility is a sign of a good model.
Of course, one may wonder if all running subjects ran at the same pace.
Perhaps females ran slower than males.
Perhaps smokers ran slower than non-smokers.
This lack of control might give the observed results, but a totally
different interpretation.
Milo
==================================================
Rich Ulrich wrote in message ...
>Here is a bit of Don's example, and then a closing comment about what
>I had said, about interpreting coefficients.
>
>On 10 Jan 2000 00:20:59 -0800, [EMAIL PROTECTED] (Donald F.
>Burrill) wrote:
>
> < snip, much example >
>
>> the coefficients b3 and b4 are indistinguishable from zero. Dropping
>> SEX and SMOKES from the model, we then have
>>
>> PULSE2 = a + b1*PULSE1 + b2*RAN + b5*SEX*RAN + b6*SMOKES*RAN + error
>>