Interpretation requires a knowledge of the subject matter -- not just of
p-values.
One should look for plausible explanations for the coefficients:
Plausibility is a sign of a good model.
A lack of plausibility raises questions about a model.
Plausibility requires a knowledge of the subjects matter.
-------------------------------------------------------------
Consider the Minitab PULSE dataset.
  Pulse1 is first measurement of pulse (beats per minute)
  Pulse2 is second measurement of pulse (bpm)
RAN, SEX and SMOKES are binary.
RAN =1:  subject ran in place between pulse measurements
RAN =0:  subject did not run/exercise between pulse measurements
Thus, we might expect if RAN = 0 (simple test-retest)
     a = 0 and b1 = 1.
-------------------------------------------------------------------
In doing the actual regression, I obtained:
MTB > regress c2 6 c1 c3 c4 c5 c9 c10
The regression equation is
PULSE2 = 17.3 + 0.763 PULSE1 + 36.2 RAN + 1.17 SMOKES - 1.04 SEX
                    - 8.03 Ran*Sex - 20.6 Ran*Smokes

Predictor            Coef             StDev              T            P
Constant           17.332           6.009              2.88        0.005
PULSE1            0.76297        0.07786         9.80        0.000
RAN                    36.249         3.005              12.06      0.000
SMOKES           1.169           2.325               0.50        0.616
SEX                    -1.036          2.137              -0.48        0.629
Ran*Sex            -8.030           3.567              -2.25        0.027
Ran*Smok          -20.649       3.510              -5.88        0.000

If I eliminate the two non-significant terms, I got:
MTB > regress c2 4 c1 c3 c9 c10
The regression equation is
PULSE2 = 16.4 + 0.773 PULSE1 + 36.4 RAN - 6.89 Ran*Sex - 21.6 Ran*Smokes

Predictor            Coef               StDev            T            P
Constant           16.352             5.556         2.94        0.004
PULSE1            0.77275         0.07546    10.24        0.000
RAN                    36.450          2.733          13.34        0.000
Ran*Sex            -6.893            2.707          -2.55        0.013
Ran*Smok          -21.582        2.869          -7.52        0.000

S = 7.566       R-Sq = 81.3%     R-Sq(adj) = 80.4%
--------------------------------------------------------------------------
Using the last model, I would say that running (RAN=1)
* adds 36.45 beats per minute to one's pulse rate (if female non-smoker).
* adds 29... beats per minute to one's rest pulse rate (if male non-smoker)
* adds 17 beats per minute to one's pulse rate (if female smoker)
* adds 8... beats per minute to one's pulse rate (if male smoker).
=================
Interpretation requires a knowledge of the subject matter -- not just
p-values.
One should look for plausible explanations for the coefficients:
1. Why is the constant positive? (2nd measurement effect?)
2. Why is the coefficient for PULSE1 different from 1? (regression effect?)
3. Why does running in place (RAN=1) increase the 2nd pulse measure?
4. Why is the increase somewhat less for males than for females?
5. Why is the increase much less for smokers than for non-smokers?

Plausibility is a sign of a good model.

Of course, one may wonder if all running subjects ran at the same pace.
Perhaps females ran slower than males.
Perhaps smokers ran slower than non-smokers.
This lack of control might give the observed results, but a totally
different interpretation.

Milo
==================================================


Rich Ulrich wrote in message ...
>Here is a bit of Don's example, and then a closing comment about what
>I had said, about interpreting coefficients.
>
>On 10 Jan 2000 00:20:59 -0800, [EMAIL PROTECTED] (Donald F.
>Burrill) wrote:
>
> < snip, much example >
>
>> the coefficients b3 and b4 are indistinguishable from zero.  Dropping
>> SEX and SMOKES from the model, we then have
>>
>>   PULSE2 = a + b1*PULSE1 + b2*RAN + b5*SEX*RAN + b6*SMOKES*RAN + error
>>


Reply via email to