Re: VIF for dichotmous variable [3]

Donald Burrill Thu, 29 May 2003 02:27:03 -0700

On Wed, 28 May 2003, Scheltema, Karen wrote:

> I know about the perils of stepwise, and I agree with you that it is
> a less than desirable procedure.  This researcher, however, is not
> as convinced as I am about not doing stepwise.  Sigh.  He has more
> variables than would comfortably fit a 5-1 case to variable ratio
> for a forced entry regression, which is why he was hoping stepwise
> would help him narrow his model.  Any suggestions I can give him,
> short of telling him to scrap everything?


Hi, Karen.
 What if any <intelligence> was he applying to the problem of selecting
variables?  For example, when a variable is dropped, it will have been
dropped because its partial correlation with the dependent residual at
that point is smaller than the largest competing partial correlation.
This can sound like a reasonable basis for discarding the variable.
However, if the two partial correlations in question are, say, 0.5345
and 0.5346, one might prefer to decide between THOSE two variables on a
basis other than size of partial r.

For reasons like this, if one is going to do stepwise at all, one should
have a close look at each step and the decision made therein;  and
should probably run several different stepwise regression, with
different starting points (e.g., one with all variables in [backwards]
as described;  one with no variables in [forwards] as is more usual;
several with different subsets of the candidate predictors in the
equation intially).  This will give at least some idea of how heavily
the results in the first instance were, as they say, "capitalizing on
chance", and how stable those results may be thought to be.  And it may
be clearer where one would REALLY like to apply some intelligence (as
distinct from computational crank-turning) to the process.

As Mike Babyak pointed out, orthogonalizing at least the interaction
terms with respect to their lower-order components would be sensible.
(And if you want to see some REALLY large VIFs, look at my Minitab white
paper.  Initially Minitab would not accept all fifteen predictors and
insisted on throwing one of them out, the VIFs were so high (or,
equivalently, the tolerances were so low).  I had to reset the tolerance
threshold to something quite ridiculously low just to get all of them
in, when they (and their products) were in their original form.)

In addition, if the investigator has a way of ranking, or rating, the
candidate predictors in importance (theoretical or other), it may be
informative to take them in order (most important first) and
orthogonalize each of them with respect to all the preceding ones.
(Whether to take them one at a time or (perhaps occasionally) several in
a bunch I cannot advise you, not knowing the theoretical and substantive
context.)

Hope this has been helpful.  Good luck!    -- Don.
 -----------------------------------------------------------------------
 Donald F. Burrill                                         [EMAIL PROTECTED]
 56 Sebbins Pond Drive, Bedford, NH 03110                 (603) 626-0816


.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: VIF for dichotmous variable [3]

Reply via email to