On Tue, 30 May 2000, Dale Glaser wrote:
> Karen..off the top of my head, the VIF is the inverse of tolerance,
> hence, if tolerance = (1 - r^2j), then VIF = 1/(1-r^2j)..
Yes, Dale is correct.
> ... r^2j would be the percentage of variation accounted for by the
> predictors in predicting the other predictor.. e.g., the linear
> combination of x1 and x2 in predicting x3;
> anyway, as with any cutoff value there can be an element of
> arbitrariness,
Indeed.
> though some have registered concern if VIF > 10.0; my personal opinion
> is that the aforementioned cutoff value is way too liberal;
I agree, if one is using the idea of "cutoff"; though possibly I am
thinking of "conservative" rather than "liberal", since I have seen (and
dealt with) VIFs exceeding several hundreds. They don't frighten me
particularly, partly because by orthogonalizing they can be reduced to
manageable levels. Even partly orthogonalizing can reduce VIFs to values
like 2 or 1.5, at least in some circumstances.
> for VIF to equal 10.0 then 1/=(1 - .9) entails a multiple R of
> .9486!!!; for me it is a stretch to conceive that collinearity only
> becomes problematic when R = .9486...I'll be interested to see what
> others think............
Strictly speaking, "multicollinearity" implies R = 1.000, I believe.
(I don't know why Dale calculates R; the effective information is that
[with VIF = 10] R^2 = 0.9, and 10% of the original variance in the
predictor remains unaccounted for. As one of our colleagues (Rich
Ulrich, I think) recently remarked in another context, with R^2 values
this large one may often usefully consider their complementary values
(1 - R^2).)
Most computer regression programs have a control based on tolerance (the
reciprocal of VIF); I believe Minitab's default tolerance threshold is
around 0.0001 or 0.0002, implying VIFs of 10,000 or 5,000 respectively.
This of course is not to be taken as an indication of "good practice",
but of where the systems analysts thought the algorithm was in danger of
breaking down: "severe multicollinearity" indeed.
But a lurking question, as my earlier post may have suggested, is
whether the multicollinearity apparently present is inherent in the
nature of the variables, or an artifact of variable construction.
The latter was the case in the problem addressed in the paper on the
Minitab web site.
Karen's original question was:
> What is the usual cutoff for saying the VIF is too high?
Depends on the purpose for which you think you want a "cutoff", and
whether you propose to implement it blindly and without further thought,
or as a (very!) rough guideline regarding where the currents (and perhaps
the undertow) may be dangerous and REQUIRE further thought; just for two
examples.
-- Don.
------------------------------------------------------------------------
Donald F. Burrill [EMAIL PROTECTED]
348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED]
MSC #29, Plymouth, NH 03264 603-535-2597
184 Nashua Road, Bedford, NH 03110 603-471-7128
===========================================================================
This list is open to everyone. Occasionally, less thoughtful
people send inappropriate messages. Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.
For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================