On 24 Feb 2000, Victor Aina wrote:

> I was wondering if anyone might have an opinion
> about the impact of subtracting a constant e.g.
> the mean of a variable from the regressor that
> happens to be collinear with another one.

Depends on the variables and their interrelationships.  If (say) X1 is a 
variable ranging between 50 and 90, and X2 = X1**2, X2 will appear to be 
collinear with X1 because the values are so far out on the parabola that 
the curve is virtually indistinguishable from a straight line.  (With 
only finite precision in the data, you can omit the "virtually" if the 
range is sufficientlky far from zero:  550 to 590, or 7050 to 7090, or 
... .)  If you subtract the mean from X1, and then construct X2 as 
 (X1 - mean)**2, the variables no longer appear collinear;  indeed, if X1 
is symmetrical, the correlation is zero between X1 and this X2.
        More generally, one can always derive from X2 the part of it that 
is orthogonal to X1;  and the part of X3 that is orthogonal to X1 and 
X2;  and so on.  For these variables, the variance inflation factors are 
all unity (or, equivalently, the tolerances (reciprocals of the variance 
inflation factors) are all unity).

> A little algebra demonstrates that for least
> squares regression, only the intercept term
> changes when a constant is subtracted from a variable.
> The other slope coefficients remain unchanged.

The precision of the coefficients may be changed, however.  

> It will be nice to know how prediction/forecasting
> is affected. Surely the condition index falls.
> Is collinearity masked in some way? 

Depends on whether it's "real" or "spurious" collinearity.  (The apparent 
collinearity between X and X**2 when the range of X is far from zero is 
what I call "spurious collinearity", arising not from the relationship 
implied by squaring but from the restriction of range.)

> Are the coefficents more efficient (in terms of variance) after
> subtracting the mean? 

Again, it depends on what else is going on.  If you first constructed X2 
as the square of an X1 suffering from the kind of restricted range 
described above, and then subtracted its mean from X1 without modifying 
X2 (or even if you then subtracted its mean from X2), the (X1,X2) 
correlation will not be changed, the apparent collinearity will still be 
present, the coefficients will still have the same precision (or 
imprecision) as for the unmodified variables.  Only subtracting the mean 
from X1 (or performing somke other linear transformation on X1) before 
constructing other predictors that are logically dependent on X1 will 
have any useful effect.
 
> What happens if we have a nonlinear regression model
> such as logistic regression etc.

Well, in logistic regression (so far as I understand it -- I've not been 
a practitioner of it) the nonlinearity lies in the dependent variable, 
not in the predictors;  so I should think the above comments, which have 
only to do with (apparent) collinearity among the predictors, would be
applicable in this case as well. 
                                        -- DFB.
 ------------------------------------------------------------------------
 Donald F. Burrill                                 [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264                                 603-535-2597
 184 Nashua Road, Bedford, NH 03110                          603-471-7128  



===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to