In article <002501bfb563$e8e6e8e0$[EMAIL PROTECTED]>,
  [EMAIL PROTECTED] (David A. Heiser) wrote:
>
> ----- Original Message -----
> From: Herman Rubin <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Wednesday, May 03, 2000 8:20 AM
> Subject: Re: no correlation assumption among X's in MLR
>
> > In article <[EMAIL PROTECTED]>,
> > Alan McLean <[EMAIL PROTECTED]> wrote:
> > >'No collinearity' *means* the X variables are uncorrelated!
> >
> > >The basic OLS method assumes the variables are uncorrelated (as you
say).
> In
> > >practice there is usually some correlation, but the estimates are
> reasonably
> > >robust to this. If there is *substantial* collinearity you are in
> trouble.
> >
> > The basic OLS method assumes NOTHING about the correlation
> > of the X variables, as long as there is no linear combination
> > which is constant.  Polynomial regression almost always has
> > the X variables correlated.
> >
> > If, for example, X_1 and X_2 are the "independent" variables,
> > and X_2 is replaced by X_2 + X_1, the coefficients would be
> > different, but the regression equation would be the same.
> .....................................................................
> Herman, you are not entirely right.
>
> The problem with colinearity is the effect it has on the computations.
If I
> had an infinite computer that did the OLS with numbers having an
infinitie
> number of digits, you would be right.
>
> On real computers using black box computer programs, each computer
program
> will give a different number, depending on whether it uses single,
double or
> quad precision, on the peculiarities of the algorithms in the programs
> (there are very many ways to do OLS), on the rounding used to present
the
> answer, and on which chip set is being used (Intel, Sun....., HP....,
they
> all have different internal representations of floating point
numbers).

Of course Herman is right (as usual)! Where are people getting this
ridiculous idea that correlation and collinearity are the same thing?

Assuming you're using an intercept, a pair of variables is
collinear if and only if their correlation is 1.0 or -1.0.
Three or more variables are collinear if and only if there
is at least one of the variables that has a multiple
correlation of 1.0 with the other variables.

If the independent variables in a multiple linear regression are
collinear, there are infinitely many sets of least-squares
regression coefficients that produce the same predictions, MSE,
R-squared, etc.  Although least squares does not produce unique
estimates, if you have prior information, you may be able to get
meaningful and useful Bayesian estimates. Regardless of whether you
have prior information, you can get useful predictions for new
cases lying in the same subspace as the original sample. Without
prior information, you cannot get useful extrapolations outside of
that subspace.  Statisticians who are not data miners sometimes
forget the distinction between estimation and prediction. :-)

Collinearity generally will NOT cause different machines or
different programs to get different answers provided the same
algorithm is used and the programs are written competently.
For example, two different programs running on different machines
will get the same estimates (except perhaps for the last few bits)
provided both use a Moore-Penrose inverse, or both use a G1 inverse,
and both do singularity checks correctly. If the numerous SAS procs
for regression started disagreeing with each other, or started
getting different answers on different machines just because of
collinearity, our QA people would totally freak out!

Where you run into trouble is with "multicollinearity." This is a
confusing term, but what it means is "almost but not quite collinear."
Multicollinearity implies numerical ill-conditioning, which means
that small changes in the data can produce large changes in the
parameter estimates. Singularity tests can be quite delicate, and
if two different programs make different decisions about the rank
of the design matrix, then they can get radically different estimates.


Sent via Deja.com http://www.deja.com/
Before you buy.


===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to