> > People tend to look at violation of assumptions as a bad thing, but you > could also look at it as an opportunity.
This is a good way to look at it. >In regression, you fit your model > and hope that the residuals look like white noise. When they do look like > white noise, you know that you have extracted the maximum amount of > information from your data, and that there are no systematic trends, > patterns, or correlations among the part of the data that the model does not > predict. Correct. I think everyone agrees with this. > > When the residuals do not look like white noise--because there is a > non-linear trend, because there is heterogeneity in the variance, or because > there is autocorrelation, then there is information in the residuals that > you can extract to produce an even better model. Again, this is a good way to look at it. It is certainly a better attitude than being disappointed to find a lack of randomness in the residuals. > > So why wouldn't you want to improve your regression model? Well, maybe > because you need to keep things simple, or maybe you don't have access to > the right software, or maybe you are under serious deadline pressures. In my case, I don't have any reason for not improving the regression model. At this point in time, I just don't think autoregression is an avenue to the best model. It is more like an avenue to a mediocre model because I believe we have the correct casual variables to work with. > [snip] > > Finally, you allude to the fact that you have thousands of independent > variables. If this is true, then you may have other problems that are far > worse than autocorrelation. The traditional regression methods that work > well for a few dozen independent variables will fail miserably when you have > thousands of independent variables. I'm working a lot now with microarray > data, where scientists can measure the expression of thousands of genes on a > single slide or chip. With these experiments, statisticians have had to > develop entirely new methods because the more traditional methods (like > least squares regression) fall apart. This is absolutely the case. We have a lot of challenges. The autocorrelation is just a small one. I am using various software tools that are designed to handle thousands of independent variables, but if you care to be more specific with regard to your statement, "statisticians have had to develop entirely new methods," I'm all ears. A general list of some of the newer methods that interst you or that you have found useful would be interesting to me. I'm trying to get up to speed on all the "newer" methods, even if they aren't directly applicable to my situation. I just finished reading an article on projection pursuit regression. > > Best of luck! > > Steve Simon, [EMAIL PROTECTED], Standard Disclaimer. > The STATS web page has moved to > http://www.childrens-mercy.org/stats. > > . > . > ================================================================= > Instructions for joining and leaving this list, remarks about the > problem of INAPPROPRIATE MESSAGES, and archives are available at: > . http://jse.stat.ncsu.edu/ . > ================================================================= . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
