Also consider the redun function in the Hmisc package, which does not use the response variable but uses flexible nonlinear additive models to predict each predictor variable from all the others, using a stepwise procedure in a formal redundancy analysis.

Frank


Ben Bolker wrote:
Peter Flom <peterf <at> brainscope.com> writes:

Robin Williams wrote
<<<<
Is there any facility in R to perform a stepwise process on a model,
which will remove any highly-correlated explanatory variables? I am told
there is in SPSS. I have a large number of variables (some correlated),
which I would like to just chuck in to a model and perform stepwise and
see what comes out the other end, to give me an idea perhaps as to which
variables I should focus on.
Thanks for any help / suggestions. Stepwise is a bad method of selecting variables. Far better methods are LASSO
and LAR (least angle
regression), available in the LARS package and the LASSO2 package.

However, while both these methods are good, neither is a substitute for
substantive knowledge.
Also, the key thing is not so much whether variables are correlated, but
whether they are co-linear, which
is different.  If you have a great many variables, then you  can have a high
degree of colinearity even with no
high pairwise correlations. I've not done this in R, but
RSiteSearch("collinearity", restrict = 'functions') yields 34 hits.

HTH

Peter


  Another suggestion would be to do PCA on the predictor variables.
And to read Frank Harrell's book on _Regression modeling strategies_.

   cheers
     Ben Bolker

Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to