milicic.marko wrote:
Hi R helpers,
I'm preparing dataset to fir logistic regression model with lrm(). I
have various cointinous and discrete variables and I would like to:
1. Optimaly discretize continous variables (Optimaly means, maximizing
information value - IV for example)
This will result in effects in the model that cannot be interpreted and
will ruin the statistical inference from the lrm. It will also hurt
predictive discrimination. You seem to be allergic to continuous variables.
2. Regroup discrete variables to achieve perhaps smaller number of
level and better information value...
If you use the Y variable to do this the same problems will result.
Shrinkage is a better approach, or using marginal frequencies to combine
levels. See the "pre-specification of complexity" strategy in my book
Regression Modeling Strategies.
Frank
Please suggest if there is some package providing this or same
functionality for discretization...
if there is no package plese suggest how to achieve this.
Many thanks helpers.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.