Re: [R] Handle lot of variables - Regression

2009-10-15 Thread anna0102

Thank you very much! Now it works!
I'm aware of the fact, that practically seen it's maybe nonsense to do
something like this. But I have to do a Monte Carlo simulation where I want
to determine some aspects... It's a complicated story, but a logistic
regression (stepwise or not) is a basis for my analysis and I want to try
both. 
Anna


Dieter Menne wrote:
 
 
 
 anna0102 wrote:
 
 I've got a data set (e.g. named Data) which contains a lot of variables,
 for example: s1, s2, ..., s50
 
 My first question is:
 It is possible to do this: Data$s1
 But is it also possible to do something like this: Data$s1:s50 (I've
 tried a lot of versions of those without a 
 result)
 
 
 Use the [] notation. For example
 
 Data[,c(s1,s2,s3)]
 
 or even better
 
 Data[,grep(s.*,names(a),value=TRUE)]
 
 
 
 anna0102 wrote:
 
 I want to do a stepwise logistic regression. For this purpose I use the
 following procedures:
 result-glm(...)
 step(result, direction=forward)
 
 Now the problem I have, is, that I have to include all my 50 variables
 (s1-s50), but I don't want to write them all down like y~s1+s2+s3+s4...
 (furthermore it has to be implemented in a loop, so I really need it).
 
 
 Construct the formula dynamically. But please, start with only 3 or 4
 variables and try if it work. Sometimes deep inside functions things can
 go wrong with this method, requiring Ripley's game-like workarounds. See
 
 http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html
 
 
 a=data.frame(s=1:10,s2=1:10,s4=1:10)
 form = paste(z~,grep(s.*,names(a),value=TRUE),collapse=+)
 glm(form,)
 
 And be aware of the nonsense you can (replace by will certainly) get with
 stepwise regression and so many parameters. If I were to be treated by a
 cure created by stepwise regression, I would prefer voodoo.
 
 Search for Harrell stepwise read Frank's well justified soapboxes.
 
 Dieter
 
 

-- 
View this message in context: 
http://www.nabble.com/Handle-lot-of-variables---Regression-tp25889056p25903765.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Handle lot of variables - Regression

2009-10-14 Thread anna0102

Hey,

I've got a data set (e.g. named Data) which contains a lot of variables, for
example: s1, s2, ..., s50

My first question is:
It is possible to do this: Data$s1
But is it also possible to do something like this: Data$s1:s50 (I've tried a
lot of versions of those without a result)

My second question:
I want to do a stepwise logistic regression. For this purpose I use the
following procedures:
result-glm(...)
step(result, direction=forward)

Now the problem I have, is, that I have to include all my 50 variables
(s1-s50), but I don't want to write them all down like y~s1+s2+s3+s4...
(furthermore it has to be implemented in a loop, so I really need it).
I've tried do store the 50 variables in a list (e.g. list[[1]]) and tried
this:
result-glm(y ~ list[[1]], ...)
This works! But if I try to do it stepwise
result2-step(result)
I always get the same results as from glm without a stepwise approach. So
obviously R can't handle this if you put a list in.
How can I make this work?

Thanks in advance,
Anna

-- 
View this message in context: 
http://www.nabble.com/Handle-lot-of-variables---Regression-tp25889056p25889056.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Handle lot of variables - Regression

2009-10-14 Thread Dieter Menne



anna0102 wrote:
 
 I've got a data set (e.g. named Data) which contains a lot of variables,
 for example: s1, s2, ..., s50
 
 My first question is:
 It is possible to do this: Data$s1
 But is it also possible to do something like this: Data$s1:s50 (I've tried
 a lot of versions of those without a 
 result)
 
 
Use the [] notation. For example

Data[,c(s1,s2,s3)]

or even better

Data[,grep(s.*,names(a),value=TRUE)]



anna0102 wrote:
 
 I want to do a stepwise logistic regression. For this purpose I use the
 following procedures:
 result-glm(...)
 step(result, direction=forward)
 
 Now the problem I have, is, that I have to include all my 50 variables
 (s1-s50), but I don't want to write them all down like y~s1+s2+s3+s4...
 (furthermore it has to be implemented in a loop, so I really need it).
 

Construct the formula dynamically. But please, start with only 3 or 4
variables and try if it work. Sometimes deep inside functions things can go
wrong with this method, requiring Ripley's game-like workarounds. See

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html


a=data.frame(s=1:10,s2=1:10,s4=1:10)
form = paste(z~,grep(s.*,names(a),value=TRUE),collapse=+)
glm(form,)

And be aware of the nonsense you can (replace by will certainly) get with
stepwise regression and so many parameters. If I were to be treated by a
cure created by stepwise regression, I would prefer voodoo.

Search for Harrell stepwise read Frank's well justified soapboxes.

Dieter

-- 
View this message in context: 
http://www.nabble.com/Handle-lot-of-variables---Regression-tp25889056p25892047.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Handle lot of variables - Regression

2009-10-14 Thread Frank E Harrell Jr

anna0102 wrote:

Hey,

I've got a data set (e.g. named Data) which contains a lot of variables, for
example: s1, s2, ..., s50

My first question is:
It is possible to do this: Data$s1
But is it also possible to do something like this: Data$s1:s50 (I've tried a
lot of versions of those without a result)

My second question:
I want to do a stepwise logistic regression. For this purpose I use the
following procedures:
result-glm(...)
step(result, direction=forward)

Now the problem I have, is, that I have to include all my 50 variables
(s1-s50), but I don't want to write them all down like y~s1+s2+s3+s4...
(furthermore it has to be implemented in a loop, so I really need it).
I've tried do store the 50 variables in a list (e.g. list[[1]]) and tried
this:
result-glm(y ~ list[[1]], ...)
This works! But if I try to do it stepwise
result2-step(result)
I always get the same results as from glm without a stepwise approach. So
obviously R can't handle this if you put a list in.
How can I make this work?

Thanks in advance,
Anna



Anna,

You might as well just take a random sample of your candidate 
predictors.  Stepwise regression isn't much better than that.  Note that 
if you don't have enough events (say 15 times 50) to fit a full model 
then you don't have enough events to do stepwise regression without 
appropriate penalization.


Frank

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.