Re: [R] Handle lot of variables - Regression
Thank you very much! Now it works! I'm aware of the fact, that practically seen it's maybe nonsense to do something like this. But I have to do a Monte Carlo simulation where I want to determine some aspects... It's a complicated story, but a logistic regression (stepwise or not) is a basis for my analysis and I want to try both. Anna Dieter Menne wrote: anna0102 wrote: I've got a data set (e.g. named Data) which contains a lot of variables, for example: s1, s2, ..., s50 My first question is: It is possible to do this: Data$s1 But is it also possible to do something like this: Data$s1:s50 (I've tried a lot of versions of those without a result) Use the [] notation. For example Data[,c(s1,s2,s3)] or even better Data[,grep(s.*,names(a),value=TRUE)] anna0102 wrote: I want to do a stepwise logistic regression. For this purpose I use the following procedures: result-glm(...) step(result, direction=forward) Now the problem I have, is, that I have to include all my 50 variables (s1-s50), but I don't want to write them all down like y~s1+s2+s3+s4... (furthermore it has to be implemented in a loop, so I really need it). Construct the formula dynamically. But please, start with only 3 or 4 variables and try if it work. Sometimes deep inside functions things can go wrong with this method, requiring Ripley's game-like workarounds. See http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html a=data.frame(s=1:10,s2=1:10,s4=1:10) form = paste(z~,grep(s.*,names(a),value=TRUE),collapse=+) glm(form,) And be aware of the nonsense you can (replace by will certainly) get with stepwise regression and so many parameters. If I were to be treated by a cure created by stepwise regression, I would prefer voodoo. Search for Harrell stepwise read Frank's well justified soapboxes. Dieter -- View this message in context: http://www.nabble.com/Handle-lot-of-variables---Regression-tp25889056p25903765.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Handle lot of variables - Regression
Hey, I've got a data set (e.g. named Data) which contains a lot of variables, for example: s1, s2, ..., s50 My first question is: It is possible to do this: Data$s1 But is it also possible to do something like this: Data$s1:s50 (I've tried a lot of versions of those without a result) My second question: I want to do a stepwise logistic regression. For this purpose I use the following procedures: result-glm(...) step(result, direction=forward) Now the problem I have, is, that I have to include all my 50 variables (s1-s50), but I don't want to write them all down like y~s1+s2+s3+s4... (furthermore it has to be implemented in a loop, so I really need it). I've tried do store the 50 variables in a list (e.g. list[[1]]) and tried this: result-glm(y ~ list[[1]], ...) This works! But if I try to do it stepwise result2-step(result) I always get the same results as from glm without a stepwise approach. So obviously R can't handle this if you put a list in. How can I make this work? Thanks in advance, Anna -- View this message in context: http://www.nabble.com/Handle-lot-of-variables---Regression-tp25889056p25889056.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handle lot of variables - Regression
anna0102 wrote: I've got a data set (e.g. named Data) which contains a lot of variables, for example: s1, s2, ..., s50 My first question is: It is possible to do this: Data$s1 But is it also possible to do something like this: Data$s1:s50 (I've tried a lot of versions of those without a result) Use the [] notation. For example Data[,c(s1,s2,s3)] or even better Data[,grep(s.*,names(a),value=TRUE)] anna0102 wrote: I want to do a stepwise logistic regression. For this purpose I use the following procedures: result-glm(...) step(result, direction=forward) Now the problem I have, is, that I have to include all my 50 variables (s1-s50), but I don't want to write them all down like y~s1+s2+s3+s4... (furthermore it has to be implemented in a loop, so I really need it). Construct the formula dynamically. But please, start with only 3 or 4 variables and try if it work. Sometimes deep inside functions things can go wrong with this method, requiring Ripley's game-like workarounds. See http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html a=data.frame(s=1:10,s2=1:10,s4=1:10) form = paste(z~,grep(s.*,names(a),value=TRUE),collapse=+) glm(form,) And be aware of the nonsense you can (replace by will certainly) get with stepwise regression and so many parameters. If I were to be treated by a cure created by stepwise regression, I would prefer voodoo. Search for Harrell stepwise read Frank's well justified soapboxes. Dieter -- View this message in context: http://www.nabble.com/Handle-lot-of-variables---Regression-tp25889056p25892047.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handle lot of variables - Regression
anna0102 wrote: Hey, I've got a data set (e.g. named Data) which contains a lot of variables, for example: s1, s2, ..., s50 My first question is: It is possible to do this: Data$s1 But is it also possible to do something like this: Data$s1:s50 (I've tried a lot of versions of those without a result) My second question: I want to do a stepwise logistic regression. For this purpose I use the following procedures: result-glm(...) step(result, direction=forward) Now the problem I have, is, that I have to include all my 50 variables (s1-s50), but I don't want to write them all down like y~s1+s2+s3+s4... (furthermore it has to be implemented in a loop, so I really need it). I've tried do store the 50 variables in a list (e.g. list[[1]]) and tried this: result-glm(y ~ list[[1]], ...) This works! But if I try to do it stepwise result2-step(result) I always get the same results as from glm without a stepwise approach. So obviously R can't handle this if you put a list in. How can I make this work? Thanks in advance, Anna Anna, You might as well just take a random sample of your candidate predictors. Stepwise regression isn't much better than that. Note that if you don't have enough events (say 15 times 50) to fit a full model then you don't have enough events to do stepwise regression without appropriate penalization. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.