Re: [R] Multiple left hand side variables in a formula
On Fri, 1 Mar 2013, Frank Harrell wrote: Thank you Bill. A temporary re-arrangement of the formula will allow me to do the usual subset= na.action= processing afterwards. Nice idea. I don't need the dot notation very often for this application. That's what the Formula package provides. It allows for multiple parts and multiple responses on both side of the ~. Internally, the formula is decomposed and separate terms are produced. And using an auxiliary formula it is assured that a single model frame (with unified NA processing). And all of this is hidden from the user by providing methods that are as standard as possible, see: vignette(Formula, package = Formula) hth, Z Frank William Dunlap wrote I don't know how much of the information that model.frame supplies you need, but you could make a data.frame containing all the variables on both sides of them formula by changing lhs~rhs into ~lhs+rsh before calling model.frame. E.g., f - function (formula) { if (length(formula) == 3) { # has left hand side envir - environment(formula) formula - formula(call(~, call(+, formula[[2]], formula[[3]]))) environment(formula) - envir } formula } This doesn't quite take care of the wild-card dot in the formula: straight variables are omitted from dot's expansion but functions of variables are not: colnames(model.frame(f(log(mpg)+hp ~ .), data=mtcars)) [1] log(mpg) hp mpg cyl [5] disp drat wt qsec [9] vs am gear carb Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-bounces@ [mailto: r-help-bounces@ ] On Behalf Of Frank Harrell Sent: Friday, March 01, 2013 4:17 PM To: r-help@ Subject: [R] Multiple left hand side variables in a formula The lattice package uses special logic to allow for multiple left-hand-side variables in a formula, e.g. y1 + y2 ~ x. Is there an elegant way to do this outside of lattice? I'm trying to implement a data summarization function that logically takes multiple dependent variables. The usual invocation of model.frame( ) causes R to try to do arithmetic addition to create a single dependent variable. Thanks Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side- variables-in-a-formula-tp4660060.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660065.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple left hand side variables in a formula
Achim this is perfect. I had not seen Formula before. Thanks for writing it! Frank Achim Zeileis-4 wrote On Fri, 1 Mar 2013, Frank Harrell wrote: Thank you Bill. A temporary re-arrangement of the formula will allow me to do the usual subset= na.action= processing afterwards. Nice idea. I don't need the dot notation very often for this application. That's what the Formula package provides. It allows for multiple parts and multiple responses on both side of the ~. Internally, the formula is decomposed and separate terms are produced. And using an auxiliary formula it is assured that a single model frame (with unified NA processing). And all of this is hidden from the user by providing methods that are as standard as possible, see: vignette(Formula, package = Formula) hth, Z Frank William Dunlap wrote I don't know how much of the information that model.frame supplies you need, but you could make a data.frame containing all the variables on both sides of them formula by changing lhs~rhs into ~lhs+rsh before calling model.frame. E.g., f - function (formula) { if (length(formula) == 3) { # has left hand side envir - environment(formula) formula - formula(call(~, call(+, formula[[2]], formula[[3]]))) environment(formula) - envir } formula } This doesn't quite take care of the wild-card dot in the formula: straight variables are omitted from dot's expansion but functions of variables are not: colnames(model.frame(f(log(mpg)+hp ~ .), data=mtcars)) [1] log(mpg) hp mpg cyl [5] disp drat wt qsec [9] vs am gear carb Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-bounces@ [mailto: r-help-bounces@ ] On Behalf Of Frank Harrell Sent: Friday, March 01, 2013 4:17 PM To: r-help@ Subject: [R] Multiple left hand side variables in a formula The lattice package uses special logic to allow for multiple left-hand-side variables in a formula, e.g. y1 + y2 ~ x. Is there an elegant way to do this outside of lattice? I'm trying to implement a data summarization function that logically takes multiple dependent variables. The usual invocation of model.frame( ) causes R to try to do arithmetic addition to create a single dependent variable. Thanks Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side- variables-in-a-formula-tp4660060.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660065.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660080.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple left hand side variables in a formula
Hi Gabor, This is not for a regression function but for a major update I'm working on for the summary.formula function in the Hmisc package. So I need to handle several data types in the formula. Thanks Frank Gabor Grothendieck wrote Gabor Grothendieck wrote On Fri, Mar 1, 2013 at 7:16 PM, Frank Harrell lt; f.harrell@ gt; wrote: The lattice package uses special logic to allow for multiple left-hand-side variables in a formula, e.g. y1 + y2 ~ x. Is there an elegant way to do this outside of lattice? I'm trying to implement a data summarization function that logically takes multiple dependent variables. The usual invocation of model.frame( ) causes R to try to do arithmetic addition to create a single dependent variable. Try: lm( cbind(Sepal.Length, Sepal.Width) ~., iris) On Fri, Mar 1, 2013 at 8:02 PM, Frank Harrell lt; f.harrell@ gt; wrote: Thanks for your reply Gabor. That doesn't handle a mixture of factor and numeric variables on the left hand side. Frank It can handle 2 level factors lm(cbind(Sepal.Length, setosa = Species == setosa) ~ ., iris) and more with some manual effort: lm(cbind(virginica = Species == virginica, setosa = Species == setosa) ~ ., iris) Typically you don't see more than that as a dependent variable. Do you actually need more? -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660081.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple left hand side variables in a formula
The lattice package uses special logic to allow for multiple left-hand-side variables in a formula, e.g. y1 + y2 ~ x. Is there an elegant way to do this outside of lattice? I'm trying to implement a data summarization function that logically takes multiple dependent variables. The usual invocation of model.frame( ) causes R to try to do arithmetic addition to create a single dependent variable. Thanks Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple left hand side variables in a formula
On Fri, Mar 1, 2013 at 7:16 PM, Frank Harrell f.harr...@vanderbilt.edu wrote: The lattice package uses special logic to allow for multiple left-hand-side variables in a formula, e.g. y1 + y2 ~ x. Is there an elegant way to do this outside of lattice? I'm trying to implement a data summarization function that logically takes multiple dependent variables. The usual invocation of model.frame( ) causes R to try to do arithmetic addition to create a single dependent variable. Try: lm( cbind(Sepal.Length, Sepal.Width) ~., iris) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple left hand side variables in a formula
I don't know how much of the information that model.frame supplies you need, but you could make a data.frame containing all the variables on both sides of them formula by changing lhs~rhs into ~lhs+rsh before calling model.frame. E.g., f - function (formula) { if (length(formula) == 3) { # has left hand side envir - environment(formula) formula - formula(call(~, call(+, formula[[2]], formula[[3]]))) environment(formula) - envir } formula } This doesn't quite take care of the wild-card dot in the formula: straight variables are omitted from dot's expansion but functions of variables are not: colnames(model.frame(f(log(mpg)+hp ~ .), data=mtcars)) [1] log(mpg) hp mpg cyl [5] disp drat wt qsec [9] vs am gear carb Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Frank Harrell Sent: Friday, March 01, 2013 4:17 PM To: r-help@r-project.org Subject: [R] Multiple left hand side variables in a formula The lattice package uses special logic to allow for multiple left-hand-side variables in a formula, e.g. y1 + y2 ~ x. Is there an elegant way to do this outside of lattice? I'm trying to implement a data summarization function that logically takes multiple dependent variables. The usual invocation of model.frame( ) causes R to try to do arithmetic addition to create a single dependent variable. Thanks Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side- variables-in-a-formula-tp4660060.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple left hand side variables in a formula
Thanks for your reply Gabor. That doesn't handle a mixture of factor and numeric variables on the left hand side. Frank Gabor Grothendieck wrote On Fri, Mar 1, 2013 at 7:16 PM, Frank Harrell lt; f.harrell@ gt; wrote: The lattice package uses special logic to allow for multiple left-hand-side variables in a formula, e.g. y1 + y2 ~ x. Is there an elegant way to do this outside of lattice? I'm trying to implement a data summarization function that logically takes multiple dependent variables. The usual invocation of model.frame( ) causes R to try to do arithmetic addition to create a single dependent variable. Try: lm( cbind(Sepal.Length, Sepal.Width) ~., iris) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660062.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple left hand side variables in a formula
Thank you Bill. A temporary re-arrangement of the formula will allow me to do the usual subset= na.action= processing afterwards. Nice idea. I don't need the dot notation very often for this application. Frank William Dunlap wrote I don't know how much of the information that model.frame supplies you need, but you could make a data.frame containing all the variables on both sides of them formula by changing lhs~rhs into ~lhs+rsh before calling model.frame. E.g., f - function (formula) { if (length(formula) == 3) { # has left hand side envir - environment(formula) formula - formula(call(~, call(+, formula[[2]], formula[[3]]))) environment(formula) - envir } formula } This doesn't quite take care of the wild-card dot in the formula: straight variables are omitted from dot's expansion but functions of variables are not: colnames(model.frame(f(log(mpg)+hp ~ .), data=mtcars)) [1] log(mpg) hp mpg cyl [5] disp drat wt qsec [9] vs am gear carb Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-bounces@ [mailto: r-help-bounces@ ] On Behalf Of Frank Harrell Sent: Friday, March 01, 2013 4:17 PM To: r-help@ Subject: [R] Multiple left hand side variables in a formula The lattice package uses special logic to allow for multiple left-hand-side variables in a formula, e.g. y1 + y2 ~ x. Is there an elegant way to do this outside of lattice? I'm trying to implement a data summarization function that logically takes multiple dependent variables. The usual invocation of model.frame( ) causes R to try to do arithmetic addition to create a single dependent variable. Thanks Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side- variables-in-a-formula-tp4660060.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660065.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple left hand side variables in a formula
Gabor Grothendieck wrote On Fri, Mar 1, 2013 at 7:16 PM, Frank Harrell lt; f.harrell@ gt; wrote: The lattice package uses special logic to allow for multiple left-hand-side variables in a formula, e.g. y1 + y2 ~ x. Is there an elegant way to do this outside of lattice? I'm trying to implement a data summarization function that logically takes multiple dependent variables. The usual invocation of model.frame( ) causes R to try to do arithmetic addition to create a single dependent variable. Try: lm( cbind(Sepal.Length, Sepal.Width) ~., iris) On Fri, Mar 1, 2013 at 8:02 PM, Frank Harrell f.harr...@vanderbilt.edu wrote: Thanks for your reply Gabor. That doesn't handle a mixture of factor and numeric variables on the left hand side. Frank It can handle 2 level factors lm(cbind(Sepal.Length, setosa = Species == setosa) ~ ., iris) and more with some manual effort: lm(cbind(virginica = Species == virginica, setosa = Species == setosa) ~ ., iris) Typically you don't see more than that as a dependent variable. Do you actually need more? -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.