Re: [R] Multiple left hand side variables in a formula

2013-03-02 Thread Achim Zeileis

On Fri, 1 Mar 2013, Frank Harrell wrote:


Thank you Bill.  A temporary re-arrangement of the formula will allow me to
do the usual subset= na.action= processing afterwards.  Nice idea.  I don't
need the dot notation very often for this application.


That's what the Formula package provides. It allows for multiple parts 
and multiple responses on both side of the ~. Internally, the formula is 
decomposed and separate terms are produced. And using an auxiliary formula 
it is assured that a single model frame (with unified NA processing). And 
all of this is hidden from the user by providing methods that are as 
standard as possible, see:


vignette(Formula, package = Formula)

hth,
Z


Frank

William Dunlap wrote

I don't know how much of the information that model.frame supplies you
need,
but you could make a data.frame containing all the variables on both sides
of them
formula by changing lhs~rhs into ~lhs+rsh before calling model.frame.
E.g.,

f - function (formula)  {
if (length(formula) == 3) { # has left hand side
envir - environment(formula)
formula - formula(call(~, call(+, formula[[2]],
formula[[3]])))
environment(formula) - envir
}
formula
}

This doesn't quite take care of the wild-card dot in the formula: straight
variables are omitted from dot's expansion but functions of variables are
not:

colnames(model.frame(f(log(mpg)+hp ~ .), data=mtcars))

 [1] log(mpg) hp   mpg  cyl
 [5] disp drat wt   qsec
 [9] vs   am   gear carb

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com



-Original Message-
From:



r-help-bounces@



 [mailto:



r-help-bounces@



] On Behalf

Of Frank Harrell
Sent: Friday, March 01, 2013 4:17 PM
To:



r-help@



Subject: [R] Multiple left hand side variables in a formula

The lattice package uses special logic to allow for multiple
left-hand-side
variables in a formula, e.g. y1 + y2 ~ x.  Is there an elegant way to do
this outside of lattice?  I'm trying to implement a data summarization
function that logically takes multiple dependent variables.  The usual
invocation of model.frame( ) causes R to try to do arithmetic addition to
create a single dependent variable.

Thanks
Frank



-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context:
http://r.789695.n4.nabble.com/Multiple-left-hand-side-
variables-in-a-formula-tp4660060.html
Sent from the R help mailing list archive at Nabble.com.

__




R-help@



 mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__



R-help@



 mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660065.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple left hand side variables in a formula

2013-03-02 Thread Frank Harrell
Achim this is perfect.  I had not seen Formula before.  Thanks for writing
it!
Frank

Achim Zeileis-4 wrote
 On Fri, 1 Mar 2013, Frank Harrell wrote:
 
 Thank you Bill.  A temporary re-arrangement of the formula will allow me
 to
 do the usual subset= na.action= processing afterwards.  Nice idea.  I
 don't
 need the dot notation very often for this application.
 
 That's what the Formula package provides. It allows for multiple parts 
 and multiple responses on both side of the ~. Internally, the formula is 
 decomposed and separate terms are produced. And using an auxiliary formula 
 it is assured that a single model frame (with unified NA processing). And 
 all of this is hidden from the user by providing methods that are as 
 standard as possible, see:
 
 vignette(Formula, package = Formula)
 
 hth,
 Z
 
 Frank

 William Dunlap wrote
 I don't know how much of the information that model.frame supplies you
 need,
 but you could make a data.frame containing all the variables on both
 sides
 of them
 formula by changing lhs~rhs into ~lhs+rsh before calling model.frame.
 E.g.,

 f - function (formula)  {
 if (length(formula) == 3) { # has left hand side
 envir - environment(formula)
 formula - formula(call(~, call(+, formula[[2]],
 formula[[3]])))
 environment(formula) - envir
 }
 formula
 }

 This doesn't quite take care of the wild-card dot in the formula:
 straight
 variables are omitted from dot's expansion but functions of variables
 are
 not:
 colnames(model.frame(f(log(mpg)+hp ~ .), data=mtcars))
  [1] log(mpg) hp   mpg  cyl
  [5] disp drat wt   qsec
  [9] vs   am   gear carb

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com


 -Original Message-
 From:

 r-help-bounces@

  [mailto:

 r-help-bounces@

 ] On Behalf
 Of Frank Harrell
 Sent: Friday, March 01, 2013 4:17 PM
 To:

 r-help@

 Subject: [R] Multiple left hand side variables in a formula

 The lattice package uses special logic to allow for multiple
 left-hand-side
 variables in a formula, e.g. y1 + y2 ~ x.  Is there an elegant way to
 do
 this outside of lattice?  I'm trying to implement a data summarization
 function that logically takes multiple dependent variables.  The usual
 invocation of model.frame( ) causes R to try to do arithmetic addition
 to
 create a single dependent variable.

 Thanks
 Frank



 -
 Frank Harrell
 Department of Biostatistics, Vanderbilt University
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Multiple-left-hand-side-
 variables-in-a-formula-tp4660060.html
 Sent from the R help mailing list archive at Nabble.com.

 __


 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 -
 Frank Harrell
 Department of Biostatistics, Vanderbilt University
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660065.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 __

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660080.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple left hand side variables in a formula

2013-03-02 Thread Frank Harrell
Hi Gabor,

This is not for a regression function but for a major update I'm working on
for the summary.formula function in the Hmisc package.  So I need to handle
several data types in the formula.

Thanks
Frank

Gabor Grothendieck wrote
 Gabor Grothendieck wrote
 On Fri, Mar 1, 2013 at 7:16 PM, Frank Harrell lt;

 f.harrell@

 gt; wrote:
 The lattice package uses special logic to allow for multiple
 left-hand-side
 variables in a formula, e.g. y1 + y2 ~ x.  Is there an elegant way to
 do
 this outside of lattice?  I'm trying to implement a data summarization
 function that logically takes multiple dependent variables.  The usual
 invocation of model.frame( ) causes R to try to do arithmetic addition
 to
 create a single dependent variable.


 Try:

 lm( cbind(Sepal.Length, Sepal.Width) ~., iris)

 
 On Fri, Mar 1, 2013 at 8:02 PM, Frank Harrell lt;

 f.harrell@

 gt; wrote:
 Thanks for your reply Gabor.  That doesn't handle a mixture of factor and
 numeric variables on the left hand side.
 Frank

 
 It can handle 2 level factors
 
lm(cbind(Sepal.Length, setosa = Species == setosa) ~ ., iris)
 
 and more with some manual effort:
 
lm(cbind(virginica = Species == virginica, setosa = Species ==
 setosa) ~ ., iris)
 
 Typically you don't see more than that as a dependent variable.  Do
 you actually need more?
 
 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660081.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple left hand side variables in a formula

2013-03-01 Thread Frank Harrell
The lattice package uses special logic to allow for multiple left-hand-side
variables in a formula, e.g. y1 + y2 ~ x.  Is there an elegant way to do
this outside of lattice?  I'm trying to implement a data summarization
function that logically takes multiple dependent variables.  The usual
invocation of model.frame( ) causes R to try to do arithmetic addition to
create a single dependent variable.

Thanks
Frank



-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple left hand side variables in a formula

2013-03-01 Thread Gabor Grothendieck
On Fri, Mar 1, 2013 at 7:16 PM, Frank Harrell f.harr...@vanderbilt.edu wrote:
 The lattice package uses special logic to allow for multiple left-hand-side
 variables in a formula, e.g. y1 + y2 ~ x.  Is there an elegant way to do
 this outside of lattice?  I'm trying to implement a data summarization
 function that logically takes multiple dependent variables.  The usual
 invocation of model.frame( ) causes R to try to do arithmetic addition to
 create a single dependent variable.


Try:

lm( cbind(Sepal.Length, Sepal.Width) ~., iris)

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple left hand side variables in a formula

2013-03-01 Thread William Dunlap
I don't know how much of the information that model.frame supplies you need,
but you could make a data.frame containing all the variables on both sides of 
them
formula by changing lhs~rhs into ~lhs+rsh before calling model.frame.  E.g.,

f - function (formula)  {
if (length(formula) == 3) { # has left hand side
envir - environment(formula)
formula - formula(call(~, call(+, formula[[2]],  formula[[3]])))
environment(formula) - envir
}
formula
}

This doesn't quite take care of the wild-card dot in the formula: straight
variables are omitted from dot's expansion but functions of variables are
not:
 colnames(model.frame(f(log(mpg)+hp ~ .), data=mtcars))
 [1] log(mpg) hp   mpg  cyl 
 [5] disp drat wt   qsec
 [9] vs   am   gear carb

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Frank Harrell
 Sent: Friday, March 01, 2013 4:17 PM
 To: r-help@r-project.org
 Subject: [R] Multiple left hand side variables in a formula
 
 The lattice package uses special logic to allow for multiple left-hand-side
 variables in a formula, e.g. y1 + y2 ~ x.  Is there an elegant way to do
 this outside of lattice?  I'm trying to implement a data summarization
 function that logically takes multiple dependent variables.  The usual
 invocation of model.frame( ) causes R to try to do arithmetic addition to
 create a single dependent variable.
 
 Thanks
 Frank
 
 
 
 -
 Frank Harrell
 Department of Biostatistics, Vanderbilt University
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Multiple-left-hand-side-
 variables-in-a-formula-tp4660060.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple left hand side variables in a formula

2013-03-01 Thread Frank Harrell
Thanks for your reply Gabor.  That doesn't handle a mixture of factor and
numeric variables on the left hand side.
Frank

Gabor Grothendieck wrote
 On Fri, Mar 1, 2013 at 7:16 PM, Frank Harrell lt;

 f.harrell@

 gt; wrote:
 The lattice package uses special logic to allow for multiple
 left-hand-side
 variables in a formula, e.g. y1 + y2 ~ x.  Is there an elegant way to do
 this outside of lattice?  I'm trying to implement a data summarization
 function that logically takes multiple dependent variables.  The usual
 invocation of model.frame( ) causes R to try to do arithmetic addition to
 create a single dependent variable.

 
 Try:
 
 lm( cbind(Sepal.Length, Sepal.Width) ~., iris)
 
 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660062.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple left hand side variables in a formula

2013-03-01 Thread Frank Harrell
Thank you Bill.  A temporary re-arrangement of the formula will allow me to
do the usual subset= na.action= processing afterwards.  Nice idea.  I don't
need the dot notation very often for this application.
Frank

William Dunlap wrote
 I don't know how much of the information that model.frame supplies you
 need,
 but you could make a data.frame containing all the variables on both sides
 of them
 formula by changing lhs~rhs into ~lhs+rsh before calling model.frame. 
 E.g.,
 
 f - function (formula)  {
 if (length(formula) == 3) { # has left hand side
 envir - environment(formula)
 formula - formula(call(~, call(+, formula[[2]], 
 formula[[3]])))
 environment(formula) - envir
 }
 formula
 }
 
 This doesn't quite take care of the wild-card dot in the formula: straight
 variables are omitted from dot's expansion but functions of variables are
 not:
 colnames(model.frame(f(log(mpg)+hp ~ .), data=mtcars))
  [1] log(mpg) hp   mpg  cyl 
  [5] disp drat wt   qsec
  [9] vs   am   gear carb
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com
 
 
 -Original Message-
 From: 

 r-help-bounces@

  [mailto:

 r-help-bounces@

 ] On Behalf
 Of Frank Harrell
 Sent: Friday, March 01, 2013 4:17 PM
 To: 

 r-help@

 Subject: [R] Multiple left hand side variables in a formula
 
 The lattice package uses special logic to allow for multiple
 left-hand-side
 variables in a formula, e.g. y1 + y2 ~ x.  Is there an elegant way to do
 this outside of lattice?  I'm trying to implement a data summarization
 function that logically takes multiple dependent variables.  The usual
 invocation of model.frame( ) causes R to try to do arithmetic addition to
 create a single dependent variable.
 
 Thanks
 Frank
 
 
 
 -
 Frank Harrell
 Department of Biostatistics, Vanderbilt University
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Multiple-left-hand-side-
 variables-in-a-formula-tp4660060.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-left-hand-side-variables-in-a-formula-tp4660060p4660065.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple left hand side variables in a formula

2013-03-01 Thread Gabor Grothendieck
 Gabor Grothendieck wrote
 On Fri, Mar 1, 2013 at 7:16 PM, Frank Harrell lt;

 f.harrell@

 gt; wrote:
 The lattice package uses special logic to allow for multiple
 left-hand-side
 variables in a formula, e.g. y1 + y2 ~ x.  Is there an elegant way to do
 this outside of lattice?  I'm trying to implement a data summarization
 function that logically takes multiple dependent variables.  The usual
 invocation of model.frame( ) causes R to try to do arithmetic addition to
 create a single dependent variable.


 Try:

 lm( cbind(Sepal.Length, Sepal.Width) ~., iris)


On Fri, Mar 1, 2013 at 8:02 PM, Frank Harrell f.harr...@vanderbilt.edu wrote:
 Thanks for your reply Gabor.  That doesn't handle a mixture of factor and
 numeric variables on the left hand side.
 Frank


It can handle 2 level factors

   lm(cbind(Sepal.Length, setosa = Species == setosa) ~ ., iris)

and more with some manual effort:

   lm(cbind(virginica = Species == virginica, setosa = Species ==
setosa) ~ ., iris)

Typically you don't see more than that as a dependent variable.  Do
you actually need more?

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.