Re: [Rd] y ~ X -1 , X a matrix

2010-03-18 Thread Peter Dalgaard
Ross Boylan wrote:
 On Thu, 2010-03-18 at 00:57 +, ted.hard...@manchester.ac.uk wrote:
 On 17-Mar-10 23:32:41, Ross Boylan wrote:
 While browsing some code I discovered a call to lm that used
 a formula y ~ X - 1, where X was a matrix.

 Looking through the documentation of formula, lm, model.matrix
 and maybe some others I couldn't find this useage (R 2.10.1).
 Is it anything I can count on in future versions?  Is there
 documentation I've overlooked?

 For the curious: model.frame on the above equation returns a
 data.frame with 2 columns.  The second column is the whole X
 matrix. model.matrix on that object returns the expected matrix,
 with the transition from the odd model.frame to the regular
 matrix happening in an .Internal call.

 Thanks.
 Ross

 P.S. I would appreciate cc's, since mail problems are preventing
 me from seeing list mail.
 Hmmm ... I'm not sure what is the problem with what you describe.
 There is no problem in the it doesn't work sense.
 There is a problem that it seems undocumented--though the help you quote
 could rather indirectly be taken as a clue--and thus, possibly, subject
 to change in later releases.

I'm pretty sure that it is per original design that data frames can have
matrix columns, although data.frame() and as.data.frame() are quite
trigger-happy when it comes to converting them to individual columns.
You need things like d - data.frame(X=I(X)) to prevent it.

As you have seen, matrices can be handy on the RHS of formulas, but
there are at least two cases where they are crucial on the LHS,
multivariate linear models and one version of glm(Y~..., binomial).

Without being able to store matrices as individual components in a data
frame, I don't think you can avoid internally expanding model formula
into (say) Y ~ X1 + X2 - 1, which could get rather unwieldy, so I don't
think the feature will be going away. (Someone with too much time on
his/her hand might want to rationalize the whole data frame concept, but
that should go in the direction of handling all  matrix-like structures
consistently, including date-time objects etc.)

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] y ~ X -1 , X a matrix

2010-03-17 Thread Ted Harding
On 17-Mar-10 23:32:41, Ross Boylan wrote:
 While browsing some code I discovered a call to lm that used
 a formula y ~ X - 1, where X was a matrix.
 
 Looking through the documentation of formula, lm, model.matrix
 and maybe some others I couldn't find this useage (R 2.10.1).
 Is it anything I can count on in future versions?  Is there
 documentation I've overlooked?
 
 For the curious: model.frame on the above equation returns a
 data.frame with 2 columns.  The second column is the whole X
 matrix. model.matrix on that object returns the expected matrix,
 with the transition from the odd model.frame to the regular
 matrix happening in an .Internal call.
 
 Thanks.
 Ross
 
 P.S. I would appreciate cc's, since mail problems are preventing
 me from seeing list mail.

Hmmm ... I'm not sure what is the problem with what you describe.
Code:

  set.seed(54321)
  X  - matrix(rnorm(50),ncol=2)
  Y  - 1*X[,1] + 2*X[,2] + 0.25*rnorm(25)
  LM - lm(Y ~ X-1)

  summary(LM)
  # Call:
  # lm(formula = Y ~ X - 1)
  # Residuals:
  #  Min   1Q   Median   3Q  Max 
  # -0.39942 -0.13143 -0.02249  0.11662  0.61661 
  # Coefficients:
  #Estimate Std. Error t value Pr(|t|)
  # X1  0.977070.04159   23.49   2e-16 ***
  # X2  2.091520.06714   31.15   2e-16 ***
  # ---
  # Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
  # Residual standard error: 0.2658 on 23 degrees of freedom
  # Multiple R-squared: 0.9863, Adjusted R-squared: 0.9851 
  # F-statistic: 826.6 on 2 and 23 DF,  p-value:  2.2e-16 

  model.frame(LM)
  #  Y  X.1  X.2
  # 1   0.04936244 -0.178900750  0.051420078
  # 2  -0.54224173 -0.928044132 -0.027963292
  # [...]
  # 24  1.54196979  0.312332806  0.602009497
  # 25 -0.16928420 -1.285559427  0.394790358

  str(model.frame(LM))
  #  $ Y: num  0.0494 -0.5422 -0.7295 -3.4422 -3.1296 ...
  #  $ X: num [1:25, 1:2] -0.179 -0.928 -0.784 -1.651 -0.408 ...
  # [...]

  model.frame(Y ~ X-1)
  #  Y  X.1  X.2
  # 1   0.04936244 -0.178900750  0.051420078
  # 2  -0.54224173 -0.928044132 -0.027963292
  # [...]
  # 24  1.54196979  0.312332806  0.602009497
  # 25 -0.16928420 -1.285559427  0.394790358
  ## (Identical to above)

  str(model.frame(Y ~ X-1))
  # $ Y: num  0.0494 -0.5422 -0.7295 -3.4422 -3.1296 ...
  # $ X: num [1:25, 1:2] -0.179 -0.928 -0.784 -1.651 -0.408 ...
  # [...]
  ## (Identical to above)

Maybe the clue (admittedly somewhat obtuse( can be found in ?lm:

  lm(formula, data, subset, weights, na.action,
 method = qr, model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
 singular.ok = TRUE, contrasts = NULL, offset, ...)
  [...]

  data: an optional data frame, list or environment (or object
coercible by 'as.data.frame' to a data frame) containing the
variables in the model.  If not found in 'data', the variables
are taken from 'environment(formula)', typically the
environment from which ?lm? is called.

So, in the example the variables are taken from X, coercible
by 'as.data.frame' ... taken from 'environment(formula)'.

Hence (I guess) X is found in the environment and is coerced
into a dataframe with 2 columns, and X.1, X.2 are taken from there.

R Gurus: Please comment! (I'm only guessing by plausibility).
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 18-Mar-10   Time: 00:57:20
-- XFMail --

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] y ~ X -1 , X a matrix

2010-03-17 Thread Ross Boylan
On Thu, 2010-03-18 at 00:57 +, ted.hard...@manchester.ac.uk wrote:
 On 17-Mar-10 23:32:41, Ross Boylan wrote:
  While browsing some code I discovered a call to lm that used
  a formula y ~ X - 1, where X was a matrix.
  
  Looking through the documentation of formula, lm, model.matrix
  and maybe some others I couldn't find this useage (R 2.10.1).
  Is it anything I can count on in future versions?  Is there
  documentation I've overlooked?
  
  For the curious: model.frame on the above equation returns a
  data.frame with 2 columns.  The second column is the whole X
  matrix. model.matrix on that object returns the expected matrix,
  with the transition from the odd model.frame to the regular
  matrix happening in an .Internal call.
  
  Thanks.
  Ross
  
  P.S. I would appreciate cc's, since mail problems are preventing
  me from seeing list mail.
 
 Hmmm ... I'm not sure what is the problem with what you describe.
There is no problem in the it doesn't work sense.
There is a problem that it seems undocumented--though the help you quote
could rather indirectly be taken as a clue--and thus, possibly, subject
to change in later releases.

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] y ~ X -1 , X a matrix

2010-03-17 Thread Dirk Eddelbuettel

On 17 March 2010 at 16:32, Ross Boylan wrote:
| While browsing some code I discovered a call to lm that used a formula y
| ~ X - 1, where X was a matrix.
| 
| Looking through the documentation of formula, lm, model.matrix and maybe
| some others I couldn't find this useage (R 2.10.1).  Is it anything I
| can count on in future versions?  Is there documentation I've
| overlooked?

From help(formula):

  Details:

   In addition to ‘+’ and ‘:’, a number of other operators are useful
   in model formulae.  [...] The
   ‘-’ operator removes the specified terms, so that ‘(a+b+c)^2 -
   a:b’ is identical to ‘a + b + c + b:c + a:c’.  It can also used to
   remove the intercept term: ‘y ~ x - 1’ is a line through the
   origin.  A model with no intercept can be also specified as ‘y ~ x
   + 0’ or ‘y ~ 0 + x’.

What exactly are you questioning? That X is a matrix? That doesn't take
away from the fact that the rest is a formula.

Dirk

-- 
  Registration is open for the 2nd International conference R / Finance 2010
  See http://www.RinFinance.com for details, and see you in Chicago in April!

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel