Hi, Don:
Thanks for your suggestion to use do.call in my get.Index. I
discovered that your version actually produces cosmetically different
answers in R 1.6.3 and S-Plus 6.1 for Windows. Fortunately, in the
context, this difference was unimportant. Since yours is faster, it is
clearly
Thanks to Thomas Lumley, Sundar Dorai-Raj, and Don McQueen for their
suggestions. I need the INDICES as part of the output data.frame, which
McQueen's solution provided. I generalized his method as follows:
by.to.data.frame -
function(x, INDICES, FUN){
# Split data.frame x on x[,INDICES]
#
Dear R-Help:
I want to (a) subset a data.frame by several columns, (b) fit a model
to each subset, and (c) store a vector of results from the fit in the
columns of a data.frame. In the past, I've used for loops do do this.
Is there a way to use by?
Consider the following example:
On Thu, 5 Jun 2003, Spencer Graves wrote:
Dear R-Help:
I want to (a) subset a data.frame by several columns, (b) fit a model
to each subset, and (c) store a vector of results from the fit in the
columns of a data.frame. In the past, I've used for loops do do this.
Is there a way
Hi, Thomas, et al.:
Thanks for the reply. Unfortunately, do.call strips off the subset
identifiers, which I want to use for further modeling:
do.call(rbind, byFits)
(Intercept) x
[1,] 0.333 -1.517960e-016
[2,] 0.667 3.282015e-016
The following does what I want
Since I don't have your by.df to test with I may not have it exactly
right, but something along these lines should work:
byFits - lapply(split(by.df,paste(by.df$A,by.df$B)),
FUN=function(data.) {
tmp - coef(lm(y~x,data.))
Spencer,
Would sapply be better here?
R by.df - data.frame(A=rep(c(A1, A2), each=3),
R+ B=rep(c(B1, B2), each=3),
R+ x=1:6, y=rep(0:1, length=6))
R t(sapply(split(by.df, do.call(paste, c(by.df[, 1:2], sep = :))),
R+ function(x) coef(lm(y ~ x, data