Bug raised for this one : https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1229&group_id=240&atid=975
The "x" issue was fixed in [.data.table. It wasn't a specific fix as far as I remember but when the internal scoping was tidied up and made more robust. Maybe this is a new one in merge.data.table when it calls [.data.table. I don't use merge() btw, preferring d1[d2] syntax instead which may explain why this got missed. Matthew "Tom Short" <[email protected]> wrote in message news:[email protected]... On Tue, Dec 7, 2010 at 2:36 PM, Steve Lianoglou <[email protected]> wrote: > Hi, > > On Tue, Dec 7, 2010 at 2:07 PM, Matthew Dowle <[email protected]> > wrote: >> >> Does anyone have time to see if this post uses data.table correctly : >> >> http://stackoverflow.com/questions/4322219/whats-the-fastest-way-to-merge-join-data-frames-in-r >> >> The dt[, colMeans(cbind(x, y)), by="g1,g2"] bit looks wrong to me. Is >> that why it takes 131 seconds vs 2.73 for sqldf ? Shouldn't it be >> dt[,list(mean(x),mean(y)),by="g1,g2"] ? >> >> And also the y2= bit of dt1[dt2,list(x,y1,y2=dt2$y2)] looks odd. > > Don't know what's wrong with me today, but running this part of the > given example in "the obvious way" is causing data.table to error and > I'm not sure what I'm (obviously(?)) doing wrong: > > set.seed(123) > N <- 1e5 > d1 <- data.frame(x=sample(N,N), y1=rnorm(N)) > d2 <- data.frame(x=sample(N,N), y2=rnorm(N)) > > d1 <- data.table(d1, key="x") > d2 <- data.table(d2, key="x") > merge(d1, d2, by="x") > > Error in x[, key, with = FALSE] : incorrect number of dimensions > > What am I missing? It's a problem with the column name "x". I thought we got rid of the naming issues a while ago. The following seems to work: set.seed(123) N <- 1e5 d1 <- data.frame(xx=sample(N,N), y1=rnorm(N)) d2 <- data.frame(xx=sample(N,N), y2=rnorm(N)) d1 <- data.table(d1, key="xx") d2 <- data.table(d2, key="xx") merge(d1, d2) Right now, I don't have time to dig further. - Tom _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
