Steve, The "x" problem you found in merge has been fixed now. In 1.5.2. There was a subtle issue here, which new FAQs 2.12 and 2.13 cover. Matthew
On Wed, 2010-12-08 at 09:07 +0000, Matthew Dowle wrote: > Bug raised for this one : > https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1229&group_id=240&atid=975 > > The "x" issue was fixed in [.data.table. It wasn't a specific fix as far as > I remember but when the internal scoping was tidied up and made more robust. > Maybe this is a new one in merge.data.table when it calls [.data.table. > > I don't use merge() btw, preferring d1[d2] syntax instead which may explain > why this got missed. > > Matthew > > "Tom Short" <[email protected]> wrote in message > news:[email protected]... > On Tue, Dec 7, 2010 at 2:36 PM, Steve Lianoglou > <[email protected]> wrote: > > Hi, > > > > On Tue, Dec 7, 2010 at 2:07 PM, Matthew Dowle <[email protected]> > > wrote: > >> > >> Does anyone have time to see if this post uses data.table correctly : > >> > >> http://stackoverflow.com/questions/4322219/whats-the-fastest-way-to-merge-join-data-frames-in-r > >> > >> The dt[, colMeans(cbind(x, y)), by="g1,g2"] bit looks wrong to me. Is > >> that why it takes 131 seconds vs 2.73 for sqldf ? Shouldn't it be > >> dt[,list(mean(x),mean(y)),by="g1,g2"] ? > >> > >> And also the y2= bit of dt1[dt2,list(x,y1,y2=dt2$y2)] looks odd. > > > > Don't know what's wrong with me today, but running this part of the > > given example in "the obvious way" is causing data.table to error and > > I'm not sure what I'm (obviously(?)) doing wrong: > > > > set.seed(123) > > N <- 1e5 > > d1 <- data.frame(x=sample(N,N), y1=rnorm(N)) > > d2 <- data.frame(x=sample(N,N), y2=rnorm(N)) > > > > d1 <- data.table(d1, key="x") > > d2 <- data.table(d2, key="x") > > merge(d1, d2, by="x") > > > > Error in x[, key, with = FALSE] : incorrect number of dimensions > > > > What am I missing? > > It's a problem with the column name "x". I thought we got rid of the > naming issues a while ago. The following seems to work: > > set.seed(123) > N <- 1e5 > d1 <- data.frame(xx=sample(N,N), y1=rnorm(N)) > d2 <- data.frame(xx=sample(N,N), y2=rnorm(N)) > > d1 <- data.table(d1, key="xx") > d2 <- data.table(d2, key="xx") > merge(d1, d2) > > Right now, I don't have time to dig further. > > - Tom > > > > _______________________________________________ > datatable-help mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
