Hi, On Tue, Dec 7, 2010 at 2:07 PM, Matthew Dowle <[email protected]> wrote: > > Does anyone have time to see if this post uses data.table correctly : > > http://stackoverflow.com/questions/4322219/whats-the-fastest-way-to-merge-join-data-frames-in-r > > The dt[, colMeans(cbind(x, y)), by="g1,g2"] bit looks wrong to me. Is > that why it takes 131 seconds vs 2.73 for sqldf ? Shouldn't it be > dt[,list(mean(x),mean(y)),by="g1,g2"] ? > > And also the y2= bit of dt1[dt2,list(x,y1,y2=dt2$y2)] looks odd.
Don't know what's wrong with me today, but running this part of the given example in "the obvious way" is causing data.table to error and I'm not sure what I'm (obviously(?)) doing wrong: set.seed(123) N <- 1e5 d1 <- data.frame(x=sample(N,N), y1=rnorm(N)) d2 <- data.frame(x=sample(N,N), y2=rnorm(N)) d1 <- data.table(d1, key="x") d2 <- data.table(d2, key="x") merge(d1, d2, by="x") Error in x[, key, with = FALSE] : incorrect number of dimensions What am I missing? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
