As I was just writing Kevin, I think (if @mnel could verify his output is correct), the reason is because Kevin's using R-devel...
If you do the following shown below, then, `xx` should *not* have the same address as *dt$by* (as is the case for me). But for Kevin, they seem to be pointing to the same location and I can't figure out why it would/should, from how R has been working so far. byval <- list(by=dt$by) address(dt$by) # [1] "0x7fa848ad8608" address(byval) # [1] "0x7fa84a93fa68" xx = byval[[1L]] address(xx) # [1] "0x7fa848e3fc48" address(list(xx)) [1] "0x7fa84aa1ba78" data.table:::dradixorder(xx) # [1] 2 3 1 byval $by [1] 0.7 0.4 0.4 Arun On Thursday, December 19, 2013 at 9:36 AM, Arunkumar Srinivasan wrote: > @mnel, I'm not sure I understand your output. Yours is different from the > correct output, but it is also different from Kevin's. Basically, dt[, > max(y), by=by] has no effect on yours and just returns back dt? > > Arun > > > On Thursday, December 19, 2013 at 3:50 AM, Michael Nelson wrote: > > > Using > > data.table 1.8.11 (Fresh install from r-forge today) > > R version 3.0.2 (2013-09-25) > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > > I get > > > > by max > > 1: 0.7 0.01464054 > > 2: 0.4 0.87328871 > > 3: 0.4 -1.02794620 > > > > On both runs. > > > > > > > > > > ________________________________________ > > From: [email protected] > > (mailto:[email protected]) > > [[email protected] > > (mailto:[email protected])] on behalf of > > Kevin Ushey [[email protected] (mailto:[email protected])] > > Sent: Thursday, 19 December 2013 12:54 PM > > To: [email protected] > > (mailto:[email protected]) > > Subject: [datatable-help] 'by' on a numeric column produces inconsistent > > output > > > > I'm cross-posting this from the GitHub mirror: > > https://github.com/arunsrinivasan/datatable/issues/2 > > > > For reference, I only see this with the latest RForge version of > > data.table (1.8.11), not the CRAN version of data.table. > > > > ----- > > > > library(data.table, lib="/Users/kevinushey/Library/R/3.1/library") > > set.seed(32) > > n <- 3 > > dt <- data.table( > > y=rnorm(n), > > by=round( rnorm(n), 1) > > ) > > > > dt[, > > list(max=max(y, na.rm=TRUE)), > > by=list(by) > > ] > > > > dt[, > > list(max=max(y, na.rm=TRUE)), > > by=list(by) > > ] > > > > produces the output > > > > > dt[, > > + list(max=max(y, na.rm=TRUE)), > > + by=list(by) > > + ] > > by max > > 1: 0.4 0.01464054 > > 2: 0.4 0.87328871 > > 3: 0.7 -1.02794620 > > > > > > dt[, > > + list(max=max(y, na.rm=TRUE)), > > + by=list(by) > > + ] > > by max > > 1: 0.4 0.8732887 > > 2: 0.7 -1.0279462 > > > > For some reason, the first return is wrong, while the second (and all > > subsequent) output is correct. Any idea what's going on? > > > > > sessionInfo() > > R Under development (unstable) (2013-12-12 r64453) > > Platform: x86_64-apple-darwin13.0.0 (64-bit) > > > > locale: > > [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] data.table_1.8.11 knitr_1.5 devtools_1.4.1.99 > > BiocInstaller_1.13.3 > > > > loaded via a namespace (and not attached): > > [1] compiler_3.1.0 digest_0.6.4 evaluate_0.5.1 formatR_0.10 > > httr_0.2 memoise_0.1 > > [7] parallel_3.1.0 plyr_1.8 RCurl_1.95-4.1 reshape2_1.2.2 > > stringr_0.6.2 tools_3.1.0 > > [13] whisker_0.3-2 > > > > --- > > > > Kevin > > _______________________________________________ > > datatable-help mailing list > > [email protected] > > (mailto:[email protected]) > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > _______________________________________________ > > datatable-help mailing list > > [email protected] > > (mailto:[email protected]) > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > > > > > > >
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
