The issue has been fixed in commit 1054 now. Once r-forge build kicks in, be sure to update, especially if you're working with R-devel version.
Arun On Thursday, December 19, 2013 at 12:56 PM, Arunkumar Srinivasan wrote: > Just tested this on the devel version (today's). And yes, this issue happens. > But I'm not sure if this is an issue with 'data.table' per-se: > > On a clean session, if you do this: > > require(data.table) > set.seed(32) > n <- 3 > dt <- data.table(y=rnorm(n), by=round( rnorm(n), 1)) > > ll <- list(dt$by) > yy <- ll[[1L]] > address(dt$by) # [1] "0x7fad3c524a40" > address(ll[[1L]]) # [1] "0x7fad3c524a40" > address(yy) # [1] "0x7fad3c524a40" > > > You see that all three are pointing to the same address. And that's why the > result is wrong because internally "yy" will be changed by reference during > "fastorder". And it is *not* supposed to point to "yy" but to have made a > copy. > > After doing it the first time, the pointing changes back to how it's in > R-stable.. Not sure if this is desirable. Probably should report on R-devel. > > On R-3.0.2, the same commands as above on a clean session: > > require(data.table) > set.seed(32) > n <- 3 > dt <- data.table(y=rnorm(n), by=round( rnorm(n), 1)) > > ll <- list(dt$by) > yy <- ll[[1L]] > address(dt$by) # [1] "0x7fc35b640408" > address(ll[[1L]]) # [1] "0x7fc35a0ec838" > address(yy) # [1] "0x7fc35a0ec838" > > > > > Arun > > > On Thursday, December 19, 2013 at 9:43 AM, Arunkumar Srinivasan wrote: > > > Simon, > > > > Thanks. One more towards my way :). I think we've nailed down the problem > > to R-devel version. I'll write again once I discuss it over with Kevin. > > > > Arun > > > > > > On Thursday, December 19, 2013 at 9:26 AM, Simon Zehnder wrote: > > > > > Hi Arun, > > > > > > here the results on Mac OS X Mavericks with gcc 4.8.2 > > > > > > data.table 1.8.10: > > > > > > > set.seed(32) > > > > n <- 3 > > > > dt <- data.table( > > > > > > > > > > + y=rnorm(n), > > > + by=round( rnorm(n), 1) > > > + ) > > > > > > > > dt[, > > > + list(max=max(y, na.rm=TRUE)), > > > + by=list(by) > > > + ] > > > by max > > > 1: 0.7 0.01464054 > > > 2: 0.4 0.87328871 > > > > > > > > dt[, > > > + list(max=max(y, na.rm=TRUE)), > > > + by=list(by) > > > + ] > > > by max > > > 1: 0.7 0.01464054 > > > 2: 0.4 0.87328871 > > > > > > data.table 1.8.11: > > > > > > > set.seed(32) > > > > n <- 3 > > > > dt <- data.table( > > > > > > > > > > + y=rnorm(n), > > > + by=round( rnorm(n), 1) > > > + ) > > > > > > > > dt[, > > > + list(max=max(y, na.rm=TRUE)), > > > + by=list(by) > > > + ] > > > by max > > > 1: 0.7 0.01464054 > > > 2: 0.4 0.87328871 > > > > > > > > dt[, > > > + list(max=max(y, na.rm=TRUE)), > > > + by=list(by) > > > + ] > > > by max > > > 1: 0.7 0.01464054 > > > 2: 0.4 0.87328871 > > > > > > Best > > > > > > Simon > > > > > > > > > On 19 Dec 2013, at 09:05, Arunkumar Srinivasan <[email protected] > > > (mailto:[email protected])> wrote: > > > > > > > Simon, sure. > > > > > > > > set.seed(32) > > > > n <- 3 > > > > dt <- data.table( > > > > y=rnorm(n), > > > > by=round( rnorm(n), 1) > > > > ) > > > > > > > > dt[, > > > > list(max=max(y, na.rm=TRUE)), > > > > by=list(by) > > > > ] > > > > > > > > dt[, > > > > list(max=max(y, na.rm=TRUE)), > > > > by=list(by) > > > > ] > > > > > > > > > > > > > > > > Arun > > > > > > > > On Thursday, December 19, 2013 at 8:49 AM, Simon Zehnder wrote: > > > > > > > > > Arun, > > > > > > > > > > if you could send me the reproducible code in copyable form I can as > > > > > well try it on Mac OS X Mavericks with gcc 4.8. > > > > > > > > > > Best > > > > > > > > > > Simon > > > > > > > > > > On 19 Dec 2013, at 08:44, Arunkumar Srinivasan <[email protected] > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > Aha, the issue seems to be with 'uniqlist', not sure why it gives > > > > > > > (f__ = data.table:::uniqlist(byval, order=o__)) # 1,3 > > > > > > > > > > > > 1,2,3 for you and 1,3 consistently for me. I'll revert this back to > > > > > > `duplist` for now. Not sure how to solve this though. I've tried it > > > > > > so far on 3 machines: > > > > > > > > > > > > 1) OS X 10.8.5 + libvm (gcc) > > > > > > 2) OS X Mavericks + Clang > > > > > > 3) Debian Weezy + gcc > > > > > > > > > > > > All of them give consistent output. Man this is such a drag. > > > > > > > > > > > > Arun > > > > > > > > > > > > On Thursday, December 19, 2013 at 8:37 AM, Kevin Ushey wrote: > > > > > > > > > > > > > Hi Arun, > > > > > > > > > > > > > > Here's the output on my machine -- other information missing from > > > > > > > before; it's with OSX Mavericks, with R and data.table compiled > > > > > > > with > > > > > > > Apple clang. > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > > library(data.table, > > > > > > > > lib="/Users/kevinushey/Library/R/3.1/library") > > > > > > > > set.seed(32) > > > > > > > > n <- 3 > > > > > > > > dt <- data.table( > > > > > > > > > > > > > > > > > > > > > > + y=rnorm(n), > > > > > > > + by=round( rnorm(n), 1) > > > > > > > + ) > > > > > > > ## run one > > > > > > > > byval <- list(by=dt$by) > > > > > > > > (o__ <- data.table:::fastorder(byval)) # 2,3,1 > > > > > > > > > > > > > > > > > > > > > > [1] 2 3 1 > > > > > > > > (f__ = data.table:::uniqlist(byval, order=o__)) # 1,3 > > > > > > > > > > > > > > [1] 1 2 3 > > > > > > > > (len__ = data.table:::uniqlengths(f__, nrow(dt))) # 2,1 > > > > > > > > > > > > > > [1] 1 1 1 > > > > > > > > (firstofeachgroup = o__[f__]) # 2,1 > > > > > > > > > > > > > > [1] 2 3 1 > > > > > > > > (origorder = data.table:::iradixorder(firstofeachgroup)) # 2,1 > > > > > > > > > > > > > > [1] 3 1 2 > > > > > > > > (f__ = f__[origorder]) # 3,1 > > > > > > > > > > > > > > [1] 3 1 2 > > > > > > > > (len__ = len__[origorder]) # 2,1 > > > > > > > > > > > > > > [1] 1 1 1 > > > > > > > > > > > > > > ## run two > > > > > > > > (o__ <- data.table:::fastorder(byval)) # 2,3,1 > > > > > > > > > > > > > > [1] 1 2 3 > > > > > > > > (f__ = data.table:::uniqlist(byval, order=o__)) # 1,3 > > > > > > > > > > > > > > [1] 1 3 > > > > > > > > (len__ = data.table:::uniqlengths(f__, nrow(dt))) # 2,1 > > > > > > > > > > > > > > [1] 2 1 > > > > > > > > (firstofeachgroup = o__[f__]) # 2,1 > > > > > > > > > > > > > > [1] 1 3 > > > > > > > > (origorder = data.table:::iradixorder(firstofeachgroup)) # 2,1 > > > > > > > > > > > > > > [1] 1 2 > > > > > > > > (f__ = f__[origorder]) # 3,1 > > > > > > > > > > > > > > [1] 1 3 > > > > > > > > (len__ = len__[origorder]) # 2,1 > > > > > > > > > > > > > > [1] 2 1 > > > > > > > > > > > > > > On Wed, Dec 18, 2013 at 11:22 PM, Arunkumar Srinivasan > > > > > > > <[email protected] (mailto:[email protected])> wrote: > > > > > > > > Not sure how to debug without being able to reproduce. Tried on > > > > > > > > Mac OS X > > > > > > > > 10.8.5 and Debian GNU/Linux 7 (wheezy). I don't have access to > > > > > > > > a windows > > > > > > > > machine. I consistently gives me this: > > > > > > > > > > > > > > > > > dt[, > > > > > > > > + list(max=max(y, na.rm=TRUE)), > > > > > > > > + by=list(by) > > > > > > > > + ] > > > > > > > > by max > > > > > > > > 1: 0.7 0.01464054 > > > > > > > > 2: 0.4 0.87328871 > > > > > > > > > > > > > > > > > > dt[, > > > > > > > > + list(max=max(y, na.rm=TRUE)), > > > > > > > > + by=list(by) > > > > > > > > + ] > > > > > > > > by max > > > > > > > > 1: 0.7 0.01464054 > > > > > > > > 2: 0.4 0.87328871 > > > > > > > > > > > > > > > > Can either of you provide me with the output of these steps in > > > > > > > > cases where > > > > > > > > there's an error? I've commented the output I get for each step. > > > > > > > > > > > > > > > > byval <- list(by=dt$by) > > > > > > > > o__ <- data.table:::fastorder(byval) # 2,3,1 > > > > > > > > f__ = data.table:::uniqlist(byval, order=o__) # 1,3 > > > > > > > > len__ = data.table:::uniqlengths(f__, nrow(dt)) # 2,1 > > > > > > > > firstofeachgroup = o__[f__] # 2,1 > > > > > > > > origorder = data.table:::iradixorder(firstofeachgroup) # 2,1 > > > > > > > > f__ = f__[origorder] # 3,1 > > > > > > > > len__ = len__[origorder] # 2,1 > > > > > > > > > > > > > > > > > > > > > > > > Arun > > > > > > > > > > > > > > > > <...snip...> > > > > > > > > > > > > _______________________________________________ > > > > > > datatable-help mailing list > > > > > > [email protected] > > > > > > (mailto:[email protected]) > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
