But data.table is floating point aware. You _can_ join to floating point values, and you _can_ group by floating point values. data.table will do that within machine tolerance and take care of it for you.
So this may explain why your 'agg' only had 119 rows (because data.table is doing the rounding for you automatically), but length(unique(DT$x)) had 331 ? But, there was a bug or two in this area a few versions ago, mentioned in NEWS. Which is why I asked for sessionInfo() and str(DT) suspecting you had a double column with a slightly older version of data.table. Or, there might be a new problem. If you have to round() in data.table, that doesn't sound right to me. Matthew On 10.04.2013 13:50, David Bellot wrote: > actually I found the issue. That was not related to data.table but because I'm comparing float values, it breaks all the time if I do not round() my values before. Basically I have values like 0,1, 1.5, 0.5 etc... > I know it's bad to do that but I'm not the boss in this project ;-) > > Just in case other users are reading my email, I can only advise to read that again and again: > http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html [1] > > Best, > David > > On Tue, Apr 9, 2013 at 11:39 AM, Matthew Dowle <[email protected] [2]> wrote: > >> That's odd. Please provide result of sessionInfo() and str(DT). >> >> Matthew >> >> On 09.04.2013 11:32, David Bellot wrote: >> >>> Hi, >>> >>> I have a data.table DT with one of the column named x and I other names, let's say, a1, a2, ... aN. The key of this data.table is made of a1...aN. >>> >>> Later on, I aggregate my DT with x like this: >>> agg = DT[ , list(m=mean(y), c=length(y)), by = c("x") ] >>> >>> The problem is that "x" has 331 unique values as found by length(unique(DT$x)) but my result "agg" only has 119 rows. I tried by changing the key to "x" alone but the problem persists. My DT table has a few millions rows by the way. >>> >>> I'm sure I'm missing something totally obvious :-( !!!! >>> >>> Any idea ? >>> Best, >>> David Links: ------ [1] http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html [2] mailto:[email protected]
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
