Agreed, new bug. Thanks for reporting. If you could please file on the R-Forge tracker (then you'll get auto updates) or I can file it, don't mind.
I will get to the bug list eventually! Thanks, Matthew On 10.04.2013 15:46, Shir Levkowitz wrote: > I have encountered a bug in the Cartesian join of two data.tables, where the resulting data.table is not sorted by its full key. This is in data.table v1.8.8. Please let me know if this issue has been brought up or if there is any insight regarding it. > Thank you, > Shir Levkowitz > > ------------------------------------------------- > > library(data.table) > > ###### set up our example data tables > test1 > b=sample(1:3, 100, replace=TRUE), > c=sample(1:10, 100,replace=TRUE)) > setkey(test1, a,b,c) > > test2 > q=sample(1:3, 100, replace=TRUE), > r=sample(1:100), > w=sample(1:100)) > setkey(test2, p,q) > > ###### a cartesian join - this is where the issue arises > test.join > > ### have a look at the key > k > k > > ### if we do a group by, we don't get the right aggregation > test.gb > test.gb[a == 1 & b == 1 & c == 1,] > ### when really what we want is: > test.agg > subset(test.agg, a == 1 & b == 1 & c == 1) > > ### if we set the same key, we get a warning > setkeyv(test.join, k) >>> Warning message: > In setkeyv(test.join, k) : Already keyed by this key but had invalid row order, key rebuilt. If you didn't go under the hood please let datatable-help know so the root cause can be fixed.
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
