Okay, posted. Thanks, Ed. --Frank
On Sun, Oct 13, 2013 at 1:54 PM, Eduard Antonyan <[email protected]>wrote: > Frank, > > Great examples! > > 1) it's a bug, please file a report > > 2-3) those sound like good FRs to me > > Ed > > > On Sat, Oct 12, 2013 at 10:40 PM, Frank Erickson <[email protected]>wrote: > >> Quick follow-up: I should use rbindlist, which unsets the key. >> >> yy <- >> rbindlist(list(setnames(data.table('No','NON',0L),names(DT)),DT,list('Extra','XTR',3L))) >> >> but maybe an rbind.data.table could be made that behaves better (in terms >> of key maintenance) than the rbind.data.frame that is apparently called. I >> guess this is related to my earlier thread on using unique.data.frame, in >> that sense. >> >> My takeaway is: Bad things happen when creating data.tables using >> functions designed for data.frames. >> >> --Frank >> >> >> On Sat, Oct 12, 2013 at 11:20 PM, Frank Erickson <[email protected]>wrote: >> >>> So, I recently did something like this: >>> >>> DT <- data.table(name=c('Guff','Aw'),id=101:102,id2=1:2,key='id') >>> y <- rbind(list('No','NON',0L),DT,list('Extra','XTR',3L)) >>> x <- data.table(id=as.character(101:102),z=1:2,key='id') >>> >>> Those rows I added on do not belong in the positions I pasted them into, >>> so when I tried... >>> >>> options(datatable.verbose=TRUE) >>> x[y,newcol:=name] >>> >>> ...it failed, silently. >>> >>> I'm guessing it saw the invalid key column in y and then proceeded to >>> merge by y's column order instead. Because "name" comes before "id" (the >>> column I thought was my key), no matches are found and newcol is not >>> created. This is very, very confusing to see. Even with verbose on, I see >>> no mention of "assigned to zero rows of x" or "matched on zero groups in y". >>> >>> I've got several problems with how this worked: >>> >>> (1) y should not inherit DT's key when I rbind it, or I should get a >>> warning when rbinding a keyed data.table suggesting a better approach (that >>> I clearly do not know about yet...?). >>> >>> (2) I really don't like the silent failure to assign to or create >>> newcol. Warnings are nice. >>> >>> (3) It failed because DT1 had an invalid key (i.e., a "sorted" attribute >>> on which it is not actually sorted). When I merge DT2[DT1] and it is found >>> that DT1's key is invalid, I'd like to see (3a) a warning and (3b) it tell >>> me explicitly that its merging on column order instead. >>> >>> Note that there's a nice warning message when I reset the key: >>> >>> setkey(y,id) >>> # Warning message: >>> # In setkeyv(x, cols, verbose = verbose) : >>> # Already keyed by this key but had invalid row order, key rebuilt. If >>> you didn't go under the hood please let datatable-help know so the root >>> cause can be fixed. >>> >>> What do you all think? Also, is there a right or safe way to do rbinding? >>> >>> Thanks, >>> >>> Frank >>> >> >> >> _______________________________________________ >> datatable-help mailing list >> [email protected] >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> > >
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
