Gabor, I think now I understand what your earlier post was about. You mean after the external by-without-by, doing DT1[DT2, ..., ] will be faster as it shouldn't do a by-without-by. Yes, that's true. So basically, the statement:
dty[dtx, abs(x - y), roll = "nearest"] once external by-without-by is implemented, will/should first do the join and then do the "j' operation. And therefore it'll be as fast as the solution I wrote. If one wants to perform the j-operation for each group, then they'll have to do something like DT1[, j, by=DT2] (or any other solutions we end up on) Sorry for the misunderstanding. On Thu, Feb 6, 2014 at 3:20 PM, Gabor Grothendieck <[email protected]>wrote: > On Thu, Feb 6, 2014 at 8:53 AM, Arunkumar Srinivasan > <[email protected]> wrote: > > Not really. Because it still doing a "by". Meaning, for every grouping in > > "by" - abs(x-y) will be evaluated. If there are 1e5 groups, there'll be > 1e5 > > calls. And that can be expensive depending on the function + the time to > > call eval from within C. > > > > However, since it's not necessary to do a by-without-by, we can perform > the > > join and then compute once the difference between columns. There's no > > grouping, no eval from C, and no multiple calls to abs. Hope this clears > it > > up? > > > > > > In that case what is the proposed user interface? > > I thought that the idea was that one would have to explicitly specify > the by= clause for by-within-by it to occur. In the code I had just > posted there is a join = "nearest" but no by= clause is specified. >
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
