Re: [datatable-help] changing data.table by-without-by syntax to require a "by"

Arunkumar Srinivasan Tue, 30 Apr 2013 06:48:15 -0700

Eduard, thanks for your reply. But somethings are unclear to me still. I'll try 
to explain them below.

First I prefer .JOIN (or cross.apply) just because `each.i` seems general (that 
it is applicable to *every* i operation, which as of now seems untrue). .JOIN 
is specific to data.table type for `i`.

>From what I understand from your reply, if (.JOIN = FALSE), then,

    DT1[DT2, y, .JOIN = FALSE] <=> DT1[DT2][, y]

Is this right? It's a bit confusing because I think you're okay with 
"by-without-by" and I got the impression from Sadao that he finds the syntax of 
"by-without-by" unaccessible/advanced for basic users. So, just to clarify, 
here the DT1[DT2, y, .JOIN=FALSE] will still do the "by-without-by" and then 
result in a "vector", right?  

Matthew explains in the current documentation that DT1[DT2][, y] would "join" 
all columns of DT1 and DT2 and then subset. I assume the implementation 
underneath is *not* DT1[DT2][, y] rather the result is an efficient 
equivalence. Then, that of course seems alright to me.

If what I've told so far is right, then the syntax `DT1[DT2, .JOIN=FALSE]` 
doesn't make sense/has no purpose to me. At least I can't think of any at the 
moment. 

To conclude, IMHO, if the purpose of `.JOIN` is to provide the same as DT1[i, 
j] for DT1[DT2, j] (j being a column or an expression that results in getting 
evaluated as a scalar for every group in the current by-without-by syntax), 
then, I find this is covered in `drop = TRUE/FALSE`. Correct me if I am wrong. 
But, one could do: `DT1[DT2, j, drop=TRUE]` instead of `DT1[DT2, j, 
.JOIN=FALSE]` and DT1[i, j, drop=FALSE] instead of DT1[i, list(x,y)].

If you/anyone believes it's wrong, I'd be all ears to clarify as to what's the 
purpose of `drop` then (and also how it *doesn't* suit here as compared to 
.JOIN).

Arun

On Tuesday, April 30, 2013 at 2:54 PM, Eduard Antonyan wrote:

> Arun,
> 
> If the new boolean is false, the result would be the same as without it and 
> would be equal to current behavior of d[i][, j]. If it's true, it will only 
> have an effect if i is a join (I think each.i= fits slightly better for this 
> description than .join=) - this will replicate current underlying behavior. 
> If you think the cross-apply is something that could work not just for i 
> being a data-table but other things as well, then it would make perfect sense 
> to implement that action too when the bool is true. 
> 
> On Apr 30, 2013, at 2:58 AM, Arunkumar Srinivasan <[email protected] 
> (mailto:[email protected])> wrote:
> 
> > (The earlier message was too long and was rejected.) 
> > So, from the discussion so far, I see that Matthew is nice enough to 
> > implement `.JOIN` or `cross.apply`. I've a couple of questions. Suppose,
> > 
> >     DT1 <- data.table(x=c(1,1,2,3,3), y=1:5, z=6:10) 
> >     setkey(DT1, "x")
> >     DT2 <- data.table(x=1)
> >     DT1[DT2, y, .JOIN=TRUE] # I guess the syntax is something like this. I 
> > expect here the same output as current DT1[DT2, y]
> > 
> > The above syntax seems "okay". But my first question is what is 
> > `.JOIN=FALSE` supposed to do under these two circumstances? Suppose, 
> > 
> >     DT1 <- data.table(x=c(1,1,2,3,3), y=1:5, z=6:10) 
> >     setkey(DT1, "x")
> >     DT2 <- data.table(x=c(1,2,1), w=c(11:13))
> >     # what's the output supposed to be for?
> >     DT1[DT2, y, .JOIN=FALSE]
> >     DT1[DT2, .JOIN = FALSE]
> > 
> > Depending on this I'd have to think about `drop = TRUE/FALSE`. Also, how 
> > does it work with `subset`? 
> > 
> >     DT1[x %in% c(1,2,1), y, .JOIN=TRUE] # .JOIN is ignored?
> > 
> > Is this supposed to also do a "cross-apply" on the logical subset? I guess 
> > not. So, .JOIN is an "extra" parameter that comes into play *only* when `i` 
> > is a `data.table`? 
> > 
> > I'd love to have some replies to these questions for me to take a stance on 
> > `.JOIN`. Thank you.
> > 
> > Best,
> > Arun.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> 
> 
>

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Re: [datatable-help] changing data.table by-without-by syntax to require a "by"

Reply via email to