Hi Frank,

Many thanks for the thoughts.

It is something that has to do with the keys, that is for sure (see below). A 
double DT join does not understand that it has to do one join by one variable 
and the other by another variable: the output simply has lines where the two 
keys are the same (ctpty_head==iss_head),which is of course not optimal.

how do I tell DT to do separate matchings at each join?  Setting 
setkey(links,ctpty_head,iss_head) before the join does not work either.

> key(g_ctpty[links])
NULL
> key(g_ctpty)
[1] "ctpty_head"
> key(g_iss)
[1] "iss_head"
> key(links)
NULL

Hi Arunkumar,

An example would look like follows:

g_ctpty = data.table(ctpty_head=c("a","b","c"), ctpty_cty=c("US","DE","JP"))
g_iss = data.table(iss_head=c("a","b","c"), iss_cty=c("US","DE","JP"))
links = data.table(ctpty_head=c("a","b","c"), iss_head=c("b","b","a"))

merge(
+     merge(links, g_ctpty, all.x = TRUE, by = "ctpty_head"),
+     g_iss,
+     all.x = TRUE, by = "iss_head"
+   )
   iss_head ctpty_head ctpty_cty iss_cty
1:        a          c        JP      US
2:        b          a        US      DE
3:        b          b        DE      DE
> g_iss[g_ctpty[links]]
   iss_head iss_cty ctpty_cty iss_head.1
1:        a      US        US          b
2:        b      DE        DE          b
3:        c      JP        JP          a
> setkey(links,ctpty_head,iss_head)
> g_iss[g_ctpty[links]]
   iss_head iss_cty ctpty_cty iss_head.1
1:        a      US        US          b
2:        b      DE        DE          b
3:        c      JP        JP          a

On 17 Jun 2013, at 14:36, Frank Erickson <[email protected]> wrote:

> I think that key(g_ctpty[links]) == key(g_ctpty) == "ctpty_head", which is 
> used when you do your second merge with [, instead of "iss_head". You can 
> check this by running key(g_ctpty[links]) . --Frank
> 
> 
> On Mon, Jun 17, 2013 at 1:57 AM, Arunkumar Srinivasan <[email protected]> 
> wrote:
> Since you have the data as well, why not provide it (a small part at least 
> with which your issue is reproducible)? Isn't it much easier than to ask 
> everyone who's willing to help to create a data and test your code?
> 
> Arun
> 
> On Monday, June 17, 2013 at 8:54 AM, Ivan Alves wrote:
> 
>> Dear all,
>> 
>> I am not sure I understand the syntax for merging data.tables. I have keyed 
>> the two 'satelite' tables from which I want to match information to the main 
>> table 'links'
>> 
>> g_ctpty <- gultimate[,list(ctpty_head,ctpty_cty)]
>> setkey(g_ctpty,ctpty_head)
>> g_iss <- gultimate[,list(iss_head,iss_cty)]
>> setkey(g_iss,iss_head)
>> 
>> Why are the two below not equivalent?
>> 
>> This works:
>> 
>> data = merge(
>> merge(links, g_ctpty, all.x = TRUE, by = "ctpty_head"),
>> g_iss,
>> all.x = TRUE, by = "iss_head"
>> ),
>> 
>> And this does not:
>> 
>> data = g_iss[g_ctpty[links]],
>> 
>> Any guidance would be appreciated.
>> Kind regards,
>> Ivan
>> _______________________________________________
>> datatable-help mailing list
>> [email protected]
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> 
> 
> _______________________________________________
> datatable-help mailing list
> [email protected]
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> 

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to