Hi,I just want want to follow up, after going through the documentation once 
again.############################
rtJoin<-nt[mt]  # outer left join
rtJoinidentical(rtJoin,rdJoin) # False
coli<-names(rdJoin)
setcolorder(rtJoin,coli)identical(rtJoin,rdJoin) # False again, the row order 
appears to be different
##########################
I think that the equivalent data.table command for left outer join is 
:nt[mt]But the identical command was false. This stayed so even after the 
column order was set to be the same in the two cases.  Now, the row order is 
different.
So, my next question is : how does one compare two data tables to check that 
the results are the same?I have now landed in a very different question from my 
original one.Thanks for any help.Ravi


      From: ravi <[email protected]>
 To: "[email protected]" 
<[email protected]> 
 Sent: Thursday, 31 December 2015, 17:20
 Subject: [datatable-help] help in joining two data tables
   
Hi,I have some trouble in understanding the data.table procedure for joining 
two tables. Let me start by taking up two example data tables :
library(data.table)
############ the first data.table example
mt<-data.table(mtcars)
## some modifications to the data.table
s1<-1:32;s1[seq(2,32,by=2)]<-NA
mt[,"cntrl":=s1];mt[,"cylO":=cyl];mt[,"cyl":=cyl*2]
setkey(mt,gear,carb,cylO,cntrl)
mt
##  More modifications
mt[gear == 3 & carb ==3 & cylO == 8 & mpg == 16.4,cntrl:=14]
str(mt)
mt
############## the second data.table example
nt<-data.table(gear= 
c(3,3,3),carb=c(1,3,3),cylO=c(4,8,8),price=c(11,44,55),cntrl=c(21,13,14))
setkey(nt,gear,carb,cylO,cntrl)
############# merging as a data frame
rdJoin<-merge.data.frame(mt,nt,by.x=c("gear","carb","cylO","cntrl"),by.y=c("gear","carb","cylO","cntrl"),all.x=TRUE)
str(rdJoin)
rdJoin
############## questions
# What is the data.table command to get rdJoin?
# How is it possible to specify the key variables for the join -see below
# For example, c("gear","carb")      c("gear","carb","cylO")   etc.
# Also, where the variables have different names in the two tables
# For example, if the cntrl variable in the first DT is "cntrl1" and "cntrl2" 
in the second
Let me elaborate on te questions shown above. First, I would like to start with 
some general questions :1. In the documentation for data.table (which includes 
the vignettes available so far), it is mentioned that it is sufficient if one 
of the two data tables being considered has keys. This is a bit confusing. The 
straightforward situation is if both the tables have keys. When would it be of 
advantage to have keys for just one of them? It would be nice if this can be 
explained in the to-be-released vignette on joins.2. The merge command in base 
R is very clear and easy to understand. It would be nice if the data table 
procedure is transparent in the same way. To start with, I would like to know 
how I can do the following things with data table :        (i) the data.table 
equivalent of the base R command                            
merge.data.frame(mt,nt,by.x=c("gear","carb","cylO","cntrl"),by.y=c("gear","carb","cylO","cntrl"),all.x=TRUE)
           (ii) How it is possible to choose the number of key variables from a 
list :                         c("gear","carb")          
c("gear","carb","cylO")                  c("gear","carb","cylO","cntrl")        
                 It is very clear in the merge command how this is done. How to 
do that with data.table?
                        The on argument can be used for one of the tables. How 
can it be specified for the other? That is, without having to use the setkey 
command each time a change is needed.          (iii) How can this be done if 
the key variables in the two lists have different names? That is, if the cntrl 
variable in the first DT is "cntrl1" and "cntrl2" in the second, for example.
I have found the data.table package to be very useful. It would be nice if I 
can understand its use better.
Thanks for any help that I can get.Ravi





_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

  
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to