Hi,I just want want to follow up, after going through the documentation once
again.############################
rtJoin<-nt[mt] # outer left join
rtJoinidentical(rtJoin,rdJoin) # False
coli<-names(rdJoin)
setcolorder(rtJoin,coli)identical(rtJoin,rdJoin) # False again, the row order
appears to be different
##########################
I think that the equivalent data.table command for left outer join is
:nt[mt]But the identical command was false. This stayed so even after the
column order was set to be the same in the two cases. Now, the row order is
different.
So, my next question is : how does one compare two data tables to check that
the results are the same?I have now landed in a very different question from my
original one.Thanks for any help.Ravi
From: ravi <[email protected]>
To: "[email protected]"
<[email protected]>
Sent: Thursday, 31 December 2015, 17:20
Subject: [datatable-help] help in joining two data tables
Hi,I have some trouble in understanding the data.table procedure for joining
two tables. Let me start by taking up two example data tables :
library(data.table)
############ the first data.table example
mt<-data.table(mtcars)
## some modifications to the data.table
s1<-1:32;s1[seq(2,32,by=2)]<-NA
mt[,"cntrl":=s1];mt[,"cylO":=cyl];mt[,"cyl":=cyl*2]
setkey(mt,gear,carb,cylO,cntrl)
mt
## More modifications
mt[gear == 3 & carb ==3 & cylO == 8 & mpg == 16.4,cntrl:=14]
str(mt)
mt
############## the second data.table example
nt<-data.table(gear=
c(3,3,3),carb=c(1,3,3),cylO=c(4,8,8),price=c(11,44,55),cntrl=c(21,13,14))
setkey(nt,gear,carb,cylO,cntrl)
############# merging as a data frame
rdJoin<-merge.data.frame(mt,nt,by.x=c("gear","carb","cylO","cntrl"),by.y=c("gear","carb","cylO","cntrl"),all.x=TRUE)
str(rdJoin)
rdJoin
############## questions
# What is the data.table command to get rdJoin?
# How is it possible to specify the key variables for the join -see below
# For example, c("gear","carb") c("gear","carb","cylO") etc.
# Also, where the variables have different names in the two tables
# For example, if the cntrl variable in the first DT is "cntrl1" and "cntrl2"
in the second
Let me elaborate on te questions shown above. First, I would like to start with
some general questions :1. In the documentation for data.table (which includes
the vignettes available so far), it is mentioned that it is sufficient if one
of the two data tables being considered has keys. This is a bit confusing. The
straightforward situation is if both the tables have keys. When would it be of
advantage to have keys for just one of them? It would be nice if this can be
explained in the to-be-released vignette on joins.2. The merge command in base
R is very clear and easy to understand. It would be nice if the data table
procedure is transparent in the same way. To start with, I would like to know
how I can do the following things with data table : (i) the data.table
equivalent of the base R command
merge.data.frame(mt,nt,by.x=c("gear","carb","cylO","cntrl"),by.y=c("gear","carb","cylO","cntrl"),all.x=TRUE)
(ii) How it is possible to choose the number of key variables from a
list : c("gear","carb")
c("gear","carb","cylO") c("gear","carb","cylO","cntrl")
It is very clear in the merge command how this is done. How to
do that with data.table?
The on argument can be used for one of the tables. How
can it be specified for the other? That is, without having to use the setkey
command each time a change is needed. (iii) How can this be done if
the key variables in the two lists have different names? That is, if the cntrl
variable in the first DT is "cntrl1" and "cntrl2" in the second, for example.
I have found the data.table package to be very useful. It would be nice if I
can understand its use better.
Thanks for any help that I can get.Ravi
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help