This is a feature request. I'm making a record linkage routine, and would
find a cartesian product function useful. In base R, this can be done using
the merge function:
> x = data.frame(name = c("JOE","ANN","HARRY"), age =c(20,20,30) );
> y = data.frame(name = c("JOE","ANN","MIKE","LARRY"), age =c(20,20,30,30)
);
>
> merge(x,y, by.x=NULL, by.y=NULL)
name.x age.x name.y age.y
1 JOE 20 JOE 20
2 ANN 20 JOE 20
3 HARRY 30 JOE 20
4 JOE 20 ANN 20
5 ANN 20 ANN 20
6 HARRY 30 ANN 20
7 JOE 20 MIKE 30
8 ANN 20 MIKE 30
9 HARRY 30 MIKE 30
10 JOE 20 LARRY 30
11 ANN 20 LARRY 30
12 HARRY 30 LARRY 30
However, in data.table this does not work:
> x = data.table(name = c("JOE","ANN","HARRY"), age =c(20,20,30) );
> y = data.table(name = c("JOE","ANN","MIKE","LARRY"), age =c(20,20,30,30)
);
> merge(x,y,by.x=NULL, by.y=NULL)
Error in merge.data.table(x, y, by.x = NULL, by.y = NULL) :
Can not match keys in x and y to automatically determine appropriate `by`
parameter. Please set `by` value explicitly.
I've gotten around this with a hack using expand.grid and several left
joins, by the way.
pairs = data.table(expand.grid(1:nrow(x),1:nrow(y)))
....
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help