I have a sparse contingency table (most cells are 0):
> xtabs(~.,data[,idx:(idx+4)])
, , x3 = 1, x4 = 1, x5 = 1
x2
x1 1 2 3
1 0 0 31
2 0 0 112
3 0 0 94
, , x3 = 2, x4 = 1, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 3, x4 = 1, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 1, x4 = 2, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 2, x4 = 2, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 0 18 0
3 0 27 0
, , x3 = 3, x4 = 2, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 1, x4 = 3, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 2, x4 = 3, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 3, x4 = 3, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 1 0 0
3 2 0 0
, , x3 = 1, x4 = 1, x5 = 2
x2
x1 1 2 3
1 0 0 142
2 0 0 340
3 0 0 1
, , x3 = 2, x4 = 1, x5 = 2
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 3, x4 = 1, x5 = 2
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 1, x4 = 2, x5 = 2
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 2, x4 = 2, x5 = 2
x2
x1 1 2 3
1 0 4 0
2 0 41 0
3 0 0 0
, , x3 = 3, x4 = 2, x5 = 2
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 1, x4 = 3, x5 = 2
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 2, x4 = 3, x5 = 2
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 3, x4 = 3, x5 = 2
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 1, x4 = 1, x5 = 3
x2
x1 1 2 3
1 0 0 173
2 0 0 4
3 0 0 0
, , x3 = 2, x4 = 1, x5 = 3
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 3, x4 = 1, x5 = 3
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 1, x4 = 2, x5 = 3
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 2, x4 = 2, x5 = 3
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 3, x4 = 2, x5 = 3
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 1, x4 = 3, x5 = 3
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 2, x4 = 3, x5 = 3
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 3, x4 = 3, x5 = 3
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
Now, I do can do the following to get the sparse representation 'y' for the
table above:
> idx<-2
> y<-as.data.frame.table(xtabs(~.,data[,idx:(idx+4)]))
> y<-y[y$Freq>0,]
> z<-sort(y$Freq,decreasing=T,index.return=T)
> y<-y[z$ix,]
> y
x1 x2 x3 x4 x5 Freq
89 2 3 1 1 2 340
169 1 3 1 1 3 173
88 1 3 1 1 2 142
8 2 3 1 1 1 112
9 3 3 1 1 1 94
122 2 2 2 2 2 41
7 1 3 1 1 1 31
42 3 2 2 2 1 27
41 2 2 2 2 1 18
121 1 2 2 2 2 4
170 2 3 1 1 3 4
75 3 1 3 3 1 2
74 2 1 3 3 1 1
90 3 3 1 1 2 1
I am wondering if there is an R function, or a simple R routine which would
help me make the data frame 'y' without using 'xtabs'. I need to study
contingency tables of 20 (or even more) dimensions. R is unable to store a
full 3^20 contingency table. But since the tables of interest are highly
sparse, I figure the problem at hand could be highly simplified if I have
something that would create a sparse representation.
Any help or suggestions would be greatly appreciated.
Thanks,
A
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.