I have a large dataframe (1400x1400) containing a symmetric similarity matrix.
Now I would like to extract subsets of elements where all elements have a
specific similarity with all other elements of this subset.
For example if the data looks like this
Spl1 Spl2 Spl3 Spl4 Spl5 [...]
Spl1 1 0.125 0.000 0.000 0.125
Spl2 0.125 1 0.000 0.000 0.125
Spl3 0.000 0.000 1 0.000 0.500
Spl4 0.000 0.000 0.000 1 0.750
Spl5 0.125 0.125 0.500 0-750 1
[...]
I am looking for a way to either like to extract, all elements that are
mutually 0, e.g:
Spl1 Spl3 Spl4 [...]
Spl1 1 0.000 0.000
Spl3 0.000 1 0.000
Spl4 0.000 0.000 1
[...]
Or that mutually have similarity 0.125:
Spl1 Spl2 Spl5 [...]
Spl1 1 0.125 0.125
Spl2 0.125 1 0.125
Spl5 0.125 0.125 1
[...]
Or alternatively to sort the table so that this info can easily be obtained by
looking for blocks around the diagonal, like this:
Spl3 Spl4 Spl1 Spl2 Spl5 [...]
Spl3 1 0.000 0.000 0.125 0.500
Spl4 0.000 1 0.000 0.000 0.750
Spl1 0.000 0.000 1 0.125 0.125
Spl2 0.000 0.000 0.125 1 0.125
Spl5 0.500 0.750 0.125 0.125 1
[...]
Any help is much appreciated!
Helmut Bürgmann, Switzerland
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.