If I understood you correctly, you have this matrix of indicator variables for
occurrences of terms in documents:
A - matrix(c(1,1,0,0,1,1,1,0,1,1,1,0,0,0,1), nrow=3, byrow=TRUE,
dimnames=list(paste(doc,1:3), paste(term,1:5)))
A
and want to determine co-occurrence counts for pairs of
On Nov 11, 2010, at 4:44 AM, Stefan Evert wrote:
Pasted and realigned from original posting:
term1 term2 term3 term4 term5
term1 0 2 0 1 3
term2 2 0 0 1 2
term3 0 0 0 0 0
term4 1 1 0 0 1
term5 3 2 0 1 1
Any ideas on how to do that?
If I understood you correctly, you have this matrix of
On 12 November 2010 02:21, David Winsemius dwinsem...@comcast.net wrote:
The fastest and easiest solution is
t(A) %*% A
That is really elegant. (Wish I could remember my linear algebra lessons as
well from forty years ago.) I checked it against the specified output and
found that with one
Hi:
Another alternative is
crossprod(A)
which is meant to produce an optimized A'A (not the vector cross-product
from intro physics :)
Example:
A - matrix(rpois(9, 10), ncol = 3)
A
[,1] [,2] [,3]
[1,]6 10 14
[2,]75 16
[3,] 12 16 10
t(A) %*% A
[,1] [,2]
Hi all,
I am trying to construct a pairwise coocurrence matrix for certain terms
appearing in a number of documents. For example I have the following table
with binary values showing the presence or absence of a certain term in a
document:
term1 term2 term3 term4 term5 doc1 1 1 0 0 1 doc2 1
Hi Tax,
Because the list dost not accept HTML messages (per posting guide),
your message was converted to plain text, and your table is difficult
to read. My suggestion would be to start with:
?table
?xtabs
If you make up a minimal example of the data you have, and email it to
us we can give
Hi Tax,
I played around with several different functions. I keep thinking
that there should be an easier/faster way, but this is what I came up
with. Given the speed tests, it looks like foo4 is the best option
(they all give identical results).
The functions
foo1 - function(object)
7 matches
Mail list logo