I forgot to say that there are no ties in each row. So any number can occur
only once in each row. Also as I mentioned earlier, actually I only need the
top 50 most frequent pairs, is there a more efficient way to do it? Because
I have 15000 numbers, output of all the pairs would be too long.

Thank you,

Cindy

On Mon, Nov 16, 2009 at 7:02 AM, David Winsemius <dwinsem...@comcast.net>wrote:

> I stuck in another "7" in one of the lines with a 2 and reasoned that we
> could deal with the desire for non-ordered "pair counting" by pasting
> min(x,y) to max(x,y);
>
> > dput(prmtx)
> structure(c(2, 1, 3, 9, 5, 7, 7, 8, 1, 7, 6, 5, 6, 2, 2, 7), .Dim = c(4L,
> 4L))
> > prmtx
>     [,1] [,2] [,3] [,4]
> [1,]    2    5    1    6
> [2,]    1    7    7    2
> [3,]    3    7    6    2
> [4,]    9    8    5    7
>
> > pair.str <- sapply(1:nrow(prmtx), function(z)  apply(combn(prmtx[z,], 2),
> 2,function(x) paste(min(x[2],x[1]), max(x[2],x[1]), sep=".")))
>
> The logic:
> sapply(1:nrow(prmtx), ... just loops over the rows of the matrix.
> combn(prmtx[z,], 2)  ... returns a two row matrix of combination in a
> single row.
> apply(combn(prmtx[z,], 2), 2 ... since combn( , 2)  returns a matrix that
> has two _rows_ I needed to loop over the columns.
> paste(min(x[2],x[1]), max(x[2],x[1]), sep=".") ... stick the minimum of a
> pair in front of the max and separates them with a period to prevent two+
> digits from being non-unique
>
> Then using table() and logical tests in an index for the desired multiple
> pairs:
>
>
> > tpair <-table(pair.str)
> > tpair
> pair.str
> 1.2 1.5 1.6 1.7 2.3 2.5 2.6 2.7 3.6 3.7 5.6 5.7 5.8 5.9 6.7 7.7 7.8 7.9 8.9
>  2   1   1   2   1   1   2   3   1   1   1   1   1   1   1   1   1   1   1
>
> > tpair[tpair>1]
> pair.str
> 1.2 1.7 2.6 2.7
>  2   2   2   3
>
> --
> David.
>
>
> On Nov 16, 2009, at 7:02 AM, David Winsemius wrote:
>
> I'm not convinced it's right. In fact, I'm pretty sure the last step taking
>> only the first half of the list is wrong. I also do not know if you have
>> considered how you want to count situations like:
>>
>> 3 2 7 4 5 7 ...
>> 7 3 8 6 1 2 9 2 ......
>>
>> How many "pairs" of 2-7/7-2 would that represent?
>>
>> --
>> David
>> On Nov 15, 2009, at 11:06 PM, cindy Guo wrote:
>>
>> Hi, David,
>>>
>>> The matrix has 20 columns.
>>> Thank you very much for your help. I think it's right, but it seems I
>>> need some time to figure it out. I am a green hand. There are so many
>>> functions here I never used before. :)
>>>
>>> Cindy
>>>
>>> On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius <dwinsem...@comcast.net>
>>> wrote:
>>> Assuming that the number of columns is 4, then consider this approach:
>>>
>>> > prs <-scan()
>>> 1: 2 5 1 6
>>> 5: 1 7 8 2
>>> 9: 3 7 6 2
>>> 13: 9 8 5 7
>>> 17:
>>> Read 16 items
>>> prmtx <- matrix(prs, 4,4, byrow=T)
>>>
>>> #Now make copus of x.y and y.x
>>>
>>> pair.str <- sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,],
>>> 2), 2,function(x) paste(x[1],x[2], sep=".")) , apply(combn(prmtx[z,], 2),
>>> 2,function(x) paste(x[2],x[1], sep="."))) )
>>> tpair <-table(pair.str)
>>>
>>> # This then gives you a duplicated list
>>> > tpair[tpair>1]
>>> pair.str
>>> 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7
>>> 2   2   2   2   2   2   2   2
>>>
>>> # So only take the first half of the pairs:
>>> > head(tpair[tpair>1], sum(tpair>1)/2)
>>>
>>> pair.str
>>> 1.2 2.1 2.6 2.7
>>> 2   2   2   2
>>>
>>> --
>>> David.
>>>
>>>
>>>
>>> On Nov 15, 2009, at 8:06 PM, David Winsemius wrote:
>>>
>>> I could of course be wrong but have you yet specified the number of
>>> columns for this pairing exercise?
>>>
>>> On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:
>>>
>>> Hi, All,
>>>
>>> I have an n by m matrix with each entry between 1 and 15000. I want to
>>> know
>>> the frequency of each pair in 1:15000 that occur together in rows. So for
>>> example, if the matrix is
>>> 2 5 1 6
>>> 1 7 8 2
>>> 3 7 6 2
>>> 9 8 5 7
>>> Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return
>>> the value 2 for this pair as well as that for all pairs. Is there a fast
>>> way
>>> to do this avoiding loops? Loops take too long.
>>>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>>>
>>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to