[R] Multiset Permutations

2008-04-06 Thread Stropharia

Dear R users,

I want to perform an exact permutation of multisets. I have looked at the
coin package, but it doesn't seem to offer what I am looking for. I want to
perform permutations (exact - without duplications) on multisets of scalars
(e.g., the multiset 0,0,1,2,2). I want to output all the possible
permutations into a data frame so that each multiset permutation occupies a
row (or column). The output would look something like this:

00122
01220
01210
20201
10202
12200
etc...

There seem to be numerous algorithms in the primary statistical literature
for doing this, but they haven't been implemented in R. I've been searching
for weeks online and in books for a way of doing this, but the only thing
i've come across is a Java script posted anonymously on a forum. I don't
know Java at all, so can't make enough sense of it to try a translation into
R, but i thought i'd include it below in case someone can glean from this a
way of doing it in R. Thanks in advance for any help on this, it is greatly
appreciated.

Here are the original (anonymous) poster's comments and code:

We definitely need to use recursion. There can be up to n! permutations.
For example we'll use recursion to generate a tree of n! leaves. Each node
is a run of the recursive function. The root node has n children, the nodes
at the second level have each n-1 children, the nodes at the third level
have n-2 children, etc. Each leaf of this tree will be a permutation. The
idea is to add an element to the permutation, remove it from the set, and
call the recursive function on that set. Because it's a multiset i'm using a
Hashtable to eliminate duplicate entries. A pointer to the Hashtable gets
passed along with each recursive call.
The recursive function will know that it's on a leaf when the size of the
set is 0, at which point it will insert its permutation in the Hashtable.
Since identical permutations map to the same bucket in the hashtable, they
overwrite eachother to leave only unique permutations. Here's the algorithm
in Java:

-  JAVA CODE  
import java.util.*;

public class Test {
public static void PermutationsRecursive(char[] s, String p, Hashtable
perms){
for(int x=0; xs.length; x++){
String np = new String(p);
char[] ns = new char[s.length-1];
int y = 0;
for(int z=0; zs.length; z++){
if(z != x) ns[y++] = s[z];
}
np = np + s[x];
if(ns.length == 0) perms.put(np, new Boolean(true));
else PermutationsRecursive(ns, np, perms);
}
}
public static String[] GetPermutations(char[] in){
int fact = Factorial(in.length);
Hashtable perms = new Hashtable(fact);
PermutationsRecursive(in, , perms);
Enumeration e = perms.keys();
String[] out = new String[perms.size()];
int i = 0;
while(e.hasMoreElements()){
out[i++] = (String) e.nextElement();
}
return out;
}
private static int Factorial(int n){
int fact = 1;
for(int i=2; i=n; i++){
fact *= i;
}
return fact;
}
public static void main(String[] args){
char[] set = new char[]{'A', 'A', 'B', 'C'};
String[] perms = GetPermutations(set);
Arrays.sort(perms, String.CASE_INSENSITIVE_ORDER);
for(int i=0; iperms.length; i++){
System.out.println(perms[i]);
}
}
}
-  END JAVA CODE  -

This code prints the following:
AABC
AACB
ABAC
ABCA
ACAB
ACBA
BAAC
BACA
BCAA
CAAB
CABA
CBAA

-- 
View this message in context: 
http://www.nabble.com/Multiset-Permutations-tp16529735p16529735.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiset Permutations

2008-04-06 Thread Johannes Hüsing
You can use the function permutation from the e1071 package,
then

library(e1071)

multisetperm - function(multiset) { 
unique(apply(matrix(multiset[permutations(length(multiset))], 
ncol=length(multiset)), 1, paste, sep=, collapse=)) }
multisetperm(c(0, 0, 1, 2, 2))

 The output would look something like this:

 00122
 01220
 01210
 20201
 10202
 12200
 etc...


The Java algorithm you cited does not look any more clever or less 
wasteful
than this.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiset Permutations

2008-04-06 Thread Gavin Simpson
Hi,

If I understand you correctly, there is beta code within the development
version of package 'vegan' on R forge to do this.

install.packages(vegan, repos=http://R-Forge.R-project.org;)

Then:

 numPerms(5)
[1] 120
 perms - allPerms(5, observed = TRUE)
 perms[1:10,]
  [,1] [,2] [,3] [,4] [,5]
 [1,]12354
 [2,]12435
 [3,]12453
 [4,]12534
 [5,]12543
 [6,]13245
 [7,]13254
 [8,]13425
 [9,]13452
[10,]13524

As you can see, it returns the ordering of 5 observations when permuted
freely. perms is a matrix with some extra attributes.

To use this, loop over the rows of perms. To get the actual permuted
vector, use something like the following:

 dat - c(0,0,1,2,2)
 dat[perms[1,]]
[1] 0 0 1 2 2
 dat[perms[2,]]
[1] 0 0 2 1 2
 dat[perms[3,]]
[1] 0 0 2 2 1

This assumes that the entries in dat are freely exchangeable, which is
what I take your output example to be representing.

Be careful to read the help page for allPerms, if n is large you'll need
to increase argument 'max'. numPerms(n) where n is the number of entries
in dat will tell you how big max needs to be.

Note that you should be able to do:

perms - allPerms(dat)

but that isn't working at all (did I say this was beta code?) due to a
silly bug I introduced by trying to be too clever for my own good. So
you have to use allPerms(n) instead.

If this isn't exactly what you wanted, email me back as there may be a
way to use the more complicated restricted permutations available in
vegan to do what you want.

HTH

G

On Sun, 2008-04-06 at 13:57 -0700, Stropharia wrote:
 Dear R users,
 
 I want to perform an exact permutation of multisets. I have looked at the
 coin package, but it doesn't seem to offer what I am looking for. I want to
 perform permutations (exact - without duplications) on multisets of scalars
 (e.g., the multiset 0,0,1,2,2). I want to output all the possible
 permutations into a data frame so that each multiset permutation occupies a
 row (or column). The output would look something like this:
 
 00122
 01220
 01210
 20201
 10202
 12200
 etc...
 
 There seem to be numerous algorithms in the primary statistical literature
 for doing this, but they haven't been implemented in R. I've been searching
 for weeks online and in books for a way of doing this, but the only thing
 i've come across is a Java script posted anonymously on a forum. I don't
 know Java at all, so can't make enough sense of it to try a translation into
 R, but i thought i'd include it below in case someone can glean from this a
 way of doing it in R. Thanks in advance for any help on this, it is greatly
 appreciated.
 
 Here are the original (anonymous) poster's comments and code:
 
 We definitely need to use recursion. There can be up to n! permutations.
 For example we'll use recursion to generate a tree of n! leaves. Each node
 is a run of the recursive function. The root node has n children, the nodes
 at the second level have each n-1 children, the nodes at the third level
 have n-2 children, etc. Each leaf of this tree will be a permutation. The
 idea is to add an element to the permutation, remove it from the set, and
 call the recursive function on that set. Because it's a multiset i'm using a
 Hashtable to eliminate duplicate entries. A pointer to the Hashtable gets
 passed along with each recursive call.
 The recursive function will know that it's on a leaf when the size of the
 set is 0, at which point it will insert its permutation in the Hashtable.
 Since identical permutations map to the same bucket in the hashtable, they
 overwrite eachother to leave only unique permutations. Here's the algorithm
 in Java:
 
 -  JAVA CODE  
 import java.util.*;
 
 public class Test {
   public static void PermutationsRecursive(char[] s, String p, Hashtable
 perms){
   for(int x=0; xs.length; x++){
   String np = new String(p);
   char[] ns = new char[s.length-1];
   int y = 0;
   for(int z=0; zs.length; z++){
   if(z != x) ns[y++] = s[z];
   }
   np = np + s[x];
   if(ns.length == 0) perms.put(np, new Boolean(true));
   else PermutationsRecursive(ns, np, perms);
   }
   }
   public static String[] GetPermutations(char[] in){
   int fact = Factorial(in.length);
   Hashtable perms = new Hashtable(fact);
   PermutationsRecursive(in, , perms);
   Enumeration e = perms.keys();
   String[] out = new String[perms.size()];
   int i = 0;
   while(e.hasMoreElements()){
   out[i++] = (String) 

Re: [R] Multiset Permutations

2008-04-06 Thread G. Jay Kerns
Dear Steven,

The prob package does this, too.  (Please see the * fix below).

x - c(0, 0, 1, 2, 2)
library(prob)
A - permsn(x, 5)  # with repeated columns
B - unique( data.frame( t(A) ))   # no repeated rows

The data frame B will have 56 rows and 5 columns.  If you need the
columns collapsed, then you can use the

apply(B, 1, paste, sep = , collapse = )

command that Johannes suggested.  Details are in the prob package vignette,

vignette(prob)

I hope that this helps,
Jay



* fix:  As it happens, your particular question helped to identify a
bug in the current CRAN version of prob.  Thank you!  :-)   Below is a
fix until the updated version appears.


permsn - function (x, m)
{
require(combinat)
if (is.numeric(x)  length(x) == 1  x  0  trunc(x) == x)
x - seq(x)
temp - combn(x, m)
if ( isTRUE(all.equal(m,1)) ) {
P - temp
} else if (isTRUE(all.equal(m, length(x {
temp - matrix(x, ncol = 1)
P - array(unlist(permn(temp[, 1])), dim = c(m, factorial(m)))
} else {
k - dim(temp)[1]
n - dim(temp)[2]
P - array(unlist(permn(temp[, 1])), dim = c(k, factorial(k)))
for (i in 2:n) {
a - temp[, i]
perms - array(unlist(permn(a)), dim = c(k, factorial(k)))
P - cbind(P, perms)
}
}
return(P)
}







***
G. Jay Kerns, Ph.D.
Assistant Professor / Statistics Coordinator
Department of Mathematics  Statistics
Youngstown State University
Youngstown, OH 44555-0002 USA
Office: 1035 Cushwa Hall
Phone: (330) 941-3310 Office (voice mail)
-3302 Department
-3170 FAX
E-mail: [EMAIL PROTECTED]
http://www.cc.ysu.edu/~gjkerns/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.