Thanks for your help! > It's not the alphabet that will kill you, but the number of parameters you are > estimating. Indeed, BioJava should be able to handle alphabets with more than > 2^32 symbols quite happily. There's an implementation of cross-product > alphabet designed especially for this case.
what I am trying to do is to develop a phylogenetic HMM. so say there are 3 sequences, in the alignment, that means each site consists of 3 symbols, and if it is a generalized HMM, each state has several sites, say 7. I wrote a testing program to see if it works. when the length of sites in the state = 5 it worked. (I just want to see if I can factorize a symbol in the state alphabet. but when number of sites in the state = 7, I get java.lang.ArrayIndexOutOfBoundsException. (code attached) Is it because i was not using the alphabet efficiently? again, thanks very much for helping! Wendy public static void main(String[] args) throws MarshalException, ValidationException, IOException { Alphabet sequenceAlphabet = DNATools.getDNA(); Set alphabetSet = AlphabetManager.getAllSymbols((FiniteAlphabet) sequenceAlphabet); int no_sequences = 3; List siteAlphabetList = Collections.nCopies(no_sequences, sequenceAlphabet); Alphabet siteAlphabet = AlphabetManager.getCrossProductAlphabet(siteAlphabetList); int length = 7; List staeAlphabetList = Collections.nCopies(length, siteAlphabet); Alphabet stateAlphabet = AlphabetManager.getCrossProductAlphabet(staeAlphabetList); AlphabetIndex alphabetIndex = AlphabetManager.getAlphabetIndex((FiniteAlphabet) stateAlphabet); AtomicSymbol sym = (AtomicSymbol) alphabetIndex.symbolForIndex(3); List symList = sym.getSymbols(); log.info("sym (index=3) is " + sym); log.info("sym is composed of:"); Iterator symIter = symList.iterator(); while (symIter.hasNext()) { log.info(symIter.next()); } } _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l