Keith, there's a method in AlphabetTools - getAllSymbols(). Feed it with the matches() map of the symbol & cat together the tokens from each of these.
Matthew
Keith James wrote:
"Sylvain" == Sylvain Foisy <[EMAIL PROTECTED]> writes:Sylvain> Hi, I used the createRegex() method to return a regular Sylvain> expression from a sequence of DNA inputted by the user to Sylvain> scan a genome for that motif. I just discovered an Sylvain> interesting thing about that method: if n is in the motif Sylvain> to seek, the regex will not have n as a possibility. Sylvain> Ok, I have that motif: atgnnnndgta. Sylvain> CreateRegex would return: atg[atcg]{4}gta and it does Sylvain> What if my sequence to scan contains n: atgagcngta, for Sylvain> exemple. Java.util.regex would not find the Sylvain> pattern. Unless mistaken, the pattern should be Sylvain> atg[atcgn]{4}gta. Sylvain> Am I wrong? Any input would be appreciated You are correct about the behaviour, but not about the solution. An ambiguous target sequence could contain n, but could also contain r, y, m, k, s, w, h, b, v and d. To match correctly the regex would have to take into account that the symbols represented by n are a superset of those represented by the other ambiguity symbols. As MotifTools is generic (it will work for any alphabet) implementing generation of regexes for searching ambiguous SymbolLists requires a more complex algorithm than the current one. I'll take a look at this as soon as I can. Keith
-- BioJava Consulting LTD - Support and training for BioJava http://www.biojava.co.uk __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l
