Hi - Is there any reason why you need to be running the restriction finder over the soft masked sequence?
Can you post some example code to replicate the bug/annoyance? If you think this is a genuine bug then please submit a biojava bug report to http://bugzilla.open-bio.org/ Please also include the example code that demonstrates the bug. Thanks. - Mark On 2/28/07, Ilhami Visne <[EMAIL PROTECTED]> wrote: > i've changed my code and called the RestrictionSiteFinder with the new > sequence. it's throwed this exception. > > Exception in thread "Thread-25" > java.lang.UnsupportedOperationException: Ambiguity should be handled > at the level of the wrapped Alphabet > at > org.biojava.bio.symbol.SoftMaskedAlphabet.getAmbiguity(SoftMaskedAlphabet.java:183) > at > org.biojava.bio.symbol.AlphabetManager.getAllSymbols(AlphabetManager.java:223) > at > org.biojava.bio.seq.io.SymbolListCharSequence.<init>(SymbolListCharSequence.java:75) > at > org.biojava.bio.molbio.RestrictionSiteFinder.run(RestrictionSiteFinder.java:73) > at > org.biojava.utils.SimpleThreadPool$PooledThread.run(SimpleThreadPool.java:295) > > i understand why it didn't work (lower case symbol 'a' and upper > symbol 'A'), but i can't find a solution. Any idea? > > On 2/28/07, ilhami visne <[EMAIL PROTECTED]> wrote: > > Thank you. it does now. i should able to find it myself, but i am really > > not a bioinformaticians yet. > > > > my code (maybe there is someone, who has the same problem like me) > > > > BufferedReader br = new BufferedReader(new FileReader("seq.fasta")); > > > > Alphabet dna = SoftMaskedAlphabet.getInstance(DNATools.getDNA()); > > SymbolTokenization dnaParser = dna.getTokenization("token"); > > > > RichSequenceIterator iter = > > RichSequence.IOTools.readFasta(br,dnaParser,null); > > RichSequence rs = iter.nextRichSequence(); > > > > Mark Schreiber wrote: > > > Hi - > > > > > > There are also the classes: SoftMaskedAlphabet and > > > SoftMaskedAlphabet.CaseSensitiveTokenization and > > > SoftMaskedAlphabet.MaskingDetector. Together these classes let you > > > read a sequence that contains case sensitive information and (if you > > > wish) make use of that information. You can also write out the > > > sequence in the original case sensitive format. > > > > > > It was originally designed for reading data that had been 'softmasked' > > > for low complexity regions (eg lower case regions are low complexity > > > and would be ignored in subsequent analysis) but it would be used for > > > quality or any other distinction. > > > > > > - Mark > > > > > > On 2/28/07, ilhami visne <[EMAIL PROTECTED]> wrote: > > >> Thank you for quick answer. Here is the part of my code: > > >> > > >> BufferedReader br = new BufferedReader(new FileReader("seq.fasta")); > > >> RichSequenceIterator iter = RichSequence.IOTools.readFastaDNA(br,null); > > >> RichSequence rs = iter.nextRichSequence(); > > >> > > >> Richard Holland wrote: > > >> > -----BEGIN PGP SIGNED MESSAGE----- > > >> > Hash: SHA1 > > >> > > > >> > DNA is not case-sensitive. What I suspect you are parsing is the > > >> output > > >> > of some sequencing software which is using case as a rough > > >> indicator of > > >> > base calling quality? > > >> > > > >> > The case will have been lost when the file was parsed, not at the > > >> moment > > >> > you iterate over the resulting sequences. This means that you have to > > >> > modify your file parsing method to become case-sensitive. > > >> > > > >> > The default DNA alphabet is not case-sensitive. It makes no > > >> distinction > > >> > between the two, and will convert everything to one case. > > >> > > > >> > If you need to preserve case, you will need to use a custom alphabet > > >> > which treats the cases differently, and also specify a tokenizer which > > >> > is case-sensitive. See the help pages at http://biojava.org/ for > > >> help on > > >> > creating new alphabets. Or, have a look at the ABITools.QUALITY > > >> alphabet > > >> > in BioJava, which interprets the case and stores the quality scores > > >> > separately. > > >> > > > >> > Note however that your custom alphabet is NOT the same as the original > > >> > DNA alphabet, and so you may not be able to use it in all the standard > > >> > transforms (RNA etc.). If you do want to use these then you will > > >> have to > > >> > make a second copy of each sequence using the normal DNA alphabet and > > >> > pass that copy to the routines. > > >> > > > >> > If you post to this list the code you are using to read the file, > > >> then I > > >> > can show you where to insert the reference to this new alphabet. > > >> > > > >> > cheers, > > >> > Richard > > >> > > > >> > Ilhami Visne wrote: > > >> > > > >> >> my sequence files contain case-sensitive symbols (TAATAACgagagg) > > >> and i am > > >> >> using now RichSequenceIterator to iterate over the sequences. > > >> >> > > >> >> How can i tell biojava that it should parse it case-sensitive? if > > >> i call > > >> >> seq.seqString() method, it should return exactly like it was in > > >> the file > > >> >> with upper- and lower-case. > > >> >> > > >> >> thanx. > > >> >> _______________________________________________ > > >> >> Biojava-l mailing list - [email protected] > > >> >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > >> >> > > >> >> > > >> > -----BEGIN PGP SIGNATURE----- > > >> > Version: GnuPG v1.4.2.2 (GNU/Linux) > > >> > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > >> > > > >> > iD8DBQFF5Etv4C5LeMEKA/QRAnGBAJ45eeQhmb4AT0CLTQCVyn5HxFS/cQCfXXgv > > >> > uZKlrdE8y6vMfKcOlm9yBZA= > > >> > =2VZC > > >> > -----END PGP SIGNATURE----- > > >> > > > >> > > > >> > > >> _______________________________________________ > > >> Biojava-l mailing list - [email protected] > > >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > >> > > > > > > > > _______________________________________________ > Biojava-l mailing list - [email protected] > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
