On Wed, Jul 8, 2009 at 10:50 PM, Peter<[email protected]> wrote: > Hi all, > > Something I mentioned to Peter Rice in passing at BOSC/ISMB 2009 was > I'd found an oddity in transeq with certain ambiguous codons while > testing Biopython's translations. Here is a specific example (but I > suspect there are more). For reference, I am expecting EMBOSS transeq > to be using the NCBI tables: > http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi > > First consider the following example, the codon TAN, which can be TAA, > TAC, TAG or TAT which translate to stop or Y. Therefore the > translation of TAN should be "* or Y", and EMBOSS transeq opts for > "X". Which is fine:
Using raw output instead of the default FASTA works better in emails: $ transeq asis:TAATACTAGTATTAN -stdout -auto -osformat raw *Y*YX > Similarly for the codon TNN, again EMBOSS transeq opts for "X" because > this could be a stop codon, or W, or F, or L, or S, or Y or C! Again, > this is fine: Again, using raw output works better in emails: $ transeq asis:TNN -stdout -auto -osformat raw X > However, consider the codon TRR. R means A or G, so this can mean TAA, > TGA, TAG or TGG which translate to stop or W (both EMBOSS and the NCBI > standard table agree here). Therefore the translation of TRR should be > "* or W", which I would expect based on the above examples to result > in "X". But instead EMBOSS transeq gives "*": Again, using raw output works better in emails: $ transeq asis:TAATGATAGTGGTRR -stdout -auto -osformat raw ***W* > I think this is a bug. > > However, I am aware that the machine I tried this on is rather old, > and I don't actually know which version of EMBOSS it is. I can check the old machine later, but I just retested on a Mac using EMBOSS 6.0.1 (the current release), and see the same behaviour. Peter C. _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
