Dear Isabelle, Wells, Isabelle wrote: > Can emboss handle inosine in nucleotide sequences? We have a > nucleotide file in embl format where some sequences contain inosine. > Dbiflat doesn't seem to index the database properly although no error > message was given and those inosine containing sequences cannot be > retrieved with seqret. Any suggestions on what we could do apart from > replacing inosine by X or N?
I assume your dbiflat problem is an error in retrieving the entries, unless there is some other format problem in the database that prevents entries from being recognized by the dbiflat parser. If you can send me one of the Inosine-containing entries (or a fake entry if these one are proprietary information) I can check. We treat Inosine as a modified base. These are usually in RNA sequences. You should replace it by X or N and if you have an EMBL format feature table you could add a modified_base feature with a /mod_base=I qualifier to mark each Inosine. EMBOSS does nothing special with these in the current release, but you can perhaps suggest applications to use the modified base information. Hope this helps, Peter Rice _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
