Hi, I already sometimes thought it would be nice to have an option for getorf which would return just the longest ORF it finds. Here is what I use to get the coding part from an mRNA sequence if there is no feature table information:
> getorf test.fa -find 1 -norev | infoseq -only -len -desc -filter | sort -nr | head -1 | tr -d "[]" | awk '{print "seqret test.fa -sb "$2" -send "$4}' HTH, David. emboss-boun...@lists.open-bio.org schrieb am 15/03/2011 18:27:00: > Starting with a large mRNA fasta repository, > what is the route to generating a derivative with the untranslated > leader and post-stop codon segments trimmed off the mRNAs > > This is enroute to a dicodon usage analysis, already written as a BASH > script calling PERL modules. > If anyone is interested in such applications, let me know. > pdc (parse dicodons) works fine, taking about a second per pre-trimmed > mRNA as it retreives from a FASTA repository. > But before providing it to a Novice community, some fool proofing > best be done. > > marvin.stodol...@gmail.com > _______________________________________________ > EMBOSS mailing list > EMBOSS@lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss