Re: [EMBOSS] Counting the number of sequences in a file

Peter Rice Tue, 20 Jul 2010 10:06:17 -0700

On 20/07/10 17:27, Peter C. wrote:
> Hi all,
> 
> Is there a tool in EMBOSS to just count the number of sequences in a file?
>
> Right now I could handle this by using seqret to convert the file into FASTA
> and then pipe that though grep to count the records. But an EMBOSS tool
> would be more elegant, e.g.
> 
> $ countseq -sformat=genbank gbvrt1.seq
> 31065
> 
> For the implementation you might offer the choice between using the normal
> EMBOSS parsing (as in seqret) versus file format specific regular expression
> searches which just look for marker lines (without checking validity) which
> should be really fast.


Very easy to write ... you could do it yourself for practise (we will
help of course).

Just use seqret as the basis, don't write any sequences out, but add an
outfile for the results.

We will add countseq to the next release.

regards,

Peter Rice


_______________________________________________
EMBOSS mailing list
[email protected]
http://lists.open-bio.org/mailman/listinfo/emboss

Re: [EMBOSS] Counting the number of sequences in a file

Reply via email to