Vasili I. Galchin <vigalc...@gmail.com> writes:

> Why are there three Fasta representation-dependent functions (toFasta,
> toFastaQual, toFastQ) doing in Sequence.hs which presumably is meant
> to be Sequence representation-free code?

Because they are the standard formats, least common denominators, so to
speak?  Also the most common inputs to tools and algorithms.  If you
have a data structure that cannot easilty be converted into these, it
should probably not be an instance BioSeq (or BioSeqQual).

IIRC, TwoBit contains sequence information (and thus should be a BioSeq
instance), but no quality information (and thus no BioSeqQual instance).

Another example is Roche (and Ion Torrent)'s SFF format, which contains
sequence and quality data in addition to flowgram data.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

Reply via email to