Vasili I. Galchin <vigalc...@gmail.com> writes: > Why are there three Fasta representation-dependent functions (toFasta, > toFastaQual, toFastQ) doing in Sequence.hs which presumably is meant > to be Sequence representation-free code?
Because they are the standard formats, least common denominators, so to speak? Also the most common inputs to tools and algorithms. If you have a data structure that cannot easilty be converted into these, it should probably not be an instance BioSeq (or BioSeqQual). IIRC, TwoBit contains sequence information (and thus should be a BioSeq instance), but no quality information (and thus no BioSeqQual instance). Another example is Roche (and Ion Torrent)'s SFF format, which contains sequence and quality data in addition to flowgram data. -k -- If I haven't seen further, it is by standing in the footprints of giants