Hi Ketil, do you want to have certain laws? Like header = id + (" " +) description? In general, seqid + seqheader seems the most useful combination in terms of fasta http://en.wikipedia.org/wiki/FASTA_format assuming that the full line is a header (id + desc). Of course, that would be a "3." -- and I am ignoring all other formats out there that could be BioSeq's.
Gruss, Christian * Ketil Malde <ke...@malde.org> [10.12.2012 10:49]: > > Hi, > > I've discovered an...unfortunate feature. In the old biolib, I had > 'seqlabel' and 'seqheader', the former would...well, let me just give > the definitions: > > -- | Return sequence label (first word of header) > seqlabel :: Sequence a -> SeqData > seqlabel (Seq l _ _) = case B.words l of (x:_) -> x; [] -> B.empty > > -- | Return full header. > seqheader :: Sequence a -> SeqData > seqheader (Seq l _ _) = l > > The current Bio.Core only defines seqlabel, and it returns the full > header. This is unfortunate, since I often generate tables with the > sequence name first, and any spaces or tabs in the header messes up the > columns. > > I'm not quite sure how to resolve this, but options are: > > 1. reintroduce the old behavior by modifying seqlabel, and add seqheader > to the Sequence class: > > -- | The 'BioSeq' class models sequence data, and any data object that > -- represents a biological sequence should implement it. > class BioSeq s where > seqlabel :: s -> SeqLabel > + seqheader :: s -> SeqLabel > seqdata :: s -> SeqData > seqlength :: s -> Offset > > 2. Keep seqlabel as it is now, and introduce a new function, say seqid: > > -- | The 'BioSeq' class models sequence data, and any data object that > -- represents a biological sequence should implement it. > class BioSeq s where > + seqid :: s -> SeqLabel > seqlabel :: s -> SeqLabel > seqdata :: s -> SeqData > seqlength :: s -> Offset > > Note that the actual changes must be implemented in the *users* of this > class, i.e. biofasta, biofastq, biopsl, and whatnot. > > Thoughts most welcome. > > -k > _______________________________________________ > Biohaskell mailing list > Biohaskell@biohaskell.org > http://malde.org/cgi-bin/mailman/listinfo/biohaskell
pgp0bu4D5pGCk.pgp
Description: PGP signature
_______________________________________________ Biohaskell mailing list Biohaskell@biohaskell.org http://malde.org/cgi-bin/mailman/listinfo/biohaskell