Hi Ketil,

do you want to have certain laws? Like header = id + (" " +)
description? In general, seqid + seqheader seems the most useful
combination in terms of fasta http://en.wikipedia.org/wiki/FASTA_format
assuming that the full line is a header (id + desc). Of course, that
would be a "3." -- and I am ignoring all other formats out there that
could be BioSeq's.

Gruss,
Christian

* Ketil Malde <ke...@malde.org> [10.12.2012 10:49]:
> 
> Hi,
> 
> I've discovered an...unfortunate feature.  In the old biolib, I had 
> 'seqlabel' and 'seqheader', the former would...well, let me just give
> the definitions:
> 
>   -- | Return sequence label (first word of header)
>   seqlabel :: Sequence a -> SeqData
>   seqlabel (Seq l _ _) = case B.words l of (x:_) -> x; [] -> B.empty
> 
>   -- | Return full header.
>   seqheader :: Sequence a -> SeqData
>   seqheader (Seq l _ _) = l
> 
> The current Bio.Core only defines seqlabel, and it returns the full
> header.  This is unfortunate, since I often generate tables with the
> sequence name first, and any spaces or tabs in the header messes up the
> columns.
> 
> I'm not quite sure how to resolve this, but options are:
> 
> 1. reintroduce the old behavior by modifying seqlabel, and add seqheader
> to the Sequence class:
> 
>   -- | The 'BioSeq' class models sequence data, and any data object that
>   --   represents a biological sequence should implement it.
>   class BioSeq s where
>     seqlabel  :: s -> SeqLabel
> +   seqheader :: s -> SeqLabel
>     seqdata   :: s -> SeqData
>     seqlength :: s -> Offset
> 
> 2. Keep seqlabel as it is now, and introduce a new function, say seqid:
> 
>   -- | The 'BioSeq' class models sequence data, and any data object that
>   --   represents a biological sequence should implement it.
>   class BioSeq s where
> +   seqid     :: s -> SeqLabel
>     seqlabel  :: s -> SeqLabel
>     seqdata   :: s -> SeqData
>     seqlength :: s -> Offset
> 
> Note that the actual changes must be implemented in the *users* of this
> class, i.e. biofasta, biofastq, biopsl, and whatnot.
> 
> Thoughts most welcome.
> 
> -k
> _______________________________________________
> Biohaskell mailing list
> Biohaskell@biohaskell.org
> http://malde.org/cgi-bin/mailman/listinfo/biohaskell

Attachment: pgp0bu4D5pGCk.pgp
Description: PGP signature

_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://malde.org/cgi-bin/mailman/listinfo/biohaskell

Reply via email to