Hi,

I've discovered an...unfortunate feature.  In the old biolib, I had 
'seqlabel' and 'seqheader', the former would...well, let me just give
the definitions:

  -- | Return sequence label (first word of header)
  seqlabel :: Sequence a -> SeqData
  seqlabel (Seq l _ _) = case B.words l of (x:_) -> x; [] -> B.empty

  -- | Return full header.
  seqheader :: Sequence a -> SeqData
  seqheader (Seq l _ _) = l

The current Bio.Core only defines seqlabel, and it returns the full
header.  This is unfortunate, since I often generate tables with the
sequence name first, and any spaces or tabs in the header messes up the
columns.

I'm not quite sure how to resolve this, but options are:

1. reintroduce the old behavior by modifying seqlabel, and add seqheader
to the Sequence class:

  -- | The 'BioSeq' class models sequence data, and any data object that
  --   represents a biological sequence should implement it.
  class BioSeq s where
    seqlabel  :: s -> SeqLabel
+   seqheader :: s -> SeqLabel
    seqdata   :: s -> SeqData
    seqlength :: s -> Offset

2. Keep seqlabel as it is now, and introduce a new function, say seqid:

  -- | The 'BioSeq' class models sequence data, and any data object that
  --   represents a biological sequence should implement it.
  class BioSeq s where
+   seqid     :: s -> SeqLabel
    seqlabel  :: s -> SeqLabel
    seqdata   :: s -> SeqData
    seqlength :: s -> Offset

Note that the actual changes must be implemented in the *users* of this
class, i.e. biofasta, biofastq, biopsl, and whatnot.

Thoughts most welcome.

-k
_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://malde.org/cgi-bin/mailman/listinfo/biohaskell

Reply via email to