Re: [Biohaskell] seqlabel vs seqheader

Ketil Malde Tue, 11 Dec 2012 01:29:37 -0800

Dan Fornika <dforn...@gmail.com> writes:

> I also vote in favour of seqid + seqheader.


I've pushed a preliminary change to malde.org/~ketil/biohaskell/biocore
if you want to take a look.  I sympathise with the sentiment, Felipe,
but I think seqid is just too important to justify the extra clutter of
a decomposing function.

>> Is it okay to use Monoid for appending and (m)empty?  And have separate
>> 'slice' and 'copy' (or perhaps 'defragment')? 

> This sounds interesting to me, but the stuff about Monoid is over my
> head.  

Monoid provides a general way to combine values into a value of the same
type:

  class Monoid a where
    mempty :: a
    mappend :: a -> a -> a

For lists, mempby is the empty list and mappend concatenation. But for
a numeric type i you can also define (Sum i) where mempty is zero, and
mappend is addition, or (Product i) where mempty is one and mappend is
multiplication.  And so on.

For SeqData etc, this would be the empty sequence and concatenation,
respectively (are there any others that make sense?).  This would make
it easier to provide this functionality for implemetations of BioSeq.

> What would be the purpose of the 'slice' and 'copy' functions? 

  slice     :: s -> (Offset,Offset) -> s  -- ^ Cut a slice of a sequence.
  copy      :: s -> s                     -- ^ Create a copy of a sequence.

'slice' would select a substring of a sequence, delimited by the
offsets.  (Inclusive, I guess?)

'copy' would be analogous to bytestring, as slices are not going to copy 
the underlying data by default. Making a copy will use more memory since
it copies the data, but will allow the old data to be GC'ed.  Say if you
read a 200Mb chromosome from a file, but only want to retain a small
fraction of it.  (Was that really clear?)

-k
_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://malde.org/cgi-bin/mailman/listinfo/biohaskell

Re: [Biohaskell] seqlabel vs seqheader

Reply via email to