Re: [Biohaskell] seqlabel vs seqheader

Christian Höner zu Siederdissen Tue, 11 Dec 2012 03:37:15 -0800

* Ketil Malde <[email protected]> [11.12.2012 10:29]:
> Dan Fornika <[email protected]> writes:
> 
> > I also vote in favour of seqid + seqheader.
> 
> I've pushed a preliminary change to malde.org/~ketil/biohaskell/biocore
> if you want to take a look.  I sympathise with the sentiment, Felipe,
> but I think seqid is just too important to justify the extra clutter of
> a decomposing function.


Ich hope to have time come weekend.

> 
> >> Is it okay to use Monoid for appending and (m)empty?  And have separate
> >> 'slice' and 'copy' (or perhaps 'defragment')? 
> 
> > This sounds interesting to me, but the stuff about Monoid is over my
> > head.  
> 
> Monoid provides a general way to combine values into a value of the same
> type:
> 
>   class Monoid a where
>     mempty :: a
>     mappend :: a -> a -> a
> 
> For lists, mempby is the empty list and mappend concatenation. But for
> a numeric type i you can also define (Sum i) where mempty is zero, and
> mappend is addition, or (Product i) where mempty is one and mappend is
> multiplication.  And so on.
> 
> For SeqData etc, this would be the empty sequence and concatenation,
> respectively (are there any others that make sense?).  This would make
> it easier to provide this functionality for implemetations of BioSeq.

Take a look at ListLike and see if methods from there make sense? Things
like indexing, head, tail, whatever? Actually, since BioSeq is a class,
one doesn't really have to do a lot. auto-deriving ListLike and Monoid
for SeqData, SeqLabel is probably enough ...

> 
> > What would be the purpose of the 'slice' and 'copy' functions? 
> 
>   slice     :: s -> (Offset,Offset) -> s  -- ^ Cut a slice of a sequence.
>   copy      :: s -> s                     -- ^ Create a copy of a sequence.
> 
> 'slice' would select a substring of a sequence, delimited by the
> offsets.  (Inclusive, I guess?)

start + length
vs.
start + stop
coordinates? Vector uses start + length.

Gruss,
Christian

> 
> 'copy' would be analogous to bytestring, as slices are not going to copy 
> the underlying data by default. Making a copy will use more memory since
> it copies the data, but will allow the old data to be GC'ed.  Say if you
> read a 200Mb chromosome from a file, but only want to retain a small
> fraction of it.  (Was that really clear?)
> 
> -k

pgpR31IBA7Wbv.pgp
Description: PGP signature

_______________________________________________
Biohaskell mailing list
[email protected]
http://malde.org/cgi-bin/mailman/listinfo/biohaskell

Re: [Biohaskell] seqlabel vs seqheader

Reply via email to