Some fun follow-up comments:

Indeed, lists are quite ok, especially if new cons cells come directly
out of the nursery -- I'm assuming that full fusion won't kick in due to
the complex stuff going on.

I've seen one of the memory plots from Sarah, we'll skype a bit about
it; but in general, there is a need for strictification and moving
towards stream fusion & vectors; but:
http://ro-che.info/ccc/02

Of course, may we have some magic please: ;-)
https://github.com/choener/DnaProteinAlignment
[it's on hackage too, but I've just noticed a fun bug]
Anyway, we ran this algorithm on protein / mitogenome alignments quite
successfully. Of course, it'll still require an awesome amount of
memory [Again, more fun during the call ;-)].

===

And indeed, we should definitely keep the original version around; in
principle we can even quickcheck everything (old == new).

Viele Gruesse,
Christian

PS: I actually prefer to have a lot of this stuff here on biohaskell,
that way we keep these discussions public.

* Ketil Malde <ke...@malde.org> [27.05.2014 13:58]:
> 
> Johannes Waldmann <johannes.waldm...@htwk-leipzig.de> writes:
> 
> > uh, actual lists (Prelude.[])?
> 
> *blush*
> 
> Seriously, they aren't all bad, the lists are produced and consumed
> lazily, so it's more like a loop.  Or it was intended to be, but clearly
> there are still some problems.
> 
> > there are lots of ways to represent matrices (both sparse and full).
> > what size are we talking about, for this application?
> 
> The matrix needs to be query sequence (typically nucleotide) times final
> target sequence (typically protein).  Apparently titin holds the record
> of some tens of thousands of amino acids with the transcript thrice that
> - so let's call it 100K x 35K or a total of 3.5G cells.  Ouch...maybe
> this isn't such a good idea after all.
> 
> -k
> -- 
> If I haven't seen further, it is by standing in the footprints of giants

* Johannes Waldmann <johannes.waldm...@htwk-leipzig.de> [27.05.2014 13:56]:
> Hi,
> 
> 
> > There are not that many choices of high-performance libraries: 
> 
> (this is just for my understanding)
> 
> in principle, the choice between different representations
> can be expressed by associated types?
> (or whatever it's called these days)
> 
> 
> > (i) my adpfusion library [1-core, high-level], (ii) ...
> 
> don't forget  (0) - straightforward implementation (Prelude.[])
> 
> should be used to  (a) understand the algorithmic idea
> (b) for automated tests (to compare results with (i) ..)
> 
> 
> - J.
> 
> 



Attachment: pgpoRuKYJgJRc.pgp
Description: PGP signature

Reply via email to