Re: [Biohaskell] GSoC 2013 is ON

Christian Höner zu Siederdissen Wed, 03 Apr 2013 14:19:08 -0700

Hi Ketil,

submit as a Haskell GSoC! I'd be very interested in hearing about
multitape applications for DP. And I think I could help with performance
issues.
Once PloS ONE is back, I'll take a look at your paper.


Gruss,
Christian

PS: now that I have handed in my thesis, I should have time to publish
the multitrack extensions to ADPfusion, might be possible to draw some
pointers from that for your stuff / gsoc.

* Ketil Malde <ke...@malde.org> [03.04.2013 22:40]:
> 
> [CC everybody including the biohaskell list. Let me know if any of you
> want off. :-) ]
> 
> Pjotr Prins <pjotr2...@thebird.nl> writes:
> 
> >   http://www.open-bio.org/wiki/Google_Summer_of_Code
> 
> > For Biopython (3x), BioRuby (5x) and BioJava (4x) I found project ideas.
> 
> > The others are missing.
> 
> > There is still a (rather small) window of opportunity for adding
> > ideas.
> 
> I have one thing that might work well as a SOC project, if the right
> student could be found.
> 
> Basically, I and a colleague recently developed and published a method
> and implementation for more sensitive pairwise alignments.  The paper is
> here, I think (PLoS ONE seems to be down atm):
>   http://dx.plos.org/10.1371/journal.pone.0054422
> 
> I'm really happy about the results, if nothing else, check the SCOP
> benchmark.  Although it's difficult to construct a good test case using
> more complex methods (training sets for HMMs and whatnot) I don't know
> anything that is as good as this.  We're using it for annotation of
> genes.
> 
> The current implementation is in Haskell, and although it works
> correctly, it is a bit slow, and more problematic, it consumes too much
> memory (so going multi-threaded, although pretty easy, won't be of any
> help).
> 
> I would like to make this into a less resource intensive (and thus more
> practical) tool, and there are two ways I can think of to go about this:
> 
> 1) Optimize the Haskell program
> 
> 2) Reimplement the algorithm (or parts of it) in a different language
> 
> Advantages of 1:
> 
> * Already have a working program, and the type system makes it easy to
> refactor without introducing errors.
> * Haskell supports lots of good multi-threading programming models (like
> STM)
> * I know Haskell pretty well, and will be hopefully be able to mentor.
> 
> Disadvantages:
> 
> * Haskell has some good debugging tools, but they tend to work really
>   poorly for large memory (i.e. it takes a long time to generate
>   profiles)
> * Needs somebody with a bit (or a lot) of experience optimizing Haskell,
>   and good knowledge of high-perf libraries (like vector)
> 
> Advantages of 2:
> 
> * Easier to get a student with adequate skills.
> * More predictable performance models in other languages.
> * Easier to compile and install for many users.
> 
> Disadvantages:
> 
> * Ideally, should know enough Haskell to read and understand the code.
> * Likely needs a co-mentor with knowledge of the language in question.
> 
> Is this something I could or should submit as a task?
> 
> -k
> -- 
> If I haven't seen further, it is by standing in the footprints of giants
> _______________________________________________
> Biohaskell mailing list
> Biohaskell@biohaskell.org
> http://malde.org/cgi-bin/mailman/listinfo/biohaskell

pgpxqVdccgpTe.pgp
Description: PGP signature

_______________________________________________
Biohaskell mailing list
Biohaskell@biohaskell.org
http://malde.org/cgi-bin/mailman/listinfo/biohaskell

Re: [Biohaskell] GSoC 2013 is ON

Reply via email to