Re: current projects in progress and/or list of things to do

Vasili I. Galchin Thu, 04 Feb 2016 09:30:10 -0800

Thanks. I will think about your thoughts.

On Thursday, February 4, 2016, Christian Höner zu Siederdissen <
choe...@bioinf.uni-leipzig.de> wrote:


> Hi Vasili,
>
> >        Are most other biohaskell members interested in your suggested
> >    functionality regarding the bowtie implementation?
>
> I think Olaf is more interested in the sequence formats that are in use
> in bowtie, than in bowtie itself. Having efficient (en|de)-coding of all
> kinds of bio-formats is quite useful for many of us.
>
> Particular algorithms (like bowtie) however, are probably of interest
> only to 1-2 people, if they happen to work on such a problem at the
> time.
>
> However, such things should *not* deter you, it makes for a good
> learning experience to build things up, especially if there is something
> like a reference implementation to compare to.
>
> Viele Gruesse,
> Christian
>
>
>
> * Vasili I. Galchin <vigalc...@gmail.com <javascript:;>> [04.02.2016
> 06:12]:
> >    Olaf,
> >        Are most other biohaskell members interested in your suggested
> >    functionality regarding the bowtie implementation?
> >
> >        I looked a little at the bowtie c++ source. Mounds of code :-)
> >
> >       Ok ... we need to look for "invariants" (not exactly like in pure
> >    maths .. but something like IMO) between different software
> architectures.
> >    Sorry .. I know very hard to understand my thoughts. In software the
> >    notion of "invariant" is very loose. I am thinking "out loud"., i.e.
> >    .thinking as I  am writing. OK .. the Johns Hopkins' writer made
> >    architectural decisions that are linked to  1) his language choice
> i.e.
> >    the C++ language and also to his personal ideas in a language
> independent
> >    way. I am thinking. We have to try to decouple 1) and 2) to derive the
> >    invariant by throwing away pieces of 1) that are totally language
> >    dependent  .. hence have no bearing on what is the "target" language,
> >    e.g. Haskell. E.g. the C++ implementation uses a "thread driver" to
> move
> >    the segment processing forward over time. Is this part of 1) or 2) (
> I am
> >    assuming that 1) and 2) are disjoint)? Answer: I don't know. i am
> sorry
> >    for all of the tortured thinking. Bottom line: need to understand
> globally
> >    the C++ software architecture to understand what is language
> dependent(and
> >    can be thrown away) and independent part that is I think language
> >    invariant. then use the language invariant part to design and
> implement in
> >    Haskell. Shields up: i anticipate flames in my lower parts .. :-(
> >
> >    Vasyl
> >
> >    On Tuesday, February 2, 2016, Olaf Klinke <o...@aatal-apotheke.de
> <javascript:;>> wrote:
> >
> >      One is on GitHub, one is on sourceforge. Google
> >
> >      bwa site:github.com
> >
> >      or go to
> >
> >      http://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.2.6/
> >
> >      Olaf
> >
> >      > Am 02.02.2016 um 01:33 schrieb Vasili I. Galchin
> >      <vigalc...@gmail.com <javascript:;>>:
> >      >
> >      > No promises. I have done a lot of reverse engineering in C/C++. So
> >      > where is the extant C/C++ source??
> >      >
> >      > On Mon, Feb 1, 2016 at 5:21 PM, Olaf Klinke <
> o...@aatal-apotheke.de <javascript:;>>
> >      wrote:
> >      >> Yes, there is a TODO: Get rid of the "cheap watches" spam sent to
> >      this list. In fact the signal/noise ratio makes we want to
> unsubscribe.
> >      >>
> >      >> Other than that, I'd like to see someone reverse-engineer one of
> the
> >      major sequence index formats and provide a Haskell interface, so
> that we
> >      can design our own functional alignment algorithms instead of
> building
> >      shell scripts around bowtie or bwa.
> >      >> By reverse-engineer I mean look at the source code. It's all
> there,
> >      but poorly documented. I understand too little C/C++ to make sense
> of
> >      how precisely these index structures are stored. But if one could
> write
> >      a Data.Binary instance, that'd be awesome.
> >      >> Meanwhile I implemented a Lempel-Ziv together with full-text
> search
> >      on the compressed data (not my idea). This is possible if one uses
> one
> >      trie for the entire text. However, full-text search only succeeds
> if the
> >      match overlaps a block boundary. That should be fine for
> sufficiently
> >      long queries.
> >      >>
> >      >> --Olaf
>

Re: current projects in progress and/or list of things to do

Reply via email to