Thanks. I will think about your thoughts. On Thursday, February 4, 2016, Christian Höner zu Siederdissen < choe...@bioinf.uni-leipzig.de> wrote:
> Hi Vasili, > > > Are most other biohaskell members interested in your suggested > > functionality regarding the bowtie implementation? > > I think Olaf is more interested in the sequence formats that are in use > in bowtie, than in bowtie itself. Having efficient (en|de)-coding of all > kinds of bio-formats is quite useful for many of us. > > Particular algorithms (like bowtie) however, are probably of interest > only to 1-2 people, if they happen to work on such a problem at the > time. > > However, such things should *not* deter you, it makes for a good > learning experience to build things up, especially if there is something > like a reference implementation to compare to. > > Viele Gruesse, > Christian > > > > * Vasili I. Galchin <vigalc...@gmail.com <javascript:;>> [04.02.2016 > 06:12]: > > Olaf, > > Are most other biohaskell members interested in your suggested > > functionality regarding the bowtie implementation? > > > > I looked a little at the bowtie c++ source. Mounds of code :-) > > > > Ok ... we need to look for "invariants" (not exactly like in pure > > maths .. but something like IMO) between different software > architectures. > > Sorry .. I know very hard to understand my thoughts. In software the > > notion of "invariant" is very loose. I am thinking "out loud"., i.e. > > .thinking as I am writing. OK .. the Johns Hopkins' writer made > > architectural decisions that are linked to 1) his language choice > i.e. > > the C++ language and also to his personal ideas in a language > independent > > way. I am thinking. We have to try to decouple 1) and 2) to derive the > > invariant by throwing away pieces of 1) that are totally language > > dependent .. hence have no bearing on what is the "target" language, > > e.g. Haskell. E.g. the C++ implementation uses a "thread driver" to > move > > the segment processing forward over time. Is this part of 1) or 2) ( > I am > > assuming that 1) and 2) are disjoint)? Answer: I don't know. i am > sorry > > for all of the tortured thinking. Bottom line: need to understand > globally > > the C++ software architecture to understand what is language > dependent(and > > can be thrown away) and independent part that is I think language > > invariant. then use the language invariant part to design and > implement in > > Haskell. Shields up: i anticipate flames in my lower parts .. :-( > > > > Vasyl > > > > On Tuesday, February 2, 2016, Olaf Klinke <o...@aatal-apotheke.de > <javascript:;>> wrote: > > > > One is on GitHub, one is on sourceforge. Google > > > > bwa site:github.com > > > > or go to > > > > http://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.2.6/ > > > > Olaf > > > > > Am 02.02.2016 um 01:33 schrieb Vasili I. Galchin > > <vigalc...@gmail.com <javascript:;>>: > > > > > > No promises. I have done a lot of reverse engineering in C/C++. So > > > where is the extant C/C++ source?? > > > > > > On Mon, Feb 1, 2016 at 5:21 PM, Olaf Klinke < > o...@aatal-apotheke.de <javascript:;>> > > wrote: > > >> Yes, there is a TODO: Get rid of the "cheap watches" spam sent to > > this list. In fact the signal/noise ratio makes we want to > unsubscribe. > > >> > > >> Other than that, I'd like to see someone reverse-engineer one of > the > > major sequence index formats and provide a Haskell interface, so > that we > > can design our own functional alignment algorithms instead of > building > > shell scripts around bowtie or bwa. > > >> By reverse-engineer I mean look at the source code. It's all > there, > > but poorly documented. I understand too little C/C++ to make sense > of > > how precisely these index structures are stored. But if one could > write > > a Data.Binary instance, that'd be awesome. > > >> Meanwhile I implemented a Lempel-Ziv together with full-text > search > > on the compressed data (not my idea). This is possible if one uses > one > > trie for the entire text. However, full-text search only succeeds > if the > > match overlaps a block boundary. That should be fine for > sufficiently > > long queries. > > >> > > >> --Olaf >