Maybe the BLAST package should be a software to which the user could develop 3rd-part addon.
On 2/12/08, Martin Gollery <[EMAIL PROTECTED]> wrote: > > The first step is to implement it in C++ to see how fast it is. Once > you have an executable, testing it will be relatively straightforward. > > Marty > > > On Feb 11, 2008 8:21 AM, Theodore H. Smith <[EMAIL PROTECTED]> wrote: > > > > Hi everyone, > > > > So I've been working, on and off, on this algorithm for quite a while > > now. It's basically an invention of mine. It is a "blast-like" > > algorithm, in that it does "Fuzzy lookup" operations across a database > > of letters. I am designing this algorithm to be useful for bio- > > informatics, this is the main field I am initially targetting. > > > > The database will be filled with protein sequences, and the search > > across the database will be another protein sequence. The algorithm > > has a "scoring matrix", which can accept different protein replacement > > scores. The cost of inserting letters (protein letters) can be > > configured also. > > > > In this sense, it's no different to Smith-Waterman. The same input, > > the same output! > > > > The real difference from Smith-Waterman, is it's speed. My algorithm > > will be hugely faster. This is because I use many techniques to avoid > > processing unnecessary parts of the Smith-Waterman matrix. > > > > I also use many tricks to reuse computations across various proteins. > > For example, the matrix for protein "ABCDE", is identical, at first > > anyhow, for the matrix for "ABCDEFG". This means if I have both > > proteins "ABCDE", and "ABCDEFG" in my protein database, I can test > > both of them against the search query, in almost half the time. My > > algorithm also runs in logarithmic-time with respect to the size of > > the database. Basically, bigger databases run disproportionately faster. > > > > I want to turn this algorithm, into something useful for people. My > > first challenge here, is to answer the question "is this algorithm > > faster, or better than BLAST". If it is not faster, my algorithm > > basically has little use. But I have good hopes it will be faster! I > > am very good with these sort of things, you see :) Speed is my strong- > > point. > > > > Currently, I do not know about the speed, because I haven't > > implemented a C++ version of my algorithm, or a good speed testing > > framework. > > > > I do however know that my algorithm is more accurate than BLAST, > > because it is just as accurate as SSEARCH, as mine uses the Smith- > > Waterman algorithm. Whereas BLAST uses a heuristic, intelligent guess- > > work basically. A fine heuristic, but still a heuristic. Mine is > > methodological, not heuristic based. > > > > So here is what I am looking for! > > > > I am hoping, that someone in the field will be able to offer me > > guidance, interest, enthusiasm, suggestions and maybe even do some > > testing for me. > > > > Perhaps a student doing a bio-informatics related degree, who would > > like to write a paper on an alternative way of processing protein > > databases. My invention could be an interesting subject for a paper. > > > > Or perhaps a researcher who just has an interest in these sort of > > things! Perhaps a researcher who feels there must be a better way of > > doing these things. Or anyone really in this field with the time and > > interest, and feels helping me could help him (or her) too in some way. > > > > I'd like someone I can ask a lot of questions to, and show my software > > to, and explain my hopes what I can achieve with it. > > > > Basically, my first question to you, would be "how would I set this up > > to be useful for someone", and "how would I test it's usefulness, what > > would you need to know about my algorithm that you would decide to use > > it over blast" > > > > It's sort of a vague question from me, like "what do you need me to > > do", but... well that's where I am right now. Sort of a bit on the > > outside hoping someone on the inside will show me something. > > > > So it's an opportunity to tell me what you want, basically!! Tell me, > > and I might just make it. > > > > Who knows? Maybe one day in a few years time, everyone will be using > > this "ElfDataFuzzy" algorithm that I invented, instead of BLAST! You > > might be part of something. > > > > Thanks to anyone who replies! > > > > -- > > http://elfdata.com/plugin/ > > "String processing, done right" > > > > > > > > _______________________________________________ > > BBB mailing list > > [email protected] > > http://www.bioinformatics.org/mailman/listinfo/bbb > > > > > > -- > -- > Martin Gollery > Senior Bioinformatics Scientist > TimeLogic- a Division of Active Motif > 775-833-9113 > 880 Northwood Blvd. Suite 7 > Incline Village, NV 89451 > > _______________________________________________ > BBB mailing list > [email protected] > http://www.bioinformatics.org/mailman/listinfo/bbb > -- Best Regards Sheng Wang _______________________________________________ BBB mailing list [email protected] http://www.bioinformatics.org/mailman/listinfo/bbb
