> how much faster is your library processor than iterating over blasting  
> all the sequences in the simple-minded approach?

It all depends on the size of the blast database you are comparing you
library to. In my case, I had a longish (1.2kb) sequence that I needed
to compare lots of 454 reads to. For each read inside the blastall
itself the 1.2kb reference sequence needs to be read using fastacmd
which we know is slow, plus there's some initialization going on,which
also requires some time, multiply that by the number of reads (on the
order of 10^5) and you can see that there's a lot of unnecessary work
being performed.

> Then there is the issue of testing -- do you  
> have (or could you write) a test suite that would test this  
> thoroughly, that we could incorporate into our standard tests or  
> megatests (if the tests would take too long or require resources too  
> big to be incorporated in the standard release package)?
I have no experience writing test case, but I suppose I could give it
a go early next week. I'm thinking of following the usage example I
described above, i.e. one long sequence as a blast database, plus a
bunch of its sub-sequences with known much coordinates as a library.
This should catch all of the problems I can think of.

Cheers,
Alex

P.S.

> FYI, I have summarized the blast refactoring in the Pygr Dev  
> discussion group and also somewhat in the issue tracker.  
Sorry, I fell about a year behind on the new going-ons with pygr,
trying to catch up.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to