Hi Titus, I just posted a new branch "simpleframe" on github, which implements everything we've been discussing for simplifying blastx etc. It drastically simplifies the tblastn and blastx code. All the pipeline and transformations stuff is unnecessary now. I simply created a new "sequence database" class called TranslationDB that wraps a nucleotide sequence database and provides an interface to six-frame translations of the nucleotide sequences as you had suggested. That means that blastx / tblastn / tblastx are processed exactly the same as blastn / blastp (they just substitute a TranslationDB for the source and / or target sequence databases). All blastx results are now stored in a single NLMSA, rather than being divided into a separate NLMSA / NLMSASlice for each hit as before.
BlastxMapping.__call__() now returns an NLMSA just as BlastMapping.__call__() does. The only difference is that because blastx (potentially) converts the query sequence to multiple translations, BlastxMapping.__getitem__() cannot just return a single NLMSASlice, but instead returns an iterator for one or more NLMSASlices representing each of those six-frame translations. http://github.com/cjlee112/pygr/tree/simpleframe For a first glance at this I suggest you just look at the end-product (BlastxMapping in blast.py, and the new translationDB.py), then take a look at the commit history... -- Chris --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---
