Hi,
in response to Titus' suggestion that we stop using "BlastDB" as the
name of our main sequence database classes, I've pursued several lines
for separating sequence database functionality from BLAST
functionality into independent modules.
- refactored sequence databases to a new base class SequenceDB and its
file-based storage subclass SequenceFileDB. No BLAST functionality is
embedded in these classes. You should use SequenceFileDB instead of
BlastDB as the default class for opening a sequence database.
- eliminated BlastDB associated indexing delays; for details see issue
49
- new proposal (issue 34): treat BLAST functionality as a mapping,
just like any other Pygr mapping. That means a graph-like interface,
i.e.
for target in blastmap[myquery]:
do something...
or
for src,target,edge in blastmap[myquery].edges():
do something...
Alternatively, if you wanted it to store multiple search results into
one NLMSA (e.g. for saving an all-vs-all alignment), you would use it
as a callable:
for myquery in lotsaQueries:
blastmap(myquery, al=myNLMSA) # save results into myNLMSA
myNLMSA.build() # finally, index the results for querying
We and other developers would code different mapping class for each
kind of homology search (blast, megablast, etc.); a user would
instantiate the desired mapping class by specifying the target
database to search, e.g.
blastmap = BlastMapping(hg17)
What do people think of this proposal? Do you see any pitfalls or
confusing aspects? Is this worth including in the upcoming 0.8
release? See issue 34 for more details.
Yours,
Chris
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"pygr-dev" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---