Solution to the additional BlastDB that Namshin reported as issue 49:
BlastDB is deprecated. Use SequenceFileDB instead as the standard
sequence database class.
The delay that Namshin reported is due to BlastDB's support for NCBI
ID mangling (i.e. the fact that NCBI blastall reports back "fake ID"s
that do not match the original ID in the FASTA file). BlastDB handles
this by creating a lookup table for translating the fake IDs to the
correct IDs. The first request for an ID that doesn't match triggers
construction of this table, thus the delay. Note also that
construction of this table will take up memory as well.
The solution is simple: switch to using the base class
(SequenceFileDB, or BlastDBbase which adds blast() etc. methods)
unless you really need the NCBI ID mangling support -- in which case
this delay is unavoidable.
On Nov 17, 2008, at 3:45 PM, Namshin Kim wrote:
> Hi Chris,
>
> Now, I got another problem.
>
> >>> from pygr import seqdb
> >>> R1 = seqdb.BlastDB('R1')
> >>> R1.has_key('1')
> True
> >>> R1.has_key('1A') # EXTREMELY SLOW
>
> You can use the same BlastDB as previous test. Python does not show
> any increase of memory usage.
>
> Yours,
> Namshin Kim
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"pygr-dev" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---