On Oct 20, 10:39 pm, Christopher Lee <[EMAIL PROTECTED]> wrote:
> OK. I now understand the problem. The bsddb module btree index is
> screwing us over: when you simply ask for an iterator, it apparently
> loads the entire index into memory.
Is this really true? The bsddb module is very heavily used by lots of
people to store dictionaries that do not fit into memory. I have
never heard people mentioning this before.
For my own curiosity I wrote a small test script (see below) that
first creates 10 million entries (a 536MB file) then attempts to
iterate over them. I see no slowdown or memory use in iterating over
elements. I get millisecond level access to keys.
---------------
def create():
db = bsddb.btopen(filename, 'n')
for key in range(10**7):
key = str(key)
db[key] = key
db.close()
def read():
db = bsddb.btopen(filename, 'r')
start = time.time()
# iterating on the database
print db.first(), db.next(), db.next(), db.last()
# with a custom iterator
it = iter(db)
print it.next(), it.next(), it.next()
end = time.time()
print 'Elapsed: %s' % (end-start)
db.close()
#create()
read()
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"pygr-dev" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---