On Oct 21, 2008, at 6:30 AM, Istvan Albert wrote:
> On Oct 21, 9:05 am, "C. Titus Brown" <[EMAIL PROTECTED]> wrote: >> >> It's a BsdDBShelf problem, not a bsddb issue -- see attached script. >> The 'iter' call seems to be what's loading the index. > > oy, indeed. I think I know what is going on, > > the BsdDBShelf class does not have a __iter__method, and it inherits > from the Shelf -> UserDict.DictMixin. And it looks like that class > unwinds the keys into a list when creating the iterator. Yes, the problem is not with bsddb itself, but with shelve. As you point out, it inherits DictMixin's __iter__ which in turn calls keys() and keeps the entire list in memory until the iterator is garbage- collected. My apologies for not having distinguished between whether the problem lay in bsddb or in shelve. > > > I noticed how iterating on the database itself is still very fast. So > I think there might be an easy fix here, replacing the iter(db) call > with iterating on the database directly. Since pygr uses its own flavor of Shelf (pygr.dbfile.BtreeShelf, which defaults to using btree instead of hash), we can work around this Shelf problem by simply implementing a BtreeShelf.__iter__ method that directly calls the btree object's iterator method. I will now do that. Thanks for working all this out so quickly, Istvan and Titus!!! -- Chris --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---
