On Tue, Nov 16, 2010 at 4:17 AM, Antonio Cuni <[email protected]> wrote:
> Hi Dan, > first: thanks for your help :-) > > > On 16/11/10 03:17, Dan Stromberg wrote: > >> >> Yes, the dbm module in pypy is basically like the bsddb module in cpython. >> >> cpython includes modules for bsddb, gdbm, and more. >> >> I tend to prefer gdbm over bsddb, because I've seen bsddb databases get >> corrupt too many times - EG, when a filesystem overflows. bsddb might be >> a >> little faster though; I've never compared their performance. >> > > So, if I understand correctly you are saying that we should rename our > dbm.py to bsdb.py, and write a new dbm.py which can use either bsdb or gdbm? > Sounds fine, do you feel like implementing it? :-) > > Moreover, I also agree with amaury that your code is very similar to the > one in the current dbm.py, so we should maybe try to refactor things to > share common parts between the twos. > > ciao, > Anto > Wow, CPython's Berkeley DB interface is actually quite a bit more comprehensive and complex than I'd realized. This isn't just a matter of renaming dbm.py to bsddb.py and refactoring a bit. It's more of a time commitment to something I don't use than I'd thought. Althought pypy's current dbm.py implements something similar to cpython's Berkeley DB interface, it isn't all that similar. It uses a subset of the same on-disk representations, but the API appears to be pretty different. This is based on playing around in the unit tests and bsddb module a bit. I actually suggest: 1) svn rename'ing dbm.py into some unused directory for history's sake; it implements the ndbm _interface_ (a little oddly called "dbm" in cpython - but I believe true "dbm" is "one database per program") well, but it's not really all that similar to bsddb. 2) Adding the gdbm.py module I wrote, more or less verbatim. I got into the project of merging these two things thinking that bsddb was mostly just like the gdbm module, but bsddb is actually quite a bit more involved, and is something I've pretty much stopped using due bad experiences with bsddb database corruption from both cpython and C. Given #1 and #2 above, anydbm should continue working, due to the presence of gdbm and dumbdbm. I guess I think that if someone has a need for bsddb (and it's assorted interfaces), they probably should work on that. Sound reasonable?
_______________________________________________ [email protected] http://codespeak.net/mailman/listinfo/pypy-dev
