> I finally cvs up'd last night, installed ZODB and tried things out. Somewhat OT, but did you have a large number of warnings when you installed ZODB? I finally got around installing ZODB 3.6 and there were screens of warnings (ZODB 3.6.0, Python 2.4.1, OS X 10.4.6, Apple's gcc 4.0.0).
> My tte.py script (in contrib) is still selecting BerkDB instead of > ZODB. > Looking at things, I see that it uses storage.database_type to > determine the > database type and name. My storage options are > > [Storage] > persistent_storage_file: ~/hammie.db > > I run my tte.py script like so: > > .../tte.py -d ~/hammie.db ... > > so storage.database_type is called like so: > > storage.database_type([('-d', '/Users/skip/hammie.db')], > default_type="ZODB", > default_name="~hammie.db") "zodb", not "ZODB" (which I suppose it ought to have been), but yes. > The _storage_options dictionary still says that -d means "dbm". > Shouldn't > it say "zodb", since that's the new default? After making that change > locally, it now dumps a ZODB database.) Does "d" stand for "database" or "dbm" (or "default"!)? I figured it stood for "dbm", so left that alone. If people think that it should mean "zodb" or should be the default (i.e. ZODB if importable, dbm otherwise), that's easy to do. At the moment, "-d NAME" is really the same as: [Storage] persistent_use_database: dbm persistent_storage_file: NAME or "-o Storage:persistent_use_database:dbm -o Storage:persistent_storage_file:NAME" And "-p NAME" is really the same as: [Storage] persistent_use_database: pickle persistent_storage_file: NAME or "-o Storage:persistent_use_database:pickle -o Storage:persistent_storage_file:NAME" > Alternatively, should I even be using storage.database_type? If you want to combine the command-line options and config file like the other scripts, then IMO yes. > I need to use the -d flag because I write the database into a > different spot > then mv it into place so as to avoid problems > with simultaneous reads and writes during database generation. I presume this would work: .../tte.py -o Storage:persistent_storage_file:~/hammie.db ... Or changing the meaning of "-d" would. I don't use the -d/-p switches, so don't personally care what they mean. > If I'm using > ZODB do I need to mv more than just one file into place? I see > that the > process generated .index, .lock and .tmp files as well. I'll leave this one for Tim. I *think* that .lock and .tmp should disappear when the ZODB is closed, and that .index will just be recreated (so would be optional). > Finally, I don't understand how I'm supposed to get the spam and > ham counts > from a ZODB database. My spamcounts.py script (see contrib dir) > was making > assumptions about the structure of the database, assuming it could > directly > access the keys of a dbm or dict (pickle). Any thoughts about how > to clean > that up? I think I should be calling db.spamprob(word), but I > still don't > know how to get the raw spam/ham counts that script wants to print. This part of all of the classifiers is pretty messy, IMO. What I do is use the _wordinfokeys, _wordinfoget, etc, methods as you (later) changed spamcounts.py to do. But these have prefixed underscores, so I guess we really shouldn't be doing that. IIRC, using keys() doesn't work for dbm, because Mark put in some clever caching code that means that hapaxes aren't in keys(), so if you want the whole list, you have to use _wordinfokeys(). Or maybe that's the other way around... If this was added to ZODBClassifier: def keys(self): return self.classifier.wordinfo.keys() def get(self, token): return self.classifier.wordinfo.get(token) def set(self, token, value): self.classifier.wordinfo.set(token, value) Would that be enough? It seems like the proper interface to me. =Tony.Meyer _______________________________________________ spambayes-dev mailing list spambayes-dev@python.org http://mail.python.org/mailman/listinfo/spambayes-dev