> I'm looking to store a bunch of storage.Classifiers in a database > indexed by user (in other words, so that each user gets his own > classifier). It seems I could do this easily enough by modifying > mySQLClassifier or similar (though I'd like to use SQLObject so I'm > not tied to a specific database server), but before I go and mess > with that, I was wondering if there's an easier way.
Are you after easy or efficient? Assuming that tokens are likely to be found in more than one user's database (this would certainly be true with email; I have no idea whether it is true for whatever you are doing) then a more efficient database system might be to have a 'token' table and then references to that in a user's table (which would have 'token reference', 'ham count', 'spam count' entries). I believe someone did something like this a long time back and posted here about it - google might help find it. > For example, I could probably create a classifier.Classifier > instead, and > just pickle it to and from a database record (one per user), but > without > looking closer at the code I'm unclear if that's completely batty or > not. This is basically what the PickledClassifier class does (but stores the pickle in a file, rather than a database), so I can't see anything particularly batty about it. This would certainly be easy and fast. This involves keeping the entire object in memory, so whether it's feasible depends somewhat on how many users there will be and how large the classifiers will be (without knowing what will be in them, we have no way of knowing that). =Tony.Meyer -- Please always include the list (spambayes at python.org) in your replies (reply-all), and please don't send me personal mail about SpamBayes. http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this. _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
