At 12:08 AM 8/12/2004 -0700, Justin Mason wrote:
currently in 3.0.0 we don't support "sa-learn --dump" containing readable
token data anymore... there's a patch in
http://bugzilla.spamassassin.org/show_bug.cgi?id=3331 to restore this
capability.  However, it slows down bayes scanning and learning
quite a bit recording that data as well.

What do people think?  is this functionality being removed a serious
issue?

Well, I think it's fairly important to have the ability available via some means, mostly for debugging a bad bayes DB, and to get an understanding of how bayes works and look for bugs in it.


However, it seems a waste to slow down general bayes operations to get it.

Reading the bug it looks like the patch gives users the choice between the two formats, and that looks like the best idea of all.

This way those who want to monitor their bayes DB and can accept the performance hit can use the extended database, and those who want the speed and size gains of the hashed database can do that too.

Really, that means the choice of accepting or not accepting the patch is largely a factor of how much it adds to the developer maintenance. Clearly it's a win-win from the user side, since users can choose which mode they want. The big question is will it hinder further bayes development by increasing code complexity?

I think it's a very worthwhile patch, provided it doesn't significantly slow down the sa-dev team when it comes to making bayes fixes and enhancements.



Reply via email to