Yes, most large sites don't seem to sync the Bayes across multiple machines since they should average out. To start with, just you and a perhaps a few other trusted users could provide the training. Every false negative helps. As to false positives, I prefer to use a manual whitelist. Check out webuserprefs for a PHP app that allows user editing.
http://wiki.apache.org/spamassassin/ManualWhitelist http://wiki.apache.org/spamassassin/WebUserInterfaces - dan -- Dan Kohn <mailto:[EMAIL PROTECTED]> <http://www.dankohn.com/> <tel:+1-650-327-2600> -----Original Message----- From: Tony Finch [mailto:[EMAIL PROTECTED] On Behalf Of Tony Finch Sent: Thursday, July 15, 2004 10:43 To: Dan Kohn Cc: Tony Finch; [EMAIL PROTECTED] Subject: RE: paper comparing spam classifiers On Thu, 15 Jul 2004, Dan Kohn wrote: > Consider doing training via redirects or IMAP. I have :-) Part of the problem is how to provide a sensible user interface (IMAP is probably best for that) and whether the users are likely to poison the database through ignorance. I'm also not fond of the idea of trying to keep 6 machines' Bayes databases in sync; if I can just leave them to get on with it independently of each other and rely on aggregate behaviour to smooth out the differences it's a much more tractable setup. Tony. -- f.a.n.finch <[EMAIL PROTECTED]> http://dotat.at/ THAMES DOVER WIGHT PORTLAND PLYMOUTH: SOUTHWESTERLY 4 OR 5, OCCASIONALLY 6 AT FIRST, BECOMING VARIABLE 3 IN SOUTH LATER. OCCASIONAL DRIZZLE. MODERATE OR GOOD WITH FOG BANKS, MAINLY IN WEST.
