Yes, most large sites don't seem to sync the Bayes across multiple
machines since they should average out.  To start with, just you and a
perhaps a few other trusted users could provide the training.  Every
false negative helps.  As to false positives, I prefer to use a manual
whitelist.  Check out webuserprefs for a PHP app that allows user
editing.

http://wiki.apache.org/spamassassin/ManualWhitelist
http://wiki.apache.org/spamassassin/WebUserInterfaces

          - dan
--
Dan Kohn <mailto:[EMAIL PROTECTED]>
<http://www.dankohn.com/>  <tel:+1-650-327-2600>

-----Original Message-----
From: Tony Finch [mailto:[EMAIL PROTECTED] On Behalf Of Tony Finch
Sent: Thursday, July 15, 2004 10:43
To: Dan Kohn
Cc: Tony Finch; [EMAIL PROTECTED]
Subject: RE: paper comparing spam classifiers

On Thu, 15 Jul 2004, Dan Kohn wrote:

> Consider doing training via redirects or IMAP.

I have :-) Part of the problem is how to provide a sensible user
interface
(IMAP is probably best for that) and whether the users are likely to
poison the database through ignorance. I'm also not fond of the idea of
trying to keep 6 machines' Bayes databases in sync; if I can just leave
them to get on with it independently of each other and rely on aggregate
behaviour to smooth out the differences it's a much more tractable
setup.

Tony.
-- 
f.a.n.finch  <[EMAIL PROTECTED]>  http://dotat.at/
THAMES DOVER WIGHT PORTLAND PLYMOUTH: SOUTHWESTERLY 4 OR 5, OCCASIONALLY
6 AT
FIRST, BECOMING VARIABLE 3 IN SOUTH LATER. OCCASIONAL DRIZZLE. MODERATE
OR
GOOD WITH FOG BANKS, MAINLY IN WEST.

Reply via email to