Larry Price wrote:

> The current setup is not the final setup, we are currently moving 
> towards per user whitelists
> but the Bayesian filter is probably most useful as a global weight. The 
> trend I notice being that spam is generally more similar than 
> different, otherwise every one's filter will have to learn the same 
> patterns again.

Sure, spam is the same for everybody, but tofu is very very different.
For example, Marc sees the words "Howard", "Dean", "campaign" a whole
lot more frequently than I do, and I see "chezgeek" and "kbob" more
often.

Recognizing each user's tofu words separately significantly improves a
Bayesian filter's performance, at the cost that every user must train
his own filter.

Maybe I should give a EUGLUG talk on setting up bogofilter for
personal use.  I'd need some help from someone who understands POP and
IMAP -- I still use /var/spool/mail/$USER as my mail queue.

-- 
Bob Miller                              K<bob>
kbobsoft software consulting
http://kbobsoft.com                     [EMAIL PROTECTED]
_______________________________________________
EuG-LUG mailing list
[EMAIL PROTECTED]
http://mailman.efn.org/cgi-bin/listinfo/eug-lug

Reply via email to