The reason I talk about lost e-mails is because my account was defaulted into a "delete spam" mode when spam filtering was first introduced at EFN and I never saw filtered spam until I specifically contacted EFN to personally to ask that my account be exempted from that default. I have no experience of receiving flagged spam as the "default" action for EFN's spam filter. I had to lose an airline reservation e-mail and at least one job-seeking related e-mail before I became suspicious and started asking questions to learn that my account was defaulted to "drop spam silently". That was very frustrating for me and has made me the spam-filter-unfriendly guy I am today.
If there is to be a central "corpus" of spam for all users, I'd like to see some accountability and transparency:
1. Who makes the final decision if an e-mail submitted to [EMAIL PROTECTED] or [EMAIL PROTECTED] is included in the "corpus" as such. What are the relevant policies? Is it automated or staffed?
2. The "corpus" should be in an open web directory that is searchable. When the SpamAssassin says something is spam, there should be links to the reference e-mails in the corpus that were correlated with the spam, upon request, so a user can review whether the items in the corpus are "objective" spam or "subjective" spam. The individual must have a way of reviewing the decisions or processes that contribute to the corpus.
nuf sed,
Marc
Bob Miller wrote:
Larry Price wrote:
The current setup is not the final setup, we are currently moving towards per user whitelists
but the Bayesian filter is probably most useful as a global weight. The trend I notice being that spam is generally more similar than different, otherwise every one's filter will have to learn the same patterns again.
Sure, spam is the same for everybody, but tofu is very very different. For example, Marc sees the words "Howard", "Dean", "campaign" a whole lot more frequently than I do, and I see "chezgeek" and "kbob" more often.
Recognizing each user's tofu words separately significantly improves a Bayesian filter's performance, at the cost that every user must train his own filter.
Maybe I should give a EUGLUG talk on setting up bogofilter for
personal use. I'd need some help from someone who understands POP and
IMAP -- I still use /var/spool/mail/$USER as my mail queue.
-
_______________________________________________ EuG-LUG mailing list [EMAIL PROTECTED] http://mailman.efn.org/cgi-bin/listinfo/eug-lug
