On Thu, 26 Jun 2008, Florian Lindner wrote:
Am 26.06.2008 um 18:26 schrieb John Hardin:
On Thu, 26 Jun 2008, Florian Lindner wrote:
> Hello,
> I use (honestly: I plan) the following procedure to filter my spam using
> SA:
>
> All mails are piped through spamc. (emails for my family and me).
> required_score is set to high value of 9 to avoid false postives. Mail
> which is detected as spam is being deleted.
Refine that a bit. Leave the threshold at 5 so that suspicious messages get
marked, but delete at a high level (e.g. 10+)
What should be done with marked messages?
If they are spam, the user can drop them into their spam training folder -
the assumption is bayes doesn't recognize them well enough yet, but that
isn't always the case.
If you want to minimize the number of weak-scores spams that your users
have to see, and you are less sensitive to FPs (which your original
proposal suggests) then you'd just delete at a lower score (e.g. 9+ or
8+).
Generally speaking, it's a bad idea to fiddle with the threshold as all
the base rulesets are scored by the masscheck process with the assumption
that 5 is "spammy".
> All SA filtering is done on the server side. On the client side
> additional filtering is done by statistic filters of Apple Mail and
> Thunderbird.
>
> Now I want to train the server SA filter by moving the junk mails (whish
> have slipped through SA) on the client into an IMAP folder. This is done
> only with the mail I receive, not the one the rest of family receive.
Why not let others train? Just give each user training folders.
The rest of family is rather computer agnostic and I'm happy they get along
with the Thunderbird filter well.
That's reasonable. In my experience what you'll see when you review the
mailbox is a few false positives that you can copy to the user's ham
training folder for them. They will generally just delete any spams unless
you stress repeatedly that spams which leak thorough shold go into the
spam training folder rather than the trash, and you may be able to tell
the MUA's classifier to save to the spam training folder rather than
deleting.
> Will this setup cause any problems? I ask because the bayes filter I
> train with only my email is used for all email.
It's better if you train with all users' email. Note that *you* may
actually be doing the training, but it's still their email.
Another option would be to completely disable the statistic filters for my
family and leave this completely up to Thunderbird. I would be using another
SA config with statistics. How to implement this? Is is sufficient to use
"spamc -F nostat.cf" with "use_bayes 0" in the config file and just spamc for
me? Are these two spamc invocations are seperated from eath other?
I'd recommend against that, personally. Bayes is very helpful even if you
can't get your users to train it themselves.
You might want to have Thunderbird move spams to the spam training folder
as I suggested, that way bayes will be led by thunderbird and the
classification at the server (which is where it should be) will get
better.
Some tools that may help you set things up are available here:
http://www.impsec.org/~jhardin/antispam/
It's very interesting but way too sophisticated for my situation and
audience.
Most of it will be visible only to you. My wife and MiL don't worry
about training and they get along well.
Then again, it also depends on how allergic to receiving _any_ spam your
users are.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
[EMAIL PROTECTED] FALaholic #11174 pgpk -a [EMAIL PROTECTED]
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Perfect Security and Absolute Safety are unattainable; beware
those who would try to sell them to you, regardless of the cost,
for they are trying to sell you your own slavery.
-----------------------------------------------------------------------
8 days until the 232nd anniversary of the Declaration of Independence