Re: Train and use bayes on different adresses

John Hardin Thu, 26 Jun 2008 10:31:40 -0700

On Thu, 26 Jun 2008, Florian Lindner wrote:

Am 26.06.2008 um 18:26 schrieb John Hardin:
On Thu, 26 Jun 2008, Florian Lindner wrote:

> Hello,
> I use (honestly: I plan) the following procedure to filter my spam using> SA:>> All mails are piped through spamc. (emails for my family and me).> required_score is set to high value of 9 to avoid false postives. Mail> which is detected as spam is being deleted.
Refine that a bit. Leave the threshold at 5 so that suspicious messages getmarked, but delete at a high level (e.g. 10+)
What should be done with marked messages?

If they are spam, the user can drop them into their spam training folder -the assumption is bayes doesn't recognize them well enough yet, but thatisn't always the case.

If you want to minimize the number of weak-scores spams that your usershave to see, and you are less sensitive to FPs (which your originalproposal suggests) then you'd just delete at a lower score (e.g. 9+ or8+).

Generally speaking, it's a bad idea to fiddle with the threshold as allthe base rulesets are scored by the masscheck process with the assumptionthat 5 is "spammy".

> All SA filtering is done on the server side. On the client side> additional filtering is done by statistic filters of Apple Mail and> Thunderbird.>> Now I want to train the server SA filter by moving the junk mails (whish> have slipped through SA) on the client into an IMAP folder. This is done> only with the mail I receive, not the one the rest of family receive.
Why not let others train? Just give each user training folders.
The rest of family is rather computer agnostic and I'm happy they get alongwith the Thunderbird filter well.

That's reasonable. In my experience what you'll see when you review themailbox is a few false positives that you can copy to the user's hamtraining folder for them. They will generally just delete any spams unlessyou stress repeatedly that spams which leak thorough shold go into thespam training folder rather than the trash, and you may be able to tellthe MUA's classifier to save to the spam training folder rather thandeleting.

> Will this setup cause any problems? I ask because the bayes filter I> train with only my email is used for all email.
It's better if you train with all users' email. Note that *you* mayactually be doing the training, but it's still their email.
Another option would be to completely disable the statistic filters for myfamily and leave this completely up to Thunderbird. I would be using anotherSA config with statistics. How to implement this? Is is sufficient to use"spamc -F nostat.cf" with "use_bayes 0" in the config file and just spamc forme? Are these two spamc invocations are seperated from eath other?

I'd recommend against that, personally. Bayes is very helpful even if youcan't get your users to train it themselves.

You might want to have Thunderbird move spams to the spam training folderas I suggested, that way bayes will be led by thunderbird and theclassification at the server (which is where it should be) will getbetter.

Some tools that may help you set things up are available here:

 http://www.impsec.org/~jhardin/antispam/
It's very interesting but way too sophisticated for my situation andaudience.

Most of it will be visible only to you. My wife and MiL don't worryabout training and they get along well.

Then again, it also depends on how allergic to receiving _any_ spam yourusers are.


--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]    FALaholic #11174     pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Perfect Security and Absolute Safety are unattainable; beware
  those who would try to sell them to you, regardless of the cost,
  for they are trying to sell you your own slavery.
-----------------------------------------------------------------------
 8 days until the 232nd anniversary of the Declaration of Independence

Re: Train and use bayes on different adresses

Reply via email to