At 05:06 AM 1/13/2006, Markus Braun wrote:
Hello together,
At the moment i use spamassassin for spam protection.
Also Autobayes is activated.
I know when autobayes make some mistakes with ham or spam, that i can
correct it manually.
e.g.
sa -- learn (the file)
But what does sa--learn keep in mind?
SpamAssassin's bayes system breaks the message body and many of the headers
up into "tokens". For the most part, tokens are simply words, but it also
breaks email addresses up (username and domain parts are separate tokens),
and does a lot of weird things I don't fully understand to grab bits of
headers.
sa-learn then takes all these and dumps them into a database, and tracks
how many times each token was seen in spam, and how many times in nonspam.
From these counts, it also calculates a probability that the token will be
in a spam message (0.000 to 1.000, aka 0% to 100%).
Later, when a message is scanned, SpamAssassin breaks it up into tokens and
checks the database. From the database it pulls out all the probability for
all the tokens that match and combines them to calculate a total
probability for the message. (it uses a chi-squared combine if you're a
statistics geek)
If you really want to see bayes running in gross detail, try adding -D to a
sa-learn or spamassassin run. You can see the bayes tokens from the
messages being processed on the debug output.
On the next day, a spam emails als come from another email adress. But the
adressname and the header information are the same.
So my question is, how can i make it that these emails are also marked als
spam.
sa-learn will help here. Another way to help is by grabbing some of the
rulesets off of rulesemporium.com that cover the spam which is giving you
the most trouble.