Let me give you an example. Here's 2 subject lines. Easy to guess which
one is spam.
"Meet horny Russian Brides online!"
"I read an article about Russian brides in a magazine."
Bayes or spam assassin would look at "Russian Brides" and 499 out of 500
times it's spam. Therefore the nonspam version scores spam points.
In my system "Russian brides" is neutral because it is used in both spam
and ham. But on the spam side, phrases used in other spam *not matched*
in ham.
Meet horny
horny Russian
horny Russian brides
brides online!
online!
On the ham side, phrases used in ham *not matched* in spam.
I read an article
read an article
an article about
brides in a magazine
in a magazine
My filter gets both correctly because of NOT matching. Not matching is a
comparison to an infinite set.