While reading the article by Paul Graham, something came to my mind;
what will happen if the user is not english?

Let's say 99.99% of all spam is in english (which is my experience), and
my mother tongue is norwegian. ;)
Let's also say that I usually never receive mails written in english.

The Bayesian approach would then put all english words in a bad-words
list (except words found in headers), and all norwegian words in a
good-words list, wouldn't it?
1. What happens the day I join an english mailing list, or receive a
mail written in english?
2. What happens if I receive a mail written in norwegian but containing
a few english words, i.e. quoting someone?

I'd say it would discard mail #1, but let through #2...
What do you think..?
:)



On Sat, 2002-08-17 at 23:37, Chuq Von Rospach wrote:
> On 8/17/02 12:37 PM, "J C Lawrence" <[EMAIL PROTECTED]> wrote:
> 
> > Keep thinking about it.  In essence it is a merely a finer grained
> > scoring system.  It doesn't fundamentally change the spam cold war;
> 
> Actually, I think it does fundamentally change it. You're not just making
> better guesses at what spammers say. You're effectively building a digital
> signature of what your REAL mail looks like, and comparing messages to it.
> The further it deviates from your real mail, the spammier it is.
> 
> The only two ways for spammers to avoid this are to move to graphics, which
> can still be whacked on, and to stop using open relays and other things that
> leave noticable signatures in the headers.
> 
> It might not catch the smartest spam, but it'll sure catch everything else.

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to