Hello Alastair,

On Tue, 20 Nov 2001 14:29:05 -0000 GMT (20/11/2001, 22:29 +0800 GMT),
Alastair Scott wrote:

AS> There may be more clever statistical methods - the above is Turkish, and
AS> it's pretty obvious the relative frequency of various letters (eg "z" and
AS> "i") is entirely different from that of English -

This would be difficult to implement in a TB filter. But I just had an
idea:

You can actually filter for certain words that are likely to occur in
most Turkish-language spams, such as siteler (web sites), for example.
You can also use other simple words from the Turkish language. Without
a scoring mechanism - i.e. just if one of those five or ten words is
found, it's a hit - make your own, very simple, language parser in the
form of a TB filter.

-- 

Cheers,
Thomas.

Moderator der deutschen The Bat! Beginner Liste.

Analogies in writing are like feathers on a snake.

Message reply created with The Bat! 1.54/10
under Chinese Windows 98 4.10 Build 2222 A 
using an AMD Athlon K7 1.2GHz, 128MB RAM


-- 
________________________________________________________
Archives   : http://tbudl.thebat.dutaint.com
Moderators : mailto:[EMAIL PROTECTED]
TBTech List: mailto:[EMAIL PROTECTED]
Unsubscribe: mailto:[EMAIL PROTECTED]
Latest Vers: 1.53d
FAQ        : http://faq.thebat.dutaint.com 

Reply via email to