Re: are there any alternatives to textcat?

2010-07-20 Thread Matus UHLAR - fantomas
On 14.07.10 12:32, Jason Haar wrote:
 For some weird reason I seem to get a lot of Chinese spam - and even
 with TextCat enabled, SA is unable to recognise it as Chinese (ie I want
 to score on X-Spam-Languages:). I've Googled around and it looks like
 TextCat ceased development some time ago, so I was wondering if there is
 any known alternative that is more capable?
 
 The idea behind TextCat seems sound, but the only alternative I've found
 is Google Translator - but sending your emails to it may not be an
 option ;-)

did you set up ok_languages?
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
He who laughs last thinks slowest. 


Re: are there any alternatives to textcat?

2010-07-20 Thread Jason Haar
 On 07/20/2010 11:36 PM, Matus UHLAR - fantomas wrote:
 did you set up ok_languages?

Yup - in general it does work - it's just that textcat doesn't seem to
be able to figure out Chinese from a 5 paragraph email containing
nothing but Chinese and about 5 words of English. I had a similar
problem with Greek spam earlier this year too. Not really a fault - my
comment is that the idea is sound - it's just a dead project (from the
sounds of it) and I wish it wasn't.

-- 
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +64 3 9635 377 Fax: +64 3 9635 417
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1



Re: are there any alternatives to textcat?

2010-07-14 Thread Henrik K
On Tue, Jul 13, 2010 at 07:35:36PM -0500, Chris Owen wrote:
 On Jul 13, 2010, at 7:32 PM, Jason Haar wrote:
 
  For some weird reason I seem to get a lot of Chinese spam - and even
  with TextCat enabled, SA is unable to recognise it as Chinese (ie I want
  to score on X-Spam-Languages:). I've Googled around and it looks like
  TextCat ceased development some time ago, so I was wondering if there is
  any known alternative that is more capable?
 
 Well according to the TextCat web site:
 
 http://www.let.rug.nl/~vannoord/TextCat/competitors.html

It's more of the implementation that needs an update than TextCat algorithm
itself.

Charset/case awareness:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6229

Better database:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4152

Etc.. feel free to chime in..



RE: are there any alternatives to textcat?

2010-07-14 Thread Giampaolo Tomassoni
 It's more of the implementation that needs an update than TextCat
 algorithm
 itself.
 
 Charset/case awareness:
 https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6229
 
 Better database:
 https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4152
 
 Etc.. feel free to chime in..

There is one more thing I guess it should be fixed (or at least I can't get
why it is the way it is right now): charsets in TextCat language database.

Why are languages in the database expressed in different charsets? Isn't it
better to have them in unicode only?



Re: are there any alternatives to textcat?

2010-07-14 Thread Benny Pedersen

On ons 14 jul 2010 02:32:36 CEST, Jason Haar wrote


The idea behind TextCat seems sound, but the only alternative I've found
is Google Translator - but sending your emails to it may not be an
option ;-)


relaycountry maybe ?

or if one make a aspell/ispell plugin


--
xpoint http://www.unicom.com/pw/reply-to-harmful.html



are there any alternatives to textcat?

2010-07-13 Thread Jason Haar
 Hi there

For some weird reason I seem to get a lot of Chinese spam - and even
with TextCat enabled, SA is unable to recognise it as Chinese (ie I want
to score on X-Spam-Languages:). I've Googled around and it looks like
TextCat ceased development some time ago, so I was wondering if there is
any known alternative that is more capable?

The idea behind TextCat seems sound, but the only alternative I've found
is Google Translator - but sending your emails to it may not be an
option ;-)

-- 
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +64 3 9635 377 Fax: +64 3 9635 417
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1



Re: are there any alternatives to textcat?

2010-07-13 Thread Chris Owen
On Jul 13, 2010, at 7:32 PM, Jason Haar wrote:

 For some weird reason I seem to get a lot of Chinese spam - and even
 with TextCat enabled, SA is unable to recognise it as Chinese (ie I want
 to score on X-Spam-Languages:). I've Googled around and it looks like
 TextCat ceased development some time ago, so I was wondering if there is
 any known alternative that is more capable?

Well according to the TextCat web site:

http://www.let.rug.nl/~vannoord/TextCat/competitors.html

Chris

--
-
Chris Owen - Garden City (620) 275-1900 -  Lottery (noun):
President  - Wichita (316) 858-3000 -A stupidity tax
Hubris Communications Inc  www.hubris.net
-