Re: are there any alternatives to textcat?

2010-07-14 Thread Henrik K
On Tue, Jul 13, 2010 at 07:35:36PM -0500, Chris Owen wrote:
 On Jul 13, 2010, at 7:32 PM, Jason Haar wrote:
 
  For some weird reason I seem to get a lot of Chinese spam - and even
  with TextCat enabled, SA is unable to recognise it as Chinese (ie I want
  to score on X-Spam-Languages:). I've Googled around and it looks like
  TextCat ceased development some time ago, so I was wondering if there is
  any known alternative that is more capable?
 
 Well according to the TextCat web site:
 
 http://www.let.rug.nl/~vannoord/TextCat/competitors.html

It's more of the implementation that needs an update than TextCat algorithm
itself.

Charset/case awareness:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6229

Better database:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4152

Etc.. feel free to chime in..



spamassassin with dcc not appearing to work

2010-07-14 Thread Jimmy Stewpot
Hi There,

I am currently trying to implement DCC on a small email server to test how 
effective it may be. Unfortunately I have been unable to get any results and it 
appears that its just simply not working.

I have the following lines in my configuration for spamassassin 


use_dcc 1
dcc_path /usr/bin
dcc_dccifd_path [127.0.0.1]:38681
dcc_home /var/lib/dcc


With the plugin definitely being enabled when I do a --lint I get the following


Jul 14 02:48:04.529 [23120] dbg: plugin: loading 
Mail::SpamAssassin::Plugin::DCC from @INC

I know that with lint it does no network based tests (local only) but I still 
don't seem to have any success.

I also added the following lines to the configuration and it made no difference.

add_header  all DCC _DCCB_: _DCCR_ 

I still don't see any header information reporting DCC..

Any advice would be really appreciated.

Regards,

Jimmy.


RE: are there any alternatives to textcat?

2010-07-14 Thread Giampaolo Tomassoni
 It's more of the implementation that needs an update than TextCat
 algorithm
 itself.
 
 Charset/case awareness:
 https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6229
 
 Better database:
 https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4152
 
 Etc.. feel free to chime in..

There is one more thing I guess it should be fixed (or at least I can't get
why it is the way it is right now): charsets in TextCat language database.

Why are languages in the database expressed in different charsets? Isn't it
better to have them in unicode only?



Re: are there any alternatives to textcat?

2010-07-14 Thread Benny Pedersen

On ons 14 jul 2010 02:32:36 CEST, Jason Haar wrote


The idea behind TextCat seems sound, but the only alternative I've found
is Google Translator - but sending your emails to it may not be an
option ;-)


relaycountry maybe ?

or if one make a aspell/ispell plugin


--
xpoint http://www.unicom.com/pw/reply-to-harmful.html



uribl not working properly with .gg TLD

2010-07-14 Thread DaveAtJLA

I'm running SpamAssassin version 3.3.0 and we received some spam recently
which contained a link to a .ru.gg domain. While investigating whether it
was listed in any of the URIBLs I discovered that if a message contains a
link to http://qwerty.ru.gg;, spamassassin only looks up the domain ru.gg
- here's a snippet from the log:

Jul 14 07:55:54.785 [3269] dbg: async: timing: 0.026 .
DNSBL:dob.sibl.support-intelligence.net:ru.gg
Jul 14 07:55:54.785 [3269] dbg: async: timing: 0.027 .
DNSBL:multi.uribl.com.:ru.gg

However if I edit the message, change the link to http://qwerty.ru.com; and
run it through spamassassin again, then the URIBL lookups are done for the
full domain name: 

Jul 14 08:52:49.412 [16122] dbg: async: timing: 0.287 .
DNSBL:dob.sibl.support-intelligence.net:qwerty.ru.com
Jul 14 08:52:49.412 [16122] dbg: async: timing: 0.290 .
DNSBL:multi.uribl.com.:qwerty.ru.com

This can't be right, can it? It looks like the gg top-level domain isn't
being handled properly. Any ideas?

Dave

-- 
View this message in context: 
http://old.nabble.com/uribl-not-working-properly-with-.gg-TLD-tp29159353p29159353.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



RE: uribl not working properly with .gg TLD

2010-07-14 Thread Giampaolo Tomassoni
 I'm running SpamAssassin version 3.3.0 and we received some spam
 recently
 which contained a link to a .ru.gg domain. While investigating whether
 it
 was listed in any of the URIBLs I discovered that if a message contains
 a
 link to http://qwerty.ru.gg;, spamassassin only looks up the domain
 ru.gg
 - here's a snippet from the log:
 
 Jul 14 07:55:54.785 [3269] dbg: async: timing: 0.026 .
 DNSBL:dob.sibl.support-intelligence.net:ru.gg
 Jul 14 07:55:54.785 [3269] dbg: async: timing: 0.027 .
 DNSBL:multi.uribl.com.:ru.gg
 
 However if I edit the message, change the link to
 http://qwerty.ru.com; and
 run it through spamassassin again, then the URIBL lookups are done for
 the
 full domain name:
 
 Jul 14 08:52:49.412 [16122] dbg: async: timing: 0.287 .
 DNSBL:dob.sibl.support-intelligence.net:qwerty.ru.com
 Jul 14 08:52:49.412 [16122] dbg: async: timing: 0.290 .
 DNSBL:multi.uribl.com.:qwerty.ru.com
 
 This can't be right, can it? It looks like the gg top-level domain
 isn't
 being handled properly. Any ideas?

I don't see why you believe querty.ru.gg == querty.ru.com .

.gg is a gTLD (for the Bailiwick of Guernsey, according to
http://en.wikipedia.org/wiki/.gg).


 Dave

Giampaolo



RE: uribl not working properly with .gg TLD

2010-07-14 Thread DaveAtJLA

What I am asking is why a reference to http://querty.ru.gg generates a URI
lookup for ru.gg (ie missing out the first component) whereas a reference to
http://qwerty.ru.com generates a URI lookup for qwerty.ru.com.

Dave


Giampaolo Tomassoni-2 wrote:
 
 I'm running SpamAssassin version 3.3.0 and we received some spam
 recently
 which contained a link to a .ru.gg domain. While investigating whether
 it
 was listed in any of the URIBLs I discovered that if a message contains
 a
 link to http://qwerty.ru.gg;, spamassassin only looks up the domain
 ru.gg
 - here's a snippet from the log:
 
 Jul 14 07:55:54.785 [3269] dbg: async: timing: 0.026 .
 DNSBL:dob.sibl.support-intelligence.net:ru.gg
 Jul 14 07:55:54.785 [3269] dbg: async: timing: 0.027 .
 DNSBL:multi.uribl.com.:ru.gg
 
 However if I edit the message, change the link to
 http://qwerty.ru.com; and
 run it through spamassassin again, then the URIBL lookups are done for
 the
 full domain name:
 
 Jul 14 08:52:49.412 [16122] dbg: async: timing: 0.287 .
 DNSBL:dob.sibl.support-intelligence.net:qwerty.ru.com
 Jul 14 08:52:49.412 [16122] dbg: async: timing: 0.290 .
 DNSBL:multi.uribl.com.:qwerty.ru.com
 
 This can't be right, can it? It looks like the gg top-level domain
 isn't
 being handled properly. Any ideas?
 
 I don't see why you believe querty.ru.gg == querty.ru.com .
 
 .gg is a gTLD (for the Bailiwick of Guernsey, according to
 http://en.wikipedia.org/wiki/.gg).
 
 
 Dave
 
 Giampaolo
 
 
 

-- 
View this message in context: 
http://old.nabble.com/uribl-not-working-properly-with-.gg-TLD-tp29159353p29159839.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



RE: uribl not working properly with .gg TLD

2010-07-14 Thread Giampaolo Tomassoni
 What I am asking is why a reference to http://querty.ru.gg generates a
 URI
 lookup for ru.gg (ie missing out the first component) whereas a
 reference to
 http://qwerty.ru.com generates a URI lookup for qwerty.ru.com.
 
 Dave

Because the ru.gg second level domain is not in the TWO_LEVEL_DOMAINS
variable defined in Mail::SpamAssassin::Util::RegistrarBoundaries , while
ru.com is.

If you mean that ru.gg should be there too, please note that querty.ru.gg is
a third-level domain of ru.gg, which is assigned to webme.com. So, I don't
see any need to discriminate querty.ru.gg from ru.gg.

Further, I would personally blacklist the whole .gg gTLD since their whois
service is ridiculous.

Giampaolo
 


 Giampaolo Tomassoni-2 wrote:
 
  I'm running SpamAssassin version 3.3.0 and we received some spam
  recently
  which contained a link to a .ru.gg domain. While investigating
 whether
  it
  was listed in any of the URIBLs I discovered that if a message
 contains
  a
  link to http://qwerty.ru.gg;, spamassassin only looks up the domain
  ru.gg
  - here's a snippet from the log:
 
  Jul 14 07:55:54.785 [3269] dbg: async: timing: 0.026 .
  DNSBL:dob.sibl.support-intelligence.net:ru.gg
  Jul 14 07:55:54.785 [3269] dbg: async: timing: 0.027 .
  DNSBL:multi.uribl.com.:ru.gg
 
  However if I edit the message, change the link to
  http://qwerty.ru.com; and
  run it through spamassassin again, then the URIBL lookups are done
 for
  the
  full domain name:
 
  Jul 14 08:52:49.412 [16122] dbg: async: timing: 0.287 .
  DNSBL:dob.sibl.support-intelligence.net:qwerty.ru.com
  Jul 14 08:52:49.412 [16122] dbg: async: timing: 0.290 .
  DNSBL:multi.uribl.com.:qwerty.ru.com
 
  This can't be right, can it? It looks like the gg top-level domain
  isn't
  being handled properly. Any ideas?
 
  I don't see why you believe querty.ru.gg == querty.ru.com .
 
  .gg is a gTLD (for the Bailiwick of Guernsey, according to
  http://en.wikipedia.org/wiki/.gg).
 
 
  Dave
 
  Giampaolo
 
 
 
 
 --
 View this message in context: http://old.nabble.com/uribl-not-working-
 properly-with-.gg-TLD-tp29159353p29159839.html
 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: spamassassin with dcc not appearing to work

2010-07-14 Thread Karsten Bräckelmann
On Wed, 2010-07-14 at 06:19 +, Jimmy Stewpot wrote:
 I am currently trying to implement DCC on a small email server to test
 how effective it may be. Unfortunately I have been unable to get any
 results and it appears that its just simply not working.

 I also added the following lines to the configuration and it made no
 difference.
 
 add_header  all DCC _DCCB_: _DCCR_ 
 
 I still don't see any header information reporting DCC..

There is no X-Spam-DCC header? If you're using spamd, you forgot to
restart the daemon. That option will add the header regardless.

If you are using glue other than spamd, like Amavis, did you restart
that? FWIW, Amavis adds its own headers, the above SA configuration is
ignored.

A --lint run (without -D debugging) does return cleanly, with no
warnings, right?


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



First run score: 25.7 Second: 2.6

2010-07-14 Thread Emin Akbulut
I run SA Win32 port 3.3.1 by JAM Software on Windows Server 2008 64 bit.
Spamassassin.exe always calculates the same score, coz User_Prefs file is
under my docs (C:\Users\ea\.spamassassin)

However spamd.exe -which runs as service- calculates the right score at
first time
then score goes very low at subsequent checks. spamd runs under system
account
and it's User_Prefs file is located
under C:\Windows\SysWOW64\config\systemprofile\.spamassassin

I have no idea why spamd calculates different scores. I hope someone here
knows
the reason  has a solution.

Thanks.



First run:
---
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on WebServer
X-Spam-Flag: YES
X-Spam-Level: *
X-Spam-Status: Yes, score=25.7 required=6.3 tests=HTML_IMAGE_ONLY_32,
HTML_IMAGE_RATIO_02,HTML_MESSAGE,LOCALPART_IN_SUBJECT,MIME_HTML_ONLY,
MISSING_DATE,MISSING_MID,RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_BRBL_LASTEXT,
RCVD_IN_PBL,RCVD_IN_XBL,RDNS_NONE,TO_NO_BRKTS_NORDNS_HTML,T_SURBL_MULTI1,
T_SURBL_MULTI2,T_SURBL_MULTI3,T_SURBL_MULTI4,T_URIBL_BLACK_OVERLAP,
URIBL_AB_SURBL,URIBL_BLACK,URIBL_DBL_SPAM,URIBL_JP_SURBL,URIBL_OB_SURBL,
URIBL_SBL,URIBL_SC_SURBL,URIBL_WS_SURBL autolearn=unavailable version=3.3.1


Next runs:
---
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on WebServer
X-Spam-Level: **
X-Spam-Status: No, score=2.6 required=6.3 tests=HTML_IMAGE_ONLY_32,
HTML_IMAGE_RATIO_02,HTML_MESSAGE,LOCALPART_IN_SUBJECT,MIME_HTML_ONLY
autolearn=unavailable version=3.3.1


Re: First run score: 25.7 Second: 2.6

2010-07-14 Thread Bowie Bailey
 On 7/14/2010 8:42 AM, Emin Akbulut wrote:
 I run SA Win32 port 3.3.1 by JAM Software on Windows Server 2008 64 bit.
 Spamassassin.exe always calculates the same score, coz User_Prefs file is 
 under my docs (C:\Users\ea\.spamassassin)

 However spamd.exe -which runs as service- calculates the right score
 at first time
 then score goes very low at subsequent checks. spamd runs under system
 account
 and it's User_Prefs file is located
 under C:\Windows\SysWOW64\config\systemprofile\.spamassassin

 I have no idea why spamd calculates different scores. I hope someone
 here knows
 the reason  has a solution.

 Thanks.



 First run:
 ---
 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on WebServer
 X-Spam-Flag: YES
 X-Spam-Level: *
 X-Spam-Status: Yes, score=25.7 required=6.3 tests=HTML_IMAGE_ONLY_32,
 HTML_IMAGE_RATIO_02,HTML_MESSAGE,LOCALPART_IN_SUBJECT,MIME_HTML_ONLY,
 MISSING_DATE,MISSING_MID,RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_BRBL_LASTEXT,
 RCVD_IN_PBL,RCVD_IN_XBL,RDNS_NONE,TO_NO_BRKTS_NORDNS_HTML,T_SURBL_MULTI1,
 T_SURBL_MULTI2,T_SURBL_MULTI3,T_SURBL_MULTI4,T_URIBL_BLACK_OVERLAP,
 URIBL_AB_SURBL,URIBL_BLACK,URIBL_DBL_SPAM,URIBL_JP_SURBL,URIBL_OB_SURBL,
 URIBL_SBL,URIBL_SC_SURBL,URIBL_WS_SURBL autolearn=unavailable
 version=3.3.1


 Next runs:
 ---
 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on WebServer
 X-Spam-Level: **
 X-Spam-Status: No, score=2.6 required=6.3 tests=HTML_IMAGE_ONLY_32,
 HTML_IMAGE_RATIO_02,HTML_MESSAGE,LOCALPART_IN_SUBJECT,MIME_HTML_ONLY
 autolearn=unavailable version=3.3.1

What sticks out to me is that most of the missing score hits on the
second run are from blacklists.  It seems like the second run is doing
local tests only.  Are you sure they are running the same way?  Can you
tell us how each of these tests are run?  I assume the first one is the
automatic test when the email is received and the second is a manual
call to spamc?

-- 
Bowie


Re: spamassassin with dcc not appearing to work

2010-07-14 Thread Michael Scheidell

On 7/14/10 2:19 AM, Jimmy Stewpot wrote:

Hi There,

I am currently trying to implement DCC on a small email server to test how 
effective it may be. Unfortunately I have been unable to get any results and it 
appears that its just simply not working.

I have the following lines in my configuration for spamassassin


use_dcc 1
dcc_path /usr/bin
dcc_dccifd_path [127.0.0.1]:38681
   
dcc_home /var/lib/dcc


   
you only need dccifd if you run the dccd daemon, and its a unix socket 
*by default.  so try to use that, something, usually like:

/var/run/dccifd.

if you don't have dccd running, SA will call (I forget) some other 
program on each email.


does dcc itself connect?  are you running a recent version of dcc? old 
version won't connect with the public folders. checked firewall?


to see if dcc works, type:

cdcc info

and see if you are connecting to a dcc server.

and as a reminder, dcc doesn't test for spam or not spam, just bulk vs 
non bulk, and the OPTIONAL  reputation filter service also gives you the 
percentage of bulk on the connecting ip.



--
Michael Scheidell, CTO
Phone: 561-999-5000, x 1259
 *| *SECNAP Network Security Corporation

   * Certified SNORT Integrator
   * 2008-9 Hot Company Award Winner, World Executive Alliance
   * Five-Star Partner Program 2009, VARBusiness
   * Best in Email Security,2010: Network Products Guide
   * King of Spam Filters, SC Magazine 2008

__
This email has been scanned and certified safe by SpammerTrap(r). 
For Information please see http://www.secnap.com/products/spammertrap/
__  


Re: First run score: 25.7 Second: 2.6

2010-07-14 Thread Charles Gregory

On Wed, 14 Jul 2010, Bowie Bailey wrote:

First run:
---
X-Spam-Status: Yes, score=25.7 required=6.3 tests=HTML_IMAGE_ONLY_32,
HTML_IMAGE_RATIO_02,HTML_MESSAGE,LOCALPART_IN_SUBJECT


What sticks out to me is that most of the missing score hits on the
second run are from blacklists.


Quite true. What also sticks out to me is that test LOCALPART_IN_SUBJECT 
disappers which means that the headers on the second run are 
substantially different from the headers on the first run.


SOMETHING is severely mangling the mail between the two runs, and quite 
obviously this degrades spamassassin's capability to detect spam.


I suppose I should ask (of the OP) WHY there are two runs at all?

- C


Re: First run score: 25.7 Second: 2.6

2010-07-14 Thread Daniel Lemke


Emin Akbulut wrote:
 
 However spamd.exe -which runs as service- calculates the right score at
 first time
 then score goes very low at subsequent checks. spamd runs under system
 account
 and it's User_Prefs file is located
 under C:\Windows\SysWOW64\config\systemprofile\.spamassassin
 
 I have no idea why spamd calculates different scores. I hope someone here
 knows
 the reason  has a solution.
 

What parameters did you start spamd with?

Daniel
-- 
View this message in context: 
http://old.nabble.com/First-run-score%3A-25.7-Second%3A-2.6-tp29161519p29162415.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: First run score: 25.7 Second: 2.6

2010-07-14 Thread Emin Akbulut
I noticed randomly while I was testing SA. All I did is below:

WinSpamC  realspam.txt  result1.txt
NET STOP Spamassassin
NET START Spamassassin
WinSpamC  realspam.txt  result2.txt
WinSpamC  realspam.txt  result3.txt

result1: under 6.3
result2: very high
result3: under 6.3


On Wed, Jul 14, 2010 at 5:08 PM, Daniel Lemke le...@jam-software.comwrote:



 Emin Akbulut wrote:
 
  However spamd.exe -which runs as service- calculates the right score at
  first time
  then score goes very low at subsequent checks. spamd runs under system
  account
  and it's User_Prefs file is located
  under C:\Windows\SysWOW64\config\systemprofile\.spamassassin
 
  I have no idea why spamd calculates different scores. I hope someone here
  knows
  the reason  has a solution.
 

 What parameters did you start spamd with?

 Daniel
 --
 View this message in context:
 http://old.nabble.com/First-run-score%3A-25.7-Second%3A-2.6-tp29161519p29162415.html
 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.




Re: First run score: 25.7 Second: 2.6

2010-07-14 Thread Matt Kettler
On 7/14/2010 11:27 AM, Emin Akbulut wrote:
 I noticed randomly while I was testing SA. All I did is below:

 WinSpamC  realspam.txt  result1.txt
 NET STOP Spamassassin
 NET START Spamassassin
 WinSpamC  realspam.txt  result2.txt
 WinSpamC  realspam.txt  result3.txt

 result1: under 6.3
 result2: very high
 result3: under 6.3
That is quite strange.. sounds like you've got DNS timeout problems.

Might want to check the DNS settings on your machine and make sure all
of the listed DNS servers are working and are capable of properly
resolving internet hosts.

SpamAssassin will *NOT* query every DNS server in your setup. It will
pick one, and query it. If it gets no response, SA goes with that and
does NOT ask the other DNS servers. So if there's a dead DNS server in
your config, that's not so good for SA.