Re: 20_sought_fraud.cf

2014-03-10 Thread Justin Mason
On Sat, Mar 8, 2014 at 9:31 PM, John Hardin jhar...@impsec.org wrote:

 Justin, can you provide any enlightenment? If the base Sought dynamic
 ruleset is indeed dead, can the wiki page be updated?


hi folks --

I've been in contact with some of the dev team regarding handing over the
sought ruleset backends… it's not so much dead as in the process of
resurrection.  perhaps they can comment ;)

--j.


Re: 20_sought_fraud.cf

2013-04-12 Thread Justin Mason
it could be that whoever was uploading the fraud corpora which the
ruleset builds from is no longer doing so.  I'll take a look later
on...

--j.

On Fri, Apr 12, 2013 at 12:25 PM, Benny Pedersen m...@junc.eu wrote:
 Axb skrev den 2013-04-12 10:17:


 all I'm seeing in that file is

 meta JM_SOUGHT_1   (0)
 score JM_SOUGHT_1  0
 describe JM_SOUGHT_1  Body contains frequently-spammed text patterns


 yep seen here, would like to know why aswell

 --
 senders that put my email into body content will deliver it to my own
 trashcan, so if you like to get reply, dont do it


Re: Sought rules

2011-06-12 Thread Justin Mason
On Sunday, June 12, 2011, Warren Togami Jr. wtog...@gmail.com wrote:
 On 6/12/2011 12:32 AM, Warren Togami Jr. wrote:

 On 6/11/2011 10:03 AM, Justin Mason wrote:

 guys -- I'm going to make the whole question moot (in trunk at least)
 -- the only reason SOUGHT and SOUGHT_FRAUD were being checked in there
 was to make their accuracy visible in ruleqa. It's been months since
 I've looked at that, so it's needless. I'll remove them from svn
 asap.

 --j.


 WAIT!!! Wouldn't this remove our ability to check for false positives of
 your patterns against the much larger ham collection of nightly masscheck?


 The alternative is to filter SOUGHT from the sa-update rule updates with a 
 script, but still allow it in the nightly masschecks.  Testing sought in 
 nightly masschecks has been useful to occasionally find obvious SOUGHT 
 problems, or sometimes to locate spam that was misplaced in the ham folder.

Sure -- if that's preferable we can do that. We just need to be
diligent about tflags Nopublish.



 Warren



Re: Sought rules

2011-06-11 Thread Justin Mason
guys -- I'm going to make the whole question moot (in trunk at least)
-- the only reason SOUGHT and SOUGHT_FRAUD were being checked in there
was to make their accuracy visible in ruleqa.  It's been months since
I've looked at that, so it's needless.  I'll remove them from svn
asap.

--j.

2011/6/11 Karsten Bräckelmann guent...@rudersport.de:
 On Fri, 2011-06-10 at 23:13 -1000, Warren Togami Jr. wrote:
 Wait a sec, I'm confused about this.  JM_SOUGHT_2 hitting on every
 legit Facebook message on dev@ list February 17th 2011.  If the SOUGHT
 channel was being overridden by the sa-update rules, how would this
 problem appear from the SOUGHT channel?  Doesn't this suggest that
 spamassassin was successfully using the SOUGHT channel?

 Yes, and no. grep for SOUGHT in the stock rules...

 The stock rule-set has a snapshot of the SOUGHT_FRAUD patterns, they do
 NOT have the SOUGHT patterns.


 And, well... READ THIS instead. ;)

 --
 char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
 main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
 (c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}




Re: Spamassasin - SQLITE as storage database

2011-05-18 Thread Justin Mason
On Wed, May 18, 2011 at 11:26, Mark Martinec mark.martinec...@ijs.si wrote:
 On Wednesday May 18 2011 09:42:55 monolit wrote:
  do you have any experience with usage of SQLITE database as storage for
  Spamassassin? Spamassassin uses Berkeley DB, but I need to replace it.
  I could not find any manual, guide or just phorum discussion about
  colaboration Sapmassassin with SQLITE. I apreciate each advice.
 
 Thanks for your post. Unfortunately my boss wants to use just SQLITE.

 Lawrence @ Rogers wrote:
  I have no experience with this, but I do have experience with using
  MySQL with InnoDB tables. The performance is actually much better than
  Berkley DBs.

 Bear in mind that MySQL with InnoDB (as well PostgreSQL) offer
 fine-grained locking at a record level, which SQLite does not
 (the last time I checked). It is very likely that SpamAssassin
 could use SQLite as user preferences database (i.e. read-only)
 with no problems, it is also very likely that the usage of
 SQLite for a r/w database such as Bayes and AWL will cause
 lock contention on a busy server.

iirc, Matt Sergeant looked into using SQLite early on, but abandoned
it due to the locking-related issues.

--j.


Re: Mailspike Performance

2011-04-15 Thread Justin Mason
On Thu, Apr 14, 2011 at 22:51, Adam Katz antis...@khopis.com wrote:
 RCVD_IN_MSPIKE_BL has 99% overlap with the SA3.3 set and 98% with the
 SA3.2 set.  That leaves 0.6758% of spam uniquely hitting this DNSBL (1%
 of its 67.5822%).  RCVD_IN_SEMBLACK has the same story, resulting in
 0.5138% unique spam from its 1% non-overlap (though note its lower s/o).

Good point. But what about the ham?  if it hits the same spam, but
less ham, it's a better rule.

--j.


Re: Need Volunteers for Ham Trap

2011-01-20 Thread Justin Mason
On Tue, Jan 18, 2011 at 12:59, Warren Togami Jr. wtog...@gmail.com wrote:
 On 1/17/2011 11:46 PM, Jeff Chan wrote:

 So a couple points:

 1.  Subscribing to lists opens up lots of grey areas including
 the above.

 2.  Some of the areas are very difficult to resolve into spam or
 ham.  Some more aggressive anti-spammers may say all of the above
 is spam, but others may disagree, and the mail may be legal.

 Before anyone accuses me of being in favor of spammers, please be
 aware that I am personally highly against any of these unethical
 practices, but when essentially making decisions for others, one
 needs to be very careful and consider whether there may be legitimate,
 ethical, legal or even wanted uses of such things.  One person's
 ham may be another persons spam, and vice versa.  However, most
 people don't want the stuff bots send.

 The issue is complex, and there are many deliverability, security
 and anti-spam companies and organizations that struggle with these
 issues every day.  Maintaining accurate ham and spam corpora and
 making policies for what belongs in which category is trivial in
 some easy cases like bot pill spam, but non-trivial in other
 cases.

 Cheers,

 Jeff C.

 I appreciate the nuanced feedback but I have thought of similar
 considerations.  I believe the following will help to avoid ambiguity and
 legal issues.

 * Yes, we cannot be 100% sure our opt-in was only for that particular site
 and not their partners.  But in any case automatic ham trapped mail will
 be only the mail branded by the subscribed provider, because that is the
 only mail we know for sure was opted-in.  Anything else is kept separate for
 later analysis.

 * If clearly spammy other mail arrives at a particular address, the original
 subscription can be unsubscribed and the continued flow monitored.  That
 address could then be discarded.

+1 to those. tagged addressing makes this easy to implement (and track).
I use this approach on a very small scale for a small number of ham newsletters
in my own corpus...

--j.


Re: Sought False Positives

2010-11-09 Thread Justin Mason
guys, feel free to mail me samples (offlist) of sought FPs -- ideally,
as mboxes.  it's easy enough to add them to the training process.

--j.

On Mon, Nov 8, 2010 at 22:54, mouss mo...@ml.netoyen.net wrote:
 Le 20/08/2010 17:12, Jan P. Kessler a écrit :

  Hi,

 we use spamassassin with the sought ruleset since several years at our
 company. After the upgrade to from 3.2.5 to 3.3.1 we notice tons of
 false-positives hitting on the rules JM_SOUGHT_1 and JM_SOUGHT_2.
 Unfortunaley I can not give examples as these messages contain
 confidental customer data (assurance company). We had more than 100
 false-positives with these rules in the last 2 days.

 I have drastically lowered the score from 4.0 to 1.0 for both rules and
 wanted to ask if anybody else noticed that?

 Cheers, Jan


 below is an FP which is a public mail. I'm going to zero the corresponding
 rules (I prefer false negatives, which help improving local rule, over false
 positives, exceptionally when I can't explain why).

 = FP sample
 Return-Path: websecurity-return-7218-mouss=ml.netoyen@webappsec.org
 Delivered-To: mouss+s...@ml.netoyen.net
 Received: from imlil.netoyen.net (localhost [127.0.0.1])
        by imlil.netoyen.net (Postfix) with ESMTP id A2E97E54898
        for mouss+s...@ml.netoyen.net; Mon,  8 Nov 2010 18:42:45 +0100
 (CET)
 X-Relay-Countries: US
 X-Virus-Scanned: amavisd-new at netoyen.net
 X-Spam-Flag: YES
 X-Spam-Score: 5.284
 X-Spam-Level: *
 X-Spam-Status: Yes, score=5.284 required=5 tests=[COUNTRY_US=0.01,
        JM_SOUGHT_3=4, RDNS_NONE=1.274] autolearn=no
 Received: from cgisecurity.net (unknown [199.125.85.46])
        by mx.netoyen.net (Postfix) with SMTP id A8EA4E54829
        for mo...@ml.netoyen.net; Mon,  8 Nov 2010 18:42:43 +0100 (CET)
 Received: (qmail 18910 invoked by uid 1017); 8 Nov 2010 18:36:41 -
 Mailing-List: contact websecurity-h...@webappsec.org; run by ezmlm
 Precedence: bulk
 List-Post: mailto:websecur...@webappsec.org
 List-Help: mailto:websecurity-h...@webappsec.org
 List-Unsubscribe: mailto:websecurity-unsubscr...@webappsec.org
 List-Subscribe: mailto:websecurity-subscr...@webappsec.org
 Delivered-To: mailing list websecur...@webappsec.org
 Delivered-To: moderator for websecur...@webappsec.org
 Received: (qmail 37779 invoked from network); 7 Nov 2010 18:51:51 -
 MIME-Version: 1.0
 In-Reply-To: 005301cb7ad5$b2875f30$c103f...@ml
 References: 002301cb7944$a7619b80$c103f...@ml
 aanlktimabfxcsrqdul=qvawxoqursqnt7nzefj2p7...@mail.gmail.com
  005301cb7ad5$b2875f30$c103f...@ml
 From: YGN Ethical Hacker Group li...@yehg.net
 Date: Mon, 8 Nov 2010 01:57:16 +0800
 Message-ID: aanlktimtbamufvwexpwqbcdl4bb55ai31hxwpcd6r...@mail.gmail.com
 To: MustLive mustl...@websecurity.com.ua
 Cc: websecur...@webappsec.org
 Content-Type: text/plain; charset=UTF-8
 Subject: Re: [WEB SECURITY] [New Tool Announcement] inspath - Path
 Disclosure Finder

 Hi MustLive

 Thanks for your suggestion.

 Searching for Google Cache might be a good feature to add in inpathx
 but I'm afraid this realm should/can be done with other tools like
 SiteDigger (http://www.foundstone.com/us/resources/proddesc/sitedigger.htm).



 -
 Best regards,
 YGN Ethical Hacker Group
 Yangon, Myanmar
 http://yehg.net
 Our Lab | http://yehg.net/lab
 Our Directory | http://yehg.net/hwd

 
 Join us on IRC: irc.freenode.net #webappsec

 Have a question? Search The Web Security Mailing List Archives:
 http://www.webappsec.org/lists/websecurity/archive/

 Subscribe via RSS:
 http://www.webappsec.org/rss/websecurity.rss [RSS Feed]

 To unsubscribe email websecurity-unsubscr...@webappsec.org and reply to
 the confirmation email

 Join WASC on LinkedIn
 http://www.linkedin.com/e/gis/83336/4B20E4374DBA

 WASC on Twitter
 http://twitter.com/wascupdates





Re: Sought False Positives

2010-11-09 Thread Justin Mason
On Tue, Nov 9, 2010 at 14:24, Bowie Bailey bowie_bai...@buc.com wrote:
 On 11/8/2010 6:04 PM, Lawrence @ Rogers wrote:
 On 08/11/2010 12:06 PM, Ned Slider wrote:

 Fair enough - fortunately I've not seen any of those here so assumed
 a genuine facebook mail had maybe slipped through into the corpus by
 mistake.

 Either way, it was fixed by the time I'd spotted it.
 I've seen it as well, and disabled the Sought rules. They were causing
 too many FPs and not hitting enough spam to be worthwhile.

 I haven't seen a lot of false positives, but you're right that they are
 not hitting much spam.

 I just checked my logs for the past two weeks and the Sought rules have
 hit on just over 1% of my spam.  They used to be the top rules in my
 list.  What happened?

Sorry about that -- basically, I haven't had any time recently to
curate my mail corpora.  I suspect sought is now building from old
data and I hadn't noticed

--j.


Re: Only running network tests when necessary - feature request

2010-10-30 Thread Justin Mason
btw, I think this is already possible using the shortcircuit plugin.
Just use rule priorities to run the non-net rules first, and
shortcircuit if they are sufficient.

On Sat, Oct 30, 2010 at 08:05, Henrik K h...@hege.li wrote:
 On Sat, Oct 30, 2010 at 02:23:00AM -0400, dar...@chaosreigns.com wrote:
 On 10/30, Michael Parker wrote:
   I'd like to see spamassassin only run network tests when they might
   affect the outcome.
 
  Why?

 To reduce the network load on my server which is one of the hosts of the
 DNSWL.org list?

  Assuming a reasonably fast connection network checks are basically free.

 That seems weird.

 But the total amount of bandwidth and processing time saved on the
 internet from not running unnecessary tests on every instance of
 spamassassin seems worth doing.

 Why not?

 It's nice that you are requesting a feature without any single proof that it
 would be generally useful. It would only make things in all ways much more
 complex, create additional delays, hinder log analyzing etc. What if there
 are lots of metas involving DNS rules or shortcircuits etc? It would become
 a big mess.

 This is just a different variation of the age old question that pops up here
 now and then. No one has yet showed it would have enough merits to
 implement.




Re: Public Corpus

2010-08-30 Thread Justin Mason
On Mon, Aug 30, 2010 at 02:03, RW rwmailli...@googlemail.com wrote:
 On Sun, 29 Aug 2010 17:36:36 -0700 (PDT)
 joker_ft gugafontan...@gmail.com top-posted:

 Benny Pedersen wrote:
 
  On søn 29 aug 2010 17:28:52 CEST, joker_ft wrote
 
  Does anyone know some public corpus updates in 2010 ? or why the
  spam assassin public corpus stopped update in 2006 ?
 
  if spammers know what being scanned for will it be effitive
  stopping spam ?
 


 Yes, this make sense.


 No it doesn't. If spammers want to know what SA is looking for they
 can just download it. If they want to optimize their spam to resist
 Bayes there's nothing special about corpora used to develop it.


Exactly -- nothing about SA's ruleset requires that the coding or
corpora be kept secret.  The only reason we keep most of our corpora
private is due to the contents being mostly private mail.

We don't have time to update the public corpus, unfortunately; it's
quite labour-intensive.

--j.


Re: Yerp connection issues

2010-05-27 Thread Justin Mason
On Thu, May 27, 2010 at 04:30, Adam Katz antis...@khopis.com wrote:
 On 05/26/2010 07:32 PM, John Hardin wrote:
 On Wed, 26 May 2010, Karsten Br�ckelmann wrote:

 The correct answer to both these statements is -- because it is in the
 mirrors list. ;)

 $ lynx -dump http://yerp.org/rules/MIRRORED.BY
 http://yerp.org:8080/rules/stage/ weight=10
 http://yerp.org/rules/stage/

 ...a botched attempt to set up Coral caching? It seems to me that should
 probably be:

  http://yerp.org.nyud.net:8080/rules/stage/ weight=10
  http://yerp.org/rules/stage/

 I do not suggest that.  Coral Cache does not play nicely with sa-update
 from my experiences (I seem to recall Justin saying the same a while ago).

 Since yerp is Justin's, I presume it's a different sort of experiment.

Yep; a separate HTTP server for sa-update files.  I'll investigate as
soon as I can.

--j.


Re: SOUGHT FP on Twitter notices

2010-05-06 Thread Justin Mason
2010/5/6 Karsten Bräckelmann guent...@rudersport.de:
 On Wed, 2010-05-05 at 15:39 -0700, Kelson Vibber wrote:
 We're seeing FPs Twitter's So-and-so is now following you on Twitter
 notices, pushed over by JM_SOUGHT_3's 4 points.  It appears to be
 matching on __SEEK_O1OO80, which contains a large chunk of Twitter's
 email footer.

 If I were to guess, it's probably due to the phishing campaign that's
 been targeting Twitter users over the last few weeks, faking a message
 from Twitter support. I've seen several of those phish land in our own
 spamtraps and abuse mailbox.

 I can send a ham sample if that would help.

 It does indeed. The sought rule-set's seek sub-rules are cross checked
 against a ham corpus. No twitter ham in the corpus results in forged
 twitter messages to be picked up in a seek, if the volume in the traps
 is high enough.

 Please send us a ham sample. Obfuscating identifying data is ok, but
 please keep it to a minimum needed, and make it obvious. Raw message
 attached preferred. Feel free to send it directly to me and/or Justin,
 rather than the list. Thanks!

+1.  I've added a quick fix (a copy of that rule's text) but some real
ham text would be better.

--j.


Re: SOUGHT ruleset FP

2010-04-16 Thread Justin Mason
yep -- feel free to send me over copies of FP messages (or strings
that match them)

2010/4/16 Karsten Bräckelmann guent...@rudersport.de:
 On Fri, 2010-04-16 at 12:20 +0100, Matthew Newton wrote:
 We had a legitimate e-mail hit the JM_SOUGHT_3 yesterday. It also
 hit a few other rules that pushed it over our reject threshold of
 10, and easily over the 'junk mail folder' level of 5.

 I managed to get them to send me the message, and it hits rule
 __SEEK_5ID3LI Conti  nuum Intern ational Publishing (spaces
 added!) which is the name of their company.

 Makes one wonder how that string ends up quite massively in spam traps.

 I know SOUGHT is an auto-generated ruleset; just wondering if
 there is there any way to remove false positives before the set is

 Yes. The Seek bits are cross-checked against a ham corpus, so the
 easiest way is to inject an artificial ham message with the string in
 question to get it off of the next run.

 generated? Otherwise I'll add local rules to compensate against
 this one.

 meta __SEEK_5ID3LI  (0)

 The Seek ID is constant, and will be the same even with later Sought
 runs, for a given string.

  guenther


 --
 char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
 main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
 (c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}




Re: Blacklists Compared 17 October 2009

2010-04-07 Thread Justin Mason
he doesn't take FPs into account.  this is a very serious problem with
the methodology.

--j.

On Wed, Apr 7, 2010 at 03:41, Alex mysqlstud...@gmail.com wrote:
 Hi,

 Last October Marc posted the following URL that compared the various RBLs:

 http://www.sdsc.edu/~jeff/spam/cbc.html

 It seems barracuda is still leading, but is that also everyone's
 experience? Can anyone provide details on how Jeff computed this
 information and is it as cut-and-dried as this makes it seem? IOW,
 barracuda, the free service, is better than all the rest...

 Thanks,
 Alex




Re: has SA 3.3.1 been recalled?

2010-03-28 Thread Justin Mason
On Sun, Mar 28, 2010 at 16:38, Klaus Heinz k.he...@maerzehn.kh-22.de wrote:
 Justin Mason wrote:
 all active mirrors should be updated by now.  if anyone runs into this, just
 save yourself the bother and immediately switch to
 http://www.apache.org/dist/spamassassin/source/ and download directly from
 the source.

 The original rule archive Mail-SpamAssassin-rules-3.3.1.r923114.tgz is
 not avalaible from the URL above.
 What is available is Mail-SpamAssassin-rules-3.3.1.r923257.tgz, as far
 as I remember this was the rule archive originally _proposed_ for release
 but then replaced with r923114.
 Could we have the officially released rules back?

Well spotted :(   My mistake.

I also noticed the downloads page on spamassassin.a.o similarly linked
to the 923257 tarballs
for rule packages, instead of 923114!  fixed.

-- 
--j.


Re: has SA 3.3.1 been recalled?

2010-03-26 Thread Justin Mason
all active mirrors should be updated by now.  if anyone runs into this, just
save yourself the bother and immediately switch to
http://www.apache.org/dist/spamassassin/source/ and download directly from
the source.

Then, please reply with the URL of the broken mirror and we'll ask infra to
remove them from the mirrors list until they're up to date.

There seems to be some brokenness on our mirrors :(

--j.

On Fri, Mar 26, 2010 at 09:12, Mathias Homann ad...@eregion.de wrote:
 I'm trying to get the 3.3.1 source frm the website, but so far all mirrors
 replied file not found...


 what's up with that?


 bye,
 MH





-- 
--j.


Re: has SA 3.3.1 been recalled?

2010-03-26 Thread Justin Mason
oh, and no -- SA 3.3.1 has not been recalled!

On Fri, Mar 26, 2010 at 10:20, Justin Mason j...@jmason.org wrote:
 all active mirrors should be updated by now.  if anyone runs into this, just
 save yourself the bother and immediately switch to
 http://www.apache.org/dist/spamassassin/source/ and download directly from
 the source.

 Then, please reply with the URL of the broken mirror and we'll ask infra to
 remove them from the mirrors list until they're up to date.

 There seems to be some brokenness on our mirrors :(

 --j.

 On Fri, Mar 26, 2010 at 09:12, Mathias Homann ad...@eregion.de wrote:
 I'm trying to get the 3.3.1 source frm the website, but so far all mirrors
 replied file not found...


 what's up with that?


 bye,
 MH





 --
 --j.




-- 
--j.


Re: ANNOUNCE: Apache SpamAssassin 3.3.1 available

2010-03-20 Thread Justin Mason
On Sat, Mar 20, 2010 at 11:25, Martin ma...@ntlworld.com wrote:
 -Original Message-
 From: Jim Knuth [mailto:j...@jkart.de]
 Sent: Friday, March 19, 2010 6:38 PM
 To: users@spamassassin.apache.org
 Subject: Re: ANNOUNCE: Apache SpamAssassin 3.3.1 available

 schrieb Michael Scheidell:
  On 3/19/10 12:31 PM, Justin Mason wrote:
  Release Notes -- Apache SpamAssassin -- Version 3.3.1
 
 
 http://www.apache.org/dist/spamassassin/source/Mail-SpamAssassin-3.3.1
  .tar.gz.md5
 
  error 404
  the requested file is not found on this server.
 
 

 use CPAN, everything is ok. :)


 After a reload index I found the new version but the .bz2 seems to be
 causing my cpan problems, getting; Failed during this command:
 JMASON/Mail-SpamAssassin-3.3.1.tar.bz2       : unwrapped NO -- untar failed

 Can we just not go back to the .gz, never had a problem before.


ok, I've uploaded the tar.gz.  We should avoid the .bz2 in future if
it causes problems.

-- 
--j.


ANNOUNCE: Apache SpamAssassin 3.3.1 available

2010-03-19 Thread Justin Mason
Release Notes -- Apache SpamAssassin -- Version 3.3.1


Introduction


This is a minor release, adding a new URIBL network rule (URIBL_DBL_SPAM, for 
the
Spamhaus DBL).


Downloading and availability


Downloads are available from:

http://spamassassin.apache.org/downloads.cgi

md5sum of archive files:

  bb977900c3b2627db13e9f44f9b5bfc8  Mail-SpamAssassin-3.3.1.tar.bz2
  5a93f81fda315411560ff5da099382d2  Mail-SpamAssassin-3.3.1.tar.gz
  4cfeb3449cee173085deef06e3090543  Mail-SpamAssassin-3.3.1.zip
  3e6ae5a39b9dd2de7ec05a2b315c396b  Mail-SpamAssassin-rules-3.3.1.r923114.tgz

sha1sum of archive files:

  f5748043eb286b1acb456093039a55db00c6f25e  Mail-SpamAssassin-3.3.1.tar.bz2
  8b32a857cc89c8d057442400bc00f33fd703ce06  Mail-SpamAssassin-3.3.1.tar.gz
  9fc7c8bfd153d49d60fbeba99d0a4272609e3a26  Mail-SpamAssassin-3.3.1.zip
  7aeeb7abb2d727bb35d3a0927a1390ad3cddad59  
Mail-SpamAssassin-rules-3.3.1.r923114.tgz


Note that the *-rules-*.tgz files are only necessary if you cannot, or do not
wish to, run sa-update after install to download the latest fresh rules.

The release files also have a .asc accompanying them.  The file serves
as an external GPG signature for the given release file.  The signing
key is available via the wwwkeys.pgp.net key server, as well as
http://www.apache.org/dist/spamassassin/KEYS

The key information is:

pub   4096R/F7D39814 2009-12-02
  Key fingerprint = D809 9BC7 9E17 D7E4 9BC2  1E31 FDE5 2F40 F7D3 9814
uid  SpamAssassin Project Management Committee 
priv...@spamassassin.apache.org
uid  SpamAssassin Signing Key (Code Signing Key, replacement 
for 1024D/265FA05B) d...@spamassassin.apache.org
sub   4096R/7B3265A5 2009-12-02

See the INSTALL and UPGRADE files in the distribution for important
installation notes.


Summary of major changes since 3.3.0


bug 6335: add Spamhaus DBL as URIBL_DBL_SPAM rule

Bug 6370: update ImageInfo plugin to latest release

bug 6215, bug 6294: RCVD_IN_CSS rule was broken.  the check_rbl_sub() syntax
was incorrect, resulting in missing hits

bug 6361: list 2tld and 3tld sub-domain hosters for URIBL/SURBL/DBL queries;
NOTE for SARE users: This file replaces the SARE file
http://www.rulesemporium.com/rules/90_2tld.cf, which will be deprecated as from
2010-05-01.

Bug 6369, 6356, 6373: WIN32 support for spamd improved

Bug 6267: Solaris 10 requires --syslog-socket=native

bug 6304 spamd is spawning and killing processes too often - Added spamd
adjustments to info level and more information for administrators + small fix
to Makefile.PL

Bug 6310: sa-learn --import gives Insecure dependency in open

Bug 6313: -Q or -q AND -x should not result in creation of a ~/.spamassassin
dir; plus: taint issues fixed

Bug 6342: make test failure on if_can under perl 5.6

Bug 6340: Impossible to find user home directory of VPOPMAIL alias

Bug 6072, 6343: POD warnings, documentation fixes

Bug 6304 (trivial), reduce sysadmin's stress level by lowercasing
the 'INTERRUPTED' in a logged message:
 spamd: handled cleanup of child pid [...] due to SIGCHLD: INTERRUPTED

Bug 6329: POSIX::strftime in call under Win32 ActivePerl causes Perl to hang up;
formatting option %e is not in a POSIX standard, use %d instead and edit

Bug 6322: In DKIM ADSP eval test check_dkim_adsp() the '*' is handled 
incorrectly

Bug 6327: Fix calling argument in utility used to determine DCC's homedir

Bug 6316: DCC.pm, wrong options for dcc_proc, (plus: avoid a warning on undef
in logger when dccifd socket is not provided)

Bug 6287: improved DKIM plugin debugging

Bug 6321 - _TOKENSUMMARY_ not working in 3.3.0 (Plugin/Bayes.pm looks-up a tag
from wrong location)

Bug 6312 - uninitialized value $start_time in spamd

bug 5761: trivial doc fix: document SPAMD_LOCALHOST test-control env variable


About Apache SpamAssassin
-

Apache SpamAssassin is a mature, widely-deployed open source project
that serves as a mail filter to identify spam. SpamAssassin uses a variety
of mechanisms including mail header and text analysis, Bayesian filtering,
DNS blocklists, and collaborative filtering databases. In addition, Apache
SpamAssassin has a modular architecture that allows other technologies to be
quickly incorporated as an addition or as a replacement for existing methods.
Apache SpamAssassin typically runs on a server, classifies and labels spam
before it reaches your mailbox, while allowing other components of a mail
system to act on its results.

Most of the Apache SpamAssassin is written in Perl, with heavily traversed
code paths carefully optimized. Benefits are portability, robustness and
facilitated maintenance. It can run on a wide variety of POSIX platforms.
The server and the Perl library feels at home on Unix and Linux platforms,
and reportedly also works on MS Windows systems under ActivePerl.

For more information, visit http://spamassassin.apache.org/


About The 

Re: Pathological messages causing long scan times

2010-03-18 Thread Justin Mason
On Thu, Mar 18, 2010 at 21:56, Matt Garretson
ma...@assembly.state.ny.us wrote:
 On 3/18/2010 5:15 PM, Kris Deugau wrote:
 Here's one pretty much guaranteed to peg a CPU core for ~130 seconds (or
 more):

 http://pastebin.com/2ssy2YEk


 Interesting. I see the same thing as you on that message. There's a
 two-minute gap between these two debug lines:

  rules: ran body rule __F_LARGE_MONEY_2 == got hit: 00 million
  rules: ran body rule __SEEK_FRAUD_JFMEJI == got hit: Insurance premium 
 and Clearance Certificate Fee

 One CPU is mostly pegged during that period. Thinking it had something
 to do with JM_SOUGHT_FRAUD_3, I removed that and __SEEK_FRAUD_JFMEJI,
 with no improvement. I don't know where else to look, aside from
 trial-and-error disabling of rules.

binary search of the ruleset is the easiest way to find this.  It's
most often a rawbody rule.  Please let us know which one it is,
particularly if it's in the core ruleset rather than a third-party
one...


-- 
--j.


Re: Pathological messages causing long scan times

2010-03-18 Thread Justin Mason
that's CPU-bound, no system calls = regexp matching.  body, rawbody
or full rules.

On Thu, Mar 18, 2010 at 22:16, Matt Garretson
ma...@assembly.state.ny.us wrote:
 On 3/18/2010 6:06 PM, Matt Garretson wrote:
 It looks like a dns call (or two?) for URI-A took 120 seconds to return.
 Is that a mere coincdence, or could that be causing a spin of some sort?


 FWIW, strace shows spamassassin doing this about twice a second
 (with varying arguments) during the two-minute delay:

  brk(0x69df000)                          = 0x69df000
  mremap(0x7fc9756db000, 1298432, 1302528, MREMAP_MAYMOVE) = 0x7fc9756db000
  mremap(0x7fc9756db000, 1302528, 1306624, MREMAP_MAYMOVE) = 0x7fc9756db000
  []






-- 
--j.


Re: [SpamAssassin] 0day spamass-milter again....

2010-03-17 Thread Justin Mason
On Wed, Mar 17, 2010 at 15:07, Daniel Lemke le...@jam-software.com wrote:
 Hmm, any comments on this?

 Heise.de just published an article regarding this issue:
 http://www.heise.de/newsticker/meldung/Sicherheitsluecke-in-SpamAssasin-Filtermodul-956991.html

 Kind of interesting to me since I have to run sa as root under windows ;)

It'll have no effect whatsoever.

To clarify: spamass-milter is not a part of SpamAssassin.  it's a
third-party product which allows sendmail/postfix users to integrate
spamassassin into their message flows as a milter.  We don't maintain
it.

--j.


Re: SA team lambasted in RISKS Digest

2010-03-05 Thread Justin Mason
Agreed, he's clearly unaware of
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6271

Anyone care to craft a response?  I think we should.  bonus points for
including the obligatory comp.risks tagline:

The RISK?  Jumping to an invalid conclusion based on incomplete research.

;)

--j.

2010/3/5 Karsten Bräckelmann guent...@rudersport.de:
 On Fri, 2010-03-05 at 09:45 +0800, jida...@jidanni.org wrote:
 http://catless.ncl.ac.uk/Risks/25.94.html#subj11
 I suggest someone send RISKS a clarification if indeed the issue is resolved.

 I suggest the author checks the facts. The following quote, the
 beginning of that, err... text, is utter bullshit.

  In RISKS-25.89 (Y2K+10 problem 4: SpamAssassin tags '2010' e-mail as
  spammish) M. Burstein wrote that the problem was that It seems the 'year
  date' was hard/hand coded, as opposed to making a comparison to 'today's'
  date. and observed that The SpamAssassin folk have a new version which
  corrects this problem.  In fact, they do not.  The replacement rule
  incorporates the same problem as before, scheduled to occur simply ten years
  further into the future, in January 2020.  This mistake has not been learned
  from, let alone corrected.

 There are bugs open about this. There are rules currently under
 evaluation, which will make this a fluid target, rather than a hardcoded
 year.

 We've got ten years, to close that bug and finish the evaluation. And
 even to come up with a more narrow definition of grossly in the
 future.

 Yes, that quote is what you get if you base your judgement *and* future
 predictions solely on the incident -- but forget to check current
 development and what's being done to prevent it.

 That quote hardly was worth my reply. *sigh*


 --
 char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
 main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
 (c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}





-- 
--j.


Re: Rule QA: Completeness / Preflight?

2010-03-01 Thread Justin Mason
On Mon, Mar 1, 2010 at 15:01,  dar...@chaosreigns.com wrote:
 On 03/01, Justin Mason wrote:
 it's based on who's reported their logs -- give it time to complete.

 Thanks.

 nope -- preflights have been stopped, as they're quite CPU-intensive and
 we don't have the hardware.

 How about hit-frequencies output from the corpora used for sa-update
 updates?

that's the ruleqa.spamassassin.org UI.


Re: Rule QA: Completeness / Preflight?

2010-03-01 Thread Justin Mason
On Mon, Mar 1, 2010 at 17:09,  dar...@chaosreigns.com wrote:
 On 03/01, Justin Mason wrote:
 that's the ruleqa.spamassassin.org UI.

 Which data is used for the sa-updates?  Just the latest random weekly
 network mass-check?

Yep, exactly.  (with additional checks to ensure the data is good enough
beforehand.)


-- 
--j.


Re: sa-update channel problem

2010-02-15 Thread Justin Mason
On Mon, Feb 15, 2010 at 07:46, mbeis mb...@xs4all.nl wrote:


 John Hardin wrote:

 On Sun, 14 Feb 2010, mbeis wrote:

 Feb 14 22:12:46.522 [11706] dbg: dns: query failed:
 0.3.3.updates.spamassassin.org = NOERROR
 Feb 14 22:12:46.525 [11706] dbg: dns: query failed:
 mirrors.updates.spamassassin.org = NOERROR
 channel: no 'mirrors.updates.spamassassin.org' record found, channel
 failed
 Feb 14 22:12:46.525 [11706] dbg: diag: updates complete, exiting with
 code 4

 I've no idea where to look to solve this. Has anyone here have an idea
 what
 causes this?

 Silly, basic question: does DNS work from that host?

 What does dig +short -t TXT 0.3.3.updates.spamassassin.org return?


 I have this computer running like this for 6 years now, and I've never had a
 problem like this before. When I enter the command it returns nothing,
 doesn't seem ok to me?

The most likely scenario is that your /etc/resolv.conf file specifies
an incorrect value for the first nameserver.  Ensure the first IP
listed in that file is a working recursive NS.

if you don't have working DNS at the site, maybe download the rules
tarball from the download site and use sa-update --install.

-- 
--j.


Re: MTX plugin created (Re: Spam filtering similar to SPF, less breakage)

2010-02-15 Thread Justin Mason
On Sat, Feb 13, 2010 at 11:01, Per Jessen p...@computer.org wrote:
 Justin Mason wrote:

 On Thu, Feb 11, 2010 at 03:00,  dar...@chaosreigns.com wrote:
 http://www.chaosreigns.com/mtx/


 It might be useful to compare with MTA MARK and see what the status of
 that proposal currently is:

 http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/
http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/draft-stumpf-dns-mtamark-04.txt

 Amazing.  Justin, you must have known about that one - you can't
 possibly have just googled it?

I could vaguely recall it, then someone else reminded me of the exact
name.  There have been a lot of MARID proposals in the past...

--j.



-- 
--j.


Re: MTX plugin created (Re: Spam filtering similar to SPF, less breakage)

2010-02-12 Thread Justin Mason
On Thu, Feb 11, 2010 at 03:00,  dar...@chaosreigns.com wrote:
 http://www.chaosreigns.com/mtx/


It might be useful to compare with MTA MARK and see what the status of
that proposal currently is:

http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/
http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/draft-stumpf-dns-mtamark-04.txt

-- 
--j.


Re: My Outgoing Email is Flagged as ***SPAM***?

2010-02-08 Thread Justin Mason
On Mon, Feb 8, 2010 at 15:07, Carlos Williams carlosw...@gmail.com wrote:
 On Mon, Feb 8, 2010 at 9:52 AM, Mike Cardwell
 spamassassin-us...@lists.grepular.com wrote:
 The error message, gpg required but not found, means that gpg was
 required, but was not found. I'm going to go out on a limb here and suggest
 that maybe gpg wasn't found, but is required, and that you should install
 it?

 As you guessed I figured this much with the help of Google. I search
 over and over how do I install the GPG key on my system and the Wiki
 really does not give clueless people like myself any reference as to
 how I can import the following key:

 http://www.apache.org/dist/spamassassin/KEYS

 I saved that key into a file and read the man page for sa-update
 trying to import it and all attempts failed:

 sa-update --gpgkey /sa_gpg.key
 error: gpg required but not found!

Take another look at the man page:

 --import file   Import GPG key(s) from file into sa-update's
 keyring. Use multiple times for multiple files

[...]

   --import
   Use to import GPG key(s) from a file into the sa-update
keyring which is located in the
   directory specified by --gpghomedir.  Before using channels
from third party sources,
   you should use this option to import the GPG key(s) used by
those channels.  You must
   still use the --gpgkey or --gpgkeyfile options above to get
sa-update to trust imported
   keys.

   To import multiple keys, use the option multiple times.  i.e.:

   sa-update --import channel1-GPG.KEY --import channel2-GPG.KEY

   Note: use of this option automatically enables GPG verification.

--j.



 [r...@mail spamassassin]# sa-update --gpghomedir /sa_gpg.key
 error: gpg required but not found!

 I am guessing I am again missing something here which is placing your
 palm to face but I really would appreciate any assistance in how to
 correct this error? If it's explain to me on the SA Wiki, then I am
 missing it because I read through it pretty good.





-- 
--j.


Re: Sought rules not doing so good

2010-02-03 Thread Justin Mason
On Tue, Feb 2, 2010 at 18:21, Warren Togami wtog...@redhat.com wrote:
 On 02/02/2010 12:07 PM, Adam Katz wrote:

 That is quite different from our masscheck stats.  Today's results at
 http://ruleqa.spamassassin.org/20100201/%2FJM_SOUGHT look like this:

    SPAM%     HAM%     S/O    RANK   SCORE  NAME
   9.8564   0.0042   1.000    0.94    0.01  T_JM_SOUGHT_3
   8.1587   0.0068   0.999    0.93    0.01  T_JM_SOUGHT_2
  11.6464   0.0289   0.998    0.89    0.01  T_JM_SOUGHT_1
        0        0   0.500    0.48    0.00  JM_SOUGHT_FRAUD_1
        0        0   0.500    0.48    0.00  JM_SOUGHT_FRAUD_2
        0        0   0.500    0.48    0.00  JM_SOUGHT_FRAUD_3


 FWIW the nightly masscheck is often very unbalanced especially on the spam
 side.  Sometimes we have only 50k spam, sometimes over 500k spam. Some spam
 corpora contain a disproportionate amount of high scoring spam trap mail.  I
 personally randomly filter out a large percentage of high scoring mail in an
 attempt to balance my spam corpus.  But ultimately we need more masscheck
 participants to have better results.

The corpus-quality for that masscheck doesn't look too bad though:

http://ruleqa.spamassassin.org/20100201-r905213-n/T_JM_SOUGHT_1/detail?s_corpus=1#corpus

-- 
--j.


Re: PORTERS QUESTION: SA 3.3.0 and rules

2010-02-01 Thread Justin Mason
In this case, I would use the sa-update --install option.

On Sun, Jan 31, 2010 at 19:56, Michael Scheidell scheid...@secnap.net wrote:
 Working on official SA 3.3.0 port for Freebsd, have a Question:
 if user who installs SA 3.3.0 does NOT install or use sa-update, then I have
 to install the default ruleset.
 where should I put it? into the updates directory?
 ../3.003000/updates_spamassassin_org/
 or where it was for 3.2.5? ../share/mail/spamassassin?

 assuming they will either NEVER update it, or they will (someday) get smart
 and run sa-update?
 where is the best place to put it?
 and, will checksum/location of default ruleset ever change?

 --
 Michael Scheidell, CTO
 Phone: 561-999-5000, x 1259
 *| *SECNAP Network Security Corporation

   * Certified SNORT Integrator
   * 2008-9 Hot Company Award Winner, World Executive Alliance
   * Five-Star Partner Program 2009, VARBusiness
   * Best Anti-Spam Product 2008, Network Products Guide
   * King of Spam Filters, SC Magazine 2008

 __
 This email has been scanned and certified safe by SpammerTrap(r). For
 Information please see http://www.secnap.com/products/spammertrap/
 __




-- 
--j.


Re: PORTERS QUESTION: SA 3.3.0 and rules

2010-02-01 Thread Justin Mason
it's a release version -- each release's version of that file and
its sigs will never change.

On Mon, Feb 1, 2010 at 10:55, Michael Scheidell scheid...@secnap.net wrote:


 On 2/1/10 5:52 AM, Justin Mason wrote:

 In this case, I would use the sa-update --install option.



 thanks, yes, I think during the freebsd fetch, I will fetch both tarballs,
 install the default rule set so that if they start spamd or run SA, it won't
 fail.
 (so that it is consistent with existing installations)

 Q: will that 'default' tarball of rules ALWAYS be available? and ALWAYS have
 the same md5 sig and size? or will it change?
 if it 'moves' or changes, then ports and rpm maintainers will need a
 'static', (release version) that doesn't change.

 --
 Michael Scheidell, CTO
 Phone: 561-999-5000, x 1259
 | SECNAP Network Security Corporation

 Certified SNORT Integrator
 2008-9 Hot Company Award Winner, World Executive Alliance
 Five-Star Partner Program 2009, VARBusiness
 Best Anti-Spam Product 2008, Network Products Guide
 King of Spam Filters, SC Magazine 2008

 

 This email has been scanned and certified safe by SpammerTrap®.

 For Information please see http://www.secnap.com/products/spammertrap/
 




-- 
--j.


Re: sa-compile and 3.3.0

2010-01-29 Thread Justin Mason
On Fri, Jan 29, 2010 at 14:59, Mark Martinec mark.martinec...@ijs.si wrote:
 On Friday 29 January 2010 04:20:15 René Berber wrote:
 Jason Bertoch wrote:
  What version of re2c are you using?  Can you post the output of
  'spamassassin -D --lint' to pastebin?

 Now using re2c 1.3.5 same problem, to be precise it doesn't hang, it
 loops (the CPU usage goes up and down, RSS the same, up and down) at the
 same point.

 Here's the output http://pastebin.com/m438000e0

 Indeed, it looks like spinning in
  zoom: run_body_fast_scan for body_0

 Could well be due to some problematic rule (SARE?),
 combined with some unusual mail.

 The re2c and the 'zoom:' plugins are beyond my expertise.
 I suggest you open a problem report, and attach
 your above log there, so that it does not get lost.

+1.

Typically the best approach is to binary-search the list of body
rules until you identify the subset that cause the problem, then use
sa-compile --keep-tmps to capture the re2c output and attach it.

-- 
--j.


Re: Fuzzyocr and rule errors after upgrade to 3.3.0

2010-01-27 Thread Justin Mason
On Wed, Jan 27, 2010 at 16:43, John Wilcock j...@tradoc.fr wrote:
 To state the problem again, 463 of the scores in the 50_scores.cf from 3.3.0
 sa-update refer to rules that used to be in 72_active.cf or 80_additional.cf
 in 3.2.5, but that neither of these two files are anywhere to be found in
 the 3.3.0 sa-update.

 Either someone forgot to delete all these rules, or (more likely IMO)
 someone forgot to include 72_active.cf and 80_additional.cf in the sa-update
 files.

I think you're dead right.  It appears one of the build scripts does
the wrong thing with the 3.3.0 release, and cut an update missing
those files. :(

I've just cut a new update which should fix it, and opened a bug to
fix the build procedure...

-- 
--j.


Re: REMINDER: 3.3.0 final cut January 15th, 2010

2010-01-18 Thread Justin Mason
On Sun, Jan 17, 2010 at 21:31, Kai Schaetzl mailli...@conactive.com wrote:
 Warren Togami wrote on Mon, 11 Jan 2010 10:32:15 -0500:

 This is a reminder that the 3.3.0 final cut is scheduled for Friday,
 January 15th.

 It seems you forgot to announce RC3, or did I overlook it here?

rc3 was never released -- there were some test failures and other
issues, so that cut was canned.


Re: Error code 98

2010-01-13 Thread Justin Mason
yes, good point.  I've updated the POD docs now for 3.3.0.

--j.

On Wed, Jan 13, 2010 at 09:01, Cecil Westerhof ce...@decebal.nl wrote:
 In the thread:
    http://osdir.com/ml/debian-bugs-closed/2009-08/msg01318.html

 Error code 98 is described as the message being fed being to big and the
 problem resolved. But it is not.

 I have a big message:
    -rw-r--r--  1 imaps users 1,4M 2010-01-11 18:05 
 1263235863.M361818P11014V0303I00D10102_0.Asterisk,S=1406379:2,S

 When feeding it to spamc with:
    spamc -L spam 
 toProcess/1263235863.M361818P11014V0303I00D10102_0.Asterisk,S=1406379:2,S

 I get a return code of 98 and this breaks my crontab job. I now solved
 this by:
                set +e
                message=$(spamc -L ${typeStr} ${toProcessSpamDir}${i})
                errorCode=${?}
                set -e
                case ${message} in
                    'Message successfully un/learned')
                        let ++learned
                        ;;
                    'Message was already un/learned')
                        let ++notLearned
                        ;;
                    *)
                        let ++error
                        case ${errorCode} in
                            98)
                                echo ${i} was to big to be processed
                                ;;
                            *)
                                echo unknown error (${errorCode})
                                ;;
                        esac
                        ;;
                esac

 But I think this should be documented. (With other undocumented errors
 if there are.)

 --
 Cecil Westerhof
 Senior Software Engineer
 LinkedIn: http://www.linkedin.com/in/cecilwesterhof





-- 
--j.


Re: Spamd startup locale question

2010-01-13 Thread Justin Mason
I'm not sure -- I've seen that (generally with UTF-8 locales).   I
think it may be missing locale data in the OS install.

On Wed, Jan 13, 2010 at 14:53, Rosenbaum, Larry M. rosenbau...@ornl.gov wrote:
 SpamAssassin Server version 3.2.5
  running on Perl 5.8.8
  with zlib support (Compress::Zlib 2.011)
 SunOS ornl72 5.9 Generic_122300-07 sun4u sparc SUNW,Sun-Fire-V240

 What causes the following error message when restarting spamd?

 perl: warning: Setting locale failed.
 perl: warning: Please check that your locale settings:
        LC_ALL = (unset),
        LANG = en_US
    are supported and installed on your system.
 perl: warning: Falling back to the standard locale (C).

 This happens on some, but not all, of our systems running spamd.  All the 
 startup files contain

 LANG=en_US
 export LANG

 Thanks,
 Larry





-- 
--j.


Re: Compiling error on Snapshots from svn.spamassassin.org

2010-01-12 Thread Justin Mason
hi -- is this still occurring with latest snapshots?  If so, could you
open a ticket at our bugzilla?

On Tue, Jan 5, 2010 at 21:00, David Bayle david.ba...@zerospam.ca wrote:
 Hy,

 Our setup is:
 - Ubuntu 8.04 ( 2.6.26 )
 - Trying to setup snapshots from
 http://svn.apache.org/snapshots/spamassassin/
 - Tested:

  spamassassin_20100104211200.tar.gz
  spamassassin_20100105031200.tar.gz
  spamassassin_20100105091200.tar.gz
  spamassassin_20100105151200.tar.gz

 We are trying to install SA snapshot from the svn.apache.org, but we got
 this error with all snapshots:

 config.status: creating config.h
 /usr/bin/make -f spamc/Makefile spamc/spamc
 make[2]: Entering directory `/home/dbayle/spamassassin'
 gcc  -g -O2 spamc/spamc.c spamc/getopt.c spamc/libspamc.c spamc/utils.c \
         -o spamc/spamc -L/usr/local/lib  -ldl -lz
 make[2]: Leaving directory `/home/dbayle/spamassassin'
 cp spamc/spamc blib/script/spamc
 /usr/bin/perl -MExtUtils::MY -e MY-fixin(shift) blib/script/spamc
 /usr/bin/perl build/preprocessor  -Mvars -DVERSION=3.003000
 -DPREFIX=/usr -DDEF_RULES_DIR=/usr/share/spamassassin
 -DLOCAL_RULES_DIR=/etc/mail/spamassassin
 -DLOCAL_STATE_DIR=/var/lib/spamassassin
 -DINSTALLSITELIB=/usr/share/perl5 -DCONTACT_ADDRESS=the administrator of
 that system -Msharpbang -Mconditional -DPERL_BIN=/usr/bin/perl
 -DPERL_WARN= -DPERL_TAINT= -m755 -isa-learn.raw -osa-learn
 cp sa-learn blib/script/sa-learn
 /usr/bin/perl -MExtUtils::MY -e MY-fixin(shift) blib/script/sa-learn
 make[1]: *** No rule to make target `sa-awl.raw', needed by `sa-awl'.  Stop.
 make[1]: Leaving directory `/home/dbayle/spamassassin'
 make: *** [build-stamp] Error 2
 dpkg-buildpackage: failure: debian/rules build gave error exit status 2


 May you help us  ?
 Best regards,
 ZEROSPAM technical support

 --
 ZEROSPAM Sécurité Inc. - http://www.zerospam.ca
 Tél : (514) 527 3232  #210
 Fax : (514) 527 1201

 Ce courriel a été filtré par ZEROSPAM pour votre sécurité. This email has
 been scanned by ZEROSPAM for your security. zerospam.ca/




-- 
--j.


Re: sa-update failing

2010-01-07 Thread Justin Mason
Very unusual.  What do you get if you use wget or curl from the
command line to download those URLs:


[98652] dbg: http: GET request,
http://daryl.dostech.ca/sa-update/asf/895075.tar.gz
[98652] dbg: http: GET request,
http://daryl.dostech.ca/sa-update/asf/895075.tar.gz.sha1
[98652] dbg: http: GET request,
http://daryl.dostech.ca/sa-update/asf/895075.tar.gz.asc

At a first guess, I suspect a transparent proxy at your site or ISP,
messing with the HTTP traffic, but it could be something more
ominous...

--j.

On Wed, Jan 6, 2010 at 19:42, David Chaplin-Loebell dir...@klatha.com wrote:
 Hi,

 I'm trying to update my rules with sa-update, and it is failing:

 deliver3# sa-update -D
 [98652] dbg: logger: adding facilities: all
 [98652] dbg: logger: logging level is DBG
 [98652] dbg: generic: SpamAssassin version 3.2.5
 [98652] dbg: config: score set 0 chosen.
 [98652] dbg: dns: is Net::DNS::Resolver available? yes
 [98652] dbg: dns: Net::DNS version: 0.65
 [98652] dbg: generic: sa-update version svn607589
 [98652] dbg: generic: using update directory: /var/db/spamassassin/3.002005
 [98652] dbg: diag: perl platform: 5.008009 freebsd
 snip - let me know if something in here is relevant, just trying to make a
 shorter message
 [98652] dbg: channel: no MIRRORED.BY file available
 [98652] dbg: http: GET request,
 http://spamassassin.apache.org/updates/MIRRORED.BY
 [98652] dbg: channel: MIRRORED.BY file retrieved
 [98652] dbg: channel: reading MIRRORED.BY file
 [98652] dbg: channel: found mirror http://daryl.dostech.ca/sa-update/asf/
 weight=5
 [98652] dbg: channel: found mirror http://www.sa-update.pccc.com/ weight=5
 [98652] dbg: channel: selected mirror http://daryl.dostech.ca/sa-update/asf
 [98652] dbg: http: GET request,
 http://daryl.dostech.ca/sa-update/asf/895075.tar.gz
 [98652] dbg: http: GET request,
 http://daryl.dostech.ca/sa-update/asf/895075.tar.gz.sha1
 [98652] dbg: http: GET request,
 http://daryl.dostech.ca/sa-update/asf/895075.tar.gz.asc
 [98652] dbg: sha1: verification wanted:
 ade9426b8f85bed554604033c71e925e5e597502
 [98652] dbg: sha1: verification result:
 b3fa2495f0091f85a8421b39c72b868afef6808f
 channel: SHA1 verification failed, channel failed
 [98652] dbg: generic: cleaning up temporary directory/files
 [98652] dbg: diag: updates complete, exiting with code 4

 I searched the list archives for the SHA1 verification error, and found
 nothing recent-- is this problem somehow unique to my system?  Any
 suggestions?

 Thanks,
 David Chaplin-Loebell





-- 
--j.


Re: lint check of update failed, channel failed

2010-01-06 Thread Justin Mason
Hmm.  We can use if can() to work around it...

On Wednesday, January 6, 2010, Mark Martinec mark.martinec...@ijs.si wrote:
 jidanni wrote:

 $ sa-update
 config: failed to parse line, skipping,
  in /tmp/.spamassassin5560GP7SGbtmp/10_default_prefs.cf:
  clear_originating_ip_headers
 config: failed to parse line, skipping,
  in /tmp/.spamassassin5560GP7SGbtmp/10_default_prefs.cf:
  originating_ip_headers X-Yahoo-Post-IP X-Originating-IP X-Apparently-From
 config: failed to parse line, skipping,
  in /tmp/.spamassassin5560GP7SGbtmp/10_default_prefs.cf:
  originating_ip_headers X-SenderIP
 channel: lint check of update failed, channel failed

 It's an unfortunate consequence of pushing the Bug 5895 rule change
 into 10_default_prefs.cf, while you are still running 3.3.0-rc1 or
 older 3.3.0 code.

 The quickest fix would be to install the current SA code from SVN:
   http://wiki.apache.org/spamassassin/DevelopmentStuff
   $ svn checkout https://svn.apache.org/repos/asf/spamassassin/trunk
 with the usual: perl Makefile.PL; make; make install

 ...or just not to worry about the failing sa-update,
 until we come up with a better temporary solution.

   Mark



-- 
--j.


Re: ANNOUNCE: Apache SpamAssassin 3.3.0-rc1 available

2010-01-05 Thread Justin Mason
On Tue, Jan 5, 2010 at 13:40, Jason Bertoch ja...@i6ix.com wrote:
 Warren Togami wrote:

 Apache SpamAssassin 3.3.0-rc1 is now available for testing.


 I've been running 3.3.0-rc1 for a little over a week with no noticeable
 issues, but I've made an observation that I'd like to note.  I have several
 role accounts (hostmaster, abuse, etc) set with the all_spam_to parameter.
  With SA 3.2.5, it appeared that no other tests were run for accounts set
 like this.  However, I'm seeing that although the score is dropped by 100,
 the rest of the regular ruleset is still being applied to the message.  This
 hasn't been problematic thus far, but it was my understanding that remaining
 tests should be skipped to save CPU cycles.  Is this change intentional, a
 bug, or have I lost my mind and it has always worked like this?


Did you have the Shortcircuit plugin active in 3.2.5?

-- 
--j.


Re: Apache SpamAssassin Y2K10 Rule Bug - Update Your Rules Now!

2010-01-05 Thread Justin Mason
On Tue, Jan 5, 2010 at 13:54, Matus UHLAR - fantomas uh...@fantomas.sk wrote:
  After a bit of digging I found that sa-update had, in fact, updated my 
  system
  before I read this.

 On 04.01.10 15:41, Alex wrote:
 sa-update had also updated my system, and amavisd was restarted.
 However, the 72_active.cf in /usr/share/spamassassin somehow overrode
 the updated one from /var/lib/spamassassin/.

 Any idea how this could happen? It was necessary for me to manually
 update the 72_active.cf in /usr/share/spamassassin for this to work.

 there's something broken in your setup (amavis?). Rules from
 /var/lib/spamassassin/ should override /usr/share/spamassassin

one way this could happen: the /var/lib/... rules are from an older
version of SA, but the /usr/share/... rules are from the one
corresponding to what is used when amavisd is run.

--j.


Re: Solaris 10 requires --syslog-socket=native

2009-12-30 Thread Justin Mason
+1 on opening a bug, the sooner that happens the sooner we can get the
fix into the next release ;)

Thanks Henrik, that looks promising (if a little crazy).

On Wed, Dec 30, 2009 at 15:20, Henrik K h...@hege.li wrote:
 On Wed, Dec 30, 2009 at 09:52:50AM -0500, Rosenbaum, Larry M. wrote:
 I have just recently installed SA v3.3.0-rc1 on Solaris 10.  I have 
 discovered that in order for syslog logging to work, I have to start spamd 
 with the switch --syslog-socket=native.  It won't work if I set it to 
 unix or inet or if I omit the switch entirely.  As this is my first time 
 running SpamAssassin on Solaris 10, I don't know if this discovery also 
 applies to older SpamAssassin versions, but I suspect it does.  I suggest 
 the documentation be changed to reflect this, since currently it does not 
 even mention native as a legitimate option.

 In the long term, perhaps native can be made the default, or have the code 
 just not call setlogsock() if the --syslog-socket switch is absent.  (I 
 don't know if this is feasible with older versions of Sys::Syslog.)

 This is what the current docs say:

     --syslog-socket=*type*
         Specify how spamd should send messages to syslogd. The options are
         unix, inet or none. The default is to try unix first,
         falling back to inet if perl detects errors in its unix support.

         Some platforms, or versions of perl, are shipped with dysfunctional
         versions of the Sys::Syslog package which do not support some socket
         types, so you may need to set this. If you get error messages
         regarding __PATH_LOG or similar from spamd, try changing this
         setting.

 FYI the code from postgrey, which works pretty well (sorry no time to check
 what's in current SA):

    if(defined $Sys::Syslog::VERSION and $Sys::Syslog::VERSION ge '0.15'
    and defined $Net::Server::VERSION and $Net::Server::VERSION ge '0.97') {
        # use 'native' when Sys::Syslog = 0.15
        $syslog_logsock = 'native';
    }
    elsif($^O eq 'solaris') {
        # 'stream' is broken and 'unix' doesn't work on Solaris: only 'inet'
        # seems to be useable with Sys::Syslog  0.15
        $syslog_logsock = 'inet';
    }
    else {
        $syslog_logsock = 'unix';
    }

 Larry, you should open up a bug..





-- 
--j.


Re: 3.3.0-beta1: What is wrong with my usage of SYSCONFDIR

2009-12-22 Thread Justin Mason
hi -- I think this is harmless.  but it is definitely ugly.  can you
open a bug in the bugzilla for this?

On Wed, Dec 16, 2009 at 18:38, Jens Schleusener
jens.schleuse...@t-systems-sfr.com wrote:
 Hi,

 to install Mail-SpamAssassin SVN versions (and now Mail-SpamAssassin
 3.3.0-beta1) I use the command

  perl Makefile.PL PREFIX=~/sausr_svn SYSCONFDIR=~/saetc_svn

 but I detect always the following output lines after that command is
 claiming about some missing only optional modules:

  warning: some functionality may not be available,
  please read the above report before continuing!

  Checking if your kit is complete...
  Looks good
  'SYSCONFDIR' is not a known MakeMaker parameter name.
  Writing Makefile for Mail::SpamAssassin
  Makefile written by ExtUtils::MakeMaker 6.42

 But after the successive make and make install all seems to work fine
 (using openSUSE 11.2 with perl v5.10.0).

 In the generated Makefile I found according lines like

  #   MakeMaker ARGV: (q[PREFIX=~/sausr_svn], q[SYSCONFDIR=~/saetc_svn])

  PREFIX = /home/my_account/sausr_svn

  cd $(DISTVNAME)  $(ABSPERLRUN) Makefile.PL PREFIX=~/sausr_svn
 SYSCONFDIR=~/saetc_svn

  $(PERLRUN) Makefile.PL PREFIX=~/sausr_svn SYSCONFDIR=~/saetc_svn

  $(NOECHO) $(PERLRUNINST) \
                Makefile.PL DIR= \
                MAKEFILE=$(MAKE_APERL_FILE) LINKTYPE=static \
                MAKEAPERL=1 NORECURS=1 CCCDLFLAGS= \
                PREFIX=~/sausr_svn \
                SYSCONFDIR=~/saetc_svn

 respectively

  #   MakeMaker ARGV: (q[PREFIX=~/sausr_svn], q[SYSCONFDIR=~/saetc_svn])

  SYSCONFDIR = /home/my_account/saetc_svn

  cd $(DISTVNAME)  $(ABSPERLRUN) Makefile.PL PREFIX=~/sausr_svn
 SYSCONFDIR=~/saetc_svn

  $(NOECHO) $(PERLRUNINST) \
                Makefile.PL DIR= \
                MAKEFILE=$(MAKE_APERL_FILE) LINKTYPE=static \
                MAKEAPERL=1 NORECURS=1 CCCDLFLAGS= \
                PREFIX=~/sausr_svn \
                SYSCONFDIR=~/saetc_svn

 So the above error line

  'SYSCONFDIR' is not a known MakeMaker parameter name.

 is only a minor flaw that can be safely ignored?

 Greetings

 Jens





-- 
--j.


Re: habeas - tainted white list

2009-12-18 Thread Justin Mason
On Fri, Dec 18, 2009 at 19:04, Jason Bertoch ja...@i6ix.com wrote:

 John Hardin wrote:

 On Fri, 18 Dec 2009, Jason Bertoch wrote:

  Charles Gregory wrote:


  If a spammer gets an IP blacklisted, at the least DNSWL and HABEAS
  should make note of this and remove the IP


 Or we could have the whitelist rules in a meta such that they only hit
 when a blacklist rule doesn't, if this is a common enough problem.  It might
 also allow people to get past the high negative score for the whitelists.


 That sounds like a good idea to me...


 Is there a way to pull stats on this concept from mass check results or
 would a new rule need to be checked in by a dev?

 it can be measured by finding the WL rule's page on
ruleqa.spamassassin.org, then examining the OVERLAP section for overlaps
with BL rules.

-- 
--j.


Re: Dear Santa

2009-12-18 Thread Justin Mason
On Wed, Dec 16, 2009 at 15:31, R-Elists list...@abbacomm.net wrote:



 
  Axb
  PS: If JM posts a link to his Amazon wishlist, maybe we can
  all help him decorate the new place :-)
 
 
 

 +1


hey, if you all insist ;)

http://www.amazon.com/registry/wishlist/1M0UDEXT6A3I7

https://www.amazon.co.uk/registry/wishlist/1G7S5QV025EOX

thanks!  it might help persuade my wife that I need to get that server
reinstalled ;)

-- 
--j.


Re: Reminder:: 3.3.0 pre-release cut: December 17th

2009-12-16 Thread Justin Mason
On Wed, Dec 16, 2009 at 13:59, Kevin A. McGrail kmcgr...@pccc.com wrote:

 It was a good catch that RNBL replaced SSBL but I don't see this as a 3.3.0
 P1 issue.  Same thing with HABEAS/BSP.  Can't these be handled in normal
 course with GA and then sa-update?  I worry we are hurrying these too much.


  This is a reminder of the goal to cut another pre-release December 17th
 either named beta2 or rc1 depending on the above conditions.  It is not a
 big deal if we need to delay the cut to the proposed deadline of December
 23rd.


 I think a beta2 would be more appropriate.  I think rc1 would be after we
 have all the known P1 bugs closed off.


+1.  sa-learn --backup being broken is a big deal.


-- 
--j.


Re: ANNOUNCE: Apache SpamAssassin 3.3.0-beta1 available

2009-12-09 Thread Justin Mason
btw might be worth getting this into a bug.

On Wed, Dec 9, 2009 at 02:30, Mark Martinec
mark.martinec...@ijs.simark.martinec%2...@ijs.si
 wrote:

  Thanks for testing! Which version of a perl module Time::HiRes
  do you have installed? See what is reported by:
$ perl -MTime::HiRes -le 'print Time::HiRes-VERSION'
  Could you please try upgrading this module if yours is rather old,
  and see if that helps.

 P.S., does the following change to t/timeout.t on your system
 make any difference in test results?

 --- timeout.t   2009-12-09 03:29:12.0 +0100
 +++ timeout.t   2009-12-09 03:29:19.0 +0100
 @@ -23,3 +23,3 @@
  use strict;
 -use Time::HiRes qw(time sleep);
 +use Time::HiRes qw(time sleep alarm);




  Mark




-- 
--j.


Re: J.D. Falk spineless insults (Re: HABEAS_ACCREDITED SPAMMER)

2009-12-04 Thread Justin Mason
On Fri, Dec 4, 2009 at 14:04, rich...@buzzhost.co.uk rich...@buzzhost.co.uk
 wrote:

 On Fri, 2009-12-04 at 06:55 -0700, LuKreme wrote:
  On 3-Dec-2009, at 23:06, R-Elists wrote:
   certainly we understand your point here, yet what about accountability
 for
   Return Path Inc (and other RPI companies) related rules in the default
   Spamassassin configs?
 
 
  My position on HABEAS is well-know by anyone who cares (I score it +0.5
 and +2.0); that's not what I'm talking about: it's the constant whinging by
 richard and falk at each other. Obviously they WANT to be communicating
 since otherwise they could easily ignore/killfile each other. I'm just tired
 of them doing it on this mailinglist.
 
 Your idea of 'constant' amuses me and is stretching the truth
 exponentially.

 I'm curious why a commercial whitelist from a bulk mailing company has
 such a positive inroad in Spamassassin. It's a fair question. I'm not
 interested in your personal views of me, my question or my posting. You
 have a killfile? You able to ignore on subject? Skills you may find
 useful to learn yes?


Richard, quit it.

It's unreasonable to assume that all of the subscribers to this list should
have to listen to, or need to set up a killfile just to avoid, your ranting.


-- 
--j.


Re: sought rules

2009-12-02 Thread Justin Mason
Hi all -

I'm afraid the sought rules, and generally most of my time to work on
SA, is still on a bit of a hiatus due to circumstances out of my
control :(

unfortunately my house renovation is taking longer than planned, and
my net access outside work, at the moment, consists of an iPhone!

Working on anything this way is not particularly practical...




On Wednesday, December 2, 2009, Bob Proulx b...@proulx.com wrote:
 Hi Justin,

 Right now I'm not likely to be able to perform more investigation for a week
 or two. :(

 Any word on your sought rules?  They were working so well for me that
 I miss them now that they aren't getting updated!  Good stuff.

 Sorry about this -- the perils of volunteer infrastructure!

 I don't want to be annoying or be a trouble maker.  I just became
 impatient and decided I would ask about them!  :-)

 Thanks!
 Bob



-- 
--j.


Re: masscheck Dumptext.pm line 26.

2009-11-24 Thread Justin Mason
that's normal.  can be ignored

On Tue, Nov 24, 2009 at 21:04, Yet Another Ninja sa-l...@alexb.ch wrote:

 When running masscheck calling:

 /home/mc/masscheck/spamassassin/trunk/masses  nice ./mass-check \
  --cf='loadplugin Dumptext plugins/Dumptext.pm' \
  --cf='loadplugin Mail::SpamAssassin::Plugin::Check' \
  -j=2 -n -o --rules='^(?!JM_SOUGHT)(?!T_JM_SOUGHT)' \
  spam:dir:/home/mc/Maildir/.SPAM/cur \
   /home/mc/masscheck/seekrules/w.s )

 I get this output and am at totally stumped:

 Wide character in print at
 /home/mc/masscheck/spamassassin/trunk/masses/plugins/Dumptext.pm line 26.

 anybody any ideas?

 thx
 Axb




-- 
--j.


Re: rspamd?

2009-11-19 Thread Justin Mason
If they are from sa, that's not good; in our case rules == source.

On Thursday, November 19, 2009, Adam Katz antis...@khopis.com wrote:
 http://cebka.pp.ru/trac/wiki/RspamdFeatures contains:
 Rspamd is anti-spam system that is designed to work faster than
 spamassassin by using event model and regular expressions
 optimization. ... With similar rules rspamd is about ten times
 faster than spamassassin.

 A quick browse of the source tarball shows the conf directory has a
 bunch of .inc files that have some regular expressions and even meta
 rules.

 Two thoughts:
 1. Any speed tips to be gleamed?  (Unlikely given C vs Perl)
 2. Any useful rules to borrow?

 ... actually, a closer look at those rules reveals some or all of them
 likely came /from/ SpamAssassin.  Sadly, this raises issues about
 license violation; no credit is given (a violation of copyright), and
 I'm not sure about whether software licensed under the Apache License
 v2 can legally be relicensed under the Original BSD License as this
 software is.  While http://www.apache.org/legal/resolved.html talks
 about incorporating sources of various licenses in Apache-Licensed
 software, it does not address the other direction.



-- 
--j.


Re: Back on DNSBL overlap

2009-11-18 Thread Justin Mason
can't type much as i've broken my elbow (oh noes!) -- but we talked in
the past about using an LR engine for rescoring.  not sure if that got
anywhere though.

btw be aware also that there was a perceptron rescorer, but it
produced more fragile scores than the ga; see 3.2.0 rescoring ticket
for history

--j

On Tue, Nov 17, 2009 at 03:22, Warren Togami wtog...@redhat.com wrote:
 On 11/16/2009 07:26 PM, Adam Katz wrote:

 My hypothesis, which I've anecdotally proven on my own deployment, is
 that the flaws are repeated as well.  Spammers that trigger spamtraps
 on multiple DNSBLs (and URIBLs) may be sending from (or linking to)
 servers that also deal with legitimate traffic.  This means that
 thanks to these similar indexing techniques, DNSBL overlap from
 spammers' abuse of a non-spam-exclusive server can single-handedly
 mark a ham as spam.

 My solution is to counter-intuitively *remove* points from message
 that hit too many DNSBLs.  They still net quite a positive score, but
 that score is effectively capped at something not quite high enough to
 kill a ham with DNSBLs alone.

 A more elegant version of this, which Karsten and I theorize might
 even happen automatically (as scored by the GA) if I were to check my
 adjustor into SVN, would be to reduce most of the points on the DNSBLs
 and add them back with a meta rule containing a union of the DNSBL
 rules (without a multiple tflag).

 I think there is a lot of merit to this approach, and it might even be a
 great idea.  But I spoke with a machine learning expert and heard some
 interesting things on this topic.

 We held a small workshop yesterday in which she explained Logistic
 Regression and how it might be applied to automated rescoring of
 spamassassin's rules.  The most intriguing aspect of her explanation was the
 suggestion of using a logarithmic function in weight scoring.  I asked
 specifically about this issue of overlap (like BRBL_LASTEXT with every other
 list) and she suggested this particular method of rescoring wouldn't have an
 issue with overlap.

 I believe you mentioned logarithmic scoring in an earlier discussion?

 It appears that we have a few very smart people interested in implementing
 an alternative rescorer using Logistic Regression.  We plan on using an
 existing library for the bulk of this implementation.

 I think we should proceed with our current generated scores for 3.3.0. After
 that we can compare the effectiveness of different approaches including your
 proposal.

 Specifically on the issue of overlapping DNSBL's, there might be a few
 possibilities:

 * Overlapping DNSBL's really is no problem with any method of scoring.
 * Overlapping DNSBL's is only a slight problem with any method of scoring,
 but if a host is blacklisted with more than one major DNSBL they have
 serious issues they need to fix and we shouldn't try to workaround for their
 benefit.
 * Overlapping DNSBL's is a real problem, but logarithmic scoring avoids it
 as an issue.

 rulesrc/sandbox/jm/20_bug_5984.cf:# score RCVD_IN_BRBL_LASTEXT 2.0

 This apparently was set manually.  It appears that spamassassin-3.2.x was
 not scored when BRBL existed as a rule.  Meanwhile our new GA scores
 resulted in:

 score RCVD_IN_BRBL_LASTEXT 0 1.644 0 1.449 # n=0 n=2

 This is relatively modest.  This combined with one other DNSBL alone will
 not push it clearly above 5 points.  I might suggest manually adjusting down
 BRBL or PBL so it requires one additional tiny score to push it over the
 edge.  I'm personally comfortable enough to outright reject mail from a
 Spamhaus listed host.  Given this bias, it is sufficiently cautious in my
 book to accept PBL + BRBL as insufficient.

 Warren Togami
 wtog...@redhat.com





-- 
--j.


Re: DNSBL Comparison 20091114

2009-11-16 Thread Justin Mason
First -- my name is not Jim.  Secondly -- I don't care what Spamhaus
does, I'm asking what you suggest SpamAssassin do to measure FPs.

--j.

On Mon, Nov 16, 2009 at 06:00, rich...@buzzhost.co.uk
rich...@buzzhost.co.uk wrote:
 On Sun, 2009-11-15 at 20:34 +, Justin Mason wrote:
 On Sun, Nov 15, 2009 at 08:53, rich...@buzzhost.co.uk
 rich...@buzzhost.co.uk wrote:
  On Sun, 2009-11-15 at 03:14 -0500, Warren Togami wrote:
  http://mail-archives.apache.org/mod_mbox/spamassassin-users/200910.mbox/%3c4ad11c44.9030...@redhat.com%3e
  Compare this report to a similar report last month.
 
  http://wiki.apache.org/spamassassin/NightlyMassCheck
  The results below are only as good as the data submitted by nightly
  masscheck volunteers.  Please join us in nightly masschecks to increase
    the sample size of the corpora so we can have greater confidence in
  the nightly statistics.
 
  http://ruleqa.spamassassin.org/20091114-r836144-n
  Spam 131399 messages from 18 users
  Ham  189948 messages from 18 users
 
  
  DNSBL lastexternal by Safety
  
  SPAM%    HAM%    RANK RULE
  12.8342% 0.0021% 0.94 RCVD_IN_PSBL *
  12.3053% 0.0026% 0.94 RCVD_IN_XBL
  31.2499% 0.0827% 0.87 RCVD_IN_ANBREP_BL *2
  80.2578% 0.1485% 0.86 RCVD_IN_PBL
  27.1836% 0.1985% 0.79 RCVD_IN_SORBS_DUL
  19.8213% 0.1785% 0.79 RCVD_IN_SEMBLACK *
  90.9360% 0.3854% 0.77 RCVD_IN_BRBL_LASTEXT
  13.0564% 0.4838% 0.67 RCVD_IN_HOSTKARMA_BL *
 
  Commentary:
  * PSBL and XBL lead in apparent safety.
  * ANBREP was added after the October report and has made a surprisingly
  strong showing in this past month.  ANBREP is currently unavailable to
  the general public.  The list owner is thinking about going public with
  the list, which I would encourage because they are clearly doing
  something right.  It seems he would need a global network of automated
  mirrors to be able to scale.  He would also need listing/delisting
  policy clearly stated on a web page somewhere.
  * SEMBLACK consistently has been performing adequately in safety while
  catching a respectable amount of spam.  I personally use this
  non-default blacklist.
  * It is clear that the two main blacklists are Spamhaus and BRBL.  The
  Zen combinatoin of Spamhaus zones is extremely effective and generally
  safe.  BRBL has a high hit rate as well, with a moderate safety rating.
  * HOSTKARMA_BL ranks dead last in safety for the past several weeks in a
  row, while not being more effective against spam than PSBL, XBL or 
  SEMBLACK.
 
  ===
  HOSTKARMA_BL much better as URIBL
  ===
  SPAM%    HAM%    RANK RULE
  68.3651% 0.2806% 0.79 URIBL_HOSTKARMA_BL *
 
  Commentary:
  While HOSTKARMA_BL is pretty unsafe as a plain DNSBL, it is surprisingly
  effective as a URIBL.  This is curious as it seems it was not designed
  to be used as a URIBL.  In any case as long our masschecks show good
  statistics like this, I will personally use this on my own spamassassin
  server.
 
  =
  SPAMCOP Dangerous?
  =
  SPAM%    HAM%    RANK RULE
  17.4225% 2.6076% 0.56 RCVD_IN_BL_SPAMCOP_NET *
 
  Commentary:
  Is Spamcop seriously this bad?  It consistently has shown a high false
  positive rates in these past weeks.  Was it safer than this in the past
  to warrant the current high score in spamassassin-3.2.5?
 
  Warren Togami
  wtog...@redhat.com
 
  Is it not a bit flawed to do the metrics on volunteer submissions, given
  the Spamhaus has is said to have a small army of them? It means the data
  cannot be relied upon as any kind of sensible comparison.

 please explain.  How would you suggest measuring false positives?

 Do you think that volunteer submissions are an accurate way to do them,
 or do you think that is open to abuse?

 For example, say I am Steve Linford with a small army of volunteers. I
 get a few false positives come in from Spamhaus, and a few from SORBS.
 What is my inclination when I submit the data?

 It takes only a small amount of research and a trawl through the NANAE
 archives to get a handle on the problem, and the general abuse and
 nefarious goings on with DNSBL volunteers. It is fair to say that there
 is not much love lost.

 I'm not pretending I have the answers, so it's probably better to take
 these lists with a large bucket of salt and find how any given DNSBL
 list works for a given organisation.

 In a world where presidents and world leaders in America, Zimbabwe and
 Afghanistan get 'elected' on tainted data, some random RBL 'comparison'
 list is a trivial by comparison. It must, however, be duly remembered
 that there are many competing 'sides' in the world of the DNSBL's, each
 looking to do the other discredit.

 Perhaps Jim, as you posed the question - you have some strong feelings
 on the matter that you would like to share?





-- 
--j.


Re: DNSBL Comparison 20091114

2009-11-15 Thread Justin Mason
On Sun, Nov 15, 2009 at 08:53, rich...@buzzhost.co.uk
rich...@buzzhost.co.uk wrote:
 On Sun, 2009-11-15 at 03:14 -0500, Warren Togami wrote:
 http://mail-archives.apache.org/mod_mbox/spamassassin-users/200910.mbox/%3c4ad11c44.9030...@redhat.com%3e
 Compare this report to a similar report last month.

 http://wiki.apache.org/spamassassin/NightlyMassCheck
 The results below are only as good as the data submitted by nightly
 masscheck volunteers.  Please join us in nightly masschecks to increase
   the sample size of the corpora so we can have greater confidence in
 the nightly statistics.

 http://ruleqa.spamassassin.org/20091114-r836144-n
 Spam 131399 messages from 18 users
 Ham  189948 messages from 18 users

 
 DNSBL lastexternal by Safety
 
 SPAM%    HAM%    RANK RULE
 12.8342% 0.0021% 0.94 RCVD_IN_PSBL *
 12.3053% 0.0026% 0.94 RCVD_IN_XBL
 31.2499% 0.0827% 0.87 RCVD_IN_ANBREP_BL *2
 80.2578% 0.1485% 0.86 RCVD_IN_PBL
 27.1836% 0.1985% 0.79 RCVD_IN_SORBS_DUL
 19.8213% 0.1785% 0.79 RCVD_IN_SEMBLACK *
 90.9360% 0.3854% 0.77 RCVD_IN_BRBL_LASTEXT
 13.0564% 0.4838% 0.67 RCVD_IN_HOSTKARMA_BL *

 Commentary:
 * PSBL and XBL lead in apparent safety.
 * ANBREP was added after the October report and has made a surprisingly
 strong showing in this past month.  ANBREP is currently unavailable to
 the general public.  The list owner is thinking about going public with
 the list, which I would encourage because they are clearly doing
 something right.  It seems he would need a global network of automated
 mirrors to be able to scale.  He would also need listing/delisting
 policy clearly stated on a web page somewhere.
 * SEMBLACK consistently has been performing adequately in safety while
 catching a respectable amount of spam.  I personally use this
 non-default blacklist.
 * It is clear that the two main blacklists are Spamhaus and BRBL.  The
 Zen combinatoin of Spamhaus zones is extremely effective and generally
 safe.  BRBL has a high hit rate as well, with a moderate safety rating.
 * HOSTKARMA_BL ranks dead last in safety for the past several weeks in a
 row, while not being more effective against spam than PSBL, XBL or SEMBLACK.

 ===
 HOSTKARMA_BL much better as URIBL
 ===
 SPAM%    HAM%    RANK RULE
 68.3651% 0.2806% 0.79 URIBL_HOSTKARMA_BL *

 Commentary:
 While HOSTKARMA_BL is pretty unsafe as a plain DNSBL, it is surprisingly
 effective as a URIBL.  This is curious as it seems it was not designed
 to be used as a URIBL.  In any case as long our masschecks show good
 statistics like this, I will personally use this on my own spamassassin
 server.

 =
 SPAMCOP Dangerous?
 =
 SPAM%    HAM%    RANK RULE
 17.4225% 2.6076% 0.56 RCVD_IN_BL_SPAMCOP_NET *

 Commentary:
 Is Spamcop seriously this bad?  It consistently has shown a high false
 positive rates in these past weeks.  Was it safer than this in the past
 to warrant the current high score in spamassassin-3.2.5?

 Warren Togami
 wtog...@redhat.com

 Is it not a bit flawed to do the metrics on volunteer submissions, given
 the Spamhaus has is said to have a small army of them? It means the data
 cannot be relied upon as any kind of sensible comparison.

please explain.  How would you suggest measuring false positives?

-- 
--j.


Re: DNSBL Comparison 20091114

2009-11-15 Thread Justin Mason
 SPAM%    HAM%    RANK RULE
 12.8342% 0.0021% 0.94 RCVD_IN_PSBL *
 12.3053% 0.0026% 0.94 RCVD_IN_XBL
 31.2499% 0.0827% 0.87 RCVD_IN_ANBREP_BL *2
 80.2578% 0.1485% 0.86 RCVD_IN_PBL
 27.1836% 0.1985% 0.79 RCVD_IN_SORBS_DUL
 19.8213% 0.1785% 0.79 RCVD_IN_SEMBLACK *
 90.9360% 0.3854% 0.77 RCVD_IN_BRBL_LASTEXT
 13.0564% 0.4838% 0.67 RCVD_IN_HOSTKARMA_BL *

hi Warren --

any chance you could post the S/O ratios?  RANK is a bit unportable,
as it depends on other rules in the ruleset at the time the
measurement takes place.

--j.


Re: sought rules

2009-11-11 Thread Justin Mason
On Wed, Nov 11, 2009 at 14:04, Bowie Bailey bowie_bai...@buc.com wrote:
 john ffitch wrote:
 Have I missed something?  I used to pull the sought rules daily, but
 nothing seems to have changed since 2 Nov.  Is that expected behaviour?
 ==John ffitch


 No, that's not expected behavior...

 On Thu, 5 Nov 2009, Justin Mason wrote:
 Right now, SOUGHT appears to be broken.  I need to get to where the
 server is currently and fix it -- I don't have remote login to it at the
 mo :(

 And that's about all we know at the moment.

Yep -- sorry -- I got to reboot the server, but it appears to have not
fixed the problem.
Right now I'm not likely to be able to perform more investigation for a week
or two. :(

Sorry about this -- the perils of volunteer infrastructure!

-- 
--j.


Re: sought rules

2009-11-11 Thread Justin Mason
Hi guys --

the problem is that SOUGHT uses gigabytes of private mail, so running
that on a shared host is not viable. Currently we don't have anything
like that I can use :(

On Wednesday, November 11, 2009, George R. Kasica geor...@netwrx1.com wrote:
On Wed, 11 Nov 2009 12:09:09 -0500, you wrote:

Hi,

 Yep -- sorry -- I got to reboot the server, but it appears to have not
 fixed the problem.
 Right now I'm not likely to be able to perform more investigation for a week
 or two. :(

 Sorry about this -- the perils of volunteer infrastructure!

Where is it physically located? Isn't there someone in the area that
you trust, or could trust, to go and fix it? I guess if there was, you
would have done that, but I'm sure you could find some volunteers to
put it up in a more centrally-located or managed location for the
future, if you'd like.

Off-site backup? At the least, I'm sure someone could contribute
there. I've got a few servers, and would be happy to provide remote
ssh/rsync access to someone, should you like.

 Truewhat do you need to host this thingif I can help out with
 space/bandwidth I'd be willing. I've got a couple linux boxes here
 that I could give you some space on.

 George
 --
 ===[George R. Kasica]===        +1 262 677 0766
 President                       +1 206 374 6482 FAX
 Netwrx Consulting Inc.          Jackson, WI USA
 http://www.netwrx1.com
 geor...@netwrx1.com
 ICQ #12862186



-- 
--j.


Re: sought rules

2009-11-05 Thread Justin Mason
I need the full mails to do that -- but with the uploaded mail, yes,
I should do that!
good point.

Right now, SOUGHT appears to be broken.  I need to get to where the server is
currently and fix it -- I don't have remote login to it at the mo :(

On Thu, Nov 5, 2009 at 18:02, John Hardin jhar...@impsec.org wrote:
 On Thu, 5 Nov 2009, Dave Pooser wrote:

 I think I remember hearing some discussion about that at one point.  I
 don't think that type of thing is as big of a concern here since these are
 all body rules.  I agree that you need a good corpus of ham to prevent FP's,
 but I'm sure Justin is doing that.

 I'm sure he's working hard on it, but his ability is naturally going to be
 limited by his ham corpus. I just saw a whole bunch of legit AmEx corporate
 card updates get thrown into the quarantine bin due to hitting SOUGHT. It
 happens sometimes; I've found that when I send him a sample of the mistagged
 email he gets it fixed pretty quickly.

 I wonder if he's considered running the ham exclusion against the complete
 nightly masscheck ham corpora...?

 --
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
 ---
  Men by their constitutions are naturally divided in to two parties:
  1. Those who fear and distrust the people and wish to draw all
  powers from them into the hands of the higher classes. 2. Those who
  identify themselves with the people, have confidence in them,
  cherish and consider them as the most honest and safe, although not
  the most wise, depository of the public interests.
                                                  -- Thomas Jefferson
 ---
  6 days until Veterans Day





-- 
--j.


Re: sought rules

2009-11-05 Thread Justin Mason
On Fri, Nov 6, 2009 at 00:00, John Hardin jhar...@impsec.org wrote:
 On Thu, 5 Nov 2009, Justin Mason wrote:

 I need the full mails to do that -- but with the uploaded mail, yes, I
 should do that! good point.

 Glad to help.

 Right now, SOUGHT appears to be broken.  I need to get to where the server
 is currently and fix it -- I don't have remote login to it at the mo :(

 ruleqa/nightly masscheck seems to have fallen apart, too. :(

I think I may have just fixed that fingers crossed.

-- 
--j.


Re: Other DNSBL's

2009-10-19 Thread Justin Mason
(back from vacation ;)

BTW, could you add

  tflags nopublish

to any rules?  or use a T_ prefix on the rule names.  that will ensure
the testing rules won't get into any published ruleset
accidentally.  this is very important to avoid accidentally causing a
production-level DOS on the BL's servers


--j.

On Fri, Oct 16, 2009 at 14:41, Warren Togami wtog...@redhat.com wrote:
 I'm looking to add other DNSBL's to tomorrow's weekly mass check.  I realize
 most of them probably are too broken to bother, but it would be nice to get
 some real numbers to confirm it so since the Internet lacks any real DNSBL
 comparisons that include Ham FP safety.

 http://antispam.imp.ch/06-dnsbl.html
 This one seems to have 3% of the hits compared to PSBL, so I am not
 bothering to test it in masscheck.

 http://bl.csma.biz/
 It seems that this blacklist is simply dead.  Zero hits on their SBL list
 within the last day.

 Any other DNSBL's out there that you folks use that are worth comparing?

 Warren Togami
 wtog...@redhat.com





-- 
--j.


Re: SA 3.3.0 and sa-compile

2009-10-01 Thread Justin Mason
On Thu, Oct 1, 2009 at 16:15, John Hardin jhar...@impsec.org wrote:
 On Thu, 1 Oct 2009, Zdenek Herman wrote:

 I have same problem.
 Any solution ?

 to...@starbridge.org napsal(a):

 i'm running SA 3.3.0 (3.3.0-alpha3-r808953) and i've some problem with
 compiled rules.

 sa-compile runs without errors, and SA seems to works fine when
 restarted. But some body rules are now not detected.

 A suggestion to both of you, based on sa-compile support requests seen
 earlier on the list: run sa-compile with the debug option turned on, publish
 the debugging output and intermediate files on a webserver somewhere, and
 post the URIs for that info here so they can be examined.

even better: open a Bugzilla entry and do the same.  That's how we
track (possible) bugs and prioritize them.

-- 
--j.


Fwd: Episode 17 of the Who and Why Show: SpamAssassin

2009-09-22 Thread Justin Mason
Hi all --

FYI, I had a chat with Steve Santorelli a while back for Team Cymru's
'Who and Why Show' podcast -- an occasional series designed to directly
assist
network administrators, looking at various Open Source Tools that can
be used to check and secure systems.  It's now up:


Today we talk with Justin Mason, the original author of SpamAssassin,
a world class anti-SPAM tool that's fundamentally changed the way we
deal with the SPAM problem.

These short overviews will give you a good grounding in the basics of
the tool, the alternatives and also perhaps enlighten long time users
to some of the upcoming features for future versions.

See it at www.youtube.com/teamcymru


It probably won't be news for this audience, as it's more newbie-oriented...
but anyway  ;)

-- 
--j.


Re: NOTICE: SpamAssassin 3.3.0 mass-checks now starting

2009-09-17 Thread Justin Mason
On Thu, Sep 17, 2009 at 04:01, Warren Togami wtog...@redhat.com wrote:
 On 09/16/2009 11:25 PM, Justin Mason wrote:

 excellent.  That's 2 people who could do with an extension, then!

 Could we state with clarity the new deadline?  I might have other people
 with data depending on the extended deadline.

Let's push it out until Monday.

regarding corpus cleaning, RTFM:
http://wiki.apache.org/spamassassin/CorpusCleaning (linked from the
RescoreDetails page)

-- 
--j.


Re: NOTICE: SpamAssassin 3.3.0 mass-checks now starting

2009-09-16 Thread Justin Mason
On Wed, Sep 16, 2009 at 15:47, Warren Togami wtog...@redhat.com wrote:
 On 09/04/2009 10:51 AM, Justin Mason wrote:

 OK, if you're planning to send us mass-check logs for the
 3.3.0 rescoring, now's the time!

 http://wiki.apache.org/spamassassin/RescoreDetails has all the details.

 cheers!

 --j.

 -rw-r--r--   174911850 2009/09/16 01:03:40 ham-bayes-net-hege.log
 -rw-r--r--    36909774 2009/09/11 20:39:47 ham-bayes-net-mmartinec.log
 -rw-r--r--     3179193 2009/09/14 23:16:15 ham-bayes-net-wt-en1.log
 -rw-r--r--     1591286 2009/09/14 23:24:19 ham-bayes-net-wt-en2.log
 -rw-r--r--     5687443 2009/09/14 23:53:41 ham-bayes-net-wt-en3.log
 -rw-r--r--         354 2009/09/14 23:56:00 ham-bayes-net-wt-en4.log
 -rw-r--r--      575780 2009/09/14 22:13:01 ham-bayes-net-wt-jp1.log
 -rw-r--r--     2139873 2009/09/14 22:23:07 ham-bayes-net-wt-jp2.log
 -rw-r--r--    40760753 2009/09/16 01:04:24 spam-bayes-net-hege.log
 -rw-r--r--    35666309 2009/09/11 20:52:01 spam-bayes-net-mmartinec.log
 -rw-r--r--     4341537 2009/09/14 23:16:16 spam-bayes-net-wt-en1.log
 -rw-r--r--        1576 2009/09/14 23:24:20 spam-bayes-net-wt-en2.log
 -rw-r--r--         310 2009/09/14 23:53:42 spam-bayes-net-wt-en3.log
 -rw-r--r--      494742 2009/09/14 23:56:00 spam-bayes-net-wt-en4.log
 -rw-r--r--       79101 2009/09/14 22:13:02 spam-bayes-net-wt-jp1.log
 -rw-r--r--         311 2009/09/14 22:23:08 spam-bayes-net-wt-jp2.log

 One day from the deadline for spamassassin-3.3.0 scoring and we currently
 have only three people reporting.

Who is running a mass-check that's still in progress?  (fwiw, I am ;)

It'll be at least 5 users (with myself and John), but that's not a
great population of training data.

-- 
--j.


Re: NOTICE: SpamAssassin 3.3.0 mass-checks now starting

2009-09-16 Thread Justin Mason
excellent.  That's 2 people who could do with an extension, then!

On Wed, Sep 16, 2009 at 20:16, Daryl C. W. O'Shea
spamassas...@dostech.ca wrote:
 On 16/09/2009 4:03 PM, Justin Mason wrote:
 Who is running a mass-check that's still in progress?  (fwiw, I am ;)

 I had a NAS failure over the weekend that consumed the time I was
 planning on getting my systems right up-to-date for the mass-check.  I
 now hope to do this Thursday/Friday.  I should be able to scan my
 million or so messages in a day on my cluster.

 Daryl





-- 
--j.


Re: antispam comparison by virus bulletin

2009-09-07 Thread Justin Mason
On Sun, Sep 6, 2009 at 22:59, moussmo...@ml.netoyen.net wrote:
 Justin Mason a écrit :
 In fairness, they got in touch to ask for help in setting up a more
 recent SA, but none of us (ie the PMC) had the spare cycles to help
 out.  Comparative third-party tests like this always take a lot of
 hand-holding.  We don't have the same kind of marketing budget as the
 commercial companies, needless to say.

 OTOH, I think that McAfee's Email  Web Security Appliance runs on
 SpamAssassin, or at least it did when I worked there ;)

 they acquired Secure Computing. so I'd say the test involved what was
 called Ironmail. Did Ironmail use SA?

from TFA (http://www.theregister.co.uk/2009/09/03/anti_spam_run_off/ ):

'McAfee's Email  Web Security Appliance was alone in achieving the
level needed for VBSpam Platinum certification' [...]

'Five products performed well enough to be awarded VBSpam Gold awards
[...] including [..] McAfee's Email Gateway (formerly IronMail)
software.'

so the ex-IronMail product is still a different product.  they tend to
not do such a great job in product line consolidation, in my
experience... ;)

-- 
--j.


Re: antispam comparison by virus bulletin

2009-09-04 Thread Justin Mason
In fairness, they got in touch to ask for help in setting up a more
recent SA, but none of us (ie the PMC) had the spare cycles to help
out.  Comparative third-party tests like this always take a lot of
hand-holding.  We don't have the same kind of marketing budget as the
commercial companies, needless to say.

OTOH, I think that McAfee's Email  Web Security Appliance runs on
SpamAssassin, or at least it did when I worked there ;)

--j.

On Fri, Sep 4, 2009 at 01:22, Jason Haarjason.h...@trimble.co.nz wrote:
 The Register reports that Virus Bulletin has announced it's latest results
 comparing a range of antispam products. McAfee won - and by the looks of it
 SpamAssassin and ClamAV came last.

 deep breath the methodology was flawed of course (oh no, I've become One
 of Those...). The chose SuSE10 which came with SA 3.1.8(!!) and didn't even
 think it unfair to compare an old product against current releases of
 commercial products - but there you go... Poor old ClamAV was treated
 similarly: ClamAV is an antivirus product - they actually tested
 Sanesecurity's add-on spam rules. I don't know of anyone using those rules
 who doesn't use them *in addition* to SA... Really didn't know much about
 what they were doing...

 http://www.theregister.co.uk/2009/09/03/anti_spam_run_off/
 http://www.virusbtn.com/vbspam/may2009 (free registration required that gets
 you access to some icons with ticks and crosses in them :-/)

 Hopefully they will do a better job next time - I'd like to see the results
 myself

 --
 Cheers

 Jason Haar
 Information Security Manager, Trimble Navigation Ltd.
 Phone: +64 3 9635 377 Fax: +64 3 9635 417
 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1




-- 
--j.


NOTICE: SpamAssassin 3.3.0 mass-checks now starting

2009-09-04 Thread Justin Mason
OK, if you're planning to send us mass-check logs for the
3.3.0 rescoring, now's the time!

http://wiki.apache.org/spamassassin/RescoreDetails has all the details.

cheers!

--j.


Re: Rule PTR != localhost

2009-09-03 Thread Justin Mason
On Thu, Sep 3, 2009 at 12:18, Benny Pedersenm...@junc.org wrote:
 On Thu 03 Sep 2009 07:19:35 AM CEST, Clunk Werclick wrote

 Forgive the stupidity of the question, but I'm not sure how to, or even
 if it can be implemented?

 forgive me, why do you want all that crap into your spamassassin when
 postfix can solve it for you without a hick ?

Obvious answer: not everyone who uses SA uses postfix.

-- 
--j.


ANNOUNCE: Apache SpamAssassin 3.3.0-alpha2 available

2009-08-11 Thread Justin Mason
Apache SpamAssassin 3.3.0-alpha2 is now available for testing.

Downloads are available from:
 http://people.apache.org/~jm/devel/

md5sum of archive files:

 1b396a9df1faa22185263c7526fe6042 Mail-SpamAssassin-3.3.0-alpha2.tar.bz2
 fbd0c4016d5d9c5adc3a958105b0b414 Mail-SpamAssassin-3.3.0-alpha2.tar.gz
 ed3ef5bef7c40e690ff80fce762a8302 Mail-SpamAssassin-3.3.0-alpha2.zip
 daaca5fba5787774eb918e1a5e92be6a
Mail-SpamAssassin-rules-3.3.0-alpha2.r802600.tgz

sha1sum of archive files:

 ab41278cb0c84c0fe6b38e57662487ea75c499a5 Mail-SpamAssassin-3.3.0-alpha2.tar.bz2
 87bc1e6777065af13a6f8c179636aa22a0644237 Mail-SpamAssassin-3.3.0-alpha2.tar.gz
 e4f08e636cd1f2cd6896e358c380fc952db51ad7 Mail-SpamAssassin-3.3.0-alpha2.zip
 64ff7fb327f0d699c4a600cd1f0f1ba9a64a0ba0
Mail-SpamAssassin-rules-3.3.0-alpha2.r802600.tgz


Note that the *-rules-*.tgz files are only necessary if you cannot, or do not
wish to, run sa-update after install to download the latest fresh rules.

The release files also have a .asc accompanying them.  The file serves
as an external GPG signature for the given release file.  The signing
key is available via the wwwkeys.pgp.net key server, as well as
http://spamassassin.apache.org/released/GPG-SIGNING-KEY

The key information is:

pub 1024D/265FA05B 2003-06-09 SpamAssassin Signing Key
rele...@spamassassin.org
   Key fingerprint = 26C9 00A4 6DD4 0CD5 AD24  F6D7 DEE0 1987 265F A05B

See the INSTALL and UPGRADE files in the distribution for important
installation notes.

Summary of major changes since 3.2.5


Changes to the core code:

[TODO: write changes list]


Re: Parallelizing Spam Assassin

2009-08-01 Thread Justin Mason
On Sat, Aug 1, 2009 at 10:04, Henrik Kh...@hege.li wrote:

 On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote:
 Well -- it's not just the cores -- what was the usage of the cores that
 were being used?  were 3 out the 8 'pegged'?  Are these 'real' cores, or
 HT cores?  In the Core2 and P4 archs, HT's actually slowed down a good
 many workloads unless they were tightly constructed to work on the same
 data in cache.  Else, those HT's did just enough extra work to block cache
 contents more than anything else.

 I really doubt there's HT involved in a recent looking 8 core 16GB machine..

 What's the disk I/O look like?  I mean don't just focus on idle cores --
 if the wait is on disk, maybe the cores can't get the data fast enough.

 As we already guessed, AWL (BerkeleyDB) caused disk I/O and slowness. For
 heavy loads you need to use SQL (or maybe the better BDB plugin in 3.3 if we
 get it working).

 If the network is involved, well, that's a drag on any message checking.
 I'm seeing times of .3msgs/sec, but I think that's with networking turned
 on.  Pretty Ugly.

 It affects single messages, but not total throughput. With network checks
 you just dedicate a lot more childs. Waiting for network responses takes no
 CPU time, thus you can process more messages simultaneously.

although you will also need to allocate more memory, as well, to
ensure that no swapping takes place.

-- 
--j.


Re: Parallelizing Spam Assassin

2009-07-31 Thread Justin Mason
hi -- turn off Bayes and AWL.

On Fri, Jul 31, 2009 at 07:55, poifghabhinav.pat...@gmail.com wrote:

 Hi

 I was measuring how quickly could SA [spam assassin] process spams when
 several SA processes are run in parallel over separate mbox files. I used a
 8 core machine. Below are the numbers when I forked different number of
 processes.

 Fork = 8;
 Rate = 57 msgs/sec

 Fork = 4;
 Rate = 44 msgs/sec

 Fork = 1;
 Rate = 22 msgs/sec


 I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing
 a linear increase in the throughput? Is a file locking creating the
 bottleneck? If yes, which particular file is being locked? If no, what could
 be the reason for this?

 thnx
 --
 View this message in context: 
 http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24751958.html
 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.





-- 
--j.


Re: Parallelizing Spam Assassin

2009-07-31 Thread Justin Mason
On Fri, Jul 31, 2009 at 09:32,
rich...@buzzhost.co.ukrich...@buzzhost.co.uk wrote:
 Imagine what Barracuda Networks could do with that if they did not fill
 their gay little boxes with hardware rubbish from the floors of MSI and
 supermicro. Jesus, try and process that many messages with a $30,000
 Barracuda and watch support bitch 'You are fully scanning to much mail
 and making our rubbish hardware wet the bed.' LOL.

Richard -- please watch your language.   This is a public mailing
list, and offensive language here is inappropriate.

-- 
--j.


Re: Malware list Q

2009-07-24 Thread Justin Mason
looks interesting!  I've asked the developer if he's interested in us
testing it out

On Fri, Jul 24, 2009 at 10:34, Brent Clarkbrentgclarkl...@gmail.com wrote:
 Hiya

 Do any of you guys use the following list.

 http://malware.hiperlinks.com.br/cgi/submit?action=list_sa

 If so, may I ask how do you find the results, and is it worth adding to
 spamassassin.

 Kind Regards
 Brent Clark





-- 
--j.


Re: boosting PBL score suggestions

2009-07-22 Thread Justin Mason
On Wed, Jul 22, 2009 at 14:41, Aaron Bennettabenn...@clarku.edu wrote:
 Hi,

 We're noticing that much of the spam which makes it through our filter hits
 the spamhaus pbl rule.  However, that rule by itself scores only 0.9.  Since
 we quarantine spam through a web interface (maia), we're pretty tolerant of
 false positives.

 Do any of you folks have a suggestion about raising the RCVD_IN_PBL score?
  I was thinking of raising it as high as 2 or 3.  Another thing I'm
 considering is a META rule that scores for PBL + BAYES_60, etc.

 I am generally reluctant to mess much with the default scoring -- but I'm
 always looking for a better setup.

when that was set a couple of years back, PBL had a few FPs -- the FP
rate has dropped greatly since then, going by recent ruleqa results.
go ahead and bump it up.

-- 
--j.


Re: Return Path Safe whitelist UPDATE [was: Opt In Spam]

2009-07-18 Thread Justin Mason
 If my system detect any HABEAS stuff, I score it with 10.00 and the spam
 is gone.  I have moved a very long time (arround 2 years)  the  messages
 to a seperated folder and had not a singel False-Positive.



obviously you weren't reading this mailing list several years ago,
when several of the SA committers, myself included, used the Habeas
headers in their personal mail...

--j.


Re: The www[variations]continue....

2009-07-17 Thread Justin Mason
On Fri, Jul 17, 2009 at 09:45, Michelle
Konzacklinux4miche...@tamay-dogan.net wrote:
 Am 2009-07-17 09:46:28, schrieb Ben:
 Dan,

 Thanks for the rules.

 I am using AE_MED42 from a previous thread, is this AE_MED44 meant
 to replace this or work in addition to it?

 Also just curious, why the low score?  With the default required hits of
 5.0 and this in my setup being the only rule to hit it would not be
 tagged as spam.  Am i missing something or have you lowered your
 required hits?

 I have scored it with 10.00 because the stupid AWL which scores  in  30%
 of all cases with -5.00.

sounds like you should turn off the AWL?

--j.


Re: How to attach spam messages as HTML instead of TXT

2009-07-15 Thread Justin Mason
On Tue, Jul 14, 2009 at 22:04, Spiro Harveysp...@knossos.net.nz wrote:
 Did you know...?

 Emails like yours are what we're trying to block on a daily basis.

This is distinctly unhelpful.  Please be courteous when dealing with
public mailing list inquiries, especially when you have no relation to
the Apache SpamAssassin project and your response could misleadingly
be taken to indicate that you do.

Needless to say, there's no indication that Jason's mails _are_ 'what
we're trying to block on a daily basis'.  In fact, we've spent a lot
of effort over the years in allowing non-spam HTML mail through,
instead of making them into false positives.

--j.


Re: sa-compile: resize not found

2009-07-15 Thread Justin Mason
the progress indicators use Term::ReadKey, which (looking at its
source) appears to call resize under certain circumstances.  you
should probably file a bug with them

On Wed, Jul 15, 2009 at 14:28, Michael Scheidellscheid...@secnap.net wrote:
 wondering..

 am I missing something?  'resize: not found'

 google didn't find anything.

 this is on all freebsd systems.

 happens on 32bit (i386) version 6.4, 64bit (amd64), 6.4 and 7.1

 re2c version 0.13.5
 sa-compile  /dev/null
 [5449] info: generic: base extraction starting. this can take a while...
 [5449] info: generic: extracting from rules of type body_0
 resize: not found
 100% [===] 6726.95 rules/sec 00m00s
 DONE
 resize: not found
 100% [== ]  83.16 bases/sec 00m56s
 DONE
 [5449] info: body_0: 3534 base strings extracted in 58 seconds
 [5449] info: generic: extracting from rules of type body_500
 resize: not found
 100% [===]  71.71 rules/sec 00m00s
 DONE
 resize: not found

 --
 Michael Scheidell, CTO
 Phone: 561-999-5000, x 1259
 *| *SECNAP Network Security Corporation

   * Certified SNORT Integrator
   * 2008-9 Hot Company Award Winner, World Executive Alliance
   * Five-Star Partner Program 2009, VARBusiness
   * Best Anti-Spam Product 2008, Network Products Guide
   * King of Spam Filters, SC Magazine 2008

 _
 This email has been scanned and certified safe by SpammerTrap(r). For
 Information please see http://www.secnap.com/products/spammertrap/
 _





-- 
--j.


Re: sa-compile: resize not found

2009-07-15 Thread Justin Mason
On Wed, Jul 15, 2009 at 14:38, Michael Scheidellscheid...@secnap.net wrote:
 'them'?

 men in black?

 freebsd? or CPAN maintainer of Term:ReadKey?

Up to you. ;)  I'd recommend the latter.

--j.

 ps, out of office messages, read receipt and FPS on vabounce.  OOO messages,
 and RR messages cannot be whitelisted by the sending mta since they never
 include the original message.  I think I started some patches on it, but OOO
 and RR messages should be treated differently, even if you save time by not
 wasting time with meta rules comparing them with vbounce_whitelist.


 --
 Michael Scheidell, CTO
 Phone: 561-999-5000, x 1259
 | SECNAP Network Security Corporation

 Certified SNORT Integrator
 2008-9 Hot Company Award Winner, World Executive Alliance
 Five-Star Partner Program 2009, VARBusiness
 Best Anti-Spam Product 2008, Network Products Guide
 King of Spam Filters, SC Magazine 2008

 

 This email has been scanned and certified safe by SpammerTrap®.

 For Information please see http://www.secnap.com/products/spammertrap/
 




-- 
--j.


Re: Spam Filter Law Suit

2009-07-15 Thread Justin Mason
Hi Damian --

Our first impression: somebody other than us is suing somebody other
than us about a matter that may be entirely unrelated to anything we
produce.  Unless we have a specific reason to believe that a specific
patent is likely to be enforced against either us or a downstream user
(and, no, one generally can't glean that from the title) there is
nothing we should do at this time.

Sorry about that

--j.

On Tue, Jul 14, 2009 at 19:59, Damian
Mendozadam...@exceleratesoftware.com wrote:
 Anyone else being sued by Southwest Technology Innovations regarding spam
 filtering? It’s odd that they would name my old company (Workgroup
 Solutions) since they have very few installations (2 person reseller)
 compared to the others named. Any opinions or feedback?



 http://www.faqs.org/patents/app/20090100138



 Southwest Technology Innovations LLC v. St. Bernard Software, Inc. et al

 EasyEdit

 (What's this?) What is the EasyEdit button? This website gets better when
 people like you add to it. Just click the EasyEdit button to start. (help)

 Last update: No updates (content history | content tools) (help)

 Keyword tags: None

 Plaintiff:

 Southwest Technology Innovations LLC

 Defendant:

 St. Bernard Software, Inc., Espion International, Inc., Workgroup Solutions,
 Inc., Sonicwall, Inc., Mirapoint Software, Inc. and Proofpoint, Inc.



 Case Number:

 3:2009cv01487

 Filed:

 July 9, 2009



 Court:

 California Southern District Court

 Office:

 San Diego Office [ Court Info ]

 County:

 San Diego

 Presiding Judge:

 Judge John A. Houston

 Referring Judge:

 Magistrate Judge Jan M. Adler



 Nature of Suit:

 Intellectual Property - Patent

 Cause:

 35:271 Patent Infringement

 Jurisdiction:

 Federal Question

 Jury Demanded By:

 Plaintiff





-- 
--j.


Re: deactivate all checks except specific tests

2009-07-14 Thread Justin Mason
sorry about the double-post -- original message was stuck in moderation queue.

On Fri, Jul 10, 2009 at 18:20,
sebast...@debianfan.desebast...@debianfan.de wrote:
 Hello,

 i have set up a virtual server for experiments.

 I want to disable all the spamassassin tests - except one specific rbl - in
 this topic-  the manitu rbl.

 Is there a parameter for disabling all the tests?

 Thx

 Sebastian




Re: Extending XBL to all untrusted

2009-07-13 Thread Justin Mason
On Fri, Jul 3, 2009 at 22:43, RWrwmailli...@googlemail.com wrote:

 I think it might be worth having 2 XBL tests, a high scoring test on
 last-external and a lower-scoring test that goes back through the
 untrusted headers.

 I understand that Spamhaus doesn't recommend this, because dynamic IP
 addresses can be reassigned from a spambot to another user, but I added
 my own rule it does seem to work. In my mail it hits about 9% of my
 spam, with zero false-positives. I suspect that part of this is down to
 UK dynamic addresses being very sticky, but I ran my mailing lists
 through SA for a few weeks and got 3 FPs out of ~2400.

 I think it's probably worth a point or so, and essentially it's free
 - all of the zen lookups get done for SBL.

we used to do it this way, but the FPs are (surprisingly) high due to
dynamic-address-pool churn.

compare:
OVERALL%   SPAM% HAM% S/ORANK   SCORE  NAME
 5.100  10.1740   0.02000.998   0.650.01  T_RCVD_IN_XBL  (with
trusted-networks)
 5.417  10.6074   0.22030.980   0.180.00  RCVD_IN_XBL  (with all nets)

I'll forward on the old mail for hysterical raisins.

--j.


Fwd: DNSBL accuracy using -firsttrusted

2009-07-13 Thread Justin Mason
that old message I was talking about.


-- Forwarded message --
From: Daniel Quinlan quin...@pathname.com
Date: Sat, May 22, 2004 at 16:25
Subject: DNSBL accuracy using -firsttrusted
To: spamassassin-...@incubator.apache.org


Someone at Spamhaus poked me to try testing only the last IP address
with XBL and I tested it and it helps reduce false positives quite
nicely.  The concept with XBL is that if it came most recently from an
okay host, then the message is probably okay too.  It's a bit spooky but
it works and I suppose it is closer in behavior to how blacklists are
generally used at connect time, so perhaps most are tuned to be used
this way.

The main caveat is that if trusted networks is not guessed or set
correctly, then *no* blacklist hits will happen and the net score set
will be used to the detriment of the site.

I tried the same idea on more or less every applicable blacklist and
check out the results:

--- start of cut text --
OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
 29979    14999    14980    0.500   0.00    0.00  (all messages)
100.000  50.0317  49.9683    0.500   0.00    0.00  (all messages as %)

 12.212  24.4083   0.    1.000   1.00    0.01  T_RCVD_IN_NJABL_PROXY
 12.962  25.7951   0.1135    0.996   0.57    0.00  RCVD_IN_NJABL_PROXY

 18.186  36.3291   0.0200    0.999   0.95    1.00  __T_RCVD_IN_NJABL
 19.877  38.1225   1.6088    0.960   0.30    1.00  __RCVD_IN_NJABL

 8.613  17.2145   0.    1.000   0.91    0.01  T_RCVD_IN_SORBS_MISC
 9.136  18.2412   0.0200    0.999   0.80    0.00  RCVD_IN_SORBS_MISC

 29.124  58.1705   0.0401    0.999   0.90    0.01  T_RCVD_IN_DSBL
 30.395  60.2640   0.4873    0.992   0.43    0.00  RCVD_IN_DSBL

 7.966  15.9211   0.    1.000   0.87    0.01  T_RCVD_IN_SORBS_HTTP
 8.449  16.8011   0.0868    0.995   0.49    0.00  RCVD_IN_SORBS_HTTP

 5.337  10.6540   0.0134    0.999   0.74    0.01  T_RCVD_IN_RFCI
 7.162  12.3675   1.9493    0.864   0.00    0.00  RCVD_IN_RFCI

 9.804  19.5613   0.0334    0.998   0.73    0.01  T_RCVD_IN_SBL
 9.927  19.7747   0.0668    0.997   0.62    0.00  RCVD_IN_SBL

 14.610  29.1486   0.0534    0.998   0.73    1.00  __T_RCVD_IN_SBL_XBL
 15.044  29.7820   0.2870    0.990   0.35    1.00  __RCVD_IN_SBL_XBL

 3.116   6.2204   0.0067    0.999   0.72    0.00  RCVD_IN_NJABL_SPAM
 3.062   6.1137   0.0067    0.999   0.70    0.01  T_RCVD_IN_NJABL_SPAM

 2.055   4.1069   0.    1.000   0.66    0.01  T_RCVD_IN_BL_SPAMCOP_NET
 2.235   4.3070   0.1602    0.964   0.14    0.00  RCVD_IN_BL_SPAMCOP_NET

 5.100  10.1740   0.0200    0.998   0.65    0.01  T_RCVD_IN_XBL
 5.417  10.6074   0.2203    0.980   0.18    0.00  RCVD_IN_XBL

 21.869  43.5562   0.1535    0.996   0.64    0.01  T_RCVD_IN_SORBS_DUL
 22.146  44.0363   0.2270    0.995   0.48    0.00  RCVD_IN_SORBS_DUL

 34.071  67.9112   0.1869    0.997   0.63    1.00  __T_RCVD_IN_SORBS
 42.410  70.9047  13.8785    0.836   0.34    1.00  __RCVD_IN_SORBS

 1.868   3.7336   0.    1.000   0.64    0.00  RCVD_IN_SORBS_SMTP
 1.731   3.4602   0.    1.000   0.62    0.01  T_RCVD_IN_SORBS_SMTP

 2.935   5.8537   0.0134    0.998   0.63    0.00  RCVD_IN_NJABL_DIALUP
 2.879   5.7404   0.0134    0.998   0.61    0.01  T_RCVD_IN_NJABL_DIALUP

 0.934   1.8668   0.    1.000   0.57    0.01  T_RCVD_IN_RSL
 1.041   2.0735   0.0067    0.997   0.55    0.00  RCVD_IN_RSL

 0.607   1.2134   0.    1.000   0.53    0.01  T_RCVD_IN_SORBS_SOCKS
 0.637   1.2401   0.0334    0.974   0.33    0.00  RCVD_IN_SORBS_SOCKS

 0.430   0.8601   0.    1.000   0.49    0.01  T_RCVD_IN_SORBS_WEB
 0.447   0.8867   0.0067    0.993   0.46    0.00  RCVD_IN_SORBS_WEB

 0.254   0.5067   0.    1.000   0.44    0.01  T_RCVD_IN_SORBS_ZOMBIE
 0.307   0.5867   0.0267    0.956   0.29    0.00  RCVD_IN_SORBS_ZOMBIE

 0.117   0.2333   0.    1.000   0.42    0.00  RCVD_IN_NJABL_RELAY
 0.113   0.2267   0.    1.000   0.40    0.01  T_RCVD_IN_NJABL_RELAY
--- end 

change in RANK (relative to just the IP-based blacklists and the new
-firsttrusted ones in testing)

  0.74   RCVD_IN_RFCI
  0.52   RCVD_IN_BL_SPAMCOP_NET
  0.47   RCVD_IN_XBL
  0.47   RCVD_IN_DSBL
  0.43   RCVD_IN_NJABL_PROXY
  0.38   RCVD_IN_SORBS_HTTP
  0.20   RCVD_IN_SORBS_SOCKS
  0.16   RCVD_IN_SORBS_DUL
  0.15   RCVD_IN_SORBS_ZOMBIE
  0.11   RCVD_IN_SORBS_MISC
  0.11   RCVD_IN_SBL
  0.03   RCVD_IN_SORBS_WEB
  0.02   RCVD_IN_RSL
 -0.02   RCVD_IN_NJABL_DIALUP
 -0.02   RCVD_IN_NJABL_RELAY
 -0.02   RCVD_IN_NJABL_SPAM
 -0.02   RCVD_IN_SORBS_SMTP

and not really relevant unless we change entire sets to reduce the
number of look-ups:

  0.65   __RCVD_IN_NJABL
  0.38   __RCVD_IN_SBL_XBL
  0.29   __RCVD_IN_SORBS

Results for some fresh mail that may still have a few misfiles:

--- start of cut text --
OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
  4039     2294     1745    0.568   0.00    0.00  (all messages)
100.000  56.7962  43.2038    0.568   0.00    0.00  (all 

Re: compiling SA3.3

2009-07-05 Thread Justin Mason
could it be using a different perl binary?

On Sun, Jul 5, 2009 at 03:26, LuKremekrem...@kreme.com wrote:
 When trying to build SA3.3 I got the following error:

 ERROR: the required NetAddr::IP module is not installed. at
 lib/Mail/SpamAssassin/Util/DependencyInfo.pm line 285.

 Trouble is, I have p5-NetAddr-IP-4.00.7 installed via ports

 $ where netaddr
 p5-NetAddr-IP-4.00.7 is in is in net-mgmt/p5-NetAddr-IP
 net-mgmt/p5-NetAddr-IP
 $ source `which where`
 #! /bin/bash
 for i in `ls /var/db/pkg | grep -i $1`; do echo $i is in `pkgdb -o $i`;
 done

 is it because all SA3.2.5 stuff is installed from ports and I need to set
 some install directory so 3.3 can find everything?

 --
 Stomach in! Chest out! on your marks! get set! GO! Now, now that
        you're free, what are you gonna be? Who are you gonna see? And
        where, where will you go, and how will you know you didn't get
        it all wrong?




Re: constantcontact.com

2009-07-03 Thread Justin Mason
I've heard that they are diligent about terminating abusive clients.
Are you reporting these spams to them?

--j.

On Fri, Jul 3, 2009 at 09:55, Mike
Cardwellspamassassin-us...@lists.grepular.com wrote:
 rich...@buzzhost.co.uk wrote:

 I'm probably missing something here - but Constant Contact (who we block
 by IP) have been a nagging source of spam for us. I'm just wondering why
 25_uribl.cf has this line in it:

 ## DOMAINS TO SKIP (KNOWN GOOD)

 # Don't bother looking for example domains as per RFC 2606.
 uridnsbl_skip_domain example.com example.net example.org

 ..
 uridnsbl_skip_domain constantcontact.com corporate-ir.net cox.net cs.com

 Is this a uri that is really suitable for white listing ?

 A set of perl modules has been uploaded to cpan today for talking to the
 ConstantContact API:

 http://search.cpan.org/~arich/Email-ConstantContact-0.02/lib/Email/ConstantContact.pm

 I just thought it was a weird coincidence, seeing as I'd never heared of
 them before today.

 --
 Mike Cardwell - IT Consultant and LAMP developer
 Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/




Re: Can update from sought.rules.yerp.org as I get SHA1 verification failed

2009-07-03 Thread Justin Mason
yep, seeing that here too.  Investigating...

On Fri, Jul 3, 2009 at 08:42, Brent Clarkbrentgclarkl...@gmail.com wrote:
 Hiya

 Im having a little problem with updating.

 [13860] dbg: plugin: Mail::SpamAssassin::Plugin::MIMEHeader=HASH(0x9ccb9c0)
 implements 'finish_tests', priority 0
 [13860] dbg: plugin: Mail::SpamAssassin::Plugin::Check=HASH(0x9e46fe8)
 implements 'finish_tests', priority 0
 [13860] dbg: generic: lint check of site pre files succeeded, continuing
 with channel updates
 [13860] dbg: channel: reading MIRRORED.BY file
 [13860] dbg: channel: found mirror http://yerp.org/rules/stage/
 [13860] dbg: channel: selected mirror http://yerp.org/rules/stage
 [13860] dbg: http: GET request, http://yerp.org/rules/stage/320790737.tar.gz
 [13860] dbg: http: GET request,
 http://yerp.org/rules/stage/320790737.tar.gz.sha1
 [13860] dbg: http: GET request,
 http://yerp.org/rules/stage/320790737.tar.gz.asc
 [13860] dbg: http: IMS GET request, http://yerp.org/rules/stage/MIRRORED.BY,
 Mon, 01 Dec 2008 04:20:22 GMT
 [13860] dbg: sha1: verification wanted: 320790737
 [13860] dbg: sha1: verification result:
 a9dbb531b21b74b2cb5b51bca7cd0352493e6a59
 channel: SHA1 verification failed, channel failed
 [13860] dbg: generic: cleaning up temporary directory/files
 [13860] dbg: diag: updates complete, exiting with code 4

 Would you know how I could fix this?

 Kind Regards
 Brent Clark




Re: Can update from sought.rules.yerp.org as I get SHA1 verification failed

2009-07-03 Thread Justin Mason
it seems to have resolved itself.

: 26...; wget http://yerp.org/rules/stage/320790737.tar.gz.sha1; wget
http://yerp.org/rules/stage/320790737.tar.gz
--2009-07-03 09:59:56--  http://yerp.org/rules/stage/320790737.tar.gz.sha1
Resolving yerp.org... 216.180.243.10
Connecting to yerp.org|216.180.243.10|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 98 [application/x-gzip]
Saving to: `320790737.tar.gz.sha1'

100%[=] 98
  --.-K/s   in 0s

2009-07-03 09:59:56 (10.4 MB/s) - `320790737.tar.gz.sha1' saved [98/98]

--2009-07-03 09:59:56--  http://yerp.org/rules/stage/320790737.tar.gz
Resolving yerp.org... 216.180.243.10
Connecting to yerp.org|216.180.243.10|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 57294 (56K) [application/x-gzip]
Saving to: `320790737.tar.gz'

100%[=]
57,294   163K/s   in 0.3s

2009-07-03 09:59:56 (163 KB/s) - `320790737.tar.gz' saved [57294/57294]


: 27...; sha1sum 320790737.tar.gz
e789d5fdcdcac78da7d4f3a13eb5b8432a5c3270  320790737.tar.gz

: 28...; cat 320790737.tar.gz.sha1
e789d5fdcdcac78da7d4f3a13eb5b8432a5c3270
/home/jm/ftp/sandboxupdates/tmp/sought.3.2.x/update.tgz

: 34...; curl http://yerp.org/rules/GPG.KEY  | gpg --import
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
100  2437  100  24370 0  10301  0 --:--:-- --:--:-- --:--:-- 1291k
gpg: key 6C6191E3: public key Justin Mason Signing Key (Code Signing
Only) signing...@jmason.org imported
gpg: Total number processed: 1
gpg:   imported: 1

: 35...; gpg --verify 320790737.tar.gz.asc 320790737.tar.gz
gpg: Signature made Thu Jul  2 21:29:30 2009 UTC using DSA key ID 6C6191E3
gpg: Good signature from Justin Mason Signing Key (Code Signing Only)
signing...@jmason.org
gpg: WARNING: This key is not certified with a trusted signature!
gpg:  There is no indication that the signature belongs to the owner.
Primary key fingerprint: 8D25 B5E9 1DAF 0F71 5F60  B588 DC85 341F 6C61 91E3


I'm not sure, but I suspect the data served by the httpd for
320790737.tar.gz was corrupted in some way, not sure why; but the
signatures caught it, and it's now back to normal again.

--j.

On Fri, Jul 3, 2009 at 10:43, Justin Masonj...@jmason.org wrote:
 yep, seeing that here too.  Investigating...

 On Fri, Jul 3, 2009 at 08:42, Brent Clarkbrentgclarkl...@gmail.com wrote:
 Hiya

 Im having a little problem with updating.

 [13860] dbg: plugin: Mail::SpamAssassin::Plugin::MIMEHeader=HASH(0x9ccb9c0)
 implements 'finish_tests', priority 0
 [13860] dbg: plugin: Mail::SpamAssassin::Plugin::Check=HASH(0x9e46fe8)
 implements 'finish_tests', priority 0
 [13860] dbg: generic: lint check of site pre files succeeded, continuing
 with channel updates
 [13860] dbg: channel: reading MIRRORED.BY file
 [13860] dbg: channel: found mirror http://yerp.org/rules/stage/
 [13860] dbg: channel: selected mirror http://yerp.org/rules/stage
 [13860] dbg: http: GET request, http://yerp.org/rules/stage/320790737.tar.gz
 [13860] dbg: http: GET request,
 http://yerp.org/rules/stage/320790737.tar.gz.sha1
 [13860] dbg: http: GET request,
 http://yerp.org/rules/stage/320790737.tar.gz.asc
 [13860] dbg: http: IMS GET request, http://yerp.org/rules/stage/MIRRORED.BY,
 Mon, 01 Dec 2008 04:20:22 GMT
 [13860] dbg: sha1: verification wanted: 320790737
 [13860] dbg: sha1: verification result:
 a9dbb531b21b74b2cb5b51bca7cd0352493e6a59
 channel: SHA1 verification failed, channel failed
 [13860] dbg: generic: cleaning up temporary directory/files
 [13860] dbg: diag: updates complete, exiting with code 4

 Would you know how I could fix this?

 Kind Regards
 Brent Clark





Re: constantcontact.com

2009-07-03 Thread Justin Mason
On Fri, Jul 3, 2009 at 10:14,
rich...@buzzhost.co.ukrich...@buzzhost.co.uk wrote:
 On Fri, 2009-07-03 at 10:06 +0100, Justin Mason wrote:
 I've heard that they are diligent about terminating abusive clients.
 Are you reporting these spams to them?

 Yes - but you would thing a log full of 550's may be a clue.

 What concerns me is SpamAssassin effectively white listing spammers.
 White listing should be a user option - not something added in a
 nefarious manner. At least it is clear to see with Spamassassin which is
 a plus - but I cannot pretend that I am not disappointed to find a
 whitelisted 'spammer net' in the core rules.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5905 has some
information on the background; we asked SURBL for their top queried
domains that they considered nonspam, and it was in that list.  SURBL
have always been scrupulous in their operations and listing criteria
fwiw.

Going by bug 5905 though, and this report, we should probably remove
it from the whitelist.

  I'm wondering why (other
 than MONEY) it would have ended up in there?

Hope that answers your question.  note that it didn't involve MONEY.
 btw silly unfounded accusations mean that it's less likely you'll get
anyone to answer your mail, so please don't do that.

--j.


Re: Weird Problem w/ Rule2XSBody + Sought Rule

2009-07-02 Thread Justin Mason
On Thu, Jul 2, 2009 at 15:28, Sean Cardusscar...@zebrahosts.net wrote:
  An re2c bug, presumably? Is anyone having problems without using sa-
  compile?

 If I removed the compiled rule sets, everything works fine again...

 I've noticed that sa-update pulled in a new set of Sought rules this morning 
 (version 320790507).  I've run sa-compile over them again, re-tried the mail 
 that previously failed and I'm glad to say I'm no longer seeing the 
 memory/loop problem.

I stopped it publishing rules containing that pattern.

We could still do with reproducing the bug though ;)

--j.


ANNOUNCE: Apache SpamAssassin 3.3.0-alpha1 available

2009-07-02 Thread Justin Mason
Apache SpamAssassin 3.3.0-alpha1 is now available for testing.

Downloads are available from:
 http://people.apache.org/~jm/devel/

md5sum of archive files:

 04141392e1f20ea4a91bb63937351c65  Mail-SpamAssassin-3.3.0-alpha1.tar.bz2
 1532b02384c37b4fb40ff1244bca3ec5  Mail-SpamAssassin-3.3.0-alpha1.tar.gz
 c2d80477b20b571b591eae11993dcf03  Mail-SpamAssassin-3.3.0-alpha1.zip
 3cb55fc1bd84bc93013638f9f8fa88c0  Mail-SpamAssassin-rules-3.3.0-alpha1.tgz

sha1sum of archive files:

 f8b5236c0ef1de82699b4bf8e29cd3e0f0b18bfd
Mail-SpamAssassin-3.3.0-alpha1.tar.bz2
 de65e1d5f29b954b2c60d58f10a6dc710a8a3629  Mail-SpamAssassin-3.3.0-alpha1.tar.gz
 cc0c7531f666e48e1593054cfa2bb70336ca2f13  Mail-SpamAssassin-3.3.0-alpha1.zip
 83b0559399a1849063cfcf209be9d23eab1f8267
Mail-SpamAssassin-rules-3.3.0-alpha1.tgz


The release files also have a .asc accompanying them.  The file serves
as an external GPG signature for the given release file.  The signing
key is available via the wwwkeys.pgp.net key server, as well as
http://spamassassin.apache.org/released/GPG-SIGNING-KEY

The key information is:

pub 1024D/265FA05B 2003-06-09 SpamAssassin Signing Key
rele...@spamassassin.org
   Key fingerprint = 26C9 00A4 6DD4 0CD5 AD24  F6D7 DEE0 1987 265F A05B

See the INSTALL and UPGRADE files in the distribution for important
installation notes.

Summary of major changes since 3.2.6


Changes to the core code:

[TODO: write changes list before 3.3.0 release ;)]


Re: Weird Problem w/ Rule2XSBody + Sought Rule

2009-07-01 Thread Justin Mason
hey Matt -- what version of re2c is installed?

On Tue, Jun 30, 2009 at 18:43, Matt Elsonmel...@fastmail.net wrote:
 Hey all,

 I stumbled upon an odd issue the other day that I'm having trouble
 tracking down.  Namely, a certain rule in the sought rule set, when
 compiled for use with Rule2XSBody is causing the processing of *some*
 emails to, well, never really end.  Piping the mail through spamassassin
 or into spamd just results in the process hanging and the memory usage
 going higher and higher (2+ gigs, easily) and seemingly ignoring any
 sort of timeouts.  The process finally gets killed only when the OS
 notices it's out of memory and starts killing processes or when I'm able
 to sneak in and kill -9 it.  There's nothing in the debug of SA whatsoever.

 I was wondering if anyone else has seen this or if it's some quirk of my
 environment. I admit that I'm no expert in this sort of thing, but
 (hopefully) some useful information is below the dotted line.

 -
 This happened on four of my machines which have the following configuration:


 RHEL5.2 / SA 3.2.5  / Perl 5.8.8 / gcc 4.1.2
 RHEl5.2 / SA 3.2.4  / Perl 5.8.8 / gcc 4.1.2
 RHELAS 4 (Update 6) / SA 3.2.4 / Perl 5.8.5 / gcc 3.4.6
 RHELAS 4 (Update 6) / SA 3.2.4 / Perl 5.8.5 / gcc 3.4.6


 The SA is built from source off the main website, and the perl is just
 stock redhat.

 If I copy down all my rules/configuration to my Debian desktop using its
 packaging, the problem doesn't emerge (sa 3.2.5/perl 5.10.0/gcc 4.3.3 there)

 Removing the compiled rulesets works around the issue fairly handily.
 I'm stubborn though, so after I did so, I dug around a bit and it seems
 one specific body rule was causing the issue, namely:

 body __SEEK_1R0JFS  /\x{ff}\x{fe} \x{00} \x{00} \x{00}
 \x{00}\x{00}m\x{00}e\x{00}t\x{00}a\x{00}
 \x{00}h\x{00}t\x{00}t\x{00}p\x{00}-\x{00}e\x{00}q\x{00}u\x{00}i\x{00}v\x{00}=\x{00}\'\x{00}R\x{00}e\x{00}f\x{00}r\x{00}e\x{00}s\x{00}h\x{00}\'\x{00}
 \x{00}c\x{00}o\x{00}n\x{00}t\x{00}e\x{00}n\x{00}t\x{00}=\x{00}\'\x{00}0\x{00};\x{00}
 \x{00}u\x{00}r\x{00}l\x{00}=\x{00}h\x{00}t\x{00}t\x{00}p\x{00}:\x{00}\/\x{00}\/\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}.\x{00}/

 Once I comment out the rule, compiled rulesets work fine again.  I don't
 know enough to know what the heck that regex even is, or why it would be
 causing problems (I basically found which rule was causing a problem by
 commenting out anything that looked scary to me, running sa-compile, and
 testing to see if I the hanging behavior went away)

 I'm not sure the best way to post up a sample of the mail that was
 choking the system without it getting mangled (though I'll gladly post
 it if someone can show me where), but fooling around, it seemed to come
 down to the message containing this as one of its parts:


 -
 Content-Type: text/html;
 Content-Transfer-Encoding: quoted-printable

 (Any content could go here)
 =00
 -

 Removing =00 OR Content-Transfer-Encoding: quoted-printable causes the
 mail to pass through without a problem.  It seems to only be both
 combined that resulted in the behavior I saw.

 Anyhoo, any thoughts?  This a legitimate bug or something wrong with my
 setup?

 Matt




vpopmail / qmail testers needed

2009-06-29 Thread Justin Mason
hi folks.  could someone using vpopmail/qmail please test this patch:

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=2536
(patch id 4432)

A fix to vpopmail/qmail support is unlikely to make it into 3.3.0
without testers.

--j.


Re: vpopmail / qmail testers needed

2009-06-29 Thread Justin Mason
On Mon, Jun 29, 2009 at 17:28, RobertHrobe...@abbacomm.net wrote:

 Sent: Monday, June 29, 2009 4:24 AM
 To: SpamAssassin Users List
 Subject: vpopmail / qmail testers needed

 hi folks.  could someone using vpopmail/qmail please test this patch:

 https://issues.apache.org/SpamAssassin/show_bug.cgi?id=2536
 (patch id 4432)

 A fix to vpopmail/qmail support is unlikely to make it into
 3.3.0 without testers.

 --j.


 Justin,

 would you want this info forwarded to two specific Qmail lists with a
 reference back to you for those that can help?

 i looked at the URL above and well, it appears that we wouldnt be of any
 help that i can tell unless there is more docs out there about whatever
 needs to be done.

 i take it this is some type of per user filtering using Qmail, Spamassassin,
 vpopmail etc?

 most of what we do is site-wide

hi Robert -- forward away.  hopefully someone there may be interested
in getting our vpopmail support up to scratch.
the status is that we've had support for vpopmail for several years,
but there have been frequent bug reports of rough edges, warning
messages, and failure to handle error conditions etc.

--j.


note for those tracking SVN trunk

2009-06-29 Thread Justin Mason
[forwarded from bugzilla]

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6139

--- Comment #6 from Justin Mason j...@jmason.org  2009-06-29 15:00:53 PST ---
actually, it's worse than that.  Every SVN checkout needs the following, it
seems:

svn propdel svn:externals . ; rm -rf rulesrc ; svn up


sorry folks.  won't happen again!


Re: gpg signed spam email ???

2009-06-28 Thread Justin Mason
there's a very good chance the GPG signature in this case was fake --
ie. a cut-and-paste job.

--j.

On Sat, Jun 27, 2009 at 19:05, Matt Kettlermkettler...@verizon.net wrote:
 RobertH wrote:
 i was reading at

 http://www.karan.org/blog/

 specifically

 http://www.karan.org/blog/index.php/2009/06/15/gpg-signed-spam

 that he recv'd a gpg signed spam email

 ive never heard of that before yet i havent thought much about it or studied
 it...

 Q: is this unheard of, or common?

 near as i can quickly investigate, it doesnt appear to be common as per
 papa google [sic].

 comments? feedback?

 just trying to get up on the curve now.

 Well, let's put it this way:

 A long, long time ago, SA had a rule in the default set, giving negative
 score to PGP and GPG signed messages. Quickly, spammers started adding
 enough fragments of a signature to match the rule. This was very
 obvious, as the rule only matched the begin clause, and the spams had a
 begin clause dropped at the bottom of the message, with no end clause.

 The rule could have been modified to validate the signature, but of
 course, anyone can GPG sign a message and have it be valid, and the
 spammers probably would have done so if the rule changed. Therefore, the
 rule was dropped from the set entirely.

 GPG signatures only validate that the sender has the private key that
 matches the public one signing the email. Like SPF, and many other
 authentication only technologies, this doesn't tell you anything about
 the sender. Even perfect authentication at best only provides
 confirmation of who the sender is, and most of these technologies only
 prove a sender is the proper owner holder of some abstract identity like
 a key or domain.

 Authentication needs to be paired with recognition to be meaningful.  If
 a sender proves who they are, will you immediately accept the email
 without further question? What if they just proved they were Alan Ralsky?

 http://www.spamhaus.org/rokso/listing.lasso?-op=cnspammer=Alan%20Ralsky


 Moral of the story: don't assign negative scores to systems that only
 provide authentication, unless you're somehow pairing it with proof the
 sender is someone you actually trust (or at least is trusted by a
 service you trust, etc).

 Ever notice that the negative score of SPF_PASS is insignificantly
 small, there's a reason for that.. Spammers can pass SPF too, so by
 itself, it's meaningless. But paired with your explicit trust of a
 domain or sender, it provides forgery resistant whitelisting
 (whitelist_from_spf).













How many people are still using perl 5.6.x?

2009-06-25 Thread Justin Mason
For the upcoming release, we're considering dropping support for that
interpreter version.  If you're still using 5.6.x, or know of a
(relatively recent) distro that does, please reply to highlight
this

--j.


Re: How many people are still using perl 5.6.x?

2009-06-25 Thread Justin Mason
On Thu, Jun 25, 2009 at 11:15, Jan P. Kesslersal...@jpkessler.info wrote:
 Justin Mason schrieb:
 For the upcoming release, we're considering dropping support for that
 interpreter version.  If you're still using 5.6.x, or know of a
 (relatively recent) distro that does, please reply to highlight
 this

 --j.


 Don't know if it's still relevant: Solaris 8

 # uname -a
  SunOS mailhub 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-250

 # perl -v
  This is perl, version 5.005_03 built for sun4-solaris

http://www.sun.com/software/solaris/support/sol8.xml :

'The Solaris 8 Operating System (OS) was originally released in
February 2000, and since then has been superseded by two later
releases: the Solaris 9 OS which was initially released in May 2002,
and the Solaris 10 OS which was initially released in January 2005.
The current update of this release is Solaris 10 5/09.

On August 16, 2006 Sun announced the transition of the Solaris 8 OS.
Per this transition:

* November 16, 2006 was the last date Solaris 8 media kits could be ordered
* Sun shipped Solaris 8 media up until February 16, 2007; Solaris
8 media kits are no longer available
* Solaris 8 entered retirement support mode Phase I on March 31, 2007;
* Solaris 8 will enter retirement support mode Phase II on March
31, 2009; and,
* Solaris 8 will reach the end of its service life on March 31, 2012.

The total service life of Solaris 8 will thus be slightly more than 12 years.'


So the OS itself is still supported.  however, that perl version (in
my experience) is quite broken; whenever I've used Solaris recently
I've been sure to install third-party precompiled perls from
sunfreeware/blastwave, or built my own, and used those instead.  it's
a moot point anyway, as SA 3.1.x/3.2.x doesn't support 5.005.

--j.


  1   2   3   4   5   6   7   8   9   10   >