Re: anyone know anything about lashback?

2011-08-09 Thread Warren Togami Jr.

On 8/9/2011 3:39 AM, Michael Scheidell wrote:

does anyone know about this rbl?

http://www.lashback.com/blacklist/

We have a persistent sender who is sending phishing emails through a
large corporate server (not ours .. ;-)

the only two reputation filters that list them are the commercial DCC,
and lashback.

(oh, where do I submit the phishing url...) its not listed either.




http://www.spamtips.org/2011/05/dnsbl-safety-report-5142011.html
UBL has been tested multiple times in the past two years with dismal 
results.  When using spamassassin's corpus, it has demonstrated poor 
spam detection rates, high false positives, and high rates of 
unreliability making it dangerous to rely upon for your spam filtering 
deployment.  Our own tests were not alone in making this assessment.


Warren


Re: sa-update failing

2011-07-17 Thread Warren Togami Jr.

On 7/16/2011 4:54 AM, dar...@chaosreigns.com wrote:

On 07/15, ssapp80 wrote:

Running spamassassin-3.3.2 on CentOS 5.5
perl-Net-DNS ver 0.59 installed

When I run sa-update i receive the following failures on the Net::DNS module

name2labels is not exported by the Net::DNS module


My guess is Net::DNS version 0.59 is too old.  In which case it would be
nice if spamassassin specified that in its use statement so it would give a
more useful error.  If upgrading that perl module fixes that problem, it
might be worth opening a bug to improve that error.  I'm using Net::DNS
v0.65.



EL-5 has perl-Net-DNS 0.59.  I've been using it in production for years 
without issue, even with 3.3.2.  This is not the droid we're looking 
for.  Move along.


Warren


Re: sa-update failing

2011-07-17 Thread Warren Togami Jr.

On 7/17/2011 7:55 AM, Axb wrote:

On 2011-07-17 18:32, Warren Togami Jr. wrote:

On 7/16/2011 4:54 AM, dar...@chaosreigns.com wrote:

On 07/15, ssapp80 wrote:

Running spamassassin-3.3.2 on CentOS 5.5
perl-Net-DNS ver 0.59 installed

When I run sa-update i receive the following failures on the Net::DNS
module

name2labels is not exported by the Net::DNS module


My guess is Net::DNS version 0.59 is too old. In which case it would be
nice if spamassassin specified that in its use statement so it would
give a
more useful error. If upgrading that perl module fixes that problem, it
might be worth opening a bug to improve that error. I'm using Net::DNS
v0.65.



EL-5 has perl-Net-DNS 0.59. I've been using it in production for years
without issue, even with 3.3.2. This is not the droid we're looking for.
Move along.

Warren


unless RH has patched perl-Net-DNS 0.59, iirc, the original had some
issues in RR.pm (going back to 2007) including some security thingie.

I'd recomend using 0.63 or higher.

http://pkgs.repoforge.org/perl-Net-DNS/ offers a good choice.
(I'm using 0.66 and happy)


Looking at the changelog, it appears they did patch several bugs and a 
security issue.  All I know is I have had no issues using EL-5's 0.59 
for years now.


Warren


Re: sa-update failing

2011-07-15 Thread Warren Togami Jr.

On 7/15/2011 10:35 AM, ssapp80 wrote:


Running spamassassin-3.3.2 on CentOS 5.5
perl-Net-DNS ver 0.59 installed

When I run sa-update i receive the following failures on the Net::DNS module

name2labels is not exported by the Net::DNS module
Can't continue after import errors at
/usr/lib/perl5/vendor_perl/5.8.8/Net/DNS/RR/NSEC3.pm line 24
BEGIN failed--compilation aborted at


Why CentOS 5.5?

http://www.spamtips.org/p/rpm-packages.html
I am running CentOS 5.6 here with standard OS + EPEL packages plus the 
SpamTips.org spamassassin-3.3.2 RPM.  Are you doing anything different 
from this?  I have never seen an error like that before.


Why are you running sa-update manually?  The upstream RPM and 
SpamTips.org RPM (both designed by me) automatically run sa-update once 
per day if spamd is running.


Warren Togami
war...@togami.com


Re: spamassassin 3.3.2 rpms for el4 / centos4 etc ???

2011-07-12 Thread Warren Togami Jr.

On 7/11/2011 7:53 PM, R - elists wrote:




It's removal was based at least in part on a belief that it
was not actually usable for anybody.  You could take it up
with the dev list, particularly if you're up for maintaining
it in a way that's useful for the major rpm platforms.
Either way you probably want to talk to Warren Togami, the
resident RedHat guy.

I'd like to see it included, but nobody was willing to maintain it.

You should be able to easily copy the relevant files from the
3.3.1 tarball, if they worked for you.



Darxus,

thanks for the info.

i checked the bug link you gave, and frankly, pulling the .spec file because
of

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6314

doesnt make any sense to me, yet what do i know...

;-)

anyways, if i knew what the relevant files were between the two, id take a
shot at it

looks like it might be time to find a different solution

bums us out cause we have actually been supportive (in small personal way)
of the SA people / project.

  - rh



I am sorry that you have been inconvenienced by this change.  It was 
done because far more often it has caused support confusion and breakage 
for RPM distributions as the rpmbuild -ta packages are incompatible with 
the way spamassassin is packaged by all distributions.  The rpmbuild -ta 
method has never been a supportable method, people often experience 
problems after installing it that way caused, report it, cause confusion 
for distributors, and in the end they were always ignored.


The official .spec file for EL5, EL6 and Fedora are identical.  It could 
technically build and work on EL4 with minor changes, but I dropped 
support for EL4 LONG AGO because the old version of perl there has 
problems with the proper operation of spamassassin.


http://en.wikipedia.org/wiki/Red_hat_enterprise_linux
In any case, it is time that you upgrade from EL4 because its supported 
lifetime ends February 2012.  Good time to upgrade to EL6 which is 
supported until the year 2017.  RHEL6 is great.  CentOS 6 was just 
released.  Scientific Linux 6 was out for a while now, and 6.1 is coming 
real soon.


Warren Togami
war...@togami.com


SpamTips.org: Why run your own DNS server?

2011-07-04 Thread Warren Togami Jr.

Hey folks,

http://www.spamtips.org/2011/07/spamassassin-why-run-your-own-dns.html
I wrote this article about why it can be important to run your own DNS 
server if you have a busy Spamassassin deployment.


Anyone have any better tips of an alternate DNS resolver, or 
configuration options to improve this suggested configuration?


http://www.spamtips.org/p/ultimate-setup-guide.html
Please see my Ultimate Setup Guide for all the latest tweaks to maximize 
your Spamassassin effectiveness and safety.  Do you have any tips or 
tricks that are not mentioned here?


https://admin.fedoraproject.org/mailman/listinfo/spamassassin-news
Subscribe here for my Spamassassin for Sysadmins Newsletter

Thanks,
Warren Togami
war...@togami.com


Re: SpamTips.org: Why run your own DNS server?

2011-07-04 Thread Warren Togami Jr.

On 7/4/2011 12:58 AM, Toni Mueller wrote:


Hi Warren,

On Mon, 04.07.2011 at 00:46:15 -1000, Warren Togami Jr.wtog...@gmail.com  
wrote:

http://www.spamtips.org/2011/07/spamassassin-why-run-your-own-dns.html

Anyone have any better tips of an alternate DNS resolver, or
configuration options to improve this suggested configuration?


while I do agree that it is generally a very good idea to run your own
DNS resolver, even if you have less than one mail per day, I am
thorougly unconvinced about the qualities of PowerDNS. I do have a
suggested alternative, though.

http://unbound.net/

This server doesn't go to proprietary changes to the DNS protocol (like
inventing new record types that noone else understands), but
concentrates on delivering DNS according to the latest specs instead.


I heard others recommend unbound, but I haven't tried it yet.  Is it 
more RAM efficient than other alternatives, and fast?


I don't believe pdns-recursor is guilty of this particular complaint as 
it is ONLY a recursor?


Warren


Re: SpamTips.org: Why run your own DNS server?

2011-07-04 Thread Warren Togami Jr.

On 7/4/2011 1:52 AM, Axb wrote:

On 2011-07-04 12:46, Warren Togami Jr. wrote:

Hey folks,

http://www.spamtips.org/2011/07/spamassassin-why-run-your-own-dns.html
I wrote this article about why it can be important to run your own DNS
server if you have a busy Spamassassin deployment.

Anyone have any better tips of an alternate DNS resolver, or
configuration options to improve this suggested configuration?


Warren

Sadly, your post has unleashed a sequel of pretty useless hints  rants.

There is a drawback to running pdns-recursor. The above pdns-recursor
instance is using ~400MB of memory. If you cannot afford this kind of
memory use, you can reduce the limits in options max-cache-entries and
max-packetcache-entries in /etc/pdns-recursor/recursor.conf as
documented upstream. You will need to find a balance between memory use
and effective cache hit performance.

A small site will never use 400MB of DNS cacheing... don't scare ppl
unnecessarily :)
Larger sites already do local recursion and have the iron to to it.
(other recursors will also use a lot of memory under high-ish load)


I am not 100% certain about this, but it appears that pdns-recursor is 
tuned to normal patterns of DNS lookups (like web browsing or maybe a 
squid proxy server).  It is caching a large amount of useless data, 
evidenced by the piss terrible cache hit ratio.  My in-brain logic 
without testing suggested that timing out most of that nearly-useless 
cache may shrink memory usage considerably without making that poor 
cache hit ratio much worse, since more recent data is often more 
relevant.  That is my theory anyway.  I'm testing that now.




Be careful when endorsing:

For example, DNS results of DNSBL and URIBL's are very transient in
nature with tiny TTL's, so perhaps we could substantially reduce memory
usage by forcing max-cache-ttl and max-negative-ttl to a much smaller
duration. It also appears that the packetcache is far more effective
than the cache at achieving hits, so we may be better off favoring the
packetcache rather than the memory hogging and less effective cache.

Reducing negative TTL time should ONLY be done the user runs *local*
copies of most of the queried BLs, otherwsise he may hit BL abuse
threshold way earlier.

BLs generally adjust their negative TTL to get a practical balance
between query load and positive hits.
Gaming these settings can become a costly process.

Axb


Good point, I'll remove that paragraph for now and actually test that 
theory myself to see how it effects the actual hit/miss ratio.


Warren


Re: SpamTips.org: Why run your own DNS server?

2011-07-04 Thread Warren Togami Jr.

On 7/4/2011 1:52 AM, Axb wrote:

A small site will never use 400MB of DNS cacheing... don't scare ppl
unnecessarily :)
Larger sites already do local recursion and have the iron to to it.
(other recursors will also use a lot of memory under high-ish load)


It is also possible that pdns-recursor just sucks and I should be trying 
other daemons.  I will try unbound next.


Warren


Re: Rule updates

2011-06-28 Thread Warren Togami Jr.

On 6/27/2011 7:03 AM, dar...@chaosreigns.com wrote:

On 06/27, Lars Jørgensen wrote:

I noticed the rules for 3.3.1 were updated during the weekend (don't worry
about my workaholism, I noticed this monday morning ^-^). I was preparing
to upgrade to 3.3.2, but seeing the updated rules makes me doubt whether
the upgrade is necessary.


I expect rule updates to remain compatible throughout the 3.3.x series, so
as long as updates are happening for any 3.3.x version, you you should get
them, and they should work, with 3.3.1 (and 3.3.0, etc.).

That *could* change, I suppose, but I don't expect it.  There has been talk
of adding a rule to hit all emails for versions nolonger being maintained,
something like SPAMASSASSIN_OUT_OF_DATE:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6614


3.3.x is the first version that supports rule conditionals, so it is 
possible that 3.4.x rule updates updates could refer to plugins that do 
not exist in 3.3.x, and those sections are safely ignored by 3.3.x.


It seems the intent is to release 3.4 late this year.  I heard that the 
only compat change from 3.3.x to 3.4.x is in the spamc/spamd protocol, 
so it should theoretically be an easy upgrade.  It remains to be seen 
exactly what is decided for 3.3.x rule updates after 3.3.x is released.


Warren


ANNOUNCE: Apache SpamAssassin 3.3.2 available

2011-06-23 Thread Warren Togami Jr.

Release Notes -- Apache SpamAssassin -- Version 3.3.2

Introduction


This is a minor release, primarily to support perl-5.12 and later.
Additionally several other minor bugs are fixed.


Downloading and availability


Downloads are available from:

http://spamassassin.apache.org/downloads.cgi

md5sum of archive files:

  253f8fcbeb6c8bfcab9d139865c1a404  Mail-SpamAssassin-3.3.2.tar.bz2
  d1d62cc5c6eac57e88c4006d9633b81e  Mail-SpamAssassin-3.3.2.tar.gz
  06d84d34834d9aecdcdffcc4de08b2a7  Mail-SpamAssassin-3.3.2.zip
  72f8075499c618518c68c7399f02b458 
Mail-SpamAssassin-rules-3.3.2-r1104058.tar.gz


sha1sum of archive files:

  f38480352935fe3bb849a27a52615e400dee7d66  Mail-SpamAssassin-3.3.2.tar.bz2
  de954f69e190496eff4a796a9bab61747f03072b  Mail-SpamAssassin-3.3.2.tar.gz
  edc6297dc651eeb7a4872f596ec5a54aeea85349  Mail-SpamAssassin-3.3.2.zip
  a199d5f0f8c2381e3dfe421e7a774356b3ffda4b 
Mail-SpamAssassin-rules-3.3.2-r1104058.tar.gz


Note that the *-rules-*.tar.gz files are only necessary if you cannot, 
or do not wish to, run sa-update after install to download the latest 
fresh rules.


See the INSTALL and UPGRADE files in the distribution for important 
installation notes.



GPG Verification Procedure
--
The release files also have a .asc accompanying them.  The file serves
as an external GPG signature for the given release file.  The signing
key is available via the wwwkeys.pgp.net key server, as well as
http://www.apache.org/dist/spamassassin/KEYS

The key information is:

pub   4096R/F7D39814 2009-12-02
  Key fingerprint = D809 9BC7 9E17 D7E4 9BC2  1E31 FDE5 2F40 F7D3 9814
uid  SpamAssassin Project Management Committee 
priv...@spamassassin.apache.org
uid  SpamAssassin Signing Key (Code Signing Key, 
replacement for 1024D/265FA05B) d...@spamassassin.apache.org

sub   4096R/7B3265A5 2009-12-02

To verify a release file, download the file with the accompanying .asc 
file and run the following commands:


 gpg -v --keyserver wwwkeys.pgp.net --recv-key F7D39814
 gpg --verify Mail-SpamAssassin-3.3.2.tar.bz2.asc
 gpg --fingerprint F7D39814

Then verify that the key matches the signature.

Note that older versions of gnupg may not be able to complete the steps 
above. Specifically, GnuPG v1.0.6, 1.0.7  1.2.6 failed while v1.4.11 
worked flawlessly.


See http://www.apache.org/info/verification.html for more information on 
verifying Apache releases.




Summary of major changes since 3.3.1


NOTE: Complete changes are available at 
http://svn.apache.org/repos/asf/spamassassin/branches/3.3/Changes


Bug #6353: Fix FH_FROMEML_NOTLD, add MISSING_FROM

Bug #6427: Spamc windows header library missing two defines.

Bug #6476: patch to fix missing sa-awl man page bug

Bug #6470: Small change in windows to exit stating that the exit status 
is unknown.  Thanks to Daniel Lemke for many of these small win32 patches.


Bug #6314: Complete removal of spamassassin.spec

Bug #6589: Errors in man pages

Bug #6588: Small bug in the regexp caught by Jose Borges Ferreira in

Bug #6515: spamd timeout_child option overrides time_limit configuration 
option with nastier behaviour


Bug #6490: Mail::SpamAssassin::Plugin::SPF - Two enhancement issues

Bug #6562: NULL reference bug in libspamc. Quick workaround to avoid a 
crash.


Bug #6454: wrong status test on $sth-rows in BayesStore::PgSQL

Bug #6418: Cannot Log to stderr without timestamps

Bug #6403: GMail should use ESMTPSA to indicate that it is in fact 
authenticated, but doesn't


Bug #6229: TextCat is too case sensitive

Bug #6241: mkrules does not understand newer options and else

Bug #6382: add missing unwhitelist_from_dkim, remove facebook and 
linkedin from dkim whitelisting


Bug #5744: some documentation fixes

Bug #6447: new feature to bayes autolearning: learn-on-error

Bug #6566: X-Ham-Report default wording (has identified this incoming 
email as possible spam) is confusing and inaccurate


Bug #6468: splice() offset past end of array in HTML.pm

Bug #6377: win32: spamd signal handling

Bug #6376: win32: consider negative pids under windows in spamds waitpid

Bug #6375: win32: posix macro not implemented - spamd

Bug #6336: Illegal octal digit 9 received during rules compile

Bug #6526: Disable rfc-ignorant.org

Bug #6531: clear_uridnsbl_skip_domain feature to allow admin override of 
default configuration


Bug #5491: MIME_QP_LONG_LINE triggering on valid email

Bug #6558: body rules having tflags multiple may cause infinite loop 
when compiled - a workaround


Bug #6557: Use same age limits in ruleqa as in sa-updates

Bug #6548: spamd protocol examples are wrong

Bug #6500: clear_originating_ip_headers seems to be broken

Bug #6565: check_rbl_sub rules - all dots need to be escaped - commit 
felicity/70_dnswl.cf and felicity/70_iadb.cf too


Bug #6565: check_rbl_sub rules - all dots need to be escaped

Bug #6578: Move TLD 

Spamassassin 3.3.2 RPM Packages for Fedora and RHEL

2011-06-23 Thread Warren Togami Jr.

http://www.spamtips.org/p/rpm-packages.html

These packages for EL5 and EL6 are identical to the Fedora versions, and 
I personally use them in production.


Warren Togami
war...@togami.com


Re: Sought rules

2011-06-12 Thread Warren Togami Jr.

On 6/11/2011 10:03 AM, Justin Mason wrote:

guys -- I'm going to make the whole question moot (in trunk at least)
-- the only reason SOUGHT and SOUGHT_FRAUD were being checked in there
was to make their accuracy visible in ruleqa.  It's been months since
I've looked at that, so it's needless.  I'll remove them from svn
asap.

--j.


WAIT!!! Wouldn't this remove our ability to check for false positives of 
your patterns against the much larger ham collection of nightly masscheck?


I wouldn't be concerned about this if there were a way to collaborate on 
feeding more ham into SOUGHT's safety check corpus, but when I asked 
about this earlier you seemed hesitant.


Warren


Re: Sought rules

2011-06-12 Thread Warren Togami Jr.

On 6/12/2011 12:32 AM, Warren Togami Jr. wrote:

On 6/11/2011 10:03 AM, Justin Mason wrote:

guys -- I'm going to make the whole question moot (in trunk at least)
-- the only reason SOUGHT and SOUGHT_FRAUD were being checked in there
was to make their accuracy visible in ruleqa. It's been months since
I've looked at that, so it's needless. I'll remove them from svn
asap.

--j.


WAIT!!! Wouldn't this remove our ability to check for false positives of
your patterns against the much larger ham collection of nightly masscheck?


The alternative is to filter SOUGHT from the sa-update rule updates with 
a script, but still allow it in the nightly masschecks.  Testing sought 
in nightly masschecks has been useful to occasionally find obvious 
SOUGHT problems, or sometimes to locate spam that was misplaced in the 
ham folder.


Warren


Re: Sought rules

2011-06-11 Thread Warren Togami Jr.
Wait a sec, I'm confused about this.  JM_SOUGHT_2 hitting on every 
legit Facebook message on dev@ list February 17th 2011.  If the SOUGHT 
channel was being overridden by the sa-update rules, how would this 
problem appear from the SOUGHT channel?  Doesn't this suggest that 
spamassassin was successfully using the SOUGHT channel?


(I still agree we should remove the static SOUGHT from the sa-update rules.)

Warren


READ THIS Re: Sought rules

2011-06-11 Thread Warren Togami Jr.

On 6/10/2011 11:13 PM, Warren Togami Jr. wrote:

Wait a sec, I'm confused about this. JM_SOUGHT_2 hitting on every legit
Facebook message on dev@ list February 17th 2011. If the SOUGHT channel
was being overridden by the sa-update rules, how would this problem
appear from the SOUGHT channel? Doesn't this suggest that spamassassin
was successfully using the SOUGHT channel?

(I still agree we should remove the static SOUGHT from the sa-update
rules.)

Warren


NOTE: I'm skeptical that this bug is effecting us on Linux.  I am not 
using a re-order hack and yet SOUGHT seems to be changing its behavior 
on a daily basis.


Warren


Re: Sought rules

2011-06-10 Thread Warren Togami Jr.

On 6/10/2011 7:14 AM, Karsten Bräckelmann wrote:


You are generally correct about the numerical (actually lexical) order,
though it doesn't apply to the files you are talking about. The
mentioned 72_active and 20_sought are in different sa-update channels.

Now, the bad thing about this is that updates_spamassassin_org.cf is
lexically *after* sought_rules_yerp_org.cf in your rule update dir.
Which means the more recent rules in the dedicated Sought channel are
overwritten by the stock rules...

This merely requires a re-ordering hack, though. A symlink zzz_sought.cf
in your rule updates dir, pointing at the channel generated cf should
do. These channel cf files only hold include statements, to pull in the
actual cf files in the per-channel dir.




Without a re-ordering hack, does this mean mean that essentially 
EVERYONE is using SOUGHT wrong?  This is a bit worrisome.


Warren


Re: Sought rules

2011-06-10 Thread Warren Togami Jr.

On 6/10/2011 2:01 PM, Karsten Bräckelmann wrote:


IFF you use the sought channel with SA 3.3.x, you will need the reorder
hack to bend the alphabet.



It is not entirely clear to me, what exactly are you supposed to rename 
for the reorder hack?  You have to do it every time you sa-update?


Warren


Re: Sought rules

2011-06-10 Thread Warren Togami Jr.

On 6/10/2011 3:34 PM, John Hardin wrote:

On Fri, 10 Jun 2011, Lawrence @ Rogers wrote:


On 10/06/2011 10:24 PM, Warren Togami Jr. wrote:

On 6/10/2011 2:01 PM, Karsten Bräckelmann wrote:
  IFF you use the sought channel with SA 3.3.x, you will need the
reorder
 hack to bend the alphabet.

It is not entirely clear to me, what exactly are you supposed to rename
for the reorder hack? You have to do it every time you sa-update?


Would renaming 20_sought_fraud.cf to 99_sought_fraud.cf, putting
20_sought_fraud.cf (from the yelp.org channel) after 72_active.cf (the
default and assumed older SA rules) solve this problem?


Or symlinks from your local configs directory to the SOUGHT channel
directory files. That would probably be easier to not forget about when
things get fixed.



Is Lawrence's suggestion something we can do upstream to fix this problem?

Alternatively, I think it is a mistake for us to ship SOUGHT rules at 
all in the standard sa-update channel.  That is, unless we plan on 
updating the patterns and scores of SOUGHT on a daily basis.  I highly 
doubt we will do that.


Warren Togami
war...@togami.com


3.3.2 Ready for Testing

2011-06-06 Thread Warren Togami Jr.
We need +3 votes from PMC (or the release manager) to declare 3.3.2 an 
official ASF release.  This 3.3.2 release has no changes since 
3.3.2-rc2.  Please do some testing before voting.


If you are not a PMC member, please let us know if you see any 
regressions since 3.3.1 along with details of your platform.


Proposed Official Release of 3.3.2
===
http://people.apache.org/~wtogami/devel/3.3.2/
f38480352935fe3bb849a27a52615e400dee7d66
Mail-SpamAssassin-3.3.2.tar.bz2
de954f69e190496eff4a796a9bab61747f03072b
Mail-SpamAssassin-3.3.2.tar.gz
edc6297dc651eeb7a4872f596ec5a54aeea85349
Mail-SpamAssassin-3.3.2.zip
a199d5f0f8c2381e3dfe421e7a774356b3ffda4b
Mail-SpamAssassin-rules-3.3.2-r1104058.tar.gz

GPG Verify Procedure

wget 
http://people.apache.org/~wtogami/devel/3.3.2/Mail-SpamAssassin-3.3.2.tar.bz2
wget 
http://people.apache.org/~wtogami/devel/3.3.2/Mail-SpamAssassin-3.3.2.tar.bz2.asc

gpg --recv-key F7D39814
gpg --verify Mail-SpamAssassin-3.3.2.tar.bz2.asc

spamassassin-3.3.2-rc2 RPM Test packages for EL5 and EL6
http://www.spamtips.org/p/rpm-packages.html
http://people.apache.org/~wtogami/rpm/3.3.2/

Warren Togami
war...@togami.com


3.3.2-rc2 Call for Testing

2011-05-31 Thread Warren Togami Jr.
3.3.2-rc2 is meant to be the true release candidate for 3.3.2.  If we 
find no problems with rc2, then I will recut it as 3.3.2 final with no 
code changes.


http://people.apache.org/~wtogami/devel/3.3.2-rc2/
3.3.2-rc2 tarballs plus rules from sa-update channel

sha1sum of archive files:

  445b3d0a9e93284af82180c03f8c3b0fa4c5d2fc 
Mail-SpamAssassin-3.3.2-rc2.tar.bz2
  4eb6a3c23714e33c0413fa25ed45c2796129ac9a 
Mail-SpamAssassin-3.3.2-rc2.tar.gz

  876314a64730604df9468243f68a1fd15b18c214  Mail-SpamAssassin-3.3.2-rc2.zip
  a199d5f0f8c2381e3dfe421e7a774356b3ffda4b 
Mail-SpamAssassin-rules-3.3.2-rc2.r1104058.tar.gz


http://people.apache.org/~wtogami/rpm/3.3.2-rc2/
RPM packages for EL5 and EL6

Warren Togami
war...@togami.com


Re: Trouble starting Spamassassin

2011-05-18 Thread Warren Togami Jr.

On 5/18/2011 1:20 AM, john ffitch wrote:

Thank you.  Removing the defined clear one error but I still get

May 18 12:17:36.306 [5489] warn: Use of uninitialized value 
$opt{syslog-socket} in lc at /usr/bin/spamd line 444.
child process [5491] exited or timed out without signaling production of a PID 
file: exit 255 at /usr/bin/spamd line 2588.

so does not work.  I am reluctant to install a rc1 in a live system
==John ffitch


3.3.2-rc1 actually works while 3.3.1 does not.  By my download counts, 
it appears at least 200 people are running my 3.3.2-rc1 RPMS and I have 
heard no complaints.


Warren


EL5 and EL6 Packages of spamassassin-3.3.2-rc1

2011-05-16 Thread Warren Togami Jr.

http://people.apache.org/~wtogami/rpm/3.3.2-rc1/
I made test packages for EL5 and EL6.  I began using both in production 
just now with no apparent ill effects.  We need more people to test this 
and provide feedback.


Warren

On 05/14/2011 10:34 PM, Warren Togami Jr. wrote:

Hey folks,

This is an UNRELEASED CANDIDATE of spamassassin-3.3.2-rc1. It would be
helpful for folks to test it and provide feedback. Don't worry about the
rules tarball, because the real rules you get from running sa-update the
first time.

http://people.apache.org/~wtogami/devel/3.3.2-rc1/

sha1sum of archive files:

191fc4548c7619e11127ef04714be19741122ea9
Mail-SpamAssassin-3.3.2-rc1.tar.bz2
813b2adb7ab15f6ddc34c9de7fc10e0f9b7b28cd
Mail-SpamAssassin-3.3.2-rc1.tar.gz
23bee590d0e4ec5f11936bc931fb73211970966a
Mail-SpamAssassin-3.3.2-rc1.zip
9e20dd49fbbb1bf1ff4d171ac3531b53ba7c9dfd
Mail-SpamAssassin-rules-3.3.2-rc1.r1083704.tgz

GPG signatures available at the above URL.

WARNING: I did not test this in production.

Warren Togami
war...@togami.com




Testing Needed: spamassassin-3.3.2-rc1

2011-05-15 Thread Warren Togami Jr.

Hey folks,

This is an UNRELEASED CANDIDATE of spamassassin-3.3.2-rc1.  It would be 
helpful for folks to test it and provide feedback.  Don't worry about 
the rules tarball, because the real rules you get from running sa-update 
the first time.


http://people.apache.org/~wtogami/devel/3.3.2-rc1/

sha1sum of archive files:

  191fc4548c7619e11127ef04714be19741122ea9
Mail-SpamAssassin-3.3.2-rc1.tar.bz2
  813b2adb7ab15f6ddc34c9de7fc10e0f9b7b28cd
Mail-SpamAssassin-3.3.2-rc1.tar.gz
  23bee590d0e4ec5f11936bc931fb73211970966a
Mail-SpamAssassin-3.3.2-rc1.zip
  9e20dd49fbbb1bf1ff4d171ac3531b53ba7c9dfd
Mail-SpamAssassin-rules-3.3.2-rc1.r1083704.tgz

GPG signatures available at the above URL.

WARNING: I did not test this in production.

Warren Togami
war...@togami.com


DNSBL Safety Report 5/14/2011

2011-05-15 Thread Warren Togami Jr.

http://www.spamtips.org/2011/05/dnsbl-safety-report-5142011.html
Several of the well known add-on DNSBL's have changed in safety or 
overlap since the previous January 2011 report, so sysadmins of 
Spamassassin servers may want to look carefully at this new report.


https://admin.fedoraproject.org/mailman/listinfo/spamassassin-news
Subscribe to my Spamassassin for Sysadmins Announce-Only Newsletter. 
The next issue is coming sometime after the release of spamassassin-3.3.2.


Warren


Re: Testing Needed: spamassassin-3.3.2-rc1

2011-05-15 Thread Warren Togami Jr.
Please file bugs.  Nothing can be committed to spamassassin-3.3.x 
without bugs and votes.


Warren


Re: Dumb questions

2011-05-06 Thread Warren Togami Jr.

On 5/6/2011 9:19 AM, Greg Lentz wrote:

Well, since it looks like SA 3.2 hasn't been getting rules for a
couple of years, that probably isn't as critical at the moment. --
Greg Lentz



Of course it is critical. How effective would your virus scanner be
after several years without updates?

Warren


Re: Any active rules repositories left?

2011-04-22 Thread Warren Togami Jr.

On 4/22/2011 6:32 AM, Morten wrote:


Hi folks,

I'm looking at upgrading a SA 3.2.5 installation. I see that there's a 3.3.1 
release, but that's more than a year old. Is there some shared rules repository 
out there that's more recent?

Thanks,

Morten



http://www.spamtips.org/p/ultimate-setup-guide.html
Please follow spamtips.org with documents every add-on and configuration 
to maximize the effectiveness and safety of your spamassassin deployment.


Warren


Mailspike Performance

2011-04-12 Thread Warren Togami Jr.
We haven't had working statistics viewing for a few weeks, but now it is 
fixed and I'm amazed by the performance of RCVD_IN_MSPIKE_BL.


http://ruleqa.spamassassin.org/20110409-r1090548-n/T_RCVD_IN_MSPIKE_BL/detail

RCVD_IN_MSPIKE_BL has nearly the highest spam detection ratio of all the 
DNSBL's, second only to RCVD_IN_XBL.  But our measurements also indicate 
it is detecting this huge amount of spam with a very good ham safety rating.


* 84% overlap with RCVD_IN_XBL - redundancy isn't a huge problem here 
because XBL is a tiny score.  But 84% is surprisingly low overlap ratio 
for such high spam detecting rule.  This confirms that Mailspike is 
doing an excellent job of building their IP reputation database in a 
truly independent fashion.
* 67% overlap with RCVD_IN_PBL - overlap with PBL is concerning because 
PBL is a high score.  But 67% isn't too bad compared to other production 
DNSBL's.

* 58% overlap with RCVD_IN_PSBL - pretty good

Given Mailspike's sustained decent performance since late 2009, it seems 
clear that it is a great candidate for addition to spamassassin-3.4 by 
default.  It would be very interesting to see what it does to the scores 
during an automatic rescoring of the network rules.


Thoughts about Future Rescoring
===
Before that rescoring, we may want to have a serious discussion about 
reducing score pile-up in the case where multiple production DNSBL's all 
hit at the same time.  Adam Katz' approach is one possibility, albeit 
confusing to users because users see subtractions in the score reports. 
 There may be other better approaches to this.



In related news...
==
http://www.spamtips.org/2011/01/dnsbl-safety-report-1232011.html
The January DNSBL Safety report found RCVD_IN_SEMBLACK to be reasonably 
safe, but at the time it overlapped with RCVD_IN_PBL 91% of the time 
making it dangerously redundant due to PBL's high production score.


http://ruleqa.spamassassin.org/20110409-r1090548-n/T_RCVD_IN_SEMBLACK/detail
Our most recent measurements indicate that SEMBLACK is back to previous 
behavior of extremely poor safety rating, with false positives on ~7% of 
ham from recent weeks.


It was a bad idea to use SEMBLACK earlier this year due to the high 
overlap with RCVD_IN_PBL, but this significant decline in safety rating 
is a clear indication that you should not be using RCVD_IN_SEMBLACK.


http://ruleqa.spamassassin.org/20110409-r1090548-n/T_RCVD_IN_HOSTKARMA_BL/detail
HOSTKARMA_BL overlaps with MSPIKE_BL 88% of the time, but detects far 
fewer spam and and with slightly more FP's.  Compared to last year, 
HOSTKARMA_BL's safety rating has improved considerably on a sustained 
basis, and if we were evaluating it alone it wouldn't be too bad.  But 
now that we see the overlaps, HOSTKARMA_BL at this very moment is pretty 
close to a redundant and slightly less safe subset of RCVD_IN_MSPIKE_BL. 
 Given these measurements, it probably isn't helpful to use HOSTKARMA_BL.


Warren Togami
war...@togami.com


Re: Suddenly tons of spam

2011-03-29 Thread Warren Togami Jr.

On 3/29/2011 8:30 AM, RW wrote:

On Tue, 29 Mar 2011 12:55:51 -0500
Maxmdun...@breakawaysystems.com  wrote:


Heres the output of spamassassin -D --lint:

[29434] dbg: logger: adding facilities: all
[29434] dbg: logger: logging level is DBG
[29434] dbg: generic: SpamAssassin version


Update to the current version. It's not worth giving it any more thought
until you've done that. The rules for 3.2.5 haven't been worked on some
time.


http://www.spamtips.org/p/ultimate-setup-guide.html
Indeed.  Upgrade to spamassassin-3.3.1, make sure sa-update is set to 
run at least once daily, then follow everything on this page to maximize 
its performance.


Warren


Re: Spam Eating Monkey causing 100% false positives for large institutions

2011-03-23 Thread Warren Togami Jr.

On 3/23/2011 7:38 AM, Blaine Fleming wrote:

On 3/23/2011 9:56 AM, dar...@chaosreigns.com wrote:

In the recent sa-updates, the Spam Eating Monkey rules were
inappropriately enabled.  If you hit them too much, they start returning
100% false positives.  Their listed limits are more than 100,000 queries
per day or more than 5 queries per second for more than a few minutes.


As soon as the bug was reported on the dev list I disabled the
127.0.0.255 response code to avoid any additional issues.  I will be
turning this functionality back on as soon as the SA rules are updated
which I assume will be soon.


I would recommend blackholing those IP addresses at the firewall of the 
DNS server, especially those 300 million+ sites that are impossible to 
contact.  They might finally notice they have a serious configuration 
issue and stop querying if their mail delivery backs up.


Warren



Re: Spam Eating Monkey causing 100% false positives for large institutions

2011-03-23 Thread Warren Togami Jr.

On 3/23/2011 10:58 AM, Karsten Bräckelmann wrote:

On Wed, 2011-03-23 at 10:18 -1000, Warren Togami Jr. wrote:

On 3/23/2011 7:38 AM, Blaine Fleming wrote:

In the recent sa-updates, the Spam Eating Monkey rules were
inappropriately enabled.  [...]



As soon as the bug was reported on the dev list I disabled the
127.0.0.255 response code to avoid any additional issues.  I will be
turning this functionality back on as soon as the SA rules are updated
which I assume will be soon.


I would recommend blackholing those IP addresses at the firewall of the
DNS server, especially those 300 million+ sites that are impossible to
contact.  They might finally notice they have a serious configuration
issue and stop querying if their mail delivery backs up.


Ugh, nasty boy. ;)  You do realize they wouldn't be hammering the SEM
DNS servers, if testrules wouldn't have slipped out accidentally -- by
sa-update.

Personally, I'd much rather prefer to have this resolved by another
manual rule update, so the queries should die down within another 24-48
hours. Obviously, these sites do use sa-update...

Thanks and props to Blaine, for effectively disabling the limit
temporarily, and sustain the load for a while! :)




Agreed that would be the ideal solution.  Who knows the procedure?  Is 
that procedure documented?


Warren


Re: Performance on Spear Phishing?

2011-03-16 Thread Warren Togami Jr.

On 3/16/2011 4:08 PM, Hamad Ali wrote:

Hi folks -- wondering if anyone has monitored SA's performance against
phishing mails. SA is able to detect 86% of phishing emails my clients
get, with 0.5% false positives on all the ham. It seems non-phish-SPAM
is easier to be detected than phish (~99% for non-phish spam). Probably
I need to participate on nightly checks to improve phish and lower false
positives.

But all the above stuff is about bulk-phish, excluding spear phish. I
haven't received any spear phishing complain from my clients, and yet
none of the detected phish mails are spear phish -- which is alarming as
it's too good to be true that no one did spear phishing yet (specially
that it works far better than bulk-phish)!

What's the scenario in your mail systems folks? Do you detect spear
phishing mail by SA? Users report it?

-- H




Are you using spamassassin-3.3.1?

http://www.spamtips.org/p/ultimate-setup-guide.html
Have you tweaked it with the best tested add-ons?  Please read this page.

In particular the fuzzy hash based plugins like pyzor, Razor and DCC 
sometimes is effective against phishing.


Warren


Re: Performance on Spear Phishing?

2011-03-16 Thread Warren Togami Jr.

On 3/16/2011 5:45 PM, Karsten Bräckelmann wrote:

On Wed, 2011-03-16 at 20:30 -0700, John Hardin wrote:

On Thu, 17 Mar 2011, Hamad Ali wrote:



Probably I need to participate on nightly checks to improve phish and
lower false positives.


More masscheck participants are always welcome!


No.

There is this thing called trust. Credibility. And track-record. Which
pretty much is the opposite of a freemail address, venting two questions
on this list -- without ever getting back even to specific requests for
better data, offer for precise help, or a dialog.




Karsten, thanks for pointing out that this is the same guy.  I had 
missed that.


Warren


Re: how to disable network tests?

2011-03-11 Thread Warren Togami Jr.

On 3/11/2011 10:05 AM, Hamad Ali wrote:

hi folks --- everything seems working like chicken. I'm loving SA so far.

However, I would like to disable all network tests (each mail takes ~10
seconds!). Except that I dunno how to do it the neat way.

Will the tests be disabled if their score is 0? I know that would lead
into disabling the effect of a rule on the decision making of SA (i.e.
Spam/Ham marking), but would SA exclude them from running too?

I need to disable all BLs, DNS queries, and anything that uses the
internet. Kindly advise.

Thank you guys -- May OOP Raise and Shine!
H


Please consider that spamassassin is CRIPPLED without the network tests. 
 If it is taking 10 seconds per message then you likely have some kind 
of serious misconfiguration.  The first likely culprit is your DNS 
server is not good.  Several times in past years I've had to stop using 
my ISP's (or data center's!) official DNS servers because they were 
simply not capable of handling the load of spamassassin.  In such cases 
I run pdns-recursor on each Spamassassin server directly, and set 
/etc/resolv.conf to use 127.0.0.1 as the DNS resolver.


After you have switched to a known good DNS server, do the following to 
diagnose the network tests.


1) Save a single spam message as a flat file, with headers and body 
intact.  If your folders are Maildir format then a single file in your 
directory tree is suitable for this purpose.


2) cat FILE | spamassassin -D

3) Copy the entire output and paste into a text editor.

4) Look at the lines near the bottom for async: timing:  Those are 
followed by a number of seconds that an individual DNS request took to 
respond.  All of these numbers are typically between 0 and 3 seconds on 
my server.   If you have much larger numbers or some queries are timing 
out entirely, then you may have further issues with your DNS server, or 
you may have been blocked from queries because you have exceeded free 
usage limits.


http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html
Please see my article here about the free usage limits of the various 
spamassassin network tests.


http://www.spamtips.org/p/ultimate-setup-guide.html
Please read this page for all known safe and effective configuration 
tweaks to spamassassin.


Warren Togami
war...@togami.com


Re: sa-updates

2011-03-10 Thread Warren Togami Jr.

On 3/10/2011 1:41 AM, Nigel Frankcom wrote:

Hi All,

Apologies if this has been covered, an admittedly fairly cursory
Google showed nothing new. My local sa-update hasn't updated in the
better part of a month. Is it that there have been no updates or do I
need to dig into my systems to see what I broke, how and when?

Regards to all

Nigel


http://ruleqa.spamassassin.org/
The auto-promotion mechanism that promotes/demotes and rescores new 
rules has been broken lately because we are lacking sufficient 
quantities of ham and spam in the nightly masscheck.  You can see the 
results of each nightly masscheck at the above link.


https://fedorahosted.org/auto-mass-check/
We are seriously in need of additional volunteers in the nightly 
masscheck.  Please read this page to learn how to join.


Warren Togami
war...@togami.com


Re: The one year anniversary of the Spamhaus DBL brings a new zone

2011-03-08 Thread Warren Togami Jr.

On 3/8/2011 9:58 AM, Bill Landry wrote:

FYI: Spamhaus created a new URL shortener/redirector zone in the
DBL. See:

http://www.spamhaus.org/news.lasso?article=667

Will Spamassassin be adding support for this new DBL
shortener/redirector response code?:

127.0.1.3 spammed redirector domain

For details, see:

http://www.spamhaus.org/faq/answers.lasso?section=Spamhaus%20DBL#291

Regards,

Bill


OK, so this is meant to be used as a URIBL.  I don't see this as 
anything special because there is no way to query the pathname portion 
of a URI which would allow more fine-grained detection of spammy URI's 
even on a non-evil shortening service.


Is this new DBL return code meant to be a lower score than ordinary 
URIBL's that often choose to list evil shortener domains?  My point is 
this is no different than an ordinary URIBL listing.


Warren


Re: Open letter to Yahoo and Hotmail concerning junkmail

2011-03-07 Thread Warren Togami Jr.

On 3/6/2011 3:15 AM, Ned Slider wrote:

On 06/03/11 11:46, Warren Togami Jr. wrote:

I have no comment on your proposed solution. I can however point out the
statistics that I see on my own spam traps.

It seems that 90%+ of the spam coming from DNSWL listed hosts is Yahoo
and Hotmail which are listed as DNSWL_NONE. Meanwhile very few spam
comes from gmail.com. Apparently DNSWL agrees because they give
gmail.com's outgoing MTA's a LOW ranking which is pretty good for a
freemail provider. Google is doing something right in outgoing spam
prevention.

Warren



Exactly.

If Google can manage to do a pretty good job then it just tells me
Microsoft and Yahoo don't care. I've long since stopped caring too and
have scored them in SpamAssassin - the only way their mail gets through
now is if the sender address is whitelisted or they score some negative
points (e.g, Bayes) to get them back below my threshold. These providers
are NOT too big to block and the sooner we all start realising that the
sooner they might start to care about their reputations and stop
emitting huge volumes of spam.

Personally I think it's about time FROM_HOTMAIL and FROM_YAHOO became
high scoring stock rules in SpamAssassin. A score of 3 points might be a
reasonable starting point.


I'd agree, but users wont rebel against Yahoo unless they begin to see 
actual bounces to their sent mail.


I do agree that we should have FROM_HOTMAIL and FROM_YAHOO so we can 
independently decide how to treat their mail separate from typical FREEMAIL.


Warren


Re: Open letter to Yahoo and Hotmail concerning junkmail

2011-03-07 Thread Warren Togami Jr.

On 3/7/2011 2:10 AM, Mynabbler wrote:



Warren Togami Jr. wrote:


I'd agree, but users wont rebel against Yahoo unless they begin to see
actual bounces to their sent mail.


I don't know about your end users, but ours typically get flummoxed if mail
from this well known and trusted free mail providers would not arrive to
them... There's just too many users actually using their services, mixed
with too many spammers abusing it.


My point here is getting an explicit reject is better than silently 
disappearing.  I wasn't commenting on the wisdom of being prejudiced 
against Yahoo or Hotmail though.


Warren


Re: Open letter to Yahoo and Hotmail concerning junkmail

2011-03-06 Thread Warren Togami Jr.
I have no comment on your proposed solution.  I can however point out 
the statistics that I see on my own spam traps.


It seems that 90%+ of the spam coming from DNSWL listed hosts is Yahoo 
and Hotmail which are listed as DNSWL_NONE.  Meanwhile very few spam 
comes from gmail.com.  Apparently DNSWL agrees because they give 
gmail.com's outgoing MTA's a LOW ranking which is pretty good for a 
freemail provider.  Google is doing something right in outgoing spam 
prevention.


Warren


Re: low score for ($1.5Million)

2011-03-03 Thread Warren Togami Jr.

On 3/3/2011 3:06 PM, Karsten Bräckelmann wrote:

On Fri, 2011-03-04 at 01:53 +0100, Mikael Syska wrote:

I get the following hits:
Content analysis details:   (19.1 points, 5.0 required)


Note though, that your score is on SA 3.3.x, while the OP uses SA 3.2.x.
Yes, I can tell this from the scores. :)

Major changes between these version are clearly reflected in your score
and rules hit. Namely a lot of work by John Hardin to catch exactly such
fraud, and the FreeMail plugin now upstream -- with 3.2 it is available
as a third-party plugin.



Could we please make an official project statement that 3.2.x is 
unsupported and people should really update to 3.3.x?


Warren


Re: DNSWL rules downscoring spam

2011-02-20 Thread Warren Togami Jr.

On 2/20/2011 6:21 AM, Matthias Leisi wrote:

On Sun, Feb 20, 2011 at 4:22 PM, Pasi Hirvonenp...@iki.fi  wrote:

Hello,

I just recently moved our mail setup to new hardware and I've been
paying close attention to what gets marked as spam and what
doesn't.

Looking at my spam folder, I have received roughly 550 spam emails
to my email account since last tuesday (15th). Out of those 550,
*345* have been downscored by RCVD_IN_DNSWL_MED. Annoyingly, a
significant number of those spam mails have dropped just below the
spam threshold because of it.


That should not happen. Can you share some headers?

Thanks,
-- Matthias, for dnswl.org


Matthias, we really need a method to auto-report violations of DNSWL. 
My spam traps receive dozens or more every week.  But I don't have time 
to file a web form every time it happens.


Warren


Re: DNSWL rules downscoring spam

2011-02-20 Thread Warren Togami Jr.

On 2/20/2011 6:31 AM, dar...@chaosreigns.com wrote:


I know of no reason it would be a temporary hiccup, but it is certainly
unusual.  According to spamassassin's mass checks, 0.89% of spam hits
RCVD_IN_DNSWL_MED:  http://www.chaosreigns.com/dnswl/


The masscheck results are a bit misleading, overwhelmed in quantity with 
lower quality trap spam at the moment because the higher quality 
real-address sorted spam is in such low quantity.  A few weeks ago when 
DOS revealed that we need at minimum of 150k spam in a 2 month window I 
even adjusted my servers to include a larger percentage of trap spam.  I 
know this is problematic, but I intend this to be temporary.  I will 
adjust this down as we have more volunteers join the nightly masscheck 
and our overall quantities are boosted.


My point is, DNSWL violations seem to be occurring at a higher rate to 
real e-mail addresses than to fake addresses.


Warren


Re: DNSWL rules downscoring spam

2011-02-20 Thread Warren Togami Jr.

On 2/20/2011 9:11 AM, Michelle Konzack wrote:

Hello Pasi Hirvonen,

Am 2011-02-20 17:22:23, hacktest Du folgendes herunter:

Hello,

I just recently moved our mail setup to new hardware and I've been
paying close attention to what gets marked as spam and what
doesn't.

Looking at my spam folder, I have received roughly 550 spam emails
to my email account since last tuesday (15th). Out of those 550,
*345* have been downscored by RCVD_IN_DNSWL_MED. Annoyingly, a


I have EXACTLY the same problem here...  I get per day
on my 8 Servers arround 280.000 with 86.000 users.


You couldn't consider me to be anti-DNSWL given that I've been 
strongly promoting DNSWL, urging many to list themselves in recent 
months.  But if automated enforcement doesn't become a reality I am 
going to push harder for further default score reductions for not only 
DNSWL but all whitelists.


I've seen problems with DNSWL and IADB whitelists in the past year.  RP 
whitelists were bothering me during 2009 but I haven't seen so many 
problems during late 2010.


Warren


Re: Sa-update and proxy servers

2011-02-18 Thread Warren Togami Jr.

On 2/17/2011 11:44 PM, Daniel Lemke wrote:



Michael Scheidell wrote:


[...]
I now need to set a proxy server to do sa-updates through, but could not
find any information on settings for a proxy server.

[...]
Added cmd options:
  -x --proxy
  -U --proxy-user
  -P --proxy-password
  -t --connect-timeout.

  [...]



Hi,

just found this old thread regarding the proxy capabilities of sa-update. I
wonder why Michael's patch hasn't been included to the official source.

We've got a customer that wants to use sa-update through a proxy but using a
custom patch to provide such a feature is kind of weird. Would it be
possible to make the patch official? At least it'd be great if one could
specify username and password in addition to the proxy url by using
environment variables for LWP::Agent.

Any comments on this?

Daniel


Was this ever filed as a bug with the suggested patch attached?  Nothing 
gets in the code without a bug filed.


Warren


Re: using spamhaus droplist with sa ?

2011-02-17 Thread Warren Togami Jr.

On 2/17/2011 5:40 AM, RW wrote:


The suggestion is that it be scored higher for that reason.


Or just outright block all MTA connections from anything listed in 
zen.spamhaus.org, which seems to be safe.  Large sites I know have been 
doing that for years without any complaints.


Warren


Re: alert: New event: ET EXPLOIT Possible SpamAssassin Milter Plugin Remote Arbitrary Command Injection Attempt

2011-02-10 Thread Warren Togami Jr.

On 2/10/2011 1:29 PM, John Hardin wrote:

On Thu, 10 Feb 2011, David B Funk wrote:


On Fri, 11 Feb 2011, Jason Haar wrote:


On 02/11/2011 09:37 AM, Mark Martinec wrote:

Yes, the security hole is entirely within the milter,
independent of the MTA.


That exploit is dated Mar 2010? Has this really not been fixed in about
a year???




a year??, try half-a-decade. I've got a copy of that code from March
2006 and the vulnerability is there. Rather stale project. ;)


heh.

I suppose we ought to compose a boilerplate response for the inevitable
visitors who will show up asking about this exploit in SpamAssassin...



Perhaps more than boilerplate, but rather an official advisory to clear 
up the confusion?  Given that upstream of that milter is dead, nobody 
else will make an official advisory?


Warren


Re: mx1.res.cisco.com a dynamic ip?

2011-02-10 Thread Warren Togami Jr.

On 2/10/2011 2:30 PM, Michael Scheidell wrote:

host mx1.res.cisco.com
mx1.res.cisco.com has address 208.90.57.13
$ host 208.90.57.13
13.57.90.208.in-addr.arpa domain name pointer mx1.res.cisco.com.


looks fine to me, why does this look to SA like a dynamic ip?

(TRIGGERED RDNS_DYNAMIC.)

what, because of 'res' in it? yes, they SHOUTED AT THE RECIPIENT, AND I
EXPLAINED DON'T DO THAT IN SUBJECT LINE, its rude.



The RDNS_DYNAMIC rule might be better to be replaced by the more precise 
S25R-based patterns in KHOP_DYNAMIC.  Care enough?  Please file a bug 
and look into the relative results of the masschecks to start an analysis.


Warren


Re: Need Volunteers for Ham Trap

2011-02-08 Thread Warren Togami Jr.

On 02/07/2011 05:37 PM, Mahmoud Khonji wrote:

On 01/21/2011 01:06 AM, Warren Togami Jr. wrote:

On 1/20/2011 7:23 AM, R - elists wrote:


initially this came across as a really suspect idea...

i.e., one man's junk is another man's treasure


Ham is a lot easier to define than Spam.  Ham is simply anything that
you subscribed for.



I am currently subscribed to number of mailing lists to collect ham
emails (in addition to other sources). While it might be true that
mailing lists can be good sources of ham, their emails do not contain
realistic diversity of features/characteristics.


I explicitly excluded discussion mailing lists from the ham trap.



In my view, the issue is not just insuring an email is ham, but also
insuring that it contains realistic set of features. If the features are
not realistic, and if we optimize tests scores based on that, then we
might end up worsening test scores for realistic end-users.


Not if it is subscribed to hundreds of opt-in subscriptions for 
legitimate mail that ordinary users receive, most of which is otherwise 
not represented in the corpora.  Many of these subscriptions send mail 
only once a week or month.


It is true that the hamtrap corpus is synthetic and thus not fully 
representative in frequencies of real ham.  But its volume is only a 
tiny fraction of a percent of our total ham.  It helps us to detect and 
fix problems in individual rules by injecting some variety without 
causing a measurable impact on the entire corpus.




For example, most list emails are non-HTML. While most end-user ham and
spam emails are HTML. Evaluating sets of features (or tests) based on
this unrealistic corpus is likely to fools us into thinking that a
feature/test is more effective that what it is in reality (i.e. we might
end up giving MIME-based tests higher scores).


The spec and implementation of this ham trap already took this and many 
other issues into consideration.  We've already had a few experts here 
conclude the plan is sound.


I'm somewhat annoyed by the armchair quarterback negative comments on 
this topic.  (Not just you) didn't read the rest of this thread to 
realize this particular concern is moot.  None of the people complaining 
about how this is such a bad idea are being helpful by actually 
participate in the nightly masscheck.


Talk is cheap.  I'm actually doing something.

Warren


Re: RFC-Ignorant (was Re: Irony)

2011-02-03 Thread Warren Togami Jr.

On 2/2/2011 7:45 AM, John Levine wrote:

RFC Ignorant is deep into kook territory, as should be apparent if you
look at which RFCs they expect people to follow, and what their
definition of follow is.

abuse.net has been listed for years, since there is an autoresponder
on ab...@abuse.net, and I've never noticed any delivery problems.

One time I asked if they'd delist me if I got rid of the autoresponder
and just threw all the abuse mail away.  Yes.  QED.

Regards,
John Levine, jo...@iecc.com, Primary Perpetrator of The Internet for Dummies,
Please consider the environment before reading this e-mail. http://jl.ly


https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6526
We finally agreed that rfc-ignorant.org is useless, or slightly more 
harmful than good.  Spamassassin will be disabling these rules by 
default sometime soon.


http://www.spamtips.org/2011/01/disable-rfc-ignorantorg-rules.html
You can disable these rules with this config and avoid a useless DNS 
query on every mail scan.


Warren


Spamassassin News Issue #2

2011-01-24 Thread Warren Togami Jr.

Hey folks,

http://lists.fedoraproject.org/pipermail/spamassassin-news/2011-January/01.html
Here is Issue #2 of my Spamassassin for Sysadmins Newsletter.

https://admin.fedoraproject.org/mailman/listinfo/spamassassin-news
Subscribe here.

It is intended to be like a Foo Weekly News publication, except it 
will likely happen monthly as there isn't enough interesting news in a week.


Warren


Re: DCC plugin for SA

2011-01-20 Thread Warren Togami Jr.

On 1/20/2011 12:49 AM, J4 wrote:


Good morning to all of you,

This popped up in the spamd.log after a reboot (done to test everything
worked after a reboot).

  warn: dcc: dccifd -  check skipped: dcc: failed to connect to a socket
/var/dcc/dccifd: Connection refused

The socket is there:
  srw-rw-rw- 1 dcc spamd 0 Jan 10 09:40 /var/dcc/dccifd

local.cf has :-
  use_dcc 1
  dcc_path /usr/local/bin/dccproc

v310.pre has :-
  loadplugin Mail::SpamAssassin::Plugin::DCC

Is there anywhere else I could look:
The last log entry for DCC in /var/dcc/log was yesterday at 16:36, which
makes sence.
-rw--- 1 dcc spamd  2771 Jan 19 16:36 msg.0cTgcW

Is this an SA related problem or specific to DCC.  If the latter, then I
shall seek help elsewhere as it might be considered off-topic.

Best wishes, s



What distribution are you using?



Re: DCC plugin for SA

2011-01-20 Thread Warren Togami Jr.

On 1/20/2011 1:06 AM, J4 wrote:


I had not realised it was in the repos - I just checked and it is. Damn.


I'm surprised it would be in the repos.  DCC is not Free Software.

Warren


Re: Need Volunteers for Ham Trap

2011-01-20 Thread Warren Togami Jr.

On 1/20/2011 7:23 AM, R - elists wrote:


initially this came across as a really suspect idea...

i.e., one man's junk is another man's treasure


Ham is a lot easier to define than Spam.  Ham is simply anything that 
you subscribed for.




for a moment, it appeared we were gonna need to review the good and the bad
of spam-l to avoid serious SA list issues.

statistically speaking, this shouldnt sway the scoring substantially anyways
would it?


You are correct.  This is more of a tool to have *some* variety in the 
ham corpus, to make it possible to flag rules in need of scrutiny.  For 
example, prior to 3.3.x many of our rules were utterly broken with 
Japanese mail.  We had no idea of this fact until I added a few thousand 
Japanese mail to the ham corpus.  JM understood the problem and fixed 
those rules.




what should be known so that bad data is not allowed into the HAM corpus ?



The previous discussion described a sort of tagged sender ham trap. 
This simple process automatically excludes extraneous mail in cases 
where the address was shared with affiliates or spammer lists.  We 
also will be careful in sticking to reputable companies and orgs for the 
ham trap.


Warren


Re: What is Ham? (was Re: Need Volunteers for Ham Trap)

2011-01-20 Thread Warren Togami Jr.

On 01/20/2011 11:31 AM, Bowie Bailey wrote:


Public discussion lists are bit different.  In that case, it is the
individual post that is being considered spam rather than considering
the list spammy.  Since there is no overall control over the content of
the posts, public lists are vulnerable to being filled with spam if the
list owners are not paying attention.


For this reason, the ham trap will not be subscribed to any discussion 
lists.




When you sign up for a company's email list, you get whatever they
decide to send you.  If they decide to start sending marketing to the
list, I would not consider that spam because they own the list and they
can decide what to use it for.  The recipients signed up to get that
company's emails and if they no longer want to receive them, they can
unsubscribe.  And as I said before, if the unsubscribe function doesn't
work, then the emails become spam (regardless of the actual content).



Your understanding is exactly correct.

Warren


Re: Need Volunteers for Ham Trap

2011-01-19 Thread Warren Togami Jr.

On 01/18/2011 11:49 PM, Jeff Chan wrote:

On Tuesday, January 18, 2011, 4:59:05 AM, Warren Jr. wrote:


* Yes, we cannot be 100% sure our opt-in was only for that particular
site and not their partners.  But in any case automatic ham trapped
mail will be only the mail branded by the subscribed provider, because
that is the only mail we know for sure was opted-in.  Anything else is
kept separate for later analysis.



* If clearly spammy other mail arrives at a particular address, the
original subscription can be unsubscribed and the continued flow
monitored.  That address could then be discarded.


Both seem reasonable approaches.

Those degenerate cases of both are indeed interesting.

Cheers,

Jeff C.


Yes, I think this is a reasonably simple and effective plan.  I only 
need volunteers to help me find appropriate sites and to help subscribe. 
 It is very boring to do all this myself.


Warren


Re: Need Volunteers for Ham Trap

2011-01-18 Thread Warren Togami Jr.

On 1/17/2011 11:46 PM, Jeff Chan wrote:


So a couple points:

1.  Subscribing to lists opens up lots of grey areas including
the above.

2.  Some of the areas are very difficult to resolve into spam or
ham.  Some more aggressive anti-spammers may say all of the above
is spam, but others may disagree, and the mail may be legal.

Before anyone accuses me of being in favor of spammers, please be
aware that I am personally highly against any of these unethical
practices, but when essentially making decisions for others, one
needs to be very careful and consider whether there may be legitimate,
ethical, legal or even wanted uses of such things.  One person's
ham may be another persons spam, and vice versa.  However, most
people don't want the stuff bots send.

The issue is complex, and there are many deliverability, security
and anti-spam companies and organizations that struggle with these
issues every day.  Maintaining accurate ham and spam corpora and
making policies for what belongs in which category is trivial in
some easy cases like bot pill spam, but non-trivial in other
cases.

Cheers,

Jeff C.


I appreciate the nuanced feedback but I have thought of similar 
considerations.  I believe the following will help to avoid ambiguity 
and legal issues.


* Yes, we cannot be 100% sure our opt-in was only for that particular 
site and not their partners.  But in any case automatic ham trapped 
mail will be only the mail branded by the subscribed provider, because 
that is the only mail we know for sure was opted-in.  Anything else is 
kept separate for later analysis.


* If clearly spammy other mail arrives at a particular address, the 
original subscription can be unsubscribed and the continued flow 
monitored.  That address could then be discarded.


Warren


Re: Need Volunteers for Ham Trap

2011-01-18 Thread Warren Togami Jr.

On 1/18/2011 1:15 AM, Martin Gregorie wrote:

On Tue, 2011-01-18 at 01:46 -0800, Jeff Chan wrote:


While I certainly would encourage improving ham and spam corpora,
this proposal may open up a lot of grey areas that may be
non-trivial to resolve.


Agreed, and some companies will get to you sign up for accounting and
service problem notifications and then pump advertising down the channel
in such volume that the purpose for which you signed up seems utterly
forgotten.

British Telecom sets a bad example here: they even behave like a spammer
inasmuch as they regularly vary their promotions text to dodge spam
filters. I'd be worried that if word gets around that SA is developing
rules that give signed-up bulk mail a free ride then a lot more
companies will do the same.


This is a misunderstanding.  I am largely against whitelisting or 
negative score rules.  I merely intend to increase the variety of 
legitimate mail in the nightly ham corpus so our spam-hostile rules can 
be better tested for safety.  This will be interesting especially with 
non-English ham.


Warren


Re: Greylisting delay (was Re: Q about short-circuit over ruling blacklisting rule)

2011-01-18 Thread Warren Togami Jr.

On 01/18/2011 12:31 PM, David F. Skoll wrote:

On Tue, 18 Jan 2011 22:18:20 +
Gary Forrestga...@netnorth.co.uk  wrote:


Interesting 2 of our 3 scanning heads use a grey list system that
uses /32 addresses as part of the process, these two servers have
100's of emails delayed for well over a day. Our 3rd scanning head
uses a grey list system that is less granular /24 ,  this does not.


Ah, I should mention that we use a /24 for greylisting for IPv4 and a
/64 for IPv6.  On the other hand, we also add a hash of the subject
into the greylisting tuple so it becomes:


I recently gave up entirely on greylisting after:

* Last week I discovered /24 was not good enough for redelivery attempts 
at one major ISP.  All mail from that ISP was failing for the past month 
except in rare cases where randomly the same /24 attempted delivery 
within the time window.


* Years of complaints of mail delivery delays or failures from my users. 
 They had began creating gmail accounts in order to bypass.  They kept 
running into too many cases of broken individual mail servers (major 
companies!) who failed to redeliver.


Users don't care about so and so is violating RFC-XXX.  They are 
trying to get business done and it was simply causing too many problems.


Warren


Re: Need Volunteers for Ham Trap

2011-01-18 Thread Warren Togami Jr.

On 01/18/2011 03:25 PM, Dave Pooser wrote:

On 1/18/11 12:52 AM, Warren Togami Jr.wtog...@gmail.com  wrote:


I am seeking volunteers to help me build and administrate a ham trap.
   The idea is to subscribe a list of unique e-mail addresses to various
retailers, airlines, government and other legitimate bulk mail senders.


The possible fly in the ointment I see is that you wouldn't necessarily have
access to some sorts of transactional emails-- airline flight reminders and
things of that nature. Would that be something where you'd be interested in
getting mail cc:ed to a hamtrap address? For example, I use tagged email
addresses for different airlines, and it would be trivial for me to have my
server relay those messages to a hamtrap address as well as delivering to my
personal email if that sort of thing would be useful.


You are correct that this isn't transactional mail.  It is however 
low-effort automatic collection of a subset of ham that real users 
receive, much of which we are entirely missing from the nightly corpus.


https://fedorahosted.org/auto-mass-check/
As for the ham you suggest, I highly suggest running your own nightly 
masscheck and uploading logs.  This avoids privacy problems and allows 
you to check/correct quality issues in your own corpus.


Warren


Re: SARE and RulesDuJour still relevant

2011-01-15 Thread Warren Togami Jr.

On 01/15/2011 01:36 AM, Ned Slider wrote:


In a year of running them locally I've never seen them hit on a ham
message. They appear to hit quite well for me because I pre-filter 95%+
of my spam at the smtp level (greylisting, HELO checks, spamhaus etc) so
SA only gets to see the difficult to catch stuff which might inflate the
percentage hits. As I said, they typically hit against bank phish sent
from compromised accounts on legit servers hence why they make it
through greylisting and many DNSBLs.

In my corpus of 3402 spam I see NSL_RCVD_FROM_USER hit 604 (17.8%) and
NSL_RCVD_HELO_USER hit 181 (5.3%). As there is (virtually?) no overlap,
that's a combined hit rate of ~23%, the vast majority of which I would
bet is bank phish. That is why I say these rules perform well for me -
once you take out the spam that's trivial to filter (spambot spam), the
hit rate against the remaining spam goes up.


It seems that NSL_RCVD_FROM_USER is indeed safe (no FP's except for 
trec_enron), but the spam hit rate may vary wildly on different targets. 
 My servers without any pre-spamassassin filters are seeing ~0.5-1.5% 
hit rates.


72_scores.cf
score NSL_RCVD_FROM_USER1.180 1.226 1.180 1.226

spamassassin-3.3.x already has NSL_RCVD_FROM_USER with a production 
score.  I am confused as to how NSL_RCVD_FROM_USER got this score, 
because AFAICT NSL_RCVD_FROM_USER was not in the 3.3 masscheck.


In any case, OR with NSL_RCVD_FROM_HELO isn't going to be helpful as 
you're only piling up more score.  Assigning a score to the HELO rule 
might be a good idea if we are certain it is safe.  OTOH, the masschecks 
indicate very little hits at all on that rule.


Warren


Re: SARE and RulesDuJour still relevant

2011-01-14 Thread Warren Togami Jr.

On 1/14/2011 2:28 AM, James Lay wrote:

Hey All!

Been a while since I did a full blown install of SpamAssassin, and as
I'm looking at my old setup, I see a fair amount of changes. I have the
SARE rules as well as RulesDuJour running, but noticed that on a fresh
install of SA, after doing an sa-update, there are very few rules files
(the bulk of which are in /var/lib/spamassassin/3.003001/). Have rules
been optimized or something? Should I copy over all the SARE rules and
setup RulesDuJour to update, or leave as is? Thanks for the input.

James


http://www.spamtips.org/
See my blog for current recommendations of rules that are tested to be 
safe.  I use nightly masscheck results at 
http://ruleqa.spamassassin.org/ in addition to local masschecks to 
verify that rules are safe before making recommendations.


https://admin.fedoraproject.org/mailman/listinfo/spamassassin-news
Spamassassin for Sysadmins Newsletter

You have installed all the optional plugins right (pyzor, razor, dcc)?

http://www.spamtips.org/2010/12/cacheredir-rule-prevent-google-cache.html
CACHEREDIR here has proven to be completely safe, while effective 
against 1-4% of low scoring spam.


http://wiki.apache.org/spamassassin/SoughtRules
Use SOUGHT.  It is good.

Anyone else have effective local rules?  Please let me know and I'll put 
them into the nightly masscheck for testing.


Warren


Re: SARE and RulesDuJour still relevant

2011-01-14 Thread Warren Togami Jr.

On 01/14/2011 01:09 PM, Ned Slider wrote:

On 14/01/11 21:04, Warren Togami Jr. wrote:


Anyone else have effective local rules? Please let me know and I'll put
them into the nightly masscheck for testing.

Warren




header NSL_RCVD_HELO_USER Received =~ /helo[= ]user\)/i
describe NSL_RCVD_HELO_USER Received from HELO User

Might want to combine into a meta rule with existing NSL_RCVD_FROM_USER
rule:

header NSL_RCVD_FROM_USER Received =~ /from User [\[\(]/
describe NSL_RCVD_FROM_USER Received from User

The above are particularly effective (here) against 419 / bank phish
type emails sent from compromised webmail accounts. Hit rate is not
great, but the FP count is near zero.

Regards,

Ned


Thanks Ned,

Both of the above rules are already in 
trunk/rulesrc/sandbox/jhardin/20_misc_testing.cf.


http://ruleqa.spamassassin.org/20110114-r1058896-n/NSL_RCVD_FROM_USER/detail
0.5% spam hit rate, and some ham hits, however they are all in the 
ancient enron corpus that we will soon be removing.


http://ruleqa.spamassassin.org/20110114-r1058896-n/T_NSL_RCVD_HELO_USER/detail
Very few spam hits, and a number of ham hits but all in DOS's corpus. 
Perhaps we should ask him if they really are ham?


Could you please describe how these rules work, and why the combination 
of them would be useful?


NSL_RCVD_FROM_USER already has a score.

It appears that the combination of the two rules will be zero masscheck 
FP's, but a maximum of 0.1% spam hits.  I suppose this is worthwhile for 
a night of testing, but I suspect it will be too small?


Warren


What's up with AHBL?

2011-01-08 Thread Warren Togami Jr.
http://ruleqa.spamassassin.org/20110107-r1056221-n/DNS_FROM_AHBL_RHSBL/detail

I just noticed this network rule with very poor performance.  0.02% spam
detected in recent masschecks.

My local logs show 16 hits out of 300K mail scanned in the last several
months, 2 of which were false positives.

http://ruleqa.spamassassin.org/20090930-r808953-n/DNS_FROM_AHBL_RHSBL/detail
Apparently it was performing poorly even in the 3.3.0 rescore masscheck late
2009, with 0.072% spam detected in the much larger sample of the rescore
masscheck.

NJABL and rfc-ignorant.org were controversial at 1% spam, but certainly
*this* is an obvious candidate for removal?

Where should we draw the line?

Warren Togami
war...@togami.com


Re: New plugin: DecodeShortURLs

2011-01-06 Thread Warren Togami Jr.
On Wed, Jan 5, 2011 at 2:41 AM, Warren Togami Jr. wtog...@gmail.com wrote:

 The only trouble here is HTTP's TCP handshake and teardown is significantly
 slower than DNSBL and URIBL lookups already used in spamassassin.  My
 average scan time is less than one second.  A plugin that catches the 1% of
 URL shortening spam is only worthwhile if it doesn't slow down your mail
 scanning considerably.  Doing the HTTP query asynchronously would help, but
 I fear that this could easily add several seconds per mail.

 Warren


Another problem... spammers could intentionally max out the number of
shortener URL's per spam.  The URL's don't even have to be real.  Any random
garbage after the domain name will trigger a HTTP get, and render the local
cache useless.  HTTP get could happen dozens or hundreds of times a minute
until the shortening service decides to block the spamassassin IP.


Re: New plugin: DecodeShortURLs

2011-01-06 Thread Warren Togami Jr.
On Thu, Jan 6, 2011 at 7:23 AM, Henrik K h...@hege.li wrote:


 There are lots of plugins out there that aren't part of the core for one
 reason or another. If you ask me, this is one of them. It just asks trouble
 widely used. It's not the only way to solve the problem anyway. And the
 problem itself is somewhat temporary in nature, just like image spam was
 etc.


I don't disagree, but I am wondering how is this temporary?

Warren


Re: New plugin: DecodeShortURLs

2011-01-05 Thread Warren Togami Jr.
On Sat, Jan 1, 2011 at 7:19 AM, Steve Freegard st...@stevefreegard.comwrote:

  On 01/01/11 11:51, Warren Togami Jr. wrote:

  I'll help you start the process with a Bugzilla ticket.  I also hope you
 could get it into some sort of public source control mechanism soon so we
 can see the changes that go into it before inclusion in upstream.  I feel
 uncomfortable using something that is only available from a URL without
 being able to see its change history.

 Know how to use git?  github.com is pretty good for something small like
 this.



 Sure. No problem.


Setup a git repository?  I'd like to collaborate on development on this
plugin.


 2) How widespread is URL shortening abuse now?  I can figure this out very
 easily by adding a non-network URI rule to the nightly masscheck.  Could you
 please send me privately your updated list of shorteners so that I may write
 such a rule?


 Based on the reports I get - quite prevalent at times and when these are
 used it's effectively a free-pass through the URIBL plug-in which often
 results in a false-negative.

 As soon as I've sorted out the list - I'll send it to you.


According to yesterday's masschecks, it appears that roughly 1% of spam and
1% of ham contains a URL shortener.  Of the spam in the corpus, ~49% of the
spam containing a URL shortener scoring 5 points or fewer.  A score this low
probably means they are successful in  avoiding positive URIBL hits.  If you
look at the borderline scores all the way up to 7, then you're looking at
64% of URL shortening spam.  Higher scores are almost always a sign that the
URL shortener domain itself is listed in a URIBL, probably because they
didn't police themselves and they were abused too much.  But the spam bias
of URL shorteners are definitely weighted heavily on the lower-end of
spamassassin scoring, meaning this is a worthwhile approach to develop.

The only trouble here is HTTP's TCP handshake and teardown is significantly
slower than DNSBL and URIBL lookups already used in spamassassin.  My
average scan time is less than one second.  A plugin that catches the 1% of
URL shortening spam is only worthwhile if it doesn't slow down your mail
scanning considerably.  Doing the HTTP query asynchronously would help, but
I fear that this could easily add several seconds per mail.

Warren


What NOT to use?

2011-01-05 Thread Warren Togami Jr.
Can anyone think of custom rules or old sites that continue to be online,
misleading people into believing that they should be using some custom rule
or plugin that is no longer effective or safe?   The former SARE repo was
the only one that I know about, but there are apparently others.

http://www.rulesemporium.com/
http://saupdates.openprotect.com/
I vaguely recall people saying for years that portions are safe, why not
include only the safe portions?  Otherwise these instructions should be
taken offline as they are doing more harm than good.

Warren


Re: IPv6 DNSBL/WL design, was Fwd: [Asrg] draft-levine-iprangepub-01

2011-01-04 Thread Warren Togami Jr.
On Mon, Jan 3, 2011 at 9:27 PM, Jason Haar jason.h...@trimble.co.nz wrote:

 On 01/04/2011 04:50 PM, Dave Pooser wrote:
  Frankly, I'd think that besides costing the spammers money (a good thing
 in
  and of itself)
 ...spammers steal other people's resources - so they'll pay nothing...
 The best case scenario we can ever hope for is that they will be stuck
 sending all their spam using the From: address and SMTP server of the
 infected host - nothing better is possible, unless you can figure out
 how to stop 100% of humanity clicking on %*# executables.



Some ISP's appear to be doing a much better job at preventing
spam-through-official-SMTP-servers than they used to.  I just now noticed
that rr.com appears to be using Cloudmark on customer mail leaving their
official MTA's.  Looking through my logs, it appears very little of my spam
is coming from official rr.com MTA's these days.  This is a good sign.  Now
why can't Yahoo do this!? =)

Warren


DNSBL Safety Report 1/2/2011

2011-01-03 Thread Warren Togami Jr.
http://www.spamtips.org/2011/01/dnsbl-safety-report-122011.html
Further on the topic of RBL's, I wrote this article yesterday for add-on
DNSBL's for spamassassin.

(BTW, I do agree that zen.spamhaus.org is an excellent choice for outright
blocking of spam.)

Warren


Re: lots of freemail spam

2011-01-02 Thread Warren Togami Jr.
I've been thinking, perhaps we should consider making a Freemail Realtime
BL that lists not IP addresses, but rather ID's at the Freemail provider.

1) I am assuming that ID's you see in headers of mail from Yahoo is always
from an authenticated user?
2) Traps and user reports can quickly list a new Freemail user ID.
3) Subsequent spam from that user ID is more easily blocked because the RBL
has the ID listed.
4) The RBL feed can be automated to be sent to the provider (like Yahoo) so
they can more quickly enforce locking down compromised accounts or enforce
their ToS.

Warren


Re: lots of freemail spam

2011-01-02 Thread Warren Togami Jr.
If I understand that thread correctly, that is for e-mail addresses in body
text?

I'm suggesting looking only at authenticated UID's in headers from specific
providers like Yahoo who are notorious for spam, but their MTA's also send a
significant amount of ham so we cannot DNSBL block them.  Given that we know
the UID's cannot be spoofed (if we verify the delivery with DKIM), such a BL
can be safely populated in an automated fashion using spam traps.

So this might be more of a Authenticated User RBL.

Warren


Re: New plugin: DecodeShortURLs

2011-01-02 Thread Warren Togami Jr.
http://ruleqa.spamassassin.org/20110102-r1054364-n/T_URL_SHORTENER/detail
I inserted a giant uri regex into the nightly masscheck in order to get a
rough measure the true extent of the URL shortener problem.

It appears that under 1% of spam is abusing shortening redirectors.  ~40% of
the shortening redirector spam has local-only spamassassin scores below the
5 point threshold.  We'll see next Saturday how it scores with all network
rules.

Warren


Re: New plugin: DecodeShortURLs

2011-01-01 Thread Warren Togami Jr.
What is the status of this plugin?

I notice that there is no Bugzilla ticket for this plugin.  Do you intend on
submitting it for inclusion in future spamassassin upstream?

Would a DoS happen if the scanned e-mail contains 10,000 short URL's, and
your mail server is hit by many such mail?  (Either spamassasin becomes very
slow, or you piss off the short URL provider by hitting them too quickly and
often.)

Could the plugin detect when there are intentionally too many short URL's?
If so, what should it do in such cases?  Are there ever legit reasons for an
e-mail to have a large number of short URL's?

Warren Togami
war...@togami.com


Re: New plugin: DecodeShortURLs

2011-01-01 Thread Warren Togami Jr.
.

Warren Togami
war...@togami.com


Re: New plugin: DecodeShortURLs

2011-01-01 Thread Warren Togami Jr.
http://www.surbl.org/faqs#redirect
BTW, this page mentions SpamCopURI and urirhdbl as existing tools that
handle redirection to some degree.  Have you confirmed that you are not
needlessly reinventing the wheel?  It is entirely possible that your design
with suggestions here could be better than the existing tools, but it might
be worthwhile to look at the existing tools to see if they have useful ideas
to borrow.

Warren


Re: New plugin: DecodeShortURLs

2011-01-01 Thread Warren Togami Jr.
On Sat, Jan 1, 2011 at 7:19 AM, Steve Freegard st...@stevefreegard.comwrote:

 7) How fast are typical URL shortening responses?  What is the timeout?  We
 want to avoid degrading the scan time and delivery performance of
 spamassassin, but in a way that cannot be abused by the spammer to evade
 detection.


 This could be a problem with your huge list of shortening services.  If you
 blindly include all possible shortening services, spammers could
 purposefully use only the slowest in order timeout spamassasin.  Web
 browsers are more forgiving in timeouts, so a slow redirector is the ideal
 way to evade your plugin.

 It is possible that you may want to include only the most reputable
 shortening services by default, because you don't know what will happen
 during the multiple years of your plugin being deployed on arbitrary
 servers.  Other less reputable shortening services might be hijacked, domain
 ownership changed, or simply neglected and become slow.  Such services may
 need to be blacklisted entirely.  For the non-default shortening services,
 it may be safe only if it can be updated via sa-update.


 The timeout is set to 5 seconds and with a default of 10 short URIs scanned
 it would take 50 seconds before it timed out the lookups.  Thinking about it
 I could possibly mitigate this by tracking timeouts by shortener domain; so
 if the 1st lookup to that shortener service timed-out then it wouldn't
 attempt the rest.


Everything else about this sounds very good, but this part is a bit
worrisome.  Looking through my logs, my average scantime is under 1 second.
During debugging a timeout of 5 seconds would be fine in order to help
determine how fast the shorteners typically respond.  But changes are needed
to avoid severely impacting delivery times.

* Consecutive timeouts wont work.  The combined timeout of all short lookups
when this plugin goes into production must be under maybe 3-5 seconds.
* I know this would be difficult, but would it be possible to make
asynchronous and concurrent queries to the shorteners instead of
one-after-another?  Kind of like how the URIDNSBL plugin currently works.
There might be some complications here, like most HTTP servers will only
respond to the first two concurrent connections from an IP address while
further connections are serviced only after the first two have disconnected.

Rule Ideas

SHORT_URL_MULTI10
SHORT_URL_TOOMANY
Rules triggering on suspicious behavior even if your plugin didn't have time
to query it all.

SHORT_URL_TIMEOUT
The plugin could print out which URL timed out.  Something like:

X-Spam-Report:
*  0.5 SHORT_URL_TIMEOUT Shortened URL Timedout
*  [3 second timeout for http://example.com/298fauu]

Warren


Re: IPv6 DNSBL/WL design, was Fwd: [Asrg] draft-levine-iprangepub-01

2010-12-30 Thread Warren Togami Jr.
On Thu, Dec 30, 2010 at 5:21 PM, Ted Mittelstaedt t...@ipinc.net wrote:

 On 12/30/2010 5:43 PM, John Levine wrote:

 Ah, I see the problem.  You're assuming that spammers will follow the
 rules.  That's a poor assumption.


 No, I am assuming the spammers will do as they have always done in the
 past - attempt to use other people's computers for free.  Other computers
 that are NOT cycling through lots of IP number in the
 normal case.


I didn't want to get into this debate, but I think this point is naively
optimistic.  If a system is capable of cycling through IP addresses, the
spammer will take advantage of this.  It is trivial to do this on a Linux
machine without disrupting operation of the owner's software by
adding/removing IP aliases.  I would assume there is a way to do it on
Windows as well, although it is better hidden.

Warren


Re: NJABL is dead?

2010-12-28 Thread Warren Togami Jr.
Folks here are missing the point, that NJABL is catching not much of
anything, like less than 1% of spam, and with a relatively high FP ratio.  I
don't understand this desire to keep such a poor performing rule, especially
when it costs a network query.

Warren


Re: NJABL is dead?

2010-12-28 Thread Warren Togami Jr.
Whoa.  Ted please calm down.  I think you read too much into this and are
seriously overreacting.  I didn't propose immediately replacing NJABL with
something else like mailspike.  I was only pointing out that NJABL was
performing very poorly, to such an extent that you're better off removing it
because it is needlessly using your resources.  In effect my proposal makes
nearly zero difference to SpamAssassin's current performance because these
rules are nearly useless.

The process of adding new DNSBL's to the official spamassassin rules is very
lengthy.  Among the things we need to improve/verify for eligibility: As you
have correctly noted, the website of Mailspike needs improvements.  Then we
need to ask about the robustness of the mirror network.  Then ask for
clarification about future plans for taking it private and demanding money
from users.  I also know about other measures to further improve Mailspike's
performance.

Masschecks have confirmed for over a year now that Mailspike's performance
is awesome.  Even after the above things are done, it still might be months
or even a year before SpamAssassin uses it as a default rule, because
current policies seem to allow for big changes like this only at major
releases like 3.4.0.

It seems we need a general discussion about rule update policies and
procedures, soon to happen on dev@ list.

Warren

On Tue, Dec 28, 2010 at 6:23 PM, Sahil Tandon sa...@freebsd.org wrote:

 On Tue, 2010-12-28 at 22:44:09 +, João Gouveia wrote:

  Again, a bit harsh, but I see your point.  We shall improve the web
  site whenever possible.  As everything free (and we would like to keep
  it that way), it's kind of subject to time+effort constraints, and
  typically we prefer to make use of that improving the efficiency of
  the list, and not so much working on the web site..

 João, please do not be discouraged by the ranting.  We use mailspike at
 multiple sites and it is a valuable, low-FP addition to the DNSBL
 arsenal.  Thanks for your efforts.

 --
 Sahil Tandon sa...@freebsd.org



Re: NJABL is dead?

2010-12-28 Thread Warren Togami Jr.
On Tue, Dec 28, 2010 at 8:11 PM, Ted Mittelstaedt t...@ipinc.net wrote:

 All very good points.  I guess I'm a bit frustrated because njabl is
 clearly not performing anymore, I noticed that a few years back, and
 yet it's still in SA but better BL's are not.  As you (and I) both
 illustrated, certain things need to be in place before a BL is added
 to SA.  It's frustrating that mailspike hasn't done the last little bit
 needed to polish it up (although it is
 good that they are care enough about it to pay attention) and it's
 also frustrating that the njabl owner has (apparently) gotten complacent
 with it's non-performance.


It is a bit unfair to blame Mailspike for not having everything 100% ready.
As I understand it, ANBREP began as their cleverly designed in-house spam
solution.  A while back they wanted it to be tested in nightly masscheck,
but they didn't even have a public webpage at all.  At my encouragement they
slowly over the last year built the public infrastructure (Mailspike.net)
and began preparing for public release.  It isn't a top priority for them,
and it seems one guy there is doing bit-by-bit in his spare time.  We did
not even formally propose inclusion to upstream yet.  If these things aren't
ready at the time when it is asked to be included then you can rightfully
complain.  Meanwhile please understand the situation.  There might even be
opportunities for you to help.

However because the BL's are so important to the usefulness of SA I
 would like to see SA change the blacklist configuration to something
 a bit different.  What I would like to see is a BL rules subdirectory
 that contains rules for every known blacklist that is functioning,
 no matter how poor they are, and then the main SA rules contain a
 check into that subdirectory, looking for a config file in that
 subdir.  That config file is nothing more than a series of lines, one
 for each BL.  Each line is a name.  If a BL name is present in the
 config file (or uncommented) then the BL rule for that name is sucked into
 SA, if the BL name isn't there, (or commented out) the rule or rules for
 that BL are ignored.


It is not that simple.  The scores work together and are carefully balanced
to maximize spam classification while minimizing the amount of ham False
Positives.  That means the scores assigned to one rule is depending on
scores assigned to rules in order to work.   Adding or removing significant
rules like BL's have a major impact that can substantially tip the balance
in either direction.  They cannot be changed very often for this reason.

Warren


Re: NJABL is dead?

2010-12-26 Thread Warren Togami Jr.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6525
Discussion about disabling NJABL.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6526
Discussion about disabling rfc-ignorant.org.

score __RCVD_IN_NJABL   0
score RCVD_IN_NJABL_CGI 0
score RCVD_IN_NJABL_MULTI   0
score RCVD_IN_NJABL_PROXY   0
score RCVD_IN_NJABL_RELAY   0
score RCVD_IN_NJABL_SPAM0
score __RFC_IGNORANT_ENVFROM0
score DNS_FROM_RFC_DSN  0
score DNS_FROM_RFC_BOGUSMX  0
score __DNS_FROM_RFC_POST   0
score __DNS_FROM_RFC_ABUSE  0
score __DNS_FROM_RFC_WHOIS  0

If you add these at local.cf, it makes almost zero difference to
spamassassin's scoring, but you do two fewer network queries per mail scan.

Warren Togami
war...@togami.com


Re: NJABL is dead?

2010-12-26 Thread Warren Togami Jr.
I found that if I don't set the non-scoring subrule to zero, it does the DNS
lookup anyway.  I will try that meta.  Thx.

Warren


Re: mass-check submissions Re: My attempt at re-calculating test scores

2010-12-25 Thread Warren Togami Jr.
I thought a bit more about the --reuse problem.  While there are pros and
cons to reuse, I guess there is more benefit to --reuse than without.  So I
now recommend it in all cases of masscheck.

On Fri, Dec 24, 2010 at 1:58 PM, Warren Togami Jr. wtog...@gmail.comwrote:

 This does remind me however that there is a serious and confusing problem
 if people should be using --reuse or not.  As it is now, it is misleading
 and broken for most people due to the chicken and egg problem of missing
 tags for newer DNSBL's.  We should probably tell people to turn off --reuse
 unless they are sure they know what they are doing.

 Warren
 On Dec 24, 2010 1:05 PM, John Hardin jhar...@impsec.org wrote:



Re: mass-check submissions Re: My attempt at re-calculating test scores

2010-12-25 Thread Warren Togami Jr.
In general, please stop worrying about your corpus being ideal.  Our sample
size right now is so small that even non-ideal corpora would be helpful.
Get started with cron nightly masschecks then work on improving your corpus
later.

I personally include:
* The last 4 weeks of spam.  I use logrotate to automatically rotate one
week at a time so I don't have to worry about it.  I receive LOTS of spam so
this is a good quantity.  IMHO, spam older than a month is far less useful
to test spamassassin's rules.
* Last 2 years of ham.  If we had 10x as many contributors to nightly
masscheck then I might reduce this to last 1 year of ham.

Warren


NJABL is dead?

2010-12-25 Thread Warren Togami Jr.
Hey folks,

Does anyone know the story of what is going on with NJABL?

http://ruleqa.spamassassin.org/20101225-r1052760-n/RCVD_IN_NJABL_PROXY/detail
http://ruleqa.spamassassin.org/20101225-r1052760-n/RCVD_IN_NJABL_RELAY/detail
http://ruleqa.spamassassin.org/20101225-r1052760-n/RCVD_IN_NJABL_SPAM/detail
After stopping early this year, I only began looking again at ruleqa results
in recent weeks.  It now appears that NJABL is almost useless or dead.

50_scores.cf:score RCVD_IN_NJABL_PROXY 0 0.208 0 2.224 # n=0 n=2
50_scores.cf:score RCVD_IN_NJABL_RELAY 0 1.881 0 2.499 # n=0 n=2
50_scores.cf:score RCVD_IN_NJABL_SPAM 0 1.466 0 1.249 # n=0 n=2
These scores were assigned by the previous rescoring masscheck before the
release of 3.3.0.

It appears that NJABL is not worthwhile to remain in spamassassin any
longer.  We are only creating extra network queries and for no good reason.
And NJABL just happens to be among the slowest of all my network queries in
spamassassin.  Perhaps it is time to remove NJABL?

RFC Ignorant appears to be the next most useless network query.  We may want
to consider the investigating if it is worthwhile to retain it.

If we eliminate network queries to useless or less effective blacklists, we
could consider later adding more effective lists.  Here are a few examples:

http://ruleqa.spamassassin.org/20101225-r1052760-n/T_RCVD_IN_MSPIKE_BL/detail
Excellent performance, I use this on my server.
http://ruleqa.spamassassin.org/20101225-r1052760-n/T_RCVD_IN_SEMBLACK/detail
Much improved performance since last year.  I am considering using it on my
server.  Only tagging for now.
http://ruleqa.spamassassin.org/20101225-r1052760-n/T_RCVD_IN_HOSTKARMA_BL/detail
Dangerously high false positive rate.  It would need to become safer.  I
personally use this for tagging but not scoring.

For now I'm proposing only disabling NJABL in sa-update, since it is
currently useless and not worth the extra network query.

Any thoughts?

Warren Togami
war...@togami.com


Re: My attempt at re-calculating test scores

2010-12-24 Thread Warren Togami Jr.
You have the option of uploading your corpus to the central server to
process every night.  But most people have privacy concerns about that if it
is their own personal ham.  For this reason you have the option of running
the masscheck script yourself every night on your own server and to rsync
upload the logs only to the spamassassin central server.

https://fedorahosted.org/auto-mass-check/
I run this script every night from cron on my corpora.  I wrote this as a
friendlier wrapper script around spamassassin's confusing and difficult to
configure scripts.

♫
And yes, a ham only corpus is extremely useful.  You must confirm that it is
100% human verified.  Start small, make sure the script is working properly,
and sort more ham into that folder.

Warren


Re: mass-check submissions Re: My attempt at re-calculating test scores

2010-12-24 Thread Warren Togami Jr.
http://www.mail-archive.com/users@spamassassin.apache.org/msg69546.html
Whitelists have almost zero impact on spamassassin's determination of ham vs
spam.  Believe me.  This is not harmful.

If you have any ham corpus it would be extremely useful to spamassassin.  We
have a severe lack of variety of data sources, so even a flawed data source
would be incredibly useful.  In this case the flaw is a not harmful like the
skew that a blacklist would cause.  Why recuse yourself from providing
statistical data on the thousand other tests?

http://ruleqa.spamassassin.org/
Look at how few contributors there are.  The WORLD of spamassassin users is
relying on the ham of a tiny group.  spamassassin defaults are working great
on MY spam, but I worry about others, especially non-US, non-English, or
non-geek mail.  We need greater variety and a larger sample size.

Warren


Re: mass-check submissions Re: My attempt at re-calculating test scores

2010-12-24 Thread Warren Togami Jr.
I think what he is failing to understand is the scores are irrelevant, as
the masscheck is only determining yes or no for each rule across a corpus.
Also current is referring to the nightly masscheck snapshot of svn trunk
including the latest rules.

This does remind me however that there is a serious and confusing problem if
people should be using --reuse or not.  As it is now, it is misleading and
broken for most people due to the chicken and egg problem of missing tags
for newer DNSBL's.  We should probably tell people to turn off --reuse
unless they are sure they know what they are doing.

Warren
On Dec 24, 2010 1:05 PM, John Hardin jhar...@impsec.org wrote:


Re: My attempt at re-calculating test scores

2010-12-23 Thread Warren Togami Jr.
BTW, if you have your own corpora, why not participate in the nightly
masscheck?  We are in serious need of additional participants in order to
enable promotion of new rules to the sa-update channel, and to make it
possible to release new versions of spamassassin.

Warren


spamassassin-3.3.1 RPM packages for Fedora and RHEL5

2010-03-21 Thread Warren Togami
http://wtogami.livejournal.com/34108.html
Please see my blog post here for official, tested RPM packages for Fedora
and RHEL5.

I highly recommend NOT building the RPM package from the spec file contained
within the spamassassin tarball.  It has never been tested to work on Fedora
or Red Hat Enterprise Linux.

Warren Togami
wtog...@fedoraproject.org


Re: Sought rules not doing so good

2010-02-03 Thread Warren Togami

On 02/03/2010 09:18 AM, Justin Mason wrote:

The corpus-quality for that masscheck doesn't look too bad though:

http://ruleqa.spamassassin.org/20100201-r905213-n/T_JM_SOUGHT_1/detail?s_corpus=1#corpus



That day was fine.  The weekly masscheck however had only 50k spam.

Warren


Re: Sought rules not doing so good

2010-02-02 Thread Warren Togami

On 02/02/2010 12:07 PM, Adam Katz wrote:

That is quite different from our masscheck stats.  Today's results at
http://ruleqa.spamassassin.org/20100201/%2FJM_SOUGHT look like this:

SPAM% HAM% S/ORANK   SCORE  NAME
   9.8564   0.0042   1.0000.940.01  T_JM_SOUGHT_3
   8.1587   0.0068   0.9990.930.01  T_JM_SOUGHT_2
  11.6464   0.0289   0.9980.890.01  T_JM_SOUGHT_1
00   0.5000.480.00  JM_SOUGHT_FRAUD_1
00   0.5000.480.00  JM_SOUGHT_FRAUD_2
00   0.5000.480.00  JM_SOUGHT_FRAUD_3



FWIW the nightly masscheck is often very unbalanced especially on the 
spam side.  Sometimes we have only 50k spam, sometimes over 500k spam. 
Some spam corpora contain a disproportionate amount of high scoring spam 
trap mail.  I personally randomly filter out a large percentage of high 
scoring mail in an attempt to balance my spam corpus.  But ultimately we 
need more masscheck participants to have better results.


Warren


Re: blog article on 3.3.0

2010-01-28 Thread Warren Togami

On 01/28/2010 11:33 AM, J.D. Falk wrote:

http://www.returnpath.net/blog/2010/01/spamassasin-rarely-misses.php

Yeah, it's partly self-serving, but that's what corporate blogs are for.  The 
people who read this blog are mostly marketers with very little exposure to the 
open source community, so this should help them understand a bit more of how 
the real email ecosystem operates.

--
J.D. Falkjdf...@returnpath.net
Return Path Inc


I wasn't planning on responding to this thread, but other positive 
responses have annoyed me.


This article is borderline misleading.

We didn't pay the Apache Foundation (which hosts  sponsors the 
SpamAssassin project) for these scores, or try to sell the developers 
on using it. We did talk about the products with them for quite a while: 
what the listing criteria is, our plans for the future, et cetera. Some 
of the developers  community members were friendly, others...not so 
much. In the end, it was SpamAssassin's own testing process which 
convinced them to include these tests with these scores. The data spoke 
for itself, and they saw the value in it.


The data spoke for itself?

http://www.gossamer-threads.com/lists/spamassassin/users/145597?do=post_view_threaded

The data showed that whitelists made almost ZERO difference, actually 
slightly negative impact on spam filtering.


Warren


Re: painting everybody in Taiwan with the same brush

2010-01-26 Thread Warren Togami

On 01/26/2010 05:31 AM, Kai Schaetzl wrote:

This is an SARE rule, I suggest you ask there.

Kai



Huh?  Aren't we supposed to be telling people to stop using SARE?

Warren


ANNOUNCE: Apache SpamAssassin 3.3.0 available

2010-01-26 Thread Warren Togami
Release Notes -- Apache SpamAssassin -- Version 3.3.0


Introduction


This is a major release, incorporating enhancements and bug fixes that have
accumulated in a year and a half of development since the 3.2.5 release.
Apart from some new or changed dependencies on perl modules, this version
is compatible to large extent with existing installations, so the upgrade
is not expected to be problematic (neither is downgrading, if need arises).
Please consult the list of known incompatibilities below before upgrading.


Downloading and availability


Downloads are available from:

http://spamassassin.apache.org/downloads.cgi

md5sum of archive files:

  15af629a95108bf245ab600d78ae754b  Mail-SpamAssassin-3.3.0.tar.bz2
  38078b07396c0ab92b46386bc70ef086  Mail-SpamAssassin-3.3.0.tar.gz
  e66856085ca14947146d57a40a51beaa  Mail-SpamAssassin-3.3.0.zip
  5be313a60c27ae522700e20b557ade33  Mail-SpamAssassin-rules-3.3.0.r901671.tgz

sha1sum of archive files:

  209a97102e2c0568f6ae8151e5a55cd949317b69  Mail-SpamAssassin-3.3.0.tar.bz2
  35ff5ab33dd83bf8e3a63bd1540d819ab35117d5  Mail-SpamAssassin-3.3.0.tar.gz
  d1c61c67c806054c4404a854fc113a1a3c3e71c7  Mail-SpamAssassin-3.3.0.zip
  04ac1d5d02a69f382909b01a4426a048a1e69278  
Mail-SpamAssassin-rules-3.3.0.r901671.tgz

Note that the *-rules-*.tgz files are only necessary if you cannot, or do not
wish to, run sa-update after install to download the latest fresh rules.

The release files also have a .asc accompanying them.  The file serves
as an external GPG signature for the given release file.  The signing
key is available via the wwwkeys.pgp.net key server, as well as
http://www.apache.org/dist/spamassassin/KEYS

The key information is:

pub   4096R/F7D39814 2009-12-02
  Key fingerprint = D809 9BC7 9E17 D7E4 9BC2  1E31 FDE5 2F40 F7D3 9814
uid  SpamAssassin Project Management Committee 
priv...@spamassassin.apache.org
uid  SpamAssassin Signing Key (Code Signing Key, replacement 
for 1024D/265FA05B) d...@spamassassin.apache.org
sub   4096R/7B3265A5 2009-12-02

See the INSTALL and UPGRADE files in the distribution for important
installation notes.


Summary of major changes since 3.2.5


COMPATIBILITY WITH 3.2.5

- rules are no longer distributed with the package, but installed by
  sa-update - either automatically fetched from the network (preferably)
  or from a tar archive, which is available for downloading separately
  (see below, section INSTALLING RULES);

- CPAN module requirements:
  - minimum required version of ExtUtils::MakeMaker is 6.17;
  - modules now required: Time::HiRes, NetAddr::IP (4.000 or later),
Archive::Tar (1.23 or later), IO::Zlib;
  - minimal version of Mail::DKIM is 0.31 (preferred: 0.37 or later);
expect some tests in t/dkim2.t to fail with versions older than 0.36_5;
  - no longer used: Mail::DomainKeys, Mail::SPF::Query;
  - either Digest::SHA or the older Digest::SHA1 is required, though
note that the DKIM plugin requires Digest::SHA for sha256 hashes
and Razor agents still need Digest::SHA1;
  - some IPv6 functionality requires IO::Socket::INET6;

- if keeping the AWL database in SQL, the field awl.ip must be extended to
  40 characters. The change is necessary to allow AWL to keep track of IPv6
  addresses which may appear in a mail header even on non-IPv6 -enabled host.
  While at it, consider also adding a field 'signedby' to the SQL table 'awl'
  (and adding 'auto_whitelist_distinguish_signed 1' to local.cf);
  see sql/README.awl for details. The change need not be undone even if
  downgrading back to 3.2.* for some reason;

- fixing a protocol implementation error regarding a PING command required
  bumping up the SPAMC protocol version to 1.5.  Spamd retains compatibility
  with older spamc clients. Combining new spamc clients with pre-3.3 versions
  of a spamd daemon is not supported (but happens to work, except for the
  PING and SKIP commands);

- if using one of the plugins (FreeMail, PhishTag, Reuse) which were
  previously not part of the official package, please retire your local copy
  to avoid it conflicting with a new native plugin;

- as the plugin AWL is no longer loaded by default, to continue using it
  the following line is needed in one of the .pre files (e.g. local.pre):
loadplugin Mail::SpamAssassin::Plugin::AWL

- it may be worth mentioning that a rule DKIM_VERIFIED has been renamed
  to DKIM_VALID to match its semantics;

- the DKIM plugin is now enabled by default for new installs, if the perl
  module Mail::DKIM is installed.  However, installation of SpamAssassin
  will not overwrite existing .pre configuration files, so to use DKIM when
  upgrading from a previous release that did not use DKIM, a directive:

loadplugin Mail::SpamAssassin::Plugin::DKIM

  will need to be uncommented in file v312.pre, or added to some
  other .pre file, such as local.pre;

- due to changes in some 

spamassassin-3.3.0 for Fedora/RHEL

2010-01-26 Thread Warren Togami

http://wtogami.livejournal.com/33674.html

If you use spamassassin on Fedora or RHEL5, please see my blog post for 
RPM packages and distro-specific notes.


Warren Togami
wtog...@redhat.com


Re: spamassassin-3.3.0 for Fedora/RHEL

2010-01-26 Thread Warren Togami

On 01/26/2010 03:31 PM, Kai Schaetzl wrote:

Charles Gregory wrote on Tue, 26 Jan 2010 14:10:51 -0500 (EST):


Anyone know where to find a RHEL(CentOS) 4 rpm?
Or will it appear in the CentOS 4 official update channels in due time?


Just do yourself. Follow the instructions on the download page, it's a
*one liner* !

Kai



FWIW, RHEL4 is older than anything I expect that .src.rpm to work with. 
 You may also need to build your own perl modules that might be missing.


Warren


Re: insecure dependency in sa-learn --import

2010-01-26 Thread Warren Togami

On 01/26/2010 06:16 PM, David Morton wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Trying to import a bayes db, I get:

#sa-learn --import
bayes: perform_upgrade: Insecure dependency in open while running with
- -T switch at /usr/share/perl/5.8/File/Copy.pm line 133.

perl 5.8.8


What distribution?

Warren


Re: That Future Bug

2010-01-19 Thread Warren Togami

Did you enable sa-update?  That will get rid of the broken rule as well.

Warren


  1   2   3   >