Re: How to get removed from spamcop?

2013-10-28 Thread Bart Schaefer
On Mon, Oct 28, 2013 at 3:08 PM, John Levine jo...@taugh.com wrote:
They have several of our IP addresses listed and delisting
doesn't seem to work. We're a spam filtering company (Junk Email Filter)
and if we fail to block a spam it can appear we are the source.

 Uh, Marc, if the spam comes out of your servers, you ARE the source.
 Nobody but you cares about your business model.

More to the point, if you're a spam filtering company, you shouldn't
be delivering something you failed to block to anybody but your own
customers.

Doesn't that make this a customer education issue?  Why are your
customers reporting you to spamcop?


Re: Really getting discouraged... when does the learning happen?

2013-09-28 Thread Bart Schaefer
On Mon, Sep 16, 2013 at 1:38 PM, Harry Putnam rea...@newsguy.com wrote:

 Yes, here is an example of a message rated as spam:

 X-Spam-Report: *  3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
 *  [score: 0.]

OK, so you've got a BAYES_99 on that message, which is a pretty good
indication that the training has worked.  However, SA's confidence in
the Bayes algorithm is only worth about one point out of a necessary
five, so the rest of the rules have to contribute the other (just a
bit more than) four points, and they do not:

 *  0.4 STOX_REPLY_TYPE STOX_REPLY_TYPE
 *  1.2 RCVD_NUMERIC_HELO Received: contains an IP address used for
 HELO
 *  1.8 STOX_REPLY_TYPE_WITHOUT_QUOTES STOX_REPLY_TYPE_WITHOUT_QUOTES

This could be because the scores are tuned to include network tests
which aren't able to be applied to your archive, or some such.  In any
case it's not the training that is failing you here.

You have a couple of choices.  You can assign your own higher score to
the BAYES_99 rule in your local spamassassin config, or you can modify
your procmail recipe to look for BAYES_99 in the filtered message and
treat messages that have it as spam even if they do not score above
the five point threshold.  Anything that's falsely BAYES_99 is
probably something you want to re-learn as ham anyway.


Re: procmail/spassassin training session

2013-09-15 Thread Bart Schaefer
On Sat, Sep 14, 2013 at 1:07 PM, Harry Putnam rea...@newsguy.com wrote:

 1) Does it matter that I have autolearn turned off in spamassassin
 conf filt 'local.cf' while doing my sandbox work

No, it doesn't.  In fact it's probably better that way because SA
won't waste time updating the bayes database with the mis-classified
stuff that will have to be backed out later.

 2) I've dirived the mbox files of pure ham and pure spam by running
 mixed mail so SA has already seen this mail.

That definitely doesn't make any difference *IF* you disabled
auto-learning in the previous step.  It shouldn't make any difference
even if autolearning was on, because sa-learn will discard the tokens
from the first pass on each message before re-learning, but it'll be
somewhat faster if that's not necessary.


Re: Really getting discouraged... when does the learning happen?

2013-09-15 Thread Bart Schaefer
On Sun, Sep 15, 2013 at 7:53 PM, Harry Putnam rea...@newsguy.com wrote:
 I've been trying to `teach' SA to spam from ham in my mail system.

 I've made it thru two main learning sessions where I ran around 450
 msgs (each time) thru sa-learn spam/ham and yet SA is still incapable
 of getting it right more than about 40 % or maybe less.

You say you've run 1100 messages through -- have at least 200 of those
been ham?  Bayes won't kick in until 200 *each* of spam and ham are
trained.  You can run sa-learn --dump magic to see how many of each
it believes it has seen.

If you've sa-learned enough of both types, is it possible you haven't
enabled bayes scoring?  Are the BAYES_* rules showing up at all in the
score details for newly arrived messages fed through spamc?


Re: Comment - GFI/SORBS

2010-12-14 Thread Bart Schaefer
http://blog.wordtothewise.com/2010/12/gfi-sorbs-considered-harmful-part-5/


RAZOR2 and SpamAssassin version or configuration

2010-10-14 Thread Bart Schaefer
We have a couple of mail servers running SpamAssassin.  One is stock
CentOS5 and therefore running SA 3.2.4.  The other is a test platform
running SA 3.3.1 (installed from rpmforge in case that matters).  Both
have the latest sa-update configurations for their respective
versions.

On both hosts, when I put the 3.3.1 sample-spam.txt message through
spamassassin, it reports RAZOR2_CHECK as expected.  If I run with -D,
SA3.2.4 reports that it is using razor2 version 2.82, and SA3.3.1
reports razor2 version 2.84.  Again this is as I expect.

However, I have another message received from outside, which when put
through spamassassin 3.2.4 reports a hit on RAZOR2_CHECK, but when put
through 3.3.1 it does not.  Run with -D, it does appear that the razor
server is being contacted in both cases, but I confess I haven't yet
resorted to sniffing traffic to be sure.

Where should I be looking for a configuration difference that would cause this?


OT: SPF: Some statistics

2010-02-24 Thread Bart Schaefer
Coincidental to the recent thread on SPF comes this from Terry Zink:

http://blogs.msdn.com/tzink/archive/2010/02/23/some-stats-and-figures-on-dkim-and-spf.aspx


Re: outlook 2007 Test email scores 30+

2009-10-31 Thread Bart Schaefer
On Sat, Oct 31, 2009 at 9:31 AM, John Hardin jhar...@impsec.org wrote:
 Here is a site that gives you your IP address and lets you check it against
 DNSBLs:

   http://cqcounter.com/rbl_check/

Just as a word of warning, that site is still checking
blacklist.spambag.org, which has been offline since 2007 and now lists
the entire Internet.


Re: Problem starting/stoping spamassasin

2007-12-21 Thread Bart Schaefer
On RedHat systems, at least, the init.d script that runs spamd is
named spamassassin.  So possibly what was meant here was

service spamassassin start
service spamassassin stop


Re: googlepages.com abuse

2007-11-13 Thread Bart Schaefer
On Nov 13, 2007 3:32 AM, Michael Scheidell [EMAIL PROTECTED] wrote:

  How do you folks trap these mails , And how do we report
  abuse to google ( if they really bother )
 You can't.  Google ignores complaints, and email to @googlepages.com
 will bounce in 5 days due to their refusal to even follow the RFC's and
 have a server to receive email.

Er, which RFC are you claiming requires them to have a server to receive mail?

In any case I inquired of an acquaintance who works at Google and got
this response to the question of how do we report abuse:

Web link: http://www.google.com/support/pages/bin/request.py
with links there for spam, phishing, scumware, etc.

Email links get too heavily spammed so the one contact address which has
leaked gets a push-back mail to use the web form.  All complaints are
acted upon quickly but no individual replies are sent.  Quickly here is
in practice within one business day (but I don't believe that there's a
formal commitment to this).

Any evidence that a complaint has been ignored would be interesting to
see; ultimately, is the page still up? is the check.


Re: Myspace's mail headers

2007-10-30 Thread Bart Schaefer
On Oct 30, 2007 12:27 PM, Joseph Brennan [EMAIL PROTECTED] wrote:

 Notice the all-lower-case field names, which do not conform to the
 RFC 2822 field names that they almost match.  That is, a from:
 header is not the same as a From: header!

It's *supposed* to be the same.  RFC *822 does not specify case for
field names.  See RFC 2234 where the ABNF format is defined, section
2.3.

if SA is not finding the from-address as a result of lower-casing
from:, then SA has a bug.  If SA is *intentionally* ignoring from:
because lower-casing is a spam-sign, then that's a somewhat different
matter ... but probably still wrong.


Re: Bit OT but it's about SPAM

2007-10-18 Thread Bart Schaefer
On 10/17/07, Randal, Phil [EMAIL PROTECTED] wrote:
 Hyperbole?

 Well, let's take a look at the figures on my mail relay boxes

Not to single out Phil, but so far everyone is quoting (among other
things) the percentage of mail that they reject out of hand.  You're
all 100% confident that none of those were false positives?

My point was that rejected/filtered by anti-spam techniques and is
spam are not synonymous, but nearly everyone who publishes spam
figures behaves as if they were, and most of them have a vested
interest in making the number sound as big and scary as possible.


Re: Bit OT but it's about SPAM

2007-10-17 Thread Bart Schaefer
On 10/17/07, Tom Ray [EMAIL PROTECTED] wrote:
 I just thought if anyone hasn't read it yet, this article might be
 interesting to many of you. According to this report SPAM has now
 reached being 95% of all email.

This is hyperbole.

What it really means is that 95% of the mail processed by someone's
commercial spam filter has been classified, possibly incorrectly, as
spam.  The rates are much lower (though still too high for comfort) if
false positives are accounted for.

See, for example:  http://www.bcs.org/server.php?show=conWebDoc.14617


Re: Suggestion to developers

2007-09-13 Thread Bart Schaefer
On 9/13/07, Justin Mason [EMAIL PROTECTED] wrote:

 if anyone feels like trying it out to see if they can make an
 auto-shortcircuiting plugin which outperforms base SpamAssassin over a
 mixed corpus of 50:50 nonspam and spam, go for it ;)

I dunno about your mail, but if it outperformed base SA on a corpus of
20:80 ham:spam that'd be worth it for what we end up filtering.

Of course outperform means it also has to maintain the same (or a
smaller) FP ratio, not just that it does the wrong thing faster.



Re: another bouncing list person

2007-07-26 Thread Bart Schaefer
On 7/25/07, Jerry Durand [EMAIL PROTECTED] wrote:
 It also is  trying to claim to come from me,  I don't have a POST or
 OFFICE address here.

 Begin forwarded message:

  From: [EMAIL PROTECTED], [EMAIL PROTECTED]

It's almost certainly the case that your own mail server did that.
The originating mail server very likely sent you

 From: Post Office

(that is, an invalid address which your server attempted to clean up
by attaching the local domain in places it thought appropriate).


TQMcube apparently gone dormant

2007-07-03 Thread Bart Schaefer

If you read JM's Planet Antispam, you know this already, but:

http://www.dnsbl.com/2007/06/status-of-dnsbltqmcubecom-abandoned.html


Re: Blacklist a mailing list

2007-07-01 Thread Bart Schaefer

On 7/1/07, dougp23 [EMAIL PROTECTED] wrote:

How do I go about blocking the mailing list?  here are some headers from a
recent message:  (It seems everyone on [EMAIL PROTECTED] is getting this
junk).


Prompted by Doug but directed to no one in particular:

Please don't use things like mailinglist.org and especially
mydomain.com as either generic examples or as placeholders for
whatever your domain really is.  There actually *is* a mydomain.com
and unless that really is your domain it just causes needless
confusion.

If for some reason you think its essential to purge references to your
domain name, then simply replace them with obvious mark-out like
--.com or the like.

Thanks.


Re: A different approach to scoring spamassassin hits

2007-06-30 Thread Bart Schaefer

On 6/29/07, Tom Allison [EMAIL PROTECTED] wrote:


The thought I had, and have been working on for a while, is changing
how the scoring is done.  Rather than making Bayes a part of the
scoring process, make the scoring process a part of the Bayes
statistical Engine.  As an example you would simply feed into the
Bayesian process, as tokens, the indications of scoring hits (binary
yes/no) would be examined next to the other tokens in the message.


There are a few problems with this.

(1) It assumes that Bayesian (or similar) classification is more
accurate than SA's scoring system.  Either that, or you're willing to
give up accuracy in the name of removing all those confusing knobs you
don't want to touch, but it would seem to me to be better to have the
knobs and just not touch them.

(2) For many SA rules you would be, in effect, double-counting some
tokens.  An SA scoring rule that matches a phrase, for example, is
effectively matching a collection of tokens that are also being fed
individually to the Bayes engine.  In theory, you should not
second-guess the system by passing such compound tokens to Bayes;
instead it should be allowed to learn what combinations of tokens are
meaningful when they appear together.

(It might be worthwhile, though, to e.g. add tokens that are not
otherwise present in the message, such as for the results of network
tests.)

(3) It introduces a bootstrapping problem, as has already been noted.
Everyone has to train the engine and re-train it when new rules are
developed.

I've thought of a few more, but they all have to do with the benifits
of having all those knobs and if you've already adopted the basic
premise that they should be removed there doesn't seem to be any
reason to argue that part.

To summarize my opinion:  If what you want is to have a Bayesian-type
engine make all the decisions, then you should install a Bayesian
engine and work on ways to feed it the right tokens; you should not
install SpamAssassin and then work on ways to remove the scoring.


Re: My Newly Expanded DNS Blacklist - Who wants to try it?

2007-06-16 Thread Bart Schaefer

On 6/16/07, Marc Perkel [EMAIL PROTECTED] wrote:

Using my new ideas here's my raw blacklist file. It has about 80k IP
addresses and is updated every 10 minutes.

http://iplist.junkemailfilter.com/black.txt


Just glancing through the list and reversing an IP address whose first
two quads I recognize, I see you've blacklisted Red Condor
(redcondor.com), a network security and anti-phishing service provider
(64.84.16.173).

So either they've got a problem they ought to be made aware of, or you do ...


Re: R: Inappropriate use of E-Mail addresses

2007-05-13 Thread Bart Schaefer

On 5/13/07, Gregory P. Ennis [EMAIL PROTECTED] wrote:

SPF seems very interesting.  Does spamAssassin automatically use an SPF
record if it exists?


There's a plugin.


Do I set up an SPF record with whoever manages my MX DNS record?


Yes.  It's a TXT record.  Some DNS hosting companies will set it up
for you, some will give you the ability to create a TXT record through
their management intraface (but you have to figure out what to put in
the record yourself), and some don't support TXT records at all.

Note that SPF is not a magic bullet.  It's not yet that widely
adopted, and any MTA that's doing accept-and-bounce for unknown
addresses is probably not checking SPF either.

You probably also want to look at SenderID.  Wikipedia has a reasonable summary.


Re: Weirdsvill

2007-04-14 Thread Bart Schaefer

On 4/13/07, Gene Heskett [EMAIL PROTECTED] wrote:

Now, I *think* I have that X-Originating-Ip: 193.93.97.195 in my .procmailrc,
but it didn't fire.  Odd...


Is that rule before or after the point at which you run the message
through spamassassin?

If after, it probably ddin't fire because spamassassin moved it out of
the top-level message header.  You'd have to be looking for
X-Originating-IP in the body, then.


Re: Weirdsvill

2007-04-13 Thread Bart Schaefer

On 4/13/07, Gene Heskett [EMAIL PROTECTED] wrote:

The trail starts at localhost!  HTF did they do that?


You're looking at the header of the wrapper message created by
spamassassin, not at the header of the actual spam (which will be
inside a message/rfc822 body part of the message created by
spamassassin).


Re: newbie question on spamassassin trainer

2007-04-04 Thread Bart Schaefer

On 4/3/07, JOYDEEP [EMAIL PROTECTED] wrote:

how can I configure spamassassin to look after the spam and ham folder
of all the cyrus mail boxes,
so that all the users has their own spamassasin trainer ? it is
something like white box and black box per user

could any one kindly suggest me how to implement this ?


I don't know if sa-learn can read cyrus mailboxes; I suspect it can't.
So it's up to you to get the mail out of the mailboxes and into a
format sa-learn can read.

As for how you do this, you'll have to set up a periodic cron job
(overnight may be often enough) to pull the mail out of the mailboxes
and feed it through sa-learn.  You may be able to do this with
logrotate and some pre/postrotate scripts in the config file, but
having cyrus in the equation may interfere with that approach.


Re: dkim: lookup failed: DNS query timeout for _policy._domainkey.joysticktowers.com

2007-03-11 Thread Bart Schaefer

On 3/10/07, Chris [EMAIL PROTECTED] wrote:


For some reason when this happens fetchmail will not delete the message after
downloading it therefore it just sits there and get downloaded over and over
again and prevents othere mail after it from being downloaded. Could this be

a) a fetchmail issue


Yes, it is.  fetchmail won't download messages out of order, it
won't mark the message on the server as one it has already seen until
it has successfully delivered it locally, and any timeout either in
the connection to the server or in the local delivery causes the local
delivery to be treated as a failure (so the message isn't marked) and
the entire fetchmail process to exit (so no later messages get fetched
in that session).

You're effectively deadlocked until you either fix the timeout problem
or delete the offending message from the server by some other access.


Re: Sorting SA Discussion List Messages

2007-03-03 Thread Bart Schaefer

On 3/3/07, Don Ireland [EMAIL PROTECTED] wrote:

Every email list I've ever subscribed to has had something in the
subject line (usually in square brackets) to identify 1) that it is a
mailing list and 2) what list it is.

Why doesn't this list have something similar?


Because it's a really annoying thing to do and interacts badly with
both threading algorithms and with other automated header rewriting
that's done by mail readers?

I'm on a couple of lists where this kind of tagging is done and there
are always threads where the subject has become

Re: [List Name] Re: [List Name] Re: [List Name] Silly subject rewrites
ad infinitum

Just Say No to unnecessary administrative mangling of messages that
pass through list exploders.


False positive on LONGWORDS

2007-02-27 Thread Bart Schaefer

A technical newsletter about transistors contains the introductory paragraph

Use of gallium nitride (GaN) power transistors in microwave
applications is expected to increase significantly with recent
technology improvements, but lateral double diffuse metal oxide
semiconductor (LDMOS) transistors are expected to stay in the lead,
according to a new report by RF Design editorial director Ashok
Bindra, and technical editor Mark Valentine. The duo's report also
addresses the general GaN market and exposes the latest advancements
in complementary metal oxide semiconductor (CMOS) radio-frequency
integrated circuits (RFICs) and integrated passive components. This
integrated passives technology could potentially displace conventional
manufacturing techniques presently used to produce passive RF
components.

Several substrings including integrated passives technology could
potentially displace conventional manufacturing techniques presently
match the LONGWORDS regex, for 3.0 points.  That seems a bit excessive
... is this worth filing a bugzilla?


Re: Google Summer of Code 2007 ...

2007-02-16 Thread Bart Schaefer

On 2/16/07, Justin Mason [EMAIL PROTECTED] wrote:


Also, any suggestions from outside the dev team?  Anyone got good ideas
for new SpamAssassin features that would be good to pay someone to work on
for 3 months?


http://issues.apache.org/SpamAssassin/show_bug.cgi?id=3785


Re: Yum and Spamassassin

2007-02-07 Thread Bart Schaefer

On 2/7/07, Theo Van Dinter [EMAIL PROTECTED] wrote:

On Wed, Feb 07, 2007 at 03:04:43PM +, Michael Bartlett wrote:
 I can't believe the yum package is so out of date, am I missing something?

You're running Fedora Core 4 (hey, me too,) which is generally out of date at
this point.  I'd suggest ditching their package, grabbing the SA tarball, and
building your own 3.1.x.


The RPMforge project at rpmforge.net has up-to-date versions of
spamassassin in a yum repository for most RedHat-derived operating
systems including Fedora Core.  I don't use FC4 so I don't know
specifically about that one, but it's pretty likely.


Re: spamdoptions ???

2007-01-24 Thread Bart Schaefer

On 1/23/07, R Lists06 [EMAIL PROTECTED] wrote:


On Redhat or CentOS machines would that be under SPAMDOPTIONS ?


Using the RPM install of spamassassin from either the CentOS project
or rpmforge, you make changes to the spamd command line in
/etc/sysconfig/spamassassin, and yes, you place those switches in the
assignment to the SPAMDOPTIONS variable.

As far as what switches you use, man spamd.  For how many spamd
children you can run, you probably just have to experiment with
gradually increasing the number.


Re: procmailrc question

2007-01-10 Thread Bart Schaefer

On 1/10/07, D Ivago [EMAIL PROTECTED] wrote:


:0:
* ^Subject:.*\[SPAM]\
/dev/null


Square brackets have special meaning: [SPAM] is a character class
matching one of any of the characters S, P, A, or M.  What you need
is:

:0
* ^Subject:.*\\[SPAM\]
/dev/null

However, I'd not recommend that.  Instead, continue to file the spam
in a folder, but set up something to discard the contents of the
folder on a regular basis.  For example, I configure the logrotate
package to move the spam folder to a backup name once a day, and
discard the oldest backup every few days; so the spam doesn't pile up
forever, but I can recover anything that gets mis-filed (and with a
threshold of 3.0 you *will* get something mis-filed eventually, even
if you have not yet).


Re: Salesforce web bug

2006-12-20 Thread Bart Schaefer

On 12/19/06, Michael Scheidell [EMAIL PROTECTED] wrote:

I noticed an email from salesforce has a 'user tracking' web bug in it
but it isn't currently detected by SA or SARES


Why do you want to consider this a spam sign?  I'm just curious.


Re: Salesforce web bug

2006-12-20 Thread Bart Schaefer

On 12/20/06, Loren Wilton [EMAIL PROTECTED] wrote:

 Why do you want to consider this a spam sign?  I'm just curious.

Bugs in mail messages are generally a suspicious circumstance, and probably
good for a fractional point all by themselves.  In general any tracking that
will auto-identify without the user at least clicking on something is
suspicious.


In general I'd agree with you, but here we're talking very
specifically about SalesForce.  Is there evidence, for example, of
someone using SalesForce to send spam?


Re: sa-update is broken

2006-12-18 Thread Bart Schaefer

On 12/18/06, Christian Eichert [EMAIL PROTECTED] wrote:


server:~# perl -MCPAN -e 'install LWP::UserAgent'
Can't locate object method install via package LWP::UserAgent at -e
line 1.


# perl -MCPAN -e shell
cpan install LWP::UserAgent


Ongoing trusted_networks confusion

2006-12-18 Thread Bart Schaefer

Maybe the name of that config option should be changed to truthful_networks.


Re: Easyjet e-mail scoring very high

2006-12-01 Thread Bart Schaefer

On 12/1/06, Chris Lear [EMAIL PROTECTED] wrote:

In fact, every full stop in the html is
represented as #46; for some reason.


In SMTP, a dot all by itself on a line is interpreted as the end of
the message.  The SMTP client is supposed to double any such dot that
is truly present in the message body, and the SMTP server then removes
the extra dot for final delivery.  My guess would be that (a) they
have a crappy SMTP cllient, probably something written in Java by a
junior programmer who doesn't know a protocol from a parsnip, to send
mail directly from a web server platform; and (b) they once had a
message truncated because there was a dot in the wrong place; so (c)
because they don't know how to fix the crappy SMTP client, they encode
all the dots instead.


Still wondering though... how do you solve a problem like EasyJet?


By doing what you don't want to do:  whitelisting.


Re: Bayes failure on hi, it's Somebody spam

2006-11-17 Thread Bart Schaefer

On 11/16/06, Jon Trulson [EMAIL PROTECTED] wrote:

 Hmm, that has not been my experience at all... Bayes (99) is
 still catching every one for me.


In this instance, SpamAssassin is running after POP download from
gmail, so I'm only seeing the samples that have already made it
through google's filters.  That may have something to do with it.


Bayes failure on hi, it's Somebody spam

2006-11-16 Thread Bart Schaefer

It looks to me as if the recent spate of pump'n'dump spams are
deliberately crafted to avoid being Bayes-learned by spamassassin.  In
spite of all having different subject lines and senders and other
minor differences, once you've learned one of them sa-learn ignores
all the rest -- and they all still get a BAYES_00 score for me.

I thought I had  a pretty good understanding of how SA's Bayes
training worked, but this is pretty clearly confusing it somehow.


Re: RPM -vs- CPAN install

2006-09-06 Thread Bart Schaefer

On 9/6/06, jdow [EMAIL PROTECTED] wrote:


The RPM installs do not seem to include the tools that you get with
the CPAN install.


The rpmforge project packages the tools as a separate RPM, named,
surprisingly enough, spamassassin-tools.


Re: RPM -vs- CPAN install

2006-09-06 Thread Bart Schaefer

On 9/6/06, jdow [EMAIL PROTECTED] wrote:

From: Bart Schaefer [EMAIL PROTECTED]

 The rpmforge project packages the tools as a separate RPM, named,
 surprisingly enough, spamassassin-tools.

And then one distro spamassassin-tools was no longer present.


I'm not sure what you mean.  yum list spamassassin shows me:

Installed Packages
spamassassin.i3863.1.5-1.el4.rf installed
Available Packages
spamassassin-tools.i386  3.1.5-1.el4.rf rpmforge


Maybe it is in extras now.


If you're talking about RedHat, no, it's not in extras.  They don't
provide it at all, unless as part of the source RPM.  However, as they
also don't provide anything newer than 3.0.6, I've already gone
looking elsewhere, in this case rpmforge.net.


This lack has left me distrustful of distros of late. I've noticed that
they all leave little things out of what they package.


Mostly I suspect they leave out things about which they're concerned
there may be even the slightest possibility of licensing or copyrigh
hassles.


Re: Calling Regex Experts

2006-08-24 Thread Bart Schaefer

On 8/24/06, D. J. [EMAIL PROTECTED] wrote:


I'm expecting these type of strings for sure:

cat
dog
cat dog
dog cat

But I may get something like this too:

cat cat dog
dog dog

Essentially I want it to match if anything other than cat or dog is in the
string.


That constraint means you have to construct a regex that can be
anchored at both beginning and end of string, e.g.
/\A(\s*(cat|dog)\s*)+\Z/.  I'm not sure that ever makes sense in the
context of a spamassassin rule, except maybe one matching against a
specific header.


Re: What changes would you make to stop spam? - United Nations Paper

2006-08-02 Thread Bart Schaefer

On 8/2/06, Marc Perkel [EMAIL PROTECTED] wrote:

Here's what I've written so far. Deadline is today. Still working on it.

http://wiki.ctyme.com/index.php/UN_Spam_Paper


Rather than extend POP/IMAP to send mail, which quite frankly will
never happen (contact the author of the IMAP protocol, Mark Crispin,
if you want the full rant -- you shouldn't have any trouble finding
his email address if you search), please suggest that the SUBMIT
protocol be used.  RFC 2476 and 4409.  See also RFC 4405.


Re: What changes would you make to stop spam? - United Nations Paper

2006-08-02 Thread Bart Schaefer

On 8/2/06, Marc Perkel [EMAIL PROTECTED] wrote:


doesn't require a separate connection on a separate port. Why use 2
protocols when you can use one?


Indeed, why don't we just close all ports except 80 and layer
everything atop HTTP?

For heavens sake, Marc.  This debate about using IMAP/POP for outbound
mail already happened more than a decade ago.  If you can't be
bothered to look through the archives of the IETF lists that discussed
creation of these protocols, at least take the word of those of us who
were present at the time:  It was a poor idea then, it's still a poor
idea, and you'd be much better off spending your time pushing
something else.

And NONE of this is relevent to SpamAssassin any more.  Take it somewhere else.


Re: Why is there so much hype behind Image spam

2006-07-16 Thread Bart Schaefer

On 7/16/06, John Andersen [EMAIL PROTECTED] wrote:

The comment was off-hand and not researched.  One of my earliest
ISPs recommended Spamassassin when it was just a bunch of scripts
written by some woman who's name escapes me.


I suspect you're thinking of SpamBouncer.  Catherine A. Hampton.
Other than possibly being a source of inspiration, SpamBouncer has
nothing to do with SpamAssassin.


Re: The best way to use Spamassassin is to not use Spamassassin

2006-07-12 Thread Bart Schaefer

On 7/12/06, Marc Perkel [EMAIL PROTECTED] wrote:

Catchy subject line eh?


What you really mean is the best way to use SpamAssassin is as an
analysis tool.

Which of course is what the best way to use it always was.  You're
just abstracting the analysis rather than applying it directly.


The reaso [sic] of spam is rejected before I get to SA through
a fairly large number of tricks that allow me to determine with near
100% accuracy things that are spam.


There's been a fellow over on the procmail list claiming for well over
a year now that he can get better accuracy than SA through message
header analysis alone, based on rules he's compiled by analyzing what
gets through the rules he already has.  Just like you've done so far
in this thread, though, all he'll do is claim that without providing
any details -- which he says is because he doesn't want to give away
all the hours of his work that went into it.


It is none mostly through behavior
and karma related lists. Being host blacklisted or URI blacklisted.

Similarly, I have created a whitelisting system that tracks hosts and
other aspects of the message


The trick, of course, is to be able to automatically feed back into
these lists based on the output of the analysis tool.  If someone has
to do it by hand, it's a losing proposition.


Re: The best way to use Spamassassin is to not use Spamassassin

2006-07-12 Thread Bart Schaefer

On 7/12/06, Marc Perkel [EMAIL PROTECTED] wrote:


Bart Schaefer wrote:
 There's been a fellow over on the procmail list claiming for well over
 a year now that he can get better accuracy than SA through message
 header analysis alone

His claim might well be true.


Oh, I have no doubt that he's speaking truthfully.  Problem is that if
no one else can look at what he's done, there's no way to confirm or
deny my own suspicion, which is that most of his rules are only that
accurate in his specific environment.  That is, I tend to expect that
if you picked up his rules and dropped them on another machine halfway
around the world with a different ISP and mail routing chain, their
accuracy would plummet.


Re: sa-learn script

2006-07-11 Thread Bart Schaefer

On 7/11/06, Nicholas Payne-Roberts [EMAIL PROTECTED] wrote:

Does anybody know a good way to script sa-learn to daily check on junk
e-mail folders?


I use logrotate because it handles automatically removing or renaming
the files after learning, but I don't use maildir-format folders so I
can't provide a tested configuration.

Something like this:

notifempty
missingok
/home/vpopmail/domains/*/*/.Junk E-mail/cur/* {
 rotate 0
 daily
 nomail
 prerotate
spamc -t 20 -l -L spam  $1
 endscript
}

Be careful of that rotate 0 which means to delete the file.  If
there's any chance that a false-positive might need to be recovered
later, you probably want to increase that and add an olddir
directive to tell logrotate where to archive the spam.

If you have logrotate running regularly as a system process, that
config would go in (for example, may vary by OS distribution)
/etc/logrotate.d/sa-learn.  If not or if you have to run logrotate  as
a user other than root, put that in a file somewhere in the correct
user's home directory (I like to use a subdirectory named .logrotate
and name the file conf) and install a crontab entry for that user,
similar to

1 3 * * * logrotate -f --state $HOME/.logrotate/state $HOME/.logrotate/conf


Re: Warnings in procmail log

2006-07-10 Thread Bart Schaefer

On 7/9/06, Geoff Soper [EMAIL PROTECTED] wrote:

Apologies, I've little idea of what is traditional and didn't realise my
situation was unusual!


I didn't say it was unusual ... it's just not the assumed default
state of affairs.


I mean all users should have the same rules and spam threshold, subject
rewriting setting etc.


Including sharing a bayes database (if you're using bayes)?

In that case all you have to do is make sure that the user who is
running spamassassin is not root and has a writable home directory.
If your conjecture that procmail is running as the user popuser is
correct, make sure that popuser does not have its home directory set
to /.


Re: Warnings in procmail log

2006-07-08 Thread Bart Schaefer

On 7/8/06, Geoff Soper [EMAIL PROTECTED] wrote:


.qmail contains the lines:
| true
./Maildir/


Caveat:  I don't use qmail, and don't even particularly like qmail, so
what I'm about to say are really educated guesses rather than
definitive answers.


which I've altered to:
| true
| /usr/bin/procmail -m ./.procmailrc


No, don't use the -m option.  Just use

| /usr/bin/procmail

and let procmail figure out where the $HOME/.procmailrc file is on its
own.  If you want any options to procmail there at all, you want the
-d recipient option (where you'll have to get the value for
recipient from qmail somehow, I don't know how).

Incidentally, I have no idea what the purpose of that pipe to true is,
and I suspect you should just remove it.


and in that .procmailrc :
DIR=./Maildir/


What exactly do you think that's accomplishing?  If you never refer to
$DIR again anywhere, this is meaningless.  If you want to change
directories, assign to MAILDIR.  If you are trying to force procmail
to deliver in maildir format, I think what you want is

DEFAULT=$HOME/Maildir/

I'm not sure about the $HOME part, but DEFAULT should never be a
relative path (never one starting with ./ or with no directory
reference at all).


I've no desire to run different configurations for different users or
addresses, the single configuration is fine, I just want to solve these
errors I'm seeing in the procmail_log file.


Where is this ./.procmailrc file that you are trying to read with
the -m option?  That is, what do you expect the current directory (./)
to be at the time procmail runs?

If you really want exactly this same config for all users, then you
should move that ./.procmailrc file (wherever it is) to
/etc/procmailrc (with no dot) and insert DROPPRIVS=yes somewhere
before the recipe that runs spamassassin, probably at the very top of
the file (unless you want all users to write to the same log file as
well).  If you later add things to /etc/procmailrc, you'll need to
research whether they belong above or below the DROPPRIVS (below will
usually be safe, but not always correct).


Re: Warnings in procmail log

2006-07-08 Thread Bart Schaefer

On 7/8/06, Geoff Soper [EMAIL PROTECTED] wrote:

Bart Schaefer wrote:
I think I need to specify the .procmailrc as the .procmailrc file is per
e-mail address, not per user or even system-wide


I think we've just uncovered a crucial bit of missing information.

You're apparently running procmail in some kind of virtual-user
environment, where there is no user login name corresponding to the
email address being processed.  You need to explain these sorts of
things up front.  All the answers so far have assumed you have a
traditional unix/linux type environment where mail is delivered to
individual user accounts that have /etc/passwd file entries, separate
home directories, etc.

So, forget everything that's been said, and let's start over.


 Where is this ./.procmailrc file that you are trying to read with
 the -m option?  That is, what do you expect the current directory
 (./) to be at the time procmail runs?

the .procmailrc file is in /var/qmail/mailnames/domain.tld/test
alongside the .qmail file and the Maildir directory


In that case you need to tell spamassassin to look for its
configuration files in that location.

There may be a way to finagle the options to spamassassin itself to
make this work, but the easiest approach is to run spamd:

 spamd --create-prefs --virtual-config-dir=/var/qmail/mailnames/%d/%u

(see man spamd for other options you might want to pass, such as
-m to limit the number
of simultaneous processes, etc.).  This is a daemon that needs to run
as a system service; you may already have an
/etc/rc.d/init.d/spamassassin or similar script for managing this
service.  It depends on your OS and whether you built SA yourself or
installed it with some kind of package management tool (other than
CPAN).

Then in each appropriate .procmailrc file,

:0fw
*  256000
| /usr/bin/spamc -u [EMAIL PROTECTED]

where you'll need to get the equivalent of [EMAIL PROTECTED] for each
virtual address from somewhere; I don't know enough about qmail to
tell you how, but if it's not in an environment variable, perhaps you
can add it to the procmail command line after the ./.procmailrc and
then refer to it here as $1.


Just to confirm, the .procmailrc file isn't common to all users but the
SA setup is.


I'm no longer sure I understand what you consider to be the SA setup.


Re: Looking for Turn-key SA solution

2006-07-05 Thread Bart Schaefer

On 7/5/06, Burton Windle [EMAIL PROTECTED] wrote:

Does anybody know of a vendor that sells boxes with SpamAssassin
pre-installed, with a pretty GUI with quarantine ability? (My company
won't allow home-brewed solutions, as they want a vendor to call if I get
hit by a spam bus).


It's not exactly a vendor solution, but:

http://www.vmware.com/vmtn/appliances/directory/255


Re: Warnings in procmail log

2006-07-05 Thread Bart Schaefer

On 7/5/06, jdow [EMAIL PROTECTED] wrote:

You need DROPPRIVS=yes somewhere near the front of your .procmailrc.


No, you don't.  By the time the .procmailrc is read, privileges have
already been dropped.  The only place you need DROPPRIVS=yes is in
/etc/procmailrc in the event that you want to give up privileges
before the end of that file has been reached.

You should not have an /etc/procmailrc file at all unless you have
carefully studied what belongs there.


Re: spamassassin-3.0.4-1.el4

2006-07-03 Thread Bart Schaefer

On 7/3/06, Kaushal Shriyan [EMAIL PROTECTED] wrote:


I have spamassassin-3.0.4-1.el4 installed by default in RHEL4 Linux


There have been updates since then.  Current is
spamassassin-3.0.6-1.el4 -- but note that I recently reported that
spamd in that package has a problem with whitelist_from_rcvd
directives leaking from one user to another.  You might want to
install the 3.1.3 RPM from rpmforge.net


box, How do i configure spamassassin and integrate it with Sendmail


First you need to run (as root)

chkconfig spamassassin on
service spamassassin start

The RedHat (and rpmforge) spamassassin packages supply some files

/etc/mail/spamassassin/spamassassin-default.rc
/etc/mail/spamassassin/spamassassin-spamc.rc

There's nothing especially magic about these, but the intention is
that users who want to pass their mail through SA can insert into
$HOME/.procmailrc a line such as

INCLUDERC=/etc/mail/spamassassin/spamassassin-spamc.rc

and not have to worry about the details.

If you as system administrator want to run spamc for all users, you'd
place that line in the /etc/procmailrc file.  Just *before* that line,
you should also have the line

DROPPRIVS=yes

otherwise spamassassin will run as root rather than as the individual
user whose mail is being scanned.


Re: spamassassin-3.0.4-1.el4

2006-07-03 Thread Bart Schaefer

On 7/3/06, Kaushal Shriyan [EMAIL PROTECTED] wrote:

Thanks for the quick turnaround. I installed the latest spamassassin
version for RHEL4

[EMAIL PROTECTED] kaushal]# rpm -qa | grep spam
spamass-milter-0.3.0-1.2.el4.rf
spamassassin-3.1.3-1


Where did you get that spamassassin RPM?

schaefer[502] rpm -qf /etc/mail/spamassassin/spamassassin-spamc.rc
spamassassin-3.1.3-1.el4.rf

The contents of spamassassin-spamc.rc are very simple and have not
changed from the 3.0.6-1 RPM:

# send mail through spamassassin
:0fw
| /usr/bin/spamc



I also dont have procmailrc under /etc.


That's normal, there is no default global procmail configuration.  You
can just create that file.

However, if you are using the milter, then you should NOT run spamc
from procmail, so you don't need to make any changes to procmail
configuration in that case.


How do i proceed here


I've not used spamass-milter, so I don't know what is needed to
configure that.  You will at least need to make sure the spamd process
is running if you are using the milter (chkconfig and service start as
I mentioned previously), and you probably need to restart sendmail.


Re: spamassassin-3.0.4-1.el4

2006-07-03 Thread Bart Schaefer

On 7/3/06, Kaushal Shriyan [EMAIL PROTECTED] wrote:


I did what you said exactly and its up and running,

How do i test all this configurations for SPAM 


sendmail YourLocalEmailAddress 
/usr/share/doc/spamassassin-3.1.3/sample-spam.txt


Re: RFC: spamd disables virtual-config when no @ in user name

2006-07-02 Thread Bart Schaefer

On 7/2/06, martin f krafft
[EMAIL PROTECTED] wrote:

On mail systems with virtual and local users, it's not easily
possible to run per-user spamc with user configuration.


I run two copies of spamd with different -p port options, and point
the virtual users' spamc at the port corresponding to the spamd with
the --virtual-config-dir option.


Re: From header being added..???

2006-07-02 Thread Bart Schaefer

On 7/2/06,  [EMAIL PROTECTED] wrote:


I'm not 100% positive this is even a SA issue but it is driving me up the wall. 
 Some mailings are having this added right after the
SA report.  It usually isn't an issue except that some user fetch their mail 
from Exchange and these mailings are showing up munged.


Did you recently change from SA 3.0 or earlier to 3.1?  In 3.1 SA
began inserting its headers at the TOP of the filtered message header,
rather than at the end.  This is more closely conformant to the IETF
mail standards and helps reduce breakage of message signature schemes
such as DKIM.

The point is that you may have had a problem with mail processing for
a while, and the behavior of earlier versions of SA simply masked it.


X-Spam-Report:
*  1.5 DATE_IN_FUTURE_06_12 Date: is 6 to 12 hours after Received: date
*  1.8 FUZZY_AFFORDABLE BODY: Attempt to obfuscate words in spam
*ALL rules removed for simplicity*
From userid  Sun Jul  2 23:47:35 2006
Return-Path: [EMAIL PROTECTED]
Received: from k5u5r1

See the From header?


In all likelyhood this was a From  line (with no leading ) at the
time SA was handed the message, and was rewritten as a From  line
by the local delivery processing after SA was already finished with
the message.  The latter part is correct if the mail is going to be
placed in a unix-style flat-file mailbox, but the former is wrong:
The From  line, if any, should not be added until final delivery of
the message to the mailbox file.

Look at the processing that's upstream of spamassassin or spamc.  In
particular check whether the message is ever written to a file and
then read back in to the processing stream. The culprit is likely to
be whatever does that write.


Re: White List and Yellow List DNS Servers - Proposal

2006-06-30 Thread Bart Schaefer

On 6/30/06, Marc Perkel [EMAIL PROTECTED] wrote:

Who likes this idea?


Evidently habeas.com does, as that's now their business model.  Also
Bonded Sender (I think they changed the name recently, but I forget to
what).  And I believe the ISIPP maintains several such lists.  Do a
Google on reputation service.


Re: internal/trusted again, MSA tested for SPF ?

2006-06-30 Thread Bart Schaefer

On 6/30/06, Daryl C. W. O'Shea [EMAIL PROTECTED] wrote:


OK, I see now that you want to unconditionally trust the MSA *and* all
hosts after it.  Which is reasonable if the MSA is just an MSA.  For
whatever reason you don't want to rely on auth tokens, etc.  Seems
reasonable to me.


That would mean that SA must be able to verify the Received: chain as
far back as the MSA, wouldn't it?  Otherwise forging a Received: for
the MSA would bypass all the network checks.


spamd not properly resetting whitelist?

2006-06-30 Thread Bart Schaefer

We recently installed a new CentOS4 server, which comes with SA 3.0.6
prepackaged, to serve as our local mail store (runs sendmail,
clamassassin, spamd, and an imap server).  The perl version is 5.8.5,
and it's an x86_64 platform.

Since migrating our users to this machine we frequently have spam
mis-classified as ham with USER_IN_WHITELIST as the culprit.  There
are no whitelist_from or whitelist_from_rcvd directives at the
/etc/mail/spamassassin/* level, and only one user who has any of these
directives in his user_prefs file, but *every* user has at random
times had spam mismarked as whitelisted.

In every case the spam so mis-marked was forwarded from a role address
on another server that matches one of the whitelist_from_rcvd lines in
the single aforementioned user_prefs.  If the same misclassified
message is put through spamassasssin rather than spamd, or even if
it is run through spamc a second time, USER_IN_WHITELIST disappears
(the rest of the rules hit remain unchanged).

Looking at the mail logs, there does seem to be a correlation:  If the
first message scanned by a new spamd child is scanned on behalf of the
user who has whiltelist lines in his user_prefs, every other message
scanned by that child is misclassified.  As long as the very first
scan is not for this user, the child behaves properly with respect to
the whitelist settings.

I don't see any bugzilla for this using a search on USER_IN_WHITELIST.
Has anyone else  encountered this issue?  Can anyone verify that it's
fixed in 3.1?


Re: White List and Yellow List DNS Servers - Proposal

2006-06-30 Thread Bart Schaefer

On 6/30/06, Marc Perkel [EMAIL PROTECTED] wrote:


Yeah - but what I'm thinking of is something that is automatic and
reputation based rather that paying someone to certify you. In other
words your server get whitelisted because you never send spam.


Paid or otherwise, how do you get on the list in the first place?  You
obviously used some criteria based on your own server logs to
determine which IPs never send spam -- but never is a long time,
and in some cases spam is objective (people report all kinds of
stuff as spam for all kinds of reasons).


Re: trusted_networks confusion

2006-06-29 Thread Bart Schaefer

On 6/29/06, Daryl C. W. O'Shea [EMAIL PROTECTED] wrote:


EVERYTHING after an MX MUST be listed as BOTH trusted and internal
networks.


Under what circumstances would one list something as internal but not trusted?


Re: trusted_networks confusion

2006-06-29 Thread Bart Schaefer

On 6/29/06, Daryl C. W. O'Shea [EMAIL PROTECTED] wrote:

Bart Schaefer wrote:

 Under what circumstances would one list something as internal but not
 trusted?

NEVER.  Newer versions of SA won't even allow you to make that
misconfiguration.


Ah, good.  That's as I expected.  (So why doesn't SA simply always
merge internal_networks into trusted_networks?  It seems a waste of
effort to have to manually list the internal_networks in both places.)


Re: Not just use_bayes_rules 0

2006-06-26 Thread Bart Schaefer

No one has any comments at all?

-- Forwarded message --
From: Bart Schaefer [EMAIL PROTECTED]
Date: Jun 23, 2006 10:49 PM
Subject: Not just use_bayes_rules 0
To: Spamassassin Users List users@spamassassin.apache.org

I want to make sure I'm not misinterpreting something else before I
report this as a bug.

I just tried

use_bayes 1
use_bayes_rules 0

The effect of this seems to be that NONE of the rules are applied,
except whitelist_from and blacklist_from.  I had assumed it would just
turn off the BAYES_* rules, as if they had all been given zero scores.

  use_bayes_rules ( 0 | 1 )  (default: 1)
  Whether to use rules using the naive-Bayesian-style classifier
  built into SpamAssassin.  This allows you to disable the rules
  while leaving auto and manual learning enabled.

Under what circumstances would one want to disable ALL the rules while
still leaving auto- learning enabled?  What good could possibly come
of it?


Trouble with UNwhitelist_from_rcvd

2006-06-23 Thread Bart Schaefer

The short of it is that I can't get unwhitelist_from_rcvd to
unwhitelist anything.

Here's the situation:  We have a brand-new machine that's going to be
swapped in as our mail server.  We're trying to test everything
thoroughly before we switch over to it.  To avoid any loss of mail, I
have a test user on the old machine (call it X) with procmail recipes
to forward all mail to the same user on the new machine (Y).

All mail sent on the LAN is masqueraded so the headers say it is from
the brasslantern.com domain.  The test user is named
untrusted-relay.

On both X and Y, the user_prefs file has

whitelist_from_rcvd [EMAIL PROTECTED] brasslantern.com

Now, the trouble is, that to test SA I forward from X to Y *before* SA
processing.  Trimmed excerpts from spamassassin -D output on a spam
message (removed stuff like SPF query failing to load, etc., and
changed machine name to Y):

debug: SpamAssassin version 3.0.6
debug: Score set 0 chosen.
[...]
debug: is Net::DNS::Resolver available? yes
debug: Net::DNS version: 0.48
debug: all '*From' addrs: [EMAIL PROTECTED] [EMAIL PROTECTED]
debug: Running tests for priority: 0
debug: running header regexp tests; score so far=0
[...]
debug: forged-HELO: from= helo=0630.com by=brasslantern.com
debug: registering glue method for check_hashcash_value
(Mail::SpamAssassin::Plugin::Hashcash=HASH(0x1b9c270))
debug: all '*To' addrs: [EMAIL PROTECTED]
[EMAIL PROTECTED]
[...]
debug: running body-text per-line regexp tests; score so far=-99.702
debug: running uri tests; score so far=-99.702

The reason for the -99.702 is because the first of the two '*From'
addresses matches the whitelist_from_rcvd rule.  So I tried adding

unwhitelist_from_rcvd [EMAIL PROTECTED] brasslantern.com

but this does not change anything.  In fact I've tried every variant
of unwhitelist_from_rcvd that I can think of, to no effect.  The only
thing that changes the score is removing the whitelist_from_rcvd
directive.

I've searched bugzilla without finding anything about unwhitelist for
anything more recent than 2.60.  Anybody have any clues what's going
on, or what I'm doing wrong?


Re: Trouble with UNwhitelist_from_rcvd

2006-06-23 Thread Bart Schaefer

On 6/23/06, Daryl C. W. O'Shea [EMAIL PROTECTED] wrote:

Did you read the Mail::SpamAssassin::Conf perldoc?


Yes ... so what you're saying is, previously used in means written
in the config file entry not used in spamassassin when matching.
The phrase the address is what threw me; the strings in
whitelist_from_rcvd are patterns that match addresses, they aren't
addresses.

OK, so that leaves the question:  How do I whitelist *all but one*
address from a given domain?  I don't think I want blacklist_from
because I don't actually want to blacklist this address; I just want
to NOT whitelist it.


Re: Trouble with UNwhitelist_from_rcvd

2006-06-23 Thread Bart Schaefer

On 6/23/06, Daryl C. W. O'Shea [EMAIL PROTECTED] wrote:


Well you could s/address/address pattern/.


I could, but plainly what I did was s/used in/used in processing/,
because it seemed a whole lot more intuitive for it to function that
way.  Ah, well.


This will probably change in a future version to do what you want
instead.


Does anyone have an alternate suggestion in the meantime?  I don't
think I can wait for SA3.2 before I roll out this new server.


Re: Black Copy filtering problem

2006-05-28 Thread Bart Schaefer

On 5/28/06, Phil (Sphinx) [EMAIL PROTECTED] wrote:


I really don't understand.


I haven't attempted to figure out what the SARE rule is doing, I'm afraid.


Do you think I should ask the exim-users list ?


If the goal is to limit the volume of mail that any particular user
can cause to be delivered, and exim is your MTA, then yes, the exim
list would be the place to ask.


Virtual user config and auto-whitelist (again)

2006-05-27 Thread Bart Schaefer

A while ago, I asked about updating the AWL when using spamd
--virtual-config-dir.  The discussion got sidetracked onto the topic
of the obsolete -a option and the AWL plugin, and consequently my
original question never got a satisfactory answer.  Here it is again:

On 4/26/06, Bart Schaefer [EMAIL PROTECTED] wrote:

I've recently switched from running spamd on our mail server machine,
where all users have direct access to their SA config in their home
directory, to running spamd on a second machine and using
--virtual-config-dir for user configuration.  (SA 3.1.1)

The only problem this has posed is that there's no convenient way for
users to modify entries in the auto-whitelist file.  Some spam (mostly
mortgage offers with obfuscated text) that came in before bayes was
retrained got scored low, and consequently the AWL scores are pulling
the total score for new spam from the same source back down below the
5.0 threshold in spite of it hitting BAYES_90 and above.  I've
resorted to deleting the auto-whitelist files from the virtual config
dir when someone notices this effect, but that's hardly a scalable
solution.

Is there another approach I don't know about?


Re: SA and Bayes in a multi-user environment

2006-05-15 Thread Bart Schaefer

Bayes is only as accurate as its trainer(s).

http://www.jgc.org/blog/2006/05/theres-one-born-every-minute-spam-and.html


sa-update vs. RDJ -- Default rules directory changed somehow?

2006-05-13 Thread Bart Schaefer

I think there's some kind of conflict between sa-update and
RulesDuJour that has borked my spamassassin installation, but I can't
figure out how.

This morning after RDJ restarted spamd, spamc started returning
messages with ONLY the spamassassin version header added, not the
score report.  Running spamc -R by hand, I found that spamd was
returning no report template found.

So I ran spamassassin -D and discovered that the SARE rules are
being loaded out of /etc/mail/spamassassin, but none of the default
rules are being loaded.

[24302] dbg: config: using /var/lib/spamassassin/3.001001 for sys
rules pre files
[24302] dbg: config: using /var/lib/spamassassin/3.001001 for
default rules dir

What?  The default rules dir is supposed to be
/usr/share/spamassassin/.  That's where it was before the restart this
morning.  I've moved the SARE rulesets out of the way, grepped for
/var/lib in all the config files that the dbg log says are being
loaded, run SA as two different users, etc., and I can't find anything
that's changing the default rules directory.  There's nothing in
/var/lib/spamassassin/3.001001 except the sa-update subdirectories
with a single MIRRORED.BY file in one of those, so no wonder I'm not
getting any rules loaded.

What the heck is going on here?  Where am I failing to look?


Re: So, when do we start handling [dot] in a URI

2006-05-13 Thread Bart Schaefer

On 5/12/06, jdow [EMAIL PROTECTED] wrote:


 jdow  And you propose we do what instead?


Look for other characteristics of the messages that could be filtered.
I haven't seen any of these spams, so I don' t know what those might
be, but this can hardly be the *only* thing the spammer is doing.
It's just the one that jumped out as obvious.  But it's also the one
that's likely to be easiest to mutate rapidly, so it's probably the
worst one to attack.


Re: sa-update vs. RDJ -- Default rules directory changed somehow?

2006-05-13 Thread Bart Schaefer

On 5/13/06, Bart Schaefer [EMAIL PROTECTED] wrote:

I think there's some kind of conflict between sa-update and
RulesDuJour that has borked my spamassassin installation, but I can't
figure out how.


Apparently the conflict is only that RDJ restarts spamd automatically,
but sa-update does not.


What?  The default rules dir is supposed to be
/usr/share/spamassassin/.


I finally grepped spamassasin itself and found:

  Default configuration data is loaded from the first existing directory
  in:

  /var/lib/spamassassin/3.001001
  /usr/share/spamassassin
  /usr/share/spamassassin
  /usr/local/share/spamassassin
  /usr/share/spamassassin

(Why is /usr/share/spamassassin in that list three times?)

Well, guess what.  sa-update creates the
/var/lib/spamassassin/3.001001 directory if it does not exist, rather
than finding the directory that does exist and using that.  I didn't
notice this at first because spamd didn't restart after sa-update, and
RDJ didn't do anything new until yesterday.  (This is a fairly recent
SA reinstall, and an even more recent RDJ install.)

Yet the CPAN install of spamassassin uses /usr/share/spamassassin for
the installation.  Surely the install ought to use the same directory
that sa-update is going to create, or vice-versa?


Re: sa-update vs. RDJ -- Default rules directory changed somehow?

2006-05-13 Thread Bart Schaefer

On 5/13/06, Theo Van Dinter [EMAIL PROTECTED] wrote:

On Sat, May 13, 2006 at 10:57:11AM -0700, Bart Schaefer wrote:
 Well, guess what.  sa-update creates the
 /var/lib/spamassassin/3.001001 directory if it does not exist, rather
 than finding the directory that does exist and using that.  I didn't

Of course.  sa-update *only* uses the /var/lib area.  It doesn't care
about what other rules you already have installed or where they are.


But surely there's some kind of disconnect here.  sa-update creates an
empty directory that spamassassin (and spamd) then uses preferentially
to the one that really has the rules in it.


 Yet the CPAN install of spamassassin uses /usr/share/spamassassin for
 the installation.  Surely the install ought to use the same directory
 that sa-update is going to create, or vice-versa?

No.  sa-update is optional and writes stuff to its own area separate
from the installation of SA.


In that case I would argue that either (a) running sa-update should
not create a directory when there are no updates to populate it, or
(b) running sa-update should copy the existing set of rules into the
update directory.


You may want to check out http://wiki.apache.org/spamassassin/RuleUpdates
which talks about sa-update, how it works, etc.


I did, before the first time I ran it.  That page explains about
setting up channels, gives an example of running sa-update (I didn't
bother to restart spamassassin after I ran it, because it reported
there were no updates available) and goes on to say:

   * Currently, for 3.1.1 and 3.2.0, to use any channel for updates
requires that updates.spamassassin.org also be used. This is because
once the update directory exists, the SpamAssassin modules expect to
find all rules in that directory.

Nowhere does it say that it creates this directory and leaves it empty
when there are no updated rules.  Nowhere in man sa-update does it
say that either.  How was I supposed to realize that running sa-update
would leave me with a crippled installation?


Re: sa-update vs. RDJ -- Default rules directory changed somehow?

2006-05-13 Thread Bart Schaefer

On 5/13/06, Theo Van Dinter [EMAIL PROTECTED] wrote:

It's not empty if the download is successful.  I believe there's a ticket
about changing the behavior so an empty directory isn't left behind if the
first attempt to do an update fails.


Sounds good.


 In that case I would argue that either (a) running sa-update should
 not create a directory when there are no updates to populate it, or

I'd have to double check, but for (a), I believe that happens already.
Having no updates available doesn't create the directory.  However, what's
more likely is that there's an upgrade available but the download failed.


Was there an update available on May 8?  That's when I ran sa-update
last.  It just happens to have been most of a week before anything
else caused spamd to restart.  I'm pretty sure that I got the exit
code 1 from sa-update; I'm quite sure that I *didn't* get an exit code
of 4 or more.

I ended up with:
/var/lib/spamassassin/3.001001/updates_spamassassin_org/ (empty directory)
/var/lib/spamassassin/3.001001/updates_spamassassin_org.tmp/MIRRORED.BY

Having removed the entire 3.001001 tree, I just re-ran sa-update and
now I have what appears to be the correct update:
/var/lib/spamassassin/3.001001/updates_spamassassin_org/ (lots of .cf files)
/var/lib/spamassassin/3.001001/updates_spamassassin_org.cf


So I'm confused.  If you're running 3.1.0, sa-update acts completely
differently and there are no updates available for it anyway.  If you're
running 3.1.1, there are updates available.  If you're running 3.2.0,
there are updates available.  So the only thing that makes sense here
is that the download failed, which is documented in the wiki page.


I'm running 3.1.1.


Re: So, when do we start handling [dot] in a URI

2006-05-12 Thread Bart Schaefer

On 5/12/06, Bret Miller [EMAIL PROTECTED] wrote:

Seems spammers have taken up to doing what many of us have in posting
e-mail addresses, putting [dot] instead of the . in the URL and telling
people to replace it


Gosh, exactly what regular people have been doing on web sites and
in news/list postings for years, to prevent spammers from harvesting
their addresses.

So now that the spammers are using our own defenses against us, you
suggest that we should invent the technology to defeat those defenses?
And *then* what happens?


Re: So, when do we start handling [dot] in a URI

2006-05-12 Thread Bart Schaefer

On 5/12/06, Kai Schaetzl [EMAIL PROTECTED] wrote:

Bart Schaefer wrote on Fri, 12 May 2006 07:34:05 -0700:

 So now that the spammers are using our own defenses against us, you
 suggest that we should invent the technology to defeat those defenses?

What's there to invent? The point is that these need to be identified as
URI. So, convert to URI and then lookup in SURBL.


It just seems like a useless rathole to go down.

(1) Website maintainer uses technique X to obsure addresses on his site.
(2) Spammer notices that his harvester failed to decrypt X.
(3) Spammer copies technique X and uses it to obscure his spam.
(4) SA programmer devises a way to decrypt X to block the spam.
(5) Spammer copies algorithm from SA into his address harvester.
(6) Website maintainer starts getting spam, so he devises a new X.
(7) Repeat at (1).


Re: Those Re: good obfupills spams (bayes scores)

2006-05-02 Thread Bart Schaefer

Incidentally, the FAQ answer for HowScoresAreAssigned on the SA wiki
is out of date.


Re: unpacking spam attachments for sa-learn

2006-05-01 Thread Bart Schaefer

On 5/1/06, Jeff Portwine [EMAIL PROTECTED] wrote:

I tried ripmime, and it does extract the attachments but it throws away all
of the header information and gives me only the attachment by itself.


I wrote an extractor in procmail for simple (as in, it doesn't handle
nested structure well) MIME body parts.

http://www.well.com/user/barts/email/mimepart.txt

You'd do something like

CONTENT_TYPE=message/rfc822
INCLUDERC=mimepart.txt
RESULT=`echo $BODY_PART | spamc -L spam`

You probably want to avoid doing this on very large messages, as it
does slurp the entire message into a variable.


Re: Those Re: good obfupills spams

2006-04-29 Thread Bart Schaefer

On 4/29/06, List Mail User [EMAIL PROTECTED] wrote:


While SA is quite robust largely because of the design feature that
no single reason/cause/rule should by itself mark a message as spam, I have
to guess that the FP rate that the majority of users see for BAYES_99 is far
below 1%.



Anyway, to better address the OP's questions:  The system is more
robust if instead of changing the weighting of existing rules (assuming that
they were correctly established to begin with), you add more possible inputs


Exactly.  For example, I find that anything in the subset consisting
of messages that don't mention my email address anywhere in the To/Cc
headers and also scoring above BAYES_70 has close to 100% likelyhood
of being spam.  However, since I also get quite a lot of mail that
doesn't fall into that subset, I can't simply increase the scores for
the BAYES rules.

In this case I use procmail to examine the headers after SA has scored
the message, but I've been considering creating a meta-rule of some
kind.  Trouble is, SA doesn't know what my email address means (it'd
need to be a list of addresses), and I'm reluctant to turn on
allow_user_rules.


Re: Those Re: good obfupills spams

2006-04-29 Thread Bart Schaefer

On 4/29/06, Matt Kettler [EMAIL PROTECTED] wrote:

Besides.. If you want to make a mathematics based argument against me,
start by explaining how the perceptron mathematically is flawed. It
assigned the original score based on real-world data.


Did it?  I thought the BAYES_* scores have been fixed values for a
while now, to force the perceptron to adapt the other scores to fit.


Re: Those Re: good obfupills spams (bayes scores)

2006-04-29 Thread Bart Schaefer

On 4/29/06, Matt Kettler [EMAIL PROTECTED] wrote:

 In SA 3.1.0 they did force-fix the scores of the bayes rules,
particularly the high-end. The perceptron assigned BAYES_99 a score of
1.89 in the 3.1.0 mass-check run. The devs jacked it up to 3.50.

That does make me wonder if:
1) When BAYES_9x FPs, it FPs in conjunction with lots of other rules
due to the ham corpus being polluted with spam.


My recollection is that there was speculation that the BAYES_9x rules
were scored too low not because they FP'd in conjunction with other
rules, but because against the corpus they TRUE P'd in conjunction
with lots of other rules, and that it therefore wasn't necessary for
the perceptron to assign a high score to BAYES_9x in order to push the
total over the 5.0 threshold.

The trouble with that is that users expect training on their personal
spam flow to have a more significant effect on the scoring.  I want to
train bayes to compensate for the LACK of other rules matching, not
just to give a final nudge when a bunch of others already hit.

I filed a bugzilla some while ago suggesting that the bayes percentage
ought to be used to select a rule set, not to adjust the score as a
component of a rule set.


Those Re: good obfupills spams

2006-04-28 Thread Bart Schaefer

The largest number of spam messages currently getting through SA at my
site are short text-only spams with subject Re: good  followed by an
obfuscated drug name (so badly mangled as to be unrecognizable in many
cases).  The body contains a gappy-text list of several other kinds of
equally unreadable pharmaceuticals, a single URL which changes daily
if not more often, and then several random words and a short excerpt
from a novel.

They usually hit RCVD_IN_BL_SPAMCOP_NET,URIBL_SBL but those alone
aren't scored high enough to classify as spam, and I'm reluctant to
crank them up just for this.  However, the number of spams getting
through SA has tripled in the last four days or so, from around 14 for
every thousand trapped, to around 40.

I'm testing out RdJ on the SARE_OBFU and SARE_URI rulesets but so far
they aren't having any useful effect.  Other suggestions?


Re: Those Re: good obfupills spams

2006-04-28 Thread Bart Schaefer

On 4/28/06,  [EMAIL PROTECTED] wrote:


I would make a subject Re: good  rule that scores just high enough to push 
it to the spam level.


They're only scoring about 3.3, and I'm reluctant to make Re: good
worth 2 points all by itself.  That'd be worse than increasing the
spamcop score.

A meta rule, though ...


Re: Virtual user config and auto-whitelist

2006-04-26 Thread Bart Schaefer
On 4/26/06, Rosenbaum, Larry M. [EMAIL PROTECTED] wrote:

  From: Bart Schaefer [mailto:[EMAIL PROTECTED]
  ...
  (Someone remind me why the spamd option to disable the auto-whitelist
  was dropped?)

 It hasn't been dropped; they just moved the documentation into
 Plugin/AWL.pm.

Ah, right, duh.  So the answer to my question is that the -a option
was dropped because it doesn't make sense to have an option to
en/disable a plugin.


Re: (OT, but relevant) Playing with AOL?

2006-02-23 Thread Bart Schaefer
On 2/23/06, Peter P. Benac [EMAIL PROTECTED] wrote:
 Get enough of those TOS messages in one day and they will still block you
 IP address and any IP address that you have assigned to you.

FUD.  They don't block multiple IPs at once as far as I can tell.

 Furthermore, they have already announced that mail providers who will not
 send mail to them through goodmail.com will find tightened filters. When
 the  goodmail.com system goes in place it will REPLACE the current
 whitelist program.

More FUD.  Please don't spread rumors.  The initial report was the
Goodmail would replace the *extended* whitelist, which is not the same
as the basic whitelist.  Further, they've backed off on that (and said
that in fact they never intended that to be the case in the first
place).  The existing whitelists will continue operating as they have.


Re: Bayes rocks

2005-09-18 Thread Bart Schaefer
On 9/16/05, jdow [EMAIL PROTECTED] wrote:
 You are better off to use a normal SpamAssassin meta rule.

How so?  SA doesn't know how to interpret not to me (unless I write
a plugin) -- it has no built-in knowledge of, for example, all
possible sendmail aliases for my personal account -- and individual
users can't add their own rules, so the only way I can code a custom
expression to match all my personal addresses is to do it outside of
SA.

I suppose it would be possible to write a blacklist_not_to rule as a
plugin, but procmail is doing it just fine, thanks.


Bayes rocks

2005-09-16 Thread Bart Schaefer
On 9/16/05, jdow [EMAIL PROTECTED] wrote:
 Yes indeedy. And I've been looking at Bayes scores here just a wee bit.
 BAYES_99 just does not hit on ham and hits on high percentages of spam.
 Even BAYES_95 does not hit ham. I go down to BAYES_80 before I hit 0.05
 percent of ham.

During a two-week period recently I captured a copy of all mail that
(1) did not reach an SA score of 5+ points and (2) did not have my
personal email address in the To:/Cc: headers.  I then examined the
set of SA rules that were triggered by those messages (as recorded in
the X-Spam-Status).  100% of such mail that hit BAYES_80 or more was
in fact spam; about 90% of BAYES_70 was spam.  However, there were a
few BAYES_80 during the same period that *did* have my address in the
headers and that were *not* spam (and, also correctly, not tagged by
SA), so it wasn't just a matter of cranking up the score for BAYES_80.

Instead I added procmail recipies to treat as  spam the combination of
not to me plus BAYES_[89][059] regardless of the SA point score.
That was a month ago, and I haven't had a false positive yet.

If there are any developers listening ... has anyone given any
consideration to Bugzilla #3785?


More unintentional spam humor/irony

2005-09-11 Thread Bart Schaefer
The choice of anti-bayes-filler below is unfortunate on so many levels
... and on top of that, they spammed our abuse address.

(Links to spammer site deleted.)

-- Forwarded message --
Date: Sun, 11 Sep 2005 09:45:40 +0500
From: Nadia Joyner [EMAIL PROTECTED]
To: abuse
Subject: Re: Nadia
 
The Environmental Protection Agency said initial samples of the
floodwaters indicated high levels of lead and E. coli and other coliform
bacteria.
 
Don't you think it's about time to drop a few pounds?
Now you can, without sacrifice or exercise
 
A representative of the Army Corps of Engineers said 23 of the 148
permanent pumps in New Orleans were working, their efforts augmented by
three portable pumps.


Re: ANNOUNCE: SpamAssassin 3.1.0-rc2 release candidate available!

2005-09-08 Thread Bart Schaefer
On 9/7/05, Loren Wilton [EMAIL PROTECTED] wrote:
  has this been opened as a bug in BZ yet?
 
 I haven't seen a sign of it.  I hope the OP does this, I'd hate to have to
 try to track back through 3 weeks of deleted mail to find the original
 posting.  Especially since I don't remember who posted it!

John Rudd, on August 27.  Took me about 15 seconds to find it in gmail
(well, really, to find the message John sent on August 29 in which he
quoted the August 27 mail).  The subject was Problems with
SpamAssassin 3.1 RC1and MIMEDefang.


Re: Rather too refreshing: bayes.lock

2005-05-19 Thread Bart Schaefer
On 5/19/05, Ben Wylie [EMAIL PROTECTED] wrote:
 
 debug: refresh: 3392 refresh F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock
[...]
 debug: expired old Bayes database entries in 1533 seconds: 485101 entries

No real help here, but another data point:

I've been seeing occasional problems on my Linux workstation with 3.0
leaving bayes.lock files behind, throughout the 3.0 series.  I suspect
it does have something to do with syncing the journals and/or token
expiry, because although it always happens after spamd has run, it
does not happen in any observably predictable way.


Re: Bombarded by German political spam

2005-05-15 Thread Bart Schaefer
On 5/15/05, Raymond Dijkxhoorn [EMAIL PROTECTED] wrote:
 
 http://mailscanner.prolocation.net/german.cf

You've got a bit of duplication in there (rules 02 and 22 are the
same, as are 04 and 26).


Re: Confession and rage

2005-05-07 Thread Bart Schaefer
On 5/6/05, List Mail User [EMAIL PROTECTED] wrote:
 Again, there is an unfortunate exception provided by Sec 3.17
 which allows Transactional or relationship message- and in particular
 clause A.iii.I specifically allows notification concerning a change in
 the terms or features of.  This has specifically been taken by courts to
 allow the sending of email notifying you of a change in price(s)

Eh?  When?  Where?  Case reference, please.

 (iii) a valid physical postal address of the sender.
 
 [...] It is unfortunate that both P.O. Boxes
 and Agents for service of a holding company would clearly qualify under
 this clause (i.e. a physical postal address != physical address).

I've actually spoken to an attorney specifically about this point, and
it was his contention that this was not at all clear.


Re: AWL whaaat

2005-05-05 Thread Bart Schaefer
On 5/4/05, Matt Kettler [EMAIL PROTECTED] wrote:
 
 That's great.. I was trying to think up a good scenario for that
 acronym, but couldn't.

Obviously it's the

Automated Spam Sorting Average Scoring System Involving Ninjas


Re: sa-learn issues

2005-04-01 Thread Bart Schaefer
On Mar 31, 2005 5:46 PM, AltGrendel [EMAIL PROTECTED] wrote:
 Matt Kettler wrote:
 
 The problem is, it seems that sa-learn is ignoring the -u / --user= flag.
 
 sa-learn uses the userid of the user that calls it. Period.
 
 Then why does man sa-learn show a -u flag:
 
 Is this obsolete?

It's not obsolete, but it's incomplete.  Changing the userid does not
change the location that sa-learn uses for the bayes databases, and
although you can specify an alternat config file path, the user-level
config file is not allowed to change the bayes database location.

So for virtual users you must actually set the HOME environment
variable to point to the correct location before running sa-learn, and
*also* use the -u option to set the userid; and of course the real
user running sa-learn must have write permission for the bayes files
and for the $HOME/.spamassassin directory (to create lock files).

I ended up making a setuid (to the spamassassin user) copy of
sa-learn and a wrapper script for running it.


AWL interaction with Bayes, and sa-learn

2005-02-16 Thread Bart Schaefer
First, tell me if there's anything wrong with this summary:

1. A message arrives and is passed to spamassassin and/or spamc+spamd.
2. The score for that message is computed.
3. The AWL score for that sender is updated.
4. The message was mis-classified, so after delivery the user feeds
the message to sa-learn.
5. The Bayes score for (the tokens in) that message is updated, *but
the AWL score for the sender remains unchanged.*
6. A similar message from the same sender arrives.  The net score is
moved away from the Bayes-influenced value by the (obsolete, or at
least incorrectly recorded) AWL value.

Assuming I've got that right, tell me whether there's aaanything wrong
with this conclusion:

The AWL will wrongly influence the score for both spam and non-spam as
long as the AWL remains unaffected at step 5, in any case where the
initial classification was incorrect.

Finally the question:

Shouldn't sa-learn retrain the AWL as well?  At the least, throw out
the entry for that sender and begin recomputing it with the next
message?


Re: PERL update broke spamassassin?

2005-02-05 Thread Bart Schaefer
It looks as if /usr/bin/spamassassin is being executed by the shell,
attempting to interpret perl as shell commands.  This probably means
that your upgrade changed the path to the perl executable, so the #!
line at the top of /usr/bin/spamassassin is no longer pointing to the
right thing, but it must have something to do with the way procmail is
invoking spamassassin because normally you'd get a no such file or
directory error instead in that case.


Re: Sudden spam volume decrease?

2005-01-14 Thread Bart Schaefer
On Fri, 14 Jan 2005 10:36:25 -0800, Bart Schaefer
[EMAIL PROTECTED] wrote:
  Menno van Bennekom wrote:

Sorry, that was mis-attibuted.  I meant to trim that line.


Re: SA 3 - I'm Totally Stuck!

2005-01-07 Thread Bart Schaefer
On Fri, 7 Jan 2005 10:27:38 -, bubba [EMAIL PROTECTED] wrote:
 
 I'm trying to install Spamassassin 3 on a Linux box w/Ensim control panel
 installed

Meaning you're trying to install it through the control panel rather
than using a real login shell?  Or only meaning that you're using
Ensim to set up the .procmailrc files?

 but I'm experiencing a variety of errors. I've modified each
 users' .procmailrc file, but the logs are showing that spamc cannot be found

No, they're showing that spamc cannot be *executed*, which is an
entirely different thing.

This implies to me that procmail is executing on a different machine,
with a different binary architecture, from that where spamc was
compiled.

 (regardless of how I address it, and I know it's there - I can run it from
 the command line).

And you're sure there's only one machine involved, and no NFS mounts
or the like?

 Copying spamc to each users' home directory allows it to be run

That pretty strongly implies that the mail delivery machine is not the
same one where the users have their home directories.

 Previously, I had version 2.6 working quite happily, so this is confusing
 the hell out of me! Any help most gratefully received!

And did you install 2.6x yourself?


Re: Spamassassin classifying normal mail as SPAM due to RBL tests

2004-11-20 Thread Bart Schaefer
The URIBL_* tests are not concerned with where the mail is from;
they're examining the message *body* to see if it contains links to
websites that are commonly advertised in spam.  If you remove the
entire message body as private content when posting the sample to
the list, you prevent anyone from helping you determine which URLs
might be the problem.


Re: ver 3.0 opinions

2004-10-29 Thread Bart Schaefer
On Thu, 28 Oct 2004 15:21:59 -0700, Jeff Ramsey [EMAIL PROTECTED] wrote:
 Is version 3 really any better at stopping spam that 2.63?

Version 3 stops different spam than 2.63, in my experience so far. 
E.g. it's better at catching the drug spam but not as good at the
earn cash for making phone calls spam.

If you use full network tests, I suspect 3.001 (did I get enough
zeroes? too many?) is actually better than 2.63/2.64.

Using it in local only mode, though, I've found it not very different.
 The spams that get through 3.x that do not get through 2.6x are
generally (a) those that match BAYES_99, which by itself in the
default configuration is no longer a large enough score to make me
happy, or (b) would have been tagged as spam except that the AWL
smoothed them down to just below the threshhold.

I confess with some embarrassment that I haven't yet looked into how
to turn off the AWL in spamd.  Statement (b) above comes from running
the same messages through spamassassin -t and having them marked as
spam with the only difference in the latter case being the absence of
an AWL hit.


  1   2   >