A false positive...

2006-11-22 Thread Steve [Spamassasin]
An ebay watched item email has been wrongly tagged as spam... with the
following rules:

--
 2.2 INVALID_DATE   Invalid Date: header (not RFC 2822)
 0.8 DATE_IN_PAST_06_12 Date: is 6 to 12 hours before Received: date
 0.1 TW_SJ  BODY: Odd Letter Triples with SJ
 0.0 HTML_MESSAGE   BODY: HTML included in message
 3.0 BAYES_95   BODY: Bayesian spam probability is 95 to 99%
[score: 0.9887]
 0.2 HTML_TITLE_EMPTY   BODY: HTML title contains no text
-0.0 SARE_LEGIT_EBAYHas signs it's from ebay, from, headers, uri
-1.1 AWLAWL: From: address is in the auto white-list
--


The (sanitised) headers read:


--
Subject:...
From:eBay [EMAIL PROTECTED]
Date:Wed, 22 Nov 2006 09:03:16 GMT-07:00
To:...
Return-Path:[EMAIL PROTECTED]
X-Original-To:...
Delivered-To:...
Received:from mx9.smf.ebay.com (mxsmfpool05.ebay.com [66.135.209.202]) by ... 
(Postfix) with ESMTP id 06C81D8C24 for ...; Wed, 22 Nov 2006 16:04:00 + 
(GMT)
Received:from sjc2bat08.sjc.ebay.com (sjc2bat08.sjc.ebay.com [10.11.37.18]) by 
mx9.smf.ebay.com (8.13.5/8.13.5) with ESMTP id kAMG3oaM009188 for ...; Wed, 
22 Nov 2006 09:03:58 -0700
X-eBay-MailTracker:10089.487.3.0
MIME-Version:1.0
Content-Type:multipart/alternative; 
boundary=7101714.1164211396002.JavaMail.ebba.sjc2bat08
Message-ID:[EMAIL PROTECTED]
--

While I understand why this email may have triggered the Bayesian rule (where 
spammers have copied ebay's email style...) I am bemused by INVALID_DATE and 
DATE_IN_PAST_06_12.

The dates I see in the header look valid to me - and (if we allow for time 
international time differences) the message was sent two seconds before it was 
received.

Am I overlooking something here?  Why doesn't SpamAssassin like these dates?

Steve




Funny spamd failure... (Maybe SARE/rules-du-jour related?)

2006-11-17 Thread Steve [Spamassasin]
The other night my default gentoo RulesDuJour for Spamassassin acquired new
Adult and General rule-sets from SARE.  Thereafter spamd refused all
connections and subsequently received mail was not spam filtered. 
Issuing '/etc/init.d/spamd restart' as root resolved the situation...
but I don't want to have to do this every time a rule-set is
automatically updated overnight.

This is a (sanitised) extract from /var/log/messages :

--
Nov 15 03:20:00 svr fcron[5328]: process already running: root's
/usr/bin/test -x /usr/sbin/run-crons  /usr/sbin/run-crons
Nov 15 03:20:14 svr postfix/pickup[11065]: ...: uid=0 from=root
Nov 15 03:20:14 svr postfix/cleanup[11232]: ...: message-id=...
Nov 15 03:20:15 svr spamd[7808]: spamd: connection from localhost
[127.0.0.1] at port 1125
Nov 15 03:20:15 svr spamd[7808]: spamd: setuid to foouser succeeded
Nov 15 03:20:15 svr spamd[7808]: spamd: processing message .. for
foouser:1000
Nov 15 03:20:18 svr spamd[7808]: spamd: clean message (-2.9/5.0) for
foouser:1000 in 3.1 seconds, 647 bytes.
Nov 15 03:20:18 svr spamd[7808]: spamd: result: . -2 - AWL,BAYES_00
scantime=3.1,size=647,user=foouser,...
Nov 15 03:20:18 svr postfix/local[11237]: ...
Nov 15 03:20:18 svr postfix/qmgr[5607]: ...: removed
Nov 15 03:20:19 svr spamd[5462]: prefork: child states: II
Nov 15 03:20:26 svr postfix/pickup[11065]: ...: uid=0 from=root
Nov 15 03:20:26 svr postfix/cleanup[11232]: ...
Nov 15 03:20:27 svr spamd[7808]: spamd: setuid to foouser succeeded
Nov 15 03:20:27 svr spamd[7808]: spamd: processing message ... for
foouser:1000
Nov 15 03:20:29 svr spamd[7808]: spamd: clean message (-2.2/5.0) for
foouser:1000 in 2.7 seconds, 612 bytes.
Nov 15 03:20:29 svr spamd[7808]: spamd: result: . -2 - AWL,BAYES_05
scantime=2.7,size=612,user=foouser,uid=1000,...
Nov 15 03:20:29 svr postfix/local[11237]: EEA5F3B945:
to=[EMAIL PROTECTED], orig_to=root, relay=local, delay=3, status=sent
(delivered to command: /usr/bin/proc
Nov 15 03:20:29 svr postfix/qmgr[5607]: EEA5F3B945: removed
Nov 15 03:20:30 svr spamd[5462]: prefork: child states: II
Nov 15 03:21:05 svr spamd[5462]: spamd: server killed by SIGTERM,
shutting down
Nov 15 03:21:11 svr rc-scripts: Failed to stop spamd
Nov 15 03:30:00 svr fcron[5328]: process already running: root's
/usr/bin/test -x /usr/sbin/run-crons  /usr/sbin/run-crons
Nov 15 03:40:00 svr fcron[11746]: Job /usr/bin/test -x
/usr/sbin/run-crons  /usr/sbin/run-crons started for user root (pid 11747)
Nov 15 03:50:00 svr fcron[11759]: Job /usr/bin/test -x
/usr/sbin/run-crons  /usr/sbin/run-crons started for user root (pid 11760)
Nov 15 03:50:24 svr postfix/smtpd[11772]: connect from localhost[127.0.0.1]
Nov 15 03:50:24 svr postfix/smtpd[11772]: ...: client=localhost[127.0.0.1]
Nov 15 03:50:24 svr postfix/cleanup[11775]: ...: message-id=...
Nov 15 03:50:24 svr postfix/qmgr[5607]: 73FAA3B4FB: from=...
Nov 15 03:50:24 svr postfix/smtpd[11772]: disconnect from
localhost[127.0.0.1]
Nov 15 03:50:24 svr spamc[11779]: connect(AF_INET) to spamd at 127.0.0.1
failed, retrying (#1 of 3): Connection refused
Nov 15 03:50:25 svr spamc[11779]: connect(AF_INET) to spamd at 127.0.0.1
failed, retrying (#2 of 3): Connection refused
--

Does anyone else have this problem?  Can it be attributed to fcron or
RulesDuJour or something peculiar to my setup?
I don't understand the process already running messages from fcron -
but my cron jobs all seem to be executed normally.

The script which was run immediately prior to spamd stopping accepting 
connections is the standard one supplied for Gentoo - a copy of the version I'm 
using is here : http://temporary.shic.dynalias.net/rules_du_jour

--




bayes_toks, expiry and spamd...

2006-10-26 Thread Steve [Spamassasin]
This feels like a series of FAQs, but previous frequent answers don't
seem to answer my questions directly...

With Spamassassin 3.1.4 I'm running spamd. and my global procmail uses
spamc to process  mail.  Individual users train/report with spamc too. 
In an end-user account there's a .spamassasin directory and this contains:

auto-whitelist 
bayes_toks
user_prefs
bayes_journal
bayes_seen

All of which makes sense... Over time, however, there is a build-up of 
bayes_toks.expire files (where $ is a decimal digit) and I'm unclear
about these.Anecdotally, when there are lots of these
bayes_toks.expire files, from time-to-time, emails stop being
processed by spamassassin and mail and spam are delivered to my inbox
without any spamassassin headers.  This happened most recently this
overnight and, subsequently, no messages were processed for spam.  I
re-started spamassassin and things seemed to work again... I ran
sa-learn --force-expire and it reported keeping ~17,000 tokens and
expiring ~6,000.  My bayes_toks.expire files remained.  This left me
with lots of unanswered questions...

What causes the creation of a bayes_toks.expire file?
Do bayes_toks.expire files affect performance, or just consume disk
space?
What effect would deleting these files have on spamassassin Bayesian
processing?
Is it likely that the 'failure' of spamassassin arose as a consequence
of a growing number of entries in bayes_toks, or is it more likely a
fault triggered by processing a malicious mail?
I've seen vague references to time-out settings - is this likely a
configuration issue (if so, which configuration options should be my focus)?
The fact that my forced expiry kept  75% of the tokens suggests to me
that expiry was not happening automatically... should it be?  How can I
tell if it is working?
Should I be regularly forcing expiry from a cron-job?




Infuriating gif spam...

2006-09-26 Thread Steve [Spamassasin]
I've been getting a _lot_ of spam recently which has been defeating my
spamassassin configuration - all of it has the same general form... A
message with auto-generated prose and an image.  I installed FuzzyOCR
and this helped, but one particular variant still slips through.

The problematic spams all embed a GIF image which confuses gocr (in
spite of being easily human-readable) - though I'm not sure why.  Three
images which defeat FuzzyOCR for me are:

http://temporary.shic.dynalias.net/Evil_Spam_Samples.zip

I would like to know if there is a straightforward way either (a) to
configure FuzzyOCR to decode the text, or (b), assuming that is hard, a
way to identify this kind of 'strange' GIF and apply a static score to
them (at least as a temporary measure?)

Thanks in advance for any pointers...




Re: Infuriating gif spam...

2006-09-26 Thread Steve [Spamassasin]
Jorge Valdes wrote: 
 There are multiple images in these gifs, and because the first image
 is 'junk', sending this image through gocr will yield no results. The
 problem is that you have to scan all images to find the text.  Try
 this with each image:

 convert -append News.gif pnm:- | gocr -
That works a treat...

 I have an updated version of the FuzzyOcr plugin that has this and
 other improvements available here:

 http://www.joval.info/proj/FuzzyOcr.html

Version 2.3j works much better...  I'd previously been using version
2.3b for which I had an ebuild for gentoo.

One thing I have noticed, however, is a number of errors/warnings which
spamd sticks into /var/log/messages when it is started:

--
Sep 26 17:20:48 server spamd[25563]: Subroutine new redefined at
/etc/mail/spamassassin/FuzzyOcr.pm line 122.
Sep 26 17:20:48 server spamd[25563]: Subroutine parse_config redefined
at /etc/mail/spamassassin/FuzzyOcr.pm line 132.
Sep 26 17:20:49 server spamd[25563]: Subroutine finish_parsing_end
redefined at /etc/mail/spamassassin/FuzzyOcr.pm line 184.
Sep 26 17:20:49 server spamd[25563]: Subroutine dummy_check redefined at
/etc/mail/spamassassin/FuzzyOcr.pm line 288.
Sep 26 17:20:49 server spamd[25563]: Subroutine load_global_words
redefined at /etc/mail/spamassassin/FuzzyOcr.pm line 292.
Sep 26 17:20:49 server spamd[25563]: Subroutine load_personal_words
redefined at /etc/mail/spamassassin/FuzzyOcr.pm line 315.
Sep 26 17:20:49 server spamd[25563]: Subroutine max redefined at
/etc/mail/spamassassin/FuzzyOcr.pm line 343.
Sep 26 17:20:49 server spamd[25563]: Subroutine within_threshold
redefined at /etc/mail/spamassassin/FuzzyOcr.pm line 351.
Sep 26 17:20:49 server spamd[25563]: Subroutine fmt_time redefined at
/etc/mail/spamassassin/FuzzyOcr.pm line 388.
Sep 26 17:20:49 server spamd[25563]: Subroutine check_image_hash_db
redefined at /etc/mail/spamassassin/FuzzyOcr.pm line 414.
Sep 26 17:20:49 server spamd[25563]: Subroutine add_image_hash_db
redefined at /etc/mail/spamassassin/FuzzyOcr.pm line 492.
Sep 26 17:20:49 server spamd[25563]: Subroutine calc_image_hash
redefined at /etc/mail/spamassassin/FuzzyOcr.pm line 539.
Sep 26 17:20:49 server spamd[25563]: Subroutine debuglog redefined at
/etc/mail/spamassassin/FuzzyOcr.pm line 580.
Sep 26 17:20:49 server spamd[25563]: Subroutine wrong_ctype redefined at
/etc/mail/spamassassin/FuzzyOcr.pm line 590.
Sep 26 17:20:49 server spamd[25563]: Subroutine corrupt_img redefined at
/etc/mail/spamassassin/FuzzyOcr.pm line 608.
Sep 26 17:20:49 server spamd[25563]: Subroutine known_img_hash redefined
at /etc/mail/spamassassin/FuzzyOcr.pm line 626.
Sep 26 17:20:49 server spamd[25563]: Subroutine removedir redefined at
/etc/mail/spamassassin/FuzzyOcr.pm line 637.
Sep 26 17:20:49 server spamd[25563]: Subroutine fuzzyocr_check redefined
at /etc/mail/spamassassin/FuzzyOcr.pm line 657.
--

Have I somehow loaded this module twice? I didn't get these messages
until I upgraded to version 2.3j from 2.3b







Re: SA v3.1.5 and Rules Problems

2006-09-25 Thread Steve [Spamassasin]
jdow wrote:
 Those are info comments. 3.1.5 has more rigorous lint checking than
 3.1.3. And you're seeing the results.
I've also recently upgraded, but only to version 3.1.4 (in order to be
able to use fuzzyocr as I'd like.)

I have a very similar set of warnings (though none about DCC/Razor etc.
as I have those options installed):

--
spamd[7083]: rules: meta test DRUGS_ERECTILE has undefined dependency
'__DRUGS_ERECTILE7'
spamd[7083]: rules: meta test SARE_SUB_ACCEPT_CCARDS has undefined
dependency '__SARE_SUB_FROM_PAYPAL'
spamd[7083]: rules: meta test SARE_SPEC_PROLEO_M2a has dependency
'MIME_QP_LONG_LINE' with a zero score
spamd[7083]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined
dependency 'SARE_XMAIL_SUSP2'
spamd[7083]: rules: meta test SARE_HEAD_SUBJ_RAND has undefined
dependency 'SARE_HEAD_XAUTH_WARN'
spamd[7083]: rules: meta test SARE_HEAD_SUBJ_RAND has dependency
'X_AUTH_WARN_FAKED' with a zero score
spamd[7083]: rules: meta test SARE_RD_SAFE has undefined dependency
'SARE_RD_SAFE_MKSHRT'
spamd[7083]: rules: meta test SARE_RD_SAFE has undefined dependency
'SARE_RD_SAFE_GT'
spamd[7083]: rules: meta test SARE_RD_SAFE has undefined dependency
'SARE_RD_SAFE_TINY'
spamd[7083]: rules: meta test VIRUS_WARNING_DOOM_BNC has undefined
dependency 'VIRUS_WARNING_MYDOOM4'
spamd[7083]: rules: meta test SARE_OBFU_CIALIS has undefined dependency
'SARE_OBFU_CIALIS2'
spamd[7083]: rules: meta test FP_MIXED_PORN3 has undefined dependency
'FP_PENETRATION'
--

Curiously, running spamassassin --lint reports no errors, but the
messages above do appear in /var/log/messages when I start the spamd
service.  I investigated DRUGS_ERECTILE - which does reference
__DRUGS_ERECTILE7 - which does seem to be undefined.  As far as I am
aware I'm using the latest rules-emporium rules (I use
spamassassin-ruledujour on gentoo) - so, are these (minor) bugs with the
rulesemporium rule-sets?  Is there something I should be doing to
resolve this, or is it best simply to wait for the rule-sets to be updated?





Re: Very simple user query...

2005-09-14 Thread Steve [Spamassasin]

jdow wrote:

I absolutely do not want to report automatically - in the sense that 
I am adamant that I want human intervention before reporting.  
Conversely - given the task of establishing a remote shell; finding 
the correct email in maildir - and verifying it is indeed the mail I 
determined was a spam in my email client - followed by manually 
reporting it individually to each service... I'm inclined not to 
bother.  If, for example I had an IMAP folder into which I drop spam 
that my mail server should report on my behalf -then reporting would 
become far less of a chore.




Simple matter of coding. That is how I handle ham and spam training. I 
simply
dunk it into ham and spam folders and let a cron job run sa-learn over 
the
two folders. In this case you'd probably have to code up something 
that takes
the folder apart properly, forwards the mail appropriately, then 
tosses it.
I haven't done such a thing. But there are perl tools for reading 
messages

via IMAP that could be used as the core of a new tool.



Hmmm - given that this seems such an obvious thing to want, and because 
I'm quite laz^H^H^Hbusy these days, I'd hoped that there such  thing 
pre-existed.  It strikes me that the best way to do this would be with a 
daemon which monitors the IMAP folders for user-identified spam; salearn 
and report it - then move it to the same folder as the automatically 
identified spam.  I realise that it wouldn't be a herculean effort to 
implement this but I'm very reluctant to re-invent the wheel.







Re: Very simple user query...

2005-09-13 Thread Steve [Spamassasin]

jdow wrote:


You do not say which version of spamassassin you are using. If it is not
3.04 an upgrade might help.


It's 3.04 - the latest stable build that's made it into Gentoo Portage


   * Is there somewhere where I can report spams which aren't caught by
 the default configuration in order to feed-back into future
 improvements?


There are places to report them manually.


I'm familiar with razor-report, for example - but it is a real pain to 
mess about with this command line tool when all my mail is managed 
remotely over IMAP



I have a strong personal bias against automating anything related to
spam REPORTING. Please examine the downsides of automatic reporting
before proceeding.


I absolutely do not want to report automatically - in the sense that I 
am adamant that I want human intervention before reporting.  Conversely 
- given the task of establishing a remote shell; finding the correct 
email in maildir - and verifying it is indeed the mail I determined was 
a spam in my email client - followed by manually reporting it 
individually to each service... I'm inclined not to bother.  If, for 
example I had an IMAP folder into which I drop spam that my mail server 
should report on my behalf -then reporting would become far less of a chore.


Steve




Re: Very simple user query...

2005-09-13 Thread Steve [Spamassasin]

Michael Monnerie wrote:

Maybe it would be good to report that e-mails to razor, etc. too. I'll 
give it a try. Do you have a script to report from IMAP to SA?
 

I don't... It can't be that hard to do using a polling approach... It 
would be neater if this was triggered by the IMAP server... but I'm not 
aware of such a facility.  I'd love to be proved wrong...


BTW: is pyzor good / worth the effort? It's latest release is 
September 7, 2002, for that I thought it wouldn't be used too much 
anymore. Do you get enough hits?
 

That's a good question... I wasn't aware that the latest release of 
pyzor was so old... but it wouldn't have concerned me if I had... I'd be 
inclined to suspect that what's important about pyzor is server-side.  
Anyway - I've compiled some statistics over the past few months... and 
it seems my installation of pyzor was really useful until sometime 
during July... thereafter no more matches were made... which is curious...


#PYZOR_CHECK   #spams

May   2794   3121
June  4402   4809
July  2713017
August0  3669
Sept  0  1546

This, I guess, might indicate part of the reason why less spam is caught 
today...  On further investigation I found pyzor crashed when run... 
un-merging then re-merging it solved the problem which was probably some 
strange python dependency.


Steve





Re: Very simple user query...

2005-09-13 Thread Steve [Spamassasin]

Pedro Sam wrote:


 I'm familiar with razor-report, for example - but it is a real pain to
 mess about with this command line tool when all my mail is managed
 remotely over IMAP

I haven't been using spamassassin for a while, but last I check,
spamassassin -r will report spam to DCC/pyzor/razor all in one go.


(Having just checked the manual) - yes - that does seem much better than 
reporting to all the services I use individually.  Unfortunately, on its 
own, it doesn't address the user interface issue from the perspective of 
a client remotely accessing mail over IMAP/SMTP...






Very simple user query...

2005-09-12 Thread Steve [Spamassasin]
I'm using spamassassin (Razor, Pyzor, DCC) and procmail to filter all my 
mail on my (Gentoo) linux-server, to which I connect from a number of 
Windows (XP/2000) machines using Mozilla Thunderbird to access my 
(dovecot) IMAP folders on the linux server.  I configured spamassassin 
to use Rulesdujour and to regularly update those rules - and I was 
very happy... at least 99.99% of spam was correctly marked with only one 
incident of false positives (for which spamassasin wasn't entirely to 
blame.) in several months.


Lately I've been less lucky - only ~99% of my spam is marked as such... 
which sounds good but the remaining 1% gives me up-to a dozen bogus 
messages each day... which is frustrating.  To the naked eye the missed 
spam is obviously spam - but typically the only significant rule it 
triggers is the Bayesian rule...  As I've stuck to the default settings 
this alone is insufficient to identify a mail as spam.


I'm left with several questions...

   * Is there somewhere where I can report spams which aren't caught by
 the default configuration in order to feed-back into future
 improvements?
   * Is there an easy way to report spam explicitly to the checksum
 services (Razor/Pyzor/DCC)?

Any other suggestions are welcome...

Steve



Re: Very simple user query...

2005-09-12 Thread Steve [Spamassasin]

Martin Hepworth wrote:


Steve

OK - what do you get for spamassassin -D --lint ??
 


Output attached: sdlint.txt...


This will give you the list of tests etc its triggering along with things
that might be causing ptoblems. The URI-RBLs are enabled by default in most 
config's, but Gentoo might have removed this from the init.pre (as it is in the 
RH rpms) which is a right PITA.

In /etc/mail/spamassassin there should be a init.pre file and the following
line should be enabled to make the URI-RBL's work..

loadplugin Mail::SpamAssassin::Plugin::URIDNSBL

if it doesn't exist or has a # in the front then that will not help at all.

I've got that line... and I can confirm that some RBLs do work - for 
example - a spam was classified today with these matches:



0.5 SARE_MSGID_ADDED   Message ID added by later system
1.7 MSGID_FROM_MTA_ID  Message-Id for external message added locally
0.1 RAZOR2_CF_RANGE_51_100 BODY: Razor2 gives confidence level above 50%
   [cf: 100]
3.5 BAYES_99   BODY: Bayesian spam probability is 99 to 100%
   [score: 1.]
1.5 RAZOR2_CHECK   Listed in Razor2 (http://razor.sf.net/)
2.2 DCC_CHECK  Listed in DCC (http://rhyolite.com/anti-spam/dcc/)
2.0 RCVD_IN_SORBS_DUL  RBL: SORBS: sent directly from dynamic IP address
   [213.106.39.160 listed in dnsbl.sorbs.net]
1.2 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
 [Blocked - see http://www.spamcop.net/bl.shtml?213.106.39.160]
3.1 RCVD_IN_XBLRBL: Received via a relay in Spamhaus XBL
   [213.106.39.160 listed in sbl-xbl.spamhaus.org]
1.6 DNS_FROM_RFC_POST  RBL: Envelope sender in postmaster.rfc-ignorant.org
0.1 RCVD_IN_NJABL_DUL  RBL: NJABL: dialup sender did non-local SMTP
   [213.106.39.160 listed in combined.njabl.org]
1.0 URIBL_SBL  Contains an URL listed in the SBL blocklist
   [URIs: e4v.net]
0.4 URIBL_AB_SURBL Contains an URL listed in the AB SURBL blocklist
   [URIs: e4v.net]
2.5 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist
   [URIs: e4v.net]
1.5 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist
   [URIs: e4v.net]
3.2 URIBL_OB_SURBL Contains an URL listed in the OB SURBL blocklist
   [URIs: e4v.net]
4.3 URIBL_SC_SURBL Contains an URL listed in the SC SURBL blocklist
   [URIs: e4v.net]
0.1 DIGEST_MULTIPLEMessage hits more than one network digest check
1.7 SARE_SPEC_ROLEXRolex watch spam
2.3 SARE_SPEC_ROLEX_REPRolex Replic




debug: SpamAssassin version 3.0.4
debug: Score set 0 chosen.
debug: running in taint mode? no
debug: diag: module not installed: DBI ('require' failed)
debug: diag: module installed: DB_File, version 1.811
debug: diag: module installed: Digest::SHA1, version 2.10
debug: diag: module installed: IO::Socket::UNIX, version 1.21
debug: diag: module installed: MIME::Base64, version 3.05
debug: diag: module installed: Net::DNS, version 0.49
debug: diag: module installed: Net::LDAP, version 0.33
debug: diag: module installed: Razor2::Client::Agent, version 2.77
debug: diag: module installed: Storable, version 2.13
debug: diag: module installed: URI, version 1.35
debug: ignore: using a test message to lint rules
debug: using /etc/mail/spamassassin/init.pre for site rules init.pre
debug: config: read file /etc/mail/spamassassin/init.pre
debug: using /usr/share/spamassassin for default rules dir
debug: config: read file /usr/share/spamassassin/10_misc.cf
debug: config: read file /usr/share/spamassassin/11_gentoo.cf
debug: config: read file /usr/share/spamassassin/20_anti_ratware.cf
debug: config: read file /usr/share/spamassassin/20_body_tests.cf
debug: config: read file /usr/share/spamassassin/20_compensate.cf
debug: config: read file /usr/share/spamassassin/20_dnsbl_tests.cf
debug: config: read file /usr/share/spamassassin/20_drugs.cf
debug: config: read file /usr/share/spamassassin/20_fake_helo_tests.cf
debug: config: read file /usr/share/spamassassin/20_head_tests.cf
debug: config: read file /usr/share/spamassassin/20_html_tests.cf
debug: config: read file /usr/share/spamassassin/20_meta_tests.cf
debug: config: read file /usr/share/spamassassin/20_phrases.cf
debug: config: read file /usr/share/spamassassin/20_porn.cf
debug: config: read file /usr/share/spamassassin/20_ratware.cf
debug: config: read file /usr/share/spamassassin/20_uri_tests.cf
debug: config: read file /usr/share/spamassassin/23_bayes.cf
debug: config: read file /usr/share/spamassassin/25_body_tests_es.cf
debug: config: read file /usr/share/spamassassin/25_hashcash.cf
debug: config: read file /usr/share/spamassassin/25_spf.cf
debug: config: read file /usr/share/spamassassin/25_uribl.cf
debug: 

Re: Very simple user query...

2005-09-12 Thread Steve [Spamassasin]

Martin Hepworth wrote:


Steve

Ok looks good. If you can drop an example of a spam that 'gets through' to a
web page somewhere, I can run it over my system and see what happens.

I've got loads of extra rules (most of rulesemporium.com etc etc so we'll
see what hits...



I should have read your suggestion more carefully - I tried mailing a 
zip file as an attachment - which seems to have been eaten.


   http://www.shic.dynalias.net/spam.zip

Contains two spams...  The eaten message would have said:

--
I suspect that the rulesemporium rules are what I refer to as Gentoo's 
rulesdujour - though I can't be sure that my automated script picks 
the same rules as you have.


I've attached a zip file containing two spams (sensitive details removed 
with '#' characters... this shouldn't confuse spamassassin) These two 
spams are typical of what's annoying me... Both these examples have 
DATE_IN_PAST_12_24, but this is not the case for all of what is slipping 
past.

--





Re: Very simple user query...

2005-09-12 Thread Steve [Spamassasin]

Martin Hepworth wrote:


Steve

OK looks like these are both uk.geocities.com abuse spam.

If you look at the archive you'll find some extra rulesets for these little
blighters (and their variants).
 

Genius answer! For some reason it had completely escaped my notice that 
all of the spams missed by SA over the past month had a uk.geocities.com 
address!  I've opted for a score of 4 for any mail mentioning a 
uk.geocities.com URL - which is hopefully good enough to avoid this kind 
of problem without too great a risk of loosing a mail that happens to 
reference a homepage on uk.geocites.com in an innocent way.


What still surprises me is that DCC/Razor/Pyzor don't pick these up... 
I'd still like to know what would be the easiest way to report these 
spams in order that in future they might be caught without falling back 
on a vicious static check for any mail referencing a URL at a free provider.


Thanks,
Steve



Re: Very simple user query...

2005-09-12 Thread Steve [Spamassasin]

Martin Hepworth wrote:


Well if this worked. we could make sure we hit the spammers really hard 
;-)
 

While I see eliminating spammers as being one of the better 
justifications for environmental warfare, it isn't sufficiently reliable 
to get my vote.



of course those unfortunates who also live in Baton Raton (or wherever Ralski 
and his co-horts are hiding this week) would be in trouble for harboring these 
people as well ;-(
 

To a large extent (I'm sad to say) I believe that spam is the fault of 
the IT industry who have utterly failed to provide a usable PKI for the 
masses.  If ISPs required to register a certificate for every user's 
email address (at minimal cost - just like is now the case for domain 
names) then spam could become a thing of the past pretty quickly; and 
all email could be sent securely into the bargain.  Well - I can dream 
too - can't I?