Help understanding TxRep errors.

2016-03-15 Thread Philip

After turning on TxRep I get these lines in my /var/log/spamd.log file.

Wed Mar 16 08:21:55 2016 [16629] warn: Use of uninitialized value 
$msgscore in addition (+) at /etc/spamassassin/TxRep.pm line 1414.
Wed Mar 16 08:21:55 2016 [16629] warn: Use of uninitialized value 
$msgscore in subtraction (-) at /etc/spamassassin/TxRep.pm line 1414.


/etc/spamassassin/60_txreputation.cf has...

use_txrep 1

header TXREP   eval:check_senders_reputation()
describe   TXREP   Score normalizing based on sender's reputation
tflags TXREP   userconf noautolearn
priority   TXREP   1000

txrep_whitelist_out 1

Ideas, suggestions?

Regards,

Phil


How to know if TxRep is white listing out going email.

2016-03-29 Thread Philip
I've enabled outgoing white listing using the TxRep plugin is there a 
way to find out if outbound emails are actually being white listed? A 
log somewhere... a file being updated?


--
Phil


How to text that TxRep is working?

2018-05-22 Thread Philip
I've added TxRep to spamassassin and set in my local.cf. Following the 
instructions:


http://truxoft.com/resources/txrep.htm

# TXTREP
use_txrep 1

Is there a way to test that it's actually working?

Phil




Tone of emails with subject: 'hey'

2018-02-05 Thread Philip
So lately I'm getting LOTS of emails coming directly though the filters 
so most likely time to investigate how to create one.


The subject is always 'hey'

Subject: hey

Date: Mon, 29 Jan 2018 09:07:40 +0300
From: Darya Message-ID: <8f35b00fb4e07d18ce82448ec9747...@112it4u.ro>
X-Mailer: PHPMailer 5.2.22 (https://github.com/PHPMailer/PHPMailer)
MIME-Version: 1.0
Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

Hi josh, my name is Darya and i'm from Russia, but living in the USA. A 
week ago, maybe more, I came across your profile on Facebook and now I 
wan to know you more. I know it sounds a bit strange, but I believe you 
had something like this in your life too :-) If its mutual, email me, 
this is my email danielamar...@rambler.ru and I will send some of my 
photos also answer any of your questions. Waiting for you, XXX Darya


As far as I can see from the different emails:

X-PHP-Originating-Script: 852:class-phpmailer.php

The number is sequential.

112it4u.ro from the message ID has valid NS entries but the reverse PTR 
is invalid.


The email always starts, 'hi {mailbox name}, and the text is mostly the 
same but the name changes now and then and so does the email address.


Any suggestions on where to start? nOOb here!

Phil




Loading custom rules.

2018-02-25 Thread Philip
How do you load custom rules... is it as simple as dropping the .cf file 
in the spamassassin directory and restart?


I'm looking at these: https://wiki.apache.org/spamassassin/CustomRulesets

Phil


Re: Spammers, IPv6 addresses, and dnsbls

2018-03-07 Thread Philip

Hi there,

Providers like Linode assign a single IPv6 address from a /64. I had to 
request my own block of /64 to use on my server as my IP neighbors were 
always getting the /64 blocked... since I've had my own I've been all 
good.  Before this my IPv6 IP was getting blocked daily because of 
someone else on that /64.  It was quite annoying for myself.


Phil

ps your server blocks .nz domains :P

On 03/03/2018 00:54, Daniele Duca wrote:

Hello list,

apologies if this is not directly SA related. "Lately" I've started to 
notice that some (not saying names) VPS providers, when offering v6 
connectivity, sometimes tends to not follow the best practice of 
giving a /64 to their customer, routing to them much smaller v6 
subnets, while still giving to them the usual /30 or /29 v4 subnets.


What It's happening is that whenever a spammer buys a VPS with those 
providers and get blacklisted, most of the time the dnsbls list the 
whole v6 /64, while still listing only the single ipv4 address. This 
makes some senses, as it would be enormously resource intensive to 
track each of the 18,446,744,073,709,551,616 addresses in the /64, but 
unfortunately not respecting basic v6 subnetting rules causes 
reputation problems also for the other customers that have the bad 
luck of living in the same /64 and are using their VPS as an outgoing 
mail server.


While I'm not judging the reasons why VPS providers are doing this 
type of useless v6 subnetting (micronetting?), I've started to deploy 
some countermeasures to avoid FPs. Specifically I wrote a rule that 
identifies if the last untrusted relay is a v6 address, and then is 
subsequently used in other meta rules that subtract some points in 
dnsbl tests that check the -lastexternal ip address on v6-aware lists.


I know that probably is not the best solution, but I've started to see 
real FPs that worried me. I've even pondered if it could have sense to 
go back to v4 only connectivity for my inbound mtas.


If you are in a similar situation I would like very much to discuss 
what would be the best approach to balance spam detection while 
avoiding fps


Regards

Daniele Duca






Rule for detecting two email addresses in From: field.

2019-10-03 Thread Philip

Morning List,

Lately I'm getting a bunch of emails that are showing up with two email 
addresses in the From: field.


From: "Persons Name " 

When you look in your mail client (Outlook, Thunderbird) it's showing 
only "Persons Name "


Is there a way I can mark From: that has 2 email addresses in it as 
spam? Pro's Cons?


Phil


White listing this mailing list.

2019-12-18 Thread Philip
How do I white list this mailing list for some reason all the messages 
are now going to spam.





Plugin/TVD.pm

2009-05-31 Thread Philip Prindeville
I upgraded from FC8 to FC9 recently, and spamassassin could no longer
find TVD.pm after I deprecated the old Perl install.

Where does TVD.pm currently live?

Thanks,

-Philip



Re: Plugin/TVD.pm

2009-06-01 Thread Philip Prindeville

Yup, that's the beast.

Missed the news that it had become part of 3.2.  Excellent.

Thanks.


Theo Van Dinter wrote:

That depends, what's TVD.pm?  ;)

Doing a quick search shows
http://mail-archives.apache.org/mod_mbox/spamassassin-users/200603.mbox/%3c20060316233124.gv22...@kluge.net%3e
which was a conversation we had way back in 2006 about SA 3.1 and bug
4255.  There was a TVD.pm in discussion, so I assume that's the plugin
in question.

It appears to have become HTTPSMismatch.pm, already included as a
standard plugin in SA 3.2 and beyond. :)


On Sun, May 31, 2009 at 2:03 PM, Philip Prindeville
philipp_s...@redfish-solutions.com wrote:
  

I upgraded from FC8 to FC9 recently, and spamassassin could no longer
find TVD.pm after I deprecated the old Perl install.

Where does TVD.pm currently live?





More of a philosophical question

2009-11-11 Thread Philip A. Prindeville
This isn't so much of a technical question as a policy one.

I get a lot of spam which looks like:

Return-Path: evan_law...@davidark.net
Received: from web.biz.mail.sk1.yahoo.com (web.biz.mail.sk1.yahoo.com 
[74.6.114.43])
by mail.redfish-solutions.com (8.14.3/8.14.3) with SMTP id nA8KXHbF007914
for philipp_s...@redfish-solutions.com; Sun, 8 Nov 2009 13:33:23 -0700
Received: (qmail 77790 invoked by uid 60001); 8 Nov 2009 20:33:17 -
Message-ID: 223519.76757...@web.biz.mail.sk1.yahoo.com
X-YMail-OSG: 
ITTxzA0VM1nOPGrQYX7tAeYtgFhkzLHYo.qDHS6MrLwhvvaHzfjqTAnctUdZXTeTR0y.mWitx7Ou0luQLKnF_GvxGk_gsyrhQiecygtXxr.GNWFkWrkP57qwERbf1Af794h0lXoiyXseb3DTTSqteQCJJ4R8cnSOGFAQavXbUa1QwMHI24mWQEyMF4VkVtpK30oRxlaHVfyGuTXo9pDtTd3mfZScylE6lSYlZjaU8EFS8b8xILkwduj7dx_FW.i4q._BpZayBZY5A5rQb2y03bhl6aTzM9nfbFpY..dlKU7NJVZhLnPeDNRv8z3ZUCBQfsJCq2M5y9Os913jTPXpB1loucgEzfYocoVj6I081B.QNiRFwnUtANDRTHDyGogYeSccqeiSzPxhABGFEtTWY2D08epaNJbwPjU66HDWEjzzNUbzBXyRny0UzKp4HLBUX5tbKNJ8kbHotjEE7xtmcpzoqm.YpfEDl_9omvGsW1e7rThr60pemte_xsNIcarBts2PAXSgzJrZ8zveH287WUmL29olqa3kkksEeVIi4cFsYWNQgSuPqQXV6TLpim1VNZ8c_bzZ5J35fEiL1iJeDWndc.SFtUMwf2leifGkzwDYSrWxOmhux7a_.AC30.BaJQypPZx6YlCXVWlJ3PIIeP0O_.NLtkltfStJB_lS69d6vSh437.X25YQtDTOo3MxMqjNgPznHdmQZ4SFJtF9lfmcksrvoSlXDkiCwGl2qfo.Iuxuh0c.KyVqFlzdy8GgUQJpw9yPwB_aTG.kIs.8gIuUQ3AY3wkI0QEfDOWbqDN2Gr3uLzwvrJLo9UJ4HTDAni7dvTSnM2INbXq7YdCgpfBZ7_AhpLTvvXhY_Yu.aoLjLh1Ill2BwfLJGCZr3bNct0pTw2_o5FXrupA.1Pk3t04NhCaQ0Y0St36th.K7a7smbRBcZusdDeQewQ7l.kEf0i.2YTbqFLUyI4QJwhXs18Kj1g_SQf3shYJxhlHF6FvRqX88D6kLJjPspPvh4eC_XiYxBtaarV0ZXoBBVKUjSj04DP8RSrFZ1DBGT5s2Uz.ZUY78.ilZcXnhFt1Dz4JwjnG0a35n8xWOx6JbWTD5d25EDahowx340TjnAGyjlfxfzgdFPlaQC54EEbDZpvjU8fbah53jJkST2JdvVUEKivsflAEEU7Y5_l8LQzENtjAAYop8dpHadyQn1lAYzRwrpHF7ViBGMwd3gihfVZs_3onzYsoYsvwkNolkWORQcvbGWxFKfuQMJDL9Iaw4QKX0iIGErAWHIkWHnF6B48RFDMrGVyVrwjEhT7X50IKYbwK.EZid2Eme9x2ElFgATPBSmjhom14Ay9DuY77cJuY_MohirOKsbTgl3_nwv704SGy6.Vg.oAaEP29c8cOcMwXpzZDUeO0ZHXcIn9f7ujQlssq9EF4Yn79sQcgkBNeRMFAkLx_cx5Ez5a9rslAITdPSuHfK.X0YH3GAmV.ONy7VE9Uta5Tk4Z3JmjtHJ0AIrCIGy7ZonllVcF1nWkv4BA083jOSbsQqFBXtU5uOnhE-
Received: from [41.207.162.4] by web.biz.mail.sk1.yahoo.com via HTTP; Sun, 
08 Nov 2009 12:33:16 PST
X-Mailer: YahooMailClassic/8.1.6 YahooMailWebService/0.7.347.3
Date: Sun, 8 Nov 2009 12:33:16 -0800 (PST)
From: Evan Lawson evan_law...@davidark.net
Subject: Hello Dear Friend
To: undisclosed recipients: ;
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii


And I report this to Yahoo!.  They then answer:



We understand your frustration in receiving unsolicited email. While we
investigate all reported violations against the Yahoo! Terms of Service
(TOS), in this particular case the message you received was not sent by
a Yahoo! Mail user.

Yahoo! has no control over activities outside its service, and therefore
we cannot take action. You may try contacting the sender's email
provider, by identifying the sender's domain and contacting the
administrator of that domain. The sender's provider should be in a
better position to take appropriate action against the sender's account.

which sounds to me like they are effectively admitting that they run an
Open Relay, which is against US law, as I remember.

It's also factually incorrect.  The message didn't originate outside of
their service, since the line Received: ... via HTTP is basically
meaningless.  HTTP isn't a mail protocol.  This tells me that the
message originated via a Webmail submission on their website, which
means that someone had to log in with credentials... which means that
(a) they do in fact have control over whether that user's credentials
get yanked or not, and (b) the message didn't originate outside of their
service.

This has been going on for 4 years, and I'm tired of their shirking
their responsibility.

We don't have a lot of users, so I'd be happy to blacklist Yahoo! until
they clean up their act... unfortunately a couple of correspondents to
this domain are Yahoo! users.

So what is the best course of action to take against Yahoo!?

I filed an IC3 complaint against them for passing phishing and operating
an Open Relay, but nothing came of it.

How has everyone else made their peace with this?

Thanks,

-Philip






Undisclosed recipients :; -- again

2009-11-23 Thread Philip Prindeville
Hi.

I want to block all messages that I'm getting that have:

To: undisclosed recipients: ;

with no Cc: line.

Unfortunately, the rule that I have:

header L_UNDISCLOSEDTo:raw =~ /undisclosed-recipients: ?;/
describe L_UNDISCLOSED  To: list is meaningless and no Cc:
score L_UNDISCLOSED 10.0



also seems to match when there's no To: line at all, only a Cc: line
(which isn't what I want).

Why is Spamassassin thinking that there's a header 'To:' line, and it
says 'undisclosed recipients' when it doesn't exist?

This is on Fedora Core 11, updated (so SA 3.2.5, Perl 5.10.0, and
Sendmail 8.14.3)

Thanks,

-Philip





Re: Undisclosed recipients :; -- again

2009-11-23 Thread Philip Prindeville
On 11/23/2009 12:10 PM, Michael Scheidell wrote:
 Philip Prindeville wrote:
   
 Hi.

 I want to block all messages that I'm getting that have:

 To: undisclosed recipients: ;

 with no Cc: line.

   
 
 I went round and round with this a while back.

 SA 3.25 has a problem with perl null vs 0 vs ''.

 so a To header (or CC header) with no content looks like a missing to line.

 but I don't see anything below in this rule that even looks for the CC 
 line, so you would need to create a meta rule (that doesn't work in 
 3.2.5) to check each.
 rule #1 checks for the undiscloved recpits
 rule #2 checks for CC (or blank cc, which sa 3.2.5 sees as the same)

 best to block this in MTA, if you really just want to block it.
   



Well, I could use:

header __L_UNDISCLOSED1 To:raw =~ /undisclosed-recipients: ?;/
header __L_UNDISCLOSED2 Cc !~ /^$/
meta L_UNDISCLOSED  (__L_UNDISCLOSED1  __L_UNDISCLOSED2)
describe L_UNDISCLOSED  To: list is meaningless and no Cc:
score L_UNDISCLOSED 10.0



but as you say, if it can't tell the difference between  and undef,
then that's an issue.




 Unfortunately, the rule that I have:

 header L_UNDISCLOSEDTo:raw =~ /undisclosed-recipients: ?;/
 describe L_UNDISCLOSED  To: list is meaningless and no Cc:
 score L_UNDISCLOSED 10.0



 also seems to match when there's no To: line at all, only a Cc: line
 (which isn't what I want).

 Why is Spamassassin thinking that there's a header 'To:' line, and it
 says 'undisclosed recipients' when it doesn't exist?

 This is on Fedora Core 11, updated (so SA 3.2.5, Perl 5.10.0, and
 Sendmail 8.14.3)

 Thanks,

 -Philip



   
 




Re: Undisclosed recipients :; -- again

2009-11-23 Thread Philip Prindeville
On 11/23/2009 12:18 PM, Michael Scheidell wrote:
 Philip Prindeville wrote:
   

 but as you say, if it can't tell the difference between  and undef,
 then that's an issue.

   
 
 use header ALL to check for a \nCC
 (which could be blank)

 or just use your MTA to reject it at SMTPtime.
   

BTW:

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5947

warns against using ALL.

Too bad there's no way to get the raw contents of the To: line
(comments included).




Re: Undisclosed recipients :; -- again

2009-11-23 Thread Philip Prindeville
On 11/23/2009 05:11 PM, LuKreme wrote:
 On Nov 23, 2009, at 12:05, Philip Prindeville 
 philipp_s...@redfish-solutions.com 
   wrote:

   
 I want to block all messages that I'm getting that have:

 To: undisclosed recipients: ;

 with no Cc: line.
 
 What's Cc: have to do with it?  undisclosed recipients is used for  
 Bcc: mail

 I used it all the time. And you WILL 'block' legitimate mail.

   

For the mailboxes that this will be coming to, no one ever gets mail
with no recipient list.

We occasionally get mail from 'RT' (the bug tracker used by Openssl-dev
and others) that generates a Cc: with no To: line... but that's about
the extent of it.




Re: Undisclosed recipients :; -- again

2009-11-23 Thread Philip Prindeville
On 11/23/2009 05:11 PM, LuKreme wrote:
 On Nov 23, 2009, at 12:05, Philip Prindeville 
 philipp_s...@redfish-solutions.com 
   wrote:

   
 I want to block all messages that I'm getting that have:

 To: undisclosed recipients: ;

 with no Cc: line.
 
 What's Cc: have to do with it?  undisclosed recipients is used for  
 Bcc: mail

 I used it all the time. And you WILL 'block' legitimate mail.

   

For the mailboxes that this will be coming to, no one ever gets mail
with no recipient list.

We occasionally get mail from 'RT' (the bug tracker used by Openssl-dev
and others) that generates a Cc: with no To: line... but that's about
the extent of it.




Re: Undisclosed recipients :; -- again

2009-11-27 Thread Philip A. Prindeville

John Hardin wrote:

On Mon, 23 Nov 2009, LuKreme wrote:

On Nov 23, 2009, at 12:05, Philip Prindeville 
philipp_s...@redfish-solutions.com wrote:



I want to block all messages that I'm getting that have:

To: undisclosed recipients: ;


undisclosed recipients is used for Bcc: mail

I used it all the time. And you WILL 'block' legitimate mail.


Granted, but in metas such a test can be useful:

http://ruleqa.spamassassin.org/?rule=%2FTO_NOsrcpath=jhardin



Speaking of tests, I saved out some messages that should have matched my 
rule but didn't into files, and ran them against spamassassin as:


spamassassin -D  /tmp/emails/XXX.eml

and I saw:

[28655] dbg: rules: ran header rule __L_UNDISCLOSED2 == got hit: negative 
match


for the ruleset:


header __L_UNDISCLOSED1 To:raw =~ /undisclosed-recipients: ;/
header __L_UNDISCLOSED2 Cc =~ /^$/
meta L_UNDISCLOSED  (__L_UNDISCLOSED1  __L_UNDISCLOSED2)
describe L_UNDISCLOSED  To: list is meaningless and no Cc:
score L_UNDISCLOSED 10.0



but didn't see __L_UNDISCLOSED1 match. Also, what does negative match 
mean? That it didn't match?


Lots of other rules (like __L_UNDISCLOSED1) didn't match, but I didn't 
see debug for those...


Just how do I go about figuring out what the To:raw value is (for 
example)?


Thanks,

-Philip





Re: Undisclosed recipients :; -- again

2009-11-27 Thread Philip A. Prindeville

John Hardin wrote:

On Fri, 27 Nov 2009, Philip A. Prindeville wrote:


header __L_UNDISCLOSED1 To:raw =~ /undisclosed-recipients: ;/

Just how do I go about figuring out what the To:raw value is (for 
example)?


  header  __TO_RAW  To:raw =~ /.+/

If you're analyzing something that may have multiple occurrences, 
you'll need a tflags multiple:


  body__ALL_BODY  /.+/
  tflags  __ALL_BODY  multiple



Interesting, thanks:

[31209] dbg: rules: ran header rule __TO_RAW == got hit:  undisclosed 
recipients: ;_


wondering why it contains the leading space, and what the trailing 
underscore is for...


On a side node, I never figured out why I see:

[31209] warn: plugin: failed to parse plugin (from @INC): syntax error at (eval 43) line 
1, near require Mail::SpamAssassin:

This seems to be a known issue.  What's the fix?





Re: Undisclosed recipients :; -- again

2009-12-02 Thread Philip A. Prindeville
On 11/30/2009 03:15 AM, Matus UHLAR - fantomas wrote:
 On 27.11.09 14:04, Philip A. Prindeville wrote:
   
 for the ruleset:
 
   
 header __L_UNDISCLOSED1 To:raw =~ /undisclosed-recipients: ;/
 
 just FYI, sendmail can be configured to do different things when To: is
 missing - there's sendmail option NoRecipientAction, configured by setting
 confNO_RCPT_ACTION m4 directive. The default value is none but e.g. Debian
 was setting it to add-to-undisclosed which causes MISSING_HEADERS not
 hitting (only from milter, which appears to be called before the headers are
 fixed).

 Maybe you should look at your MTA's configuratioon options if it doesn't
 cause different rules hitting/not hitting, e.g. sendmail adds Date: and
 Message-Id headers which cause MISSING_DATE and MISSING_MID. I was not able
 to find how disable this behaviour in sendmail.

   

Odd.  This is on FC11:

[r...@mail mail]# grep confNO_RCPT_ACTION /usr/share/sendmail-cf/*/*
/usr/share/sendmail-cf/m4/proto.m4:_OPTION(NoRecipientAction, 
`confNO_RCPT_ACTION', `none')
[r...@mail mail]# grep NoRecipientAction *.cf
sendmail.cf:#O NoRecipientAction=none
submit.cf:#O NoRecipientAction=none
[r...@mail mail]# 

Added:

define(`confNO_RCPT_ACTION', `none')dnl


to the sendmail.mc, made sendmail.cf, did a service restart... will see what 
happens.

I can't remember: if an option is commented, is that showing us the default 
value typically?

-Philip




Holding yahoo!'s feet to the fire

2009-12-07 Thread Philip A. Prindeville
Some good news... possibly.

I finally complained to ARIN (for the 4th time) that the contact information 
for the Inktomi address blocks was incorrect, as Inktomi hasn't existed as a 
corporate (and legal) entity for some time... it was acquired by Yahoo! 3 years 
ago, and their address blocks have all been repurposed for Yahoo! server 
infrastructure and as such should reflect that reality.

So ARIN flagged the address block registry information as INVALID, and will ask 
the Yahoo contact person (Joan Luster) to address this issue next time they are 
in communication with her.

Why is this useful?

Well, every time I personally report an issue with them, I get the very 
annoying auto-reply that says:


 Once you have identified the IP address, you can conduct an IP lookup to
 determine which ISP provides this person with Internet access. One such 
 lookup tool you may want to try is:

http://www.arin.net/whois/

 You can then attempt to contact that ISP to report any abuse activities 
 occurring within their service.
   

which raises my blood pressure, because ARIN whois indicates they *are* the 
owner of that netblock (as Inktomi).

The upshot is that the OrgID: and NetName: records will no longer be INKT and 
INKTOMI-NET-* respectively... so that gives them one less rock to hide under.

(And it's entirely possible their autoresponders are mistakenly keying off 
these particular record fields when doing prefiltering... giving them the 
benefit of the doubt.)

We'll see how it goes, and I'll try to keep the list current.

Keep your fingers crossed.

-Philip



Magical mystery colon

2010-01-30 Thread Philip A. Prindeville
I ran yum update on my FC11 machine a couple of days ago, and now I'm
getting nightly cron errors:

plugin: failed to parse plugin (from @INC): syntax error at (eval 84) line 1, 
near require Mail::SpamAssassin:

plugin: failed to parse plugin (from @INC): syntax error at (eval 148) line 1, 
near require Mail::SpamAssassin:

I've seen this message periodically, but never figured out what generated it.

Can someone set me straight?  It of course doesn't mention a file, so it's hard 
to know where it's coming from.

Also, how come the eval block:


foreach $thing (qw(Anomy::HTMLCleaner Archive::Zip Digest::SHA1 
HTML::Parser HTML::TokeParser IO::Socket IO::Stringy MIME::Base64 MIME::Tools 
MIME::Words Mail::Mailer Mail::SpamAssassin Net::DNS Unix::Syslog )) {
unless (eval require $thing) {
printf(%-30s: missing\n, $thing);
next;
}

doesn't contain a terminating ';', i.e.:

eval require $thing; instead?

Thanks,

-Philip




Re: Magical mystery colon

2010-01-31 Thread Philip A. Prindeville
On 01/30/2010 12:24 PM, Karsten Bräckelmann wrote:
 On Sat, 2010-01-30 at 12:16 -0800, Philip A. Prindeville wrote:
   
 I ran yum update on my FC11 machine a couple of days ago, and now I'm
 getting nightly cron errors:
 
 Would be nice and maybe even helpful to know, what command(s) that cron
 job executes, don't you think? :)
   

Well, this is unmodified Fedora, so the same as every other Fedora box:

10 4 * * * root /usr/share/spamassassin/sa-update.cron 21 | tee -a 
/var/log/sa-update.log


And that script contains:


#!/bin/bash
# *** DO NOT MODIFY THIS FILE ***
#
# /etc/mail/spamassassin/channel.d/*.conf
# Place files here to add custom channels.
#

# list files in a directory consisting only of alphanumerics, hyphens and
# underscores
# $1 - directory to list
# $2 - optional suffix to limit which files are selected
run_parts_list() {
if [ $# -lt 1 ]; then
echo ERROR: Usage: run_parts_list dir  /dev/stderr
exit 1
fi
if [ ! -d $1 ]; then
echo ERROR: Not a directory: $1  /dev/stderr
exit 1
fi

if [ -d $1 ]; then
if [ -n $2 ]; then
find_opts='-name *'$2
fi
find -L $1 -mindepth 1 -maxdepth 1 -type f $find_opts | sort -n
fi
}

# Proceed with sa-update if spam daemon is running or forced in 
/etc/sysconfig/sa-update
unset SAUPDATE
[ -f /etc/sysconfig/sa-update ]  . /etc/sysconfig/sa-update
for daemon in spamd amavisd; do
/sbin/pidof $daemon  /dev/null
[ $? -eq 0 ]  SAUPDATE=yes
done
[ -f /var/run/mimedefang.pid ]  SAUPDATE=yes

# Skip sa-update if daemon not detected
[ -z $SAUPDATE ]  exit 0

# sa-update must create keyring
if [ ! -d /etc/mail/spamassassin/sa-update-keys ]; then
sa-update
fi

# Initialize Channels and Keys
CHANNELLIST=
KEYLIST=
# Process each channel defined in /etc/mail/spamassassin/channel.d/
for file in $(run_parts_list /etc/mail/spamassassin/channel.d/ .conf); do 
# Validate config file
PREFIXES=CHANNELURL KEYID BEGIN
for prefix in $PREFIXES; do
if ! grep -q $prefix $file; then
echo ERROR: $file missing $prefix
exit 255
fi
done
. $file
#echo CHANNELURL=$CHANNELURL
#echo KEYID=$KEYID
CHANNELLIST=$CHANNELLIST $CHANNELURL
KEYLIST=$KEYLIST $KEYID
sa-update --import $file
done

# Sleep random amount of time before proceeding to avoid overwhelming the 
servers
sleep $(expr $RANDOM % 7200)

unset arglist
# Run sa-update on each channel, restart spam daemon if success
for channel in $CHANNELLIST; do
arglist=$arglist --channel $channel
done
for keyid in $KEYLIST; do
arglist=$arglist --gpgkey $keyid
done
/usr/bin/sa-update $arglist
if [ $? -eq 0 ]; then
/etc/init.d/spamassassin condrestart  /dev/null
[ -f /etc/init.d/amavisd ]  /etc/init.d/amavisd condrestart  /dev/null
[ -f /var/run/mimedefang.pid ]  /etc/init.d/mimedefang reload  /dev/null
fi



   
 plugin: failed to parse plugin (from @INC): syntax error at (eval 84) line 
 1, near require Mail::SpamAssassin:

 plugin: failed to parse plugin (from @INC): syntax error at (eval 148) line 
 1, near require Mail::SpamAssassin:

 I've seen this message periodically, but never figured out what
 generated it.

 Can someone set me straight?  It of course doesn't mention a file, so
 it's hard to know where it's coming from.
 
   




Re: Magical mystery colon

2010-02-01 Thread Philip A. Prindeville
On 02/01/2010 05:35 AM, Mark Martinec wrote:
 On Saturday January 30 2010 21:16:01 Philip A. Prindeville wrote:
   
 Also, how come the eval block:
   unless (eval require $thing) {...}
 doesn't contain a terminating ';', i.e.:
 eval require $thing; instead?
 
 It is not needed. It is an 'eval EXPR', not 'eval BLOCK'.
 A semicolon in perl is a statement separator, not a statement terminator.

   Mark
   

Ok.  No one knows why I'm seeing the warnings from the cron job, however?




It's a fine line...

2007-11-05 Thread Philip Prindeville
Between the truly clueless administrator, and those that feign ignorance 
to cover up their implicit approval of spammers...


What do you do in the case where someone is filtering deliveries to 
their abuse mailbox?  (Like 99% of mail sent there isn't going to 
score positively...)


Sigh.



Return-Path: 
Received: from localhost (localhost)
by mail.redfish-solutions.com (8.14.1/8.14.1) id lA5HEMTM017203;
Mon, 5 Nov 2007 10:14:22 -0700
Date: Mon, 5 Nov 2007 10:14:22 -0700
From: Mail Delivery Subsystem [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
MIME-Version: 1.0
Content-Type: multipart/report; report-type=delivery-status;
boundary=lA5HEMTM017203.1194282862/mail.redfish-solutions.com
Subject: Returned mail: see transcript for details
Auto-Submitted: auto-generated (failure)

This is a MIME-encapsulated message

--lA5HEMTM017203.1194282862/mail.redfish-solutions.com

The original message was received at Mon, 5 Nov 2007 10:14:14 -0700
from pool-71-112-36-94.sttlwa.dsl-w.verizon.net [71.112.36.94]

  - The following addresses had permanent fatal errors -
[EMAIL PROTECTED]
   (reason: 550 Rejecting message scored for more than 8.0 (9.0) SPAM points.)

  - Transcript of session follows -
... while talking to arminco.com.:

DATA

 550 Rejecting message scored for more than 8.0 (9.0) SPAM points.
554 5.0.0 Service unavailable

--lA5HEMTM017203.1194282862/mail.redfish-solutions.com
Content-Type: message/delivery-status

Reporting-MTA: dns; mail.redfish-solutions.com
Received-From-MTA: DNS; pool-71-112-36-94.sttlwa.dsl-w.verizon.net
Arrival-Date: Mon, 5 Nov 2007 10:14:14 -0700

Final-Recipient: RFC822; [EMAIL PROTECTED]
Action: failed
Status: 5.2.0
Remote-MTA: DNS; arminco.com
Diagnostic-Code: SMTP; 550 Rejecting message scored for more than 8.0 (9.0) 
SPAM points.
Last-Attempt-Date: Mon, 5 Nov 2007 10:14:22 -0700

--lA5HEMTM017203.1194282862/mail.redfish-solutions.com
Content-Type: message/rfc822

Return-Path: [EMAIL PROTECTED]
Received: from [192.168.10.148] (pool-71-112-36-94.sttlwa.dsl-w.verizon.net 
[71.112.36.94])
(authenticated bits=0)
by mail.redfish-solutions.com (8.14.1/8.14.1) with ESMTP id 
lA5HECTN017198
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
for [EMAIL PROTECTED]; Mon, 5 Nov 2007 10:14:14 -0700
Message-ID: [EMAIL PROTECTED]
Date: Mon, 05 Nov 2007 09:14:05 -0800
From: Abuse Department [EMAIL PROTECTED]
User-Agent: Thunderbird 2.0.0.6 (Windows/20070728)
MIME-Version: 1.0
To:  [EMAIL PROTECTED]
Subject: Filtering abuse reports
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.63 on 192.168.1.3

Of course submitted mail to the Abuse mailbox is going to score as 
spam.  It is spam.  Why else would anyone be reporting it?


Please get a clue and turn off filtering on your abuse mailbox:

The original message was received at Mon, 5 Nov 2007 10:10:58 -0700
from pool-71-112-36-94.sttlwa.dsl-w.verizon.net [71.112.36.94]

  - The following addresses had permanent fatal errors -
[EMAIL PROTECTED]
   (reason: 550 Rejecting message scored for more than 8.0 (20.6) SPAM points.)

  - Transcript of session follows -
... while talking to styx.aic.net.:


 DATA
  

 550 Rejecting message scored for more than 8.0 (15.1) SPAM points.
554 5.0.0 Service unavailable
... while talking to arminco.com.:


 DATA
  

 550 Rejecting message scored for more than 8.0 (20.6) SPAM points.
554 5.0.0 Service unavailable


--lA5HEMTM017203.1194282862/mail.redfish-solutions.com--




Re: It's a fine line...

2007-11-05 Thread Philip Prindeville

Steven Kurylo wrote:

Philip Prindeville wrote:
Between the truly clueless administrator, and those that feign 
ignorance to cover up their implicit approval of spammers...


What do you do in the case where someone is filtering deliveries to 
their abuse mailbox?  (Like 99% of mail sent there isn't going to 
score positively...) 
I filter my abuse address.  Otherwise it would get so many spam 
messages, the ham would get lost in the noise.


Only send the headers.  If the body is actually needed post it on some 
webpage.


A lot of sites won't accept just header lines.  They need both (to 
confirm that it's software piracy, or pornography, or phishing... and 
with phishing, you need the 4th party:  the link that is being used to 
spoof the legitimate organization).  And who bothers to keep track of 
who wants what?


I send everyone a complete copy of the message inline, because some 
braindead sites don't accept attachments, etc.


-Philip



Re: It's a fine line...

2007-11-05 Thread Philip Prindeville

John D. Hardin wrote:

On Mon, 5 Nov 2007, Steven Kurylo wrote:

  

Philip Prindeville wrote:

Between the truly clueless administrator, and those that feign 
ignorance to cover up their implicit approval of spammers...


What do you do in the case where someone is filtering deliveries to 
their abuse mailbox?  (Like 99% of mail sent there isn't going to 
score positively...) 
  


I have a form note that I send to the postmaster address whenever a 
report to the abuse address is bounced. It says (1) you need a working 
abuse address and (2) you shouldn't filter it.


  

I filter my abuse address.  Otherwise it would get so many spam
messages, the ham would get lost in the noise.

Only send the headers.  If the body is actually needed post it on
some webpage.



To heck with that. If I have to jump through that many hoops to report
abuse in *your* network, I'm just going to roundfile it. It's enough
work to pick out all of the relevant abuse addresses to forward the
message to, and note the type of abuse (lottery, 419, money
laundering, etc.).

I almost don't report abuse to Yahoo because they refuse to deal with
RFC-822 attachments and want the entire original message in the body,
and that makes reporting abuse containing a Yahoo.* contact address
two separate operations - forward as attachment to the relay owner,
and forward in the body to Yahoo.
  


Well, Yahoo is a waste of time for other reasons, right?  They tell you 
that it doesn't come from their site...  but to use the top-most 
Received: line's IP address, then to look that up on ARIN  which... 
surprise! ... typically points to Yahoo! (or one of their surrogates, 
like Inktomi...  do their tier-1 people not *know* that Yahoo owns 
Inktomi?  or are they just playing dumb?).


-Philip



Re: It's a fine line...

2007-11-05 Thread Philip Prindeville

Olivier Nicole wrote:

And not to point fingers, how to react with a narrow minded sysadmin
that ban per IP?

From my legitimate mail server in Thailand, that has never been
blacklisted as far as I know:

mailon45: telnet mail.redfish-solutions.com 25
Trying 66.232.79.143...
Connected to mail.redfish-solutions.com (66.232.79.143).
Escape character is '^]'.
554 mail.redfish-solutions.com ESMTP not accepting messages

From another mailserver I administrate, but located in Germany:

sinoon72: telnet mail.redfish-solutions.com 25
Trying 66.232.79.143...
Connected to mail.redfish-solutions.com.
Escape character is '^]'.
220 mail.redfish-solutions.com ESMTP Sendmail 8.14.1/8.14.1; Mon, 5 Nov 
2007 19:10:02 -0700

No need to remind that any person seriously looking at spam problem

know that spam is mainly originated from USA, even if relayed through
other, possibly Asian, countries.

Yes I am quite pisse dby such attitude.

Olivier
  


It's not a matter of cultural imperialism, if that's what you're getting at.

It's an acknowledgment of the importance of the rule of law in cyberspace.

Some countries enforce anti-spam, anti-trespass laws.  Others lack them 
or don't enforce them.


When these countries put some teeth into the enforcement of their laws, 
then they will stop being blacklisted.


-Philip



Re: It's a fine line...

2007-11-06 Thread Philip Prindeville

Olivier Nicole wrote:

It's not a matter of cultural imperialism, if that's what you're getting at.

It's an acknowledgment of the importance of the rule of law in cyberspace.



Except that I don't think it is anything close to a rule of law, but
rather a sign of short view.

As I said, I doubt you ever got any spam from my organisation (either
originated from, or relayed).
  


So, what are you saying?  One well behaved citizen obviates the need for 
laws for all others?


It doesn't work that way.


Some countries enforce anti-spam, anti-trespass laws.  Others lack them 
or don't enforce them.



The attitude goes by organisation, not by country.
  


Organizations don't make laws.  Countries do.


When these countries put some teeth into the enforcement of their laws, 
then they will stop being blacklisted.



Plus if we would to ban the oginating country for 50% of spam (not my
figure), USA should be banned.
  


Do the math.  50% of the spam (if that is indeed the case) is very low, 
considering that the US generates a much larger percentage of the total 
Internet traffic than just half.


In any case, you might get spammed from the US, but I don't:  it would 
be too easy for me to make a complaint against the spammer and have them 
be charged, shut down, and fined.


That's what effectively laws, properly enforced, do.


But hey, that is a too big cut from Internet, so in some way it is
cultural imperialism.

Bests,

Olivier

  


That's a fairly specious argument.

-Philip




Re: It's a fine line...

2007-11-06 Thread Philip Prindeville

Matus UHLAR - fantomas wrote:

The advise I've seen (iirc it was in rfc-ignorant lists) was not to allow
send the mail to abuse and non-abuse mailboxes together, e.g. when it's sent
to abuse mailbox, reject rcpt to:non-abuse mailboxes with temporary error
and vice versa. The result should be, once the mail will be sent to all
non-abuse mailboxes, once to abuse mailboxes, and they can be filtered with
different rules.

  


If only it were that easy.

The issue is that a lot of sites are ignorant and haven't filled out all 
of their ICANN required fields in their ARIN (or RIPE or APNIC or LACNIC 
or AFRNIC) registrations  So there might be a OrgTech contact as 
[EMAIL PROTECTED]  who you Bcc: on the message, but you guess that 
there's also an abuse mailbox, and they just forgot to register it.


However, you don't want to mail to the abuse mailbox to see if it gets 
delivered, and then if it bounced, mail to the OrgTech mailbox 
instead... because that's too much wasted time...  So you To: the abuse 
mailbox on the odd chance that it exists, and you Bcc: the noc mailbox 
(or the hostmaster or whatever) as a fallback address.


-Philip



Re: How to filter messages from this list?

2007-11-06 Thread Philip Prindeville

mouss wrote:

Marcin Praczko wrote:
  

It is possible add some text to Subject: For example [SPLIST] - to make easier 
set up filter for emails?
  



How about having the logo in png format on the subject line :)

List managers (and other software) should not alter email unless
absolutely necessary. This includes subject tagging, reply-to munging,
removal of trace headers, format conversion, ... etc. The people who
compose messages know better how their messages should look like. Local
policies may override this, because local users have a chance to hang
their sysadmin ;-p

  


If they're lucky they can.  If they work for Uncle Sam, and their 
sysadmin trots out security requirements as their lame excuse for 
breaking things they don't understand, then they're screwed.  As in:


Received: from gate3-sandiego.nmci.navy.mil (gate3-sandiego.nmci.navy.mil 
[138.163.0.43])
by mail.redfish-solutions.com (8.13.8/8.13.8) with ESMTP id 
l8AGAjaQ028222
for XYZZY; Mon, 10 Sep 2007 10:10:50 -0600
Received: from nawesdnims03.nmci.navy.mil by gate3-sandiego.nmci.navy.mil
 via smtpd (for mail.redfish-solutions.com [66.232.79.143]) with ESMTP; 
Mon, 10 Sep 2007 16:00:18 +
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Subject: RE: Chuckle
Date: Mon, 10 Sep 2007 09:10:39 -0700
Message-ID: [EMAIL PROTECTED]
In-Reply-To: [EMAIL PROTECTED]
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Chuckle

Thread-Index: AcfzukOHakkCi8HDRJ2nEhvQOY8RZgACopXw
References: [EMAIL PROTECTED]
From: John Doe [EMAIL PROTECTED]
To: Philip Prindeville [EMAIL PROTECTED]
X-OriginalArrivalTime: 10 Sep 2007 16:10:40.0158 (UTC) 
FILETIME=[219FDBE0:01C7F3C5]


Could they have just *deleted* the Received: lines they didn't want to show?  No, 
of course not.  That would be too easy.  Let's mangle them into something that doesn't 
conform to RFC-822 instead.

As it is, they were leaking hostnames through the Reference: and 
Message-Id: fields anyway...  but we won't talk about that.

They couldn't even leave the id and timestamp fields in the Received: lines 
because that would be revealing... ummm... revealing...  uhh...  how many licks it takes to get to 
the center of a tootsie pop... or some such nonsense.





US Senate as bad internet citizens???

2007-11-13 Thread Philip Prindeville
Well, I recently called my Senator to ask him to support enhanced 
network neutrality legislation (since he worked on the 1998 
Telecommunications Bill).


I received his reply 2 days later by email.  Ok.  I found that there 
were some misconceptions he had about the topic on a purely technical 
basis, and decided to reply to him and set him straight (or more 
appropriately, set the staffer straight that had written the response on 
his behalf).


Well...

The original message was received at Tue, 13 Nov 2007 17:04:21 -0500 (EST)
from localhost [127.0.0.1]

  - The following addresses had permanent fatal errors -
[EMAIL PROTECTED]
   (reason: 550 5.1.1 User unknown)

  - Transcript of session follows -
... while talking to bridgeheadpsq.senate.gov.:


 DATA
  

 550 5.1.1 User unknown
550 5.1.1 [EMAIL PROTECTED]... User unknown
 503 5.5.2 Need Rcpt command.



So I'm wondering... if they send emails out that can't be replied to...  
doesn't that correspond to the very definition of a spammer?  Aren't 
they concealing their identity?


Sigh.

Oh, well.  I knew it was asking too much to have meaningful legislation 
on net neutrality (or digital rights, or copyright reform, etc) come 
from Washington D.C.  Perhaps in 50 years they'll finally have a handle 
on it.


But I dared to hope...

-Philip



Re: help

2007-11-13 Thread Philip Prindeville
As a heads up, more people will read your message if you make your 
Subject line more insightful.


Ironically, I contacted Kintera last Spring pointing out that I wasn't 
getting messages from one of their customers because they were sending 
malformed messages that pegged the spam-o-meter (in particular, they 
were sending broken Date: lines).


Didn't hear back.

Apparently, they've never heard of Spam, or have no interest in 
differentiating themselves from less-legitimate content.


Perhaps it's a marketing strategy to sell you more products and services 
to complement the ineffectual ones you're now using.  ;-)


-Philip



Kim Hurlbutt wrote:
Wondering if you can point me in the right direction on how to make 
our spam scores lower.  How can I get information on how to make edits 
to our pages to lower our scores?  We currently use Kintera to send 
our email newsletters.  Please help!!   Thanks
 
An example of our spam score:
 
Your spam score is: 2.9 points


Score Details:
pts rule name  description
 --
--
0.2 HTML_FONT_FACE_ODD BODY: HTML font face is not a commonly used
face
0.2 HTML_MESSAGE   BODY: HTML included in message
0.3 HTML_FONT_BIG  BODY: HTML has a big font
0.6 HTML_TABLE_THICK_BORD  BODY: HTML table has thick border
0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
0.7 HTML_50_60 BODY: Message is 50% to 60% HTML
0.4 FORGED_YAHOO_RCVD  'From' yahoo.com http://yahoo.com/ does 
not match 'Received'

headers


  Kim Hurlbutt
  Development
   Proctor Academy
  603.735.6218
www.proctoracademy.org http://www.proctoracademy.org 




Clearly bogus false positives -- on abuse contact point, no less

2008-02-16 Thread Philip Prindeville
Hmmm.  I think we need a BL for reporting ISP's that are clueless as to 
run filtering on their abuse mailbox (or the mailbox that's listed for 
their ARIN/RIPE AbuseEmail attributes).


Anyway, I have no idea why I'm seeing some of these scores.  URL matches 
when there aren't even URL's in my message?


A 2.6 score on BAYES_00?  URIBL_JP_SURBL and URIBL_OB_SURBL?  And what 
the heck is DNS_FROM_OPENWHOIS???


TVD_STOCK1?  There's no mention of stock anywhere in the message.  Why am I 
seeing all of these bogus matches?

I looked on the wiki for some of these, but couldn't find descriptions.

What should I do?  Just block their domain?  I don't want to deal with their 
misconfiguration issues.

-Philip





Received: from localhost (localhost)
by mail.redfish-solutions.com (8.14.1/8.14.1) id m1H2M5XP027602;
Sat, 16 Feb 2008 19:22:05 -0700
Date: Sat, 16 Feb 2008 19:22:05 -0700
From: Mail Delivery Subsystem [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
MIME-Version: 1.0
Content-Type: multipart/report; report-type=delivery-status;
boundary=m1H2M5XP027602.1203214925/mail.redfish-solutions.com
Subject: Returned mail: see transcript for details
Auto-Submitted: auto-generated (failure)

This is a MIME-encapsulated message

--m1H2M5XP027602.1203214925/mail.redfish-solutions.com

The original message was received at Sat, 16 Feb 2008 19:22:01 -0700
from pool-71-112-32-245.sttlwa.dsl-w.verizon.net [71.112.32.245]

  - The following addresses had permanent fatal errors -
[EMAIL PROTECTED]
   (reason: 550-This email has been automatically tagged as spam)
[EMAIL PROTECTED]
   (reason: 550-This email has been automatically tagged as spam)

  - Transcript of session follows -
... while talking to alpha.inbound.mercury.spaceservers.net.:

DATA

 550-This email has been automatically tagged as spam
 550-Spam detection software, operated by UKDomains limited, has
 550-identified this incoming email as possible spam.
 550-contact [EMAIL PROTECTED] for details and error reports.
 550-pts rule name  description
 550- -- 
--
 550-1.1 DNS_FROM_OPENWHOIS RBL: Envelope sender listed in
 550-bl.open-whois.org.
 550--0.0 SPF_PASS   SPF: sender matches SPF record
 550--2.6 BAYES_00   BODY: Bayesian spam probability is 0 to 1%
 550-[score: 0.]
 550-1.5 URIBL_JP_SURBL Contains an URL listed in the JP SURBL
 550-blocklist
 550-[URIs: chalturs.com]
 550-1.5 URIBL_OB_SURBL Contains an URL listed in the OB SURBL
 550-blocklist
 550-[URIs: chalturs.com]
 550-0.5 WHOIS_DMNBYPROXY   Contains URL registered to Domains by Proxy
 550-[URIs: redfish-solutions.com]
 550 3.4 AWLAWL: From: address is in the auto white-list
554 5.0.0 Service unavailable

--m1H2M5XP027602.1203214925/mail.redfish-solutions.com
Content-Type: message/delivery-status

Reporting-MTA: dns; mail.redfish-solutions.com
Received-From-MTA: DNS; pool-71-112-32-245.sttlwa.dsl-w.verizon.net
Arrival-Date: Sat, 16 Feb 2008 19:22:01 -0700

Final-Recipient: RFC822; [EMAIL PROTECTED]
Action: failed
Status: 5.2.0
Remote-MTA: DNS; alpha.inbound.mercury.spaceservers.net
Diagnostic-Code: SMTP; 550-This email has been automatically tagged as spam
Last-Attempt-Date: Sat, 16 Feb 2008 19:22:05 -0700

Final-Recipient: RFC822; [EMAIL PROTECTED]
Action: failed
Status: 5.2.0
Remote-MTA: DNS; alpha.inbound.mercury.spaceservers.net
Diagnostic-Code: SMTP; 550-This email has been automatically tagged as spam
Last-Attempt-Date: Sat, 16 Feb 2008 19:22:05 -0700

--m1H2M5XP027602.1203214925/mail.redfish-solutions.com
Content-Type: message/rfc822

Return-Path: [EMAIL PROTECTED]
Received: from [192.168.10.120] (pool-71-112-32-245.sttlwa.dsl-w.verizon.net 
[71.112.32.245])
(authenticated bits=0)
by mail.redfish-solutions.com (8.14.1/8.14.1) with ESMTP id 
m1H2M0XQ027599
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
Sat, 16 Feb 2008 19:22:01 -0700
Message-ID: [EMAIL PROTECTED]
Date: Sat, 16 Feb 2008 18:21:27 -0800
From: Abuse Department [EMAIL PROTECTED]
User-Agent: Thunderbird 2.0.0.9 (Windows/20071031)
MIME-Version: 1.0
To: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
Subject: Of course it's spam: it's an abuse mailbox
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.63 on 192.168.1.3

Of course it's spam.  It's a copy of an offending message (that 
originated from *your* site) being reported back to you, and do you 
abuse mailbox.


If it weren't spam, there'd hardly be a point in reporting it now, would 
there?


What other brilliant deductions are to follow?  That there are a lot of 
sick people in a hospital?


Get a clue.  Better yet, if you were as good at detecting *outbound* 
spam coming from your site as you are incoming spam, we wouldn't be 
having

Re: Clearly bogus false positives -- on abuse contact point, no less

2008-02-16 Thread Philip Prindeville

Karsten Bräckelmann wrote:

Please, do not paste a gigantic blob of multipart MIME messages. Put it
up somewhere, raw, and simply provide a link.


On Sat, 2008-02-16 at 18:44 -0800, Philip Prindeville wrote:
  
Anyway, I have no idea why I'm seeing some of these scores.  URL matches 
when there aren't even URL's in my message?



There are. Self-inflicted. The ones in square brackets with the leading
550 code, which you seem to keep sending back and forth. :)
  


And just *mentioning* the domain name, without any sort of valid URL 
(ftp: or http: or anything of the sort) is going to match it as a URL?  
That's highly bogus.


A domain name alone does not a URL make.

A 2.6 score on BAYES_00?  URIBL_JP_SURBL and URIBL_OB_SURBL?  And what 
the heck is DNS_FROM_OPENWHOIS???



Well, if you don't mind having a second look, that is MINUS 2.6 for
Bayes. What's wrong with that?\
  


Oh, sorry, read over the scores too quickly.  Never mind the BAYES_00.



Regarding your SURBL questions... Yes.  Wait, you where hoping for more?
Without any actually asked question? OK, good then. The domain
chalturs.com is listed in these RBLs, as the results tell you. See
http://surbl.org/ for more.
  


I read the top-level page, but didn't see anything really pertinent.  I 
get the idea.  But naming the domain in a message, again, is not the 
same as embedding an entire URL containing the domain.  The two aren't 
equivalent.




Oh, and DNS_FROM_OPENWHOIS probably is http://open-whois.org/, which
gives you a hint about what it actually is. The hit itself pretty much
mentions this...
  


Yeah, I read this.  And I don't get that either.

How does having your domain be anonymous (for whatever reason... maybe 
you're a small company operating below the radar) make your email any 
more likely to be spam



TVD_STOCK1?  There's no mention of stock anywhere in the message.



From a quick glimpse of the code, it appears to identify common words
used in stock (as in stock exchange, pump-n-dump penny stocks) spam. It
does not search for the word stock. Just as pretty much no rule in SA
ever searches for single words only...
  


Again, I didn't see anything that should legitimately be causing this 
rule to fire, and certainly not with such a high score for such an 
unreliable rule.




Why am I seeing all of these bogus matches?



From what I can tell, and what you sent us, they don't appear to be
bogus.
  


Depends on whether you equate bare domains with URL's, I suppose.


I looked on the wiki for some of these, but couldn't find descriptions.

What should I do?  Just block their domain?  I don't want to deal with
their misconfiguration issues.



Apparently you already exchanged messages? Try not sending the offensive
mail in question. Put it up somewhere as reference, if need be. Hmm,
sounds familiar... ;)

  guenther


  


No, I sent them back the offending email, initially.  Which they marked 
as spam (bloody brilliant, of course it's spam, otherwise I wouldn't be 
bothering to report it what else do they expect to come to their 
Abuse mailbox, anyway???).


So I sent back the SA scores back to them, and that's the part that I 
pasted previously.


How do you report Spam to such a site that's going to block your Spam 
reports for being... well, Spam!


(Yes, I'm shocked too to hear there's gambling going on in Casablanca...)



Re: Clearly bogus false positives -- on abuse contact point, no less

2008-02-17 Thread Philip Prindeville

Matt Kettler wrote:

Philip Prindeville wrote:

Karsten Bräckelmann wrote:

Please, do not paste a gigantic blob of multipart MIME messages. Put it
up somewhere, raw, and simply provide a link.


On Sat, 2008-02-16 at 18:44 -0800, Philip Prindeville wrote:
 
Anyway, I have no idea why I'm seeing some of these scores.  URL 
matches when there aren't even URL's in my message?



There are. Self-inflicted. The ones in square brackets with the leading
550 code, which you seem to keep sending back and forth. :)
  


And just *mentioning* the domain name, without any sort of valid URL 
(ftp: or http: or anything of the sort) is going to match it as a 
URL?  That's highly bogus.


A domain name alone does not a URL make.
You tell that to most windows-based clients, which will automatically 
make clickalble URLs out of things like www.google.com in text sections.


snip



Oh, and DNS_FROM_OPENWHOIS probably is http://open-whois.org/, which
gives you a hint about what it actually is. The hit itself pretty much
mentions this...
  


Yeah, I read this.  And I don't get that either.

How does having your domain be anonymous (for whatever reason... 
maybe you're a small company operating below the radar) make your 
email any more likely to be spam
Decidedly so. The people with the strongest reason to hide their 
contact information are the spammers, and other shady businesses.


That's not to say they're aren't some legitimate folks that use this 
kind of anonymization.  However, the domains by proxy model is a 
questionable practice, as it violates the spirit of the whois 
requirements. Also, many of them violate the letter of the 
requirements, such as the phone issue noted on the open-whois main 
page. (ie:  anyone registered using securewhois is not correctly 
reigstered, per ICANN requirements for whois)


Well, what's ironic here is this:

I go to the open-whois web-site, and read their blurb:

What do you have against privacy?

In a word: nothing. This is not about privacy, but about 
accountability. The Internet is built upon cooperation and 
accountability, anything which undermines accountability is a bad thing. 
The usability of the WHOIS database is seriously undermined by anonymous 
domains.


Ah...  But filtering your spam reports so no one can ever report spam to 
you... that's a lot more accountable, clearly.  :-)







TVD_STOCK1?  There's no mention of stock anywhere in the message.




Not sure, you migth want to try running it with debugging on.
The debug message from the code would be:

 dbg(eval: stock info hit: $1);

That should tell you what exact substring matched the stock info code.


From a quick glimpse of the code, it appears to identify common words
used in stock (as in stock exchange, pump-n-dump penny stocks) spam. It
does not search for the word stock. Just as pretty much no rule in SA
ever searches for single words only...
  


Again, I didn't see anything that should legitimately be causing this 
rule to fire, and certainly not with such a high score for such an 
unreliable rule.






Why am I seeing all of these bogus matches?



From what I can tell, and what you sent us, they don't appear to be
bogus.
  


Depends on whether you equate bare domains with URL's, I suppose.
If MUA's equate them with URLs, spammers will use this, and 
SpamAssassin will use it.


There is only so much braindeath in UA's that you can bend the rules 
for.  Clearly, this involves breaking them.







I looked on the wiki for some of these, but couldn't find 
descriptions.


What should I do?  Just block their domain?  I don't want to deal with
their misconfiguration issues.



Apparently you already exchanged messages? Try not sending the 
offensive

mail in question. Put it up somewhere as reference, if need be. Hmm,
sounds familiar... ;)

  guenther


  


No, I sent them back the offending email, initially.  Which they 
marked as spam (bloody brilliant, of course it's spam, otherwise I 
wouldn't be bothering to report it what else do they expect to 
come to their Abuse mailbox, anyway???).


So I sent back the SA scores back to them, and that's the part that I 
pasted previously.


How do you report Spam to such a site that's going to block your Spam 
reports for being... well, Spam!
Well, it's stupid, and probably a RFC violation to perform such 
filtering on your abuse box. So, I'm not saying the domain in question 
isn't behaving foolishly. You might want to point this out to them, 
and suggest they whitelist their abuse address. At the very least, ask 
them if they have an alternate reporting address that isn't filtered.




I'll give it another try.  If not, their CIDR range and domain name will 
go into my blacklist.  I don't want to open myself up to them if I can't 
reasonably expect them to respond to spam issues when/if they occur (again).


-Philip





Re: SVN notifications killing spamassassin

2008-02-17 Thread Philip Prindeville

Eric A. Hall wrote:

I sometimes get SVN notifications that contain lists of files and their
status. The filenames will often get picked up by the URI matching
algorithm, each of which end up being processed through numerous lookups
(URICOUNTRY, my LDAP filter, etc). Sometimes I get very large messages
with hundreds of file lists, which in turn causes spamassassin to go into
never-never land while it thinks about the hundreds of URI matches.

For example,

  Afpo/reports/perl/nagios_notifications1.pl.bak
  Afoo/reports/perl/nagios_outages1.pl
  Afoo/reports/perl/GWIR.pm

nagios_outages1.pl will be determined as a URI for .pl domain and GWIR.pm
will be determined as a URI for .pm domain, and so forth. The only way to
get these messages through is to disable spamassassin...

I've updated to 3.2.4 just now and it still has the same problem

I'm guessing the URI analyzer needs to be smarter.
  


That's strangely appropriate to the issue I had with calthurs.com.

It would be nice if this checker had an option to enforce checking only 
of well-formed URL's (i.e. not anything that might conceivably be 
munged into a URL by the most ignorant of UA's)... something requiring 
a protocol name (ftp:, http:, tftp:, etc.), a domain name, and a path 
name (even if it's just slash).


Or at the very least, to score complete URL's higher than just domain 
names alone.


-Philip




Re: Clearly bogus false positives -- on abuse contact point, no less

2008-02-17 Thread Philip Prindeville

Matt Kettler wrote:

Philip Prindeville wrote:

Matt Kettler wrote:

Philip Prindeville wrote:
 


Depends on whether you equate bare domains with URL's, I suppose.
If MUA's equate them with URLs, spammers will use this, and 
SpamAssassin will use it.


There is only so much braindeath in UA's that you can bend the rules 
for.  Clearly, this involves breaking them.
Erm.. What rule does this actually break? Is there a rule in an RFC 
somewhere specifying you MUST not interpret bare domains as URIs in 
text emails?


There is an RFC that defines what a URL looks like.  A bare domain 
doesn't cut it.


You want to forbid bare domains in email?  Go ahead.  You can forbid 
anything you like.


But don't call it a test for URL's, since it's clearly not.




Besides, when this braindeath is more the norm than the exception, 
it's a de facto standard. Particularly in the absence of any rules 
against it.


Yeah, I'll talk to the Outlook folks, and file a bug against 
Thunderbird... (I think the latter only does it to be compatible with 
the former...)




*EVERY* graphical MUA I've used in the past 10 years does this. 
Thunderbird, Outlook, Groupwise, Eudora, they all do it. I'm sure 
there are MUAs that don't, but there's an awful lot that do. Most 
webmails seem to do it too. Outlook web access, Comcast and Yahoo all 
do, but I'll concede that Verizon's webmail doesn't.






Re: Clearly bogus false positives -- on abuse contact point, no less

2008-02-18 Thread Philip Prindeville

Matt Kettler wrote:

Philip Prindeville wrote:

Matt Kettler wrote:

Philip Prindeville wrote:

Matt Kettler wrote:

Philip Prindeville wrote:
 


Depends on whether you equate bare domains with URL's, I suppose.
If MUA's equate them with URLs, spammers will use this, and 
SpamAssassin will use it.


There is only so much braindeath in UA's that you can bend the 
rules for.  Clearly, this involves breaking them.
Erm.. What rule does this actually break? Is there a rule in an RFC 
somewhere specifying you MUST not interpret bare domains as URIs in 
text emails?


There is an RFC that defines what a URL looks like.  A bare domain 
doesn't cut it.
Yes, but there's nowhere that says you can't interpret any text you 
want as a URL.


RFCs in general are interpreted with be strict about what you 
generate, and liberal with what you accept. URLizing text segments 
fits with that spirit, and it does not violate the letter of any RFC 
I'm aware of.


There are lots of caveats to this rule, and security is certainly one 
region where you'll find being liberal what you accept to be antithetical.




If you can prove otherwise, please do so.

You want to forbid bare domains in email?  Go ahead.  You can forbid 
anything you like.


But don't call it a test for URL's, since it's clearly not.
Well, they don't.. they call it a test for URIs, which is actually 
slightly different, but not really to the point here.


However, in general, it is intended to be a test for anything most 
MUA's will interpret as a URI.


Ok, conceded.  So the fix is to stop the UA's broken behavior, so we 
don't have to copy it.






Besides, when this braindeath is more the norm than the exception, 
it's a de facto standard. Particularly in the absence of any rules 
against it.


Yeah, I'll talk to the Outlook folks, and file a bug against 
Thunderbird... (I think the latter only does it to be compatible with 
the former...)
I'd venture to guess neither started it. Eudora predates both products 
by quite an extensive period of time. It could have originated there, 
or in Netscape mail.


Sorry, but I highly doubt you can blame this on microsoftism, nor do I 
think it's any kind of wild incorrectness as you so strongly 
postulate. This has been a very standard feature in email for a very 
long time. It's not a recent development.


Long standing hardly equates to correct.  If that were the case, 
day-one bugs would never get fixed. :-)





It's also a feature that is quite important to accuracy in 
spamassassin. Spammers regularly take advantage of MUA's urlizing 
text. Regularly.. Every day. Adding the ability to detect those 
domains increases SA's hit rate for spam, and that's a good thing. 
Yes, it causes SA to trigger on spam reports, but it generally will do 
that for other parts of spam messages anyway.


Let's face it, your problem isn't with SA detecting a spam domain, 
it's with some idiot filter/rejecting their abuse box.




Not at all.

A lot of spam uses constructs that aren't well-formed according to 
standards.  Like broken Date: lines.


I'm happy to reject email that can't get something simple as a Date: 
line correct.


If Kintata (or whatever it's called) emails get bounced, I'm fine with 
that.  Maybe it will light a fire beneath them to get it fixed.  They're 
in the minority anyway.


Same applies to interpreting URI's.

I'd rather suffer a few broken applications, or in this case, a user 
having to cut a domain name out of an email and paste it into a web 
browser and not be able to simply click through the message body, if 
it helps maintain the clear distinction between well-formed messages and 
gray area ham/spam.


-Philip



Re: Clearly bogus false positives -- on abuse contact point, no less

2008-02-18 Thread Philip Prindeville

Daryl C. W. O'Shea wrote:

Philip Prindeville wrote:
  

There is an RFC that defines what a URL looks like.  A bare domain
doesn't cut it.

You want to forbid bare domains in email?  Go ahead.  You can forbid
anything you like.



I don't, and I doubt Matt wants to either.

  

But don't call it a test for URL's, since it's clearly not.



FWIW, you're the only one who's been calling it a URL.  The SA headers
say it's a URI, which isn't accurate either, unless of course you
consider SURBL to be a Schemeless URI Realtime Blocklist.

  

Besides, when this braindeath is more the norm than the exception,
it's a de facto standard. Particularly in the absence of any rules
against it.
  

Yeah, I'll talk to the Outlook folks, and file a bug against
Thunderbird... (I think the latter only does it to be compatible with
the former...)



Yeah, good luck with that.

Do you really have an issue with SA, or is it just that you're pissed
off that somebody rejected spam sent to their abuse account and you're
taking your frustration out on how SA detected that spam?

Daryl
  


I don't like going down the slippery slope of Well, it's not really an 
URI, but Outlook treats it like one, so we will too. (substitute URI 
and Outlook with an number of alternate permutations here).


Half of the security holes that viri, etc. exploit probably exist 
because of woolly-minded thinking and bent definitions like that in the 
first place.  So what could be a well-intentioned attempt to make things 
better just ends up making them worse.


-Philip




Another candidate for the hall of Shame: Eschelon

2008-04-18 Thread Philip Prindeville
Well, I got a bunch of spams from 66.213.228.51 about some non-existent 
stock (that's considered Wire Fraud, and it's a federal felony offense 
in the US).


It was also unsolicited.

I went to Eschelon.com, the ISP, and provided them with examples and 
asked them to shutdown the spammer.


They insisted that the client in this case (meaning their checks cash, 
even if they do spam) was a legitimate opt-in operator.


I said, Fine, then have them furnish the proof that this user ever 
opted in, because he insists he didn't.


A week later, no reply, despite my pinging them twice.

They're either complicit, or else burying their head in the ground as to 
the legitimacy of the complaints (they did call them a major customer).


Because it doesn't take over a week to dig out proof that someone opted 
into a list or didn't.


So, what's the procedure for spanking an irresponsible ISP?

How do you name him to the various RBL's?

I suppose I could sign up for spamcop.net... Which S/X/RBL would be most 
effective in this case?


Thanks,

-Philip





Blacklist of phone numbers?

2006-06-03 Thread Philip Mak
Is there a blacklist of phone numbers?

A lot of diploma spam I get has totally different message bodies,
except they list the same phone number to call.


Bad quoting

2006-06-08 Thread Philip Prindeville
I noticed the following message (well, I'll just put a fragment):

!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN
HTMLHEAD
META http-equiv=3DContent-Type content=3Dtext/html; =
charset=3Dwindows-1252
META content=3DMSHTML 6.00.2900.2670 name=3DGENERATOR
STYLE/STYLE
/HEAD
BODY bgColor=3D#ff
DIVFONT face=3DArial size=3D2IMG alt=3D hspace=3D0=20
src=3Dcid:000e01c68b04$73437a90$41e45853@qop align=3Dbaseline=20
border=3D0IMG alt=3D hspace=3D0=20
src=3Dcid:000f01c68b04$73437aaa$41e45853@qop align=3Dbaseline=20
border=3D0IMG alt=3D hspace=3D0=20
src=3Dcid:001001c68b04$73437ac4$41e45853@qop align=3Dbaseline=20
border=3D0IMG alt=3D hspace=3D0=20
src=3Dcid:001101c68b04$73437ade$41e45853@qop align=3Dbaseline=20
border=3D0IMG alt=3D hspace=3D0=20
src=3Dcid:001201c68b04$73437af8$41e45853@qop align=3Dbaseline=20
border=3D0/FONT/DIV



Note that the '=' got escaped as '=3D'  they probably entered
the text and their HTML editor escaped it, not figuring it was
raw HTML being entered directly...

-Philip




Re: how do reject email with ....

2006-06-08 Thread Philip Prindeville
Call SA from Mimedefang.  And see the sample config I put up:

http://www.mimedefang.org/kwiki/index.cgi?PhilipsWorkingFilter

See the last test in filter_relay().

Note that there are two blocks that need to be downloaded and
put into the mimedefang-filter file.  I broke them up to be able to
document them.

-Philip


Screaming Eagle wrote:

I getting this type of spam:

  Return-Path: [EMAIL PROTECTED]
 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on
 X-Spam-Virus: No
 X-Spam-Status: No, score=1.4 required=8.0 tests=BAYES_50,HTML_30_40,
 HTML_MESSAGE autolearn=no version=3.1.0
 X-Spam-Level: *
 Received: from 1802EC8 ([59.95.26.84]) by .
 (8.11.6/8.11.6) with SMTP id k58CtsN23285; Thu, 8 Jun 2006
08:55:55 -0400
 Received: from echoes (unknown [59.95.26.84]) by WXMVW (LBYSys) with ESMTP

The ip 59.95.26.84 is not resolvable. How can I not accept email
from sources which does not have a proper reverve lookup or name
lookup.

Thanks.
  




Re: Adding Phishing Link rule

2006-06-24 Thread Philip Prindeville
What about combining this with a whitelist?

I.e. I regularly get emails from target.bifn0.com that contain links that
point to themselves, but say they are target.com  And in fact, this is
a 3rd party that Target has contracted to do outsource mailings for them,
so in that respect they are legitimate.  So I could easily whitelist
them and
continue to reject everyone else...

The other approach would be to push for an advisory standard (RFC)
that explains how to encode URL's so that they aren't flagged as
phishing.  (No flames from pissy people please... you know who you
are... ;-)  I.e. that at a minimum the host portions of the URL and the
label for the link would have to match...

If the sender REALLY needs to have the link reside somewhere else,
they could always have the published address send a Location: response
that redirects you to the eventual resting place.

-Philip


Loren Wilton wrote:

The rule you suggest isn't particularly good.  There are far too many legit
mails (mostly mailing list type of things) that do exactly what you want to
check for.  So the FP rate is higher than most people would like.  This has
been discussed many times in the past.

That said, I believe there is at least one SARE rule that checks for exactly
what you want to look for.

Loren

  




On bichromatic GIF stock spam

2006-06-24 Thread Philip Prindeville
I get a lot of spam that looks like:

http://pastebin.com/729105

on the alsa-devel mailing list, amongst others...  And noticed the
following.

If you decompress the GIF file and decode it into a pixmap image, then
do a color histogram of the image, you notice two things immediately.

There are two colors, black (the text), and the colored background.

Further, these spams seem to always use one of 6 common colors...

It should be trivial to write a filter that does exactly this decompression
and returns a list of what colors (maybe as a #281000 RGB encoding)
occur and what percentage of the total color map they occupy.

If, after excluding black, we find that 100% of the color map is that
nasty pastel pink or pastel lime green (etc) then it's a spam and we
toss it.

Sound reasonable?

There might be other tests that cover this, but they aren't used by
SourceForge unfortunately, which hosts a lot of the lists that I read...

-Philip



Re: On bichromatic GIF stock spam

2006-06-24 Thread Philip Prindeville
Michael Scheidell wrote:

-Original Message-
From: Philip Prindeville [mailto:[EMAIL PROTECTED] 
Sent: Saturday, June 24, 2006 2:10 PM
To: users@spamassassin.apache.org
Subject: On bichromatic GIF stock spam


I get a lot of spam that looks like:

http://pastebin.com/729105

on the alsa-devel mailing list, amongst others...  And 
noticed the following.

If you decompress the GIF file and decode it into a pixmap 
image, then do a color histogram of the image, you notice two 
things immediately.



Or feed it through character recognition software and then replace the
gif attachment with a plain text attachment and reinject it back into
SA.
  


Well, yeah, and that's already been discussed...  I wanted an alternative
to that that might be less CPU intensive.

-Philip



Re: On bichromatic GIF stock spam

2006-06-24 Thread Philip Prindeville
Loren Wilton wrote:

If, after excluding black, we find that 100% of the color map is that
nasty pastel pink or pastel lime green (etc) then it's a spam and we
toss it.

Sound reasonable?



I was thinking about this the other day.  I think the concept is reasonable,
but as stated doesn't go far enough, and would be trivial to bypass.

I think that someone first needs to come up with either a formula or a list
of RGB triples that are visually indistinguishable or some such.  (I
suspect this has been done several times now and the research should exist
in the wild.)

This can then be used as a fuzz to group colors that are very close down
into a common bucket.  As it is, trivial 1-bit variations on colors would
defeat the simple scheme.
  


Shh they might be listening... ;-)

Seriously, though, how many people send out 2-color GIFs (besides
BW scans of Dilbert and faxes) as email?

The formula is:

sqrt((r1 - r2) ^2 + (g1 - g2) ^2 + (b1 - b2) ^2))

to generate the RGB vector distance between to pixels.


It might also be interesting to accumulate a) total area of each color and
b) largest rectangle (or other easily detected shape) of each color.  The
first case we would have from the pixel counts.  The second case could be
used to detect large areas of fill color.  This might help classify a text
message vs a map of the world or a picture of downtown Camaroon.
  


Why?  What does downtown Cameroon look like?  ;-)

It also might be interesting to accumulate statistics on the common color
distributions for 10K or so legit images sent through email, possibly along
with some sort of indication of purpose: picture of me, picture of my
dog, billboard I saw, kids at Christmas, Hallmark greeting card, etc.
  


But those aren't sent as multipart/alternative... because you want to
see both
the text and the images.  The spammers send multipart/alternative because
they want the text/plain section to confuse the Bayes filters, since
they know
it won't be rendered...

With that info the color distribution might be able to help classify the
image fairly cheaply.

I don't know how much of the above would be absolutely necessary, but I
suspect at least some of it is.  Still, this is a fairly trivial sort of
thing to have to accumulate.  Expecially since all spam (at least currently)
uses gifs, which a blind man can decode with a hair comb - no fancy software
required.

Loren
  



Yup.  Exactly.

-Philip




Re: On bichromatic GIF stock spam

2006-06-25 Thread Philip Prindeville
John D. Hardin wrote:

On Sat, 24 Jun 2006, Philip Prindeville wrote:

  

the text and the images.  The spammers send multipart/alternative
because they want the text/plain section to confuse the Bayes
filters, since they know it won't be rendered...



It seems to me that right there is the spam sign you should be looking
for, then, and save all the heavy-duty mathematical analysis of the
images themselves.
  


A lot of mailers generate multipart/alternative legitimately, though if you
ask me sending both text/plain and text/html is bogus and no one should
configure their mailer to do that.

-Philip



Re: On bichromatic GIF stock spam

2006-07-01 Thread Philip Prindeville
Loren Wilton wrote:

No, I was thinking of multipart/alternative where one of the
alternative streams is nothing but images. That doesn't strike me as
legitimate. Can anyone think of a scenario where images *are* a
legitimate alternative representation of text?



Doesn't really help.  The actual mails have a tiny gibberish text part, and
a tiny to medium html part that has a few words of gibberish (usually the
same as the text part) and the rest is calls to images.  So there really is
an html part.

I did a trivial test for alternative and gif, and it didn't pan out very
well.  Will need some additional conditions to make it more usable.

Loren

  


What Perl modules are there that can process (decode, perform certain
inspections and histogram analysis, etc) of GIF files?

I'd like to throw something together...

-Philip



Does SpamAssassin support SPF?

2006-07-01 Thread Philip Mak
Does SpamAssassin support SPF record checking?

Or is this something I have to patch into my incoming SMTP server?


Whitelisting abuse and

2006-07-19 Thread Philip Prindeville
What are the steps to whitelist email sent from  (i.e. Postmaster
when bouncing mail) or [EMAIL PROTECTED]

Thanks,

-Philip



Re: Rejection text

2006-07-19 Thread Philip Prindeville
Will Nordmeyer wrote:

  

On Wed, 12 Jul 2006, Paul Dudley wrote:



If we decide to reject low grade spam messages rather than
quarantine them, is it possible to add text to the body of the
rejection message?
  

Rejecting (bouncing) spam is utterly pointless, as 99% of it will 


have
  

forged sender information. You will either be sending your notice 


to a
  

nonexistent address, in which case you get yet more useless traffic
back to your server in the form of a bounce of your bounce, or your
notice will go to some innocent third party, possibly contributing 


to
  

an effective DDoS against their email account.

--


I thought this was about having the MTA saying 555 we dont want that 


spam at the
  

end of data phase .
Whether it can be done at all, and whether the message can be 


changed, depends on the MTA
  

rather than SA

Wolfgang Hamann



Since MOST (if not all, these days) SPAM comes from invalid/forged 
addresses, doesn't that just bog down the email system with SPAM reject 
bounces bouncing back to you reporting that the address you were 
telling we rejected your SPAM is invalid?

(I had a user who had a 3rd party program that he'd do that with - I 
asked him to stop because when he'd do it, it'd bog down my email 
with invalid recipient type emails since the person he 
was notifying was an invalid address).
  


Thankfully there are fewer open relays each day, and hence if you
reject the message as it's being sent, then the sender is the spammer,
and he will know he is failing.

With any luck, he might even remove you from the list of addresses
that he will try to spam in the future.

-Philip



Broken abuse auto-responders

2006-08-22 Thread Philip Prindeville
Well, I have the following issue.  When I report abuse to [EMAIL PROTECTED],
they send me back an auto-generated email ticket with a broken Date: on
it (honestly, people, how hard is it to correctly format the date???).

They do this as  for the sending address.

How does one go about writing a whitelist_rcvd_from line for the empty
address

Aug 22 07:49:28 mail mimedefang.pl[458]: helo: dns-mx.noc.verio.net 
(129.250.49.11) said helo dns-mx.noc.verio.net
Aug 22 07:49:28 mail mimedefang.pl[458]: helo: whitelist dns-mx.noc.verio.net 
(129.250.49.11)
Aug 22 07:49:33 mail sendmail[472]: k7MDnN3u000472: from=, size=2062, 
class=0, nrcpts=1, msgid=[EMAIL PROTECTED], proto=ESMTP, daemon=MTA-v4, 
relay=dns-mx.noc.verio.net [129.250.49.11]
Aug 22 07:49:34 mail mimedefang.pl[458]: k7MDnN3u000472: hits=5.164, req=5, 
names=AWL,INVALID_DATE,NO_REAL_NAME
Aug 22 07:49:34 mail mimedefang.pl[458]: 
MDLOG,k7MDnN3u000472,spam,5.164,129.250.49.11,,[EMAIL PROTECTED],Re: 
[NTT-C2755649Z] Phishing from 161.58.27.23
Aug 22 07:49:34 mail mimedefang.pl[458]: filter: k7MDnN3u000472:  bounce=1 
discard=1
Aug 22 07:49:34 mail mimedefang[4220]: k7MDnN3u000472: Bouncing because filter 
instructed us to
Aug 22 07:49:34 mail sendmail[472]: k7MDnN3u000472: Milter: data, reject=554 
5.7.1 Message rejected; scored too high on the Spam test.
Aug 22 07:49:34 mail sendmail[472]: k7MDnN3u000472: to=[EMAIL PROTECTED], 
delay=00:00:05, pri=32062, stat=Message rejected; scored too high on the Spam 
test.




How to whitelist_from ?

2006-08-23 Thread Philip Prindeville
Hmm  Maybe if I post with a more obvious subject line

What is the notation for writing a whitelist_from or whitelist_from_rcvd
when the sender is  ?  (As in MAIL FROM: )

Thanks,

-Philip


Philip Prindeville wrote:

Well, I have the following issue.  When I report abuse to [EMAIL PROTECTED],
they send me back an auto-generated email ticket with a broken Date: on
it (honestly, people, how hard is it to correctly format the date???).

They do this as  for the sending address.

How does one go about writing a whitelist_rcvd_from line for the empty
address

Aug 22 07:49:28 mail mimedefang.pl[458]: helo: dns-mx.noc.verio.net 
(129.250.49.11) said helo dns-mx.noc.verio.net
Aug 22 07:49:28 mail mimedefang.pl[458]: helo: whitelist dns-mx.noc.verio.net 
(129.250.49.11)
Aug 22 07:49:33 mail sendmail[472]: k7MDnN3u000472: from=, size=2062, 
class=0, nrcpts=1, msgid=[EMAIL PROTECTED], proto=ESMTP, daemon=MTA-v4, 
relay=dns-mx.noc.verio.net [129.250.49.11]
Aug 22 07:49:34 mail mimedefang.pl[458]: k7MDnN3u000472: hits=5.164, req=5, 
names=AWL,INVALID_DATE,NO_REAL_NAME
Aug 22 07:49:34 mail mimedefang.pl[458]: 
MDLOG,k7MDnN3u000472,spam,5.164,129.250.49.11,,[EMAIL PROTECTED],Re: 
[NTT-C2755649Z] Phishing from 161.58.27.23
Aug 22 07:49:34 mail mimedefang.pl[458]: filter: k7MDnN3u000472:  bounce=1 
discard=1
Aug 22 07:49:34 mail mimedefang[4220]: k7MDnN3u000472: Bouncing because filter 
instructed us to
Aug 22 07:49:34 mail sendmail[472]: k7MDnN3u000472: Milter: data, reject=554 
5.7.1 Message rejected; scored too high on the Spam test.
Aug 22 07:49:34 mail sendmail[472]: k7MDnN3u000472: to=[EMAIL PROTECTED], 
delay=00:00:05, pri=32062, stat=Message rejected; scored too high on the Spam 
test.


  




Re: How to whitelist_from ?

2006-08-23 Thread Philip Prindeville
John D. Hardin wrote:

On Wed, 23 Aug 2006, Philip Prindeville wrote:

  

Hmm  Maybe if I post with a more obvious subject line

What is the notation for writing a whitelist_from or
whitelist_from_rcvd when the sender is  ?  (As in MAIL FROM:
)



Are you sure you want to use that broad a brush? There is a *lot* of
garbage that is sent as faked mailer daemon bounces.
  


Well, yes, especially since the IP address of the sender is reserved for
a machine that does ticketing and auto-replies exclusively (I was going
to use whitelist_from_rcvd and not just whitelist_from).

When dealing with a known correspondent's brokenness, it's safer to
focus your permissiveness rather tightly. Try a meta rule that matches
a Received: line on a bounce from them, add a rule that ANDs that meta
with the rule that fires on their malformed date, and score it to
cancel out the malformed date score.
  


I'm not ready to work that hard...

I'd rather catch the broken email, point it out to them, have them fix it,
and then remove the whitelisting when they've fixed their agent.

-Philip




Re: How to whitelist_from ?

2006-08-24 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  


  

Well, yes, especially since the IP address of the sender is reserved for
a machine that does ticketing and auto-replies exclusively (I was going
to use whitelist_from_rcvd and not just whitelist_from).



At that point, you should be able to use:

 whitelist_from_rcvd * rdns.host.name

Which will effectively white-list the host.
  


There's no way to whitelist just the empty address then?  Rather than
everything?

-Philip



Re: How to whitelist_from ?

2006-08-25 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  

There's no way to whitelist just the empty address then?  Rather than
everything?

-Philip

  


Not given the simple file-glob format of the whitelist commands. You'd
need a regular expression and negation.

You could do it with a rule...

header __NULL_RETURN   From !~   /./i
header __RCVD_MYHOST   Received =~ /insert Received header regex
matching your servers exchanging../
meta MY_NULL_RETURN   (__NULL_RETURN  __RCVD_MYHOST)
  


How about modifying the source to accept some sort of notation for an
empty address in whitelist_from_rcvd?

-Philip



Re: How to whitelist_from ?

2006-10-19 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  

There's no way to whitelist just the empty address then?  Rather than
everything?

-Philip

  


Not given the simple file-glob format of the whitelist commands. You'd
need a regular expression and negation.

You could do it with a rule...

header __NULL_RETURN   From !~   /./i
header __RCVD_MYHOST   Received =~ /insert Received header regex
matching your servers exchanging../
meta MY_NULL_RETURN   (__NULL_RETURN  __RCVD_MYHOST)


  


It's not the From, but rather the EnvelopeFrom.

--- Mail/SpamAssassin/Conf/Parser.pm.bak2006-08-29 09:16:46.0 
-0600
+++ Mail/SpamAssassin/Conf/Parser.pm2006-10-19 20:44:18.0 -0600
@@ -631,6 +631,10 @@
   unless (defined $value  $value !~ /^$/) {
 return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
   }
+  # email from postmaster, abuse autoresponders, etc.
+  if ($value eq '') {
+return $conf-{parser}-add_to_addrlist ($key, '');
+  }
   $conf-{parser}-add_to_addrlist ($key, split (' ', $value));
 }


I tried the above fix, but it didn't work.

Not sure why...

-Philip





Re: How to whitelist_from ?

2006-10-19 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  

Matt Kettler wrote:

  


Philip Prindeville wrote:
 


  

There's no way to whitelist just the empty address then?  Rather than
everything?

-Philip

 
   

  


Not given the simple file-glob format of the whitelist commands. You'd
need a regular expression and negation.

You could do it with a rule...

header __NULL_RETURN   From !~   /./i
header __RCVD_MYHOST   Received =~ /insert Received header regex
matching your servers exchanging../
meta MY_NULL_RETURN   (__NULL_RETURN  __RCVD_MYHOST)


 


  

It's not the From, but rather the EnvelopeFrom.
  


A rule matching header From should match any from like header,
including Return-Path.
  


Not sure I follow.

The From: header will be [EMAIL PROTECTED] or [EMAIL PROTECTED]
or something similar (depending on the agent).

The Sender (EnvelopeFrom will be empty, however).  I believe that MdF
sticks that into the ReturnPath: header.

-Philip

Unless you're calling SA before the return-path header is created, in
which case you can't match it with SA at all.
  

  



  




Re: Yerp connection issues

2010-05-26 Thread Philip Prindeville

On 5/26/10 11:06 AM, Mikael Syska wrote:

Hi,

On Wed, May 26, 2010 at 6:59 PM, Philip Prindeville
philipp_s...@redfish-solutions.com  wrote:
   

Anyone else seeing the following in their cron logs:

http: GEThttp://yerp.org:8080/rules/stage/330948267.tar.gz  request failed:
500 Can't connect to yerp.org:8080 (connect: Connection refused): 500 Can't
connect to yerp.org:8080 (connect: Connection refused)
 

Nope, same problem here on port 8080 ... but fine access on port 80.

Any reason why you are uding 8080 ?
   



I've not modified any of the config files, so it's using whatever the 
Fedora 12 rpm has in /etc/mail/spamassassin/channel.d/* files.


-Philip


   


I'm running spamassassin-3.3.1-2.fc12.x86_64 on Fedora 12.




 

mvh
   




Re: SA 3.3.1 and NetAddr::IP 4.034

2010-10-31 Thread Philip Prindeville

On 10/29/10 9:18 AM, Michael Scheidell wrote:

On 10/29/10 12:11 PM, Mark Martinec wrote:

Sure, go ahead, can't hurt. The patch is now in the SA trunk.
Is it worth opening a ticket and putting it into the 3.3 branch too?

   Mark

looks like Freebsd ports has an older version, so it should be ok.

 pkg_info | grep NetAddr
p5-NetAddr-IP-4.02.8 Perl module for working with IP addresses and blocks thereo




You might be able to get better results with: Net-Patricia-1.18 which I 
released earlier this week.



Re: SA 3.3.1 and NetAddr::IP 4.034

2010-11-07 Thread Philip Prindeville

On 11/2/10 7:35 PM, Mark Martinec wrote:


One suggestion: currently it is not possible to store 0 and 1
as a data item associated with each net, because a 0 is treated
the same as undef and replaced by the key.

And the AF_NET6 argument to new() needs to be documented in a POD.

Thanks for your efforts!

  Mark


Try the following patch.  If it works for you, I'll rerelease as 1.19:

--- Patricia.pm.orig2010-10-23 17:26:03.0 -0600
+++ Patricia.pm 2010-11-07 21:36:30.0 -0700
@@ -216,7 +216,7 @@

 sub add {
   my ($self, $ip, $bits, $data) = @_;
-  $data ||= $bits ? $ip/$bits : $ip;
+  $data ||= defined $bits ? $ip/$bits : $ip;
   my $packed = inet_pton(AF_INET6, $ip) || croak(invalid key);
   $self-SUPER::_add(AF_INET6,$packed,(defined $bits ? $bits : 128), $data);
 }




Re: SA 3.3.1 and NetAddr::IP 4.034

2010-11-07 Thread Philip Prindeville

On 11/7/10 9:19 PM, Philip Prindeville wrote:


Try the following patch.  If it works for you, I'll rerelease as 1.19:


Actually, I released it as Net-Patricia-1.18_01



Re: SA 3.3.1 and NetAddr::IP 4.034

2010-11-08 Thread Philip Prindeville

On 11/2/10 8:14 PM, Mark Martinec wrote:

Btw, this could be more gracefully handled:

$ perl -e 'use Socket6; use Net::Patricia'
Prototype mismatch: sub main::AF_INET6: none vs ()
at /usr/local/lib/perl5/5.12.2/Exporter.pm line 64.

   Mark


That's someone else's bug:

https://rt.cpan.org/Public/Bug/Display.html?id=32362

and represents a defect in Socket6.  The work-around is to include Socket 
before Socket6.

-Philip



Re: SA 3.3.1 and NetAddr::IP 4.034

2010-11-08 Thread Philip Prindeville

On 11/8/10 5:58 PM, Mark Martinec wrote:

Philip,

Thanks for your off-list reply. Unfortunately I cannot
reply, as your mailer is refusing connections:

$ host -t mx redfish-solutions.com
  redfish-solutions.com mail is handled by 10 mail.redfish-solutions.com.
$ telnet -s mail4.ijs.si mail.redfish-solutions.com 25
  Trying 66.232.79.143...
  Connected to mail.redfish-solutions.com.
  554 mail.redfish-solutions.com ESMTP not accepting messages

(the message is now sitting in our queue, retrying periodically)

   Mark


Oh, sorry.  Fixed.



Yahoo webmail spam from Africa

2010-11-09 Thread Philip Prindeville

Has anyone else noticed that if they get a message with:

Received: from [41.184.9.153] by web80007.mail.sp1.yahoo.com via HTTP; Sat, 06 
Nov 2010 09:52:53 PDT



i.e. from the 41.0.0.0/8 CIDR block from Africa, and the transport was HTTP, to 
anything ending with yahoo.com that 100% of the time it's SPAM?

I see that Plugin/HeaderEval.pm contains:

  if ($rcvd =~ /by web\S+\.mail\S*\.yahoo\.com via HTTP/) { return 0; }


which is part of it.  And Message/Metadata/Received.pm contains:

# Received: from [193.220.176.134] by web40310.mail.yahoo.com via HTTP;
# Wed, 12 Feb 2003 14:22:21 PST
if (/ via HTTP$//^\[(${IP_ADDRESS})\] by (\S+) via HTTP$/) {
  $ip = $1; $by = $2; goto enough;
}

(I note that HTTP$ seldom matches, by the way, since all of my examples have via 
HTTP;date instead.)

Is it worth having an explicit rule for this?

Thanks,

-Philip





Re: SA and SELinux

2010-11-11 Thread Philip Prindeville

On 11/10/10 11:39 AM, John Williams wrote:

No on my server I have a hard requirement to run SELinux.  I cannot turn that
off.  I find that when i enable SA with SELinux turned on, my CPU rate sky
rockets eventually forcing my system to stop responding.  I've seen this thread
several times through Google searches however I did not find a solution.



I know that there are issues affecting Mimedefang stopping it from working in 
SElinux.

They might be related.

https://bugzilla.redhat.com/show_bug.cgi?id=647587

Collect traces and attach them as a comment.



Deciphering the geography of Yahoo domains

2010-12-12 Thread Philip Prindeville

Like a lot of corporations, Yahoo! seems to apply their AUP based on the 
requirements of the jurisdiction in which they are operating.  I.e. for Taiwan 
they seem to be incredibly lax.

If that's the case, then we're happy to block Yahoo! except for North America 
(where we can pursue legal recourse if we need to).

I figured out that:

ird.yahoo.com = Ireland
tp2.yahoo.com = Taipei
sp2.yahoo.com = Spain

Anyone know what the entirety of domains are for Yahoo?

Thanks,

-Philip



Re: blacklist.mailrelay.att.net

2010-12-14 Thread Philip Prindeville

On 12/13/10 2:14 AM, Giampaolo Tomassoni wrote:

Le 12/12/2010 19:23, Giampaolo Tomassoni a écrit :

How does it work?

I just got blocked by the ATT's blacklist (in contacting

ab...@att.com,

besides...), but I'm pretty sure my MX is not an open relay or other

kind of

nifty thing.

Maybe ATT blocks whole address bunches from which some hosts are

spamming?

Because this could explain me why: my MX is co-located...


$ host tomassoni.biz
tomassoni.biz has address 62.149.201.242
tomassoni.biz has address 62.149.220.102
tomassoni.biz mail is handled by 10 c0.edlui.it.

$ host c0.edlui.it
c0.edlui.it has address 62.149.220.102
c0.edlui.it has address 62.149.201.242

$ host 62.149.201.242
242.201.149.62.in-addr.arpa domain name pointer
host242-201-149-62.serverdedicati.aruba.it.

$ host 62.149.220.102
102.220.149.62.in-addr.arpa domain name pointer
host102-220-149-62.serverdedicati.aruba.it.

So both IPs use generic hostnames, which are a sign of half
configured
servers.

Unfortunately the RDNS is not under my control.

Which is a fact I share with a lot of people worldwide...



think as the receiving side. when I see spam out of joe.spam.example, I
blocklist spam.example (and possibly every IP and domain related to
them). If I see spam coming from host1-2-364.serverdedicati.aruba.it,
what will I blacklist?

I personally (and many serious blocklists) would block the single spamming
address. You may easily see that Aruba.it is a co-location provider, so you
may easily understand that different hosts from the same address bunch are
probably handled by different organizations, with different means and
purposes.

To me, it is counter-productive to block the whole bunch.

Giampaolo


I would strongly encourage your ISP to clean up their act by adding an 
excursion detection system, that watches for bursty outbound traffic patterns, 
like a sudden spike in outbound SMTP or HTTP connections to a wide spread of 
addresses.

-Philip



perl-Net-Patricia-1.19 is out

2010-12-14 Thread Philip Prindeville

It's been released for F13 and F14.  And of course, it's upstream on CPAN.

It's the promotion of the development version 1.18_81 to production.



Re: blacklist.mailrelay.att.net

2010-12-14 Thread Philip Prindeville

On 12/14/10 11:31 AM, Giampaolo Tomassoni wrote:

I would strongly encourage your ISP to clean up their act by adding an
excursion detection system, that watches for bursty outbound traffic
patterns, like a sudden spike in outbound SMTP or HTTP connections to a
wide spread of addresses.

Is Aruba.it so poorly reputed?

g


I can't speak for their reputation, but when an entire ISP's CIDR blocks get 
blacklisted (like we did with iWeb.ca) it's usually because they aren't very 
responsive in dealing with issues when they occur and not proactive about 
trying to prevent them.

-Philip



Re: DNSBL for email addresses?

2010-12-14 Thread Philip Prindeville

On 12/14/10 3:35 PM, Cedric Knight wrote:

On 14/12/10 14:28, Marc Perkel wrote:

Are there any DNSBLs out there based on email addresses? Since you can't
use an @ in a DNS lookup

Actually, you can use '@' in a lookup.  You just can't use it in a hostname.

Or you could convert the '@' to a '.' as is the format still used in SOA
records.


Not just SOA records, but the MB records were supposed to use this as well.  
They just never caught on.

-Philip



Re: preventing authenticated smtp users from triggering PBL

2010-12-19 Thread Philip Prindeville

On 12/17/10 9:57 AM, Ted Mittelstaedt wrote:

On 12/17/2010 9:23 AM, Aaron Bennett wrote:

-Original Message- From: Ted Mittelstaedt
[mailto:t...@ipinc.net] Sent: Friday, December 17, 2010 12:20 PM
To: users@spamassassin.apache.org Subject: Re: preventing
authenticated smtp users from triggering PBL

why are you using authenticated SMTP from trusted networks?

The whole point of auth smtp is to come from UN-trusted networks.




I think you are misunderstanding.  I may be on an unstrusted network,
but I want to send email through a host on a trusted network.  By
authenticating, I can.  It was the trusted host which authenticated
me, and thus SA needs to take that I was authenticated by a trusted
host into consideration before applying the PBL rule to the address
the mail initiated on.



Right, but a spammer can send a message with the same authenticated
flag set in the mail header through the standard SMTP port because
they are manufacturing the headers out of thin air.

My experience with SA is that if it sees that flag anywhere in the
header, it will assume the mail is safe.  I have also had the experience
with earlier versions of SA that they ignore the flag completely but
that was fixed a while ago.

Ted


I use Mimedefang with some home-brewed patches I've been trying to get David to 
include for the last 3+ years.

I look for the local port # that the connection comes in on, and pass it in to 
SpamAssassin as a hint from the command via --cf=...  And port 587 forces a 
different rule than 25 does.

This can't be forged.

-Philip



Re: Irony

2011-02-14 Thread Philip Prindeville

On 2/7/11 1:28 AM, Matus UHLAR - fantomas wrote:

On Tue, 1 Feb 2011 09:49:36 -0500
Michael Scheidellmichael.scheid...@secnap.com  wrote:


because HELO doesn't match RDNS.

On 01.02.11 09:54, David F. Skoll wrote:

Rejecting on that basis would also cause tons of false-positives.

It's also violation of all SMTP RFCs (former and current), because they
explicitly say that the sender MUST NOT reject smtp session just because
HELO string does not match resolved FQDN.



Does anyone else reject messages where the rDNS maps to more than one PTR 
record?




Re: Chickenpoxed subjects

2011-11-08 Thread Philip Prindeville
On 10/20/11 8:24 PM, Adam Katz wrote:
 On 10/19/2011 04:43 AM, Mynabbler wrote:
 You are kidding, right? 50% of this crap comes from FREEMAIL
 addresses, and even more specific: 44% of this crap is delivered by
 aol.com.  The aol deliveries have about 85% unique from@aol
 addresses, so they pretty much 'own' aol.
 
 We're writing spam filters, not idiot filters.  The fact that there is
 so much overlap is often useful, bit the overlap is not complete.  There
 is also a decent amount of overlap between the
 mostly-computer-illiterate and freemail users.  I think this drives your
 current line of thinking.
 
 There are a lot of people that do very spammy things.  It is a testament
 to SA and other filters that such non-spam doesn't so commonly flag as spam.
 

Sorry to come to the party late on this, was traveling a bit.

It seems to me that if you have lines like:

Subject: T R +A N/N!l :ES,  P \0 R  N
Subject: S C/H ,O 0=LG)l :R$L$S ) P -0 RN

Then the solution is to use agrep.  Make deletions of punctuation very low 
cost, as well as the usual transformations like:

0 = O
1 = l
$ = S
...

also be low-cost.  (Of course, then you end up with the possibility of clash 
between deleting $ and replacing it with 'S', but agrep is good about checking 
both)... they you just grep through a dictionary of the usual offenders:

lesbian
cash
meds
porn
...

I'm not familiar with perl-String-Approx...  reading up on it, it uses the 
Levenshtein distances just like agrep does... so it would be ideal for doing 
approximate matches.

http://search.cpan.org/~jhi/String-Approx-3.26/Approx.pm

-Philip


No X-Spam- headers appearing

2013-09-26 Thread Philip Colmer
I've just installed SA 3.3.2 on an Ubuntu server to be used with Mailman,
using apt-get install spamassassin.

I've mostly followed the instructions in
http://www.jamesh.id.au/articles/mailman-spamassassin/ - I say mostly
because (a) it looks like James was using a different distro so the
configuration file is in a different place and (b) it looks like he was
using an older version because the configuration options have changed.

Anyhow, SA seems to be working nicely with Mailman, in that /var/log/syslog
is showing me things like:

Sep 26 12:57:43 ip-10-141-164-156 spamd[7659]: spamd: connection from
localhost [127.0.0.1] at port 58125
Sep 26 12:57:43 ip-10-141-164-156 spamd[7659]: spamd: using default config
for linaro-mm-sig: /var/lib/spamassassin/linaro-mm-sig.prefs/user_prefs
Sep 26 12:57:43 ip-10-141-164-156 spamd[7659]: spamd: checking message 
3535d648a078fd18d0cc6f13ea347...@rkmryshu.net for linaro-mm-sig:999
Sep 26 12:57:48 ip-10-141-164-156 spamd[7659]: spamd: identified spam
(10.7/5.0) for linaro-mm-sig:999 in 4.5 seconds, 11234 bytes.
Sep 26 12:57:48 ip-10-141-164-156 spamd[7659]: spamd: result: Y 10 -
DOS_OE_TO_MX,MIME_BASE64_BLANKS,NO_DNS_FOR_FROM,RCVD_IN_BRBL_LASTEXT,RCVD_IN_PBL,RCVD_IN_XBL,RDNS_NONE,WEIRD_QUOTING
scantime=4.5,size=11234,user=linaro-mm-sig,uid=999,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=58125,mid=
3535d648a078fd18d0cc6f13ea347...@rkmryshu.net,autolearn=no

However, the emails are NOT getting spam headers inserted, whether they are
ham or spam.

According to
http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.html:

Note that X-Spam-Checker-Version is not removable because the version
information is needed by mail administrators and developers to debug
problems. Without at least one header, it might not even be possible to
determine that SpamAssassin is running.

so I would, at the very least, expect X-Spam-Checker-Version to appear in
all emails. Furthermore, the documentation says:

Here are some examples (these are the defaults, note that Checker-Version
can not be changed or removed):

  add_header spam Flag _YESNOCAPS_
  add_header all Status _YESNO_, score=_SCORE_ required=_REQD_
tests=_TESTS_ autolearn=_AUTOLEARN_ version=_VERSION_
  add_header all Level _STARS(*)_
  add_header all Checker-Version SpamAssassin _VERSION_ (_SUBVERSION_) on
_HOSTNAME_

Even though the documentation says these are the defaults, I've added them
anyway to /etc/spamassassin/local.cf and restarted spamd, but the headers
still aren't being inserted.

One blog I found (
http://blog.dmitryleskov.com/small-hacks/forcing-spamassassin-to-add-the-x-spam-status-header-to-ham-for-debugging/)
suggests that these headers are actually added by amavisd ... which I don't
have installed. However, the blog posting does say:

In other words, in configurations where SpamAssasin is controlled by
amavisd-new, the X-Spam- headers are actually added by the latter, and it
is amavisd-new that decides whether to add them.

so, since SA is *not* being controlled by amavisd-new on my system, I don't
think this applies anyway.

The same blog posting suggests that the X-Spam headers will only appear on
messages that have some score and not pure ham messages, but I've checked a
message that got a score of 10.7 and there are no headers in it.

What am I misunderstanding or what have I overlooked?

Thanks.

Philip


Re: No X-Spam- headers appearing

2013-09-26 Thread Philip Colmer
Thanks, Karsten, for your explanation. That makes sense and I'll have to
see whether the lack of headers is going to cause problems going forwards
or if looking in syslog will suffice.

Regards

Philip



On 26 September 2013 16:33, Karsten Bräckelmann guent...@rudersport.dewrote:

 On Thu, 2013-09-26 at 14:11 +0100, Philip Colmer wrote:
  I've just installed SA 3.3.2 on an Ubuntu server to be used with
  Mailman, using apt-get install spamassassin.
 
  I've mostly followed the instructions in
  http://www.jamesh.id.au/articles/mailman-spamassassin/  [...]

  However, the emails are NOT getting spam headers inserted, whether
  they are ham or spam.

 That's due to the Mailman filter in above reference.

 The Mailman filter uses the SYMBOLS spamd method, rather than PROCESS,
 which makes spamd return the status, score and a list of rules hit only.
 Unlike with PROCESS, the message itself is not returned, thus no X-Spam
 headers either.

 A closer look at the python code suggests, the filter even hardly cares
 about the rules hit -- it just cares about the score to decide whether
 to pass, moderate or discard the message. That decision is passed back
 to the Mailman filter chain.


  What am I misunderstanding or what have I overlooked?

 Generally, the client (a mailman filter in your case) passes the message
 to spamd, which after processing returns the modified message. As you
 pointed out correctly, this modification by default adds some X-Spam
 headers, at the very least an X-Spam-Version header.

 The docs you cited apply to SpamAssassin. The method the Mailman filter
 uses is spamd specific, though, a daemon implementation using SA at its
 core.


 There's also a general misunderstanding here. While the client passes
 the message (a copy of the message, rather), it is up to the client what
 it does with the returned data -- including outright ignoring the result
 with the X-Spam headers added, and proceeding with (a copy of) the
 original message.

 In order to have the added X-Spam headers show up later, the client has
 to discard the original copy, and pass along the modified version as
 received from the daemon.


 If you want the X-Spam headers, you will need a different Mailman
 filter / documentation to follow.


 --
 char *t=\10pse\0r\0dtu\0.@ghno
 \x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
 main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8?
 c=1:
 (c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0;
 }}}




Testing the _REMOTEHOSTNAME_ in a rule

2013-10-18 Thread Philip Prindeville
I'm trying to write a rule that gives some spamminess score to messages 
received from any host that resolves to protection.outlook.com.

I tried to use _REMOTEHOSTNAME_ to do this, but I think I got the header syntax 
wrong.

Can someone set me straight?

Thanks,

-Philip



Re: Testing the _REMOTEHOSTNAME_ in a rule

2013-10-21 Thread Philip Prindeville

On Oct 19, 2013, at 5:28 PM, Karsten Bräckelmann guent...@rudersport.de wrote:

 On Fri, 2013-10-18 at 18:34 -0600, Philip Prindeville wrote:
 I'm trying to write a rule that gives some spamminess score to messages
 received from any host that resolves to protection.outlook.com.
 
 I tried to use _REMOTEHOSTNAME_ to do this, but I think I got the
 header syntax wrong.
 
 Template Tags cannot be used in rules. What you're looking for is the
 X-Spam-Relays-External or -Untrusted pseudo-header.
 
  http://wiki.apache.org/spamassassin/TrustedRelays
 
 Run a sample through spamassassin -D and grep the debug output for the
 X-Spam-Relays headers. You'll likely want to match your rule against the
 rdns or helo values.
 
 To ensure matching against the very last untrusted relay, no closing
 square bracket may appear before the match.
 
  RULE_NAME  X-Spam-Relays-Untrusted =~ /^[^\]]+ rdns=evil.example.net /
 
 That rdns value is added to the Received header by your SMTP, and your
 MX actually should be listed as by value in that very [...] block.
 
 

Thanks.  By the way, in your example, the dots in evil.example.net need to be 
escaped, don't they?

I ended up going with:

header L_OUTLOOKX-Spam-Relays-Untrusted =~ /^[^\]]+ rdns=[^ 
]*\.(ptr|outbound)\.protection\.outlook\.com /
describe L_OUTLOOK  Anything coming from outlook.com
score L_OUTLOOK 4.95


and this seems to work.

-Philip



Re: Image spams getting thru

2006-10-30 Thread Philip Prindeville
Logan Shaw wrote:

[snip]
And there's also an easy way around it.  Simply add noise to
the image.  There are a number of techniques, but an obvious
one to use with GIF is to assign two palette entries to
two nearly (but not quite) identical colors.  For example,
put 0xff and 0xfffeff in your palette.  Then, for every
white pixel in the original image, choose at random whether it
gets represented by a 0xff or 0xfffeff pixel.  There will
be virtually no discernable difference to the eye, but the
files will completely different, especially since GIF uses
LZW compression on the pixel data.

There are similar methods for other formats:  with JPEG, you
can just change the quality settings, causing the JPEG decoder
itself to add noise to your image.  (And perfectly legit noise,
too, since the quality parameters vary on legit images.)

And of course you can just add noise to the least significant
bit in any generic format as well.

   - Logan
  


If I could revisit this issue and be less sinister in doing so, I'm
trying to look at ways to generate a fingerprint from GIF stock
spams that could be used to filter them.

I'll need to reduce a large number of spam (no, I don't need any
extra, so don't bother forwarding them ;-)... and then do a stochastic
analysis of those parameters.

In the meantime, a couple of questions and observations...

First, CPAN seems to come up short on modules to parse and
decompose (and render!) GIF or PNG file formats. Most
disappointing. I finally decided on the now stagnant and
unsupported Image::Info module (sigh), but it doesn't
decompress that data once it deconstructs the GIF data stream
into its component parts.

I tried to use Compress::LZW to decompress the stream, but
that only seems to work on 12 or 16 bit minimum codesize,
whereas GIF images are routinely 4, 6, or 8 bits long.

Does anyone have a handle on what Perl modules to use for
dissecting GIF objects?

Thanks,

-Philip



Can't upgrade w/ RPM

2006-11-02 Thread Philip Prindeville
Hi.

I'm running FC3 on an AMD64 platform for my mail server,
and I had last installed SpamAssassin 3.1.5.  Well, I grabbed the
tarball for 3.1.7, and did a rpmbuild -tb ... of the tarball.

Worked fine.

Then I tried to upgrade via RPM:

# rpm -v -U 
/home/src/redhat/RPMS/x86_64/perl-Mail-SpamAssassin-3.1.7-1.x86_64.rpm
error: Failed dependencies:
perl-Mail-SpamAssassin = 3.1.5-1 is needed by (installed) 
spamassassin-3.1.5-1.x86_64


any ideas why this is happening and what the fix is?

-Philip




Re: Can't upgrade w/ RPM

2006-11-02 Thread Philip Prindeville
Jim Maul wrote:

Philip Prindeville wrote:
  

Hi.

I'm running FC3 on an AMD64 platform for my mail server,
and I had last installed SpamAssassin 3.1.5.  Well, I grabbed the
tarball for 3.1.7, and did a rpmbuild -tb ... of the tarball.

Worked fine.

Then I tried to upgrade via RPM:

# rpm -v -U 
/home/src/redhat/RPMS/x86_64/perl-Mail-SpamAssassin-3.1.7-1.x86_64.rpm
error: Failed dependencies:
perl-Mail-SpamAssassin = 3.1.5-1 is needed by (installed) 
 spamassassin-3.1.5-1.x86_64


any ideas why this is happening and what the fix is?

-Philip
 



You cant just upgrade one of the RPM's, you need to do them all at once.

spamassassin-3.1.5-1.x86_64 is using 
perl-Mail-SpamAssassin-3.1.5-1.x86_64.rpm so you cant upgrade one 
without the other.

-Jim
  


You're right.  Sorry, I spaced.  I figured that the RPM container
actually contained several modules, like zaptel does (it also contains
zaptel-devices, zaptel-libs, etc).

Is there any reason to not have a single container contain multiple
packages?  Since they do both need to be installed simultaneously?

-Philip



Microsoft blacklisted?

2006-11-13 Thread Philip Prindeville
I recently saw an email get bounced that was legitimately coming
from Microsoft:

Nov 13 14:59:26 mail mimedefang.pl[19053]: helo: maila.microsoft.com 
(131.107.115.212) said helo smtp.microsoft.com
Nov 13 14:59:26 mail sendmail[21067]: kADLxLLR021067: from=[EMAIL PROTECTED], 
size=1207, class=0, nrcpts=1, msgid=[EMAIL PROTECTED], bodytype=7BIT, 
proto=ESMTP, daemon=MTA-v4, relay=maila.microsoft.com [131.107.115.212]
Nov 13 14:59:29 mail mimedefang.pl[20521]: kADLxLLR021067: hits=6.909, req=5, 
names=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,L_WIN_CHARSET
Nov 13 14:59:29 mail mimedefang.pl[20521]: 
MDLOG,kADLxLLR021067,spam,6.909,131.107.115.212,[EMAIL PROTECTED],[EMAIL 
PROTECTED],Out of Office: Software Development with Microsoft
Nov 13 14:59:29 mail mimedefang.pl[20521]: filter: kADLxLLR021067:  bounce=1 
discard=1
Nov 13 14:59:29 mail mimedefang[5737]: kADLxLLR021067: Bouncing because filter 
instructed us to
Nov 13 14:59:29 mail sendmail[21067]: kADLxLLR021067: Milter: data, reject=554 
5.7.1 Message rejected; scored too high on the Spam test.
Nov 13 14:59:29 mail sendmail[21067]: kADLxLLR021067: to=[EMAIL PROTECTED], 
delay=00:00:03, pri=31207, stat=Message rejected; scored too high on the Spam 
test.

I've put into my spamassassin/sa-mimedefang.cf file:

whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com


What am I missing at this point?

Does the 2nd arg to the whitelist_from_rcvd need to be
maila.microsoft.com instead?

And what do DNS_FROM_RFC_ABUSE and DNS_FROM_RFC_POST correspond to?
Where do I get the descriptions of these tests, why some sites get
tagged with them, etc?

-Philip






Re: Microsoft blacklisted?

2006-11-13 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  

I recently saw an email get bounced that was legitimately coming
from Microsoft:

Nov 13 14:59:26 mail mimedefang.pl[19053]: helo: maila.microsoft.com 
(131.107.115.212) said helo smtp.microsoft.com
Nov 13 14:59:26 mail sendmail[21067]: kADLxLLR021067: from=[EMAIL 
PROTECTED], size=1207, class=0, nrcpts=1, msgid=[EMAIL PROTECTED], 
bodytype=7BIT, proto=ESMTP, daemon=MTA-v4, relay=maila.microsoft.com 
[131.107.115.212]
Nov 13 14:59:29 mail mimedefang.pl[20521]: kADLxLLR021067: hits=6.909, req=5, 
names=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,L_WIN_CHARSET
Nov 13 14:59:29 mail mimedefang.pl[20521]: 
MDLOG,kADLxLLR021067,spam,6.909,131.107.115.212,[EMAIL PROTECTED],[EMAIL 
PROTECTED],Out of Office: Software Development with Microsoft
Nov 13 14:59:29 mail mimedefang.pl[20521]: filter: kADLxLLR021067:  bounce=1 
discard=1
Nov 13 14:59:29 mail mimedefang[5737]: kADLxLLR021067: Bouncing because 
filter instructed us to
Nov 13 14:59:29 mail sendmail[21067]: kADLxLLR021067: Milter: data, 
reject=554 5.7.1 Message rejected; scored too high on the Spam test.
Nov 13 14:59:29 mail sendmail[21067]: kADLxLLR021067: to=[EMAIL PROTECTED], 
delay=00:00:03, pri=31207, stat=Message rejected; scored too high on the Spam 
test.

I've put into my spamassassin/sa-mimedefang.cf file:

whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com


What am I missing at this point?

Does the 2nd arg to the whitelist_from_rcvd need to be
maila.microsoft.com instead?

And what do DNS_FROM_RFC_ABUSE and DNS_FROM_RFC_POST correspond to?
  


postmaster and abuse lists at rfc-ignorant.org. Both are wildly prone to
false positives and have been removed from the 3.2 devel branch. They
effectively list sites that violate the RFCs for mail hosts and refuse
mail sent to postmaster or abuse.

That said, neither scores very high.. Assuming set3 (bayes and network)
the combined score in SA 3.1.x is only 1.908 points..

What's L_WIN_CHARSET.. that's not a stock rule I'm aware of. Looks like
an add-on to me, and probably the real culprit here. I found some
references to it from list conversations, and looks like it's trying to
match email with a windows-specific character set (windows-1252). But
it's not in any ruleset I can find anywhere.
  

Actually, it looks like a rule you yourself were developing back in
April.. What did you set the score to?
http://www.gossamer-threads.com/lists/spamassassin/users/72328

  



Yes, it's local.

I set it to 4.85.  Or maybe 4.99.

But why isn't the whitelisting kick in?

Could it be because:

# nslookup # nslookup 131.107.115.212
Server: 205.171.3.65
Address:205.171.3.65#53

Non-authoritative answer:
212.115.107.131.in-addr.arpaname = maila.microsoft.com.
212.115.107.131.in-addr.arpaname = smtp.microsoft.com.
212.115.107.131.in-addr.arpaname = mail1.microsoft.com.

Authoritative answers can be found from:
107.131.in-addr.arpanameserver = ns5.msft.net.
107.131.in-addr.arpanameserver = ns1.msft.net.
107.131.in-addr.arpanameserver = ns2.msft.net.
107.131.in-addr.arpanameserver = ns3.msft.net.
107.131.in-addr.arpanameserver = ns4.msft.net.
ns1.msft.netinternet address = 207.68.160.190
ns2.msft.netinternet address = 65.54.240.126
ns3.msft.netinternet address = 213.199.144.151
ns4.msft.netinternet address = 207.46.66.126
ns5.msft.netinternet address = 65.55.238.126


Server: 205.171.3.65
Address:205.171.3.65#53

Non-authoritative answer:
212.115.107.131.in-addr.arpaname = maila.microsoft.com.
212.115.107.131.in-addr.arpaname = smtp.microsoft.com.
212.115.107.131.in-addr.arpaname = mail1.microsoft.com.

Authoritative answers can be found from:
107.131.in-addr.arpanameserver = ns5.msft.net.
107.131.in-addr.arpanameserver = ns1.msft.net.
107.131.in-addr.arpanameserver = ns2.msft.net.
107.131.in-addr.arpanameserver = ns3.msft.net.
107.131.in-addr.arpanameserver = ns4.msft.net.
ns1.msft.netinternet address = 207.68.160.190
ns2.msft.netinternet address = 65.54.240.126
ns3.msft.netinternet address = 213.199.144.151
ns4.msft.netinternet address = 207.46.66.126
ns5.msft.netinternet address = 65.55.238.126

# 

(how hard can it be to follow $%^* RFC directions saying
only one PTR record per address)

What's the fix here?  Set the 2nd argument to the IP
address instead?  The man doesn't suggest you can do that.

And I don't want to wildcard it as microsoft.com -- that's
way too many potential hosts.

-Philip




  

Where do I get the descriptions of these tests, why some sites get
tagged with them, etc?



  




Re: Microsoft blacklisted?

2006-11-14 Thread Philip Prindeville
SM wrote:

At 18:56 13-11-2006, Philip Prindeville wrote:
  

I recently saw an email get bounced that was legitimately coming


from Microsoft:

[snip]


  

I've put into my spamassassin/sa-mimedefang.cf file:

whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com


What am I missing at this point?

Does the 2nd arg to the whitelist_from_rcvd need to be
maila.microsoft.com instead?



Yes.

Regards,
-sm 

  


The problem with this is that the DNS returns the response (of the multiple
PTR records) in no particular order, so looking up the rDNS can return
one of three different names...

# nslookup
 set type=any
 server ns4.msft.net.
Default server: ns4.msft.net.
Address: 207.46.66.126#53
 212.115.107.131.in-addr.arpa
Server: ns4.msft.net.
Address:207.46.66.126#53

212.115.107.131.in-addr.arpaname = mail1.microsoft.com.
212.115.107.131.in-addr.arpaname = smtp.microsoft.com.
212.115.107.131.in-addr.arpaname = maila.microsoft.com.
 


So, if I put:


whitelist_from_rcvd [EMAIL PROTECTED] mail1.microsoft.com
whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com
whitelist_from_rcvd [EMAIL PROTECTED] maila.microsoft.com


will that work?  Or will each command clobber the previous one?

-Philip




Re: Microsoft blacklisted?

2006-11-14 Thread Philip Prindeville
SM wrote:

At 11:49 14-11-2006, Philip Prindeville wrote:
  

The problem with this is that the DNS returns the response (of the multiple
PTR records) in no particular order, so looking up the rDNS can return
one of three different names...

# nslookup


set type=any
server ns4.msft.net.
  

Default server: ns4.msft.net.
Address: 207.46.66.126#53


212.115.107.131.in-addr.arpa
  

Server: ns4.msft.net.
Address:207.46.66.126#53

212.115.107.131.in-addr.arpaname = mail1.microsoft.com.
212.115.107.131.in-addr.arpaname = smtp.microsoft.com.
212.115.107.131.in-addr.arpaname = maila.microsoft.com.


So, if I put:

whitelist_from_rcvd [EMAIL PROTECTED] mail1.microsoft.com



Then use:

whitelist_from_rcvd [EMAIL PROTECTED] microsoft.com

Regards,
-sm 
  


Yeah, in an earlier message, I considered that, but didn't want to
leave myself wide open to every misbehaving host at Microsoft.

So I take it the short answer is that you can't have three entries for
the same mail address, and can't have multiple hostname args (which
you really should be able to do... or maybe even take an IP address
directly!).

-Philip



Re: Microsoft blacklisted?

2006-11-14 Thread Philip Prindeville
John D. Hardin wrote:

On Tue, 14 Nov 2006, Daryl C. W. O'Shea wrote:

  

Philip Prindeville wrote:



whitelist_from_rcvd [EMAIL PROTECTED] mail1.microsoft.com
whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com
whitelist_from_rcvd [EMAIL PROTECTED] maila.microsoft.com

will that work?
  

It should.



A microsoft whitelist does appear in 70_sare_whitelist, though it does
trust all microsoft hosts rather than just the three listed above...

You might consider adding that ruleset.
  


Can't do that. Matter of principle: I'm tired of tacitly admitting that
they're the 800lb gorilla and they get to do whatever they please.

When '95 came out, I was willing to cut them some slack since this
whole Internetworking thing was new to them. That was 10 years
ago. Why they're still struggling to comply with standards I don't
know. It's not for lack of engineers.

-Philip



Re: ????? ??? ??????

2006-11-16 Thread Philip Prindeville
I would say that this issue in general (and this file in particular) is
more than overdue for a revisiting.

I haven't seen UCS, CP125?, or IBM852 for a long time.  Likewise
for UNICODE or XUNKNOWN.

As for ISO (tout court) from Magellan... that's broken, and if it
hasn't been fixed by now, then it's their problem, not our.  Easier to
whitelist the few users still clinging to broken mailers than to
continue to compromise spamproofness.

As for Windows...  I would change the test from:

$cs =~ /^WINDOWS/

to:

$cs eq 'WINDOWS-1252'

instead.  There is no reason to use any of the other
Windows character sets:  they offer nothing that UTF doesn't
already have.

Being liberal in what you accept is good if interoperability is
your goal.  If security and integrity, however, are primal, then
being paranoid in what you accept might actually be more
appropriate.

Is there anyone out there (preferably in Central/Eastern Europe)
that handles a high volume of traffic that can tell us if
any of these encodings are still in legitimate use?  Like ISO10646
or UCS or ISO-8859-8 or CP125?, etc.

The alternative is to add checks per language for each of the
Windows-125[0-8] types.  Yes, you can encode English in
Windows-1256... but a sane mailer would detect that a message
all fits into 7-bits and use USASCII instead.

If it doesn't, then it's broken and needs to be fixed.

I'm not against reinventing the wheel when a new design is
offered that's better.  But I'm not convinced that Windows-1252
is an improvement over Latin-1.  For instance, the glyphs oe
and OE aren't a unique letter:  they are a presentation (i.e.
ligature) that renders (displays) differently from writing o and
e separately... but it is in fact just the two letters o and e
that are being represented (similarly for ij in Dutch, etc)
without kerning between them.

The bottom line is you don't need specific characters for
oe and ij, etc.  You just need a rendering engine that
understands when using a ligature is appropriate (same
as with ss in German, or ff, fl, etc. in English).

Making these distinct characters was folly.

But I digress.

Just out of curiosity, what are the charsets_for_locale{'en'}
anyway?  If it were up to me, I'd limit it to USASCII,
ISO-8859-1, and UTF-8.  Period.

Likewise, for Japanese, how many UA's use anything other
than ISO2022JP?  This is the blessed standard.  Anything else
is out-of-date and requires a fix.

-Philip


Robert Nicholson wrote:

 so what is the conclusion to this issue?

 why when I set ok_locales to it th en does it allow any Charset with
 Windows in the name
 to bypass that setting?

 Why is it that is_charset_ok_for_locales written to give exceptions

 sub is_charset_ok_for_locales {
   my ($cs, @locales) = @_;

   $cs = uc $cs; $cs =~ s/[^A-Z0-9]//g;
   $cs =~ s/^3D//gs; # broken by quoted-printable
   $cs =~ s/:.*$//gs;# trim off multiple charsets, just use 1st

   study $cs;
   #warn JMD $cs;

   # always OK (the net speaks mostly roman charsets)
   return 1 if ($cs eq 'USASCII');
   return 1 if ($cs =~ /^ISO8859/);
   return 1 if ($cs =~ /^ISO10646/);
   return 1 if ($cs =~ /^UTF/);
   return 1 if ($cs =~ /^UCS/);
   return 1 if ($cs =~ /^CP125/);
   return 1 if ($cs =~ /^WINDOWS/);  # argh, Windows
   return 1 if ($cs eq 'IBM852');
   return 1 if ($cs =~ /^UNICODE11UTF[78]/); # wtf? never heard of it
   return 1 if ($cs eq 'XUNKNOWN'); # added by sendmail when converting
 to 8bit
   return 1 if ($cs eq 'ISO');   # Magellan, sending as 'charset=iso
 8859-15'. grr

   foreach my $locale (@locales) {
 if (!defined($locale) || $locale eq 'C') { $locale = 'en'; }
 $locale =~ s/^([a-z][a-z]).*$/$1/;  # zh_TW... = zh

 my $ok_for_loc = $charsets_for_locale{$locale};
 next if (!defined $ok_for_loc);

 if ($ok_for_loc =~ /(?:^| )\Q${cs}\E(?:$| )/) {
   return 1;
 }
   }

   return 0;
 }

 On Nov 13, 2006, at 8:30 PM, Giampaolo Tomassoni wrote:

 # don't allow windows-1252 text attachments...

 mimeheader __CTYPE_MH_WIN1252   Content-Type =~ 

 /charset=(\windows-125[0-8]\|windows-125[0-8])/i

 meta WIN_CHARSET((__CTYPE_MH_HTML || 

 __CTYPE_MH_TEXT_PLAIN)  __CTYPE_MH_WIN1252)

 describe WIN_CHARSETContent-Type is Windows-specific text

 score WIN_CHARSET   0.01





Re: ????? ??? ??????

2006-11-16 Thread Philip Prindeville
You'd think, wouldn't you

-Philip


Robert Nicholson wrote:

 This is Japanese

 # Japanese: Peter Evans writes: iso-2022-jp = rfc approved, rfc 1468,
 created
   # by Jun Murai in 1993 back when he didnt have white hair!  rfc
 approved.
   # (rfc 2237) -- by M$.
   'ja' = 'EUCJP JISX020119760 JISX020819830 JISX020819900
 JISX020819970 '.
 'JISX021219900 JISX021320001 JISX021320002 SHIFT_JIS SHIFTJIS '.
 'ISO2022JP SJIS JIS7 JISX0201 JISX0208 JISX0212',

 Surely the MUA only changes the charset to Windows-1255 once it sees
 there are glyphs in which case you'd expect seldom to see Windows-1255
 when there are no glyphs present?

 On Nov 16, 2006, at 4:24 PM, Philip Prindeville wrote:

 Windows-1256... but a sane mailer would detect that a message

 all fits into 7-bits and use USASCII instead.





Re: ????? ??? ??????

2006-11-16 Thread Philip Prindeville
[EMAIL PROTECTED] wrote:

The bottom line is you don't need specific characters for
oe and ij, etc.  You just need a rendering engine that
understands when using a ligature is appropriate (same
as with ss in German, or ff, fl, etc. in English).

Making these distinct characters was folly.

But I digress.

  


Hi,

typography considers it a gross error to use ligature characters (fl) if they 
occur at the
boundary between word compounds. So either a rendering system has to be pretty 
smart, or the
transmitted text needs to be able to represent the ligature as well as the 
separate character.
This slightly resembles arabic languages where different glyphs are used for 
the same character
at the beginning of a word, in the middle, or at the end.
Of course, most email writers are not concerned about these fine details, and 
the company
behind the winows- charsets does not seem to understand kerning at all.

Wolfgang Hamann
  


You're right!  The rendering system does need to be pretty smart.

Unfortunately, few of them are.

But that's still no excuse to lobotomize character encodings.

The least offensive of all solutions would have been to create a
throw-away non-rendering character, like the non-break space,
that says, glue these two together as a ligature.  It would waste
a lot less of an already limited encoding space, too.

-Philip





Accurately deprecating charsets

2006-11-17 Thread Philip Prindeville
I'll ask again...  Can someone who handles a fair mix of
email content (i.e. not just western European languages)
do a triage (individually) of the rules below for ham versus
spam?

I'd suspect that very little genuine ham contains IBM852
or Unicode or CP12[0-8] these days.

Thanks,

-Philip



Robert Nicholson wrote:

 so what is the conclusion to this issue?

 why when I set ok_locales to it th en does it allow any Charset with
 Windows in the name
 to bypass that setting?

 Why is it that is_charset_ok_for_locales written to give exceptions

 sub is_charset_ok_for_locales {
   my ($cs, @locales) = @_;

   $cs = uc $cs; $cs =~ s/[^A-Z0-9]//g;
   $cs =~ s/^3D//gs; # broken by quoted-printable
   $cs =~ s/:.*$//gs;# trim off multiple charsets, just use 1st

   study $cs;
   #warn JMD $cs;

   # always OK (the net speaks mostly roman charsets)
   return 1 if ($cs eq 'USASCII');
   return 1 if ($cs =~ /^ISO8859/);
   return 1 if ($cs =~ /^ISO10646/);
   return 1 if ($cs =~ /^UTF/);
   return 1 if ($cs =~ /^UCS/);
   return 1 if ($cs =~ /^CP125/);
   return 1 if ($cs =~ /^WINDOWS/);  # argh, Windows
   return 1 if ($cs eq 'IBM852');
   return 1 if ($cs =~ /^UNICODE11UTF[78]/); # wtf? never heard of it
   return 1 if ($cs eq 'XUNKNOWN'); # added by sendmail when converting
 to 8bit
   return 1 if ($cs eq 'ISO');   # Magellan, sending as 'charset=iso
 8859-15'. grr

   foreach my $locale (@locales) {
 if (!defined($locale) || $locale eq 'C') { $locale = 'en'; }
 $locale =~ s/^([a-z][a-z]).*$/$1/;  # zh_TW... = zh

 my $ok_for_loc = $charsets_for_locale{$locale};
 next if (!defined $ok_for_loc);

 if ($ok_for_loc =~ /(?:^| )\Q${cs}\E(?:$| )/) {
   return 1;
 }
   }

   return 0;
 }




Re: ??

2006-11-21 Thread Philip Prindeville
John D. Hardin wrote:

On Mon, 20 Nov 2006, twofers wrote:

  

I would like to know what local rule I could invoke to tag email that the 
subject is not in english.
   
  header   NOT_IN_ENGLISH Subject !~ /English/i
  describe NOT_IN_ENGLISH Subject Contains Non English Characters
  score NOT_IN_ENGLISH 3.5
   
  What regexp could I use?



I haven't tested this, but it may work:

header   NOT_IN_ENGLISH Subject =~ /[\x80-\xFF]{3}/

That should hit on a string of at least three charaters with the high
bit set.

You may need to drop it down to {2} to get good detection.

Don't score it very high.
  


Of course, that would exclude messages with ISO Latin 1 (8859.1)
characters like Yen, Pound Sterling, Trademark, etc. Plus, there are
words in English that when properly written do contain accents,
such as resume, dais, cliche, cooperation, etc.

Excluding words with pounds and yen in the Subject line might be
a good thing, however...

-Philip



Redundant QP encoding of Subject/From fields...

2006-11-21 Thread Philip Prindeville
I got the following spam.  I've included the header:

Return-Path: [EMAIL PROTECTED]
Received: from mail.libertysurf.net (webmail-out.libertysurf.net 
[213.36.80.105])
by mail.redfish-solutions.com (8.13.8/8.13.7) with ESMTP id 
kAM1ckKs008704
for [EMAIL PROTECTED]; Tue, 21 Nov 2006 18:38:52 -0700
Received: from aliceadsl.fr (192.168.10.57) by mail.libertysurf.net (7.1.026)
id 43F3DDC5003935BF; Wed, 22 Nov 2006 02:22:49 +0100
Date: Wed, 22 Nov 2006 02:22:49 +0100
Message-Id: [EMAIL PROTECTED]
Subject: =?iso-8859-1?Q?Representative_Needed.?=
MIME-Version: 1.0
X-Sensitivity: 3
Content-Type: multipart/alternative; 
boundary=_=__=_XaM3_.1164158569.2A.498089.42.6019.52.42.007.3770
From: [EMAIL PROTECTED] [EMAIL PROTECTED]

My question is this.  The encoding of the Subject: and From: lines
is redundant.  There are no non-USASCII characters in either field.
Hence, specifying =?iso-8859-1?Q? is not necessary.

The test SUBJECT_EXCESS_QP seems to handle this (at least the Subject:
part).  I'd like to crank it up to 3.5 or higher.

Any intuitive reasons why this wouldn't work?  Are there any
valid mailers that are braindead?

Thanks,

-Philip




Re: Greylisting

2006-11-21 Thread Philip Prindeville
John Andersen wrote:

On Monday 20 November 2006 15:08, Rick Macdougall wrote:
  

It's possible that they could send it all twice but I've never seen it.
  Remember that some unbelievable number of infected Windows clients are
the main source of spam and it would just be too much trouble for the
spammer to try every address twice after a 15 minute interval.



Oh come on!  It costs the spammer NOTHING to make that adjustment
to his bot net.  Its someone else's bandwidth, and someone else's
cpu cycles.

They are reading this list and planning the changes already.

  


If the graylist time is 15 minutes (for instance), and someone
reports them fairly soon after they start up... and their ISP is
quick to shut them down (cough, cough) then we're managed
to severely limit how many sites they hit before they get
shut down.

Of course, graylisting a larger value (2 hours) for totally
unknown correspondents would be more effective.

-Philip



Braindeath in the Navy

2006-11-21 Thread Philip Prindeville
Well, I tried to contact some people responsible for
the servers below that what they were doing was broken,
including citing chapter and verse where in RFC-2822 in
syntax of the Received: lines was spec'd out:

Received: from Gate2-sandiego.nmci.navy.mil (gate2-sandiego.nmci.navy.mil 
[138.163.0.42])
by mail.redfish-solutions.com (8.13.8/8.13.7) with ESMTP id 
kAGNLZHp020689
for [EMAIL PROTECTED]; Thu, 16 Nov 2006 16:21:40 -0700
Received: from nawesdnims03.nmci.navy.mil by Gate2-sandiego.nmci.navy.mil
  via smtpd (for mail.redfish-solutions.com [71.36.29.88]) with ESMTP; 
Thu, 16 Nov 2006 23:21:40 +
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)

and which fields it requires (like the semi-colon followed by the
timestamp coming after a comment field) [cf: RFC 2822, section 3.6.7:

received=   Received: name-val-list ; date-time CRLF

name-val-list   =   [CFWS http://tools.ietf.org/html/rfc2822#ref-CFWS] 
[name-val-pair *(CFWS name-val-pair)]

including the definition of CFWS in 3.2.3.]

It just boggles my mind why anyone would go through that much trouble
to deliberately damage a header line, rather than just delete it.

Well, maybe they'll get a whiff of the errs of their ways in the
Hall of Spam Shame...

-Philip




Re: Greylisting

2006-11-22 Thread Philip Prindeville
Don't they?  I thought the recommended retry time was 2 minutes,
doubling on each failure, and maxing out at 2 hours.

That's what sendmail does (unless it's retry time has been explicitly
set to more than 2 hours, of course).

-Philip


Richard Frovarp wrote:

I don't think the RFCs specify any time limit. Most timeout after 5 days 
of trying. We run 3 equivalent scanning machines, which requires us to 
run a greylisting that will sync between them. That could cause a large 
delay, if the sending machine tries to send to a different host that 
isn't synced. Messages that aren't sent from the same machine (SMTP 
farms like at GMail) can cause trouble as well, since the IP will 
change. The whitelist usually will timeout after a period of time, so 
there is a delay that may be induced again in the future, but that 
depends on setup.

If a sensitive piece of mail needs to get through, it may be possible 
for the user to send the message again after the delay period has 
elapsed. This would be a new message, but if it leaves the same IP, with 
the same from and to pair (or however your greylisting works), it would 
fire right on through the greylist no problem. Not a perfect solution, 
but should work for rare occasions.

One probably can whitelist recipients or recipient domains so that they 
are not affected by greylisting.

Last week greylisting stopped 1.3 million messages, which is after the 
blacklists and greet pause did their significant work.

Richard
  




Re: Braindeath in the Navy

2006-11-23 Thread Philip Prindeville
Jonas Eckerman wrote:

Philip Prindeville wrote:

  

Received: (private information removed)



  

It just boggles my mind why anyone would go through that much trouble
to deliberately damage a header line, rather than just delete it.



The only reason I can think of for that (in this case) is that ther want to 
keep those headers in order for server hop counting to work.

Of course, it would be much better (and more useful) if they kept the time 
stamp and obfuscated the headers without breaking the format. :-/

/Jonas
  


Sure.

They could replace it with either:

Received (deleted); timestamp-here


and still be in complaince, or rewrite it:

X-Header-Rewrite: Received-obfucated-for-no-obvious-purpose=3


Or any other number of imaginative yet-ultimately-misguided but
still-in-compliance-with-applicable-standards ways.

-Philip




List weirdness

2006-11-23 Thread Philip Prindeville
I'm seeing the following (attached).

I went back and looked at the message that seems to have
provoked it, and there was nothing odd about the message:
no attachments, nothing but text/plain 7-bit, in the body
(though it's weird that it's a 7-bit body, but charset=iso-8859-1).

Is this a lurking ratware writer?  Who on this list runs Exchange?

Why is this bouncing back to me, and not the envelope sender,
which was:

Return-Path: [EMAIL PROTECTED]


-Philip


---BeginMessage---
Subject of the message: Redundant QP encoding of Subject/From fields...
Recipient of the message: SpamAssassin Users
---End Message---


Re: Interesting text content in the new spams

2006-11-23 Thread Philip Prindeville
Charlie Clark wrote:

Looks like there are some pretty impressive self-learning systems out  
there. I'm enclosing the content of the text part of a new spam. I  
think it's quite an interesting vocabulary that they are using,  
presumably from their own trained ham database. This spam got through  
four different checks (postfix + blacklisting, spamassassin,  
spambayes and Opera's own spam system)! Given them a couple of years  
and we can finally close slashdot et al. and actually start reading  
this stuff! ;-)

Charlie

Raquo Areas Bugs. Open total a bug Tracking Support or Requests in  
Tech Patches.
Release archive is raquo of Areas?
Framework gd Engine Details Developers Beta Intended Audience. In  
Create Newscreate Farm Mapcreate or Projectnew am Wantedmy?  
Statistics currently Browse Most!
Of feeds available for this About by or the from. Activity Percentile  
last week View list of feeds available is.
Language a License gnu of. Patches Patch Feature a Request. Details  
Developers Beta Intended Audience Education Technology.
Education Technology or Other Topic English Unix name Registered.  
Language License gnu?
Va Software Ostg Source Group all Rights Reserved or Find.
Projectnew Wantedmy Statussite is.
Areas in Bugs open total bug Tracking Support. Va Software Ostg  
Source Group all Rights Reserved or Find.
Bug or Tracking Support Requests or Tech Patches am Patch in.
Audience or Education Technology Other Topic English Unix.
Support in Requests Tech Patches Patch Feature Request. Kolmafia sw  
Test Automation Framework gd. System of os Written an language of  
License gnu General Public.
License gnu General Public gpl. Create Newscreate is Farm of  
Mapcreate Projectnew am Wantedmy Statussite Status web!
Sprites a Release archive raquo of Areas Bugs?
Open total a bug Tracking Support or Requests in Tech Patches. Book  
Search is Advanced log in Create is. Va Software Ostg Source Group in  
all Rights.
Latest a News new or Graphics and Sprites Release archive. Va  
Software Ostg Source Group in all Rights.
Intended Audience Education.

--
  


I hear the New York Times isn't too picky about who they hire.

Someone could create an army of ghost writers and sit back and
collect the paychecks.

-Philip



Re: Interesting text content in the new spams

2006-11-23 Thread Philip Prindeville
Given that spammers read this list to figure out how to defeat us...
Why don't we just secure a copy of ratware and engineer a retro-virus
for it?

-Philip


Justin Mason wrote:

there was a very interesting project described in CEAS which did
just this -- engaged 419ers and other spammers in negotation,
to waste their time.  It's a great idea!

--j.

[EMAIL PROTECTED] writes:
  

Hi,

anybody recall that ELIZA program from ages ago? It would be interesting to
see her response to those utterances :)

Wolfgang Hamann



Looks like there are some pretty impressive self-learning systems out =20=

there. I'm enclosing the content of the text part of a new spam. I =20
think it's quite an interesting vocabulary that they are using, =20
presumably from their own trained ham database. This spam got through =20=

four different checks (postfix + blacklisting, spamassassin, =20
spambayes and Opera's own spam system)! Given them a couple of years =20
and we can finally close slashdot et al. and actually start reading =20
this stuff! ;-)

Charlie

Raquo Areas Bugs. Open total a bug Tracking Support or Requests in =20=

Tech Patches.
Release archive is raquo of Areas?
Framework gd Engine Details Developers Beta Intended Audience. In =20
Create Newscreate Farm Mapcreate or Projectnew am Wantedmy? =20
Statistics currently Browse Most!
Of feeds available for this About by or the from. Activity Percentile =20=

last week View list of feeds available is.
Language a License gnu of. Patches Patch Feature a Request. Details =20
Developers Beta Intended Audience Education Technology.
Education Technology or Other Topic English Unix name Registered. =20
Language License gnu?
Va Software Ostg Source Group all Rights Reserved or Find.
Projectnew Wantedmy Statussite is.
Areas in Bugs open total bug Tracking Support. Va Software Ostg =20
Source Group all Rights Reserved or Find.
Bug or Tracking Support Requests or Tech Patches am Patch in.
Audience or Education Technology Other Topic English Unix.
Support in Requests Tech Patches Patch Feature Request. Kolmafia sw =20
Test Automation Framework gd. System of os Written an language of =20
License gnu General Public.
License gnu General Public gpl. Create Newscreate is Farm of =20
Mapcreate Projectnew am Wantedmy Statussite Status web!
Sprites a Release archive raquo of Areas Bugs?
Open total a bug Tracking Support or Requests in Tech Patches. Book =20
Search is Advanced log in Create is. Va Software Ostg Source Group in =20=

all Rights.
Latest a News new or Graphics and Sprites Release archive. Va =20
Software Ostg Source Group in all Rights.
Intended Audience Education.

--
Charlie Clark
Helmholtzstr. 20
D=FCsseldorf
D- 40215
Tel: +49-211-938-5360
GSM: +49-178-782-6226









  1   2   3   4   >