Re: Funny spamd failure... (Maybe SARE/rules-du-jour related?)

2006-11-17 Thread Peter H. Lemieux

Giampaolo Tomassoni wrote:

# Check for amavis termination
while [[ ! -z ${PIDS} ]]; do
sleep 1
PIDS=$( /sbin/pidof ${AMV_NM} )
done


In cases like this I usually just put the sleep command in the init 
script like this:


...

case $1 in
...

  restart|reload)
stop
sleep 5  =
start
RETVAL=$?
;;

...

I'm not a gentoo user, though, so YMMV.  I'm using RedHat/CentOS.  Still 
I'd bet the init scripts aren't that different.


Peter


more ascii art spam

2006-11-17 Thread Peter H. Lemieux
I just got a new one with the usual drugs displayed in larged ascii art. 
 It was nearly unreadable, and it didn't pass my SA checks either.


Peter



MailScanner not using /usr/share/spamassassin?

2006-11-16 Thread Peter H. Lemieux
OK, I've ransacked mailing lists for over an hour now and have yet to 
find an answer to this question.


Until a couple of months ago I was running SA 2.64 under MailScanner 
4.36.4, both installed from RPMs on a RedHat 7.3 system.  I've been 
migrating to a CentOS 4.4 box running SA 3.1.7 and MailScanner 4.56.8, 
both again installed from RPMs.


I began to get suspicious about the new installation when I ran a couple 
of spams through spamassassin -t.  Rules like DATE_IN_PAST showed up in 
those tests, but they didn't get tripped when the message was scanned by 
SA under MailScanner.  It looked as though MailScanner was simply 
ignoring the default rules in /usr/share/spamassassin.  A few scans of 
maillog for some of the default rule names didn't show any hits over a 
period of weeks.  For instance, there are no log entries in the new 
installation for commonly-hit rules like 'HTML_[0-9]+' or 'DATE_IN'.


Except... I do get hits for the URIBL rules in 
/usr/share/spamassassin/20_dnsbl_tests.cf.  A locate dnsbl search 
doesn't turn up any other copies of these rules in the directory tree.


So I tried an experiment based on the approach described in 
http://article.gmane.org/gmane.mail.virus.mailscanner/4499 of running 
spamassassin -D --lint -C /etc/MailScanner/spam.assassin.rules.conf and 
the output showed that only /etc/mail/spamassassin was used.  I also get 
a lot of non-existent rule errors that don't appear if I just run 
spamassassin -D --lint.  Running a normal lint without the config file 
specified shows rules being read from both /etc/mail/spamassassin and 
/usr/share/spamassassin.


I've looked all over the system to see if I can find some setting that 
differentiates between these two situations.  I've even tried it with an 
empty /etc/MailScanner/spam.assassin.rules.conf.  I've looked in places 
like /root/.spamassassin, /var/spool/MailScanner/spamassassin, and the 
like, and can't find anything that would divert SA from using 
/usr/share/spamassassin when invoked by MailScanner.


I read a bunch of postings to the MailScanner list and found nothing 
helpful.  My next step is to run MailScanner in debugging mode, I guess, 
but I'd prefer not to have to interrupt production.  If any of you have 
any clues about what my problem is, I'd appreciate it.  If not, I off to 
debugging land.


Peter


Spam surge tied to SpamThru Trojan botnet

2006-11-16 Thread Peter H. Lemieux

From this article at eWeek:
http://www.eweek.com/print_article2/0,1217,a=194218,00.asp

The recent surge in e-mail spam hawking penny stocks and penis 
enlargement pills is the handiwork of Russian hackers running a botnet 
powered by tens of thousands of hijacked computers.


Internet security researchers and law enforcement authorities have 
traced the operation to a well-organized hacking gang controlling a 
70,000-strong peer-to-peer botnet seeded with the SpamThru Trojan.


Peter




Re: RelayChecker 0.3

2006-11-16 Thread Peter H. Lemieux

Billy Huddleston wrote:

Reverse DNS is a must. I'm surprised at how many people still haven't
got that yet in the IT world.. (Consultants mostly..)


It's not uncommon outside the industrialized world.  Last few days I got
a few false positives for a client that was corresponding with folks in
the Caribbean.

One of the few services I believe AOL provided the rest of us was 
deciding a few years' back not to accept mail from servers without 
reverse DNS.  Suddenly lots of admins had to deal with the problem of 
correct server configuration because you couldn't fail to deliver mail to 
the millions of AOL users worldwide.


Peter



Re: Where to submit SARE rule patches?

2006-11-15 Thread Peter H. Lemieux

Karl Auer wrote:

On Tue, 2006-11-14 at 09:58 -0500, Peter H. Lemieux wrote:

 body  __HAS_PENETRATION   /\bpenetration\b/i


I think a lot of rules would be better for losing the word boundaries.
Very few of the worst four letter words, are ever legitimate
substrings, either.


I generally agree, Karl.  In this particular instance I was suggesting a 
patch to the 70_sare_adult ruleset and was following the patterns the 
maintainer used for similar rules.


OTOH, I've had FP problems with simple word searches that don't include 
word boundaries.  A word like sex can match sextuplets or 
Middlesex.  (The latter case brought this quickly to my attention some 
years ago when I first starting writing my own SA rules.  Middlesex is a 
county here in Massachusetts.)  It's often hard to imagine all the 
possible false positives that might arise from a particular string, so I 
can understand why the publicly-distributed rulesets like those from SARE 
are so careful about word boundaries.


Peter


Disclaimer of the month

2006-11-15 Thread Peter H. Lemieux

For your amusement.  A spam arriving here today from Taiwan reads:

Dear Sir/Madam,

We learnt your e-mail add.from internet.

FIRST OF ALL,PLEASE KINDLY NOTE THIS E-MAIL IS SENT BY
OUR ADVERTISING COMPANY AND THE E-MAIL ADDRESS IS
NOT REAL(VIRTUAL),THEREFORE,PLEASE CONTACT US
VIA FAX  OR POST.DON'T DIRECTLY RESPONSE VIA  E-MAIL
BECAUSE WE CAN'T RECEIVE YOUR E-MAIL.
IF YOU WANT TO BE REMOVED FROM THE LIST,PLEASE ADVISE
YOUR E-MAIL ADDRESS  THIS E-MAIL CONTENT OR SUBJECT VIA FAX OR POST.

Wow, I wonder how many people will want to communicate with them.  I 
guess they missed the part in marketing class about impulse buying.


The From address was [EMAIL PROTECTED], which is a legitimate domain.  A 
visit to the contact us page at www.parts.com reveals this:


Parts.com is to be used as a parts portal. We are a software development 
company that provides online software solutions. We do not carry, stock 
or supply parts. If you are looking for a part, please contact a supplier 
under that part category. If your business is looking for an online 
e-commerence [sic] solution, please let us know.


I don't think I'll be contacting them any time soon!

Peter


PS:  To top it all off, the end of the spam message has this amusing 
tidbit:  Please directly push the button to send your fax message out,

don't pick up the phone.



---BeginMessage---
Dear Sir/Madam,

We learnt your e-mail add.from internet.
 
FIRST OF ALL,PLEASE KINDLY NOTE THIS E-MAIL IS SENT BY
OUR ADVERTISING COMPANY AND THE E-MAIL ADDRESS IS
NOT REAL(VIRTUAL),THEREFORE,PLEASE CONTACT US
VIA FAX  OR POST.DON'T DIRECTLY RESPONSE VIA  E-MAIL
BECAUSE WE CAN'T RECEIVE YOUR E-MAIL.
IF YOU WANT TO BE REMOVED FROM THE LIST,PLEASE ADVISE
YOUR E-MAIL ADDRESS  THIS E-MAIL CONTENT OR SUBJECT VIA FAX OR POST.

We are the professional product designer,mold  die maker,machinery builder
and molded parts(moldings) supplier for the following parts:

* Product design(from simply 3D model creating to whole project sub-contracting)
* Prototype making( mock up)
* Molds  Dies(sheet metal stamping die including single-staged  progressive 
dies,
   plastic injection molding molds,zinc or aluminum high-pressure die casting 
dies 
   rubber compression or injection molding molds)
* Laser cutting  CNC folding(suitable for small quantity,NO TOOLING(DIE) 
needed.
* Sheet Metal Stampings.
* Castings(sand castings)aluminium
* (High-Pressure) Die Casting for Zinc or aluminium
* Plastic Injection moldings.
* Oil Seals  other Rubber Moldings(both for industrial or general uses).
* Various Magnets.
* Machinings(Machined parts)
* Assembled unit(components assembled)
* Plastic Injection machines  Rubber Injection machines, 
   other related injection molding machines,custom-built machineries especially
   in connection with injection molding  hydraulic(oil)/pneumatic operating,
   whole plant export including know-how and hydraulic(oil)/pneumatic 
engineering consultation.

SMALL ORDER IS OK,PLEASE CONTACT US TO SAVE YOUR COST!

Thank you

Best Regards
Robert Lin
P.O.Box 1-120 Yung-Ho,Taipei Hsien,Taiwan
Fax: 886-4-8783310 (886 is the country code) 
NOTE:
Please directly push the button to send your fax message out,
don't pick up the phone. 




Developing Japan markets.doc
Description: Binary data
---End Message---


Re: different threshold for one address

2006-11-15 Thread Peter H. Lemieux

Jean-Paul Natola wrote:

My goal is to is have one email address bounces@ , which can have a different
score threshold than the system- in other words ,  anything that now comes in
and scores higher than 6.0  is considered spam and rejected- I would like to
have  bounces@ set to lets say  12.0


Create a file /etc/mail/spamassassin/whitelist.cf that contains this rule:

header TO_BOUNCES   To =~ /bounces\@/i
description TO_BOUNCES  Whitelist mail to bounces mailbox
score TO_BOUNCES-6

Now messages arriving for bounces start at -6, so they'd need 12 SA 
points to reach your threshold of +6.


If you only want to whitelist [EMAIL PROTECTED], you might want instead 
to use /[EMAIL PROTECTED]/ in the rule above.


Peter



Re: Disclaimer of the month

2006-11-15 Thread Peter H. Lemieux

Peter H. Lemieux wrote:

For your amusement.  A spam arriving here today from Taiwan reads:


Sorry, I didn't intend to attach the whole message.

Peter




Re: different threshold for one address

2006-11-15 Thread Peter H. Lemieux

Jean-Paul Natola wrote:

I currently use the local.cf for whitelisitng located in
/usr/local/etc/mail/spamassassin

Is it ok to create that rule in that file?


SA reads rules from any *.cf files it finds in ../etc/mail/spamassassin. 
 Since I have dozens of custom rules, I find it easier to organize them 
into separate files by type of rule rather than putting everything into 
local.cf.  Either way works fine.


Peter



Where to submit SARE rule patches?

2006-11-14 Thread Peter H. Lemieux
Is this a good place for this?  If so, I'd like to propose the following 
fix to 70_sare_adult.cf:


329d328
 body  __HAS_PENETRATION   /\bpenetration\b/i
331c330
 meta  FP_MIXED_PORN3  ((__HAS_COLLECTION + 
__HAS_HARDCORE + __HAS_YOUNGGIRL + __HAS_PENETRATION + __HAS_ADOLESCENT + 
__HAS_CHICKS)  2)

---
 meta  FP_MIXED_PORN3  ((__HAS_COLLECTION + 
__HAS_HARDCORE + __HAS_YOUNGGIRL + FPS_PENETRAT + __HAS_ADOLESCENT + 
__HAS_CHICKS)  2)


There is no rule called simply FPS_PENETRAT in the file.  There is a 
header rule called __FPS_PENETRAT which might be what's intended, but the 
rest of the checks in the FP_MIXED_PORN3 meta are body rules.  So I 
decided from the logic that you wanted to tag the word penetration in 
the body as well and created the __HAS_PENETRATION rule along the same 
lines as __HAS_HARDCORE.


Peter




Re: change spamhaus.org's score

2006-11-14 Thread Peter H. Lemieux

Matt Kettler wrote:

Should be something like this in 50_scores.cf:
score RCVD_IN_BL_SPAMCOP_NET 0 1.332 0 1.558
Just add score RCVD_IN_BL_SPAMCOP_NET 1.0 in your local.cf.

That said, I would NOT advise raising the score of spamcop.. lots of FPs for me 
lately.


I've reduced the score on this rule to 0.5 just recently myself.

Peter



Re: Per Domain Whitelisting

2006-10-27 Thread Peter H. Lemieux

jasonegli wrote:

For example let's say that domain xyz.com wants to allow all messages from
yahoo.com, but domain 123.com does not. Is there a way to allow FROM
[EMAIL PROTECTED] TO [EMAIL PROTECTED]?


Obtuse SMTPD (http://sd.inodes.org/) can handle this at the SMTP level. 
I think it may be possible to add this to MailScanner 
(http://www.mailscanner.info/) through it's custom rules; its default 
whitelists/blacklists, however, are global.





Re: Scoring base64 blob messages

2006-10-27 Thread Peter H. Lemieux

Theo Van Dinter wrote:

On Thu, Oct 26, 2006 at 12:19:23PM -0400, Peter H. Lemieux wrote:

No, because there are going to be a lot of mails that would hit that.
Really?  Maybe it's because I live in the US, but I can't think of a 
legitimate message I've ever received consisting only of a base64 blob. 


You look at a lot of raw messages?  ;)


Doesn't everybody?

Seriously, I do look at a lot of raw messages; for instance, I review the 
full text of nearly every spam message that doesn't get caught by my 
filters and shows up in my inbox.  Obviously I don't get much mail from 
Blackberry users or Ticketmaster!


Rather than making anyone else do the work for me, is there something I 
can read about how to determine the frequency of different message 
features appearing in the corpus?



Well, there isn't a SA corpus, so there's no answer to that question.


Ah, I hadn't read this page before:
http://wiki.apache.org/spamassassin/HandClassifiedCorpora
My recollection was that 2.x used a centrally-defined corpus rather than 
a variety of developers' corpora (see, I read the wiki).  Either things 
changed with the switch in scoring algorithms in 3.x, or my recollection 
is shoddy.  Probably the latter.



You can generate some rules and use mass-check to run against your own corpus
to gather some statistics.  I'm willing to run some rules for you against my
corpus if you want.  I just don't have time to come up with the rules right
now.


Thanks for the offer, Theo, but don't spend your valuable time on this. 
I'll give it shot some day when I've got some spare moments.  If I do get 
some candidate rules, I'll pass them along to you for testing.



Thanks again!
Peter


Re: domainkeys unverified - solved

2006-10-27 Thread Peter H. Lemieux

Chris Purves wrote:
In the end, with the help of Mark Martinec, I was able to determine that 
the problem was with my ISP provided DNS namerservers not allowing full 
TXT records to be returned (they were truncated).


Was this something that the ISP cooked up, or was it intrinsic to the DNS 
server software they are using?  If the latter, it would be good to know 
which server they were running.  It might be a useful addition to the 
FAQ/wiki.


Peter



Scoring base64 blob messages

2006-10-26 Thread Peter H. Lemieux

I received a spam today where the text was only a base64-encoded blob.

Content-Type: text/html;
charset=us-ascii
Content-Transfer-Encoding: base64
Subject: feel young and strong again

PGh0bWw+DQpTdG9wIG92ZXJwYXlpbmcgZm9yIHlvdXIgcHJlc2NyaXB0aW9uIG1lZGljYXRpb25z
IHRvZGF5Lg0KPGJyPg0KPGJyPg0KU2F2ZSBtb3JlIHRoYW4gc2l4dHkgcGVyY2VudCBvbiBicmFu
ZCBuYW1lIGdlbmVyaWMgbWVkcyB0aGF0IGFyZSBjaGVtaWNhbGx5IGlkZW50aWNhbC4NCjxicj4N

Does SA convert the blob into text before scanning?  It contains a number 
of drug-related words and a URI that points to pharmconnect.org.


Also is there an SA rule that scores messages that contain only a single 
base64 part (as opposed to a base64-encoded attachment)?  I doubt many 
legitimate messages arrive with only a single base64 part.


Peter


Re: Scoring base64 blob messages

2006-10-26 Thread Peter H. Lemieux

Theo Van Dinter wrote:

On Thu, Oct 26, 2006 at 09:46:28AM -0400, Peter H. Lemieux wrote:
Does SA convert the blob into text before scanning?  It contains a number 
of drug-related words and a URI that points to pharmconnect.org.


Yes.


I was pretty sure this was the case but wanted to confirm it.

Also is there an SA rule that scores messages that contain only a single 
base64 part (as opposed to a base64-encoded attachment)?  I doubt many 
legitimate messages arrive with only a single base64 part.


No, because there are going to be a lot of mails that would hit that.


Really?  Maybe it's because I live in the US, but I can't think of a 
legitimate message I've ever received consisting only of a base64 blob. 
Our of curiosity, how frequently does this appear in the SA ham corpus? 
Rather than making anyone else do the work for me, is there something I 
can read about how to determine the frequency of different message 
features appearing in the corpus?


Thanks, Theo.

Peter





Re: Scoring base64 blob messages

2006-10-26 Thread Peter H. Lemieux

[EMAIL PROTECTED] wrote:

Content-Type: text/html;
charset=us-ascii
Content-Transfer-Encoding: base64



Probably a message in base64 that does not contain any single 8bit code should 
be
considered as an attempt to hide the message from scanners


That's a good idea, Wolfgang.  The message I posted said it was in ascii. 
 I've not delved into writing SA rules that combine different message 
features, but this sounds like a possibility.  If base64-only with a 7bit 
charset, then score it.


I'll put it on my todo list; maybe I'll come up with something before the 
new year!


Peter



Re: Concerned with scores for from rfc-ignorant.org

2006-10-23 Thread Peter H. Lemieux

Elizabeth Schwartz wrote:
IMHO if a rule is getting legit email tagged as SPAM it should be toned 
down. Obeying the RFC's is a good thing, but I am trying to tune our 
spam filter to filter spam, not to be a netcop. Our particular contact 
seems to have gotten onto rfc-ignorant's list because it is rejecting 
mail from , nothing to do with sending spam, and it's a legitimate 
site, neither a spammer nor an ISP (nor in a computer related field, nor 
English speaking...)


It seems to me you have a couple of different options, Betsy.  You can 
reduce the score attached to all mail that trips the rfc-ignorant rule, 
you can set it to zero and deactivate the rule entirely, or you can 
whitelist particular senders in a custom .cf file.  I usually choose the 
latter route, most often based on the Received headers.  For instance,


header RCVD_FROM_HARVARDReceived =~ /from .*\.harvard\.edu \(/i
score RCVD_FROM_HARVARD -5

matches the Received header added by sendmail.  If you're using a 
different MTA, you'll need to write a rule customized to the headers it 
adds.  (Note the escaped periods and parenthesis in the regex.)


You might drop a note to the postmaster box at that domain and tell them 
they're listed in rfc-ignorant.  I bet they haven't got a clue, and some 
of their other legitimate messages aren't being delivered.


Peter



Re: I'm thinking about suing Microsoft

2006-10-23 Thread Peter H. Lemieux

Magnus Holmgren wrote:
I thought they did? At least the message from WU/WGA on one computer with 
Windows XP I used recently was that unauthorised installations only get 
critical updates, but they do get those. Is that going to change with Vista?


Yes.  See, for instance, http://www.computerworld.com/blogs/node/3665

Vista machines that Windows Genuine Advantage believes to be pirated 
will operate with reduced functionality, including disabling the Windows 
Defender software that protects against malware.


What's especially troubling is the large number of false positives that 
WGA currently generates if the computer's hardware is significantly 
altered.  It also seems to me that this approach leaves these machines 
ripe for a denial-of-service attack where a virus somehow changes the WGA 
signature on the machine so it appears that the Windows OS is pirated. 
Then the next time WGA phones home it switches the infected computer to 
the reduced functionality state (which generates lots of calls to the 
help desk!).


All that said, those of you who think a lawsuit is a good approach should 
start by reading the Windows EULA.  Like most EULA's it exempts Microsoft 
from liability for just about anything it's software does.  I also 
suspect most judges wouldn't consider spamming to be a sufficient threat 
to the public's health and welfare that it would justify taking legal 
actions against Microsoft.  But, if your attorneys think this is a good 
idea, more power to you!


Peter




Re: scoring spam

2006-10-20 Thread Peter H. Lemieux

Steve Ingraham wrote:


I am trying to figure out how I can get scores to this type of spam
bumped up so they do not get delivered to my user mailboxes.  Can
anyone give me some suggestions on what I should do to stop this type
of spam from being delivered?

[...]

X-Spam-Flag: YES
X-Spam-Status: Yes, score=8.3 required=5.0 tests=BAYES_60,HTML_10_20,
 HTML_MESSAGE,JV_Pharm1r_Drug,MIME_HTML_ONLY,RCVD_NUMERIC_HELO
 autolearn=no version=3.1.5



You don't need to bump up the score; this one received an 8.3 which 
exceeds your 5.0 ceiling.  This result is that it's tagged as spam.  SA 
itself doesn't do anything other than tag likely spams.  It's up to you 
to decide what to do with these messages.


If the scanning machine is running a *nix OS and the mailboxes reside on 
the same server, an elegant solution is to have procmail route these 
messages for you.  Just create or edit the file /etc/procmailrc and add 
the following rule:


:0
* ^X-Spam-Flag.*YES
/path/to/some/quarantine/mailbox

That will scan every message at delivery for the Spam-Flag header and 
route those with a YES to the quarantine folder.  Since procmail executes 
rules in /etc/procmailrc with root privileges it can write to the 
quarantine mailbox even if it's owned by another user.  (See man 
procmailrc and man procmailex for details.)


Don't delete them, just put them in a quarantine.  That way when someone 
asks why they didn't get that important message that you inadvertently 
scored as spam, you can give them the quarantined copy.


You might also want to add the quarantine mailbox to your log rotation 
program so it doesn't just grow forever.  On a RedHat-Linux-flavored box, 
you can add a file to /etc/logrotate.d like this:


/path/to/some/quarantine/mailbox {
daily
rotate 30
missingok
notifempty
}

This will keep 30 days of quarantines.

If you're doing scanning on a box in front of the eventual mailbox server 
(e.g., Exchange), you can't use this trick because the mail isn't being 
delivered on the scanning box.  You're better off using an SMTP-level 
scanner like MailScanner or amavisd that can invoke SA along with any 
virus scanners you might be using.


I use MailScanner (with clamav) to handle all these tasks.  It 
automatically prepends a string like {Spam?} to the subject of any 
message that scores above your floor value and can also be configured to 
delete, quarantine, deliver, or forward tagged messages.  If tagged 
messages are delivered to the recipient, he or she can write a 
client-side rule to handle the spams.



Peter



Re: scoring spam

2006-10-20 Thread Peter H. Lemieux

Steve Ingraham wrote:

I was trying to see if there was anything I could change in the rules in
spamassassin to raise the spam score up enough to reach the spam_hits=10
limit set up in my qmail controls so that qmail will not deliver the
message.  Once the spam score reaches 10 delivery is stopped.  I was
posting to the spamassassin list because I wanted to know if there was a
way I could bump up the score for these types of spam messages so it
would reach the setting I have of spam_hits=10 in the controls for
qmail and therefore not get delivered to the user's mailbox.


Ah, I understand now, Steve.  Your motivation wasn't so obvious from your 
original posting.


You can modify the scoring of any rule by adding a file to 
/etc/mail/spamassassin that changes the score for specific rules.  I 
named mine ZZscores.cf so it will be read after the other files in this 
directory.  For instance,


score HTML_MESSAGE_BODY 1.0

That said, I'm not sure you'd want to fiddle with the scores on the rules 
this message hits.  From your original posting, we have


1.7 RCVD_NUMERIC_HELO Received: contains an IP address used for HELO
3.9 JV_Pharm1r_Drug BODY: partial word hidden in HTML in pill ad
1.0 BAYES_60 BODY: Bayesian spam probability is 60 to 80%
0.0 HTML_MESSAGE BODY: HTML included in message
1.4 HTML_10_20 BODY: Message is 10% to 20% HTML
0.4 MIME_HTML_ONLY BODY: Message only has text/html MIME parts

Some other alternatives might be training your Bayes filter with messages 
like these so you get a higher score than BAYES_60 or giving a positive 
score to HTML_MESSAGE_BODY, though that can lead to false positives in 
today's look at my pretty email world.


I also noticed that this was a GIF spam, but it's not scored as such. 
You might want to look back over this list's archives and read about the 
FuzzyOCR and ImageInfo plugins.  Also the newest SARE stock rules might help.


Of course, you could also lower your thresholds.  I tag at 4 and 
quarantine at 8.  I review the decisions on messages in between these 
values to make sure we're not generating false positives; seems to work 
okay here.


I don't use Bayes at all.

Peter


Re: scoring spam

2006-10-20 Thread Peter H. Lemieux

Steve Ingraham wrote:

Could you explain how I can train Bayes?  What specifically do I need to
do to accomplish this?


http://spamassassin.apache.org/full/3.0.x/dist/doc/sa-learn.html




Re: Scoring PTR's

2006-10-19 Thread Peter H. Lemieux

Robert Swan wrote:

Guys, if my mail server announces itself as mail.somename.com and has a
PTR that matches. I can send mail out as [EMAIL PROTECTED] or
[EMAIL PROTECTED] as long as the MX record for the domain
anothername.com reads as mail.somename.com 


The original questions was how do I write a header rule similar to
below, to identify if the announce name and PTR name do not match?

header  LOCAL_INVALID_PTR2  Received =~ /from \S+ \(unknown /


Doesn't sendmail usually insert the phrase claiming to be 
some.other.host in these situations?  For instance,


Received: from exchange.fccj.edu(207.203.47.99), claiming to be 
fccj-sbm-03.fccj.org


Unfortunately a quick grep for 'claiming to' in my mail spool shows 
dozens of perfectly legitimate mail servers that result in a claiming 
header, like the one above.


The only one of these cases that I score is claiming to be localhost 
which gets 3 points here.  They're nearly always spams though they're 
usually tagged by other rules.  A quick grep of my logs shows that the 
lowest SA score received by a message that claims to be localhost is 
about 10 (including the 3 points for this rule).


Peter






Re: R: Scoring PTR's

2006-10-19 Thread Peter H. Lemieux

R Lists06 wrote:

Nothing personal, yet that is some messed up reverse dns delegation.


Perhaps, but RIPE, for instance, calls RFC2317, which proposed this 
method, a Best Current Practices RFC: 
http://www.ripe.net/rs/reverse/infosources.html


I also skimmed the list of complaints about this procedure at
http://homepages.tesco.net/J.deBoynePollard/FGA/avoid-rfc-2317-delegation.html,
but he mostly argues that RFC2317 delegation makes life more difficult 
for people maintaining servers running Microsoft DNS or Dan Bernstein's 
djbdns.  He'd prefer a scheme where the upstream provider aliases every 
single address, not subnet blocks.


When Microsoft or djbdns become the dominant name servers on the 
Internet, perhaps we'll all need to change.  Until then, the millions of 
us running BIND will probably stick with RFC2317 delegation.




Re: Q. about spam directed towards highest MX Record?

2006-10-18 Thread Peter H. Lemieux

Matt wrote:

Just to clarify here You are talking about doing something like:

domain.com   1200   IN   MX   10  smtp-1.domain.com
domain.com   1200   IN   MX50  smtp-2.domain.com

You all are saying that most of the spam should be coming in MX 50 right?


No, I'm saying most of the mail coming to the secondary (MX 50) is likely 
to be spam in situations where the primary (MX 10) is accepting mail.



I have to admit I've tried this, but it seems like mail continues to
come into the MX 50 even when the primary servers are available.Is
it not correct that the 50 should NOT be tried until the 10 is
unavailable?  Or do I have that backwards?


Legitimate mail servers follow the rule you describe; send first to the 
primary, then to the secondary if the primary is unavailable.  However, 
there's no technical or other requirement that messages first be sent to 
the primary.  Spammers often ignore the primary and send directly to the 
secondary in hopes that the back door has fewer restrictions.


Legitimate mail can show up on the secondary even when the primary is up 
for reasons like congestion.  If the primary is busy, the sending server 
may time out and then try the secondary.  For that reason, you cannot 
assume that all mail on the secondary is spam, but a quick review of the 
logs for the secondary will show that nearly all of it is spam.  That's 
why I give messages arriving at the secondary a high SA score, but not 
one that is sufficient by itself to tag the message.


Peter


Re: How to filter these spam messages

2006-10-18 Thread Peter H. Lemieux

Chris Santerre wrote:
But if you rely on email for time sensitive info you best rethink 
what you are doing :)


Regardless of your perspective, Chris, the fact is that most people have 
come to expect email to be as reliable and instantaneous as making a 
phone call.  In one sense that's a tribute to the hard work of mail 
admins around the world, but it's also raised the expectation of most 
email users well beyond what was envisioned when RFC822 was written.


Peter



Re: for the people who write rules

2006-10-18 Thread Peter H. Lemieux

Jo Rhett wrote:
Sorry, I should write a rule but no time today or tomorrow.  This e-mail 
has gotten past SA with no score on 4 different accounts nearly half a 
dozen times today.  The only change in the e-mail is the name used in 
the From address, which is also reflected in the Subject line.  It's 
always TXHE.


I'm using this for now:

SUBJ_SOMEONE_WROTE  Subject =~ /\bwrote:$/i

with a score of 3.  Works here.

I don't see many real messages with a subject line ending in wrote:. 
The colon on the end gives it away!


Peter



Re: DNS lookup plugin?

2006-10-18 Thread Peter H. Lemieux

Chris St. Pierre wrote:

I use Postfix and, for a while, I had reject_unknown_hostname as part
of my smtpd_helo_restrictions 
This was insanely effective; SpamAssassin started to get lonely while

I had this enabled.  I was dropping massive amounts of spam at
connection time -- but, unfortunately, I had a fair number of FPs as
well, due to misconfigurations, or, more frequently than I'd hoped,
mail outsourcing firms giving a bogus HELO.


I just use an SA rule to scan the Received header to see if the names 
resolve, then add 3.3 SA points to those without a valid DNS entry.  This 
prevents false positives based solely on unknowns, but is high enough 
that it will trigger if the rest of the message is slightly spammy.  (I 
tag at 4.)


Since I don't use Postfix, I don't know what the exact format of the 
Received header will be when the address doesn't resolve, but it should 
be pretty obvious from looking at the headers of such a message.


Peter





Re: Q. about spam directed towards highest MX Record?

2006-10-17 Thread Peter H. Lemieux

Jon Trulson wrote:

Hehe, that is an old spammer trick... Our secondary MX is
pretty much 100% spam.
I implemented greylisting on the secondary which reduced spam
through it by about 99% :)  The secondary does not do spam
scanning, it's simply store and forward.  Greylisting really
helps in these cases.


My experience is like Jon's; nearly all mail arriving at the backup MX is 
spam.


Rather than greylisting, I simply score messages higher if they come in 
through the backup MX.  On my systems, where the primary MX is almost 
never down, I add 3.3 SA points for messages that arrive via the back 
door.  This is routinely one of the most frequently hit rules, right up 
there with senders without reverse DNS, which gets an equivalent score. 
Many messages arriving at the back door trip both these rules and thus 
get marked as spam.


This approach doesn't put a great deal of stress on my SA scanner because 
I block a lot of mail at the SMTP level based on a substantial custom 
rule list.


Peter




Re: Problem with URIBL rules : false positive and not listed while mannually checking

2006-10-17 Thread Peter H. Lemieux

Fabien GARZIANO wrote:

And for dns, I'm sorry, I typed it too fast and when I meant no 'dns' i
also meant no 'named' process. 


On mail servers it's usually a good idea to run a local nameserver, even 
if you have no zone files to publish (e.g., the caching nameserver 
named configuration that comes with RedHat-flavored distributions). 
Without a local nameserver you have to make a request against the ISPs 
servers for every message you receive.  If you run a local, caching 
server, once you've looked up an address it's kept locally which improves 
performance on a busy mail server.


If you run a caching server, make sure that /etc/resolv.conf has 
127.0.0.1 as its initial nameserver address.  Add the ISPs addresses 
below this in case your local named falls over.


Peter



Re: New ebay phish

2006-10-17 Thread Peter H. Lemieux

New phish looks like a LEGIT ebay messege from another user


I handle all problems like this at the SMTP level using the old, but 
extremely powerful Obtuse smtpd daemon (http://sd.inodes.org/).  All 
inbound mail is collected by the smtpd daemon on my MX server, then 
passed to another machine for SA scanning and delivery.


The Obtuse daemon lets you write rules based on the sending server's 
identity (both IP and domain name) and the data contained in the MAIL 
FROM and RCPT TO fields in the SMTP exchange.


In the case of eBay, we only accept messages with an @ebay.com From 
address if they come from a server in *.ebay.com.  I've found this to be 
a very effective deterrent to phishing scams and use it with a number of 
banking and financial domains.  I also apply similar rules to messages 
from commonly-forged domains like AOL, Yahoo, hotmail, etc.


This approach occasionally runs afoul of people, usually on residential 
connections, who erroneously use their AOL or Yahoo address in the From, 
but mail out through another ISP's server.  When this happens I politely 
explain why there is a Reply-To header.  We process about 100K messages a 
week; these problems arise at most once a month.


The Obtuse daemon also has a function that can reject mail according to 
the domain of the sending server's DNS host.  That works well with some 
spamming operations that have dozens of bogus domains all pointing at a 
common DNS host.



Peter



Re: Is there any way to score this?

2006-10-17 Thread Peter H. Lemieux

Micke Andersson wrote:

excuse me for my ignorance, but is this really the correct approach 
right now, since it is quite a lot of badly configured DNS servers out
there. Should this not be handled by the SMTP server as is instead! 
And return an error code of 421 or something like this. Like AOL has

implemented at their servers, you will be informed as sender about the
problem, with an URL link to
http://postmaster.info.aol.com/errors/421dnsnr.html


Whatever opinions you may have about AOL, when they began rejecting mail 
without reverse-DNS entries a few years' back, AOL's sheer size forced 
mail admins to make sure that their servers have both forward and reverse 
lookups enable.  Heck, even random cable/DSL hosts usually have reverse 
lookups configured, usually something like 123-123-123-123.someisp.com. 
Most of the mail I see coming from servers without reverse-resolution is 
spam, usually from hosts in places like China.


Moreover, I'd much rather give such messages a relatively high SA score 
than reject them at the SMTP level.  False positives in the SMTP exchange 
cause ill-will with clients and their correspondents.


Or if one should have this above Rule, me my self would not for the time 
being, have that high of a score,


I give these messages a score of 3.3 with an SA criterion of 4.0; I get 
very few false positives.



Peter


Re: What's with UCEPROTECT List?

2006-10-17 Thread Peter H. Lemieux

Marc Perkel wrote:
Sender Verification is an Exim trick. What it does is start a sequence 
where my server starts to send an email back to the sender address to 
see if it's a real email account. But I do a quit after the rctp to: 
command. If the receiving end says the user doesn't exist then I block 
the email.


My incoming servers know literally nothing about which users have valid 
addresses and which do not.  All these servers do is accept or reject 
inbound mail based on a (long) list of SMTP-level rules and forward the 
messages that are accepted to another machine for SA and virus scanning.


If sender verification requires that the incoming server have a complete 
list of valid mailboxes, it's going to fail miserably here.  I don't see 
anything in the RFCs that makes my configuration non-compliant, do you?






Re: New ebay phish

2006-10-17 Thread Peter H. Lemieux

John D. Hardin wrote:

The Obtuse daemon also has a function that can reject mail
according to the domain of the sending server's DNS host.  That
works well with some spamming operations that have dozens of bogus
domains all pointing at a common DNS host.


Any stats for that?


I'm not sure I know what kind of stats you're looking for, John.

Uncovering situations like this requires a bit of detective work. 
Sometimes when I get messages from obviously spammy domains like 
randomword-anotherrandomword.com, I'll do some checking into their IP and 
domain whois records.  I might also use nmap to ping-scan their class-C 
subnet to see what other hostnames are nearby.  Following those domains 
back can often uncover a common DNS server.  If the DNS server doesn't 
have reverse-DNS configured (e.g., dns[12].superduperspecials.com), it's 
*really* suspicious.


My list isn't all that long because this takes a bit of work.  I usually 
resort to such measures when I get really annoyed by a particular set of 
spams.  Most of my rules depend on the IP/hostname of the sending server, 
not this indirect approach based on DNS servers, but the latter can come 
in handy sometimes.


Peter



FuzzyOCR (and gocr) can't detect HGH spams

2006-10-16 Thread Peter H. Lemieux
I get a lot of messages with a gif ad for HGH drugs with this image: 
http://www.crystalmail.net/hgh.gif.  FuzzyOCR doesn't return anything 
because gocr doesn't show any text.  I've tried various -i settings for 
gocr from 1 to 254 and get gibberish at all settings.


For instance, 'gocr -i 180 hgh.gif' yields:

lI__c_tc)r _rc_hc_rihc_Ll _cnLl .h1c_Llic_;cll_ _u__c_c __ihc LI
 l c htc)hlc_rc)c_c_ B llr_ll l hc r_cp_


_ t4 __cc_'un ic) __'ri_c _ hH3s, t_k   _ ,r o_E,y _h K E,_
_ ,_ics r _ sncu)._r. t.ihk). lhirkrr x_))  '   gg __, r
_ Krvc)_H t)r r_irk cct .__ _
 O _' Y O ___ TE_ E
 _Lncl nLnn __ mc)R hnrtb

Results at other -i settings are about the same.

System is CentOS 4.3
gocr is at version 0.37 (from rpmforge)
netpbm is version 10.25

Any hints?

Peter


Re: SA just stopped working

2006-03-27 Thread Peter H. Lemieux

mouss wrote:

Liam-PrintingAutomation wrote:


given what you posted, you sa seems to be ok. you now need to make sure
your sendmail is actually calling procmail. try putting an error in your


You can tell procmail to log its actions by adding the following to the 
top of a procmailrc:


LOGFILE=/path/to/logfile
VERBOSE=yes

If this a .procmailrc in a user's home directory, point the log to a 
location to which that user has write privileges, e.g., 
/home/user/procmailog.  If it's in /etc/procmailrc, it's probably better 
to write it to some place like /var/log/procmail.