from:"Eric A. Hall"

AWL observations

2010-07-22 Thread Eric A. Hall


Sometimes the AWL rule doesn't appear in the list. From looking at the
behavior it seems that the rule is only guaranteed to fire if the stored
score for the tuple is significantly different than the message score, or
if the stored tuple has a very high stored score. But if the stored score
and message score are close and the stored tuple does not have a large
score, then the rule will not fire.

I assume the above reflects the logic for when to adjust the score, rather
than reflecting when the tuple was matched. But the plugin text and code
all talk about the rule firing on match, not when corrective scoring occurred.

Is this a bug? or should the text be changed?

If the current code is intended, I'd like to request a new function call
that tells if the tuple exists and the number of times it has been seen

-- 
Eric A. Hall  http://www.eric-a-hall.com/
Network Technology Research Grouphttp://www.ntrg.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: AWL observations

2010-07-22 Thread Eric A. Hall


On 7/22/2010 11:24 AM, RW wrote:
 I don't recall seeing anything like that. Are sure it's not due to the
 IP address changing or AWL being short-circuited?

My testing is with local message files. If I use sa-awl to dump the
database I can see the counter increment, but the rule doesn't fire unless
the conditions are met

-- 
Eric A. Hall  http://www.eric-a-hall.com/
Network Technology Research Grouphttp://www.ntrg.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: AWL observations

2010-07-22 Thread Eric A. Hall


On 7/22/2010 11:07 PM, Matt Kettler wrote:
 On 7/22/2010 10:32 AM, Eric A. Hall wrote:

 If the current code is intended, I'd like to request a new function call
 that tells if the tuple exists and the number of times it has been seen
 
 For what purpose? (Not trying to be mean, just asking, because if it's
 not of use to the general SA community, it doesn't belong in the
 mainline release. However, if it's useful.)

I want to use a previously-seen match list for a variety of purposes. I
already have my SAGrey plugin [1] that uses the AWL for limited-use
greylists (it only fires when spam threshold exceeded, so as not to
penalize everybody, but its not as useful if the AWL rule isn't reliable).
I also have a rule I use locally that blocks mail with binary attachments
if the sender is unknown, which I would like to modify so that it only
fires on spammy messages. There are a couple of other things on the to-do
list here that would benefit from a seen-before database. I can write my
own, but it would be easier to use AWL if its going to be present and
reliable.

It would also be nice to have a last-updated field so that the entries can
be aged. I already run the pruning tools to purge one-time senders
(spammers) at the end of each month, but I would rather do one-time over
six-months.

-- 
Eric A. Hall  http://www.eric-a-hall.com/
Network Technology Research Grouphttp://www.ntrg.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SVN notifications killing spamassassin

2008-02-18 Thread Eric A. Hall


On 2/18/2008 5:50 AM, Justin Mason wrote:
 Eric A. Hall writes:
 I sometimes get SVN notifications that contain lists of files and their
 status. The filenames will often get picked up by the URI matching
 algorithm, each of which end up being processed through numerous lookups
 (URICOUNTRY, my LDAP filter, etc). Sometimes I get very large messages
 with hundreds of file lists, which in turn causes spamassassin to go into
 never-never land while it thinks about the hundreds of URI matches.

 For example,

   Afpo/reports/perl/nagios_notifications1.pl.bak
   Afoo/reports/perl/nagios_outages1.pl
   Afoo/reports/perl/GWIR.pm

 nagios_outages1.pl will be determined as a URI for .pl domain and GWIR.pm
 will be determined as a URI for .pm domain, and so forth. The only way to
 get these messages through is to disable spamassassin...

 I've updated to 3.2.4 just now and it still has the same problem

 I'm guessing the URI analyzer needs to be smarter.
 
 The URI analyzer already is smarter ;)
 
 Changing the URICountry plugin is the way to fix this.

It doesn't appear to be URICountry that's dying. Either way though, I bet
all of the plugins will perform a lot better when they are no longer being
passed filenames.


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

SVN notifications killing spamassassin

2008-02-17 Thread Eric A. Hall


I sometimes get SVN notifications that contain lists of files and their
status. The filenames will often get picked up by the URI matching
algorithm, each of which end up being processed through numerous lookups
(URICOUNTRY, my LDAP filter, etc). Sometimes I get very large messages
with hundreds of file lists, which in turn causes spamassassin to go into
never-never land while it thinks about the hundreds of URI matches.

For example,

  Afpo/reports/perl/nagios_notifications1.pl.bak
  Afoo/reports/perl/nagios_outages1.pl
  Afoo/reports/perl/GWIR.pm

nagios_outages1.pl will be determined as a URI for .pl domain and GWIR.pm
will be determined as a URI for .pm domain, and so forth. The only way to
get these messages through is to disable spamassassin...

I've updated to 3.2.4 just now and it still has the same problem

I'm guessing the URI analyzer needs to be smarter.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Question - How many of you run ALL your email through SA?

2007-08-17 Thread Eric A. Hall


On 8/16/2007 12:39 PM, Marc Perkel wrote:
 OK - it's interesting that of all of you who responded this is the only 
 person who is doing it right. I have to say that I'm somewhat surprised 
 that so few people are preprocessing their email to reduce the SA load. 
 As we all know SA is very processor and memory expensive.
 
 Personally, I'm filtering 1600 domains and I route less than 1% of 
 incoming email through SA. SA does do a good job on the remaining 1% 
 that I can't figure out with blacklists and whitelists and Exim tricks, 
 but if I ran everything through SA I'd have to have a rack of dedicated 
 SA servers.

third-party blacklists are good indicators but they are not perfectly
accurate. the errors make them unsuitable as a sole metric, but are by
definition very good inputs for spamassassin's probability scoring systems.

for those of us that can afford this approach it works very well. I'm
sorry you can't, but that's not our fault.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Question - How many of you run ALL your email through SA?

2007-08-15 Thread Eric A. Hall


On 8/15/2007 11:11 PM, Marc Perkel wrote:
 As opposed to preprocessing before using SA to reduce the load. (ie. 
 using blacklist and whitelist before SA)

All email sent to port 25 goes through SA for processing. Postfix has a
couple of regular expressions and some behavioral stuff (invalid commands,
invalid recipients, etc), but otherwise it just looks for the spam score
and if its too high the transfer is rejected.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: plugin to test attachments from unknown senders

2007-08-11 Thread Eric A. Hall


On 7/14/2007 3:49 PM, Eric A. Hall wrote:
 Like other folks I've been getting hit with the PDF spam pretty hard. I
 think the way to solve this and the image spam in general is to do a
 plugin that does two things:
 
  1) looks in the message to see if there is a binary attachment
 
  2) looks in the AWL to see if the sender tuple is known
 
  3) if (1==true)  (2==false) fire a score

I was able to do this with basic rules. Note the low (0.1) scores. It
would be nice to use this as a DEFER check in the MTA, since resends will
hit the AWL rule and get cleared.

#
# This rule looks for in-line MIME Content-Type headers of various
# types, and then looks to see if the sender tuple is already known
# to the autowhitelist system. If the message contains a binary
# attachment and the sender tuple is unknown, fire a rule that tells
# us that the message is a gift from a stranger.
#

mimeheader  __L_C_TYPE_APP  Content-Type =~ /^application/i
mimeheader  __L_C_TYPE_IMAGEContent-Type =~ /^image/i
mimeheader  __L_C_TYPE_AUDIOContent-Type =~ /^audio/i
mimeheader  __L_C_TYPE_VIDEOContent-Type =~ /^video/i
mimeheader  __L_C_TYPE_MODELContent-Type =~ /^model/i

metaL_STRANGER_APP  (!AWL  __L_C_TYPE_APP)
score   L_STRANGER_APP  0.1
tflags  L_STRANGER_APP  noautolearn
priorityL_STRANGER_APP  1001 # defer till after AWL

metaL_STRANGER_IMAGE(!AWL  __L_C_TYPE_IMAGE)
score   L_STRANGER_IMAGE0.1
tflags  L_STRANGER_IMAGEnoautolearn
priorityL_STRANGER_IMAGE1001 # defer till after AWL

metaL_STRANGER_AUDIO(!AWL  __L_C_TYPE_AUDIO)
score   L_STRANGER_AUDIO0.1
tflags  L_STRANGER_AUDIOnoautolearn
priorityL_STRANGER_AUDIO1001 # defer till after AWL

metaL_STRANGER_VIDEO(!AWL  __L_C_TYPE_VIDEO)
score   L_STRANGER_VIDEO0.1
tflags  L_STRANGER_VIDEOnoautolearn
priorityL_STRANGER_VIDEO1001 # defer till after AWL

metaL_STRANGER_MODEL(!AWL  __L_C_TYPE_MODEL)
score   L_STRANGER_MODEL0.1
tflags  L_STRANGER_MODELnoautolearn
priorityL_STRANGER_MODEL1001 # defer till after AWL



-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

some of you have bad meta rules...

2007-08-10 Thread Eric A. Hall


noticed this in the debug output while upgrading

[10637] dbg: rules: meta test DIGEST_MULTIPLE has undefined dependency
'DCC_CHECK'
[10637] info: rules: meta test FM__TIMES_2 has dependency
'FH_HOST_EQ_D_D_D_D' with a zero score
[10637] info: rules: meta test FM_SEX_HOST has dependency
'FH_HOST_EQ_D_D_D_D' with a zero score
[10637] dbg: rules: meta test SARE_RD_SAFE has undefined dependency
'SARE_RD_SAFE_MKSHRT'
[10637] dbg: rules: meta test SARE_RD_SAFE has undefined dependency
'SARE_RD_SAFE_GT'
[10637] dbg: rules: meta test SARE_RD_SAFE has undefined dependency
'SARE_RD_SAFE_TINY'
[10637] info: rules: meta test HS_PHARMA_1 has dependency
'HS_SUBJ_ONLINE_PHARMACEUTICAL' with a zero score
[10637] dbg: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency
'SARE_XMAIL_SUSP2'
[10637] dbg: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency
'SARE_HEAD_XAUTH_WARN'
[10637] dbg: rules: meta test SARE_HEAD_SUBJ_RAND has undefined dependency
'X_AUTH_WARN_FAKED'

don't feel bad, I had some broken ones myself :)

--lint probably ought to be extended to catch meta rules btw

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

plugin to test attachments from unknown senders

2007-07-14 Thread Eric A. Hall


Like other folks I've been getting hit with the PDF spam pretty hard. I
think the way to solve this and the image spam in general is to do a
plugin that does two things:

 1) looks in the message to see if there is a binary attachment

 2) looks in the AWL to see if the sender tuple is known

 3) if (1==true)  (2==false) fire a score

I've been meaning to adapt my SAGREY plugin [1] for this but have not had
time and may not have time for a while yet, so I thought I'd throw this
out there to see if anybody else is interested in doing it

[1] http://www.ntrg.com/misc/sagrey/

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Rule suggestion - smtp sanity

2007-07-14 Thread Eric A. Hall


On 7/13/2007 11:04 AM, arni wrote:
  From large providers i sometimes recieve messages through encrypted 
 smtp, the header looks smth like this (qmail):
 
 ...  with (AES256-SHA encrypted) SMTP; ...
 
 
 Would it be a good idea to give a minimal negative score on this -0.1 or 
 -0.2 if this happens on the last hop? - It proves that the sending smtp 
 server is very protocol sane, which spambots are usually not.

It's a good idea to look at last-hop transfer and see if it used STARTTLS,
if the certificate was valid, etc., and is something I've got on my to-do
list for future development.

The big problem is that there is no real standard and every MTA records
the details differently.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Rule based on X Greylist header

2007-03-13 Thread Eric A. Hall


On 3/13/2007 2:40 PM, Arjun Datta wrote:

 when milter-greylist detects a user that has passed SMTP AUTH - it does not
 delay it and adds a header:
 
 X Greylist: Sender succeeded SMTP Authentication, not delayed by
 milter-greylist 0.3
 
 Now, how do I add a rule to spamassassin that assigns a negative score to
 emails (like whitelisting where it adds a score of -100) that are detected
 with that header so that spamass-milter will not scan those emails.

Assuming you mean X-Greylist instead of X Greylist, something like the
following will either work or get you close:

header L_MILTER_GREY X-Greylist =~ /^Sender succeeded SMTP Authentication/
score  L_MILTER_GREY -100

put that into a cf file in one of your rules directory

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Annoying stocks scams

2007-03-06 Thread Eric A. Hall


On 3/6/2007 5:30 AM, [EMAIL PROTECTED] wrote:

 It's my first meta rule, which only gives a score if both conditions are 
 true, and I was wondering if there's a possibility to make the score more 
 intelligent :

my local rules use combinations. any message that hits AT LEAST one rule
gets the L_STOCKS_1 match. messages that hit more than one ALSO get a
separate score, in addition to L_STOCKS_1:

metaL_STOCKS_1  (__L_STOCKS_01 || __L_STOCKS_02 ||
__L_STOCKS_03 || __L_STOCKS_04 || __L_STOCKS_05 || __L_STOCKS_06 ||
__L_STOCKS_07 || __L_STOCKS_08 || __L_STOCKS_09 || __L_STOCKS_10 ||
__L_STOCKS_11 || __L_STOCKS_12 || __L_STOCKS_13 || __L_STOCKS_14 ||
__L_STOCKS_15 || __L_STOCKS_16 || __L_STOCKS_17 || __L_STOCKS_18 ||
__L_STOCKS_19 || __L_STOCKS_20 || __L_STOCKS_21 || __L_STOCKS_22 ||
__L_STOCKS_23 || __L_STOCKS_24 || __L_STOCKS_25 || __L_STOCKS_26 ||
__L_STOCKS_27 )
describeL_STOCKS_1  One or more stock markers
score   L_STOCKS_1  1.0

metaL_STOCKS_2  (( __L_STOCKS_01 + __L_STOCKS_02 +
__L_STOCKS_03 + __L_STOCKS_04 + __L_STOCKS_05 + __L_STOCKS_06 +
__L_STOCKS_07 + __L_STOCKS_08 + __L_STOCKS_09 + __L_STOCKS_10 +
__L_STOCKS_11 + __L_STOCKS_12 + __L_STOCKS_13 + __L_STOCKS_14 +
__L_STOCKS_15 + __L_STOCKS_16 + __L_STOCKS_17 + __L_STOCKS_18 +
__L_STOCKS_19 + __L_STOCKS_20 + __L_STOCKS_21 + __L_STOCKS_22 +
__L_STOCKS_23 + __L_STOCKS_24 + __L_STOCKS_25 + __L_STOCKS_26 +
__L_STOCKS_27 ) == 2)
describeL_STOCKS_2  Two stock markers
score   L_STOCKS_2  4.0

metaL_STOCKS_3  (( __L_STOCKS_01 + __L_STOCKS_02 +
__L_STOCKS_03 + __L_STOCKS_04 + __L_STOCKS_05 + __L_STOCKS_06 +
__L_STOCKS_07 + __L_STOCKS_08 + __L_STOCKS_09 + __L_STOCKS_10 +
__L_STOCKS_11 + __L_STOCKS_12 + __L_STOCKS_13 + __L_STOCKS_14 +
__L_STOCKS_15 + __L_STOCKS_16 + __L_STOCKS_17 + __L_STOCKS_18 +
__L_STOCKS_19 + __L_STOCKS_20 + __L_STOCKS_21 + __L_STOCKS_22 +
__L_STOCKS_23 + __L_STOCKS_24 + __L_STOCKS_25 + __L_STOCKS_26 +
__L_STOCKS_27 ) == 3)
describeL_STOCKS_3  Three stock markers
score   L_STOCKS_3  9.0

metaL_STOCKS_4  (( __L_STOCKS_01 + __L_STOCKS_02 +
__L_STOCKS_03 + __L_STOCKS_04 + __L_STOCKS_05 + __L_STOCKS_06 +
__L_STOCKS_07 + __L_STOCKS_08 + __L_STOCKS_09 + __L_STOCKS_10 +
__L_STOCKS_11 + __L_STOCKS_12 + __L_STOCKS_13 + __L_STOCKS_14 +
__L_STOCKS_15 + __L_STOCKS_16 + __L_STOCKS_17 + __L_STOCKS_18 +
__L_STOCKS_19 + __L_STOCKS_20 + __L_STOCKS_21 + __L_STOCKS_22 +
__L_STOCKS_23 + __L_STOCKS_24 + __L_STOCKS_25 + __L_STOCKS_26 +
__L_STOCKS_27 )  3)
describeL_STOCKS_4  Four or more stock markers
score   L_STOCKS_4  20.0

My scores are high because I have some mail accounts on other networks
that are lightly whitelisted and I need to hit the spams that come from
there. Do not use those scores or else you will fry mailing lists etc.

[Fwd: Re: POSIBLE SPAM Re: Annoying stocks scams]

2007-03-06 Thread Eric A. Hall


please suspend this users mailing list account


---BeginMessage---
 Mensaje Automatico ***
Este usuario no se encuentra operativo, para cualquier asunto le ruego
se pongan en contacto con Leandro Gayango [EMAIL PROTECTED]

***

 ehall 03/06/07 19:24 

Spam detection software, running on the system
vm-antispam2.mpsistemas.es, has
identified this incoming email as possible spam.  The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email.  If you have any questions, see
the administrator of that system for details.

Content preview:  On 3/6/2007 5:30 AM, [EMAIL PROTECTED] wrote: 
It's
  my first meta rule, which only gives a score if both conditions are 
  true, and I was wondering if there's a possibility to make the score
  more  intelligent : [...] 

Content analysis details:   (5.1 points, 4.0 required)

 pts rule name  description
 --
--
 1.0 MY_DSL I could use a BL for this.
 0.5 NO_RDNSSending MTA has no reverse DNS (Postfix
variant)
 0.2 MR_NOT_ATTRIBUTED_IP   Beta rule: an non-attributed IPv4 found in
headers
 0.0 BAYES_50   BODY: Bayesian spam probability is 40 to 60%
[score: 0.5000]
 2.0 RATWR10_MESSID Message-ID has ratware pattern
(HEXHEX.HEXHEX@)
 0.4 UPPERCASE_50_75message body is 50-75% uppercase
 0.0 NO_RDNS2   Sending MTA has no reverse DNS
 1.0 RCVD_IN_SORBS  RCVD_IN_SORBS

---End Message---

feature req

2007-02-15 Thread Eric A. Hall


need a --show-rule option to spamassassin cmd that will display all the
information associated with a named rule (DESC, SCORE, rule syntax, etc)

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: feature req

2007-02-15 Thread Eric A. Hall


On 2/15/2007 8:53 AM, Justin Mason wrote:
 Eric A. Hall writes:
 need a --show-rule option to spamassassin cmd that will display all the
 information associated with a named rule (DESC, SCORE, rule syntax, etc)
 
 could you open a feat req on the bugzilla?  it'll get lost otherwise.

bug 5335

 for what it's worth, we already (internally) use a tool called
 build/parse-rules-for-masses, which parses and generates a perl data
 structure representing the rules.  If you're consuming it in perl,
 that'd be a good way to do it.

I'm looking at it from operator perspective--it's very time consuming to
track down information about a rule and I'm thinking that this would make
the process simpler.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: How to deal with mailing list spam?

2007-01-24 Thread Eric A. Hall


On 1/24/2007 3:29 PM, Chris Purves wrote:
 I was wondering what is the best way to deal with spam that comes 
 through on mailing lists?  For mailing lists like spamassassin I 
 whitelist all mail because I expect to see examples of spam, but for 
 other lists, is it a good idea to run 'sa-learn --spam'?  What about 
 reporting those spam to razor/pyzor or spamcop?

1) subscribe to lists that are well run

2) whitelist the envelope-sender address, or the originating network, or
in some cases you may also want to whitelist the list address itself so
that directed replies that are TO you but CC the list also get boosted


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: One person to filter spam

2007-01-24 Thread Eric A. Hall


On 1/24/2007 12:37 PM, IT_Architect wrote:
 I'm thinking about using SpamAssassin.  Is it possible to have the suspected
 spam to one account to have one person clear or delete possible spam.  When
 they say it's good, will it then go to the correct user?

It's possible to do that but not with spamassassin alone. You'll need a
mailer that can resumbit the messages after they've been cleared, at the
very least. That's trivial, but it's not something spamassassin does.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: INFO_TLD

2007-01-17 Thread Eric A. Hall


On 1/16/2007 1:52 AM, Eric A. Hall wrote:
 On 1/16/2007 12:06 AM, Theo Van Dinter wrote:
 On Mon, Jan 15, 2007 at 10:44:33PM -0500, Eric A. Hall wrote:
 sa-update nuked INFO_TLD which I was still finding useful
 can somebody with the rule send it to me? thanks

One of the aggressive porno spammers is all about the .info so in case
anybody else is looking for these

uri  INFO_TLD  /\.info(?::\d+)?(?:\/|$)/i
describe INFO_TLD  Contains an URL in the INFO top-level domain
scoreINFO_TLD  1.0

btw, I run with a lot higher than 1.0 here

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

INFO_TLD

2007-01-15 Thread Eric A. Hall


sa-update nuked INFO_TLD which I was still finding useful

can somebody with the rule send it to me? thanks


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: INFO_TLD

2007-01-15 Thread Eric A. Hall


On 1/16/2007 12:06 AM, Theo Van Dinter wrote:
 On Mon, Jan 15, 2007 at 10:44:33PM -0500, Eric A. Hall wrote:
 sa-update nuked INFO_TLD which I was still finding useful
 can somebody with the rule send it to me? thanks
 
 It's pretty straightforward to write, but the rule still exists in the
 standard 3.1 install.  Check out /usr/share/spamassassin or the tarball.

got it

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

would SA benefit from port to Java

2006-11-17 Thread Eric A. Hall


Thinking about the GPL Java announcement some, and trying to imagine the
kinds of opportunities this allows for, it occurs to me that SpamAssassin
might be a natural fit for Java.

I'm just thinking out loud here, not advocating anything...

Would it run better? Would it be faster, have smaller memory footprint,
better reclamation, better hooks for plugins etc? OTOH, would it be harder
to build, given the dependence of SA on perl modules?

Thoughts?

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Feature Request: envelope scanning

2006-10-26 Thread Eric A. Hall


On 10/25/2006 7:15 PM, Mark Martinec wrote:

 For envelope sender there is a standard header: Return-Path

Return-Path is supposed to be added when the message is placed in the
mailstore (ie, last hop, after the transfer network). Since I do scanning
at the MTA level before delivery, I don't have Return-Path yet.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Scoring PTR's

2006-10-25 Thread Eric A. Hall


On 10/24/2006 4:01 PM, John Rudd wrote:
 Eric A. Hall wrote:

 Note that this is entirely legal, and even necessary:

 [ root# ] host 207.65.71.14
 14.71.65.207.in-addr.arpa is an alias for 14.in-addr.ntrg.com.
 14.in-addr.ntrg.com is an alias for 14.in-addr.labs.ntrg.com.
 14.in-addr.labs.ntrg.com domain name pointer bulldog.labs.ntrg.com.
 
 All of that's ok.  The question is:
 
 is bulldog.labs.ntrg.com an A record, or a CNAME record?
 
 That's the thing I have been testing for (is it a CNAME).  That's the 
 thing that RFC1912 doesn't like (the PTR record itself, not merely 
 in-addr.arpa aliases that eventually get to the PTR record, but the PTR 
 record itself, may not _refer_to_ a CNAME record, it must refer to an A 
 record)

There's nothing that prohibits the target domain name entry of a PTR from
having a CNAME record. A PTR is just a pointer to some other domain
name. The target domain name can have whatever records the owner feels
they need. It's probably something that should be discouraged, since
additional processing would be needed to obtain a complete answer, but on
its face it's not illegal (again RFC1912 is informational, is not
authoritative, and has significant errors).

You'd probably need a plugin to check for this, since you'd need to
generate your own query for the RRs associated with the target domain name
in order to get a definitive answer.

I'm not really sure this would be reliable spam-sign. I can imagine some
legitimate uses for this.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Feature Request: envelope scanning

2006-10-25 Thread Eric A. Hall


On 10/25/2006 2:35 PM, Joe Flowers wrote:

 If I pre-pend a message's Envelope to it's Body, can Spamassassin do 
 anything useful with it?

At a minimum you can use the envelope recipient(s) to do some kinds of
spam-trap filtering (eg, is the message addressed to a spamtrap and me).
You can use the envelope sender to do some kinds of whitelisting too (such
as whitelisting your aunt at yahoo even if the you have the whole yahoo
domain otherwise blacklisted, or whitelisting a mailing list sender). My
LDAPfilter plugin (http://www.ehsco.com/misc/ldapfilter/) uses them for
these kinds of purposes.

Other possibilities exist too. Envelope sender can be used for some SPF
filters that aren't currently done, for example.

The first problem is that there is no standard header field, and in the
case of envelope recipient(s) where there can be multiple entries, there
is no standard for the field data. I use X-Envelope-To and X-Envelope-From
with typical RFC822 address syntaxes (no real name blob, etc), but only
because I had nothing else to use and that structure seems to be the most
obvious and least harmful.

Another consideration is that they have to be created by the MTA, and
spamassassin doesn't have possession of the envelope data so it can't
create them. In my case I have to make postfix generate them in order for
them to be usable, and the LDAPfilter plugin has .cf options that point to
the header fields in questions (eg, ldapfilter_env_from_header)

But yeah, if they are provided and if there is a way to tell spamassassin
where to look, they are very useful.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Scoring PTR's

2006-10-24 Thread Eric A. Hall


On 10/23/2006 10:50 PM, John Rudd wrote:
 Eric A. Hall wrote:
 On 10/23/2006 7:01 PM, John Rudd wrote:

 a) does the hostname in the PTR record point to a CNAME instead of an A 
 record

 That's not illegal. It's pretty common too, since subnet delegation of
 in-addr space only works on /8, /16 and /24 subnets due to the way that
 octets are mapped to domain name labels in that hierarchy.
 
 RFC 1912 says don't do that :-)

RFC1912 is informational non-authoritative. It has some big errors (ie, it
says a label may not be all-numeric, which is wrong).

 Though, honestly, I've yet to see it actually get triggered in my 
 mimedefang filter, so I don't mind losing it.

Can you clarify what you are looking for here?

Note that this is entirely legal, and even necessary:

[ root# ] host 207.65.71.14
14.71.65.207.in-addr.arpa is an alias for 14.in-addr.ntrg.com.
14.in-addr.ntrg.com is an alias for 14.in-addr.labs.ntrg.com.
14.in-addr.labs.ntrg.com domain name pointer bulldog.labs.ntrg.com.

In that example, the entry for 14.71.65.207.in-addr.arpa. has a CNAME RR
pointing to 14.in-addr.ntrg.com. (the entry has been delegated to my zone
using a CNAME), which in turn aliases to 14.in-addr.labs.ntrg.com., which
in turn has a PTR record that resolves to bulldog.labs.ntrg.com.

A PTR record is just a pointer to some other domain name and only has
semantic meaning when lookups are keyed to a name in the in-addr.arpa.
hierarchy.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Scoring PTR's

2006-10-24 Thread Eric A. Hall


On 10/24/2006 7:55 AM, John Rudd wrote:

 Here's an example for one I got tonight (I got 3, but trashed the others 
 before thinking I should send that as an example).
 
 (i577A0BC3.versanet.de [87.122.11.195])
 
 577A0BC3 is the hex encoding of the IP address, with no separators.

That may be spam-sign, but unless there's something more than what you're
showing it's not a standards violation.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Scoring PTR's

2006-10-23 Thread Eric A. Hall


On 10/23/2006 7:01 PM, John Rudd wrote:
 Eric A. Hall wrote:
 http://www.ehsco.com/misc/spamassassin/std_compliance.cf might help or
 work for what you're doing.

 Make sure to read the disclaimers and warnings
 
 Those helped a lot.  There's only three checks I can't do with them 
 (probably need to use a plugin for it):
 
 a) does the hostname in the PTR record point to a CNAME instead of an A 
 record

That's not illegal. It's pretty common too, since subnet delegation of
in-addr space only works on /8, /16 and /24 subnets due to the way that
octets are mapped to domain name labels in that hierarchy.

 b) does the hostname contain it's IP address in _hex_ form (instead of 
 in decimal form, which I've already got working)

I don't recall ever seeing that. If you create a rule for that you might
also want to do octal notations too, which is another valid address
encoding syntax that should never appear naturally.

 c) does the hostname in the PTR record actually going to an A record 
 which includes the relay's IP addr

that's a reasonable test

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: R: R: Scoring PTR's

2006-10-20 Thread Eric A. Hall


On 10/20/2006 10:43 AM, Giampaolo Tomassoni wrote:
 RFC 2821 Section 4.1.4 Order of Commands ...

 An SMTP server MAY verify that the domain name parameter in the EHLO
 command actually corresponds to the IP address of the client.
 However, the server MUST NOT refuse to accept a message for this
 reason if the verification fails: the information about verification
 failure is for logging and tracing only. ...
 
 It can mean whatever you like (do note MAY and MUST NOT though).
 
 It just mean you can't drop a message based solely on the parameter of 
 the EHLO command. You MAY check it, if you like to. But you MUST NOT 
 drop it.

2821 is for implementors, not operators. Software developers must not
automatically drop mail for this reason --as a matter of design-- but as
an operator you can do whatever you want with any piece of mail.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Scoring PTR's

2006-10-19 Thread Eric A. Hall


http://www.ehsco.com/misc/spamassassin/std_compliance.cf might help or
work for what you're doing.

Make sure to read the disclaimers and warnings

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: R: R: Scoring PTR's

2006-10-19 Thread Eric A. Hall


On 10/19/2006 7:11 PM, John Rudd wrote:

 It is my observation that the messages which come from an immediately 
 relay that:
 
 A) does not have a PTR record, or
 
 B) has forged DNS (PTR record doesn't lead to an A record which
 resolves back to the SMTP client's IP address), or
 
 C) has a hostname that appears to be an end-client of some other
 network than my own (contains its own IP addr in the hostname, contains
 words like dynamic, dsl, dial-up, etc.)
 
 are generating spam. 

It's a bigger list than that but yeah. My theory is that if they can't get
their network configured, no telling what else is broken, so I flag it.

 In order to exempt my own legitimate users, I skip the check if they're
 on my IP block OR if they do SMTP-AUTH.

I've got two listeners, one for SMTP 25, one for SUBMIT 587. The latter
only allows authenticated sessions. Mail sent to the former is heavily
inspected while the session is action, while mail to the latter bypasses
the filters altogether.

 The one thing I'm thinking about changing is, at home I _reject_ 
 messages that fail these checks (using filter_sender in mimedefang). 
 I'm thinking that, for the production systems at work, just to cover 
 that incredibly small percentage of people who can't or wont use their
 ISP's mail server or do SMTP-AUTH, I'll merely quarantines their 
 messages, via spam assassin score, instead of rejecting them.

Yeah, I moved almost everything out of postfix and into spamassassin so
that I could work on probability instead of binary. Just make sure to
whitelist all traffic for any mailing list that you're on, possibly
including to/cc whitelists.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Any comments of the SpamHaus lawsuit?

2006-10-16 Thread Eric A. Hall


On 10/11/2006 1:16 AM, Jason Haar wrote:

 If Spamhaus lose this lawsuit (which they are ignoring as they are

they lost it a while ago. summary judgement was in september. the latest
papers are because spamhaus didn't comply with the default judgement.
pretty routine stuff.

 Americans to arms I say... Start sending Internet for Dummies to the
 judge for starters ;-)

Actually the judge seems to have left an out for spamhaus in the original
default judgement. Excerpting myself:

the original default injunctive order states that Spamhaus must not
interfere with Mr. Linhardt's e-mail messages ...unless Spamhaus can
demonstrate by clear and convincing evidence that Plaintiffs have violated
relevant United States law. Well, that should be easy enough--there are
millions of people who have received his spam, and it seems to be in
violation of the CAN-SPAM act as I know it (and the judge might know it,
too). Once demonstrated, the injunction would be partially lifted
automatically. Better yet, affected parties could then pursue damages
against Mr. Linhardt of their own, thus forcing him to back down.

The problem here is that Spamhaus isn't subject to U.S. jurisdiction (as
it has argued itself) and so isn't eligible for relief under the CAN-SPAM
act, either. Instead, it needs a U.S.-based partner to pursue this angle
on its behalf. Worse, due to the way that the CAN-SPAM act is written,
only certain parties can sue for damages, which further limits the pool of
potential partners. However, many of the organizations that are eligible
for relief are also some of Spamhaus' biggest beneficiaries (namely the
ISPs that rely most on its filters), and so there should be a natural pool
of willing partners for Spamhaus to choose from.

http://www.informationweek.com/blog/main/archives/2006/10/spamhaus_needs.html

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: double letter porn

2006-10-04 Thread Eric A. Hall


On 10/4/2006 5:57 PM, Richard Doyle wrote:
 I've been getting lots of porn site spam containing words with doubled
 letters, like this one:

 Can anybody suggest a rule or ruleset to catch these double-letter
 obfuscations? I'm using Spamassassin 3.1.4.

You'd probably need to write a plug-in that used some kind of
typo-matching logic to find porno words.

Would be a good plug-in actually. Get busy :)

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: testing for empty text/plain

2006-08-07 Thread Eric A. Hall


On 8/7/2006 12:25 AM, Theo Van Dinter wrote:
 On Mon, Aug 07, 2006 at 12:07:58AM -0400, Eric A. Hall wrote:
 Anybody written a rule that tests for empty text/plain, preferably only
 when a non-empty text/html or some other media-type is provided?
 
 Sounds very similar to MPART_ALT_DIFF.

That might be useful as a pre-test filter, such as looking to see if
MPART_ALT_DIFF fired before doing anything else. From there I can grep to
see if text/plain has any printable characters.

What's the most efficient way to grab the text/plain part?

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

testing for empty text/plain

2006-08-06 Thread Eric A. Hall


Anybody written a rule that tests for empty text/plain, preferably only
when a non-empty text/html or some other media-type is provided?

Thanks

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

spec file for cpan2rpm and suse 9.3

2005-10-20 Thread Eric A. Hall


Anybody got one that works with gnome/evolution?

evolution requires spamassassin, which requires perl-spamassassin.
cpan2rpm makes perl-Mail-SpamAssassin, which doesn't satisfy either of
the packaging dependencies. Attempts on my part to tweak the spec file
generated with cpan2rpm have failed miserably.


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: spec file for cpan2rpm and suse 9.3

2005-10-20 Thread Eric A. Hall


On 10/20/2005 1:27 PM, Eric A. Hall wrote:
 Anybody got one that works with gnome/evolution?
 
 evolution requires spamassassin, which requires perl-spamassassin.
 cpan2rpm makes perl-Mail-SpamAssassin, which doesn't satisfy either of
 the packaging dependencies. Attempts on my part to tweak the spec file
 generated with cpan2rpm have failed miserably.

fixed it by adding the following two lines to the header block:

  Provides: perl-spamassassin
  Provides: spamassassin

I was trying too hard before, thinking I needed version numbers etc.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Individual scoring at SMTP time

2005-10-14 Thread Eric A. Hall


On 10/14/2005 6:40 AM, Magnus Holmgren wrote:

 If you want to reject spam at SMTP time (which I think all agree is a
 good practice as long as you weigh the risks against the benefits
 properly), but also want to apply individual settings (according to the
 One person's ham is another person's spam. maxim), what is the best
 practice for handling multiple recipients (RCPT TO: commands)?

Best current practice for rejecting per-user at MTA level is to configure
your MTA so that it only allows one RCPT per message transfer, as you
spotted in your option #1.

 2. Run a global check at SMTP time, using a conservative ruleset
 (possibly including bayes with low scores) that only catches 100%
 certain spam, then let each user run SA a second time any way they want
 (but without the ability to reject, just accept/blackhole/file in spam
 folder as usual).

That's the better way, although it loses the ability to reject modest spam
at the gateway.

 3. Run SA with custom configs for each user at SMTP time. Reject if
a) any user rejects,
b) all user rejects,
c) the majority rejects,
d) the average score is above the average limit,
e) other criterion.
 (Potentially time-consuming with many recipients, risking that the
 sending MTA times out.)

You still need to limit recipients (actually more imporant since all
processing load is now N*msg instead of 1*msg).

 6. Write an RFC about changing the SMTP protocol to allow DATA before
 RCPT TO: (if the mail is sufficently short). :-)

There have been several drafts trying to attack this problem. Mine is at
http://www.ntrg.com/misc/I-Ds/draft-hall-inline-dsn-01.txt and suggests
returning per-RCPT response codes after the DATA ack (eg, user1 returns
250, while user2 returns 550).

 1 and 2 are easy to implement, but I don't know if someone has
 implemented support for 3 in current software.

It's feasible enough with some scripting work. The hard part is applying
the per-user settings into the chain (have to read the settings, apply
them to the local scanning process without clobbering the others, etc).
Really though the problem is load, since you are looking at multiples of
scanning processes.

FWIW, I wrote a primer on this kind of architecture for Network Computing
Magazine, archived at http://www.ehsco.com/reading/20040916ncf.html

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: [SPAM] RE: GeoCities Link-only spam

2005-08-22 Thread Eric A. Hall


On 8/22/2005 3:34 PM, Derek Harding wrote:
 On Sun, 2005-08-21 at 20:05 -0400, Eric A. Hall wrote:
 
What's the benefit of using this instead of the uridnsbl plugin? The code
below will look for the IP address behind a URI and then query the
cn-kr.blackholes.us RBL to see if that addr is in China:
 
 This one doesn't require a DNS lookup which makes it faster.

IP::Country use Whois lookups instead though, and UDP/DNS lookups are
going to be faster than chained TCP/Whois queries.

 blackholes.us only covers a limited set.

Just an example for discussion purposes (worth noting that their main web
site is down too). http://countries.nerd.dk/more.html is another one

I'll play with the plugin and see what kind of times and load I get

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: [SPAM] RE: GeoCities Link-only spam

2005-08-22 Thread Eric A. Hall


On 8/22/2005 3:50 PM, Eric A. Hall wrote:

 IP::Country use Whois lookups instead though, and UDP/DNS lookups are
 going to be faster than chained TCP/Whois queries.

 I'll play with the plugin and see what kind of times and load I get

Some poking around, IP::Country::Fast uses a pre-built mapping database
instead of issuing lookups (IP::Country::Slow) or caching lookups
(IP::Country::Medium). The pre-built databse is stored in the .gif files
in /usr/lib/perl5/site_perl/5.8.6/IP/Country/Fast/ on my system, and
presumably this stuff gets repackaged when IP allocations change. This
means keeping the package synched, of course, but it does seem to be
somewhat faster and requires less overhead.

BTW, lookups for dead domain names  are really slow and block the rest of
the message processing.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: GeoCities Link-only spam

2005-08-22 Thread Eric A. Hall


On 8/22/2005 4:14 PM, Dallas L. Engelken wrote:

IP::Country use Whois lookups instead though

 Hrmm?  Where does it say it uses Real-Time Whois lookups?

The docu for IP::Country::Fast is empty and refers to IP::Country, which
describes the use of whois.

See my follow-up post though

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: OT: sa-learn, interfaced with Cyrus mailboxes

2005-08-21 Thread Eric A. Hall


On 8/21/2005 1:59 AM, Forrest Aldrich wrote:
 I just switched over to Cyrus IMAP - and it didn't occur to me I'd need 
 to change several ways I report spam, due to the mailstore format.
 
 I wonder whom else is using Cyrus IMAP here, and how you may be handling 
 this.

I don't use sa-learn, but Cyrus mailstore is basically just a folder
hierarchy that each contain individual messages, each of which are their
own mbox file. Just read *. into sa-learn and it should work on
message-id the same as usual. Automatically moving the messages may be
more of a problem.

http://www.google.com/search?q=sa-learn+cyrus seeems to return a bunch of
relevant matches

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

SAGrey plugin

2005-08-21 Thread Eric A. Hall


I've written a little plugin called SAGrey that provides a limited amount
of greylisting functionality using SpamAssassin's existing services.

SAGrey is two-phased, in that it first looks to see if the current score
of the current message exceeds the user-defined threshold value (as set in
one of the cf files), and then looks to see if the message sender's email
and IP address tuple are already known to the auto-whitelist (AWL)
repository. If the message is spam and the sender is unknown, SAGrey
assumes that this is one-time spam from a throwaway or zombie account, the
SAGREY rule fires, adds 1.0 to the current message score, and optionally
creates a header field in the message itself. The rulename or header field
can then be used to perform additional functions (EG, having your delivery
or transfer agent defer delivery), or the score by itself can be used to
penalize the message.

This model has two benefits over MTA-specific greylisting mechanisms:
first, it only subjects probable-spam to greylisting (instead of making
everybody be deferred, which has known problems), and it repurposes the
existing spamassassin history database (meaning no additional databases
need to be maintained). Another benefit is that it can still work at the
MTA level if your MTA can call spamassassin while the transfer is active
and then defer delivery based on the presence of header-field data
(postfix 2.x will not do this unfortunately, since the header checks don't
provide a DEFER verb), but can also be used in other models (such as
delivery routines).

The plugin and cf are posted at http://www.ntrg.com/misc/sagrey/ and I've
also updated the wiki

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: [SPAM] RE: GeoCities Link-only spam

2005-08-21 Thread Eric A. Hall


On 8/8/2005 5:05 PM, Derek Harding wrote:

It allows rules such as:
uricountry  URICOUNTRY_CN   CN
header  URICOUNTRY_CN   eval:check_uricountry('URICOUNTRY_CN')
describeURICOUNTRY_CN   Contains a URI hosted in China
tflags  URICOUNTRY_CN   net
score URICOUNTRY_CN 2.0

What's the benefit of using this instead of the uridnsbl plugin? The code
below will look for the IP address behind a URI and then query the
cn-kr.blackholes.us RBL to see if that addr is in China:

  uridnsblURIBL_CNKR  cn-kr.blackholes.us TXT
  bodyURIBL_CNKR  eval:check_uridnsbl('URIBL_CNKR')
  tflags  URIBL_CNKR  net
  score   URIBL_CNKR  2.0

I'm sure there's a difference but I guess I'm not seeing it

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: having spamc/spamd include hostname?

2005-08-20 Thread Eric A. Hall


On 8/20/2005 4:22 PM, Dan Mahoney, System Admin wrote:

 basically I will have a different sql userpref for [EMAIL PROTECTED] or 
 [EMAIL PROTECTED], or different global defaults for hosta.com.  This seems 
 elementary to do, but I can't figure out how to make spamd tell which one 
 to use -- maybe based on the connecting ip, maybe based on a command 
 line/config file variable passed.

A simple plug-in would probably do the trick. You'd need to call the
Sys::Hostname::Long module yourself, since SA itself does not need or
provide the local hostname itself.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: messages with no body

2005-07-13 Thread Eric A. Hall


On 7/12/2005 8:59 PM, Loren Wilton wrote:

 Note that in business circles content includes the subject.  As far
 as I know, rawbody won't see a subject.  It is fairly common to send
 one line questions in the subject with an empty body, and one line
 replies likewise.

I have trained my users better than that, which is why I don't care about
these tests. Other people might tho.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: messages with no body

2005-07-12 Thread Eric A. Hall


On 7/10/2005 4:41 PM, Eric A. Hall wrote:
 On 7/10/2005 3:49 PM, Loren Wilton wrote:
 
However, if you want something like this, just off the top of my head:

header __HAS_TOTo =~/\S/
body__HAS_BODY/\S/
metaEMPTY_MSG(!__HAS_TO  !__HAS_BODY)
 
 Good idea. rawbody works better but the model is right.

As was pointed out off-list, this rule will wrongly fire if there is an
attachment and no text body. The following rule is adapted from the
suggested rules that were provided (I assume anonymity was desired from
the off-list response so...). The rule checks for the presence of a nested
media-type (message/ and multipart/ are the only nested types) or the
presence of body data.

header  __L_MSG_HAS_C_TYPE_MContent-Type =~ /^(message|multipart)/i
rawbody __L_MSG_HAS_BODY/\S/

describe L_MSG_NO_BODY  Raw message does not have any body data
metaL_MSG_NO_BODY   (!__L_MSG_C_TYPE_M  !__L_MSG_BODY)
score   L_MSG_NO_BODY   0.1

There are lots of fancier things to look for but that is pretty minimal
testing which is what I'm looking for.

BTW, I am doing this so that postfix can trap the rule after the message
has undergone filtering, so that the message can simply be rejected
(there's no judgement as to spamminess here, just a check to see if the
message has any content).


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: messages with no body

2005-07-10 Thread Eric A. Hall


On 7/10/2005 3:12 PM, Loren Wilton wrote:

 Anybody got a rule that will catch messages that don't have a body?
 
 
 There are things like that around.  I have a rather draconian pesonal
 rule I use.  There is a much milder form in one of the SARE rulesets.
 The problem is you can't check just missing body, as you will get way
 too many FPs in a business environment.

I guess I should have asked the obvious question:

and if so, could you post it?

thanks

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: messages with no body

2005-07-10 Thread Eric A. Hall


On 7/10/2005 3:49 PM, Loren Wilton wrote:

 However, if you want something like this, just off the top of my head:
 
 header __HAS_TOTo =~/\S/
 body__HAS_BODY/\S/
 metaEMPTY_MSG(!__HAS_TO  !__HAS_BODY)

Good idea. rawbody works better but the model is right.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: messages with no body

2005-07-10 Thread Eric A. Hall


On 7/10/2005 4:56 PM, Loren Wilton wrote:
 Rawbody will miss the subject, so you will need to add a test for that too.

I'm not looking for that

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: OT: Insecure dependency in connect?

2005-06-17 Thread Eric A. Hall


On 6/16/2005 5:47 PM, Eric A. Hall wrote:
 I'm trying to update my ldap plugin to use SRV lookups for server
 discovery but am getting barked at during tests with the Insecure
 dependency in connect... error. I'm not having much luck with googling
 this error, but I remember this was a problem with razor and spamassassin
 before, and I'm wondering if anybody knows what the resolution was.

For the benefit of others, and for Google's cache:

  #
  # this stops IO::Socket from complaining about taint problems
  #
  if ($permsgstatus-{ldap_server} =~ /^(\S+)$/) {
  $permsgstatus-{ldap_server} = $1;
  }

I found that in an unofficial SA patch to razor and it seems to do the
trick here too.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

OT: Insecure dependency in connect?

2005-06-16 Thread Eric A. Hall


I'm trying to update my ldap plugin to use SRV lookups for server
discovery but am getting barked at during tests with the Insecure
dependency in connect... error. I'm not having much luck with googling
this error, but I remember this was a problem with razor and spamassassin
before, and I'm wondering if anybody knows what the resolution was.

Thanks

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: AWL pokes, and SAGray.pm

2005-06-16 Thread Eric A. Hall


On 6/15/2005 3:20 PM, Justin Mason wrote:

 Eric -- you may have to patch the AutoWhitelist class to throw those
 numbers into variables hanging off the PerMsgStatus object.  Then the
 plugin can access those values safely.
 
 I'd be +1 on applying a patch that simply sets a variable or two on the
 PerMsgStatus object as the AWL logic is run, that wouldn't have any
 noticeable effect during normal use (and it seems handy in general).

I don't disagree that it would be handy in general, but I'm not sure it's
useful strategy for this plugin given some of the synchronization issues
at play here. In particular, AWL runs after all of the eval rules, and
that is too late in the cycle for my rule to update the message.

This is kind of tricky. On the one hand, the plugin needs to run after all
of the other eval tests so that it can get the current spam score. But if
it is going to assign a booster score to the message (+1.0 for being
first-time spam from unknown source, and getting the outcome recorded in
the appropriate header field), then it also needs to run before the end of
the message processing so that SA is still in a position to modify the
score (and the underlying message) appropriately. This means it has to be
pretty much the last rule to run, which is proving to be pretty
challenging in its own right.

On top of that it also has to pull data from the AWL database, but without
allowing AWL to actually run against the message (it would be too late for
my eval rule to update the message at that point). Therefore, the easiest
way for me to find out if the message has been seen by AWL is to just ask
AWL directly, using the exposed method (but that doesn't seem to be
working, for reasons unknown).

So I agree with you as to general utility, but it won't really help with
this plugin. I need to get the AWL method figured out, and I need to get
the timing factors figured out (eg, how do I make the rule be last). I'm
stuck on both of those, although I'll readily admit that I'm not really
trying very hard either, since I've got other stuff to work on.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: AWL pokes, and SAGray.pm

2005-06-11 Thread Eric A. Hall


On 6/10/2005 3:04 PM, Eric A. Hall wrote:

 What I specifically need from AWL is number of instances for the current
 sender tuple, with the value of one (for the current message) being the
 magic number. Any suggestions would be appreciated.

http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_AutoWhitelist.html
says that $meanscore = awl-check_address($addr, $originating_ip); is
supposed to work for this but it always seems to return undef no matter
what. Is it supposed to do what I think it's supposed to do or do I need
to do some other stuff first (like setup a factory or whatever)?

 Looking through the permsgstatus docs, getting the threshold and current
 spam score values looks pretty simple.

This doesn't seem to be easy, either. It looks like I have to put the code
for pulling current score in a sub check_end {} block but it's not
behaving... I'm trying to figure out what URIBLDNS does here but it's not
simple like I'd hoped.

So much for quick and dirty

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

AWL pokes, and SAGray.pm

2005-06-10 Thread Eric A. Hall


I'm looking to do a quick-n-dirty plugin that:

 1) reads the spam threshold score from config (eg, default is 5.0)

 2) reads the spam score for the current message

 3) compares if the current score is greater than the threshold score,

AND if the auto-whitelist learner has not seen this sender tuple

 4) append header field that says probable spam from unknown sender

The purpose of this is to allow my MTA to defer accepting messages that
have this header field, providing a psuedo-greylisting feature that is
keyed to spamassassin score which reuses the AWL tracking. Using this
approach, I can do selective keying on spam instead of everybody (thus
minimizing collateral damage to the honest mail systems that don't respond
well to greylisting), and can avoid implementing yet-another tracking
system (if I can get away with reusing AWL).

[I should state the obligitory -- this module won't do much for people who
call SA from procmail. But in my setup, postfix is calling spamassassin
during the transfer process and I'm currently rejecting spam over 8.0, and
rerouting mail in the 5.0-8.0 range to a per-user Junk mail folder for
quarantine. This module would simply defer mail in the 5.0-8.0 range the
first time they try, while subsequent transfers would be quarantined as
current behavior.]

Looking through the permsgstatus docs, getting the threshold and current
spam score values looks pretty simple. But there doesn't seem to be much
support for working with the AWL system, and I'm looking for suggestions
here. I don't want to manipulate the database since it may not exist
(maybe its using SQL storage or something).

What I specifically need from AWL is number of instances for the current
sender tuple, with the value of one (for the current message) being the
magic number. Any suggestions would be appreciated.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

ldapfilter.pm updated

2005-06-08 Thread Eric A. Hall


FYI to the handful of people that use it, the ldapfilter.pm plugin on
http://www.ntrg.com/misc/ldapfilter/ has been updated to v0.02

The significant change was the use of an eval {} timer block around the
LDAP searches, so that if Net::LDAP doesn't come back on its own, the
plugin timer kicks in. This seems to have fixed the sporadic timeout
problems with LDAP searches, and it seems to operate in persistent mode
reliably now.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Is Bayes Really Necessary?

2005-05-26 Thread Eric A. Hall


On 5/26/2005 10:08 AM, Jake Colman wrote:
 Given the rather complete set of rules that ship with SA and which can
 expanded with SARE, does bayes learning really help?  Won't the rules catch
 pretty much everything anyway?

The base SA install is insufficient, but if you tweak the scores and add
some additional tests, you can get by without bayes just fine. I use a
select set of RBLs, Razor, rulesets from rulesemporium, and my own
LDAP-based weighting plugin, and my highest spam only gets an average of
one spam per day, and even those are over the 5.0 threshold (so they are
auto-filed into the Junk Email folder).

Bayes is great for per-user stuff, but unless you are willing to manage
the per-user databases (which I'm not), it is easier to just tweak the
system scores and rules. Less management overhead, less CPU, etc.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Comparison of SA and commercial solutions

2005-05-26 Thread Eric A. Hall


On 5/26/2005 10:30 AM, Chris Santerre wrote:

 Understood, and very good effort by you to educate them. Mostly all the
 reviews slam the cost benefit of SA with the Pay an employee to
 support it. line of crap.

Every filtering system requires admin time, and if the reviews don't say
as much then they're junk.

There is a critical difference with SA, however, which is that the admins
need to be proficient at stuff like CPAN, Perl, etc., while some of the
packaged offerings provide simple click-the-button GUI, and those can have
significantly lower salary associations.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

LDAPfilter plugin posted

2005-05-24 Thread Eric A. Hall


I got my plugin finished (I think) and have posted links to the plugin and
documentation at http://www.ntrg.com/misc/ldapfilter/


Is the wiki locked? I wanted to post a link there but the pages don't
appear to be editable.


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: LDAPfilter plugin posted

2005-05-24 Thread Eric A. Hall


On 5/24/2005 2:29 PM, Justin Mason wrote:

 the plugin looks good.  did you run into any more wierdness in the core
 that we should look at?  any core APIs that aren't documented but should
 be, etc.?

I think I submitted all the bugs and wishlist items that seemeds
reasonable. One cleanup point is that the permsgstatus docs still list
finish() which is still apparently dead, and now includes per_msg_finish()
which is apparently new. I'm not using either of them for portability
reasons but there ya have it.

One thing I'd like to request would be ability to fetch explicit
data-types from permsgstatus. What I mean by that is stuff like
-myhostname() and -mailboxaddr(0) and so forth. The rules are hard for
newbies to understand so they will get them wrong, and for bad programmers
like me they are too much trouble to write cleanly, so being able to just
ask SA for well-known data-types would be a big help.

The only bug I know of in this plugin is that Net::LDAP doesn't always
come back from a query when persistency is enabled and I can't figure out
why, but that doesn't seem to have anything to do with SA, and it might be
an artifact of my system's super-weird kernel/perl setup.

Is the wiki locked? I wanted to post a link there but the pages don't
appear to be editable.
 
 you need to create an account and log in.  (I think there's a mention
 of this somewhere on the front page and the user accounts page...)

Okay I'll check, thanks.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Relaying Server and sa-learn --spam

2005-05-17 Thread Eric A. Hall


Matt Kettler wrote:

 I've never played with thunderbird's forward as a attachment feature, but 
 you might be able to use that. In this situation you'd need to set up a 
 script that strips off the attachment and feeds the attachment to sa-learn.

It creates a message/rfc822 attachment, just like what SA does when it
creates a report for an (attached) message.

Stripping the embedded message out should be relatively straightforward
using some of the mime tools.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SPAMassassin headers missplaced and follow message body

2005-05-11 Thread Eric A. Hall


On 5/11/2005 6:58 AM, Martin G. Diehl wrote:

 I saw a SPAM message with the SPAMassassin message headers
 (X-spam headers) grossly out of sequence.  The message
 was recognized as SPAM ... but because the X-spam headers
 were written in the wrong part of the message, it was able

I get this periodically too. Very annoying.

I haven't really looked into this much yet, but it appears that some
embedded CR or LF characters are getting processed by SA and then fed back
to Postfix, which then cleans up the message and splits the headers where
it sees the bare CR or LF. The result is two sets of headers, the second
of which naturally becomes part of the body.

I've dealt with this phenomenon by having postfix check the message body
for the locally-generated X-Spam-NTRG header (apart from the header
block check), and reject those messages.

If somebody wants to see the message I should have it in my trash still.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SPAMassassin headers missplaced and follow message body

2005-05-11 Thread Eric A. Hall


On 5/11/2005 3:02 PM, Martin G. Diehl wrote:
 Eric A. Hall wrote:

I haven't really looked into this much yet, but it appears that some
embedded CR or LF characters are getting processed by SA and then fed back
to Postfix, which then cleans up the message and splits the headers where
it sees the bare CR or LF. The result is two sets of headers, the second
of which naturally becomes part of the body.

If somebody wants to see the message I should have it in my trash still.
 
 Please send the headers for that message.

Return-Path: [EMAIL PROTECTED]
Received: from goose.ehsco.com (localhost [127.0.0.1])
by goose.ehsco.com (Cyrus v2.2.3) with LMTP; Tue, 10 May 2005 04:01:56 
-0500
X-Sieve: CMU Sieve 2.2
Received: from goose.ehsco.com (localhost [127.0.0.1])
by clean.ehsco.com (Postfix ) with ESMTP id 5AED93D877
for [EMAIL PROTECTED]; Tue, 10 May 2005 04:01:48 -0500 (CDT)
X-Envelope-Sender: [EMAIL PROTECTED]
X-Envelope-Recipients:  [EMAIL PROTECTED]
Received: from 24.232.159.2 (OL2-159.fibertel.com.ar [24.232.159.2])
by goose.ehsco.com (Postfix ) with SMTP
for [EMAIL PROTECTED]; Tue, 10 May 2005 04:01:48 -0500 (CDT)
Received: from 168.213.224.150 by ; Tue, 10 May 2005 21:58:35 +0100
Message-Id: [EMAIL PROTECTED]
Date: Tue, 10 May 2005 04:01:48 -0500 (CDT)
From: [EMAIL PROTECTED]
To: undisclosed-recipients:;

sdp.com.arMSS_ID
From: Pablo [EMAIL PROTECTED]
Subject: Su sitio web en doce cuotas de 35 pesos
Date: Wed, 11 May 2005 02:59:35 +0600
MIME-Version: 1.0
Content-Type: multipart/related;
type=multipart/alternative;
boundary==_NextPart_000_0001_01C55496.4B31A720
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1106
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
X-Spam-Status: Yes
X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on goose.ehsco.com
X-Spam-NTRG: *** (19.0); AWL,DNS_FROM_RFC_ABUSE,
EXTRA_MPART_TYPE,FORGED_MUA_OUTLOOK,FORGED_RCVD_HELO,HTML_10_20,
HTML_MESSAGE,L_SMTP_MANY_PROBS,MIME_MISSING_BOUNDARY,
RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_NERDS_AR,RCVD_IN_NJABL_DUL,
RCVD_IN_SORBS_DUL,RCVD_NUMERIC_HELO,UNWANTED_LANGUAGE_BODY
X-Spam-Virus: No




-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SPAMassassin headers missplaced and follow message body

2005-05-11 Thread Eric A. Hall


On 5/11/2005 2:51 PM, Kevin W. Gagel wrote:
On 5/11/2005 6:58 AM, Martin G. Diehl wrote:

I haven't really looked into this much yet, but it appears
that some embedded CR or LF characters are getting
processed by SA and then fed back to Postfix, which then
cleans up the message and splits the headers where it sees
the bare CR or LF. The result is two sets of headers, the
second of which naturally becomes part of the body.
 
 SpamAssassin does not alter the message.

Like I said, I haven't really looked into very closely and I don't know
who's doing the conversion of bare CR/LF into CRLF pairs.

How sure are you that SA doesn't do conversion?

I don't have much doubt that postfix cleanup is doing this, but frankly it
seems more likely to be SA.

 All MTA's will interpret the first blank line as the
 begining of the body.

No kidding. The problem we are seeing happens when there is a EOL marker
at the end of a header, and when that is cleaned up we have two CRLF pairs
all of a sudden, with all of the headers which follow suddenly being part
of the message body.

Trying to figure out who/where this is happening is the exercise


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SPAMassassin headers missplaced and follow message body

2005-05-11 Thread Eric A. Hall


On 5/11/2005 3:28 PM, Justin Mason wrote:

 BTW I've seen some similar messages -- as far as I can see, when it
 happens in my case, it's one of postfix, procmail or my MUA which is
 interpreting the message structure wrongly due to the whitespace
 wierdness.

That's possible too, if whitespace is wrapped improperly it can be read as
a blank line.

My setup is arranged so that postfix SMTP recieves the mail, hands it off
to spamassassin while the session is still open, and then examines the
message that comes back from SA for header flags. If the headers show that
the message is spam, then postfix rejects the message, but otherwise
accepts the mail and then hands it off to cleanup agent.

Looking at the headers that come in and out of spamassasin is what leads
me to believe that it's doing the mangling. In particular, we can assume
that SA didn't see the blank line in the headers that it read (or else it
would have appended the lines at the end of the top block, not the second
block), so it seems kind of likely that writing a new headers block is
what causes the conversion to happen, and results in CR/blanks/whatever
getting turned into a blank line.

OTOH, I know that postfix does some cleanup before it performs analysis
(it adds Message-ID and does other stuff), so it is entirely possible that
it is doing a CR/null conversion as part of that.

Very annoying whatever it is

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: spamd or amavisd-new

2005-05-06 Thread Eric A. Hall


On 5/6/2005 5:38 AM, Beast wrote:
 I would like to create a mail/antispam gateway using postfix,sqlgrey and 
 spam assassin. I don't want to install Av on this gateway because it 
 already handle separately by each internal mail server.
 What is the recomendation on SA setup and which is preferred, using 
 spamd or amavisd-new (traffic is arround 15k-20k/day).

I use SpamPD [http://www.worlddesign.com/index.cfm/rd/mta/spampd.htm] so
that I can call SpamAssassin from the Postfix proxy filter mechanism
[http://www.postfix.org/SMTPD_PROXY_README.html], meaning in-line
rejections instead of after-transfer rejections.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: What is better DCC or Razor2?

2005-04-18 Thread Eric A. Hall


On 4/17/2005 11:23 AM, Robert Nicholson wrote:

 I currently run DCC and since adding

 But what benefit is there in running razor2?

DCC just checks for volume, and doesn't quantify content. Mailings from
~cnet or elsewhere end up getting the same rank as spam, so you really
have to couple DCC with a whitelisting system of some kind.

Razor scores are based on tags that reflect on content, the credibility of
the reporter, etc.


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SpamAssassin Without Bayes

2005-04-04 Thread Eric A. Hall


On 4/4/2005 12:28 PM, Gustafson, Tim wrote:

 I know that Bayes is the defacto best way to fight SPAM right now, but
 I wonder if anyone out there is running SA without Bayes turned on and
 what their experience with it is?

I have it turned off and don't miss it. Tweaking your rules works just as
well, and you don't have to maintain a bunch of user-specific databases.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Effectiveness

2005-03-28 Thread Eric A. Hall


On 3/28/2005 9:30 AM, Matt wrote:
 That worked but your right it has no effect on the autolearn=spam.  Any idea 
 how I get it to autolearn all email to a given address as spam?

can you pipe incoming mail for that account to sa-learn?

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Effectiveness

2005-03-28 Thread Eric A. Hall


On 3/28/2005 2:07 PM, Daryl C. W. O'Shea wrote:

 Better yet, is to not even bother running mail for that account through 
 SpamAssassin in the first place and instead just pipe it to sa-learn. 
 No point in filtering mail that you are positive is 100% spam.

except that he wants to blacklist for all of the other recipients too, so
running it through SA with blacklist_to is needed for that, even with
really high bayes marks

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Effectiveness

2005-03-27 Thread Eric A. Hall


On 3/26/2005 4:47 PM, Matt wrote:
 blacklist_to appears to add 10 points to spam score.  I would like to
 change it so it adds 20 points.  How would I do that?  Reason being
 that way blacklist_to messages will always be scored high enough to
 trigger them to be bayes auto_learn spam.

Add this to one of your *.cf files

score USER_IN_BLACKLIST_TO 100.0

or whatever score you want

Dunno if the bayes auto-learner works with blacklist_to rules; it doesn't
work with some whitelist rules.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: question about greylisting

2005-03-24 Thread Eric A. Hall


alan premselaar wrote:
 Rob McEwen wrote:
 
I have a question about greylisting.

Does greylisting **always** involve blocking upon receipt of the SMTP
envelope and not accepting the rest of the message?

Or, can greylisting alternatively work where it **does** accept the
**entire** message (for auditing purposes, for example) and THEN returns the
temporary rejection code?

 however, temporarily rejecting the message after fully receiving it and 
 processing it kind of defeats the purpose of greylisting. (or at least 
 one major purpose of it)

Yeah, it would still require CPU processing, which is one of the
advantages of refusing to accept the mail in the first place. OTOH, it
would still have value in terms of keeping spam away from the end-users,
which is its own reward sometimes.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: RES: Dictionary Attack

2005-03-23 Thread Eric A. Hall


On 3/23/2005 4:16 PM, Matt Kettler wrote:
 Daniel A. de Araujo wrote:
 
Thanks Matt. The 2nd option looks fine, but we use Postfix. Do u (or
somebody) know how to implement this option at Postfix ?
 
 Try looking at smtpd_error_sleep_time and smtpd_soft_error_limit at this
 page:
 
 http://www.postfix.org/rate.html

That's the right track definitely. I use:

smtpd_error_sleep_time = 10s
smtpd_soft_error_limit = 3
smtpd_hard_error_limit = 5

That stops most malware and dictionary attacks but still tolerates
problematic clients and my fat-fingered tests.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Effectiveness

2005-03-23 Thread Eric A. Hall


On 3/23/2005 12:01 PM, Matt wrote:

 Another thing is I have several domains.  One is from our dialup ISP 10
  years old.  It has several email addresses that are dead and receive
 nothing but junk and lots of it.  About 20 pieces or more an hour.  Is
 there anyway I can use these to improve the effectiveness of
 Spamassassin?

Add them to your cf with a blacklist_to [EMAIL PROTECTED] entry and
they'll make good spamtraps for other recipients of those same messages
(but will have no effect on recipients of other copies that were sent
under separate cover). You could also write the message-id and/or envelope
sender (among other things) and deal with secondary copies that way. One
thing I'm noticing more of lately is that some spam will come from three
or four sources all at once, which is presumably happening because
somebody has submitted the spam and mailing list to multiple trojaned PCs,
so my spamtraps are having a little bit less success lately, but they
still work very well.

You can also use the messages to feed a ~global bayes training process if
you're willing to accept the possibe side-effects of one-dimensional training.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: How do I whitelist this list?

2005-03-22 Thread Eric A. Hall


Jim Maul wrote:

 While the above works great for people using procmail, does anyone have 
 a solution that works without procmail?

whitelist_from_rcvd [EMAIL PROTECTED] apache.org worked when I used static 
whitelists.

I had a bunch of similar entries for various mailing lists in a big
whitelists.cf file in /etc/mail/spamassassin


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: plugins and parrallelization

2005-03-21 Thread Eric A. Hall


Eric A. Hall wrote:

 I'm storing the session variables (such as login status) as part of $self,
 and storing message variables with $permsgstatus. But where do I put the
 logout/disconnect code? DESTROY seems to get called after every message
 (seems to but I'm fairly blurry at this point), which causes the session
 to get killed after every message. Where am I supposed to put this stuff?

Got around to looking at this some more. DESTROY() does actually get
called when everything is being zapped, but that is way too late to do
anything useful (Net::LDAP is already dead, for example).

http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Plugin.html
says $plugin-finish() called when the Mail::SpamAssassin object is
destroyed but that is wrong or there is a bug because near as I can tell
finish() never gets called, and it doesn't appear to even get probed (as
opposed to $plugin-parse_config which shows up in debug probes, and even
gets called). Is this a bug?

Frankly I'm not sure that finish() would work, since the description
sounds like it happens the same time as DESTROY() which is no different.
What would be really useful here would be something that SA calls after it
is done hitting all of the rules that it's going to. That probably ought
to be finish(), and maybe it is, dunno.

I can post this on bugzilla so it can be ignored there too. :o

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

plug-in timeouts

2005-03-21 Thread Eric A. Hall


Every so often I get spampd complaining about a time-out while SA is
trying to interact with one of my eval functions. I've watched the logs,
and what basically happens is that the plug-in *sometimes* goes to sleep
when one the (current) first eval rule in a batch is activated. It seems
to hit a couple of them mroe than others, which is what strikes me as the
most suspicious.

Verbatim log data below is absolutely typical:

Mar 21 05:10:59 goose spampd[28292]: debug: running raw-body-text per-line
regexp tests; score so far=-99.95
Mar 21 05:10:59 goose spampd[28292]: debug: running full-text regexp
tests; score so far=-99.95
Mar 21 05:10:59 goose spampd[28292]: debug: ClamAV: No virus detected
Mar 21 05:10:59 goose spampd[28292]: debug: DCCifd is not available: no
r/w dccifd socket found.
Mar 21 05:10:59 goose spampd[28292]: debug: Running tests for priority: 500
Mar 21 05:10:59 goose spampd[28169]: Failed to run LDAP_MSG_FROM_LIGHT
SpamAssassin test, skipping: (Timed out! )
Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO: from=apache.org
helo=apache.org by=ehsco.com
Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO:
from=smtp-vbr11.xs4all.nl helo=smtp-vbr11.xs4all.nl by=apache.org
Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO:
from=webmail7.xs4all.nl helo=webmail.xs4all.nl by=smtp-vbr11.xs4all.nl
Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO: mismatch on HELO:
'webmail.xs4all.nl' != 'webmail7.xs4all.nl'
Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO:
from=adsl.xs4all.nl helo= by=webmail.xs4all.nl

Everything is hunky-dorey and then poop no-habla-API... You can tell that
the plug-in itself wasn't even activated because there's no debug output
from it. Also, once spampd recovers it goes right into the next set of
tests, and on the next message the plug-in will be working fine again...

LDAP_MSG_FROM_LIGHT is by far the most common rule to be cited, and yes it
works fine when it doesn't trigger a suicide pact. It is the 23rd eval
rule in the cf, if that means anything.

I've done some back-end debugging, and there aren't any protocol problems
like dropped connections or anything that would suggest network trouble
(I've even switched to LDAPI sessions via UNIX domain socket and it still
happens). So my first guess is that something in the plug-in has gone into
blocking mode. This doesn't seem to happen with any other plug-ins, so I'm
guessing this has something to do with one of the modules I'm using, or
there's something about my plug-in that's keeping , but does this ring any
bells for anybody? Could there be too many open eval calls (there are a
couple of dozen in my LDAP cf), excess garbage that needs to be collected
more frequently, or anything like that? The only other thing I can think
of would be that the LDAP server itself is blocked, but like I said the
protocol traces don't show any problems, and it's a pretty common cluster
of crashes, mostly failing on the same rules (but succeeding on them the
majority of the time). Quite curious.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Best way to disable a test from running?

2005-03-20 Thread Eric A. Hall


Vicki Brown wrote:
 I could give it a score of 0 but I'd like to simply say don't even test
 against it.
 
 I'm getting tired of seeing ALL_TRUSTED. We run SMTP; they connect directly
 to us; there are no interim hosts.

You just want to do this for specific hosts, or period?

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: DCC License Change

2005-03-20 Thread Eric A. Hall


Greg Allen wrote:
 I read through some of these postings at rhyolite.com. It sounds to me like
 DCC should be off in SA by default going forward, or possibly completely
 removed from SA future versions so users don't accidentally get in a
 license/legal dispute without their knowledge.

Seems to me that most of this stuff should be using the plug-in interface
anyway. So maybe just move it out of the core and into a plug-in, and then
hand the module off for Vernon to do whatever he feels like with it.

SA can still provide pointers in the distro and a link on the Wiki.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

call-back plug-in

2005-03-20 Thread Eric A. Hall


I'm thinking that SA might also benefit from a call-back plug-in that
looked at the MAIL-FROM and various 822 addresses, opened a connection to
the mail server for the domain[s], and verified the sender's address as
valid. This would actually be a fair bit of effort given all the stuff
that has to be done (MX and fall-back processing, connection management
within a time-limit, etc). I'm also aware that some people really dislike
these things. The real question it seems is the amount of spam something
like this might catch.

I've done some poking in google but can't seem to find trustworthy numbers
and experiences. Anybody got any thoughts here?

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: call-back plug-in

2005-03-20 Thread Eric A. Hall


List Mail User wrote:

 since you have mentioned that you do/did use Postfix - there is an
 option to have Postfix perform that task.

Yeah, but I'm on a mission to convert the binary pass-fail tests in
postfix into probability tests in SA, and this is on that list.

I don't even use the call-back system in postfix here, but friends and
clients have been known to.

 That said:  everyone I find doing callbacks, gets a letter asking them 
 to stop (at least to my addresses);  Until I recognize the pattern,
 they look just like SMTP port scanners and/or address
 verification/harvesting `bots. Also, the Postfix notes warn that you
 should expect people to complain if you enable the option:)

Yep. One option might be to cache addresses so that it only does it once
per sender per ~six month window, although I'm not keen on keeping a
database with this.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

what diff between init.pre and local.cf?

2005-03-20 Thread Eric A. Hall


I'm trying to figure out any issues regarding config data and my
ldapBlacklist plug-in, and this is a mystery to me.

Why purpose does init.pre serve excactly if local.cf and user_prefs can
load the same plug-in modules?

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: plugins and parrallelization

2005-03-19 Thread Eric A. Hall


Justin Mason wrote:

 yeah -- as discussed in the Plugin pod docs, the life-cycle of the objects
 you have access to there is:

I'm currently trying to work this so the LDAP session is maintained for
the lifetime of the module. TCP sessions are pretty expensive, and having
hundreds or even thousands of dead sessions lying around in timeout mode
(not uncommon for busy sites) is going to be very undesirable.

I'm storing the session variables (such as login status) as part of $self,
and storing message variables with $permsgstatus. But where do I put the
logout/disconnect code? DESTROY seems to get called after every message
(seems to but I'm fairly blurry at this point), which causes the session
to get killed after every message. Where am I supposed to put this stuff?

Thanks

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: Is this Received header correctly formatted?

2005-03-18 Thread Eric A. Hall

mouss wrote:
Eric A. Hall wrote:
Huh? The helo= stuff is inside the parenthesis. Perhaps I am missing
something but your point 3 seems to conflicewith your point 2.
comments are only allowed where whitespace occurs
can you give you me the line num in the rfc?
It's actually somewhat stricter than that, and actually says that 
comments can only be used where folding would occur (that's a 
hyper-techinical but accurate reading; see the robustness principle).

Here is what rfc2822 says:
3.2.3. Folding white space and comments
 [...]
   There are several places in this standard where comments and FWS may
   be freely inserted.  To accommodate that syntax, an additional token
   for CFWS is defined for places where comments and/or FWS can occur.
   However, where CFWS occurs in this standard, it MUST NOT be inserted
   in such a way that any line of a folded header field is made up
   entirely of WSP characters and nothing else.
FWS =   ([*WSP CRLF] 1*WSP) /   ; Folding white space
obs-FWS
ctext   =   NO-WS-CTL / ; Non white space controls
%d33-39 /   ; The rest of the US-ASCII
%d42-91 /   ;  characters not including (,
%d93-126;  ), or \
ccontent=   ctext / quoted-pair / comment
comment =   ( *([FWS] ccontent) [FWS] )
CFWS=   *([FWS] comment) (([FWS] comment) / FWS)
   Throughout this standard, where FWS (the folding white space token)
   appears, it indicates a place where header folding, as discussed in
   section 2.2.3, may take place.  Wherever header folding appears in a
   message (that is, a header field body containing a CRLF followed by
   any WSP), header unfolding (removal of the CRLF) is performed before
   any further lexical analysis is performed on that header field
   according to this standard.  That is to say, any CRLF that appears in
   FWS is semantically invisible.
   A comment is normally used in a structured field body to provide some
   human readable informational text.  Since a comment is allowed to
   contain FWS, folding is permitted within the comment.  Also note that
   since quoted-pair is allowed in a comment, the parentheses and
   backslash characters may appear in a comment so long as they appear
   as a quoted-pair.  Semantically, the enclosing parentheses are not
   part of the comment; the comment is what is contained between the two
   parentheses.  As stated earlier, the \ in any quoted-pair and the
   CRLF in any FWS that appears within the comment are semantically
   invisible and therefore not part of the comment either.
   Runs of FWS, comment or CFWS that occur between lexical tokens in a
   structured field header are semantically interpreted as a single
   space character.
RFC 2822 is slightly stricter than RFC 822 in this regard. And while 
it's not full standard like 822, it is a standards-track update to 822 
and was sanctioned by the IESG as such, and was developed after years of 
debate over good and bad behavior.

and even then, the original thing was:
Received: from ar39.lsanca2-4.16.241.28.lsanca2.elnk.dsl.genuity.net
([4.16.241.28] helo=watson1)
and here helo=watson1 is inside parens, and with withespace (before and 
after the parens). or am I missing something?
Check the BNF again.
--
Eric A. Hall  http://www.ehsco.com/
Internet Core Protocolshttp://www.oreilly.com/catalog/coreprot/

Re: Is this Received header correctly formatted?

2005-03-17 Thread Eric A. Hall

Christopher Weimann wrote:
On 03/16/2005-04:49AM, Eric A. Hall wrote:
Loren Wilton wrote:
Received: from ar39.lsanca2-4.16.241.28.lsanca2.elnk.dsl.genuity.net
([4.16.241.28] helo=watson1)
by pop-a065d23.pas.sa.earthlink.net with smtp (Exim 3.33 #1)
id 1DBKRe-Kp-00; Tue, 15 Mar 2005 14:23:22 -0800
[snip]
2) header data in parenthesis is comment data. comments are supposed
   to be ~allowed anywhere that whitespace is allowed (this rule is
   actually documented in RFC2822, which governs header fields). with
   that in mind, yes, it's fine there.
3) the helo= stuff isn't conformant
Huh? The helo= stuff is inside the parenthesis. Perhaps I am missing
something but your point 3 seems to conflicewith your point 2.
comments are only allowed where whitespace occurs
--
Eric A. Hall  http://www.ehsco.com/
Internet Core Protocolshttp://www.oreilly.com/catalog/coreprot/

need testers for ldapBlacklist.pm plug-in

2005-03-16 Thread Eric A. Hall

I got the ldapBlick plug-in pretty much finished, and it just needs some 
polishing I think.

I'd like to get some help testing this for load and latency, so if 
anybody has a local LDAP server running already and is pretty 
comfortable with SA and LDAP, and is willing to poke at this, let me 
know. Be warned that this plugin can really beat the crap out of your 
LDAP server, and will add some measurable latency if the SA system is 
already burdened down. But it works pretty well, and is interesting if 
you're into LDAP.

Responses off-list pls.
Thanks
--
Eric A. Hall   http://www.ehsco.com/
Internet Core Protocols http://www.oreilly.com/catalog/coreprot/

Re: Is this Received header correctly formatted?

2005-03-16 Thread Eric A. Hall

Loren Wilton wrote:
Received: from ar39.lsanca2-4.16.241.28.lsanca2.elnk.dsl.genuity.net
([4.16.241.28] helo=watson1)
 by pop-a065d23.pas.sa.earthlink.net with smtp (Exim 3.33 #1)
 id 1DBKRe-Kp-00; Tue, 15 Mar 2005 14:23:22 -0800
1) Is stmp in lower case valid, or should it have been STMP?
2) Is it valid to have the (Exim etc) stuff between 'stmp' and 'id'?
3) Anything else that may be off the mark?
The robustness principle says that you should be strict in what you send 
and liberal in what you accept. From that perspective, it's not a 
strictly conformant header, but its not broken enough for somebody to 
refuse to parse it.

In answer to your questions:
 1) the spec calls for uppercase
 2) header data in parenthesis is comment data. comments are supposed
to be ~allowed anywhere that whitespace is allowed (this rule is
actually documented in RFC2822, which governs header fields). with
that in mind, yes, it's fine there.
 3) the helo= stuff isn't conformant
Here's the BNF notation for the Received header as provided in RFC2821:
| Time-stamp-line = Received: FWS Stamp CRLF
|
| Stamp = From-domain By-domain Opt-info ;  FWS date-time
|
|   ; where date-time is as defined in [32]
|   ; but the obs- forms, especially two-digit
|   ; years, are prohibited in SMTP and MUST NOT be used.
|
| From-domain = FROM FWS Extended-Domain CFWS
|
| By-domain = BY FWS Extended-Domain CFWS
|
| Extended-Domain = Domain /
|( Domain FWS ( TCP-info ) ) /
|( Address-literal FWS ( TCP-info ) )
|
| TCP-info = Address-literal / ( Domain FWS Address-literal )
|   ; Information derived by server from TCP connection
|   ; not client EHLO.
|
| Opt-info = [Via] [With] [ID] [For]
|
| Via = VIA FWS Link CFWS
|
| With = WITH FWS Protocol CFWS
|
| ID = ID FWS String / msg-id CFWS
|
| For = FOR FWS 1*( Path / Mailbox ) CFWS
|
| Link = TCP / Addtl-Link
| Addtl-Link = Atom
|   ; Additional standard names for links are registered with the
|   ; Internet Assigned Numbers Authority (IANA).  Via is
|   ; primarily of value with non-Internet transports.  SMTP
|   ; servers SHOULD NOT use unregistered names.
| Protocol = ESMTP / SMTP / Attdl-Protocol
| Attdl-Protocol = Atom
| ; Additional standard names for protocols are registered with the
| ; Internet Assigned Numbers Authority (IANA).  SMTP servers
| ; SHOULD NOT use unregistered names.
--
Eric A. Hall   http://www.ehsco.com/
Internet Core Protocols http://www.oreilly.com/catalog/coreprot/

Re: Is this Received header correctly formatted?

2005-03-16 Thread Eric A. Hall

List Mail User wrote:
the with is sometimes also either a by or via (and probably 
other string values which I haven't noticed). BTW.
by via and with are separate sub-fields with their own meaning
--
Eric A. Hall   http://www.ehsco.com/
Internet Core Protocols http://www.oreilly.com/catalog/coreprot/

Re: Is this Received header correctly formatted?

2005-03-16 Thread Eric A. Hall

Daryl C. W. O'Shea wrote:
...and if you can, avoid using running messages to the list through SA 
(easy to do if you're using procmail, not so easy in other cases).
or run them through with whitelist_from_rcvd *.* apache.org to pad the 
value so that it doesn't matter

I do wish that postfix would let me add dynamic headers to the message 
before the proxy filter is called, or give me an ACL for no-filter, 
either of which would work to skip well-known message origins

--
Eric A. Hall  http://www.ehsco.com/
Internet Core Protocolshttp://www.oreilly.com/catalog/coreprot/

Re: plugins and (more)

2005-03-15 Thread Eric A. Hall

Eric A. Hall wrote:
Over the weekend I banged together a preliminary ldapBlacklist.pm plugin
which lets the master process query an ldap server for whitelist or
blacklist flags associated with the connecting SMTP client's reverse DNS,
the HELO identifer, the mail-from address, the From address, and so
forth... The problem is that each of these tests have to do a fair amount
I got this working more fully, including with the persistency stuff 
(thanks again). Couple of other things I'm looking for help on:

 - is there an internal means to determine the local domain name? I'm
   having trouble with Sys::Hostname::Long on a couple of systems and
   would rather use something internal anyway since it's sure to work
   everywhere that SA itself works.
 - is there a way to force a plugin to load last? like, if I want SPF
   and all of the other validation stuff to get called first, but not
   to rely on it (it may not be installed), is there a way to force
   the plugin to get called last (presumably this is done by numbering
   the ldapBlacklist.cf to something like 99_ldap_blacklist.cf, but
   maybe there's a better way)?
Thanks
--
Eric A. Hall  http://www.ehsco.com/
Internet Core Protocolshttp://www.oreilly.com/catalog/coreprot/

Re: Header-Rule with multiple lines

2005-03-15 Thread Eric A. Hall


On 3/15/2005 2:50 AM, Jörg Schütter wrote:

 I want to write a additional rule for spamassassin (3.0.2) which
 match the following header lines.
 
 Received: from blabla (unknown [1.2.3.4])
   by my.mailserver.com
 
 This rule shuld add bad scores to machines which don't talk rfc.

http://www.rulesemporium.com/forums/showthread.php?s=threadid=105 has a
set of rules that might do what you want, or might be adaptable.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

plugins and parrallelization

2005-03-14 Thread Eric A. Hall


It seems that the plugin architecture only allows a single pass/fail
result, so if you want to have multiple tests with different shades of
results, you have to call the plugin multiple times. Is that right?

Over the weekend I banged together a preliminary ldapBlacklist.pm plugin
which lets the master process query an ldap server for whitelist or
blacklist flags associated with the connecting SMTP client's reverse DNS,
the HELO identifer, the mail-from address, the From address, and so
forth... The problem is that each of these tests have to do a fair amount
of processing with some significant serialization (ie, DNS lookup for SRV
RRs, DNS lookup for ldap server, connect-bind-query the server, as well
as the rest of the background code. Using the pass/fail model as a
front-end to this system, each test basically has to be its own rule, and
each rule has to call its own eval() in order for each rule to use its
defined weighting (eg, -50 for whitelisted, +50 for blacklisted, on a
per-test basisc. But in that model, the core LDAP stuff has to be run ~six
times to process ~six tests, and that's a significant serialization
penalty in sum, just to find out if one of the sending domains is listed
as blacklisted or whitelisted in a local LDAP server. It's so bad that I'm
not sure it's feasible to do this.

What are the thoughts?

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SA addr tests need to be updated

2005-03-12 Thread Eric A. Hall


After considering all the discussion, I've filed these three bugs:

  4188--RCVD_HELO_IP_MISMATCH should check address literals (this was
argued against by Justin, but I'm convinced it's spam-sign)

  4186--RCVD_NUMERIC_HELO does not test reserved addresses (they are
still 'numeric' and aren't hostnames, and should still hit)

  4187--RCVD_ILLEGAL_IP does not fire in all cases (reserved, malformed,
and literals should all be tested, but aren't)

The rest of it can stay where it is and still be useful

Thanks

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SA addr tests need to be updated

2005-03-11 Thread Eric A. Hall


On 3/9/2005 1:38 PM, Eric A. Hall wrote:

 I think the four affected rules are RCVD_HELO_IP_MISMATCH,
 RCVD_NUMERIC_HELO, RCVD_ILLEGAL_IP, RCVD_BY_IP

Extending the problem report--it seems that these rules don't fire in some
instances. I haven't really checked this out yet, but addresses with a
leading octet of 111, 123, and some others at or below ~130 seem to get
skipped entirely (so does 99 and a few other two-digit numbers). Further,
in keeping with the notion that all-numeric is illegal, high-numbered
decimals (eg, 789) don't trip the RCVD_NUMERIC_HELO rule either.

Let me know what you the plan is on this as I can add these kinds of tests
to my private set, but would rather not if they'll be in the core set.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SA addr tests need to be updated

2005-03-11 Thread Eric A. Hall


On 3/11/2005 3:42 PM, Theo Van Dinter wrote:
 On Fri, Mar 11, 2005 at 03:25:06PM -0500, Eric A. Hall wrote:
 
 Extending the problem report--it seems that these rules don't fire in
 some instances. I haven't really checked this out yet, but addresses
 with a leading octet of 111, 123, and some others at or below ~130
 seem to get skipped entirely (so does 99 and a few other two-digit
 numbers).
 
 Yeah, they're all listed as reserved.  See M::SA::Constants for more
 detail...

I suspected as much. But even then, RCVD_NUMERIC_HELO should match in all
cases because all-numeric is always illegal (regardless of the number
itself, any number is illegal period). Furthermore, they should be firing
on RCVD_ILLEGAL_IP since they are also illegal--bonus ratware sign.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SA addr tests need to be updated

2005-03-10 Thread Eric A. Hall



You already got a couple of responses but let me pile on.

On 3/10/2005 3:17 AM, [EMAIL PROTECTED] wrote:

 However, I still believe it is perfectly legal to refuse mail if
 - the HELO matches my own MX, or lists one of my IPs

I do this too. My local networks get an immediate exception to all other
filters, and all other connections are queried against an LDAP server that
stores PERMIT/REJECT ACLs, with REJECT entries for my own networks. So if
a remote connection gets to that point in the process and claims to be me,
it's lying. Separately, I run a submission server on another port, which
uses strict authentication, and doesn't use the LDAP ACLs. All my clients
use the submission server, which allows them to roam.

 - the MAIL FROM pretends to be one of my users

I don't recommend that. There's the eBay problem, but there are also
online newspapers and magazines (send this article) that use ~your
address as the envelope sender. There are some mailing groups that use
aliases instead of lists, and some mailing lists don't re-send the
message, in both cases the envelope sender doesn't get rewritten.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

SA addr tests need to be updated

2005-03-09 Thread Eric A. Hall


SA 3.0.2 currently performs a handful of tests against HELO greetings that
contain an IP address. These tests don't currently fire when an address
literal is used in the HELO greeting, but they should.

See section 3.6 of RFC 2821:

| -  The domain name given in the EHLO command MUST BE either a primary
|host name (a domain name that resolves to an A RR) or, if the host
|has no name, an address literal as described in section 4.1.1.1.

and section 4.1.3:

4.1.3 Address Literals

| Sometimes a host is not known to the domain name system and
| communication (and, in particular, communication to report and repair
| the error) is blocked.  To bypass this barrier a special literal form
| of the address is allowed as an alternative to a domain name.  For
| IPv4 addresses, this form uses four small decimal integers separated
| by dots and enclosed by brackets such as [123.255.37.2], which
| indicates an (IPv4) Internet Address in sequence-of-octets form.  For
| IPv6 and other forms of addressing that might eventually be
| standardized, the form consists of a standardized tag that
| identifies the address syntax, a colon, and the address itself, in a
| format specified as part of the IPv6 standards [17].

Technically, addresses that are NOT enclosed in brackets are illegal, but
those are the only ones that SA sniffs out currently.

Extending the current rules to include literals can probably be done by
simply changing the sniff code to look for open and close brackets, but I
haven't looked so I'm just guessing. As far as that goes, the tests might
already do this, and just not firing.

I think the four affected rules are RCVD_HELO_IP_MISMATCH,
RCVD_NUMERIC_HELO, RCVD_ILLEGAL_IP, RCVD_BY_IP


-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SA addr tests need to be updated

2005-03-09 Thread Eric A. Hall


On 3/9/2005 4:01 PM, Justin Mason wrote:

SA 3.0.2 currently performs a handful of tests against HELO greetings that
contain an IP address. These tests don't currently fire when an address
literal is used in the HELO greeting, but they should.
 
 actually, that's deliberate -- compare the frequencies of an RFC-2821
 address literal, vs. a raw address, and you'll notice that the latter
 is much more prevalent in spam.

That's true, but the rules that compare for addresses should still check
the address in literals.

I think the four affected rules are RCVD_HELO_IP_MISMATCH,
RCVD_NUMERIC_HELO, RCVD_ILLEGAL_IP, RCVD_BY_IP

if the addr doesn't check out...

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SA addr tests need to be updated

2005-03-09 Thread Eric A. Hall


On 3/9/2005 3:29 PM, List Mail User wrote:

 See section 3.6 of RFC 2821:
 
 | -  The domain name given in the EHLO command MUST BE either a
 primary |host name (a domain name that resolves to an A RR) or,
 if the host |has no name, an address literal as described in
 section 4.1.1.1.

 3.6 Domains

 used.  There are two exceptions to the rule requiring FQDNs: ...
 
 Nothing in either the section you have quoted, or the one I have allows
 a hostname which is not a FQDN to be used.

see the first exception, which is the text I cited above.

 Technically, addresses that are NOT enclosed in brackets are illegal,
 but those are the only ones that SA sniffs out currently.
 
 Of course, my machines just refuse these during the SMTP conversation, 

Many do.

BTW, postfix has similar problems wrt literals. For example, if postfix
gets a regular address (non-literal) in the HELO, it will split the
address into octets and do lookups for PERMIT/REJECT ACLs on incrementally
smaller sets, which is all very nice. But if it finds a literal, it
doesn't parse for the address inside, and treats the literal like a domain
name. Another bug here is that the strict-syntax checks in postfix don't
match against non-literal addresses, which it should (RFC1123 spells out
what is a valid hostname, and all-numerics is clearly not legal).

 Please be careful and check the definitions and references in each 
 document

indeed

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

Re: SA addr tests need to be updated

2005-03-09 Thread Eric A. Hall


On 3/9/2005 5:17 PM, List Mail User wrote:

   Postfix option reject_invalid_hostname will reject bare
 IPs (when used in the smtpd_helo_restrictions section of main.cf).

Good to hear this was fixed. I filed a bug report on it in May '04 but
didn't get much of a response. I'll have to upgrade.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/

1 2 >

1 - 100 of 129 matches

Mail list logo