[exim] SPAM Filtering - Losing the war!

2006-10-29 Thread Vitaly A Zakharov
In my opinion, each postmaster (except Postgres daemon :-) MUST prevent 
false positives as much, as he can. Many of mail systems is ruled by 
incompetent systems administrators and it is great likelihood that those 
hosts will be blocked by spam filter, that related to wrong 
configuration of this mailers (or configuratoin of DNS system, if we 
speaks about DNS checks).

But then, there is a public mail systems (as free mail services, or 
mail servers of hosting providers). A lot of them is rightly configured.
However, whose hosts may be listed in DNSBL's (very often in Spamhaus, 
IMCO), because company client (or user of free mail service) makes a 
spam distribution. We make mistake rejecting mail from those systems, 
bacause administrators quickly discover and stops spam distributions 
like that, and the most part of mail from that systems is legitimate.

Instead of starting holy wars flames, we SHOULD understand, how to 
decrease the volume of false positive rejects in spam checks of mail 
systems, without decreasing the volume of true positive mail drops?
A lot of spammers hosts has more sins, besides listing in SBL's, or 
giving HELO friend. :-)

So that is the simple example for MTA Exim in which we can see, how to 
block spammer hosts more efficiently, and making Exim more efficient worker:

First, we need three markers (because this simple conf has three
checks). Each marker has boolean (1 or 0) value, which contains the
result of some spamcheck.

   warn  set acl_m0 = 0
 set acl_m1 = 0
 set acl_m2 = 0

Each sender MAY go through the all of spamtests, and have some labeles
on finish. Examine this labeles at the finish we assessing the
personality of this host and reject or accept the message.

We make the SBL listing test and remember result in variable acl_m0

   warndnslists   = sbl.spamhaus.org: \
bl.spamcop.net: \
relays.mail-abuse.org
   set acl_m0 = 1
   set acl_c0 = $acl_c0 Listed in DNSBL $dnslist_domain;

Also, we include result in status string variable acl_c0. We will need
it later. Operand warn means that is no action performed, only
passive check.

This is second spamtest in our example, test of coincidence of PTR
(reverse) and A (direct) DNS records using Exim stored procedure
reverse_host_lookup:

   warn!verify= reverse_host_lookup
   set acl_m1 = 1
   set acl_c0 = $acl_c0 Reverse host lookup failed;

The result is remembered in acl_m1.

The last spamtest is the check of HELO command argument, the result of
which we put in acl_m2.

   warn!condition = ${if or {\
 {eq{$sender_helo_name}{$sender_host_name}}\
 {match\{$sender_helo_name}{\\[$sender_host_address\\]}}\
}}
   set acl_m2 = 1
   set acl_c0 = $acl_c0 HELO forged;

HELO command argumend should be an FQDN (Full Qualified Domain Name) or
IP literal (IP-address in brackets, like [1.2.3.4]) and points to this
host (this is summarily, see RFC 2821 for full details).

Now we have three results and should consider about action, taken to
that SMTP session - abort it, accept message, or gave defer.

This section describes how to handle the message. If two (or all three)
variables is true, session reset, else - the message is accept.

   deny  condition = ${if and{\
  {eq{$acl_m0}{1}}\
  {eq{$acl_m1}{1}}\
  }{1}{0}}
 message  = $acl_c0

   deny  condition = ${if or {\
  {eq{$acl_m0}{1}}\
  {eq{$acl_m2}{1}}\
  }{1}{0}}
 message   = $acl_c0

   deny  condition = ${if or {\
  {eq{$acl_m1}{1}}\
  {eq{$acl_m2}{1}}\
  }{1}{0}}
 message   = $acl_c0

   accept

Note, if mailer is rejecting message, it MUST include the description of 
reason of mail reject in message, transmitted with 5xx status code. This 
helps postmasters to resolve problems in case of false positive.

Exim give implementation to detect many traces of spam and can handle 
them very good.

This is an example of more progressive config, that provides an 
spamweight of the session by counting the spamscore. It based on 
supposition that SBL listing is more significant than lack of 
convergence in DNS record and still more important than forged HELO 
arguments:

   warn   set acl_m0 = 0

   warndnslists   = sbl.spamhaus.org: \
bl.spamcop.net: \
relays.mail-abuse.org
   set acl_m0 = ${eval:$acl_m0+30}
   set acl_c0 = $acl_c0 Listed in DNSBL $dnslist_domain;

   warn!verify= reverse_host_lookup
   set acl_m0 = ${eval:$acl_m0+10}
   set acl_c0 = $acl_c0 Reverse host lookup failed;

   warn!condition = 

Re: [exim] SPAM Filtering - Losing the war!

2006-10-29 Thread W B Hacker
Vitaly A Zakharov wrote:

*snip* (details of some well-written examples...)

We would add that it can be very beneficial to defer actually 'acting on' these 
strict tests (rDNS fail, HELO mismatch, RBL hit, etc.) until at least 
acl_smtp_rcpt phase, where 'per-recipient' filtering is practical.

The reasons are economic.

Given that in any given 'organization-specific' domain - and arrivals are 
grouped by target domain - there is, or most often *should be* - at least one 
address that is *very* forgiving, and many others that are less so.

Example: Clients to whom a missed opportunity for a unit sale to a new customer 
is worth several thousand US$ per each. New user registrations.  Helpdesks.

So - a 'sales@domain.tld', 'info@domain.tld or similar spam-target 
initial-point-of-contact address needs the Mark 1 human eyeball to sort copious 
arrivals of spam in order to find the one or two potentially valuable arrivals 
- 
then respond and whitelist them if need be.

Best if staff can share that sort of unpleasant workload!

In acl_smtp_rcpt, we can pull the per-recipient thresholds, still reject 
any/all 
that are NOT 'tolerant' recipients, and onpass only the survivors.

Also -  the 'tighter' the filters, the more attention needs to be paid to 
maintaining very current exception whitelists and applying code that has a 
similar 'automagical' effect. e.g. - allowing traffic from any domain your 
clients have intentionally *sent to* [ ever | x-times in y-months), and similar 
lookups.

We are, after all, not supposed to shoot the bystanders in this 'war'.

;-)

Bill





-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-29 Thread Oliver Egginger
 Use of this intelligent behavior of MTA we can handle mail 
 transactions more accuracy. This is very simple and efficient way.

We use a similar approach. I wouldn't call it simple. I wouldn't call
it \intelligent\. It's simply hard work. We combine asn, hostname,
rbl and helo tests, dns adress verifikation and several other sanity
checks in the pre data phase. Nevertheless, a lot of spam and malware
pass through. Only a (full featured) spamassassin in conjunction with
clamAV is able to clean it up.

- oliver

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-29 Thread Vitaly A Zakharov
Oliver Egginger пишет:
 Use of this intelligent behavior of MTA we can handle mail 
 transactions more accuracy. This is very simple and efficient way.
 
 We use a similar approach. I wouldn't call it simple. I wouldn't call
 it \intelligent\. It's simply hard work. We combine asn, hostname,
 rbl and helo tests, dns adress verifikation and several other sanity
 checks in the pre data phase. Nevertheless, a lot of spam and malware
 pass through. Only a (full featured) spamassassin in conjunction with
 clamAV is able to clean it up.

Yor just not understand the basic terms of my post. Or, maybe, my English so 
bad, that it is hard to understand what I 
write. :-)

I never say Do not use bayesian filters!.
I never say Do not use antivirus!

As about intelligence:
Three of four check is not intelligence behavior, but ~20 tests + greylisting 
+ challengelisting + blacklisting + 
whitelisting and using the MTA logic to manipulate of all of this is really 
intelligence.

  Nevertheless, a lot of spam and malware
  pass through.

Try to use a well-known construction, just above virus checking in Exim 
configuration:

acl_check_mime:

   warndecode = default
   dropmessage= Blacklisted file extension detected.
   condition  = ${if 
match{${lc:$mime_filename}}{\N(\.cpl|\.pif|\.bat|\.scr|\.lnk|\.com|\.hta)$\N}{1}{0}}

   accept

You would be surprised, the volume of viruses will decrease about a half.

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/

Re: [exim] SPAM Filtering - Losing the war!

2006-10-29 Thread SeattleServer.com
On Sunday 29 October 2006 05:36, Vitaly A Zakharov wrote:
 Try to use a well-known construction, just above virus checking in Exim
 configuration:

 acl_check_mime:

warndecode = default
dropmessage= Blacklisted file extension detected.
condition  = ${if
 match{${lc:$mime_filename}}{\N(\.cpl|\.pif|\.bat|\.scr|\.lnk|\.com|\.hta)$\
N}{1}{0}}

accept

 You would be surprised, the volume of viruses will decrease about a half.

You would be surprised, the number of users who complain because these 
extensions (especially .lnk and .scr) are blocked.

In fact it was such a common problem among our (mostly non-IT) users, that we 
ended up defaulting to NOT blocking executable extensions, though it can be 
turned on per-domain.

I don't really like blocking simply on extension anyways - I ran into it 
myself when trying to E-mail an HTML file without an extension (it was named 
simply somedomain.com).

Cheers,
-- 
SeattleServer.com Mailing Lists - Casey Allen Shobe
[EMAIL PROTECTED] - http://seattleserver.com

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-29 Thread W B Hacker
SeattleServer.com wrote:
 On Sunday 29 October 2006 05:36, Vitaly A Zakharov wrote:
 Try to use a well-known construction, just above virus checking in Exim
 configuration:

 acl_check_mime:

warndecode = default
dropmessage= Blacklisted file extension detected.
condition  = ${if
 match{${lc:$mime_filename}}{\N(\.cpl|\.pif|\.bat|\.scr|\.lnk|\.com|\.hta)$\
 N}{1}{0}}

accept

 You would be surprised, the volume of viruses will decrease about a half.
 
 You would be surprised, the number of users who complain because these 
 extensions (especially .lnk and .scr) are blocked.
 
 In fact it was such a common problem among our (mostly non-IT) users, that we 
 ended up defaulting to NOT blocking executable extensions, though it can be 
 turned on per-domain.
 
 I don't really like blocking simply on extension anyways - I ran into it 
 myself when trying to E-mail an HTML file without an extension (it was named 
 simply somedomain.com).
 
 Cheers,

We have two such rules - both with far more extensive lists, as we cover mostly
Mac and other 'non-MS' platforms. Both add 'points' and user prefs do
modification to 'Subject:' and quarantining.

- But the 'surprise' here is that they almost never triggered until recently.

Client branch offices that need to send photos and such are whitelisted and/or
trained to alter the file extent or encapsulate, and the villainous *were* being
stopped before they got as far as that.

That said, the recent rise in otherwise innocuous body with text-bearing graphic
attached says we need a server-global tightening up on a *combination* of
any-graphic + [stranger AND/OR rudebugger].

- Where 'stranger' is anyone we have never sent 'TO:', and 'rudebugger' is
weighted scores for failure on rDNS, HELO, dynamic-IP, RBL, header format
... etc.

If we have to get into the insanity of CPU cycles needed for OCR inspection of
graphics, I'd call that a dead loss, strip the dodgy attachments, and point the
user community back to their fax machines (color, for the most part) or FedEx.

:-(

Bill

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-26 Thread Odhiambo G. Washington
* On 23/10/06 21:16 -0200, Marlon Cabrera Oliveira wrote:
| Hi,
| 
|  To tell you the truth I'm losing ground lately against spammers. Two
|  reasons. The Image spam is getting through and because it poisons the
|  bayes I've lost much of the effectiveness of bayes filtering. I'm still
|  holding on but I've had people who I hosted for for over a year who
|  never had a single spam who are now getting a few. I am also having a
|  few more false positives than I used to.
| 
| 
| I'm having succes here detecting image spam using OSBF-Lua filter:
| 
| from OSBF-lua website:
| 
| OSBF-Lua (Orthogonal Sparse Bigrams with confidence Factor) is a Lua C 
module 
| for text classification. It is a port of the OSBF classifier implemented in 
| the CRM114 project. This implementation attempts to put focus on the 
| classification task itself by using Lua as the scripting language, a powerful 
| yet light-weight and fast language, which makes it easier to build and test 
| more elaborated filters and training methods.
| 
| The OSBF algorithm is a typical Bayesian classifier but enhanced with two 
| techniques that I originally developed for the CRM114 project: Orthogonal 
| Sparse Bigrams - OSB, for feature extraction, and the Exponential 
| Differential Document Count - EDDC (a.k.a Confidence Factor) for automatic 
| feature selection. Combined, these two techniques produce a highly accurate 
| classifier. OSBF was developed focused on two classes, SPAM and NON-SPAM, so 
| the performance for more than two classes may not be the same.
| 
| 
| OSBF-Lua learn very fast. It only require Lua 5.1 installed on Exim server 
| with dynamic loading enabled. 
| See install doc; http://osbf-lua.luaforge.net/#installation
| 
| 
| On exim.conf I add this statements:
| 
| On ## ON CONFIGURATION SETTINGS ##
| 
| # set OSBF_LUA_DIR to where spamfilter.lua, spamfilter_command.lua etc were 
| #installed
| OSBF_LUA_DIR=/usr/local/osbf-lua
| 
| 
| On ## TRANSPORTS CONFIGURATION ##
| 
| 
| add transport_filter to local_delivery transport:
| 
| local_delivery:
|driver = appendfile
|check_string = 
|create_directory
|delivery_date_add
|directory = ${home}/Maildir/
|directory_mode = 700
|envelope_to_add
|return_path_add
|group = mail
|maildir_format
|maildir_tag = ,S=$message_size
|message_prefix = 
|message_suffix = 
|mode = 0600
|quota = ${lookup{$local_part}lsearch*{/etc/mail/quota_usr}{$value}{4M}}
|quota_size_regex = S=(\d+)$
|quota_warn_threshold = 75%
|transport_filter = OSBF_LUA_DIR/spamfilter.lua --udir $home/osbf-lua
| 
| 
| that's it!! :)
| 
| 
| Verify our setup sending a message to yourself with the following in the 
| subject line: help your password 
| 
| You will receive a message with a help about spamfilter.
| 
| To verify that databases wre created correctly: stats your password
| 
| From now, all mesages that you received will be classified and tagged 
| according the score they get:
| 
| TagMeaning
| 
| [--] almost sure it's a spam - score = -20
| 
| [-]  probably it's a spam (reinforcement zone) - score  0 and  -20
| 
| [+]probably it's not spam (reinforcement zone) - score =0 and  20
| 
| [++] almost sure it's not spam - score = 20. This tag is here just for   
| symmetry, it's not used. An empty tag is used in place of it so as not to 
| pollute the messages.
| 
| 
| If the classification is wrong you nust train the filter replaying the 
message 
| back to yourself, replacing the subject with the correspondent training 
| command:
| 
| learn password spam or learn password nonspam 
| 
| 
| After training a few messages, osbf-lua will increase the accuracy on spam 
| detection.
| If you have a pre-classified messages (nonspam / spam) database on a imap 
| folder, you can use the script toer.lua to do the training.

This doesn't look like a good solution. We simply don't want to accept 
the message, if that were possible. Of course I know it's possible with 
Exim, but the fact that this still leans towards SpamAssassin-ism 
If this could be integrated within Exiscan framework, then I'd rethink
my stand.


cheers
   - wash 
+--+-+
Odhiambo Washington . WANANCHI ONLINE LTD (Nairobi, KE)  |
wash () WANANCHI ! com  . 1ere Etage, Loita Hse, Loita St.,  |
GSM: (+254) 722 743 223 . # 10286, 00100 NAIROBI |
GSM: (+254) 733 744 121 . (+254) 020 313 985 - 9 |
+-+--+
Oh My God! They killed init! You Bastards!  
 --from a /. post

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-26 Thread Ian FREISLICH
John W. Baxter wrote:
 On 10/23/06 11:56 PM, Johann Spies [EMAIL PROTECTED] wrote:
 
  I have not implemented greylisting so far.  Maybe it is time to do so. I
  am not quite convinced that it is an unmixed blessing.  Can somebody
  convince me?

 It is helpful, but you MUST whitelist sensibly.  There are at least
 three classes of things in sensibly 1. neighbor ISPs and others that
 routinely send mail to your users but are not large and well known.

 2. large and well known senders which will pass greylisting anyhow
 even though spam comes from them (eg hotmail)

 3. broken sending servers (there is a nice list available somewhere on
 the www.greylisting.org site that will serve as a starting point).

 We have just under 700 entries in our whitelist database table.

Here, the greylist feeds a whitelist because any host that passes
the greylist test need not ever be greylisted again.  My greylist
and whitelist detect HELO morphing, so the host gets re-greylisted
if the HELO changes.

mail=# select count(*) from greylist ;
 count 
---
 48839
(1 row)

mail=# select count(*) from whitelist ;
 count  

 160828
(1 row)

Pretty representative for a few hundred servers.  I am willing to
share my home (non-work) copy of the greylist source.  I find it
cuts out a large portion of my spam load.

Ian

--
Ian Freislich

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-26 Thread Marlon
 This doesn't look like a good solution. We simply don't want to accept
 the message, if that were possible. Of course I know it's possible with
 Exim, but the fact that this still leans towards SpamAssassin-ism 
 If this could be integrated within Exiscan framework, then I'd rethink
 my stand.


Unfortunately, the world is not perfect. There are a lot
of mail servers not RFC compliance around.
If you are agressive on your ACLs, good messages will be dropped.
Statistic filters like osbf-lua evolve, if spam behavior change, it can be
trained with a false-negative, new spam attack will be blocked.
End users have diferent behavior on mail use, statistics filters can
evolve and adapt.
On my concern, it´s necessary use more than one layer of spam protection
nowadays.
Exim ACL´s with statistics filters is a very good choice.
The problem with statistics filters that it's not a out-of-box solution,
you have to feed it with a good number of messages (over 2,500) to get 99%
of accuracy,
but the final result is great.


Regards,

Marlon


-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-26 Thread John W. Baxter
On 10/26/06 5:58 AM, Ian FREISLICH [EMAIL PROTECTED] wrote:

 Here, the greylist feeds a whitelist because any host that passes
 the greylist test need not ever be greylisted again.

We didn't want to go that way, because of accidental passing of the
greylisting (the engine trying again within expiration time).  We saw a
bunch of that just under a year ago with THAT VIRUS (whatever its names
were--the fairly convincing FBI, etc thing).

It may be time to revisit that, but perhaps using a count of the passed
messages after the initial greylist pass.  (Our database could be mined for
that pretty easily.  Hmmm.)

[We rolled our own greylisting when we started, since none of the available
solutions seemed robust at that time.  Our Python daemon which does the work
has an unkown mean time between crashes, since the servers are restarted too
often for security updates for us to see daemon crashes.]

  --John



-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Peter Bowyer
On 25/10/06, Marc Perkel [EMAIL PROTECTED] wrote:


 Wakko Warner wrote:

  Have you tried gocr?  I have, it's not the greatest, but it is possible that
  it can read the image and allow you to react on the content of the image.
  It's not that great yet, but I think it could have potential (until spammers
  start using the other characters in the images like they are in text spam)
 
 

 I have started to install it several times and got distracted. I think I
 will try it.

We've just implemented the FuzzyOCR plugin for SA, which uses gocr
under the covers. It's catching 100% of the current wave of image spam
with no FPs reported yet. Very impressed.

Since the OCR is farly expensive I might look at pulling out the SA
plugin bits and implementing the guts as a perl module to call
directly from Exim (before SA), where I can be a bit more choosy about
which attachments I choose to process.

Peter


-- 
Peter Bowyer
Email: [EMAIL PROTECTED]

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Mike Meredith
Sometime around Tue, 24 Oct 2006 18:39:34 +0100, it may be that Chris
Lightfoot wrote:
 The point here is *false* positives, and whether the

Or what I term 'unintented blocks' ... a term I've been using since
before spam was a large problem. (I recall I used it in reference to
accidentally blocking coloured book mail forwarded from a VAX onto a
Unix box because the address format was 'invalid').

 decision about whether something should be treated as spam
 should be up to the addressee, or up to some MTA
 administrator exercising a technical prejudice.

Different organisations have different levels at which unintended
blocks are acceptable. An ISP may prefer a level of zero (where the
user is paying for a service); another ISP who claims to block most
spam may have a level somewhat higher. 

Where the email address is provided primarily for work related purposes
and email is read during work time, the organisation may decide that a
higher level of unintended blocks is acceptable. Letting users deal
with spam has a high cost to an organisation (a back of the envelope
calculation several years ago indicated I was saving £500,000 a year
with anti-spam measures ... and I was probably underestimating by quite
a way).

Sometimes I get it right, and sometimes I get it wrong. But as I've
been a Postmaster for 13 years and haven't been fired yet, I'm
obviously not totally off the wall.

-- 
Mike Meredith, Senior Informatics Officer
University of Portsmouth: Hostmaster, Postmaster and Security 
  If you play the Windows CD backwards you hear a satanic message.
  But it gets worse... If you play it forwards it installs Windows.


signature.asc
Description: PGP signature
-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/

Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Peter McEvoy
On Wed, Oct 25, 2006 at 07:04:53AM +0100, Peter Bowyer wrote:
 We've just implemented the FuzzyOCR plugin for SA, which uses gocr
 under the covers. It's catching 100% of the current wave of image spam
 with no FPs reported yet. Very impressed.
 
 Since the OCR is farly expensive I might look at pulling out the SA
 plugin bits and implementing the guts as a perl module to call
 directly from Exim (before SA), where I can be a bit more choosy about
 which attachments I choose to process.

You mean sort of like this:

http://spam.co.nz/nsfo/


-- 
Pete

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Peter Bowyer
On 25/10/06, Peter McEvoy [EMAIL PROTECTED] wrote:
 On Wed, Oct 25, 2006 at 07:04:53AM +0100, Peter Bowyer wrote:
  We've just implemented the FuzzyOCR plugin for SA, which uses gocr
  under the covers. It's catching 100% of the current wave of image spam
  with no FPs reported yet. Very impressed.
 
  Since the OCR is farly expensive I might look at pulling out the SA
  plugin bits and implementing the guts as a perl module to call
  directly from Exim (before SA), where I can be a bit more choosy about
  which attachments I choose to process.

 You mean sort of like this:

 http://spam.co.nz/nsfo/

Err... yes, almost entirely like that - glad I only got as far as
looking at the FuzzyOCR code...

Thanks!

Peter

-- 
Peter Bowyer
Email: [EMAIL PROTECTED]

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Ian Eiloart

On 24 Oct 2006, at 18:27, W B Hacker wrote:


 That is a cop-out.

 No 'national' snail-mail postal service, nor private courier will,  
 or would
 allow themselves to be forced to - carry hazardous, offensive - or  
 merely
 'non-compliant' packages, properly 'addressed' or not.

 They are *required* to reject such, and the recipients generally  
 expect them to
 do so.

Actually, that's not true. You aren't allowed to post illegal or  
hazardous material, but the UK's Royal Mail doesn't make judgements  
about offensive material, and *anything* non-hazardous with a  
recipient name on it is required to be delivered.
http://www.royalmail.com:80/portal/rm/content1? 
catId=400044mediaId=400255


There are recommendations for packaging, but no requirements. http:// 
www.royalmail.com:80/portal/rm/content1?catId=400044mediaId=400251  
Even without a proper address, or with an incorrect address the royal  
mail is required to make best efforts to deliver. For example, lots  
of post is delivered to well known names (like BBC or Santa) even  
when there's no proper address given. The Royal Mail will open  
envelopes or consult telephone directories to find an address, if  
necessary.

The delivery might be dependant on making up any shortfall in postage  
paid.

None of that means that the same rules should apply to email, though.
-- 
Ian Eiloart
Postmaster,
IT Services
University of Sussex




-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Peter Bowyer
On 25/10/06, Ian Eiloart [EMAIL PROTECTED] wrote:

 None of that means that the same rules should apply to email, though.

Indeed - to continue the use of inappropriate analogies - an
organisation's postmaster is more like the office mailroom manager
than it is the Royal Mail - and the mailroom may well have its own
rules about what mail it deals with which are stricter than those of
Royal Mail

Peter

-- 
Peter Bowyer
Email: [EMAIL PROTECTED]

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Ian Eiloart

On 24 Oct 2006, at 19:12, W B Hacker wrote:



 Nonsense!  Refusing to accept...

 - a parcel containing live animals. Or unrefrigerated dead ones

Both permitted by the Royal Mail if properly packaged. Live animals  
by special arrangement, dead ones presumably qualify as meat or  
other perishables, and merely need to be marked perishable and sent  
first class. First class gets delivered next day 90% of the time.

 - hazardous chemicals, flammables, etc.

The Royal Mail doesn't accept these.

 - NO postage

Royal Mail accepts such post from mailboxes, but not over the  
counter, and asks the recipient to pay postage plus an excess.

 - NO, or known-forged, sender information.

Most mail in the UK doesn't carry sender information on the envelope.

 - cross-border shipments without proper customs information

 Try any of these with your local FedEx or Post Office and see how  
 far you get.

 Bill

-- 
Ian Eiloart
Postmaster, University of Sussex
()  ascii ribbon campaign - against html mail
/\- against microsoft attachments





-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Marc Perkel


Peter Bowyer wrote:
 On 25/10/06, Marc Perkel [EMAIL PROTECTED] wrote:
   
 Wakko Warner wrote:
 

   
 Have you tried gocr?  I have, it's not the greatest, but it is possible that
 it can read the image and allow you to react on the content of the image.
 It's not that great yet, but I think it could have potential (until spammers
 start using the other characters in the images like they are in text spam)


   
 I have started to install it several times and got distracted. I think I
 will try it.
 

 We've just implemented the FuzzyOCR plugin for SA, which uses gocr
 under the covers. It's catching 100% of the current wave of image spam
 with no FPs reported yet. Very impressed.

 Since the OCR is farly expensive I might look at pulling out the SA
 plugin bits and implementing the guts as a perl module to call
 directly from Exim (before SA), where I can be a bit more choosy about
 which attachments I choose to process.

 Peter


   

That would be great if we can create an Exim equiv of the Fuzzy OCR 
code. I'd rather drop it in Exim so that SA never sees it and therefore 
never gets to poison my bayes filters.

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Johann Spies
On Wed, Oct 25, 2006 at 10:49:13AM +0100, Peter McEvoy wrote:
 On Wed, Oct 25, 2006 at 07:04:53AM +0100, Peter Bowyer wrote:
  We've just implemented the FuzzyOCR plugin for SA, which uses gocr
  under the covers. It's catching 100% of the current wave of image spam
  with no FPs reported yet. Very impressed.
  
  Since the OCR is farly expensive I might look at pulling out the SA
  plugin bits and implementing the guts as a perl module to call
  directly from Exim (before SA), where I can be a bit more choosy about
  which attachments I choose to process.
 
 You mean sort of like this:
 
 http://spam.co.nz/nsfo/

I get a an error with the setup you suggest there:

error in ACL: unknown ACL condition/modifier in decode = default

Where I put acl_check_mime just before  acl_check_data.

Regards
Johann
-- 
Johann Spies  Telefoon: 021-808 4036
Informasietegnologie, Universiteit van Stellenbosch

 Only take heed to thyself, and keep thy soul  
  diligently, lest thou forget the things which thine 
  eyes have seen, and lest they depart from thy heart 
  all the days of thy life; but teach them to thy sons, 
  and to thy sons' sons...Deuteronomy 4:9 

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread Peter Bowyer
On 25/10/06, Marc Perkel [EMAIL PROTECTED] wrote:


 Peter Bowyer wrote:
 On 25/10/06, Marc Perkel [EMAIL PROTECTED] wrote:

 Wakko Warner wrote:



 Have you tried gocr? I have, it's not the greatest, but it is possible
 that
it can read the image and allow you to react on the content of the
 image.
It's not that great yet, but I think it could have potential (until
 spammers
start using the other characters in the images like they are in
 text spam)



 I have started to install it several times and got distracted. I think
 I
will try it.

 We've just implemented the FuzzyOCR plugin for SA, which uses gocr
under the
 covers. It's catching 100% of the current wave of image spam
with no FPs
 reported yet. Very impressed.

Since the OCR is farly expensive I might look
 at pulling out the SA
plugin bits and implementing the guts as a perl module
 to call
directly from Exim (before SA), where I can be a bit more choosy
 about
which attachments I choose to process.

Peter




 That would be great if we can create an Exim equiv of the Fuzzy OCR code.
 I'd rather drop it in Exim so that SA never sees it and therefore never gets
 to poison my bayes filters.


Agreed - the example posted earlier is by its own admission not
'fuzzy' - I'll do some testing on it and see how it performs. Maybe it
needs a bit of the fuzziness from FuzzyOCR.

Peter

-- 
Peter Bowyer
Email: [EMAIL PROTECTED]

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-25 Thread W B Hacker
Ian Eiloart wrote:
 
 On 24 Oct 2006, at 18:27, W B Hacker wrote:
 

 That is a cop-out.

 No 'national' snail-mail postal service, nor private courier will,  or 
 would
 allow themselves to be forced to - carry hazardous, offensive - or  
 merely
 'non-compliant' packages, properly 'addressed' or not.

 They are *required* to reject such, and the recipients generally  
 expect them to
 do so.
 
 
 Actually, that's not true. You aren't allowed to post illegal or  
 hazardous material, but the UK's Royal Mail doesn't make judgements  
 about offensive material,

Sorry- shoudl have been more specific - not 'offensive; in the sense of 
pictures 
or politics, 'offensive' in the sense of being smeared in cow-dung, limberger 
cheese in a paper envelope - that sort of 'offensive'.  IOW - distrubing, 
distracting, perhaps hazardous to health of those who must handle it, etc. when 
passing thru the handling process.

  and *anything* non-hazardous with a  recipient
 name on it is required to be delivered.
 http://www.royalmail.com:80/portal/rm/content1? 
 catId=400044mediaId=400255


Royal Mail in particular, and UK Civil Seervice (most of whom *are*'civil') in 
general, are known for being more helpful than, for example, their US 
counterparts.

 
 There are recommendations for packaging, but no requirements. http:// 
 www.royalmail.com:80/portal/rm/content1?catId=400044mediaId=400251  
 Even without a proper address, or with an incorrect address the royal  
 mail is required to make best efforts to deliver. For example, lots  of 
 post is delivered to well known names (like BBC or Santa) even  when 
 there's no proper address given. The Royal Mail will open  envelopes or 
 consult telephone directories to find an address, if  necessary.
 
 The delivery might be dependant on making up any shortfall in postage  
 paid.
 
 None of that means that the same rules should apply to email, though.

Well - the gist of it is that *some* rules and *some* concern for the 'health 
and safety' of the community are certainly appropriate.

Not to forget the many years that not only GPO, but essentially every business 
in London had 'special ways' of handling mail and machines to sniff for the odd 
unwanted surprise.

'Freedom to speak', BTW does not equate to 'right to force others to listen'.

Nor does 'willingness to accept' equate to a right to force others to carry 
whatever it is one is willing to accept if/as/when so doing violates other 
appropriate rules, terms, conditions or could place the server and/or other 
clients at undue risk.

Bill




-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread marlon
Hi,

 To tell you the truth I'm losing ground lately against spammers. Two
 reasons. The Image spam is getting through and because it poisons the
 bayes I've lost much of the effectiveness of bayes filtering. I'm still
 holding on but I've had people who I hosted for for over a year who
 never had a single spam who are now getting a few. I am also having a
 few more false positives than I used to.


I'm having succes here detecting image spam using OSBF-Lua filter:

from OSBF-lua website:

OSBF-Lua (Orthogonal Sparse Bigrams with confidence Factor) is a Lua C module 
for text classification. It is a port of the OSBF classifier implemented in 
the CRM114 project. This implementation attempts to put focus on the 
classification task itself by using Lua as the scripting language, a powerful 
yet light-weight and fast language, which makes it easier to build and test 
more elaborated filters and training methods.

The OSBF algorithm is a typical Bayesian classifier but enhanced with two 
techniques that I originally developed for the CRM114 project: Orthogonal 
Sparse Bigrams - OSB, for feature extraction, and the Exponential 
Differential Document Count - EDDC (a.k.a Confidence Factor) for automatic 
feature selection. Combined, these two techniques produce a highly accurate 
classifier. OSBF was developed focused on two classes, SPAM and NON-SPAM, so 
the performance for more than two classes may not be the same.



OSBF-Lua learn very fast. It only require Lua 5.1 installed on Exim server 
with dynamic loading enabled. 
See install doc; http://osbf-lua.luaforge.net/#installation


On exim.conf I add this statements:

On ## ON CONFIGURATION SETTINGS ##

# set OSBF_LUA_DIR to where spamfilter.lua, spamfilter_command.lua etc were 
#installed
OSBF_LUA_DIR=/usr/local/osbf-lua


On ## TRANSPORTS CONFIGURATION ##


add transport_filter to local_delivery transport:

local_delivery:
   driver = appendfile
   check_string = 
   create_directory
   delivery_date_add
   directory = ${home}/Maildir/
   directory_mode = 700
   envelope_to_add
   return_path_add
   group = mail
   maildir_format
   maildir_tag = ,S=$message_size
   message_prefix = 
   message_suffix = 
   mode = 0600
   quota = ${lookup{$local_part}lsearch*{/etc/mail/quota_usr}{$value}{4M}}
   quota_size_regex = S=(\d+)$
   quota_warn_threshold = 75%
   transport_filter = OSBF_LUA_DIR/spamfilter.lua --udir $home/osbf-lua


that's it!! :)


Verify our setup sending a message to yourself with the following in the 
subject line: help your password 

You will receive a message with a help about spamfilter.

To verify that databases wre created correctly: stats your password

From now, all mesages that you received will be classified and tagged 
according the score they get:

Tag  Meaning

[--] almost sure it's a spam - score = -20

[-]  probably it's a spam (reinforcement zone) - score  0 and  -20

[+]probably it's not spam (reinforcement zone) - score =0 and  20

[++] almost sure it's not spam - score = 20. This tag is here just for   
symmetry, it's not used. An empty tag is used in place of it so as not to 
pollute the messages.


If the classification is wrong you nust train the filter replaying the message 
back to yourself, replacing the subject with the correspondent training 
command:

learn password spam or learn password nonspam 


After training a few messages, osbf-lua will increase the accuracy on spam 
detection.
If you have a pre-classified messages (nonspam / spam) database on a imap 
folder, you can use the script toer.lua to do the training.


Regards,

Marlon


 











-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Johann Spies
I have up to last week tried to avoid using dnslists and I could to some
extent manage using just spamassassin on our three mail servers.  On one
I have the fuzzyocr-plugin to see how it handles the image-spam.  I did
not install it on the other two because I still have issues like this in
exim's paniclog:

spam acl condition: cannot parse spamd output

This happens about 3 to 4 times per hour.

Last week on Friday I started to use dnslists: 

denymessage = rejected because $sender_host_address \
  is in a black list at $dnslist_domain\n\
  $dnslist_text
dnslists= sbl-xbl.spamhaus.org : relays.ordb.org : dnsbl.njabl.org

in acl_check_rcpt just after accept hosts = :

This made a dramatic difference.  Messages marked as spam by SA dropped
from about 14 per day to about 46000.  The message count in the
queues are lower than before with less frozen messages in it.

I have also lowered the effect of the bayesian filter.  Because of spam
poisoning those filters I had a surge of false positives in the past 10
days.

I have not implemented greylisting so far.  Maybe it is time to do so. I
am not quite convinced that it is an unmixed blessing.  Can somebody
convince me?

Regards
Johann
-- 
Johann Spies  Telefoon: 021-808 4036
Informasietegnologie, Universiteit van Stellenbosch

 Do all things without murmurings and disputings; 
  That ye may be blameless and harmless, the sons of 
  God, without rebuke, in the midst of a crooked and 
  perverse nation, among whom ye shine as lights in the 
  world.  Philippians 2:14,15 

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Gareth Hastings


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On
 Behalf Of Odhiambo Washington
 Sent: 23 October 2006 17:25
 To: exim-users@exim.org
 Subject: [exim] SPAM Filtering - Losing the war!
 
 1. You are using Exim and its techniques only

Hi Wash,

Don't you use dspam anymore?



Gareth

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Peter Velan
am 2006-10-24 08:56 schrieb Johann Spies:
 Last week on Friday I started to use dnslists: 
 
 denymessage = rejected because $sender_host_address \
   is in a black list at $dnslist_domain\n\
   $dnslist_text
 dnslists= sbl-xbl.spamhaus.org : relays.ordb.org : dnsbl.njabl.org

From a total of 404899 connections we rejected 291450. After a bunch of
HELO|hostname|IP rejects we check the sender-IP against some dnslists:

 2571 dnsbl.njabl.org
 6414 dynablock.njabl.org
19442 sbl-xbl.spamhaus.org
  720 list.dsbl.org
7 relays.ordb.org
-
29154

The check against relays.ordb.org is not (more) worth the effort. The
big era of open relays seems to be a thing of the past. May be the low
number for relays.ordb.org results from the fact, that this list is the
last one in my dnslist-definition.

Greetings,
Peter

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread W B Hacker
Johann Spies wrote:

*snip*

 
 I have not implemented greylisting so far.  Maybe it is time to do so. I
 am not quite convinced that it is an unmixed blessing.  Can somebody
 convince me?
 
 Regards
 Johann

Try it and log the aitch out of the other 'factors' of what it blocks.

My bet is that after any significant period of observation you will find that 
anything it blocked you could have blocked w/o delay anyway, and anything it 
let 
in you would have let in w/o delay anyway *even if* you would have blocked it 
later. As soem will be. Zombie farms are often programmed to run successive 
waves of attack.

So - aside fom delaying 'virgin' visitors, and generating what *appear to be* 
attractive stats, I don't see the point.

YMMV, YOMD

Bill


-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Ian Eiloart


--On 23 October 2006 20:53:25 +0200 Andreas Pettersson [EMAIL PROTECTED] 
wrote:

 Chris Lightfoot wrote:

 On Mon, Oct 23, 2006 at 08:37:16PM +0200, Andreas Pettersson wrote:
[...]

 what's the false-positive rate? how do you measure it?



 Measured using manual labour.


Ok, and what's the measurement? How many false positives do you find. I 
know that I'd have people screaming down the phone if I use your 
techniques. Senderbase are listing several wanadoo servers - that's a large 
UK ISP which many of our staff and students use.

-- 
Ian Eiloart
IT Services, University of Sussex

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Mike Meredith
Sometime around Tue, 24 Oct 2006 08:56:08 +0200, it may be that Johann
Spies wrote:
 I have not implemented greylisting so far.  Maybe it is time to do
 so. I am not quite convinced that it is an unmixed blessing.  Can
 somebody convince me?

I don't much like the idea of greylisting for ordinary mail servers ...
the ones that tick all the boxes in behaving properly. But it does
sound like a great way of dealing with mail 'servers' that are coming
from suspicious sources, or are doing suspicious things, but not bad
enough to block outright.

-- 
Mike Meredith, Senior Informatics Officer
University of Portsmouth: Hostmaster, Postmaster and Security 
   Q: How many software engineers does it take to change a lightbulb?
   A: Just one. But the house falls down.
   -- Andrew Siwko


signature.asc
Description: PGP signature
-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/

Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Mike Meredith
Sometime around Tue, 24 Oct 2006 03:14:09 +0800, it may be that W B
Hacker wrote:
 - *WE* - the mailadmins of the world - have handed the spammers the
 environment they need to grow and prosper. The time for 'be generous
 as to what you accept..' and fawn over 'false positives' is gone.
 Ancient history.

To some extent, although I've never liked the 'false positive'
phrase. When I'm checking rDNS (unfortunately now only enabled on a
per-user basis here) and I defer mail, if that blocks mail from a
legitimate company with an incompetent mail administrator, then it
ain't no false positive.

However I'm not convinced that the mail admins are to blame for a
permissive environment allowing spammers to run riot. I remember a time
when mail admins would terminate anybody doing anything like spamming
without regard to commercial pressures. We can't have that time back
again.

Incidentally it's odd that as soon as spammers start doing stuff that
annoys network administrators (stealing IP networks, ASNs, etc) their
networks get routed to null0 pretty quickly ... even when their network
provider has a grey-tinged hat.

Sigmonster earns a dog biscuit ...

-- 
Mike Meredith, Senior Informatics Officer
University of Portsmouth: Hostmaster, Postmaster and Security 
  Spammers on the Internet are like hula hoops, pet rocks, or subway
   alligators; only incredibly fertile, incontinent, and able to fly.
And it's still illegal to shoot them, so bring an umbrella. SC, on
SPAM-L.


signature.asc
Description: PGP signature
-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/

Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Chris Lightfoot
On Tue, Oct 24, 2006 at 03:25:00PM +0100, Mike Meredith wrote:
 When I'm checking rDNS (unfortunately now only enabled on a
 per-user basis here) and I defer mail, if that blocks mail from a
 legitimate company with an incompetent mail administrator, then it
 ain't no false positive.

It must be nice to run a system which only accepts mail
for you.

-- 
``I've done made a deal with the devil. He said he's going to give me an
  air-conditioned place when I go down there, if I go there, so I won't
  put all the fires out.'' (Paul `Red' Adair, on death)

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Mike Meredith
Sometime around Tue, 24 Oct 2006 15:26:56 +0100, it may be that Chris
Lightfoot wrote:
 It must be nice to run a system which only accepts mail
 for you.

I don't. Or rather the rDNS block I was talking about is implemented on
work's email gateways. As far as I'm concerned a false positive on a
rDNS test is a mail servers with a rDNS ... thus the test is
problematic.

A legitimate company (or other organisation) that fails an rDNS check
because they have an incompetent mail administrator is *not* a false
positive. It may well be decided that such a test causes too many
problems (which is what happened here) but that does not make it a
false positive ... as far as I'm concerned.

-- 
Mike Meredith, Senior Informatics Officer
University of Portsmouth: Hostmaster, Postmaster and Security 
  If you play the Windows CD backwards you hear a satanic message.
  But it gets worse... If you play it forwards it installs Windows.


signature.asc
Description: PGP signature
-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/

Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Chris Lightfoot
On Tue, Oct 24, 2006 at 03:41:45PM +0100, Mike Meredith wrote:
 because they have an incompetent mail administrator is *not* a false
 positive. It may well be decided that such a test causes too many

any case where your spam filter blocks a mail that a user
wanted to receive is a false positive. If you ask your
users, ``should I block email you want to receive because
it came from a host which didn't have a reverse-DNS
name?'', what would they say?

-- 
``They accused us of suppressing freedom of expression.
  This was a lie and we could not let them publish it.''
  (Nelba Blandon, Nicaraguan Interior Ministry Director of Censorship)

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Walt Reed
On Tue, Oct 24, 2006 at 03:45:28PM +0100, Chris Lightfoot said:
 On Tue, Oct 24, 2006 at 03:41:45PM +0100, Mike Meredith wrote:
  because they have an incompetent mail administrator is *not* a false
  positive. It may well be decided that such a test causes too many
 
 any case where your spam filter blocks a mail that a user
 wanted to receive is a false positive. If you ask your
 users, ``should I block email you want to receive because
 it came from a host which didn't have a reverse-DNS
 name?'', what would they say?

There are false positives for just about every test you can possibly
do. Fortunately whitelisting is a valid and viable solution to
resolve problems with rnds (as is fixing the dang dns!) My users are estatic
that the plethora of tests I employ (which also includes rdns testing)
has reduced spam to near zero (with the exception of image spam where I
don't have a viable solution - yet.) 


-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Mike Meredith
Sometime around Tue, 24 Oct 2006 15:45:28 +0100, it may be that Chris
Lightfoot wrote:
 On Tue, Oct 24, 2006 at 03:41:45PM +0100, Mike Meredith wrote:
  because they have an incompetent mail administrator is *not* a false
  positive. It may well be decided that such a test causes too many
 
 any case where your spam filter blocks a mail that a user
 wanted to receive is a false positive. 

From the user's perspective yes. Not from my perspective. And I've
already pointed out that whilst it may not be a false positive, if it
causes too many problems the test is too problematic and will be
removed. 

What do I do in a situation where a user wants to receive all mail from
a problematic ISP whereas all my other 2 users don't ? Ideally I
setup a block that applies for 2 users and not for 1, if I can't
(possibly due to time restraints), the 1 user is disappointed.

On a separate point, my job (as a mail administrator ... I do other
stuff as well) is to provide an Internet mail service. By definition a
'mail server' with no rDNS isn't sending Internet mail ... it's just
random noise that happens to look like valid mail. 

 If you ask your
 users, ``should I block email you want to receive because
 it came from a host which didn't have a reverse-DNS
 name?'', what would they say?

What's DNS ?


-- 
Mike Meredith, Senior Informatics Officer
University of Portsmouth: Hostmaster, Postmaster and Security 
  If you play the Windows CD backwards you hear a satanic message.
  But it gets worse... If you play it forwards it installs Windows.


signature.asc
Description: PGP signature
-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/

Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Chris Lightfoot
On Tue, Oct 24, 2006 at 04:33:41PM +0100, Mike Meredith wrote:
 Sometime around Tue, 24 Oct 2006 15:45:28 +0100, it may be that Chris
 Lightfoot wrote:
  On Tue, Oct 24, 2006 at 03:41:45PM +0100, Mike Meredith wrote:
   because they have an incompetent mail administrator is *not* a false
   positive. It may well be decided that such a test causes too many
  
  any case where your spam filter blocks a mail that a user
  wanted to receive is a false positive. 
 
 From the user's perspective yes. Not from my perspective.

remind me, to whom is the mail addressed?

-- 
``Close observation of the Senator suggested that there might not be any
  medical obstacles to launching the entire legislative branch into space,
  possibly the most encouraging scientific result of the mission.''
  (Maciej Ceglowski on the achievements of Senator John Glenn)

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Mike Meredith
Sometime around Tue, 24 Oct 2006 16:39:40 +0100, it may be that Chris
Lightfoot wrote:
 
 remind me, to whom is the mail addressed?
 

The user's *work* email address. So what ?

-- 
Mike Meredith, Senior Informatics Officer
University of Portsmouth: Hostmaster, Postmaster and Security 
  'A foolish consistency is the hobgoblin of little minds'


signature.asc
Description: PGP signature
-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/

Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread David Saez Padros
Hi !!

 because they have an incompetent mail administrator is *not* a false
 positive. It may well be decided that such a test causes too many
 
 any case where your spam filter blocks a mail that a user
 wanted to receive is a false positive. If you ask your
 users, ``should I block email you want to receive because
 it came from a host which didn't have a reverse-DNS
 name?'', what would they say?

you can say that it will block legitimate mail but from the
point of view of the test a false positive/negative is a
message that is incorrectly classified. If you pretend to
detect spam based on rdns then it's a false positive but
if you just pretend to reject non rfc compliant hosts then
there are no false positives.

-- 
Best regards ...


David Saez Padroshttp://www.ols.es
On-Line Services 2000 S.L.   e-mail  [EMAIL PROTECTED]
Pintor Vayreda 1 telf+34 902 50 29 75
08184 Palau-Solita i Plegamans   movil   +34 670 35 27 53




-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Chris Lightfoot
On Tue, Oct 24, 2006 at 04:55:06PM +0100, Mike Meredith wrote:
 Sometime around Tue, 24 Oct 2006 16:39:40 +0100, it may be that Chris
 Lightfoot wrote:
  
  remind me, to whom is the mail addressed?
 
 The user's *work* email address. So what ?

In your organisation, do email administrators have
management responsibility for determining what is, and
what is not, acceptable email for their colleagues /
superiors / whoever to be sending and receiving? If so, do
you think this is typical of other organisations which
employ people to manage their email systems?


More generally I think there's a very serious and very
common error among people who design ad-hoc antispam
systems of confusing feature extraction and machine
learning, of which the discussion here has shown some
useful examples.

-- 
``There has to be a balance between telling the truth and reassurance.''
  (David Blunkett, on terrorist alerts)

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread John W. Baxter
On 10/23/06 11:56 PM, Johann Spies [EMAIL PROTECTED] wrote:

 I have not implemented greylisting so far.  Maybe it is time to do so. I
 am not quite convinced that it is an unmixed blessing.  Can somebody
 convince me?

It is helpful, but you MUST whitelist sensibly.  There are at least three
classes of things in sensibly
1.  neighbor ISPs and others that routinely send mail to your users but are
not large and well known.

2.  large and well known senders which will pass greylisting anyhow even
though spam comes from them (eg hotmail)

3.  broken sending servers (there is a nice list available somewhere on the
www.greylisting.org site that will serve as a starting point).

We have just under 700 entries in our whitelist database table.

All that said, the engine sending the image spam (or at least one) passes
greylisting, so greylisting won't help with it.  (This is likely the
beginning of the end of greylisting as a helpful technique.)

  --John



-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread W B Hacker
Chris Lightfoot wrote:

 On Tue, Oct 24, 2006 at 04:33:41PM +0100, Mike Meredith wrote:
 
Sometime around Tue, 24 Oct 2006 15:45:28 +0100, it may be that Chris
Lightfoot wrote:

On Tue, Oct 24, 2006 at 03:41:45PM +0100, Mike Meredith wrote:

because they have an incompetent mail administrator is *not* a false
positive. It may well be decided that such a test causes too many

any case where your spam filter blocks a mail that a user
wanted to receive is a false positive. 

From the user's perspective yes. Not from my perspective.
 
 
 remind me, to whom is the mail addressed?
 

That is a cop-out.

No 'national' snail-mail postal service, nor private courier will, or would 
allow themselves to be forced to - carry hazardous, offensive - or merely 
'non-compliant' packages, properly 'addressed' or not.

They are *required* to reject such, and the recipients generally expect them to 
do so.

We need to do more educating about the value-add of insisting on compliance, 
AND 
furthering the understanding that receiving messages is not a right - it is a 
privilege - on 'paid for' by the cooperation of many entities *other than* the 
recipient.

In other words, our 'price of a postage stamp' is primarily adherence to 
reasonable rules and good manners.

Bill



-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Andreas Pettersson
W B Hacker wrote:

snip

If every 'honest' MTA on Planet Earth started dropping rDNS fail, dynamic-IP 
sources, and HELO vagaries on a specific date, the problem would abate 
dramatically same-day

Nothing overly clever about such rules.

One needs two things:

- Top-management or client-pool buy-in to strict rules.

- Willingness to apply our own creativity to carefully crafting 'white' 
listing 
for the 1% or fewer of desired correspondents on 'challenged' hosts, instead 
of 
'black' listing much of the world.
  


I see the same goal far off in the horizon. It's propably the only way 
out of the mess. Or perhaps one of two (*).

Even though I'm pretty pleased with how the filtering works today, it 
needs continous tuning and tweaking to keep it that way. And I have a 
lot other things to do.


(*) The other way out would be if all ISP's that doesn't block smtp/25 
from their subscriber networks started doing so. Most unlikely..

-- 
Andreas



-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Chris Lightfoot
The point here is *false* positives, and whether the
decision about whether something should be treated as spam
should be up to the addressee, or up to some MTA
administrator exercising a technical prejudice.

Argument by analogy is usually futile, but in your terms
this would be like the postman discarding, say, letters
where the postmark wasn't completely legible or where he
didn't like the way the letterhead was formatted. Actual
postal authorities (in reasonable countries at least)
prohibit that sort of interference.

-- 
``Don't use nuclear weapons to troubleshoot faults.''
  (actual US Air Force safety instruction)

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread W B Hacker
Chris Lightfoot wrote:

 The point here is *false* positives, and whether the
 decision about whether something should be treated as spam
 should be up to the addressee, or up to some MTA
 administrator exercising a technical prejudice.

We are not discussing 'minor infractions, here.

Nothing to do with 'prejudice', technical or otherwise.

And no - for the sort of violations involved the addressee has *zero* say [1].

No more right than someone would have to say 'it's OK to permit [saltwater | 
vinegar | sewage] in the public water mains'.

 
 Argument by analogy is usually futile, but in your terms
 this would be like the postman discarding, say, letters
 where the postmark wasn't completely legible or where he
 didn't like the way the letterhead was formatted. Actual
 postal authorities (in reasonable countries at least)
 prohibit that sort of interference.
 

Nonsense!  Refusing to accept...

- a parcel containing live animals. Or unrefrigerated dead ones

- hazardous chemicals, flammables, etc.

- NO postage

- NO, or known-forged, sender information.

- cross-border shipments without proper customs information

Try any of these with your local FedEx or Post Office and see how far you get.

Bill


[1] We *have* had to remind those who pay the fees that our own ToS can get us 
disconnected from the upstream b/w if we knowingly convey malicious content.

The upstreams, in turn, are beholden (eventually) to national governments in a 
similar manner. TANSTAAFL.



-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Wakko Warner
Marc Perkel wrote:
 Odhiambo Washington wrote:
  Junk mail is war. RFCs do not apply.
  --Wietse Venema
 
 
 
  Hello!
 
  With the recent upsurge of spam, I am strongly compelled to ask:
 
  Is there anyone on this list who can afford to brag about the 
  effectiveness of their spam filtering techniques? (With the
  exception of Marc Perkel ;))

 
 To tell you the truth I'm losing ground lately against spammers. Two 
 reasons. The Image spam is getting through and because it poisons the 
 bayes I've lost much of the effectiveness of bayes filtering. I'm still 
 holding on but I've had people who I hosted for for over a year who 
 never had a single spam who are now getting a few. I am also having a 
 few more false positives than I used to.

Have you tried gocr?  I have, it's not the greatest, but it is possible that
it can read the image and allow you to react on the content of the image. 
It's not that great yet, but I think it could have potential (until spammers
start using the other characters in the images like they are in text spam)

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals
 Got Gas???

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-24 Thread Marc Perkel


Wakko Warner wrote:
 Marc Perkel wrote:
   
 Odhiambo Washington wrote:
 
 Junk mail is war. RFCs do not apply.
 --Wietse Venema



 Hello!

 With the recent upsurge of spam, I am strongly compelled to ask:

 Is there anyone on this list who can afford to brag about the 
 effectiveness of their spam filtering techniques? (With the
 exception of Marc Perkel ;))
   
   
 To tell you the truth I'm losing ground lately against spammers. Two 
 reasons. The Image spam is getting through and because it poisons the 
 bayes I've lost much of the effectiveness of bayes filtering. I'm still 
 holding on but I've had people who I hosted for for over a year who 
 never had a single spam who are now getting a few. I am also having a 
 few more false positives than I used to.
 

 Have you tried gocr?  I have, it's not the greatest, but it is possible that
 it can read the image and allow you to react on the content of the image. 
 It's not that great yet, but I think it could have potential (until spammers
 start using the other characters in the images like they are in text spam)

   

I have started to install it several times and got distracted. I think I 
will try it.
-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-23 Thread Chris Lightfoot
On Mon, Oct 23, 2006 at 08:37:16PM +0200, Andreas Pettersson wrote:
[...]

what's the false-positive rate? how do you measure it?

-- 
``Eden may have invaded Egypt, but one can't be too censorious, everybody gets
  a bit silly when they're stoned.'' (Jeremy Scott, from `Fast and Louche')

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-23 Thread Marc Perkel


Odhiambo Washington wrote:
   Junk mail is war. RFCs do not apply.
   --Wietse Venema



 Hello!

 With the recent upsurge of spam, I am strongly compelled to ask:

 Is there anyone on this list who can afford to brag about the 
 effectiveness of their spam filtering techniques? (With the
 exception of Marc Perkel ;))
   

To tell you the truth I'm losing ground lately against spammers. Two 
reasons. The Image spam is getting through and because it poisons the 
bayes I've lost much of the effectiveness of bayes filtering. I'm still 
holding on but I've had people who I hosted for for over a year who 
never had a single spam who are now getting a few. I am also having a 
few more false positives than I used to.

 That would mean:

 1. You are using Exim and its techniques only
 2. You show stats that your accuracy is very good... 
 2. You present stats that show negligible false positives...

 ...and you are NOT using any of those expensive appliances!

 It's becoming evident, day by day, that the war against spam is 
 almost a full time job ;)

 Perhaps it's time we got [EMAIL PROTECTED] to share ideas on 
 fighting spam and the rapidly changing techniques being used by 
 spammers. So much time and resources are consumed by spammers that we
 need to declare spam as the third world war.


   

I'm thinking about suing Microsoft to get them to release their security 
patches to the public.


-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-23 Thread Andreas Pettersson
Chris Lightfoot wrote:

On Mon, Oct 23, 2006 at 08:37:16PM +0200, Andreas Pettersson wrote:
[...]

what's the false-positive rate? how do you measure it?

  

Measured using manual labour.

-- 
Andreas



-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-23 Thread christoph . kliemt
Moin!

Odhiambo Washington [EMAIL PROTECTED] writes:

   Junk mail is war. RFCs do not apply.
   --Wietse Venema

Wietse... i do not like him, but in this special case he is right.


 Hello!

 With the recent upsurge of spam, I am strongly compelled to ask:

 Is there anyone on this list who can afford to brag about the
 effectiveness of their spam filtering techniques?

No way to win with filtering, computer are (afaik, till now) not
able to understand the content of eMails. The only and one thing that
can not be forged is the ip where the email comes from. Any questions? 

 (With the exception of Marc Perkel ;))

the kindergarten guy?

[...]

 It's becoming evident, day by day, that the war against spam is almost
 a full time job ;)

Hmm... ask a simple question: where does it come from?. Life is much
easier then.

[...]

 So much time and resources are consumed by spammers that we need to
 declare spam as the third world war.

wash... think about that statement, ok?

regards,

Christoph

-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-23 Thread Andreas Pettersson
Marc Perkel wrote:

Andreas Pettersson wrote:
  

Chris Lightfoot wrote:

  


On Mon, Oct 23, 2006 at 08:37:16PM +0200, Andreas Pettersson wrote:
   [...]

what's the false-positive rate? how do you measure it?

 


  

Measured using manual labour.

  



I call it scream testing. Try something and see if anyone screams.
  


FP flagged by SA won't get anybody screaming, since the quarantine is 
controlled regularly.

-- 
Andreas



-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: [exim] SPAM Filtering - Losing the war!

2006-10-23 Thread Marlon Cabrera Oliveira
Hi,

 To tell you the truth I'm losing ground lately against spammers. Two
 reasons. The Image spam is getting through and because it poisons the
 bayes I've lost much of the effectiveness of bayes filtering. I'm still
 holding on but I've had people who I hosted for for over a year who
 never had a single spam who are now getting a few. I am also having a
 few more false positives than I used to.


I'm having succes here detecting image spam using OSBF-Lua filter:

from OSBF-lua website:

OSBF-Lua (Orthogonal Sparse Bigrams with confidence Factor) is a Lua C module 
for text classification. It is a port of the OSBF classifier implemented in 
the CRM114 project. This implementation attempts to put focus on the 
classification task itself by using Lua as the scripting language, a powerful 
yet light-weight and fast language, which makes it easier to build and test 
more elaborated filters and training methods.

The OSBF algorithm is a typical Bayesian classifier but enhanced with two 
techniques that I originally developed for the CRM114 project: Orthogonal 
Sparse Bigrams - OSB, for feature extraction, and the Exponential 
Differential Document Count - EDDC (a.k.a Confidence Factor) for automatic 
feature selection. Combined, these two techniques produce a highly accurate 
classifier. OSBF was developed focused on two classes, SPAM and NON-SPAM, so 
the performance for more than two classes may not be the same.



OSBF-Lua learn very fast. It only require Lua 5.1 installed on Exim server 
with dynamic loading enabled. 
See install doc; http://osbf-lua.luaforge.net/#installation


On exim.conf I add this statements:

On ## ON CONFIGURATION SETTINGS ##

# set OSBF_LUA_DIR to where spamfilter.lua, spamfilter_command.lua etc were 
#installed
OSBF_LUA_DIR=/usr/local/osbf-lua


On ## TRANSPORTS CONFIGURATION ##


add transport_filter to local_delivery transport:

local_delivery:
   driver = appendfile
   check_string = 
   create_directory
   delivery_date_add
   directory = ${home}/Maildir/
   directory_mode = 700
   envelope_to_add
   return_path_add
   group = mail
   maildir_format
   maildir_tag = ,S=$message_size
   message_prefix = 
   message_suffix = 
   mode = 0600
   quota = ${lookup{$local_part}lsearch*{/etc/mail/quota_usr}{$value}{4M}}
   quota_size_regex = S=(\d+)$
   quota_warn_threshold = 75%
   transport_filter = OSBF_LUA_DIR/spamfilter.lua --udir $home/osbf-lua


that's it!! :)


Verify our setup sending a message to yourself with the following in the 
subject line: help your password 

You will receive a message with a help about spamfilter.

To verify that databases wre created correctly: stats your password

From now, all mesages that you received will be classified and tagged 
according the score they get:

Tag  Meaning

[--] almost sure it's a spam - score = -20

[-]  probably it's a spam (reinforcement zone) - score  0 and  -20

[+]probably it's not spam (reinforcement zone) - score =0 and  20

[++] almost sure it's not spam - score = 20. This tag is here just for   
symmetry, it's not used. An empty tag is used in place of it so as not to 
pollute the messages.


If the classification is wrong you nust train the filter replaying the message 
back to yourself, replacing the subject with the correspondent training 
command:

learn password spam or learn password nonspam 


After training a few messages, osbf-lua will increase the accuracy on spam 
detection.
If you have a pre-classified messages (nonspam / spam) database on a imap 
folder, you can use the script toer.lua to do the training.


Regards,

Marlon


 











-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/