A bounce killer feature:

Use a pen pals lookup to check inbound DSN for a corresponding
outbound message; if a Message-ID contained within the inbound DSN doesn't
match a valid Message-ID from the apparent sender, such message receives
$bounce_killer_score spam score points (100 by default) and can be blocked
as spam.

A received delivery status notifications is parsed looking for attached
header section of an original message in an attempt to find a Message-ID.
A standard DSN structure (RFC 3462, RFC 3464) is recognized, as well as
a few nonstandard but common formats. Other automatic reports and bounces
with unknown structure and no attached header section are ignored for this
purpose (are subject to other checks). Unfortunately there are still many
nonstandard mailers around (12+ years after DSN format standardization) and
ad-hoc filtering solutions which do not supply the essential information.

If a Message-ID can be found in an SQL log database matching a previous
message sent by a local user (which is now a recipient of a DSN),
using a normal pen pals lookup (no extra SQL operations are necessary),
or if domain part of the Message-ID is one of local domains, then the
DSN message is considered a genuine bounce, is unaffected by this check
and passes normally (subject to other checks).

On the other hand, if the attached DSN header does supply a Message-ID
but but it does not meet the above criteria, then it is assumed that the
message is a backscatter to a faked address belonging to our local domains,
and $bounce_killer_score spam score points are added, so the message can be
treated as spam (subject to spam kill level and other spam settings).

The only user-configurable setting is $bounce_killer_score (also member
of policy banks), its default value is 100. To turn off the bounce killer
feature set $bounce_killer_score to undef or to 0.

A couple of SNMP-like counters are added to facilitate assessing
effectiveness of the feature (e.g. viewed by amavisd-agent utility):
  BounceNotFromUsKilled     1341   1197/h
  BounceRescuedByDomain        1      1/h
  BounceRescuedByPenPals      17     15/h
  BounceUnparsable            37     33/h

More information on operations can be obtained from a log, search for:
  inspect_dsn:
  bounce killed
  bounce rescued by penpals
  bounce rescued by domain
  bounce unparsable

The feature was suggested by Scott F. Crosby;


A patch against amavisd-new-2.6.0-rc1 follows:


--- amavisd.orig        2008-03-19 15:50:44.000000000 +0100
+++ amavisd     2008-04-07 21:12:02.000000000 +0200
@@ -289,5 +289,6 @@
       @av_scanners @av_scanners_backup $first_infected_stops_scan
       $sa_spam_report_header $sa_spam_level_char $sa_mail_body_size_limit
-      $penpals_bonus_score $penpals_halflife $reputation_factor
+      $penpals_bonus_score $penpals_halflife $bounce_killer_score
+      $reputation_factor
       $undecipherable_subject_tag $localpart_is_case_sensitive
       $recipient_delimiter $replace_existing_extension
@@ -696,4 +697,6 @@
        # for testing and statistics gathering);
 
+  $bounce_killer_score = 100;
+
   #
   # Receiving mail related
@@ -6467,4 +6470,6 @@
 sub name_declared     # string or a ref to a list of strings
   { my($self)=shift; [EMAIL PROTECTED] ? $self->{nm_decl}  : 
($self->{nm_decl}=shift) };
+sub report_type       # a string, e.g. 'delivery-status', rfc3462
+  { my($self)=shift; [EMAIL PROTECTED] ? $self->{rep_typ}  : 
($self->{rep_typ}=shift) };
 sub size
   { my($self)=shift; [EMAIL PROTECTED] ? $self->{size}     : 
($self->{size}=shift) };
@@ -6930,4 +6935,6 @@
     }
     $part->name_declared(@rn==1 ? $rn[0] : [EMAIL PROTECTED])  if @rn;
+    $val = $head->mime_attr('content-type.report-type');
+    $part->report_type($val)  if $val ne '';
   }
   mime_decode_pre_epi('epilogue', $entity->epilogue,
@@ -9707,4 +9714,5 @@
        ($will_do_virus_scanning || $will_do_banned_checking);
 
+    my($bounce_header_fields_ref,$bounce_msgid);
     if ($will_do_parts_decoding) {  # decoding parts can take a lot of time
       $which_section = "mime_decode-1";
@@ -9715,4 +9723,15 @@
       prolong_timer($which_section);
 
+      if (c('bounce_killer_score') > 0) {
+        $which_section = "dsn_parse";
+        # analyze a bounce after MIME decoding but before further archive
+        # decoding, which often replaces decoded files
+        $bounce_header_fields_ref = inspect_a_bounce_message($msginfo);
+        $bounce_msgid = $bounce_header_fields_ref->{'message-id'}
+          if $bounce_header_fields_ref &&
+             exists $bounce_header_fields_ref->{'message-id'};
+        prolong_timer($which_section);
+      }
+
       $which_section = "parts_decode_ext";
       snmp_count('OpsDec');
@@ -9945,4 +9964,5 @@
       if $enable_dkim_verification;
     $which_section = "penpals_check";
+    my($pp_age);
     my($spam_level) = $msginfo->spam_level;
     if (defined($sql_storage) && [EMAIL PROTECTED]) {
@@ -9954,8 +9974,8 @@
            (defined $penpals_threshold_low || defined $penpals_threshold_high);
       if ($pp_bonus <= 0 || $pp_halflife <= 0) {}
-      elsif (defined($penpals_threshold_low) &&
+      elsif (defined($penpals_threshold_low)  && !defined($bounce_msgid) &&
              $spam_level + max(@boost_list) < $penpals_threshold_low) {}
         # low score for all recipients, no need for aid
-      elsif (defined($penpals_threshold_high) &&
+      elsif (defined($penpals_threshold_high) && !defined($bounce_msgid) &&
              $spam_level + min(@boost_list) - $pp_bonus
                                             > $penpals_threshold_high) {}
@@ -9976,9 +9996,11 @@
                             $msginfo->get_header_field_body('references');
             my(@refs) = $refs_str eq '' ? () : parse_message_id($refs_str);
+            push(@refs,$bounce_msgid)  if defined $bounce_msgid;
             do_log(4,"penpals: references: %s", join(", ",@refs))  if @refs;
             # NOTE: swap $rid and $sid as args here, as we are now checking
             # for a potential reply mail - whether the current recipient has
             # recently sent any mail to the sender of the current mail:
-            my($pp_age,$pp_mail_id,$pp_subj) =
+            my($pp_mail_id,$pp_subj);
+            ($pp_age,$pp_mail_id,$pp_subj) =
               $sql_storage->penpals_find($rid,$sid,[EMAIL 
PROTECTED],$msginfo->rx_time);
             if (defined($pp_age)) {  # found info about previous correspondence
@@ -10010,4 +10032,39 @@
     }
 
+    $which_section = "bounce_killer";
+    if ($bounce_header_fields_ref) {  # message looks like a DSN
+      my($bounce_rescued);
+      if (defined $pp_age) {  # found by pen pals by a Message-ID in attachment
+        # is a bounce, refers to our previous outgoing message, treat it kindly
+        snmp_count('BounceRescuedByPenPals'); $bounce_rescued = 'by penpals';
+      } elsif (defined($bounce_msgid) && $bounce_msgid =~ /([EMAIL 
PROTECTED]@>]+)>?\z/ &&
+               lookup(0,$1, @{ca('local_domains_maps')})) {
+        # not in pen pals, but domain in Message-ID is a local domain
+        snmp_count('BounceRescuedByDomain'); $bounce_rescued = 'by domain';
+      }
+      do_log(2, "bounce %s, %s -> %s, %s",
+                defined $bounce_rescued ?'rescued '.$bounce_rescued :'killed',
+                qquote_rfc2821_local($sender),
+                join(',', qquote_rfc2821_local(@recips)),
+                join(', ', map { $_ . ': ' . $bounce_header_fields_ref->{$_} }
+                               sort(keys %$bounce_header_fields_ref)) );
+      if (!$bounce_rescued) {
+        snmp_count('BounceNotFromUsKilled');
+        my($bounce_killer_score) = c('bounce_killer_score');
+        for my $r (@{$msginfo->per_recip_data}) {
+          my($boost) = $r->recip_score_boost || 0;
+          $r->recip_score_boost($boost + $bounce_killer_score);
+        }
+      }
+    } elsif ($sender eq '' ||
+             $sender =~ /^(?:MAILER-DAEMON|postmaster)(?:\z|\@)/i ||
+             $sender =~ /^Symantec_/i) {
+      # message could be some kind of a non-standard bounce,
+      # but lacks a header section from original mail
+      do_log(3, "bounce unparsable, %s -> %s, ", qquote_rfc2821_local($sender),
+                join(',', qquote_rfc2821_local(@recips)));
+      snmp_count('BounceUnparsable');
+    }
+
     $which_section = "decide_mail_destiny";
     $snmp_db->register_proc(2,0,'r',$am_id)  if defined $snmp_db;  # results...
@@ -10716,4 +10773,105 @@
 }
 
+# Check if a message is a bounce, and if it is, try to obtain essential
+# information from an attached header section of an attached original message,
+# mainly the Message-ID.
+#
+sub inspect_a_bounce_message($) {
+  my($msginfo) = @_;
+  my(%header_field); my($is_true_bounce) = 0; 
+  my($parts_root) = $msginfo->parts_root;
+  if (defined $parts_root) {
+    my($sender) = $msginfo->sender;
+    my($structure_type) = '?';
+    my($top) = $parts_root->children;
+    $top = $top->[0]  if defined $top;  # there should only be one top part
+    my(@parts); my($fname_ind,$fname); my($plaintext) = 0;
+    if (defined $top)
+      { my($ch) = $top->children;  @parts = ($top, !defined $ch ? () : @$ch) }
+    my(@t) =
+      map { my($t)=$_->type_declared; lc(ref $t ? $t->[0] : $t) } @parts;
+    ll(5) && do_log(5, "inspect_dsn: parts: %s", join(", ",@t));
+    if (  @parts >= 2 && @parts <= 4  &&  # a root, with 2 or 3 leaves
+          $t[0] eq 'multipart/report' &&
+        ( $t[2] eq 'message/delivery-status' ||
+          $t[2] eq 'message/feedback-report' ) &&
+          $t[2] eq 'message/'.lc($parts[0]->report_type) &&
+        ( $t[3] eq 'text/rfc822-headers' || $t[3] eq 'message/rfc822' )) {
+      # standard DSN or feedback-report
+      $fname_ind = 3; $is_true_bounce = 1; $structure_type = 'standard DSN';
+    } elsif (@parts >= 3 && @parts <= 4  &&  # a root, with 2 or 3 leaves
+          $t[0] eq 'multipart/report' &&
+          lc($parts[0]->report_type) eq 'delivery-status' &&
+        ( $t[-1] eq 'text/rfc822-headers' || $t[-1] eq 'message/rfc822' )) {
+      # not quite std. DSN (missing message/delivery-status), but recognizable
+      $fname_ind = -1; $is_true_bounce = 1;
+      $structure_type = 'DSN, missing delivery-status part';
+    } elsif (@parts >= 3 && @parts <= 5 && $sender eq '' &&
+          $t[0] eq 'multipart/mixed' &&
+        ( $t[-1] eq 'text/rfc822-headers' || $t[-1] eq 'message/rfc822' )) {
+      # sometimes qmail
+      $fname_ind = -1; $structure_type = 'multipart/mixed with attached msg';
+    } elsif (@parts == 1 && $sender eq '') {
+      # nonstructured, possibly a non-standard bounce (qmail, gmail.com, ...)
+      $fname_ind = 0; $plaintext = 1; $structure_type = 'nonstructured';
+    }
+    if (defined $fname_ind) {  # we have a header section from original mail
+      $fname_ind = $#parts  if $fname_ind < 0;
+      $fname = $parts[$fname_ind]->full_name;
+      my(%collectable_header_fields);
+      $collectable_header_fields{lc($_)} = 1
+        for qw(From Message-ID Return-Path);
+      my($fh) = IO::File->new;
+      $fh->open($fname,'<') or die "Can't open file $fname: $!";
+      binmode($fh,":bytes") or die "Can't cancel :utf8 mode: $!"
+        if $unicode_aware;
+      my($curr_head,$ln); my($nr) = 0; my($have_msgid) = 0; local($1,$2);
+      for ($! = 0; defined($ln = $fh->getline); $! = 0) {
+        $nr++;  last if $nr > 1000;  # safety measure
+        if ($ln =~ /^[ \t]/) {  # folded
+          $curr_head .= $ln  if length($curr_head) < 2000;  # safety measure
+        } else {  # a new header field, process previous if any
+          $header_field{lc($1)} = $2  if defined($curr_head) &&
+                          $curr_head =~ /^([!-9;-\176]{1,30})[ \t]*:(.*)\z/s &&
+                          $collectable_header_fields{lc($1)};
+          $curr_head = $ln;
+          $have_msgid = 1  if defined $header_field{'from'} &&
+                              defined $header_field{'message-id'};
+          last  if ($ln eq "\n" || $ln =~ /^--/) && !$plaintext;
+          last  if $have_msgid;
+        }
+      }
+      defined $ln || $!==0  or die "Error reading from $fname: $!";
+      $fh->close or die "Error closing $fname: $!";
+      $header_field{lc($1)} = $2  if defined($curr_head) &&
+                          $curr_head =~ /^([!-9;-\176]{1,30})[ \t]*:(.*)\z/s &&
+                          $collectable_header_fields{lc($1)};
+      $have_msgid = 1  if defined $header_field{'from'} &&
+                          defined $header_field{'message-id'};
+      $is_true_bounce = 1  if $have_msgid;
+      if ($is_true_bounce) {
+        for (@header_field{keys %header_field})
+          { s/\n(?=[ \t])//g; s/^[ \t]+//; s/[ \t\n]+\z// }
+        $header_field{'message-id'} =
+          (parse_message_id($header_field{'message-id'}))[0]
+          if defined $header_field{'message-id'};
+      }
+      section_time("inspect_dsn");
+    }
+    if ($is_true_bounce) {
+      do_log(3, "inspect_dsn: bounce, struct(%s): %s, <%s>, %s",
+                !defined($fname_ind) ? '-' : $fname_ind,
+                $structure_type, $sender,
+                join(", ", map { $_ . ": " . $header_field{$_} }
+                               sort(keys %header_field)) )  if ll(3);
+    } elsif ($sender eq '') {
+      do_log(3, "inspect_dsn: not a bounce, struct(%s): %s, parts: %s",
+                !defined($fname_ind) ? '-' : $fname_ind,
+                $structure_type, join(", ",@t))  if ll(3);
+    }
+  }
+  !$is_true_bounce ? 0 : \%header_field;
+}
+
 sub add_forwarding_header_edits_common($$$$$$$) {
   my($conn, $msginfo, $hdr_edits, $hold, $any_undecipherable,





Mark

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Reply via email to