Re: Incoming whitelist

Benjamin J. Stassart Wed, 15 Oct 2003 21:14:07 -0700

> Date: Wed, 15 Oct 2003 17:13:32 -0600
> From: Jason R. Mastaler <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: Re: Incoming whitelist
>
> "Benjamin J. Stassart" <[EMAIL PROTECTED]> writes:
>
> > Is there a way to have an incoming ok=append type rule (or
> > alternative below)?
>
> I don't think so, no.


Well, at least I didn't embarrass myself by missing something in the
documentation.

> > What I would love is when someone on my whitelist sends a message to
> > me and some other addresses in the To: and CC: field that those
> > other addresses get added to a whitelist-incoming.
>
> [...]
>
> > This would cut down dramatically on the number of people having to
> > respond to confirmations.
>
> This will also probably result in a lot of unused cruft in your
> whitelist.

I don't mind that.  I had an idea to ignore adding incoming addresses when
more than a certain number of addresses are listed in an e-mail.  In
addition, one would not add any addresses already listed in another
whitelist.

> You might also check out FAQ 5.5 if you are concerned with reducing
> confirmations.

I am already using tagged outgoing addresses and outgoing auto-whitelist.

The situation I am proposing to fix is:

A is using TMDA.
B is on A's whitelist.
B sends a message to A, C, and D.
C replies to that message to A, B, and D.
D replies to that message to A, B, and C.

Currently if C and D are not currently on A's whitelist, both of them will
be asked to confirm the messages.

> > If this is not possible through tmda directly, then I assume one
> > could write a script between the tmda-filter and the Mailbox that
> > adds addresses to the whitelist-incoming.
>
> Yes, this sounds reasonable.  Just make sure your script and TMDA each
> append to their own files.  You might have unpredictable results if
> you try to make them share a file.

I assume you mean using a separate auto-whitelist file (and logs).  I am
not sure what else would be a problem sharing.

I finished writing such a script.  It supports text (wildcards), CDB, and
ezmlm whitelists.  I ran it through a bunch of test scripts and on an
Athlon 1.2GHz machine it took between 0.07 and 0.15s to process e-mails
with between 5-50 addresses test messages.

I ran it on my current mail file which is 2MB and has 160 messages.  It
took around 5 seconds to process them all. I had it set to skip any
message addressed to more than 8 people, but I didn't have any in my
mailbox that that applied to. But I created some tests that are oddly
formed, some that are invalid, and ones that contain a lot of e-mail
addresses.

~/bin/tests> split $MAIL
~/bin/tests> ls mail* test* | xargs -n 1 ../tmda-incoming

I got one invalid e-mail address logged and skipped, caused by a local
user using a mailer (mutt) that allowed them to e-mail a local address
without adding the domain part.  It added 16 addresses to my
whitelist-incoming (not including test addresses).

Next, I ran it on a bunch of saved folders I had, messages from friends
and family.  Adding another 654 messages to process. That took it 48
seconds to process.  It increased the number of addresses in my
whitelist-incoming to 82 (not including test addresses).  This one
generated a bunch of 'Invalid e-mail address: "Undisclosedrecipients:;"'
and 'Too many addresses in e-mail, skipped.' warnings to the log file.

I then did some tests sending messages from off system from accounts that
were and were not previously whitelisted.  Well, now you know the extent
of it.  So if you want to beta test it, you are welcome to.

Hopefully this helps some others.  Right now it needs to be configured
with the whitelists so each user needs their own version of the script.
If enough people request it I may write the code to parse the TMDA config
and filters to grab the necessary information.

One enables the script with a .qmail file such as:

#################################################
# Start of .qmail-tmda
#################################################
|preline /usr/bin/tmda-filter
|preline /home/user/bin/tmda-incoming
./Mailbox

#################################################
# End of .qmail-tmda
#################################################

Here is the script:

#################################################
# Start of tmda-incoming
#################################################
#!/usr/bin/perl -T
#
# Make sure to point the line above at a version of Perl 5
#
# Written 2003 by Benjamin J. Stassart
#                 <[EMAIL PROTECTED]> (I hate Spam)
#
# $Id: tmda-incoming,v 1.12 2003/10/16 03:23:02 stassart Exp stassart $
#
#    This program is free software; you can redistribute it and/or modify
#    it under the terms of either:
#
#        a) the GNU General Public License as published by the Free
#        Software Foundation; either version 2, or (at your option) any
#        later version, or
#
#        b) the "Artistic License" included with Larry Wall's Perl.
#
# tmda-incoming adds incoming e-mail addresses to a whitelist
# It checks current whitelists before adding
#
# To Configure see Configuration section below
#
# $Log: tmda-incoming,v $
# Revision 1.12  2003/10/16 03:23:02  stassart
# Added code to get rid of comments in ()'s.
#
# Revision 1.11  2003/10/16 03:12:38  stassart
# Added code to remove surrounding ticks for MS-Exchange.
#
# Revision 1.10  2003/10/16 02:32:25  stassart
# Added too many addresses code.
#
# Revision 1.9  2003/10/16 02:19:53  stassart
# Fixed space bug.
#
# Revision 1.8  2003/10/16 01:42:06  stassart
# Added e-mail parsing code.
#
# Revision 1.7  2003/10/15 23:57:35  stassart
# Added DO_NOTHING.
#
# Revision 1.6  2003/10/15 23:28:37  stassart
# Added DEBUG.
#
# Revision 1.5  2003/10/15 23:19:59  stassart
# Added check for incoming whitelist being listed.
#
# Revision 1.4  2003/10/15 22:41:53  stassart
# Added ezmlm support.
#
# Revision 1.3  2003/10/15 21:20:15  stassart
# Added CDB_ENABLED.
#
# Revision 1.2  2003/10/15 20:39:28  stassart
# Fixed regular expressions for current whitelists.
#
# Revision 1.1  2003/10/15 20:29:40  stassart
# Initial revision
#

########################################################################
# Modules
########################################################################

use strict;
use English;
# Require Perl 5
require 5.005_03;

# CDB_File is included below, it is not manditory
# but it improves performance for CDB files

########################################################################
# Environment
########################################################################

# For Taint checking
$ENV{PATH} = '/bin:/usr/bin';
$ENV{IFS} = '';
delete $ENV{ENV};
delete $ENV{BASH_ENV};
delete $ENV{CD_PATH};

########################################################################
# Configuration
########################################################################

# Debug?
my $DEBUG = 0;
# Don't actually add to the file
my $DO_NOTHING = 0;

# Get home directory
# This shouldn't need to be changed
# my $HOME = "/home/user";
my $HOME = $ENV{HOME};
# Need to untaint $HOME
if ($HOME =~ m#^(/[-._/\w]+)$#) {
   $HOME = $1;
} else {
   $HOME = "";
}

# If an e-mail includes more than this number of addresses do not
# add them to the whitelist
#
# Set to 0 to disable
#
my $MAXADDR             = 8;

# Whitelist to add incoming addresses to
#
my $INCOMING            = "$HOME/.tmda/lists/whitelist-incoming";

# Text
my @TEXT_WHITELISTS     = (
                           "$HOME/.tmda/lists/whitelist-wild",
                          );

# CDB Whitelists
# Filenames should end in .cdb
my @CDB_WHITELISTS      = (
                           "$HOME/.tmda/lists/whitelist.cdb",
                           "$HOME/.tmda/lists/whitelist-confirmed.cdb",
                           "$HOME/.tmda/lists/whitelist-mailinglists.cdb",
                          );

# ezmlm-issubn command if using ezlmlm whitelists
my $ISSUBN              = "/usr/bin/ezmlm-issubn";

# EZMLM Whitelists
my @EZMLM_WHITELISTS    = (
                          );

# Log file
my $LOG                 = "$HOME/.tmda/logs/incoming-whitelist.log";

# Error exit code
#my $ERROR_CODE         = 111;  # Temporary
my $ERROR_CODE          = 100;  # Permanent

# Success code
#my $SUCCESS_CODE       = 99;   # Done
my $SUCCESS_CODE        = 0;    # Continue on

########################################################################
# Globals
########################################################################

# Subroutine declarations
sub my_die($);
sub my_diestderr($);
sub add_addrs($);
sub proc_addr($);

########################################################################
# Main
########################################################################

# Check if CDB_File module is installed
my $CDB_ENABLED;
#eval 'use CDB_Bogus; 1';
eval 'use CDB_File; 1';
if ($EVAL_ERROR) {
   $CDB_ENABLED = 0;
   if ($DEBUG) {
      print "CDB not found\n";
   }
} else {
   $CDB_ENABLED = 1;
   if ($DEBUG) {
      print "CDB found\n";
   }
}

# For testing
# You can also try adding addresses from EZMLM, text, and cdb whitelists
#my @TOADD      = (
#                  '[EMAIL PROTECTED]',         # Does not exist
#                 );
my @TOADD;

# Temporary partial address from previous line
# Also will hold last address in a field
my $prev_addr;

# List of current whitelisted addresses
my $list;
my $address;
my $i;

# Count of e-mail addresses in e-mail
my $addr_count = 0;

# Open Log file
open LOG, '>>', $LOG
   or my_diestderr qq(Unable to open log file "$LOG" for writing: $!\n);

my_die qq(Home directory not defined.\n)
   unless -d $HOME;

my_die qq[Incoming whitelist "$INCOMING" does not exist (must create).\n]
   unless -w $INCOMING;

# Parse E-mail for addresses
my $grab = 0;
while (<>) {

   if (/^To:/i) {

      # Do we still have an address?
      if ($prev_addr) {
         proc_addr($prev_addr);
         $prev_addr = "";
      }

      # Get rid of field-name
      s/^To:\s*//i;

      add_addrs($_);
      $grab = 1;

      next;
   }

   if (/^CC:/i) {

      # Do we still have an address?
      if ($prev_addr) {
         proc_addr($prev_addr);
         $prev_addr = "";
      }

      # Get rid of field-name
      s/^CC:\s*//i;

      add_addrs($_);
      $grab = 1;

      next;
   }

   # Other headers
   #
   # RFC822:
   # The field-name must be composed of printable ASCII characters
   # (i.e., characters that  have  values  between  33.  and  126.,
   # decimal, except colon)
   # field-name  =  1*<any CHAR, excluding CTLs, SPACE, and ":">
   #                                                    Oct      Dec
   # CTL         =  <any ASCII control           ; (  0- 37,  0.- 31.)
   #                 character and DEL>          ; (    177,     127.)
   # SPACE       =  <ASCII SP, space>            ; (     40,      32.)
   #
   # 037 (31d) is last CTL char, 040 (32d) is the space,
   #    then 041 (33d) is where we begin
   # 072 (58d) is the colon
   # 0176 (126d) is the ~, the la
   #
   if (/^[\041-\071\073-\176]+:/) {
      $grab = 0;
      next;
   }

   # Skip From line
   if (/^From/) {
      $grab = 0;
      next;
   }

   # Blank line marks end of headers
   last
      if /^\s*$/;

   # All that should be left is continuation lines
   # RFC 822 states continuation lines must start with
   # "AT LEAST one LWSP-char"
   # LWSP-char   =  SPACE / HTAB
   unless (/^[ \t]/) {
      # Unknown header, skip it
      print LOG qq(Unknown header: $_\n);

      if ($DEBUG) {
         print qq(Unknown header: $_\n);
      }

      next;
   }

   # Should be a continued line, is it a continued address line?
   if ($grab) {
      add_addrs($_);
   }
}

# Do we still have an address?
if ($prev_addr) {
   proc_addr($prev_addr);
   $prev_addr = "";
}

# Make sure incoming whitelist is being checked too
#
my $in_found = 0;
if (-e "$INCOMING.cdb") {
   foreach $list (@CDB_WHITELISTS) {
      if ($list eq "$INCOMING.cdb") {
         $in_found = 1;
         last;
      }
   }
}
unless ($in_found) {
   foreach $list (@TEXT_WHITELISTS) {
      if ($list eq $INCOMING) {
         $in_found = 1;
         last;
      }
   }
}
# Couldn't find it, we need to add it
unless ($in_found) {
   if (-e "$INCOMING.cdb") {
      push @CDB_WHITELISTS, "$INCOMING.cdb";
   } else {
      push @TEXT_WHITELISTS, $INCOMING;
   }
}

# If we can't use CDB, then move lists to text
unless ($CDB_ENABLED) {
   while (@CDB_WHITELISTS) {
      $list = pop @CDB_WHITELISTS;

      $list =~ s/\.cdb$//;
      push @TEXT_WHITELISTS, $list;
   }
}

# Check for issubn command
$ISSUBN = ""
   unless -x $ISSUBN;

# Is it listed in CDB whitelists?

my %CDB;
if ($CDB_ENABLED) {
   foreach $list (@CDB_WHITELISTS) {
      tie %CDB, 'CDB_File', $list
         or my_die qq(Unable to open database whitelist "$list": $!\n);

      for ($i = 0; $i < scalar(@TOADD); $i++) {
         $address = $TOADD[$i];

         # Skip deleted addresses
         next
            unless defined $address;

         if ($DEBUG) {
            print "Checking address: $address\n";
         }

         next
            unless exists $CDB{$address};

         # Remove it from addition list
         delete $TOADD[$i];

         if ($DEBUG) {
            print "Found $address on $list.\n";
         }
      }
   }
}

# Is it listed in text whitelists?

foreach $list (@TEXT_WHITELISTS) {
   open LIST, '<', $list
      or my_die qq(Unable to open whitelist "$list" for reading: $!\n);

   while (<LIST>) {
      chomp;

      # Skip blank lines
      next
         if /^\s*$/;

      # Skip comments
      next
         if /^#/;

      # Get rid of anything after whitespace
      s/\s+$//;

      # Get rid of any possible Perl metachars
      $_ = quotemeta $_;

      # Add back in wildcards
      s/\\\*/.*/g;
      s/[EMAIL PROTECTED]/@.*/g;

      for ($i = 0; $i < scalar(@TOADD); $i++) {
         # print "Checking: $_\n";

         $address = $TOADD[$i];

         # Skip deleted addresses
         next
            unless defined $address;

         next
            unless $address =~ m/^$_$/;

         delete $TOADD[$i];

         if ($DEBUG) {
            print "Found $address on $list.\n";
         }
      }
   }

   close LIST;
}

# Is it listed in ezmlm whitelists?

if ($ISSUBN) {
   foreach $list (@EZMLM_WHITELISTS) {
      for ($i = 0; $i < scalar(@TOADD); $i++) {
         # Skip deleted addresses
         next
            unless defined $TOADD[$i];

         $ENV{'SENDER'} = $TOADD[$i];

         # Check if subscribed
         unless (system($ISSUBN, $list) / 256) {
            # Subscribed
            delete $TOADD[$i];

            if ($DEBUG) {
               print "Found $address on $TOADD[$i].\n";
            }
         }
      }
   }
}

# Not listed, let's add them

open OUT, '>>', $INCOMING
   or my_die qq(Unable to open "$INCOMING" for writing: $!\n);

if ($DEBUG) {
   print "Addresses to add:\n";
}

foreach (@TOADD) {
   next
      unless defined $_ and $_;

   unless ($DO_NOTHING) {
      print OUT "$_\n";
   }

   if ($DEBUG) {
      print "$_\n";
   }
}

close OUT
   or my_die qq(Unable to close "$INCOMING" after writing: $!\n);

close LOG;

exit $SUCCESS_CODE;

########################################################################
# Subroutines
########################################################################

# my_die
# Die with correct error code
sub my_die($) {
   my $error = shift;

   print LOG $error;

   exit $ERROR_CODE;
}

# my_diestderr
# Die with correct error code
sub my_diestderr($) {
   my $error = shift;

   print STDERR $error;

   exit $ERROR_CODE;
}

# add_addrs
# Add to list of addresses
#
# Caution for comments, spaces, and line continuation
#
# This does not follow RFC 822's parser, but it is much faster
#
sub add_addrs($) {
   my $line = shift;

   if ($DEBUG) {
      print qq(In add_addrs: "$line"\n);
   }

   # Get rid of all whitespace
   $line =~ s/\s+//g;

   # Get rid of any quoted text
   $line =~ s/"[^"]*"//g;

   # print qq(line is now: "$line"\n);

   # Add to the previous address if any
   if ($prev_addr) {
      $line = $prev_addr . $line;
      $prev_addr = "";
   }

   # Add back one space at the end to preserve empty last field
   # Otherwise, last address on line gets combined with next one
   $line .= " ";

   # print qq(line is now: "$line"\n);

   my @a = split ',', $line;

   # If the line ended in a comma, prev_addr will be blank
   # Otherwise we have part of an address
   $prev_addr = pop @a;

   # Get rid of the space we just added
   $prev_addr =~ s/ $//;

   foreach (@a) {
      proc_addr($_);
   }
}

# proc_addr
#
# Process an address
#
sub proc_addr ($) {
   my $addr = shift;

   if ($DEBUG) {
      print qq(In proc_addr: "$addr"\n);
   }

   # Does it contain a route-addr?
   if ($addr =~ m/<(.*)>/) {
      # Use the route-addr
      $addr = $1;
   }

   # Does it have single quotes around it?
   # Microsoft Exchange sometimes does this
   $addr =~ s/^'//;
   $addr =~ s/'$//;

   # Get rid of comments in ()'s
   $addr =~ s/\(.*\)//g;

   # Better contain local-part "@" domain
   unless (defined $addr and $addr =~ m/[EMAIL PROTECTED]/) {
      # Invalid e-mail address
      print LOG qq(Invalid e-mail address: "$addr"\n);

      if ($DEBUG) {
         print qq(Invalid e-mail address: "$addr"\n);
      }

      return;
   }

   # Finally add it

   push @TOADD, $addr;

   $addr_count++;

   # Check address count
   if ($addr_count > $MAXADDR) {
      # Too many addresses, skip e-mail
      print LOG "Too many addresses in e-mail, skipped.\n";

      close LOG;
      exit $SUCCESS_CODE;
   }
}

# End

#################################################
# End of tmda-incoming
#################################################

Here is the split script I used:

#################################################
# Start of split
#################################################
#!/usr/bin/perl

use strict;
use English;

my $PREFIX = "mail";
my $count = 0;
my $open = 0;

while (-e "$PREFIX$count") {
   $count++;
}

while (<>) {
   if (/^From /) {
      if ($open) {
         close OUT
            or die qq(Unable to close after writing "$PREFIX$count": $!\n);
      }

      open OUT, ">$PREFIX$count"
         or die qq(Unable to open for writing "$PREFIX$count": $!\n);
      $open = 1;
      $count++;
   }

   print OUT $_;
}

close OUT
   or die qq(Unable to close after writing "$PREFIX$count": $!\n);

# End

#################################################
# End of split
#################################################
_____________________________________________
tmda-users mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-users

Re: Incoming whitelist

Reply via email to