Here's my code. It's based upon the 4 field file that mailscanner uses (so that I wouldn't have to re-do all of our extensions). That file is in the form:

allow|deny(tab)regex(tab)log-text(tab)user-text

(user-text is a user-friendly explanation for why this attachment is blocked; log-text is a shorter (and thus log-friendly) version of the same)

Here's my code:

@myfilenames = undef;

open (FILENAMERULES, "</etc/mail/filename.rules.conf");
while (defined ($line = <FILENAMERULES>)) {
   chomp $line;
   $line =~ s/\#.*//;
   $line =~ s/^\s+//;
   $line =~ s/\s+$//;
   if ($line ne "") {
      push(@myfilenames, $line);
      }
   }
close (FILENAMERULES);

# This procedure returns true for entities with bad filenames.
sub filter_bad_filename ($) {
   my ($entity) = @_;
   my ($bad_exts, $re, $perm, $regex, $logtxt, $usertxt);

   foreach $re (@myfilenames) {
      ($perm, $regex, $logtxt, $usertxt) = split(/      +/, $re);
      if (re_match($entity, $regex)) {
         if ($perm eq "allow") {
            return (0);
            }
         if ($perm eq "deny") {
            return (1);
            }
         } # if re_match
      } # foreach

   return 0;
   }

This is a simple one that doesn't actually use the user-text or log-text. I also have a version I'm testing that instead of just returning 1 or 0, returns (perm,logtxt,usertxt), so that the usertxt can be incorporated into the response, and logtxt can be incorporated into the logs.

I also tried to smash them together into 2 strings, like this:

$allowregex = '(\.gif$)|(\.jpg$)(and more expressions)';
$denyregex = '(\.exe$)|(\.com$)|(\.ma[dgf]$)(and more expressions)';

But I couldn't figure out a way to get perl to tell me _which_ part of the regular expression was matched (you can get it to tell you which part of the _target_ string was matched, like command.com matched against ".com", if you use $&, which is REALLY slow ... but it wont tell you that in the regex it matched (\.com$), so that you can use it as an index into a hash that contains logtxt and usertxt ... so I would still have to iterate through to find out what rule was tripped, so that I can then return the appropriate logtxt and usertxt ... so in the end, that wouldn't be any faster than what's above.

_______________________________________________
Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list
[email protected]
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang

Reply via email to