Michael,

Thanks to Joseph Brennan, I use this code at the end of sub filter() to achieve what I believe you want:

#Disable bad HTML code -- Based on work by Columbia University / Joseph Brennan
   #Modified by KAM 2004-04-16
#Modified by KAM 2004-04-21 to add slurp of entire message and one regexp check + size check #Modified by KAM 2004-06-02 to add a check for defined bodyhandle and path to prevent issues. #Modified by KAM 2004-08-09 to add $io to defined variables thanks to Tony Nelson
   if ($type eq "text/html") {
my($currentline, $output, $badtag, $delimiter_backup, $sizelimit, $bh, $path, $io);

     $badtag = 0;
     $output = "";
$sizelimit = 1048576; #1MB #max size of an email you want to check in bytes
     $delimiter_backup = $/;

     $bh = $entity->bodyhandle();
     if (defined($bh)) {
       $path = $bh->path();
     }
     if (defined($path)) {
       if (-s $path <= $sizelimit) {
         if ($io = $entity->open("r")) {
           undef $/; # undef the seperator to slurp it in.
           $output = $io->getline;
           $io->close;
           $badtag = $output =~ s/<(iframe|script|object)\b/<no-$1 /igs;

           if ($badtag) {
             if ($io = $entity->open("w")) {
               $io->print($output);
               $io->close;
             }
md_graphdefang_log('modify',"$badtag Iframe/Object/Script tag(s) deactivated by MIMEDefang"); action_change_header("X-Warning", "$badtag Iframe/Object/Script tag(s) deactivated by MIMEDefang");
             action_rebuild();
           }
           $/ = $delimiter_backup;
         }
       }
     }
   }

I also use this rule in SA:

#WE USE MIMEDEFANG TO DISABLE ANY IFRAME, OBJECT OR SCRIPT TAGS IN EMAILS
header KAM_IFRAME X-Warning =~ /Iframe\/Object\/Script tag\(s\) deactivated by MIMEDefang/ describe KAM_IFRAME Email contained Iframe, Object or Script tags
score           KAM_IFRAME      1.0

Actually, it's good you asked this because I changed it to X-IframeWarning in the code.

Regards,
KAM

----- Original Message ----- From: "Michael D. Sofka" <[email protected]>
To: <[email protected]>
Sent: Wednesday, April 29, 2009 2:10 PM
Subject: [Mimedefang] Suggestions on an HTML sanitize program.


Greetings,

Not directly a Mimedefang issue, but users of mimedefang are likely to have looked into this problem.

I have a need to sanitize HTML primarily to remove and scripts. Since the last time I looked at Perl modules the field of candidates has expanded. I'm looking for suggestions. The three I'm considering are:

HTML::Defang
HTML::Detoxifier
HTML::StripScripts

I am leaning towards HTML::Defang. But, HTML::Detoxifier has a simple interface, and does some HTML cleanup as well. All three appear to be good choices. My primary need is to ensure scripts are removed from the input (to the degree that is possible). The application is already busy, so low memory overhead, and processing speed are important.

I'm interested in any feedback, or suggestions for other Perl modules.

Mike
--
Michael D. Sofka               [email protected]
C&MT Sr. Systems Programmer,   Email, TeX, Epistemology
Rensselaer Polytechnic Institute, Troy, NY.  http://www.rpi.edu/~sofkam/
_______________________________________________
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list [email protected]
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


_______________________________________________
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list [email protected]
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang

Reply via email to