On Thu, 16 Jun 2011 17:05:21 +0100 Carlo Rodrigues <c...@net4b.pt> wrote:
> Hey Stevan! > Hello Carlo, > On 06/15/2011 08:32 PM, Stevan Bajić wrote: > > I just extracted the subject line from the mbox file and made this > > quick and dirty Perl file for testing: > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > #!/usr/bin/perl -w > > # > > > > require "htmlize.pl"; > > > > my $header = 'Subject: > > =?UTF-8?B?RWluIHNjaMO2bmVzIFdvY2hlbmVkZSEgTnVyIDIgVGFnZSBSYWJhdHQsIEhhbmR5IC0xMjfigqwsIFRhYmxldCBQQyAtMzDigqw=?='; > > if ($header =~ /^(.*?)=\?([^?]+)\?([qb])\?([^?]*)\?=(.*)$/is) { > > print htmlize_chars($header) . "\n"; > > } else { > > print "Not encoded\n"; > > } > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > > > I copied the htmlize.pl from GIT into the same directory and run the Perl > > script and get this as output: > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > theia ~ # perl ./csmr.pl > > Subject: Ein schönes Wochenede! Nur 2 Tage Rabatt, Handy -127€, > > Tablet PC -30€ > > theia ~ # > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > > > IMHO this looks okay. It's German and say "Have a nice weekend! Only 2 days > > left for discount, mobile handset -127€, tablet PC -30€". > > > > So the htmlize_chars in htmlize.pl is IMHO working correctly. Can you check > > on your install if you get the same result with the above Perl script? > > > Yes, the perl script worked just fine with that Subject string. > The problem with those errors was on the htmlize() fuction on dspam.cgi. > After playing around with several types of strings, I found that this > new htmlize function, in addition to the new htmlize.pl 1.02 you sent > June 10th, works flawlessly for all cases: > > sub htmlize { > # > # Replace some characters > # to be HTML characters > # > my($text) = @_; > > use Encode; > use HTML::Entities; > > if ($text =~ /[\xC2-\xDF][\x80-\xBF]/) { > $text = decode("utf8", $text); > $text = encode_entities(decode_entities($text)); > } > > if ($text =~ /^(.*?)=\?([^?]+)\?([qb])\?([^?]*)\?=(.*)$/is) { > if (-r "htmlize.pl") { > require "htmlize.pl"; > $text = htmlize_chars($text); > } > } > > return $text; > } > Okay. This is fine but we do to much in two different modules. This is IMHO not optimal. Let me craft another patch that avoids doing twice the decoding. > I really appreciate your help. Quarantine UI is looking gorgeous now :) > Thank you. > > Carlo > -- Kind Regards from Switzerland, Stevan Bajić ------------------------------------------------------------------------------ EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev _______________________________________________ Dspam-devel mailing list Dspam-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-devel