On Thu, 16 Jun 2011 17:05:21 +0100 Carlo Rodrigues <c...@net4b.pt> wrote:
> Hey Stevan! > Hello Carlo, > On 06/15/2011 08:32 PM, Stevan Bajić wrote: > > I just extracted the subject line from the mbox file and made this > > quick and dirty Perl file for testing: > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > #!/usr/bin/perl -w > > # > > > > require "htmlize.pl"; > > > > my $header = 'Subject: > > =?UTF-8?B?RWluIHNjaMO2bmVzIFdvY2hlbmVkZSEgTnVyIDIgVGFnZSBSYWJhdHQsIEhhbmR5IC0xMjfigqwsIFRhYmxldCBQQyAtMzDigqw=?='; > > if ($header =~ /^(.*?)=\?([^?]+)\?([qb])\?([^?]*)\?=(.*)$/is) { > > print htmlize_chars($header) . "\n"; > > } else { > > print "Not encoded\n"; > > } > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > > > I copied the htmlize.pl from GIT into the same directory and run the Perl > > script and get this as output: > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > theia ~ # perl ./csmr.pl > > Subject: Ein schönes Wochenede! Nur 2 Tage Rabatt, Handy -127€, > > Tablet PC -30€ > > theia ~ # > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > > > IMHO this looks okay. It's German and say "Have a nice weekend! Only 2 days > > left for discount, mobile handset -127€, tablet PC -30€". > > > > So the htmlize_chars in htmlize.pl is IMHO working correctly. Can you check > > on your install if you get the same result with the above Perl script? > > > Yes, the perl script worked just fine with that Subject string. > The problem with those errors was on the htmlize() fuction on dspam.cgi. > After playing around with several types of strings, I found that this > new htmlize function, in addition to the new htmlize.pl 1.02 you sent > June 10th, works flawlessly for all cases: > > sub htmlize { > # > # Replace some characters > # to be HTML characters > # > my($text) = @_; > > use Encode; > use HTML::Entities; > > if ($text =~ /[\xC2-\xDF][\x80-\xBF]/) { > $text = decode("utf8", $text); > $text = encode_entities(decode_entities($text)); > } > > if ($text =~ /^(.*?)=\?([^?]+)\?([qb])\?([^?]*)\?=(.*)$/is) { > if (-r "htmlize.pl") { > require "htmlize.pl"; > $text = htmlize_chars($text); > } > } > > return $text; > } > > > I really appreciate your help. Quarantine UI is looking gorgeous now :) > Thank you. > try this attached patch. It's way smaller than the old one and does avoid double decoding/encoding. Should IMHO yeald the same result as your code but be slightly faster when processing. You need to apply the patch against GIT Master. Does it work as expected? If so tell me than I will merge it into GIT Master. > Carlo > -- Kind Regards from Switzerland, Stevan Bajić
2918624.patch
Description: Binary data
------------------------------------------------------------------------------ EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________ Dspam-devel mailing list Dspam-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-devel