----- Original Message -----
From: "Greg Webster" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, July 23, 2003 11:27 PM
Subject: [SAtalk] Errors in PerMsgStatus.pm
> Hi all...when I pipe a message through 'spamassassin -r', I get the
> following messages (Nx means repeated that many times).
>
> 20 x Malformed UTF-8 character (unexpected continuation byte 0xa0, with no
> preceding start byte) in transliteration (tr///) at
> /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/PerMsgStatus.pm line
> 1186.
Yeah, the old UTF-8 stuff. :) I had the exact same problem (Perl 5.8.0), and
language settings would not make it go away. I solved it with a little
trick. I simply exported the "offending" code to its own module (NoUTF8.pm):
--------------------------------------------
package Mail::SpamAssassin::NoUTF8;
use strict;
sub strip_whitepace {
my $text = @_;
$text =~ s/\n+\s*\n+/\f/gs; # double newlines => form feed
$text =~ tr/ \t\n\r\x0b\xa0/ /s; # whitespace => space
$text =~ tr/\f/\n/; # form feeds => newline
return $text;
}
1;
--------------------------------------------
Then, in PerMsgStatus.pm I added:
use Mail::SpamAssassin::NoUTF8;
And where the tr// stuff takes place, I now call it as follows:
my @textary = $self->split_into_array_of_short_lines
(Mail::SpamAssassin::NoUTF8::strip_whitepace ($text));
(Mind the email wrap) And voila, in its own namespace, all UTF-8 problems
have now disappeared. :)
P.S. It may not be the official solution, but it works.
- Mark
-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk