http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5440
Summary: Malformed UTF-8 character errors
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Platform: Sun
OS/Version: Solaris
Status: NEW
Severity: major
Priority: P5
Component: Libraries
AssignedTo: [email protected]
ReportedBy: [EMAIL PROTECTED]
Testing SpamAssassin v3.2.0-rc3 on Solaris 9
We are getting tons of "Malformed UTF-8 character" messages when certain
messages are scanned (sample message attached). We did NOT have this problem
with v3.1.8 or any earlier 3.x versions. Problem was first noticed with spamd
but can be demonstrated by running
spamassassin < badmsg.txt
Here is the system/Perl information:
# uname -a
SunOS ornl71 5.9 Generic_118558-39 sun4u sparc SUNW,UltraAX-i2
# perl -V
Summary of my perl5 (revision 5 version 8 subversion 7) configuration:
Platform:
osname=solaris, osvers=2.9, archname=sun4-solaris
uname='sunos 5.9 generic sun4u sparc sunw,ultra-5_10 solaris '
config_args=''
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-fno-strict-aliasing -pipe -I/usr/local/include -
D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O',
cppflags='-fno-strict-aliasing -pipe -I/usr/local/include'
ccversion='', gccversion='3.3.2', gccosandvers='solaris2.9'
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='gcc', ldflags =' -L/usr/local/lib '
libpth=/usr/local/lib /usr/lib /usr/ccs/lib
libs=-lsocket -lnsl -ldl -lm -lc
perllibs=-lsocket -lnsl -ldl -lm -lc
libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib'
Characteristics of this binary (from libperl):
Compile-time options: USE_LARGE_FILES
Built under solaris
Compiled at Dec 10 2005 02:42:33
@INC:
/usr/local/lib/perl5/5.8.7/sun4-solaris
/usr/local/lib/perl5/5.8.7
/usr/local/lib/perl5/site_perl/5.8.7/sun4-solaris
/usr/local/lib/perl5/site_perl/5.8.7
/usr/local/lib/perl5/site_perl
.
I have tried it with and without "LANG=en_US" (default is undefined).
Here are some of the error messages:
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xce) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xce) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xce) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xce) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
...
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xc4) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xc4) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xc5) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xc5) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xc7) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
...
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xce) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xcf) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xd0) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xd1) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xd2) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xce) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte
0x00, immediately after start byte 0xcf) in pattern match (m//)
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.