http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5440

           Summary: Malformed UTF-8 character errors
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Sun
        OS/Version: Solaris
            Status: NEW
          Severity: major
          Priority: P5
         Component: Libraries
        AssignedTo: [email protected]
        ReportedBy: [EMAIL PROTECTED]


Testing SpamAssassin v3.2.0-rc3 on Solaris 9
We are getting tons of "Malformed UTF-8 character" messages when certain 
messages are scanned (sample message attached).  We did NOT have this problem 
with v3.1.8 or any earlier 3.x versions.  Problem was first noticed with spamd 
but can be demonstrated by running

spamassassin < badmsg.txt

Here is the system/Perl information:

# uname -a
SunOS ornl71 5.9 Generic_118558-39 sun4u sparc SUNW,UltraAX-i2
# perl -V
Summary of my perl5 (revision 5 version 8 subversion 7) configuration:
  Platform:
    osname=solaris, osvers=2.9, archname=sun4-solaris
    uname='sunos 5.9 generic sun4u sparc sunw,ultra-5_10 solaris '
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef 
usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-fno-strict-aliasing -pipe -I/usr/local/include -
D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O',
    cppflags='-fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='3.3.2', gccosandvers='solaris2.9'
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', 
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib '
    libpth=/usr/local/lib /usr/lib /usr/ccs/lib
    libs=-lsocket -lnsl -ldl -lm -lc
    perllibs=-lsocket -lnsl -ldl -lm -lc
    libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib'


Characteristics of this binary (from libperl):
  Compile-time options: USE_LARGE_FILES
  Built under solaris
  Compiled at Dec 10 2005 02:42:33
  @INC:
    /usr/local/lib/perl5/5.8.7/sun4-solaris
    /usr/local/lib/perl5/5.8.7
    /usr/local/lib/perl5/site_perl/5.8.7/sun4-solaris
    /usr/local/lib/perl5/site_perl/5.8.7
    /usr/local/lib/perl5/site_perl
    .

I have tried it with and without "LANG=en_US" (default is undefined).

Here are some of the error messages:

[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xce) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xce) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xce) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xce) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
...
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xc4) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xc4) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xc5) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xc5) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xc7) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule __SARE_OBFU_VISIT1, line 1.
...
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xce) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xcf) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xd0) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xd1) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xd2) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xce) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.
[19860] warn: Malformed UTF-8 character (unexpected non-continuation byte 
0x00, immediately after start byte 0xcf) in pattern match (m//) 
at /etc/mail/spamassassin/70_sare_obfu.cf, rule SARE_OBFU_XANAX, line 1.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to