I am using sb_bnfilter.py version 1.1a3 on linux (actually CVS version from Aug 21), and am getting an increasing number of spam messages with 8-bit characters in messages labelled as 7-bit. These cause sb_bnfilter.py to give an error such as: UnicodeDecodeError: 'ascii' codec can't decode byte 0xb0 in position 1: ordinal not in range(128) and the message is not classified.
I set the option replace_nonascii_chars: True in .spambayesrc, and some characters no longer caused problems (as they were replaced with ?). However any 8-bit character in the Subject: header still caused problems, also the character 0xb0 somewhere in messages. I used hexdump on the message to look for the reported 8-bit character. I could find the reported character in the message (or headers), except in the case of 0xb0, which did not appear to be anywhere in the message when dumped with hexdump. I have attached a message which reports the 0xb0 problem.
--- Begin Message ---
![]()
I sometimes sleep there when it's very hot. ague carbonaceous I know about for-EN-sics!
Geoffrey stole one last look at her, and for just a moment those cornflower eyes flashed his way, warming him, filling him. "I heard ye were turrible bunged up t'other night. He pushed it with the ball of his thumb. She turned to go. "Who's being smart? When he saw the mower bearing down on him he rolled over on his back and dug frantically at the driveway dirt with his heels, trying to push himself under the cruiser where she couldn't get him. And now, Paulie, you're going to be a good little Do-Bee and follow the scenario. assassin
--- End Message ---
_______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
