https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5590
--- Comment #32 from Warren Togami <[email protected]> 2010-01-18 08:26:41 UTC --- (In reply to comment #28) > Not surprisingly it affects Bayes, but only as slightly as the rules. Probably > tokens containing highbits etc. It's simple to test with sa-learn and > comparing > dumps. I would imagine that treating the multi-byte characters as individual bytes might bite us in ways similar to Bug 6183. Various control characters or characters considered non-words happen as the second byte, screwing up tokenization. -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
