https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656
--- Comment #4 from Henrik Krohns <apa...@hege.li> --- So getting back to this. I've been running my SA with normalize_charset 1 without any ill-effects so far. Should we head towards activating it by default in 4.0.0? Only thing left after that would be documenting what format .cf files are expected to be in. Probably just "bytes" without any special encoding? For anything else than personal use, pure ascii should be used for portability (non-ascii characters should be in \xff format). To be compatible for both normalize_charset 0/1, it should be clearly documented that any rules expected to hit latin1 extended characters would need to be written to include both latin1/utf8 - "รค" -> (?:\xe4|\xc3\xa4). We could also detect this automatically from rules and output warning that it should be fixed. One thing to consider would be removing the whole normalize_charset option, and just force everything normalized, plain and simple. -- You are receiving this mail because: You are the assignee for the bug.