https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7091
--- Comment #4 from John Hardin <[email protected]> --- (In reply to Mark Martinec from comment #3) > > > body CRAZY_EURO /€uro/ > > > header SUBJ_CREDIT_FR Subject =~ /crédit/ > > > > So... how do we make rules aware of whether or not normalize_charset is > > enabled? > > The same way as making them aware of original encoding on a text - you can't. > > I have been asking myself the same question - and I think the question > is wrong. There is no difference (from rules viewpoint) between text > that is originally encoded as UTF-8 (or plain US-ASCII) and a text that > is transcoded into UTF-8 from some other character set by normalize_charset. I apologize, it appears my question was unclear. Let me try again. normalize_charset is a local configuration option - it can be disabled. A rule written for use when normalize_charset is enabled will generally be simpler than one that needs to directly deal with multiple encodings. Is there a way to write rule alternatives such that one will be used when the normalize_charset option is enabled and the other when it is not? I'm wondering if there is something similar to rule variants using or not using a perl-5.10-ism switched by "if can(Mail::SpamAssassin::Conf::perl_min_version_5010000)" Is there no way to intelligently choose between different rules based on such configuration options? That kinda leaves us with unwelcome alternatives: write for one mode and ignore the other (which will probably be broken in you write to normalized text or inefficient and complex if you write to non-normalized) or write two rules (which will be double the work to scan - not recommended at all). Do we need a "can(Mail::SpamAssassin::Conf::normalize_enabled)" or some such? -- You are receiving this mail because: You are the assignee for the bug.
