[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2022-03-06 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Henrik Krohns changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2021-05-29 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Henrik Krohns changed: What|Removed |Added Severity|blocker |normal Target

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2021-05-29 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #16 from Henrik Krohns --- Changed normalized_charset 1 as default and added some docs. Sendingtrunk/UPGRADE Sendingtrunk/lib/Mail/SpamAssassin/Conf.pm Sending

ATTN: BUG BUMP: [Bug 7656] UTF8 rules, normalize_charset etc overhaul

2021-04-15 Thread Bill Cole
This bug is part of the complex related to smoothing out all the edge and corner cases of character set encoding for v4. There is some concern that changing the default for normalize_charset (to enable it) or even removing the switch altogether to nail down documentation of how to match

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2021-04-15 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Bill Cole changed: What|Removed |Added CC||billc...@apache.org --- Comment #15

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2021-04-15 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #14 from Henrik Krohns --- Good to hear, I cast my official +1 for normalize_charset 1 too. There doesn't seen to be any dependencies, Encode::Detect can still remain optional and required HTML::Parser 3.46 is from 2005.. Will

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2021-04-15 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Giovanni Bechis changed: What|Removed |Added CC||giova...@paclan.it --- Comment

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2021-04-15 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #12 from Henrik Krohns --- Bumping this bug. Comments? Monologs are getting a bit tiresome.. :-) -- You are receiving this mail because: You are the assignee for the bug.

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-09 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Henrik Krohns changed: What|Removed |Added Blocks|4745| Depends on|

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-09 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Henrik Krohns changed: What|Removed |Added Depends on||6234, 7072 -- You are receiving

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-08 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Henrik Krohns changed: What|Removed |Added Blocks||4745 -- You are receiving this

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-04 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #11 from Henrik Krohns --- This bug already floods dev@ list, if someone wants to chime in, feel free. I have no intention of spending time posting on users@ at this stage, when it's still only on idea and much left to do.

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-04 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #10 from Kevin A. McGrail --- Well for it to be the default in 4.0.0, I'd like it to be discussed on list, please. -- You are receiving this mail because: You are the assignee for the bug.

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-04 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #9 from Henrik Krohns --- Well yes that pretty much sums up what was already said in this bug. You can't expect to match extended ascii characters like before. It's nothing but a documentation issue. -- You are receiving this

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-04 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #8 from Kevin A. McGrail --- Sure, here's one example: #ZWNJ #ZWNJ 200C 157 https://en.wikipedia.org/wiki/Windows-1256 # Also want to look at Unicode U+200C. # Also 'zero-width joiner' which is Windows-1256 0x9E and Unicode

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-04 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #7 from Henrik Krohns --- (In reply to Kevin A. McGrail from comment #6) > I know I have some rules that fire differently with normalize_charset. Could you show some examples? -- You are receiving this mail because: You are

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-04 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Kevin A. McGrail changed: What|Removed |Added CC||kmcgr...@apache.org --- Comment

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-04 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #5 from Henrik Krohns --- I tried performance tests with mass-check, there's absolutely no difference here for normalize_charset, total duration was always within normal +-2% variance. Rule differences between these were

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2019-08-04 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #4 from Henrik Krohns --- So getting back to this. I've been running my SA with normalize_charset 1 without any ill-effects so far. Should we head towards activating it by default in 4.0.0? Only thing left after that would be

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2018-11-17 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Henrik Krohns changed: What|Removed |Added Blocks||7645 Referenced Bugs:

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2018-11-17 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Henrik Krohns changed: What|Removed |Added Blocks||7022 Referenced Bugs:

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2018-11-17 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #3 from Henrik Krohns --- (In reply to Henrik Krohns from comment #0) > latin1 message, no ct RULE_LATIN1 / > latin1 message, utf8 ct RULE_LATIN1 / > latin1 message, no ct RULE_UTF8 / > latin1 message, utf8 ct

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2018-11-17 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 --- Comment #2 from Henrik Krohns --- (In reply to Henrik Krohns from comment #0) > Unless people want to use multiple rules to match non-utf8 and utf8 > messages, perhaps the only sane solution would be to "upgrade" all non-utf8 > rules to

[Bug 7656] UTF8 rules, normalize_charset etc overhaul

2018-11-17 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656 Henrik Krohns changed: What|Removed |Added CC||h...@hege.li --- Comment #1 from