https://bugs.exim.org/show_bug.cgi?id=2527
Bug ID: 2527 Summary: Incomplete unicode handling in pcre2_substitute when converting to upper/lower case Product: PCRE Version: 10.34 (PCRE2) Hardware: All OS: All Status: NEW Severity: bug Priority: medium Component: Code Assignee: p...@hermes.cam.ac.uk Reporter: kkil...@gmail.com CC: pcre-dev@exim.org According to Philip it should be possible to not set PCRE2_UTF but set PCRE2_UCP. In this case the desired behaviour should be that Unicode char properties are considered although surrogates may not be handled correctly and invalid unicode may be present. This does not work in pcre2_substitute where explicitely if (utf) is asked when doing the conversion from upper/lower. I would suggest to ask for PCRE2_UCP or just completely disable the "if" in the unicode case. We just removed "if". This makes pcre2_substitute work like classical UCS-2 upper/lower case conversion. -- You are receiving this mail because: You are on the CC list for the bug. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev