[pcre-dev] [Bug 2527] New: Incomplete unicode handling in pcre2_substitute when converting to upper/lower case

admin Mon, 17 Feb 2020 04:26:51 -0800

https://bugs.exim.org/show_bug.cgi?id=2527


            Bug ID: 2527
           Summary: Incomplete unicode handling in pcre2_substitute when
                    converting to upper/lower case
           Product: PCRE
           Version: 10.34 (PCRE2)
          Hardware: All
                OS: All
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: p...@hermes.cam.ac.uk
          Reporter: kkil...@gmail.com
                CC: pcre-dev@exim.org

According to Philip it should be possible to not set PCRE2_UTF but set
PCRE2_UCP. In this case the desired behaviour should be that Unicode char
properties are considered although surrogates may not be handled correctly and
invalid unicode may be present. 

This does not work in pcre2_substitute where explicitely 

if (utf)

is asked when doing the conversion from upper/lower. I would suggest to ask for
PCRE2_UCP or just completely disable the "if" in the unicode case. 

We just removed "if". This makes pcre2_substitute work like classical UCS-2
upper/lower case conversion.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev

[pcre-dev] [Bug 2527] New: Incomplete unicode handling in pcre2_substitute when converting to upper/lower case

Reply via email to