What should happen with "\N{LATIN SMALL LIGATURE IJ}" =~ /(i)(j)/i

karl williamson Sun, 05 Sep 2010 14:35:40 -0700

"\N{LATIN SMALL LIGATURE IJ}" =~ /ij/i

matches, as the fold of the ligature is 'ij'. But if you simply addcapturing parentheses, as in this post's subject line, it becomessomewhat nonsensical, as each captured group should match some part ofthe indivisible character LATIN SMALL LIGATURE IJ. And the problem isnot restricted to ligatures, but when comparing using the NFDnormalizations of a string and pattern, the captured portion matched maynot be in the original string.


I didn't see any reference to this in the ICU documentation.

What should happen with "\N{LATIN SMALL LIGATURE IJ}" =~ /(i)(j)/i

Reply via email to