"\N{LATIN SMALL LIGATURE IJ}" =~ /ij/i

matches, as the fold of the ligature is 'ij'. But if you simply add capturing parentheses, as in this post's subject line, it becomes somewhat nonsensical, as each captured group should match some part of the indivisible character LATIN SMALL LIGATURE IJ. And the problem is not restricted to ligatures, but when comparing using the NFD normalizations of a string and pattern, the captured portion matched may not be in the original string.

I didn't see any reference to this in the ICU documentation.

Reply via email to