------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1208 Summary: Case folding in PCRE Product: PCRE Version: 8.30 Platform: Other OS/Version: Linux Status: NEW Severity: wishlist Priority: low Component: Code AssignedTo: [email protected] ReportedBy: [email protected] CC: [email protected] Hi, I was wondering what's the (planned?) status of casefolding in PCRE when doing a (case insensitive) match using Unicode. For instance, "ß" (U+00DF LATIN SMALL LETTER SHARP S) should match "ss" (or even "SS" in case insensitive); µ (U+00B5, MICRO SIGN) should match μ (U+03BC, GREEK SMALL LETTER MU), or Μ (U+039C, GREEK CAPITAL LETTER MU). The CaseFolding.txt file from Unicode says # If all characters are mapped according to the full mapping below, then # case differences (according to UnicodeData.txt and SpecialCasing.txt) # are eliminated. For instance the relevant entries for what I just said are: 0053; C; 0073; # LATIN CAPITAL LETTER S 00DF; F; 0073 0073; # LATIN SMALL LETTER SHARP S 00B5; C; 03BC; # MICRO SIGN 039C; C; 03BC; # GREEK CAPITAL LETTER MU From what I can see right now, PCRE doesn't seem to do this. For starters -- am I wrong? If not, what's the overall status of such a feature? For instance, how are the four different Turkish "i" letters considered? Thanks, Giuseppe D'Angelo -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
