U+015F LATIN SMALL LETTER S WITH CEDILLA (ş) has the following annotations:
  • Turkish, Azerbaijani, Romanian, ...
  • this character is used in both Turkish and Romanian data
  • a glyph variant with comma below is preferred for Romanian
and there is a cross-reference to U+0219 LATIN SMALL LETTER S WITH COMMA BELOW (ș), which has the following annotation:
  • Romanian, when distinct comma below form is required
Those characters have the expected canonical mappings, with combining cedilla and combining comma below respectively, so they are  entirely distinct characters as far as Unicode is concerned. However, the last annotation on U+015F suggests they are the same. What is the truth?
  • Is a glyph with a comma below a correct representation of U+015F, as the annotation suggests? Of course, such a font would not be usable for languages other than Romanian
  • Should the annotations be interpreted (and may be changed) to something like: "U+015F is not used in Romanian, you are probably looking for U+0219; however, data encoded prior to Unicode 3.0 may have incorrectly used U+015F instead of U+0073 U+0326"?

Thanks,
Eric.


Reply via email to