I offered a suggestion on cedilla and combining undercomma:

/ It seems to me that Cedilla/undercomma folding would be a useful /
/addition to "Character Foldings" at http://www.unicode.org/reports/tr30. /

and Philippe Verdy responded:


Excellent idea, however it has to be tailored by language:

For example, Turkish and French (which almost always and consistently use
preferably a cedilla) behave differently of Romanian and Latvian (which
should use preferably a comma below).

No.


Forced tailoring by language would greatly reduce the usefulness of such foldings for search purposes.

One wants to find matches for Romanian and Latvian personal names or place names or individual forms using cedilla or undercomma regardless of the language in which they are embedded.

Similarly Turkish forms normally spelled with cedilla would be found regardless of language even if a undercomma rather than a cedilla had been used in the spelling (perhaps by error or perhaps purposely to adapt a name to Romanian or Latvian style).

One wants cedilla and undercomma to match in a search in legacy code pages regardless of the transliteration table to Unicode that is used by a particular application.

Jim Allan









Reply via email to