Such example shows that ignoring umlauts makes the document counterintuitive. Nobody is able to infer that "Paper" refers to a person here or if he actually meant a paper sheet/article... At least he should have written "Paeper" which would be more correct (if "Christoph Päper" is German, the umlaut is equivalent to a following "e"), or even "Christoph Paper".
Apply that tot the Kazakh language, and attempt to drop the apostrophes (because they very commonly cause various technical issues in softwares), I'm sure you'll see problems of interpretation or too many synonyms, that the use of acute instead would have avoided All softwares today are "8-bit" clean and support at least ISO 8859-1 or windows 1252, if they don't support multibyte UTF-8; the time of 7-bit ASCII is ended now since long, except in very old systems, that were anyway not used at all for Kazakh in Cyrillic; so acute accents are more likely than ASCII apostrophes to survive the technical software constraints, notably if Latin letters with accents come from the ISO 8859-1 subset which is also 8-bit in Unicode. Even with UTF-8, these Latin letters with accents (from any ISO 8859-* subset) will be 2-byte wide, so exactly the same encoding size as basic letter+ASCII quote and the encoding size is definitely not an issue anywhere (all existing Kazakh Cyrillic letters are already using 2-byte encoding in UTF-8, as all their assigned code points values were higher than 0x7F but lower than 0x800) Choosing the ASCII quote for this "apostrophe" will not save anything ; but the regular Unicode apostrophe U+2019 would need... 3 bytes after the 1-byte basic Latin letter from ASCII (so it is worse !). Choosing the acute accent above Latin letters from ISO 8859-* would avoid this issue, because they are precombined, and in UTF-8 the usual prefered representation is in NFC form using a single code points. Javascript, Java, or C/C++ "wide string" types will handle these characters also with a single code unit (so the measured string "length" matches the number of letters). You will avoid all problems of SQL code injection on web sites if you have to allow the ASCII quotes unfiltered in data input forms to represent the proposed Kazakh orthography: with the acute, you can still continue to reject all ASCII quotes from software input forms and people won't be forced to use the alternate U+2019, not found on their basic keyboards, or will not substitute it by an hyphen or space or will not drop it completely; they'll just type letters with acute accents with a single keystroke on their Latinized keyboard. 2018-01-25 13:15 GMT+01:00 Andrew West via Unicode <unicode@unicode.org>: > On 23 January 2018 at 00:55, James Kass via Unicode <unicode@unicode.org> > wrote: > > > > Regular American users simply don't type umlauts, period. > > Not even the president of the Unicode Consortium when referring to > Christoph Päper: > > http://www.unicode.org/L2/L2018/18051-emoji-ad-hoc-resp.pdf > > Andrew > >