----- Original Message ----- From: "John Hudson" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: "'Jim Allan'" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Wednesday, October 29, 2003 6:15 PM Subject: RE: Merging combining classes, was: New contribution N2676
> At 04:04 AM 10/29/2003, Kent Karlsson wrote: > > >The Latvian "cedillas" are really commas below, and are best encoded so. > >Still for lowercase g (not for uppercase) the comma below is _rendered_ > >as a turned comma above. > > The 'not for uppercase' rule depends on the design of the uppercase letter. > Typically, there is no descending portion, so the 'comma' accent goes > below; in some handwriting typefaces and with swash letters, the G may have > a descending stroke. In this case the accent is turned and placed above, > just as it is for the lowercase. Of course, it is encoded as the comma > below. The attached examples are from the version of Hermann Zapf's Zapfino > that ships with Apple's OS X. So Latvian "cedillas" (as well as Romanian) should be encoded with the comma below and not the cedilla. But there's a huge legacy use of characters with cedilla instead of comma below, due to the good support for Turkish, and the long history of bad support for the comma below. A common example is the default ANSI charset of Windows for Latvian and Romanian, which simply does not have the comma-below, but force users to encode characters with cedillas... This has forced users to create custom fonts to create glyph variants of characters coded with a cedilla, but rendered as a comma below... Even today, it is quite hard to find any Romanian or Latvian web page using the new Unicode characters with a comma-below: even governmental sites use the characters coded with the cedilla, and they support that this comma below is rendered approximately, as this does not cause interpretation problems for readers. For these countries, document writers are choosing between the Central European or Turkish ISO charsets, and they avoid using commas below as they are not rendered at all (or displayed with a missing square glyph) on most platforms... For example, on Windows, the comma below is most often supported only if users have installed MS Office that includes the "Arial Unicode MS" font capable of displaying it. When Microsoft will offer as a free download this font to all Internet Explorer users, there will be much less problems, and we'll probably see more texts encoded correctly with the comma-below. May be we could militate here so that Microsoft includes at least the characters for Latvian and Romanian (at least the precomposed characters, even if a decomposed comma-below is not rendered correctly) in a update of its "Times New Roman", "Arial", "Verdana" and "Tahoma" fonts for the web...