OK. So Chrome is using Unicode 6.0 character properties to determine character properties and effectively expects that the prepended vowel will be encoded after the base letter (or space or dotted circle symbol).
But as the diacritic vowels -e and -u in Buginese are expected to be spacing in Buginese fonts, the mark-to-base positioning is not used in Buginese fonts, and they correctly define a non-zero spacing width. This means that effectively the renderer needs to take into account the expected reordering of glyphs to apply the prepending on vowel -e. As far as I have seen, The Windows 7 builtin text renderer does not do that for Buginese, and Chrome's builtin renderer does not do that too (is it Pango ?). In other words, we need a bug fix in text renderers for the support of Buginese, despite it has been encoded since long now. Hmmmm.... This means that there's no software support for the script for now. And this may explain why Buginese texts have been encoded for now in such a way that they expect the same exception to the logical order as in Thai, Lao and Tai Viet, i.e. these texts are encoded using the visual order... What would be the behavior of a font that would use GSUB entries (or ligatures) in a feature to implement the reordering that NO renderer currently implements for Buginese ? What will happen later if the renderer does implement it ? Shouldn't we define this feature in a feature tagged with the "bugi" script ID, that future renderer will simply ignore if they implement the reordering themselves ? Does the OpenType specification allow specifying a temporary override for the missing renderer reordering capabilities ? I.e. Can we tag the defined feature to be specifically ignored by renderers implementing the reordering themselve ? Or at least say that the feature will override the renderer's builtin feature, so that both reorderings won't be used simultaneously (in the font feature, and in the renderer itself) ? Shouldn't the OpenType specification define such thing to allow smooth transition and compatibility of fonts made for compliant and non-compliant renderers ? Note: The Microsoft Font Validator (found in Microsoft Typography website, section for Downloadable Tools) still does not recognize bit 96 of the ulUnicodeRange field, officially defined for the Buginese block range (U+1A00..U+1A1F), and reports an error if this bit is set. And the Fonts folder in Windows 7 Explorer does not say that the font effectively supports Buginese (a Buginese font says that it supports no script at all, even if all code points assigned in the Buginese block are mapped, and bit 96 is set in Unicode Ranges of the header). Apparently, this Microsoft Font Validator, as well as the Windows Exporer extension for the Fonts folder, do not match the current OpenType specification published by... Microsoft, but only the much older specifications currently implemented in Windows (as if it was still reserved). This is the case for all ulUnicodeRange bits defined now after bit number 87, i.e. the Deseret block of the UCS, meaning that the validator and the Windows 7 text renderer and Fonts Explorer are still only based on the (now very old) Unicode 4.1 of... 2003 (with the Deseret additions) or even before in 1996 with Unicode 3.1 only. Who's late ? Another bug of the font validator (I don't know where to post it, because the Microsoft page does not contain any link to post comments) generates exceptions when parsing floatting-point numbers in strings found in the gont header, when running on French version of Windows (apparently, it bugs on the full stop found in a version number, and expects a comma because it does not properly sets the US locale and uses the current user locale...). As a workaround, I have to start the validator from an Explorer only after I have set the user locale to US English (with the language bar). -- Philippe. 2011/7/24 Peter Constable <[email protected]>: > In the OpenType model, a distinction is made between font-specific behaviours > and font-neutral script behaviours. OpenType Layout tables were designed to > deal with only font-specific details, leaving it to OTL client software to > handle anything that is font-neutral. > > Re-ordering of prepended Buginese vowel /e/ is a font-neutral behaviour. More > generally, re-ordering in Brahmi-derived scripts is considered a font-neutral > behaviour, and OpenType Layout does not include means to describe the > re-ordering of characters. (You could fake things out by creating ligature > glyphs for entire syllables, but that isn't generally recommended. > > So, if you're not seeing Buginese script text rendering as expected > specifically wrt the re-ordering issue, that's an issue with the rendering > software--a bug if the software claims to support Buginese, a limitation if > it doesn't. > > > Peter > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On > Behalf Of verdy_p > Sent: Saturday, July 23, 2011 9:13 AM > To: Unicode Mailing List > Subject: Prepending vowel exception in Lontara/Buginese script ? > > If I look in the Unicode 6.0 charts for the Buginese script, I see that vowel > /e/ (U+1A19) is prepended visually on the left of the base consonnant to > which it applies. This should mean that the vowel has to be encoded > ilogically in texts AFTER the base consonnant to which it applies. > > However, I have tested all fonts available on the web for this script, and > none of them contain the necessary OpenType substitution feature needed to > make the logical- to-visual reordering. > > Is this a bug of these fonts (most of them are TrueType only, not OpenType > with a reordering feature like those used in other Indic scripts, but built > like basic TrueType fonts for Thai, Lao and Tai Viet scripts, that are the > only scripts for which Unicode has defined the "Prepended Vowel" exception)? > > Or is is a bug/limitation of text renderers ? > > I note for example that Chrome correctly uses Unicode 6.0 default grapheme > cluster boundaries, when editing and selecting in Lontara text (written in > Biginese or Makassar languages), so that the vowel will be selected/deleted > logically along with the base character encoded before it (for example a > space or punctuation, or even a HTML syntax character). But if I use this > browser to display Lontara text, the vowel /e/ is still shown with the > diacritic on the right of the base consonnant (or dotted circle symbol), > meaning that the text is garbled when I use any one of those available fonts. > > All texts in Makassar or Buginese I have found, encoded in Unicode, seem to > assume the visual order (i.e. the same "prepended vowel" exception as in Thai > and Lao). > Given the geographical area where the Lontara script is mostly used > (Indonesia and Thailand), it seems quite logical that text authors assumed > this exception to the logical encoding order. > > What can be done? Should the fonts be corrected to include the OpenType > feature, or should Unicode be modified to inclide the "prepended vowel" > exception also for Buginese, and so the default grapheme boundaries modified > as well, and the Unicode 6.0 chart modified too for U+1A19 ? > > -- Philippe. > > >

