C J Fynn responded to John Hudson, > If someone wants this, isn't it possible to put a specific lookup in the font > so that any dependant vowel following a space character renders as a spacing > (stand-alone) dependant vowel? Surely a specific lookup should overide it being > displayed on a dotted circle by default.
Has anyone tried this? Would the space glyph U+0020 be expected to trigger a look-up in the Tamil GSUB table as if it were a Tamil base character? The reason that I haven't tried this is because, in the OpenType look-ups here for the "re-ordrant" vowel signs of Tamil, the vowel sign is "INPUT1" and the base letter is "INPUT2". This is because the rendering engine has already re-ordered the character string before this look-up is performed. It doesn't seem likely that a rendering engine would re-order a vowel sign before a space. It could be tested both ways, I suppose... This seems to be OT for this list, but, here it is, and it will probably keep popping up from time to time unless clarified. I can only make inferences and suppositions based on observation of the behavior and reasoning behind the behavior of the rendering engine used here, Microsoft's "Uniscribe". People who know all about this do follow this list, so they're free to offer corrections. <inference and supposition> Uniscribe inserts the dotted circle into the display for complex scripts in order to give a visual indication of an encoding or spelling error. This seems quite useful whether text is being entered or merely displayed. Allowing dependent vowels to follow the space character breaks this utility. In other words, somebody could write a Tamil word in a web page starting with the E-vowel-sign (U+0BC6), and there'd be no indication that this is improper, either to the author or the visitor. Someone searching for that word on that page wouldn't find it, and so on. Maybe some kind of spell-checker should be used by the original author, but, there seems to be no way to assure that spell-checking was performed by the author of any web page one visits. It is the very appearance of that dotted circle unexpectedly in our texts which alerts us to the fact that we have made a mistake. That dotted circle jumps out of the page into our vision exclaiming, "Hey, I'm wrong! I'm so wrong, don't even bother running your spell-checker on me!" This is the basis upon which Uniscribe renders text which includes dependent vowel signs, not just for Tamil, but for the other so-called "complex" scripts, too. The dotted circle plus the matra is the default rendering for combining marks *in isolation*. Uniscribe seems to rightly treat a vowel sign following a space as being in isolation, and, how could it do otherwise? What goes for the space character also seems to go for any other character which is not a valid character *within the Unicode range*. Again, how could it be otherwise. If the first character in a string isn't a Tamil character, there's no reason for the renderer to consult the Tamil OpenType tables in a font. If it did, my gosh, imagine all the pointless look-ups just to display a page which was, for example, mostly Chinese with a few Tamil phrases. <end of supposition and inference> The good folks engineering the Uniscribe have been most responsive to all kinds of special requests and pointers related to complex script shaping. I think asking them to break the existing mechanism in order to support vowel signs on spaces asks too much, though. People generating texts for educational purposes will always have special needs. So, they'll always need to make special effort to get special effects. Workarounds concerning the original question have already been suggested. If this is treated as a Unicode issue rather than a display issue, then one solution would be for someone to propose a new character, (back on topic a little bit) COMBINING DOTTED CIRCLE FOR COMBINING MARKS. Then, rather than inserting DOTTED CIRCLE into the display, a rendering engine could be changed to insert this new character. Then, these updated rendering engines could be distributed and font developers could add the new characters to fonts and distribute updated fonts. This might just take a while, but it wouldn't be too hard to find examples of the character in actual text use to accompany the proposal... "If it ain't broke, don't fix it." So, is it 'broke'? Best regards, James Kass

