On Thu, 20 Apr 2017 15:33:37 +0530 Shriramana Sharma via Unicode <unicode@unicode.org> wrote: > All I can say is that Tamil script has eschewed most consonant cluster > ligatures/conjoining forms. As for Devanagari, writing श्रीमान्को (I > used ZWNJ) i.o. श्रीमान्को is quite possible with existing technology. > The latter would be Sanskrit orthography and former perhaps Hindi, > although I wouldn't know why anyone would want to run in the को with > the preceding श्रीमान् even in Hindi.
According to p23 of http://www.unicode.org/L2/L2011/11370-devanagari-vip-issues.pdf, it's Nepali. It's a compromise between श्रीमान्को and Hindi-style श्रीमान् को. > And IMO it would be better to > clearly define at the outset what you meant by "akshara" in your > question to avoid confusions by people replying having a different > idea of the meaning of that term. I didn't want to be any more precise than "orthographic syllable". Swaran Lata is urging, in submission http://www.unicode.org/L2/L2017/17094-indic-text-seg.pdf to the UTC, that UAX#29 "Unicode Text Segmentation" adopt a rather naïve definition of an Indian orthographic syllable. The worst outcome in my opinion would be if it were adopted for the extended grapheme cluster definition - it would make editing orthographic clusters even more difficult. However, it would make sense for CLDR to carry localised definitions. For layout, the definition would be relevant for 'drop capital effects' and for the analogue of inserting spaces between letters. There are recommendations in a maturing W3C specification for Indic layout, though to be fair the specification fairly quickly restricts its scope to Indian scripts. Now, if the spacing were applied to the Nepali word श्रीमान्को I would expect to see something like श्री मा न् को, as the base word itself would appear as श्री मा न् when subjected to the same treatment. However, before suggesting minor improvements that might be in order, I thought I should check whether there was agreement that <VIRAMA, ZWNJ> terminated an orthographic syllable. It now seems that any general agreement would in fact be that it did *not* terminate an orthographic syllable! I must say that stretching श्रीमान्को out as श्री मा न्को feels wrong. If my feeling is right, then the definition of orthographic syllable, if it can be done without reference to a font, belongs in CLDR, as UAX#29 implies, and not in the Unicode Character Database and Unicode standards. Richard.