Right, Doug. I'll say a few more words. In terms of language support, encoding of new characters in Unicode benefits mostly digital heritage languages (via representation of historic languages in Unicode, enabling preservation and scholarly work), although there are some modern-use cases like Hanifi Rohingya. We do include digital heritage under the umbrella of "digitally disadvantaged languages", but we are not consistent in our terminology sometimes.
But encoding is just a first step. A vital first step, but just one step. People tend to forget that adding new characters is just a part of what Unicode does. For script support, it is just as important to have correct Unicode algorithms and properties, such as correct values for the Indic_Positional_Category property (which together with the related work in with the Universal Shaping Engine, allows for proper rendering of many languages). Behind the scenes we have people like Ken and Laurentiu who have to dig through the encoding proposals and fill in the many, many gaps to come up with reasonable properties for such basic behavior as line-break. As important as the work is on encoding, properties, and algorithms, when we go up a level we get CLDR and ICU. Those have more impact on language support for far more people in the world than the addition of new scripts does. After all, approaching half of the population of the globe owns smartphones: ICU provides programmatic access to the Unicode encoding, properties, and algorithms, and CLDR + ICU together provide the core language support on essentially every one of those smartphones. But in terms of language coverage, the chart you reference (and the corresponding graph <http://cldr.unicode.org/index/downloads/cldr-32#TOC-Growth>) show how very far CLDR still has to go. So we are gearing up for ways to extend that graph: to move at least the basic coverage (the lower plateau in that graph) to more languages, and to move basic-coverage languages up to more in-depth coverage. We are focusing on ways to improve the CLDR survey tool backend and frontend, since we know it currently cannot able to handle the number of people that want to contribute, and has glitches in the UI that make it clumsier to use than it should be. Well, this turned out to be more than just a few words... sorry for going on! Mark On Thu, Mar 1, 2018 at 9:10 PM, Doug Ewell via Unicode <unicode@unicode.org> wrote: > Tim Partridge wrote: > > > Perhaps the CLDR work the Consortium does is being referenced. That is > > by language on this list > > http://www.unicode.org/cldr/charts/32/supplemental/locale_ > coverage.html#ee > > By the time it gets to the 100th entry the Modern percentage has "room > > for improvement". > > I think that is a measurement of locale coverage -- whether the > collation tables and translations of "a.m." and "p.m." and "a week ago > Thursday" are correct and verified -- not character coverage. > > -- > Doug Ewell | Thornton, CO, US | ewellic.org > > >