On 2013/01/22 1:12, Denis Jacquerye wrote:
Does anybody have any idea of how much of the Web is normalized in NFC
or NFD? Or how much not normalized?

I have never measured this. But at one time, there was only NFD (and NFKD). The Unicode Consortium, with input from W3C, then defined NFC (and NFKC) to be much closer to the actual encodings used on the Web.

So in some sense, Web Content is (mostly) NFC *by design*.

Regards,    Martin.


How would one find out or try to make a smart guess?

I know a lot of library catalogue data is in NFD or somewhat
decomposed. Is there any other field that heavily uses decomposition?

--
Denis Moyogo Jacquerye
African Network for Localisation http://www.africanlocalisation.net/
Nkótá ya Kongó míbalé --- http://info-langues-congo.1sd.org/
DejaVu fonts --- http://www.dejavu-fonts.org/




Reply via email to