On 04/07/2022 7:18 AM, Ola Fosheim Grøstad wrote:
I hardly ever use anything outside UTF-8, and if I do then I use a well tested unicode library as it has to be correct and up to date to be useful. The utility of going beyond UTF-8 seems to be limited:

https://en.wikipedia.org/wiki/UTF-32#Analysis

I have just finished implementing string normalization which is based around UTF-32.

It is required for string equivalent comparisons (which is what you should be doing in a LOT more cases! Anything user provided when compared should be normalized first.

Reply via email to