2025/01/18 10:15 ... Mouse:
There is no one-size-fits-all character set, nor encoding, nor even
serialization, no matter what the priests of the UTF-8 religion would
have you believe.

If you want to argue that UTF-8 is the best default, that at least is
worth discussing.  But maintaining that there is any single "the right"
character set, encoding, or serialization is...nonsense.  There is, at
most, "right" for a particular use case, or set of use cases.

ASCII is ugly, Latin-1 is ugly (at the defining meeting, the member from France, no printer, no linguist, no typographer, against his country s tradition repudiated the letter OE. Of course, another member jumped at that and proposed multiplication and division. There is now a hole in the added letters), Unicode is ugly. But UTF-8 is particularly ugly. It has 5 message bits and 3 overhead bits, and writers in Devanagari, ... Malayalam, ... Hangul, ... hiragana & katakana, ... above all in Chinese, find that their text files are huge, bigger than if entered in a 16-bit code (24-bit code, anyone?) with all its surrogates. In bijective binary ("Bijective numeration" in Wikipedia) using seven bits and one marker-bit, one can get up to only three bytes, and inherent uniqueness. Anyone interested in losing UTF-8 s deliberate redundancies?

Reply via email to