Re: Encoding of old compatibility characters

Frédéric Grosshans Mon, 27 Mar 2017 15:10:43 -0700

Another example, about to be encoded, it the GOUP MARK, used on old IBMcomputers (proposal: ML threads:http://www.unicode.org/mail-arch/unicode-ml/y2015-m01/0040.html , andhttp://unicode.org/mail-arch/unicode-ml/y2007-m05/0367.html )


Le 27/03/2017 à 23:46, Frédéric Grosshans a écrit :

An example of a legacy character successfully encoded recently is ⏨U+23E8 DECIMAL EXPONENT SYMBOL, encoded in Unicode 5.2.It came from the Soviet standard GOST 10859-64 and the German standardALCOR. And was proposed by Leo Broukhis in this proposalhttp://www.unicode.org/L2/L2008/08030r-subscript10.pdf . It follows adiscussion on this mailing list herehttp://www.unicode.org/mail-arch/unicode-ml/y2008-m01/0123.html, whereKen Whistler was already sceptical about the usefulness of this encoding.
Le 27/03/2017 à 16:44, Charlotte Buff a écrit :
I’ve recently developed an interest in old legacy text encodings andnoticed that there are various characters in several sets that don’thave a Unicode equivalent. I had already started research into theseencodings to eventually prepare a proposal until I realised I shouldprobably ask on the mailing list first whether it is likely the UTCwill be interested in those characters before I waste my time on aproject that won’t achieve anything in the end.
The character sets in question are ATASCII, PETSCII, the ZX80 set,the Atari ST set, and the TI calculator sets. So far I’ve onlyanalyzed the ZX80 set in great detail, revealing 32 characters not inthe UCS. Most characters are pseudo-graphics, simple pictographs orinverted variants of other characters.
Now, one of Unicode’s declared goals is to enable round-tripcompatibility with legacy encodings. We’ve accumulated a lot of weirdstuff over the years in the pursuit of this goal. So it would benatural to assume that the unencoded characters from the mentionedsets would also be eligible for inclusion in the UCS. On the otherhand, those encodings are for the most part older than Unicode and sofar there seems to have been little interest in them from the UTC orWG2, or any of their contributors. Something tells me that if thesecharacter sets were important enough to consider for inclusion, theywould have been encoded a long time ago along with all the otherstuff in Block Elements, Box Drawings, Miscellaneous Symbols etc.
Obviously the character sets in question don’t receive much usenowadays (and some weren’t even that relevant in their time, either),which leads to me wonder whether further putting work into thisproposal would be worth it.

Re: Encoding of old compatibility characters

Reply via email to