Encoding of old compatibility characters

Charlotte Buff Mon, 27 Mar 2017 07:51:11 -0700

I’ve recently developed an interest in old legacy text encodings and
noticed that there are various characters in several sets that don’t have a
Unicode equivalent. I had already started research into these encodings to
eventually prepare a proposal until I realised I should probably ask on the
mailing list first whether it is likely the UTC will be interested in those
characters before I waste my time on a project that won’t achieve anything
in the end.


The character sets in question are ATASCII, PETSCII, the ZX80 set, the
Atari ST set, and the TI calculator sets. So far I’ve only analyzed the
ZX80 set in great detail, revealing 32 characters not in the UCS. Most
characters are pseudo-graphics, simple pictographs or inverted variants of
other characters.

Now, one of Unicode’s declared goals is to enable round-trip compatibility
with legacy encodings. We’ve accumulated a lot of weird stuff over the
years in the pursuit of this goal. So it would be natural to assume that
the unencoded characters from the mentioned sets would also be eligible for
inclusion in the UCS. On the other hand, those encodings are for the most
part older than Unicode and so far there seems to have been little interest
in them from the UTC or WG2, or any of their contributors. Something tells
me that if these character sets were important enough to consider for
inclusion, they would have been encoded a long time ago along with all the
other stuff in Block Elements, Box Drawings, Miscellaneous Symbols etc.

Obviously the character sets in question don’t receive much use nowadays
(and some weren’t even that relevant in their time, either), which leads to
me wonder whether further putting work into this proposal would be worth it.

Encoding of old compatibility characters

Reply via email to