I’ve recently developed an interest in old legacy text encodings and noticed that there are various characters in several sets that don’t have a Unicode equivalent. I had already started research into these encodings to eventually prepare a proposal until I realised I should probably ask on the mailing list first whether it is likely the UTC will be interested in those characters before I waste my time on a project that won’t achieve anything in the end.
The character sets in question are ATASCII, PETSCII, the ZX80 set, the Atari ST set, and the TI calculator sets. So far I’ve only analyzed the ZX80 set in great detail, revealing 32 characters not in the UCS. Most characters are pseudo-graphics, simple pictographs or inverted variants of other characters. Now, one of Unicode’s declared goals is to enable round-trip compatibility with legacy encodings. We’ve accumulated a lot of weird stuff over the years in the pursuit of this goal. So it would be natural to assume that the unencoded characters from the mentioned sets would also be eligible for inclusion in the UCS. On the other hand, those encodings are for the most part older than Unicode and so far there seems to have been little interest in them from the UTC or WG2, or any of their contributors. Something tells me that if these character sets were important enough to consider for inclusion, they would have been encoded a long time ago along with all the other stuff in Block Elements, Box Drawings, Miscellaneous Symbols etc. Obviously the character sets in question don’t receive much use nowadays (and some weren’t even that relevant in their time, either), which leads to me wonder whether further putting work into this proposal would be worth it.