GROUP MARK Best Regards,
Jonathan Rosenne -----Original Message----- From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Fr?d?ric Grosshans Sent: Tuesday, March 28, 2017 1:05 AM To: unicode Subject: Re: Encoding of old compatibility characters Another example, about to be encoded, it the GOUP MARK, used on old IBM computers (proposal: ML threads: http://www.unicode.org/mail-arch/unicode-ml/y2015-m01/0040.html , and http://unicode.org/mail-arch/unicode-ml/y2007-m05/0367.html ) Le 27/03/2017 à 23:46, Frédéric Grosshans a écrit : > An example of a legacy character successfully encoded recently is ⏨ > U+23E8 DECIMAL EXPONENT SYMBOL, encoded in Unicode 5.2. > It came from the Soviet standard GOST 10859-64 and the German standard > ALCOR. And was proposed by Leo Broukhis in this proposal > http://www.unicode.org/L2/L2008/08030r-subscript10.pdf . It follows a > discussion on this mailing list here > http://www.unicode.org/mail-arch/unicode-ml/y2008-m01/0123.html, where > Ken Whistler was already sceptical about the usefulness of this encoding. > > > Le 27/03/2017 à 16:44, Charlotte Buff a écrit : >> I’ve recently developed an interest in old legacy text encodings and >> noticed that there are various characters in several sets that don’t >> have a Unicode equivalent. I had already started research into these >> encodings to eventually prepare a proposal until I realised I should >> probably ask on the mailing list first whether it is likely the UTC >> will be interested in those characters before I waste my time on a >> project that won’t achieve anything in the end. >> >> The character sets in question are ATASCII, PETSCII, the ZX80 set, >> the Atari ST set, and the TI calculator sets. So far I’ve only >> analyzed the ZX80 set in great detail, revealing 32 characters not in >> the UCS. Most characters are pseudo-graphics, simple pictographs or >> inverted variants of other characters. >> >> Now, one of Unicode’s declared goals is to enable round-trip >> compatibility with legacy encodings. We’ve accumulated a lot of weird >> stuff over the years in the pursuit of this goal. So it would be >> natural to assume that the unencoded characters from the mentioned >> sets would also be eligible for inclusion in the UCS. On the other >> hand, those encodings are for the most part older than Unicode and so >> far there seems to have been little interest in them from the UTC or >> WG2, or any of their contributors. Something tells me that if these >> character sets were important enough to consider for inclusion, they >> would have been encoded a long time ago along with all the other >> stuff in Block Elements, Box Drawings, Miscellaneous Symbols etc. >> >> Obviously the character sets in question don’t receive much use >> nowadays (and some weren’t even that relevant in their time, either), >> which leads to me wonder whether further putting work into this >> proposal would be worth it. > >