Re: Encoding of old compatibility characters
I have got MS Word 2002 and MS Excel 2000. Maybe, later versions bring an amended version of Arial Unicode MS. Maybe. A./
Re: Encoding of old compatibility characters
Helo, Am 31.03.2017 um 09:57 schrieb Eli Zaretskii: Arial Unicode MS supports that character [U+23E8], FWIW. From: Otto StolzDate: Tue, 4 Apr 2017 15:21:02 +0200 Not on my good ole Wndows XP SP3 system. On 4/4/2017 7:58 AM, Eli Zaretskii wrote: This here is also XP SP3. Maybe some package I have installed updated the font? Am 04.04.2017 um 18:51 schrieb Asmus Freytag: AFAIK, this font is / was installed by MS Office. I have got MS Word 2002 and MS Excel 2000. Maybe, later versions bring an amended version of Arial Unicode MS. Cheers, otto
Re: Encoding of old compatibility characters
On 4/4/2017 7:58 AM, Eli Zaretskii wrote: From: Otto StolzDate: Tue, 4 Apr 2017 15:21:02 +0200 Am 31.03.2017 um 09:57 schrieb Eli Zaretskii: Arial Unicode MS supports that character [U+23E8], FWIW. Not on my good ole Wndows XP SP3 system. This here is also XP SP3. Maybe some package I have installed updated the font? AFAIK, this font is / was installed by MS Office. A./
Re: Encoding of old compatibility characters
> From: Otto Stolz> Date: Tue, 4 Apr 2017 15:21:02 +0200 > > Am 31.03.2017 um 09:57 schrieb Eli Zaretskii: > > Arial Unicode MS supports that character [U+23E8], FWIW. > > Not on my good ole Wndows XP SP3 system. This here is also XP SP3. Maybe some package I have installed updated the font?
Re: Encoding of old compatibility characters
Am 31.03.2017 um 09:57 schrieb Eli Zaretskii: Arial Unicode MS supports that character [U+23E8], FWIW. Not on my good ole Wndows XP SP3 system. Best wishes, Otto
Re: Encoding of old compatibility characters
Probably you've installed the Noto collection on your Windows XP, or installed some software that added fonts to the system (pmossibly with updates to the Uniscribe library, suc has an old version of Office). Anyway I would no longer trust XP for doing correct rendering for many scripts, even with Uniscribe which is not needed for this simple character mapped in the BMP. Now minimal support in XP is essentially by third party software providers. Most have resigned, except Mozilla and some security suites that attempt to fill the gaps abandonned now by Microsoft (but still maintain it... because there are still various banks using it for example in their ATM: you know it when you frequently see the ATM rebooting of sometimes unusable as it has crashed with a "BSOD" displayed). 2017-03-30 22:17 GMT+02:00 António Martins-Tuválkin: > On 2017.03.29 05:41, Leo Broukhis asked: > > Are you still using Windows 7 or RedHat 5, or something equally old? >> Newer systems have ⏨ out of the box. >> > > I’m using Windows XP and "⏨" renders perfectly as "₁₀". Maybe fonts can > be installed without “upgrading” the whole operating system? Who knew?! > > -- . > António MARTINS-Tuválkin | ()| > || > PT-1500-239 LisboaNão me invejo de quem tem | > PT-2695-010 Bobadela LRS carros, parelhas e montes | > +351 934 821 700, +351 212 463 477só me invejo de quem bebe | > facebook.com/profile.php?id=744658416 a água em todas as fontes | > - > De sable uma fonte e bordadura escaqueada de jalde e goles por timbre > bandeira por mote o 1º verso acima e por grito de guerra "Mi rajtas!" > - > >
Re: Encoding of old compatibility characters
On 2017.03.29 05:41, Leo Broukhis asked: Are you still using Windows 7 or RedHat 5, or something equally old? Newer systems have ⏨ out of the box. I’m using Windows XP and "⏨" renders perfectly as "₁₀". Maybe fonts can be installed without “upgrading” the whole operating system? Who knew?! -- . António MARTINS-Tuválkin | ()||| PT-1500-239 LisboaNão me invejo de quem tem | PT-2695-010 Bobadela LRS carros, parelhas e montes | +351 934 821 700, +351 212 463 477só me invejo de quem bebe | facebook.com/profile.php?id=744658416 a água em todas as fontes | - De sable uma fonte e bordadura escaqueada de jalde e goles por timbre bandeira por mote o 1º verso acima e por grito de guerra "Mi rajtas!" -
Re: Encoding of old compatibility characters
On Tue, Mar 28, 2017 at 6:09 AM, Asmus Freytagwrote: > On 3/28/2017 4:00 AM, Ian Clifton wrote: > > I’ve used ⏨ a couple of times, without explanation, in my own > emails—without, as far as I’m aware, causing any misunderstanding. > > Works especially well, whenever it renders as a box with 23E8 inscribed! > Are you still using Windows 7 or RedHat 5, or something equally old? Newer systems have ⏨ out of the box. Leo
Re: Encoding of old compatibility characters
On 03/28/2017 09:09 AM, Asmus Freytag wrote: On 3/28/2017 4:00 AM, Ian Clifton wrote: I’ve used ⏨ a couple of times, without explanation, in my own emails—without, as far as I’m aware, causing any misunderstanding. Works especially well, whenever it renders as a box with 23E8 inscribed! A./ I ⬚ Unicode. ~mark
Re: Encoding of old compatibility characters
I don't think I want my text renderer to be *that* smart. If I want ⏨, I'll put ⏨. If I want a multiplication sign or something, I'll put that. Without the multiplication sign, it's still quite understandable, more so than just "e". It is valid for a text rendering engine to render "g" with one loop or two. I don't think it's valid for it to render "g" as "xg" or "-g" or anything else. The ⏨ character looks like it does. You don't get to add multiplication signs to it because you THINK you know what I'm saying with it. And using 20⏨ to mean "twenty base ten" sounds perfectly reasonable to me also. ~mark On 03/28/2017 05:33 AM, Philippe Verdy wrote: Ideally a smart text renderer could as well display that glyph with a leading multiplication sign (a mathematical middle dot) and implicitly convert the following digits (and sign) as real superscript/exponent (using contextual substitution/positioning like for Eastern Arabic/Urdu), without necessarily writing the 10 base with smaller digits. Without it, people will want to use 20⏨ to mean it is the decimal number twenty and not hexadecimal number thirty two. 2017-03-28 11:18 GMT+02:00 Frédéric Grosshans>: Le 28/03/2017 à 02:22, Mark E. Shoulson a écrit : Aw, but ⏨ is awesome! It's much cooler-looking and more visually understandable than "e" for exponent notation. In some code I've been playing around with I support it as a valid alternative to "e". I Agree 1⏨3 times with you on this ! Frédéric
Re: Encoding of old compatibility characters
On 3/28/2017 4:00 AM, Ian Clifton wrote: I’ve used ⏨ a couple of times, without explanation, in my own emails—without, as far as I’m aware, causing any misunderstanding. Works especially well, whenever it renders as a box with 23E8 inscribed! A./
Re: Encoding of old compatibility characters
Philippe Verdywrites: > Ideally a smart text renderer could as well display that glyph with a > leading multiplication sign (a mathematical middle dot) and implicitly > convert the following digits (and sign) as real superscript/exponent > (using contextual substitution/positioning like for Eastern > Arabic/Urdu), without necessarily writing the 10 base with smaller > digits. Actually, I would see this as putting unnecessary clutter back in! I would say the advantage of the ⏨ notation, introduced with Algol 60, is that it subsumes and makes implicit the multiplication and exponentiation operators, resulting in a visually compact denotation of a real number in “scientific notation”, and it does so with a single symbol that hints at its own meaning. I’ve used ⏨ a couple of times, without explanation, in my own emails—without, as far as I’m aware, causing any misunderstanding. > Without it, people will want to use 20⏨ to mean it is the decimal > number twenty and not hexadecimal number thirty two. Yes, this ambiguity is a drawback. Hopefully, the use cases should be sufficiently different that real confusion would be unlikely (and of course, normally, U+23E8 should never be used to denote decimal number base). -- Ian Clifton ⚗ ℡: +44 1865 275677 Chemistry Research Laboratory ℻: +44 1865 285002 Oxford University : ian.clif...@chem.ox.ac.uk Mansfield Road Oxford OX1 3TA UK
Re: Encoding of old compatibility characters
Ideally a smart text renderer could as well display that glyph with a leading multiplication sign (a mathematical middle dot) and implicitly convert the following digits (and sign) as real superscript/exponent (using contextual substitution/positioning like for Eastern Arabic/Urdu), without necessarily writing the 10 base with smaller digits. Without it, people will want to use 20⏨ to mean it is the decimal number twenty and not hexadecimal number thirty two. 2017-03-28 11:18 GMT+02:00 Frédéric Grosshans: > Le 28/03/2017 à 02:22, Mark E. Shoulson a écrit : > >> Aw, but ⏨ is awesome! It's much cooler-looking and more visually >> understandable than "e" for exponent notation. In some code I've been >> playing around with I support it as a valid alternative to "e". >> > > I Agree 1⏨3 times with you on this ! > > Frédéric > >
Re: Encoding of old compatibility characters
Le 28/03/2017 à 02:22, Mark E. Shoulson a écrit : Aw, but ⏨ is awesome! It's much cooler-looking and more visually understandable than "e" for exponent notation. In some code I've been playing around with I support it as a valid alternative to "e". I Agree 1⏨3 times with you on this ! Frédéric
Re: Encoding of old compatibility characters
On 03/27/2017 05:46 PM, Frédéric Grosshans wrote: An example of a legacy character successfully encoded recently is ⏨ U+23E8 DECIMAL EXPONENT SYMBOL, encoded in Unicode 5.2. It came from the Soviet standard GOST 10859-64 and the German standard ALCOR. And was proposed by Leo Broukhis in this proposal http://www.unicode.org/L2/L2008/08030r-subscript10.pdf . It follows a discussion on this mailing list here http://www.unicode.org/mail-arch/unicode-ml/y2008-m01/0123.html, where Ken Whistler was already sceptical about the usefulness of this encoding. Aw, but ⏨ is awesome! It's much cooler-looking and more visually understandable than "e" for exponent notation. In some code I've been playing around with I support it as a valid alternative to "e". ~mark
RE: Encoding of old compatibility characters
GROUP MARK Best Regards, Jonathan Rosenne -Original Message- From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Fr?d?ric Grosshans Sent: Tuesday, March 28, 2017 1:05 AM To: unicode Subject: Re: Encoding of old compatibility characters Another example, about to be encoded, it the GOUP MARK, used on old IBM computers (proposal: ML threads: http://www.unicode.org/mail-arch/unicode-ml/y2015-m01/0040.html , and http://unicode.org/mail-arch/unicode-ml/y2007-m05/0367.html ) Le 27/03/2017 à 23:46, Frédéric Grosshans a écrit : > An example of a legacy character successfully encoded recently is ⏨ > U+23E8 DECIMAL EXPONENT SYMBOL, encoded in Unicode 5.2. > It came from the Soviet standard GOST 10859-64 and the German standard > ALCOR. And was proposed by Leo Broukhis in this proposal > http://www.unicode.org/L2/L2008/08030r-subscript10.pdf . It follows a > discussion on this mailing list here > http://www.unicode.org/mail-arch/unicode-ml/y2008-m01/0123.html, where > Ken Whistler was already sceptical about the usefulness of this encoding. > > > Le 27/03/2017 à 16:44, Charlotte Buff a écrit : >> I’ve recently developed an interest in old legacy text encodings and >> noticed that there are various characters in several sets that don’t >> have a Unicode equivalent. I had already started research into these >> encodings to eventually prepare a proposal until I realised I should >> probably ask on the mailing list first whether it is likely the UTC >> will be interested in those characters before I waste my time on a >> project that won’t achieve anything in the end. >> >> The character sets in question are ATASCII, PETSCII, the ZX80 set, >> the Atari ST set, and the TI calculator sets. So far I’ve only >> analyzed the ZX80 set in great detail, revealing 32 characters not in >> the UCS. Most characters are pseudo-graphics, simple pictographs or >> inverted variants of other characters. >> >> Now, one of Unicode’s declared goals is to enable round-trip >> compatibility with legacy encodings. We’ve accumulated a lot of weird >> stuff over the years in the pursuit of this goal. So it would be >> natural to assume that the unencoded characters from the mentioned >> sets would also be eligible for inclusion in the UCS. On the other >> hand, those encodings are for the most part older than Unicode and so >> far there seems to have been little interest in them from the UTC or >> WG2, or any of their contributors. Something tells me that if these >> character sets were important enough to consider for inclusion, they >> would have been encoded a long time ago along with all the other >> stuff in Block Elements, Box Drawings, Miscellaneous Symbols etc. >> >> Obviously the character sets in question don’t receive much use >> nowadays (and some weren’t even that relevant in their time, either), >> which leads to me wonder whether further putting work into this >> proposal would be worth it. > >
Re: Encoding of old compatibility characters
Another example, about to be encoded, it the GOUP MARK, used on old IBM computers (proposal: ML threads: http://www.unicode.org/mail-arch/unicode-ml/y2015-m01/0040.html , and http://unicode.org/mail-arch/unicode-ml/y2007-m05/0367.html ) Le 27/03/2017 à 23:46, Frédéric Grosshans a écrit : An example of a legacy character successfully encoded recently is ⏨ U+23E8 DECIMAL EXPONENT SYMBOL, encoded in Unicode 5.2. It came from the Soviet standard GOST 10859-64 and the German standard ALCOR. And was proposed by Leo Broukhis in this proposal http://www.unicode.org/L2/L2008/08030r-subscript10.pdf . It follows a discussion on this mailing list here http://www.unicode.org/mail-arch/unicode-ml/y2008-m01/0123.html, where Ken Whistler was already sceptical about the usefulness of this encoding. Le 27/03/2017 à 16:44, Charlotte Buff a écrit : I’ve recently developed an interest in old legacy text encodings and noticed that there are various characters in several sets that don’t have a Unicode equivalent. I had already started research into these encodings to eventually prepare a proposal until I realised I should probably ask on the mailing list first whether it is likely the UTC will be interested in those characters before I waste my time on a project that won’t achieve anything in the end. The character sets in question are ATASCII, PETSCII, the ZX80 set, the Atari ST set, and the TI calculator sets. So far I’ve only analyzed the ZX80 set in great detail, revealing 32 characters not in the UCS. Most characters are pseudo-graphics, simple pictographs or inverted variants of other characters. Now, one of Unicode’s declared goals is to enable round-trip compatibility with legacy encodings. We’ve accumulated a lot of weird stuff over the years in the pursuit of this goal. So it would be natural to assume that the unencoded characters from the mentioned sets would also be eligible for inclusion in the UCS. On the other hand, those encodings are for the most part older than Unicode and so far there seems to have been little interest in them from the UTC or WG2, or any of their contributors. Something tells me that if these character sets were important enough to consider for inclusion, they would have been encoded a long time ago along with all the other stuff in Block Elements, Box Drawings, Miscellaneous Symbols etc. Obviously the character sets in question don’t receive much use nowadays (and some weren’t even that relevant in their time, either), which leads to me wonder whether further putting work into this proposal would be worth it.
Re: Encoding of old compatibility characters
An example of a legacy character successfully encoded recently is ⏨ U+23E8 DECIMAL EXPONENT SYMBOL, encoded in Unicode 5.2. It came from the Soviet standard GOST 10859-64 and the German standard ALCOR. And was proposed by Leo Broukhis in this proposal http://www.unicode.org/L2/L2008/08030r-subscript10.pdf . It follows a discussion on this mailing list here http://www.unicode.org/mail-arch/unicode-ml/y2008-m01/0123.html, where Ken Whistler was already sceptical about the usefulness of this encoding. Le 27/03/2017 à 16:44, Charlotte Buff a écrit : I’ve recently developed an interest in old legacy text encodings and noticed that there are various characters in several sets that don’t have a Unicode equivalent. I had already started research into these encodings to eventually prepare a proposal until I realised I should probably ask on the mailing list first whether it is likely the UTC will be interested in those characters before I waste my time on a project that won’t achieve anything in the end. The character sets in question are ATASCII, PETSCII, the ZX80 set, the Atari ST set, and the TI calculator sets. So far I’ve only analyzed the ZX80 set in great detail, revealing 32 characters not in the UCS. Most characters are pseudo-graphics, simple pictographs or inverted variants of other characters. Now, one of Unicode’s declared goals is to enable round-trip compatibility with legacy encodings. We’ve accumulated a lot of weird stuff over the years in the pursuit of this goal. So it would be natural to assume that the unencoded characters from the mentioned sets would also be eligible for inclusion in the UCS. On the other hand, those encodings are for the most part older than Unicode and so far there seems to have been little interest in them from the UTC or WG2, or any of their contributors. Something tells me that if these character sets were important enough to consider for inclusion, they would have been encoded a long time ago along with all the other stuff in Block Elements, Box Drawings, Miscellaneous Symbols etc. Obviously the character sets in question don’t receive much use nowadays (and some weren’t even that relevant in their time, either), which leads to me wonder whether further putting work into this proposal would be worth it.
Re: Encoding of old compatibility characters
TI caculators are not antique tools, and when I see how most calculators for Android or Windows 10 are now, they are not as usable as the scientific calculators we had in the past. I know at least one excellent calculator that works with Android and Windows and finally has the real look and feel of a true calculator, and that display correct labels and excellent formulas (with the conventional 2D layout), my favorite is now "HyperCalc" (it has a free version and a paid version). The Android version is a bit more advanced. The paid version has only a few additional features not so needed (such as themes). The interface is clear, and there are several input modes for expressions. When you look at the default Calculator of Windows 10 it has never been worse than what it is now (it was much better in Windows 7 or before, even if it had many limitations). Also entering expressions in Excel is really antique, and many functions have stupid limitations (in addition, spreadsheets are not even portable across versions of Office or don't render the same, and sometimes unexpectedly produce different results). But this is not at all a problem of character encoding: we don't need Unicode at all to create a convenient UI in such applications. Even with a web_based interface, you can do a lot with HTML canvas and SVG and have a scalable UI without having to use dirty text tricks or using PUA fonts. 2017-03-27 19:18 GMT+02:00 Ken Whistler: > > On 3/27/2017 7:44 AM, Charlotte Buff wrote: > >> Now, one of Unicode’s declared goals is to enable round-trip >> compatibility with legacy encodings. We’ve accumulated a lot of weird stuff >> over the years in the pursuit of this goal. So it would be natural to >> assume that the unencoded characters from the mentioned sets [ATASCII, >> PETSCII, the ZX80 set, the Atari ST set, and the TI calculator sets] would >> also be eligible for inclusion in the UCS. >> > > Actually, it wouldn't be. > > The original goal was to ensure round-trip compatibility with *important* > legacy character encodings, *for which there was a need to convert legacy > data, and/or an ongoing need to representation of text for interchange*. > > From Unicode 1.0: "The Unicode standard includes the character content of > all major International Standards approved and published before December > 31, 1990... [long list ensues] ... and from various industry standards in > common use (such as code pages and character sets from Adobe, Apple, IBM, > Lotus, Microsoft, WordPerfect, Xerox and others)." > > Even as long ago as 1990, artifacts such as the Atari ST set were > considered obsolete antiquities, and did not rise to the level of the kind > of character listings that we considered when pulling together the original > repertoire. > > And there are several observations to be made about the "weird stuff" we > have accumulated over the years in the pursuit of compatibility. A lot of > stuff that was made up out of whole cloth, rather than being justified by > existing, implemented character sets used in information interchange at the > time, came from the 1991/1992 merger process between the Unicode Standard > and the ISO/IEC 10646 drafts. That's how Unicode acquired blocks full of > Arabic ligatures, for example. > > Other, subsequent additions of small (or even largish) sets of oddball > "characters" that don't fit the prototypical sets of characters for scripts > and/or well-behaved punctuation and symbols, typically have come in with > argued cases for the continued need in current text interchange, for > complete coverage. For example, that is how we ended up filling out Zapf > dingbats with some glyph pieces that had been omitted in the initial > repertoire for that block. More recently, of course, the continued > importance of Wingdings and Webdings font encodings on the Windows platform > led the UTC to filling out the set of graphical dingbats to cover those > sets. And of course, we first started down the emoji track because of the > need to interchange text originating from widely deployed Japanese carrier > sets implemented as extensions to Shift-JIS. > > I don't think the early calculator character sets, or sets for the Atari > ST and similar early consumer computer electronics fit the bill, precisely > because there isn't a real text data interchange case to be made for > character encoding. Many of the elements you have mentioned, for example, > like the inverse/negative squared versions of letters and symbols, are > simply idiosyncratic aspects of the UI for the devices, in an era when font > generators were hard coded and very primitive indeed. > > Documenting these early uses, and pointing out parts of the UI and > character usage that aren't part of the character repertoire in the Unicode > Standard seems an interesting pursuit to me. But absent a true textual data > interchange issue for these long-gone, obsolete devices, I don't really see > a case to be made for
Re: Encoding of old compatibility characters
On 27 Mar 2017, at 17:49, Markus Schererwrote: > > I think the interest has been low because very few documents survive in these > encodings, and even fewer documents using not-already-encoded symbols. That doesn’t mean that the few people who may need the characters now or in the centuries to come shouldn’t have them. If we’ve encoded some characters like these for compatibility, it’s only fair to be thorough. > In my opinion, this is a good use of the Private Use Area among a very small > group of people. I’d say not, since they’d be using some encoded characters and having to augment it with some PUA characters. > See also https://en.wikipedia.org/wiki/ConScript_Unicode_Registry That’s not for this sort of thing at all at all. The UCS is for this sort of thing. Michael Everson > PS: I had a ZX 81, then a Commodore 64, then an Atari ST, and at school used > a Commodore PET... Lucky man. :-)
Re: Encoding of old compatibility characters
On 3/27/2017 7:44 AM, Charlotte Buff wrote: Now, one of Unicode’s declared goals is to enable round-trip compatibility with legacy encodings. We’ve accumulated a lot of weird stuff over the years in the pursuit of this goal. So it would be natural to assume that the unencoded characters from the mentioned sets [ATASCII, PETSCII, the ZX80 set, the Atari ST set, and the TI calculator sets] would also be eligible for inclusion in the UCS. Actually, it wouldn't be. The original goal was to ensure round-trip compatibility with *important* legacy character encodings, *for which there was a need to convert legacy data, and/or an ongoing need to representation of text for interchange*. From Unicode 1.0: "The Unicode standard includes the character content of all major International Standards approved and published before December 31, 1990... [long list ensues] ... and from various industry standards in common use (such as code pages and character sets from Adobe, Apple, IBM, Lotus, Microsoft, WordPerfect, Xerox and others)." Even as long ago as 1990, artifacts such as the Atari ST set were considered obsolete antiquities, and did not rise to the level of the kind of character listings that we considered when pulling together the original repertoire. And there are several observations to be made about the "weird stuff" we have accumulated over the years in the pursuit of compatibility. A lot of stuff that was made up out of whole cloth, rather than being justified by existing, implemented character sets used in information interchange at the time, came from the 1991/1992 merger process between the Unicode Standard and the ISO/IEC 10646 drafts. That's how Unicode acquired blocks full of Arabic ligatures, for example. Other, subsequent additions of small (or even largish) sets of oddball "characters" that don't fit the prototypical sets of characters for scripts and/or well-behaved punctuation and symbols, typically have come in with argued cases for the continued need in current text interchange, for complete coverage. For example, that is how we ended up filling out Zapf dingbats with some glyph pieces that had been omitted in the initial repertoire for that block. More recently, of course, the continued importance of Wingdings and Webdings font encodings on the Windows platform led the UTC to filling out the set of graphical dingbats to cover those sets. And of course, we first started down the emoji track because of the need to interchange text originating from widely deployed Japanese carrier sets implemented as extensions to Shift-JIS. I don't think the early calculator character sets, or sets for the Atari ST and similar early consumer computer electronics fit the bill, precisely because there isn't a real text data interchange case to be made for character encoding. Many of the elements you have mentioned, for example, like the inverse/negative squared versions of letters and symbols, are simply idiosyncratic aspects of the UI for the devices, in an era when font generators were hard coded and very primitive indeed. Documenting these early uses, and pointing out parts of the UI and character usage that aren't part of the character repertoire in the Unicode Standard seems an interesting pursuit to me. But absent a true textual data interchange issue for these long-gone, obsolete devices, I don't really see a case to be made for spending time in the UTC defining a bunch of compatibility characters to encode for them. --Ken
Re: Encoding of old compatibility characters
On 27 Mar 2017, at 18:08, Garth Wallacewrote: > > Apple IIs also had inverse-video letters, and some had "MouseText" > pseudographics used to simulate a Mac-like GUI in text mode. > > I know that a couple of fonts from Kreative put these in the PUA and > Nishiki-Teki follows their lead. I think it’s better to be inclusive rather than exclusive. PUA isn’t stable, and marginal as this stuff may be, we stuff encoded that is far more marginal… nothing more frustrating than expecting something and finding it missing. Michael Everson
Re: Encoding of old compatibility characters
Apple IIs also had inverse-video letters, and some had "MouseText" pseudographics used to simulate a Mac-like GUI in text mode. I know that a couple of fonts from Kreative put these in the PUA and Nishiki-Teki follows their lead. On Mon, Mar 27, 2017 at 9:25 AM Charlotte Buff < irgendeinbenutzern...@gmail.com> wrote: > > It’s hard to say without knowing what the characters are. > > For the ZX80, the missing characters include five block elements (top and > bottom halfs of MEDIUM SHADE, as well as their inverse counterparts), and > inverse/negative squared variants of European digits and the following > symbols: " £ $ : ? ( ) - + * / = < > ; , . > Negative squared digits may be unifiable with negative circled digits. > > ATASCII includes inverse variants of box drawing characters. I have to > check whether some other pictographs are unifiable with existing characters. > > PETSCII includes some box drawings and vertical scan lines that are > probably not unifiable. > > Atari ST includes two simple pictographs that were used as graphical UI > elements. They look like a negative, low diagonal stroke and a negative > diamond respectively. It also has six characters that together form logos > which I wasn’t going to propose. > > TI calculators include a single character for a superscript minus 1. I > don’t have a lot of information available about this set at the moment. >
Re: Encoding of old compatibility characters
I think the interest has been low because very few documents survive in these encodings, and even fewer documents using not-already-encoded symbols. In my opinion, this is a good use of the Private Use Area among a very small group of people. See also https://en.wikipedia.org/wiki/ConScript_Unicode_Registry Best regards, markus PS: I had a ZX 81, then a Commodore 64, then an Atari ST, and at school used a Commodore PET...
Re: Encoding of old compatibility characters
> It’s hard to say without knowing what the characters are. For the ZX80, the missing characters include five block elements (top and bottom halfs of MEDIUM SHADE, as well as their inverse counterparts), and inverse/negative squared variants of European digits and the following symbols: " £ $ : ? ( ) - + * / = < > ; , . Negative squared digits may be unifiable with negative circled digits. ATASCII includes inverse variants of box drawing characters. I have to check whether some other pictographs are unifiable with existing characters. PETSCII includes some box drawings and vertical scan lines that are probably not unifiable. Atari ST includes two simple pictographs that were used as graphical UI elements. They look like a negative, low diagonal stroke and a negative diamond respectively. It also has six characters that together form logos which I wasn’t going to propose. TI calculators include a single character for a superscript minus 1. I don’t have a lot of information available about this set at the moment.
Re: Encoding of old compatibility characters
On 27 Mar 2017, at 15:44, Charlotte Buffwrote: > > I’ve recently developed an interest in old legacy text encodings and noticed > that there are various characters in several sets that don’t have a Unicode > equivalent. I had already started research into these encodings to eventually > prepare a proposal until I realised I should probably ask on the mailing list > first whether it is likely the UTC will be interested in those characters > before I waste my time on a project that won’t achieve anything in the end. It’s hard to say without knowing what the characters are. Michael Everson