Re: Logical Nor (¬) in ASCII-based code pages?

Phil Smith III Sun, 07 May 2023 10:42:05 -0700

Seymour J Metz wrote:
>I've seen Logical Not () at AA and at AC. Are there and ASCII-based
>code pages that have it at a third position? Put another way, is there
>a third code point that ooRexx and Regina should recognize as ?


And later:
>UTF-8 is just a transform of Unicode, and the Unicode code point is
>AC. The string C2AC is just a way of encoding AC.

Not quite. Yes, hex C2AC is the UTF-8 encoding of the Unicode NOT sign. Unicode 
is a list of code points and, as you said, UTF-8 is an encoding. The Unicode 
code point is U+00AC. It is NOT AC, nor hex AC. Yes, Im being picky, but 
this matters. The point is, U+00ACthe Unicode expression of that code 
pointhas a specific meaning, which then *must* be encoded somehow (UTF-8, 
UTF-16, UTF-32); AC is meaningless in a Unicode context.

This is especially confusing since plain ol ASCII maps directly to the first 
part of UTF-8-encoded Unicode. This is of course A Good Thing in general, but 
lets people cheat and get away with ituntil they dont.

It gets even more confusing because ISO 8859-1 *looks* like Unicode in that, 
for example, a hex AC is the NOT sign in 8859-1. But thats 8859-1, not 
Unicode, not UTF-8. A hex AC *is not a character* in UTF-8: its an error. Ive 
seen customers take data thats UTF-8 and think its 8859-1. This mostly works. 
Mostly is not good.

As for your original question, Im more than willing to believe in some code 
page with hex AA as the NOT sign, just never seen it. Hard to search for, too, 
alas. Do you know what page that is?

Im a bit chary* of blindly accepting multiple code points as NOT signs. Better 
to know how your input is encoded (or mandate it). Unless, of course, it can be 
demonstrated that this particular multilingualism cannot be misinterpreted.

...phsiii

*no char pun intended


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: Logical Nor (¬) in ASCII-based code pages?

Reply via email to