Seymour J Metz wrote: >I've seen Logical Not () at AA and at AC. Are there and ASCII-based >code pages that have it at a third position? Put another way, is there >a third code point that ooRexx and Regina should recognize as ?
And later: >UTF-8 is just a transform of Unicode, and the Unicode code point is >AC. The string C2AC is just a way of encoding AC. Not quite. Yes, hex C2AC is the UTF-8 encoding of the Unicode NOT sign. Unicode is a list of code points and, as you said, UTF-8 is an encoding. The Unicode code point is U+00AC. It is NOT AC, nor hex AC. Yes, Im being picky, but this matters. The point is, U+00ACthe Unicode expression of that code pointhas a specific meaning, which then *must* be encoded somehow (UTF-8, UTF-16, UTF-32); AC is meaningless in a Unicode context. This is especially confusing since plain ol ASCII maps directly to the first part of UTF-8-encoded Unicode. This is of course A Good Thing in general, but lets people cheat and get away with ituntil they dont. It gets even more confusing because ISO 8859-1 *looks* like Unicode in that, for example, a hex AC is the NOT sign in 8859-1. But thats 8859-1, not Unicode, not UTF-8. A hex AC *is not a character* in UTF-8: its an error. Ive seen customers take data thats UTF-8 and think its 8859-1. This mostly works. Mostly is not good. As for your original question, Im more than willing to believe in some code page with hex AA as the NOT sign, just never seen it. Hard to search for, too, alas. Do you know what page that is? Im a bit chary* of blindly accepting multiple code points as NOT signs. Better to know how your input is encoded (or mandate it). Unless, of course, it can be demonstrated that this particular multilingualism cannot be misinterpreted. ...phsiii *no char pun intended ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
