I did not "miss" the word defined. It's not in IBM's definition.

I hear everyone who is saying "the term 'code point' *really* means a bit
value with a glyph assigned to it," but that's not what the definitions out
there say. Wikipedia:

In character encoding terminology, a code point or code position is any of
the numerical values that make up the code space (or code page).[1] For
example, ASCII comprises 128 code points in the range 0hex to 7Fhex,
Extended ASCII comprises 256 code points in the range 0hex to FFhex, and
Unicode comprises 1,114,112 code points in the range 0hex to 10FFFFhex. The
Unicode code space is divided into seventeen planes (the basic multilingual
plane, and 16 supplementary planes), each with 65,536 (= 216) code points.
Thus the total size of the Unicode code space is 17 × 65,536 = 1,114,112.

"ASCII comprises 128 code points." Not all 128 of those have glyphs.

"Unicode comprises 1,114,112 code points." Not all of those million-plus
code points have glyphs.

"every code point in the source CCSID maps to a unique code point in the
target CCSID" (note no "defined")

That says every one of the 128 ASCII code points maps to a unique bit
combination. But it ain't so.

If one is going to define "round trip conversion" as applying only to
corresponding glyphs then the definition loses any meaning. Any rational
translation is round trip with regard to corresponding glyphs.

Charles

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf
Of Mike Schwab
Sent: Wednesday, June 13, 2012 12:02 PM
To: IBM-MAIN@bama.ua.edu
Subject: Re: Anyone a Unicode Services expert? -- roundtrip conversion

On Wed, Jun 13, 2012 at 12:59 PM, Charles Mills <charl...@mcn.org> wrote:
> I got a response to the PMR. Taking the liberty of paraphrasing a long 
> reply, the essence of it seemed to be that -- per the CCSID pair lists 
> in the manual -- they support round trip conversion from 1027 to 1208 
> but not from 1208 to 1027. Here is what I wrote back:
>
What this means is:
Roundtrip example:  Every defined character in 1027, excluding values that
do not have a character defined, exist in 1208, is successfully translated
from 1027 to 1208 and back to 1027.  All codepoints that do not have a
character defined will be translated to (one?) non-valid value.

Non-roundtrip example:  Some defined characters in 1208, and all codepoints
that do not have a character defined, do not exist in 1027.
 If you translate text from 1208 to 1027, the characters not defined in
1027, and all undefined codepoints will be translated to (one?) non-valid
codepoint.

In any translation, a codepoint that does not encode a character will be
translated to (one?) non-valid codepoint.

If, while the text is in 1208 is changed to add a character not in 1027,
upon translation that value will be changed to an invalid value in 1027.

> It sounds like you are saying (for the CCSIDs in question) "we support 
> round trip, but in one direction only." It would be like if I bought a 
> round trip ticket on Delta between San Francisco and Atlanta, and 
> after I got to Atlanta, they explained that it was a round trip ticket 
> only in one direction.
>
> I would kind of question also whether what you are doing conforms to 
> your definition of round trip in the Unicode manual glossary: Round 
> trip. Encoding that occurs when every code point in the source CCSID 
> maps to a unique code point in the target CCSID.

You missed the word defined.  If a codepoint does not have a defined
character, it is translated to (one?) invalid value.

> Using round trip
> tables ensure the capability of reversing the conversion, and 
> recovering the complete original source datastream.
>
> I would question "every code point in the source CCSID maps to a 
> unique code point in the target CCSID" when both 3F and 41 map to the 
> same code point, and I wonder how I would recover the original source 
> datastream.
>
Again, you missed the word defined.  If a codepoint does not have a defined
character, it is translated to (one?) invalid value.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN

Reply via email to