The notion of bifurcating the code points in a code page into 1) those
to which a printable/displayable grapheme is assigned and 2) those to
which no such assignment has [yet] been made is a classical one.

The further notion of assigning a locally standard sub[stitute]
character,  x'1a'  in the example under discussion, to unassigned code
points during translation to another code page is a standard if not
classical one

The new IBM glossary definition is nevertheless a helpful
clarification.  If one intelligent and experienced reader can
interpret the current definition in a non-standard way, another may
too; and dullards are certain to do so.

The prospects for practically useful translations of this sort remains
bleak.  If one considers two sets of SBCS code pages, a set for ASCII
and a set for EBCDIC, one is struck by the fact that the number of
assigned code points is in general different among the elements of the
EBCDIC set, the ASCII set, and of course their conjunction.  This
being the case, no sequence of translations involving more than two
such code pages can be bijective, and for n translations, n > 1,
entropy increases with increasing n.

I am reminded of Chomsky's dictum that there is no reason to suppose
that translation is in general possible.

If we ask for too much, we get nothing.  Still, just as some
particular quintics can be solved, particular felicitous translations
are sometimes possible.

John Gilmore, Ashland, MA 01721 - USA

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to