The notion of bifurcating the code points in a code page into 1) those to which a printable/displayable grapheme is assigned and 2) those to which no such assignment has [yet] been made is a classical one.
The further notion of assigning a locally standard sub[stitute] character, x'1a' in the example under discussion, to unassigned code points during translation to another code page is a standard if not classical one The new IBM glossary definition is nevertheless a helpful clarification. If one intelligent and experienced reader can interpret the current definition in a non-standard way, another may too; and dullards are certain to do so. The prospects for practically useful translations of this sort remains bleak. If one considers two sets of SBCS code pages, a set for ASCII and a set for EBCDIC, one is struck by the fact that the number of assigned code points is in general different among the elements of the EBCDIC set, the ASCII set, and of course their conjunction. This being the case, no sequence of translations involving more than two such code pages can be bijective, and for n translations, n > 1, entropy increases with increasing n. I am reminded of Chomsky's dictum that there is no reason to suppose that translation is in general possible. If we ask for too much, we get nothing. Still, just as some particular quintics can be solved, particular felicitous translations are sometimes possible. John Gilmore, Ashland, MA 01721 - USA ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
