Mark Leisher <[EMAIL PROTECTED]> writes:
>    Nick> Following the first page will be all the other pages, each in the
>    Nick> same format as the first: one number identifying the page followed
>    Nick> by 256 double-byte Unicode characters.  If a character in the
>    Nick> encoding maps to the Unicode character 0000, it means that the
>    Nick> character doesn't actually exist.  If all characters on a page would
>    Nick> map to 0000, that page can be omitted.
>
>There may some day be a use for the Unicode codepoint 0x0000.  It might be
>better to make this 0xFFFF, which is a guaranteed non-character in Unicode and
>probably in ISO10646.

Documentation not withstanding, the original Tcl C code does permit the 
Unicode code point 0x0000 to exist iff in the 0 slot of of the other encoding. 
e.g. ASCII NUL is mapped to it. 

I made at least an attempt at this in the OO perl stuff as well.

0x0000 has a nice C/perl "falseness" which 0xFFFF lacks - but as we don't
use the .enc tables directly anyway using 0xFFFF and converting to C<undef>
at load time (Tcl does not have undef equivalent) would be a reasonable 
approach.

-- 
Nick Ing-Simmons

Reply via email to