Dan Kogai <[EMAIL PROTECTED]> writes:
>>   Dan, in case EBCDIC scares you (and it should :-), a quick intro:
>>   basically, consider the whole low 256 characters being rearranged from
>>   what they are in ASCII.  For example, ord("A") is 0xC1, not 0x41. (The
>>   pod/perlebcdic.pod has the full tables.)
>
>   Sure it does scare me.  I have to confess UTF-EBCDIC was totally out
>of mind.  But here I got a hint;  Like what perl used to be, CJK
>encodings are very, very ASCII-chauvinistic;  Its variable-length
>encoding heavily relies on the fact that ascii leaves MSB of the byte
>alone.  That way you can tell if a given byte is a whole (half-width)
>character or half of full-width character.

That is fine. When in the CJK codings they can stay ASCII_oid.

The problem comes when we convert to perl's internal form.
An ASCII 'A' in shift-JIS or whatever will still become 0xC1 in
an EBCDIC perl because that is "defined" to be EBCDIC perl's
view of U+0041.

So if tests convert CJK into "internal" and then just do ord()
they will fail for range 0..255. There are some XS functions
to map native<->unicode numbers.

>   The shadow of ASCII casts even on ISO-2022, an escape-based encoding
>that is not supposed to be affected by MSB and such (Only \e was
>supposed to matter);  in ISO-2022, most 2-byte characters are
>represented by either 96x96 or 94x94 grid, which is (7bit ascii -
>control characters) or (that - space (0x20) and DEL (\x7F)).
>   Obviously this will not work on EBCDIC....

Nor should it.

>   This one may be tougher than we think....
>   FYI I know something called 12-bit EBCDIC kanji also exists.  I know
>only of existence but is that in our support list?

If OS390 (or ICU given its history) has tables we can probably support
them.

>
>> The test logs are attached: I would really appreciate if you could see
>> some pattern in the failures.
>
>   I will do the best I can but I will be away for this weekend and I
>won't be back online till Sunday at least.
>
>> --
>> $jhi++; # http://www.iki.fi/jhi/
>>         # There is this special biologist word we use for 'stable'.
>>         # It is 'dead'. -- Jack Cohen
>
>Dan the Unstable according to Jack Cohen
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/



Reply via email to