Re: bareword test on ebcdic.

rajarshi das Thu, 28 Jul 2005 00:35:28 -0700

Nicholas Clark <[EMAIL PROTECTED]> wrote:

On Tue, Jul 26, 2005 at 08:48:10AM -0700, rajarshi das wrote:

> > For the code points being tested
> > ("\x{0442}\x{0435}\x{0441}\x{0442}")
> > does the perl source file contain the correct byte
> > sequence in UTF-EBCDIC?
> Yes it does, since I ran the test,
> if (($hash{"\x{0442}\x{0435}\x{0441}\x{0442}"}) eq
> ($hash{eval '"\x{0442}\x{0435}\x{0441}\x{0442}"'}))
> print "ok\n";
> and the test ran fine, if that is what you mean by the
> source file containing the correct byte sequence. Or
> am I mistaken ?

You are mistaken, I'm afraid. bareword means no quotes.

In ASCII & UTF-8 land, the 1 liner

$ perl -le 'use utf8; $a{à¶¬}++; print map {ord} keys %a'

gives

3500

The 3 bytes in the source code between '{' and '}' are 224, 182 and 172
which are the UTF-8 encoding for the code point 3500.

My question is, what are the bytes in UTF-EBCDIC that encode code point 3500?

The equivalent bytes on UTF-EBCDIC are 186, 84 and 83.

If you put those 3 bytes directly between the '{' and '}' characters in
the EBCDIC version of that 1 liner, does it also print 3500?
I am unable to put those three bytes in the 1-liner you mentioned above, since I am unable to print the chars corresponding to those bytes (www.kostis.net/charsets/ebc1047.htm) on the command line.

> > If so, *that* would explain the failures, and be the
> > thing that needs
> > correcting. The test file would need if/else with a
> > different test on EBCDIC.
> what would you suggest be put in the if/ else ?

I think that the regression tests tended to do something like

if (ord 'A' == 65) {
# Do the ASCII/UTF-8 version
} else {
# Assume EBCDIC
}

Thanks,

Rajarshi.

Nicholas Clark

Start your day with Yahoo! - make it your home page

Re: bareword test on ebcdic.

Reply via email to