On Tue, Jul 26, 2005 at 08:12:16AM -0700, rajarshi das wrote:

> I basically want to know if there are alternate ways
> of representing barewords (as I mentioned in question
> 2) above) ? 

No. By definition there can not be.
You're failing to grasp what is meant by "bareword".
There is only one representation.

> Also, any pointers that you have regarding where to
> look to fix this ? 

Not much better than "in toke.c or utf8.c"

However, based on a comment I've spotted at the top of utfebcdic.h *think*
that the internal encoding of perl on an EBCDIC system is UTF-EBCDIC rather
than UTF-8. The byte sequence in the source file for the bareword will need
to be valid UTF-EBCDIC.

For the code points being tested ("\x{0442}\x{0435}\x{0441}\x{0442}")
does the perl source file contain the correct byte sequence in UTF-EBCDIC?

Does the byte sequence in UTF-EBCDIC for those 4 code points differ from the
byte sequence in UTF-8?

Does the source file happen to have the UTF-8 byte sequence?

If so, *that* would explain the failures, and be the thing that needs
correcting. The test file would need if/else with a different test on EBCDIC.

Nicholas Clark


Reply via email to