On Tue, Jul 26, 2005 at 08:12:16AM -0700, rajarshi das wrote:
> I basically want to know if there are alternate ways
> of representing barewords (as I mentioned in question
> 2) above) ?
No. By definition there can not be.
You're failing to grasp what is meant by "bareword".
There is only one representation.
> Also, any pointers that you have regarding where to
> look to fix this ?
Not much better than "in toke.c or utf8.c"
However, based on a comment I've spotted at the top of utfebcdic.h *think*
that the internal encoding of perl on an EBCDIC system is UTF-EBCDIC rather
than UTF-8. The byte sequence in the source file for the bareword will need
to be valid UTF-EBCDIC.
For the code points being tested ("\x{0442}\x{0435}\x{0441}\x{0442}")
does the perl source file contain the correct byte sequence in UTF-EBCDIC?
Does the byte sequence in UTF-EBCDIC for those 4 code points differ from the
byte sequence in UTF-8?
Does the source file happen to have the UTF-8 byte sequence?
If so, *that* would explain the failures, and be the thing that needs
correcting. The test file would need if/else with a different test on EBCDIC.
Nicholas Clark