On 4/15/05, Rajarshi Das <[EMAIL PROTECTED]> wrote:
> In a script testing barewords, the character 'tau' displays when opened
> using vi editor on linux. OTOH, the same character doesnt display on z/OS
> and shows as ^69^22 in the vi editor.
> The failing line in the script is :
> %hash = (^69^22.... => 123)
> 
> Perl (5.8.6) complains when it reads thus :
> "Unrecognized character \x69".
> 
> If I just write \x69\x22, perl doesnt understand that I am implying the
> character 'tau'.
> 
> Is there a way to display the 'tau' character as a bareword, without
> interpreting it as "\x69\x22", on a ebcdic platform ?
> 
> Thanks for all the help,
> Rajarshi.
> 
> >From: Chris Devers <[EMAIL PROTECTED]>
> >Reply-To: Perl Beginners List <beginners@perl.org>
> >To: Rajarshi Das <[EMAIL PROTECTED]>
> >CC: Perl Beginners List <beginners@perl.org>
> >Subject: Re: what are utf8 barewords.?
> >Date: Thu, 14 Apr 2005 10:27:13 -0400 (EDT)
> >
> >On Thu, 14 Apr 2005, Rajarshi Das wrote:
> >
> > > Barewords acccording to perldata.pod are "words that donot have any
> > > other meaning in grammar".
> > >
> > > 1) So, does this mean that any word which is not reserved is a
> > > bareword ?
> >
> >Off the top of my head, every "token" of text in Perl is either:
> >
> >  * an operator: +, *, =~, s///, ..
> >
> >  * a built-in function: chomp(), map(), grep()
> >
> >  * an imported function or method from a module: $cgi->param()
> >
> >  * a user defined subroutine or method: do_stuff_with()
> >
> >  * a string: "including" qw{ things like }, qq[ this ], 'or', "this"
> >
> >  * a bareword: FILEHANDLE, etc
> >
> >I may have missed a class or two, but that's most of them.
> >
> > > 2) What exactly would be a utf8 bareword ? Is it any utf8 encoded
> > > character ?
> >
> >A non- operator / function / method / subroutine / string that includes
> >one or more UTF8 characters.
> >
> > > Any examples ?
> > > Would "\x69\x22" qualify as a utf8 bareword ?
> >
> >Well, if used exactly as you have it there, it's a string, because it's
> >wrapped in double quotes. If you just had
> >
> >     \x69\x22
> >
> >by itself, then yes, it would be a bareword.
> >
> >
> >
> >--
> >Chris Devers

Simply put, it's a bareword because you didn't double quote it. 
You're also not using perl's unicode notation. That's part of the
issue.  The other is the encoding.  Right now, both your editors are
using different encodings, and neither of them is unicode; 'GREEK
SMALL LETTER TAU' is U+03C4 and 'GREEK CAPITAL LETTER TAU'  is U+03A4.
 This isn't perl's fault.  To make these systems talk to each other,
you're going to have to explicity give the code points for non ascii
characters, instead of relying on your editor, which will give you
ebcdic code points on the one hand, and, probably, ISO-Latin-something
on the other.  So:

   %hash = ("\x{03a4}" => 123)

When it comes to saving and printing, particularly on ebcdic systems,
there are other issues, but take a look through

perldoc perluniintro
perldoc perlunicode
perldoc perlebcdic
perldoc Encode
perldoc PerlIO
perldoc utf8

I know that looks like a long list, but maintaining encodings across
different platforms is tricky once you get into extended charater sets
because it's not enough for perl to get it right; you have to be able
to print it to screen without getting "wide character in..." errors.

HTH,

--jay

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to