On 22 Feb 2005 at 10:46, Schupp Roderich (extern) Com MD PD SWP 2 wrote:

> > I am impressed with pp and PAR, however I have come across a 
> > problem concerning utf-8 data. The script uni2hex.pl 
> > recognizes the utf-8 characters when run on rh9 using their 
> > perl. The output is given in uni2hex.pl_output. The problem 
> > is that after running pp the resulting executable ran in the 
> > same directory in the same machine does not recognize the 
> > utf-8 characters but splits them into there ansii parts. The 
> > output from a.out is given in a.out_output.
> > ...                                                                      
> > This may be because of something different about the setup of 
> > perl on rh9, if so it would be useful to be able to include 
> > this feature in executables made by pp. 
> 
> I can confirm that behaviour on Debian Sid (Perl 5.8.4),
> so it's not a Redhat specific problem. I had to set
> PERL_UNICODE=CSL in the environment and also set LANG
> to a utf8 locale in order for uni2hex.pl to produce the
> expected output, though. The pp generated executable would
> not recognize utf8 characters even with these settings.
> 
> Cheers, Roderich 

I'm running Win32 and I am not strong on Unicode in Perl. I haven't had any 
luck 
getting uni2hex.pl to run with utf8 behavior by default, no matter what I try 
for 
environment variables. 

However, I have gotten the same results as uni2hex.pl_output by explicitly 
stating utf8 
in the code:

    my $file = 'test_utf8_data';
    open(TEST,"<:utf8", $file) or die('aaaaagh!!');
    binmode STDOUT, ':raw :utf8';
    while(<TEST>){
        $cur = $_;
        chomp $cur;
        my $testl = length($cur);
        print "$cur is $testl long\n";
        my @letters = split //,"$cur";
        for my $loop(0..$#letters){
            my $try1 = $letters[$loop];
            my $number1 = ord($try1);
            my $hex =  sprintf "%1x",$number1;
            print "$try1 is $hex\n";
        }
    }
    close(TEST);

It works for me on Win32 by itself or as a pp packed app, no environment 
variables 
needed. Note that I used ':raw' in:

    binmode STDOUT, ':raw :utf8';
        
only to eliminate the Win32 default of crlf, for comparison to 
uni2hex.pl_output.

If I could get this to default under Win32, perhaps I could figure out what's 
missing 
when pp packed.

Alan


Reply via email to