On Tue, 21 May 2002 13:18:22 +0100
Nick Ing-Simmons <[EMAIL PROTECTED]> wrote:

> Craig A. Berry <[EMAIL PROTECTED]> writes:
> >FWIW, I made a smaller reproducer:

> >$ perl vms_problem.t
> >euc-kr "\xAD" does not map to Unicode at vms_problem.t line 9, <$fh> line 230.
> 
> Good - that decouples the whole thing from previous runs etc. etc.
> Why the FileHandle and warnings bits - are they necessary to see the problem?

This code should extract that problematic octet.
(This part is at line 229, nearest to line 230 ;-)

#!perl
use strict;
use Encode (":all");
use Fcntl ':seek';

open my $fh, '<', 'ksc5601.enc' or die "ksc5601.enc : $!";
seek $fh, 2**14, SEEK_SET;
read($fh, my $data, 16);
close $fh;
print unpack('H*', $data), "\n";
# says 'acf1adf1aef1aff1b0f1b1f1b2f1b3f1'.
__END__

I don't know what a magic number 2**14 would be
nor what does it mean,
anyhow, correct character boundaries around here are

  (f1)ac | f1ad | f1ae | f1af | f1b0 | f1b1 | ...

but, on VMS, they seem to be:

  acf1 | ad | f1ae | f1af | ...
         ^^HERE warned?
  (\xAC\xF1 is an existing euc-kr char.)

> >I guess the next thing is to create a build for the VMS debugger and go 
> >slogging around in the vicinity of PerlIO_read, though I'm not sure exactly 
> >what I'm looking for.
> 
> We moved on a bit over weekend ...
> 
> 1. Can you grep CPP'ed perlio.c for "right" VMS hackery for ungetc() ?
> 
> 2. Can you try using :perlio layer rather than :stdio layer and 
>    see if that helps
>    on UNIX that would be 
>    PERLIO=perlio perl vms_problem.t 
> 
> 
> -- 
> Nick Ing-Simmons
> http://www.ni-s.u-net.com/
> 
> 

Regards,
SADAHIRO Tomoyuki

Reply via email to