On Tue, 21 May 2002 13:18:22 +0100
Nick Ing-Simmons <[EMAIL PROTECTED]> wrote:
> Craig A. Berry <[EMAIL PROTECTED]> writes:
> >FWIW, I made a smaller reproducer:
> >$ perl vms_problem.t
> >euc-kr "\xAD" does not map to Unicode at vms_problem.t line 9, <$fh> line 230.
>
> Good - that decouples the whole thing from previous runs etc. etc.
> Why the FileHandle and warnings bits - are they necessary to see the problem?
This code should extract that problematic octet.
(This part is at line 229, nearest to line 230 ;-)
#!perl
use strict;
use Encode (":all");
use Fcntl ':seek';
open my $fh, '<', 'ksc5601.enc' or die "ksc5601.enc : $!";
seek $fh, 2**14, SEEK_SET;
read($fh, my $data, 16);
close $fh;
print unpack('H*', $data), "\n";
# says 'acf1adf1aef1aff1b0f1b1f1b2f1b3f1'.
__END__
I don't know what a magic number 2**14 would be
nor what does it mean,
anyhow, correct character boundaries around here are
(f1)ac | f1ad | f1ae | f1af | f1b0 | f1b1 | ...
but, on VMS, they seem to be:
acf1 | ad | f1ae | f1af | ...
^^HERE warned?
(\xAC\xF1 is an existing euc-kr char.)
> >I guess the next thing is to create a build for the VMS debugger and go
> >slogging around in the vicinity of PerlIO_read, though I'm not sure exactly
> >what I'm looking for.
>
> We moved on a bit over weekend ...
>
> 1. Can you grep CPP'ed perlio.c for "right" VMS hackery for ungetc() ?
>
> 2. Can you try using :perlio layer rather than :stdio layer and
> see if that helps
> on UNIX that would be
> PERLIO=perlio perl vms_problem.t
>
>
> --
> Nick Ing-Simmons
> http://www.ni-s.u-net.com/
>
>
Regards,
SADAHIRO Tomoyuki