At 10:54 PM 5/21/2002 +0900, SADAHIRO Tomoyuki wrote:
>This code should extract that problematic octet.
>(This part is at line 229, nearest to line 230 ;-)
>
>#!perl
>use strict;
>use Encode (":all");
>use Fcntl ':seek';
>
>open my $fh, '<', 'ksc5601.enc' or die "ksc5601.enc : $!";
>seek $fh, 2**14, SEEK_SET;
>read($fh, my $data, 16);
>close $fh;
>print unpack('H*', $data), "\n";
># says 'acf1adf1aef1aff1b0f1b1f1b2f1b3f1'.
>__END__
>
>I don't know what a magic number 2**14 would be
>nor what does it mean,
>anyhow, correct character boundaries around here are
>
>  (f1)ac | f1ad | f1ae | f1af | f1b0 | f1b1 | ...
>
>but, on VMS, they seem to be:
>
>  acf1 | ad | f1ae | f1af | ...
>         ^^HERE warned?
>  (\xAC\xF1 is an existing euc-kr char.)


Many thanks.  Actually, when I run this I get the correct output 
(acf1adf1aef1aff1b0f1b1f1b2f1b3f1) and no warnings, which I think confirms 
that we have a buffer getting corrupted (or lost our reference point in it? 
[or separate I/O operations getting the two halves of one character?]) 
rather than trouble parsing some particular legitimate byte sequence.

If I create a file containing only lines 229-231 from ksc5601.enc, I also do 
not get the warning when reading from it that I get when reading the entire 
file.

Reply via email to