found 466341 5.10.0-19
retitle 466341 support the Encode::decode CHECK argument with ISO-2022-JP
severity 466341 wishlist
thanks
On Mon, Feb 18, 2008 at 01:36:55AM -0500, Bryan Donlan wrote:
Package: perl
Version: 5.8.8-12
Severity: normal
Converting a certain sequence of ISO-2022-JP text to utf8 succeeds:
$ perl -MEncode -e '$s= {\x1b\x24\x42\x2d)\x1b(B}; print
encode(utf8, decode(iso-2022-jp, $s, Encode::FB_CROAK)), \n'
{⑨}
However, converting it back to ISO-2022-JP fails:
$ perl -MEncode -e '$s= {\x1b\x24\x42\x2d)\x1b(B}; print
encode(iso-2022-jp, decode(iso-2022-jp, $s, Encode::FB_CROAK)),
\n'
{\x{2468}}
It should be noted that iconv rejects this entirely:
$ perl -MEncode -e '$s= {\x1b\x24\x42\x2d)\x1b(B}; print $s,
\n'|iconv -f iso-2022-jp -t utf8
{iconv: illegal input sequence at position 4
However, if this is truly invalid iso-2022-jp, perl should croak on it, since
FB_CROAK was passed.
It's indeed an invalid sequence, iconv is right about that. The original
JIS-C-6226 (aka. JIS X 0208) standard can be found at e.g. [1], and it
does not contain 0x2d 0x29, which is the sequence embedded in your
iso-2022-jp coded example.
The bug here seems to be that the corresponding Encode module ignores
the CHECK argument. The Encode documentation states:
NOTE: Not all encoding support this feature
Some encodings ignore CHECK argument. For example, Encode::Unicode ignores
CHECK and it
always croaks on error.
so lowering the severity.
[1] http://www.itscj.ipsj.or.jp/ISO-IR/087.pdf
Cheers,
--
Niko Tyni nt...@debian.org
--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org