Package: libc6
Version: 2.3.6-15
Severity: important

Hi,

It seems that iconv() return -1 and sets errno to EILSEQ on valid
input that it can't convert to the output encoding.  It shouldn't be
doing that, since it is valid input.

This can be simple showed using the iconv util, since it reacts
the same.  An simple latin1 file:
$ cat test.txt
tést
$ iconv -f latin1 -t ASCII test.txt > /dev/null
iconv: illegal input sequence at position 1
$ iconv -f latin1 -t UTF-8 test.txt > /dev/null
$ 

>From the manpage:
       EILSEQ An invalid multibyte sequence has been encountered in the input.

>From Single Unix Specification 3:
   [EILSEQ]
          Input conversion stopped due to an input byte that does not
          belong to the input codeset.

It also says:
     If iconv() encounters a character in the input buffer that is
     valid, but for which an identical character does not exist in the
     target codeset, iconv() shall perform an implementation-defined
     conversion on this character.

Instead of doing an "implementation-defined conversion", it's
returning an error, and saying the input is invalid, while the
input is clearly valid.  I would rather have that it actually
follows the standard, and does some conversion, even if it just
turns it in a '?' or something.


Kurt


Reply via email to