Hallo Jörg, all,

Joerg Schilling wrote in
 <20201210004945.i3n8e%sch...@schily.net>:
 |Steffen Nurpmeso <stef...@sdaoden.eu> wrote:
 |> this is an iconv(3)-related error that was fixed in later version
 |> of the mailer you use.  The very error came up on the ML this
 |> year[1], basically you use LATIN1 on your box, as could be
 |> expected, but Thorsten is known to be a Unicode character
 |> "junkie", so to say.
 |
 |You are correct,

Yep -- unfortunately.

 |I have been able to save the mail as file and to run iconv(1) on the \
 |content.

Yes, we temporarily did not restart for ILSEQ, if your prompt
would include "set prompt='\${^ERRNAME}', for example, you would
have seen that an error happened.
But of course we are tolerant for weird base64, so we should be
tolerant for weird iconv, thus i "restored the original
behavirour", so to say.

That reminds me of iconv weirdness regarding hard-to-test
replacement characters, which makes testing really hard.  Wasn't
there an issue on that going on, being able to specify it
explicitly, and whether it stands for an entire character or for
by-byte sequences would be a great improvement.

While talking about iconv, i got closed glibc bug[1] as "resolved
invalid", but wouldn't you all agree that in the following

  #include <stdio.h> 
  #include <string.h>
  #include <iconv.h>
  #include <errno.h>
  int main(void){
     char inb[16], oub[16], *inbp, *oubp;
     iconv_t id;
     size_t inl, oul;

     memcpy(inbp = inb, "a\303\244c", sizeof("a\303\244c"));
     inl = sizeof("a\303\244c") -1;
     oul = sizeof oub;
     oubp = oub;

     if((id = iconv_open("ascii", "utf8")) == (iconv_t)-1)
       return 1;
     fprintf(stderr, "Converting %lu <%s>\n",(unsigned long)inl, inbp);
     if(iconv(id, &inbp, &inl, &oubp, &oul) == (size_t)-1){
        fprintf(stderr, "Fail <%s>\n", strerror(errno));
        return 2;
     }  
     fprintf(stderr, "GOT <%s>\n", oub);
     iconv_close(id);
     return 0;
  }

you should get replacement characters out of the box?
I said by then

   $ /tmp/zt
   Converting 4 <aäc>
   Fail <Invalid or incomplete multibyte or wide character>

  whereas musl gives

   $ ./zt
   Converting 4 <aäc>
   GOT <a*c>

and i still think musl is totally right (also by giving only one
replacement character.

  [1] https://sourceware.org/bugzilla/show_bug.cgi?id=22908

 |Maybe a problem is that the first missing line is a line with a character \
 |that
 |is not part of ISO-8859-1

Yes, transliteration should possibly be possible.
On the other hand, if i change the above to

   if((id = iconv_open("ascii//TRANSLIT", "utf8")) == (iconv_t)-1)

i get

  Converting 4 <aäc>
  GOT <a?cFU>

and with

   if((id = iconv_open("ascii//TRANSLIT", "utf8")) == (iconv_t)-1)

we are back at the error.

Ciao,

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

Reply via email to