On Mon, 02 Feb 2004 20:46:20 -0500
Sam Varshavchik <[EMAIL PROTECTED]> wrote:

> IKEDA Soji writes:
> 
> > unicode.c:
> >   Error handling by unicode_xconvert() breaks string using multibyte
> >   (e.g. EUC-JP) and/or stateful (e.g. ISO-2022-*) encoding schemes.
> > 
> >   I modified unicode_xconvert() so that it lets converter functions
> >   handle conversion errors and then (if possible) retry to handle
> >   errors by itself.  I'm not sure this behaviour is suitable for 
> >   all implementation of charsets.
> 
> That's not going to work.  I don't recall offhand what the reason for the 
> current logic is, but there was a specific reason I did this.
> 
> If you really need to do it this way, then what you can do is take advantage 
> of struct unicode_info.flags, and define a new flag, then check this flag in 
> unicode_xconvert, and do it this way.
> 
> This way you're only touching your own stuff.

I added new flag UNICODE_REPLACEABLE.  If this flag is set:

  o Error pointer handed over conversion function (c2u/u2c) can be
    NULL.

  o When NULL is specified for error pointer, conversion function
    must replace unmappable character/octet by adequate place holder
    and continue conversion, instead of setting error pointer and
    returning NULL.

    - On c2u function, place holder is a U+FFFD REPLACEMENT CHARACTER.

    - On u2c function. place holder is a character of 1-column
      e.g. "?" (since wcwidth() returns 1 or -1 for REPLACEMENT
      CHARACTER).

Modified patch is attached.


Other changes:
  o Fix mistake of code range on ISO-2022-KR.


  --- nezumi
  

Attachment: sqwebmail-cvs20040203-unicode_ja-1.1.1.patch.gz
Description: GNU Zip compressed data

Reply via email to