Re: [bug-libunistring] roundtrippable encoding support

Ben Pfaff Fri, 10 Oct 2014 08:47:57 -0700

On Thu, Oct 09, 2014 at 06:04:02PM +0200, David Kastrup wrote:
> What I am actually more interested in is in having libunistring offer
> "roundtrippable" encodings as a fallback for decoding errors.
> Basically, I want an option for decoding where libunistring announces
> "what you have here is not valid utf-8 but I know how to deal with it".
> Including reencoding.  And delivering unique "character codes" and
> string length calculations.  The application would either keep track of
> having received "dirty utf-8" and would reencode when putting out utf-8
> (where reencoding "internal utf-8" to "external utf-8" means replacing
> the 2-byte sequences representing a wild byte by their original byte),
> or it would reencode into "external" utf-8 when writing anyway which
> would not change anything for originally valid utf-8.


It sounds like a reasonable philosophy to me.  I don't think I'd want
this to become the only option for libunistring, but if there's a
practical way to add alternate interfaces, etc., then I think that would
be valuable.

(I am not the libunistring maintainer and don't intend to speak for
him.)

Re: [bug-libunistring] roundtrippable encoding support

Reply via email to