Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

hgonzalez Sat, 08 May 2010 15:51:16 -0700

Well, I finally found some related -rather old- issues in Bugzilla (glib)


http://sources.redhat.com/bugzilla/show_bug.cgi?id=6530
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=208308
http://sources.redhat.com/bugzilla/show_bug.cgi?id=649

The last explains why they do not consider it a bug:

ISO C99 requires for %.*s to only write complete characters that fit belowthe

precision number of bytes. If you are using say UTF-8 locale, but ISO-8859-1
characters as shown in the input file you provided, some of the strings are
not valid UTF-8 strings, therefore sprintf fails with -1 because of the
encoding error. That's not a bug in glibc.

It's clear, though it's also rather ugly, from a specification point ofview (we mustcount raw bytes for the width field, but also must decode the utf8 charsfor finding

character boundaries). I guess we must live with that.

Hernán J. González

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

Reply via email to