Re: [HACKERS] Getting the red out (of the buildfarm)

Peter Eisentraut Fri, 16 Oct 2009 12:07:09 -0700

On Thu, 2009-10-15 at 00:43 +0300, Peter Eisentraut wrote:
> On Sun, 2009-10-04 at 10:48 -0400, Tom Lane wrote:
> > Peter Eisentraut <pete...@gmx.net> writes:
> > > I understand the annoyance, but I think we do need to have an organized
> > > way to do testing of non-ASCII data and in particular UTF8 data, because
> > > there are an increasing number of special code paths for those.
> > 
> > Well, if you want to keep the test, we should put in the variant with
> > \200, because it is now clear that that is in fact the right answer
> > in a nontrivial number of environments (arguably *more* cases than
> > in which "\u0080" is correct).
> 
> I put in a new variant file.  Let's see if it works.


[http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/pl/plpython/expected/plpython_unicode_0.out]

Actually, what I committed was really the output I got.  Now with your
commit my tests started failing again.

The difference turns out to be caused by glibc.  When you print an
invalid UTF-8 byte sequence using "%.*s" when LC_CTYPE is a UTF-8 locale
(e.g., en_US.utf8), it prints nothing.  Presumably, it gets confused
counting the characters for aligning the field width.

Test program:

#include <locale.h>
#include <stdio.h>

int
main()
{
        setlocale(LC_ALL, "");
        printf("%.*s", 1, "\200");
        return 0;
}

This prints nothing (check with od) when LC_CTYPE is en_US.utf8.

I think this can be filed under trouble caused by mismatching LC_CTYPE
and client encoding and doesn't need further fixing, but it's good to
keep in mind.

Let's see what the Solaris builds say now.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting the red out (of the buildfarm)

Reply via email to