Interesting issue. Mainly because the "ť" char it complains about
(utf-8 0xc5 0xa5) is accepted in the SELECT that generates the record. If
it's valid input it should be valid output, right? We didn't change the
client_encoding in the mean time. It makes sense though:
initdb on that animal says:
The database cluster will be initialized with locale "English_United
The default database encoding has accordingly been set to "WIN1252".
The regress script in question sets:
SET client_encoding = 'utf8';
but we're apparently round-tripping the data through the database encoding
at some point, then converting back to client_encoding for output.
Presumably that's when we're forming the text 'data' column in the
tuplestore produced by the get changes function, which will be in the
So setting client_encoding is not sufficient to make this work and the
non-7-bit-ascii part should be removed from the test, since it's not
allowed on all machines.
In some ways it seems like the argument to pg_logical_emit_message(...) should
be 'bytea'. That'd be much more convenient for application use. But then
it's a pain when using it via the text-format SQL interface calls, where
we've got no sensible way to output it.
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services