On Sun, Apr 07, 2019 at 07:56:02AM -0400, Andrew Dunstan wrote:
> On Sun, Apr 7, 2019 at 2:41 AM Noah Misch <[email protected]> wrote:
> >
> > On Sun, Dec 30, 2018 at 10:32:31AM -0500, Andrew Dunstan wrote:
> > > On 12/30/18 12:53 AM, Noah Misch wrote:
> > > > 2. stats_temp_directory is incompatible with TAP suites that start more
> > > > than
> > > > one node simultaneously.
> >
> > > The obvious quick fix would be to have PostgresNode.pm set this to the
> > > default after inserting the TEMP_CONFIG file.
> >
> > I'd like to get $SUBJECT in place for variables other than
> > stats_temp_directory, using your quick fix idea. Attached. When its time
> > comes, your stats_temp_directory work can delete that section.
>
> Looks good.
Pushed. This broke 010_dump_connstr.pl on bowerbird, introducing 'invalid
byte sequence for encoding "UTF8"' errors. That's because log_connections
renders this 010_dump_connstr.pl solution insufficient:
# In a SQL_ASCII database, pgwin32_message_to_UTF16() needs to
# interpret everything as UTF8. We're going to use byte sequences
# that aren't valid UTF-8 strings, so that would fail. Use LATIN1,
# which accepts any byte and has a conversion from each byte to UTF-8.
$ENV{LC_ALL} = 'C';
$ENV{PGCLIENTENCODING} = 'LATIN1';
The log_connections message prints before CheckMyDatabase() calls
pg_perm_setlocale() to activate that LATIN1 database encoding. Since
bowerbird does a non-NLS build, GetMessageEncoding()==PG_SQL_ASCII at that
time. Some options:
1. Make this one test explicitly set log_connections = off. This workaround
restores what we had a day ago.
2. Move the log_connections message after CheckMyDatabase() calls
pg_perm_setlocale(), so it gets regular post-startup encoding treatment.
That fixes this particular test. It's still wrong when a database's name
is not valid in that database's encoding.
3. If GetMessageEncoding()==PG_SQL_ASCII, make pgwin32_message_to_UTF16()
assume the text is already UTF8, like it does when not in a transaction.
If UTF8->UTF16 conversion fails, the caller will send untranslated bytes to
write() or ReportEventA().
4. If GetMessageEncoding()==PG_SQL_ASCII, make pgwin32_message_to_UTF16()
return NULL. The caller will always send untranslated bytes to write() or
ReportEventA(). This seems consistent with the SQL_ASCII concept and with
pg_do_encoding_conversion()'s interpretation of SQL_ASCII.
5. When including a datname or rolname value in a message, hex-escape
non-ASCII bytes. They are byte sequences, not text of known encoding.
This preserves the most information, but it's overkill and ugly in the
probably-common case of one encoding across all databases of a cluster.
I'm inclined to do (1) in back branches and (4) in HEAD only. (If starting
fresh today, I would store the encoding of each rolname and dbname or just use
UTF8 for those particular fields.) Other preferences?
Thanks,
nm