[HACKERS] Re: [bug fix] multibyte messages are displayed incorrectly on the client

Heikki Linnakangas Mon, 23 Jun 2014 06:58:31 -0700

On 04/05/2014 07:56 AM, Tom Lane wrote:

"MauMau" <[email protected]> writes:

Then, as a happy medium, how about disabling message localization only if
the client encoding differs from the server one?  That is, compare the
client_encoding value in the startup packet with the result of
GetPlatformEncoding().  If they don't match, call
disable_message_localization().


AFAICT this is not what was agreed to in this thread.  It puts far too
much credence in the server-side default for client_encoding, which up to
now has never been thought to be very interesting; indeed I doubt most
people bother to set it at all.  The reason that this issue is even on
the table is that that default is too likely to be wrong, no?

Also, whatever possessed you to use pg_get_encoding_from_locale to
identify the server's encoding?  That's expensive and seems fairly
unlikely to yield the right answer.  I don't remember offhand where we
keep the postmaster's idea of what encoding messages should be in, but I'm
fairly sure it's stored explicitly somewhere.  Or if it isn't, we can for
sure do better than recalculating it during every connection attempt.

Having said all that, though, I'm unconvinced that this cure isn't worse
than the disease.  Somebody claimed upthread that no very interesting
messages would be delocalized by a change like this, but that's complete
nonsense: in particular, *every* message associated with client
authentication will be sent in English if we go down this path.  Given
the nearly complete lack of complaints in the many years that this code
has worked like this, I'm betting that most people will find a change
like this to be a net reduction in friendliness.

Given the changes here to extract client_encoding from the startup packet
ASAP, I wonder whether the right thing isn't just to set the client
encoding immediately when we do that.  Most application libraries pass
client encoding in the startup packet anyway (libpq certainly does).

Based on Tom's comments above, I'm marking this as returned withfeedback in the commitfest. I agree that setting client_encoding asearly as possible seems like the right thing to do.

Earlier in this thread, MauMau pointed out that we can't do encodingconversions until we have connected to the database because you need toread pg_conversion for that. That's because we support creating customconversions with CREATE CONVERSION. Frankly, I don't think anyone caresabout that feature. If we just dropped the CREATE/DROP CONVERSIONfeature altogether and hard-coded the conversions we have, there wouldbe close to zero complaints. Even if you want to extend something aroundencodings and conversions, the CREATE CONVERSION interface is clunky.Firstly, conversions are per-database, and even schema-qualified, whichjust seems like an extra complication. You'll most likely want to modifythe conversion across the whole system. Secondly, rather than define anew conversion between encodings, you'll likely want to define a wholenew encoding with conversions to/from existing encodings, but you can'tdo that anyway without hacking the source code.


- Heikki



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [bug fix] multibyte messages are displayed incorrectly on the client

Reply via email to