Re: [HACKERS] Client Messages

Heikki Linnakangas Thu, 26 Jan 2012 10:59:46 -0800

On 26.01.2012 17:31, Tom Lane wrote:

Heikki Linnakangas<[email protected]>  writes:

The thing is, there's currently no encoding conversion happening, so if
you have one database in LATIN1 encoding and another in UTF-8, for
example, whatever you put in your postgresql.conf is going to be wrong
for one database. I'm happy to just document the issue for per-database
messages, "ALTER DATABASE ... SET welcome_message", the encoding used
there need to match the encoding of the database, or it's displayed as
garbage. But what about per-user messages, when the user has access to
several databases, or postgresql.conf?


I've not looked at the patch, but what exactly will happen if the string
has the wrong encoding?

You get an incorrectly encoded string, ie. garbage, in your console,when you log in with psql.

You can also use current_setting() to copy the incorrectly-encodedstring elsewhere in the system. If you insert it into a table and runpg_dump, I think the dump might not be restorable. That's a bit of astretch, perhaps, but it would be nice to avoid that.

BTW, you can already do that if you set e.g default_text_search_configto something non-ASCII in postgresql.conf. Or if you do it withsearch_path, you get a warning at login. For example, I did "ALTER USERfoouser set search_path ='kääk';" in a LATIN1 database, and thenconnected to a UTF-8 database and got:


$ ~/pgsql.master/bin/psql postgres foouser
WARNING:  invalid value for parameter "search_path": ""k��k""
DETAIL:  schema "k��k" does not exist
psql (9.2devel)
Type "help" for help.

(in case that didn't get across right, I set the search_path to a stringcontaining two a-with-umlauts, and in the warning, they got replacedwith question marks with inverse colors, which is apparently a characterthat the console uses to display bytes that are not valid UTF-8).

The problem with welcome_message would look just like that. No-one islikely to run into that with search_path, but it's quite reasonable andexpected to use your native language in a welcome message.

The idea that occurs to me is to have the code that uses the GUC do a
verify_mbstr(noerror) on it, and silently ignore it if it doesn't pass
(maybe with a LOG message).  This would have to be documented of course,
but it seems better than the potential consequences of trying to send a
wrongly-encoded string.

Hmm, fine with me. It would be nice to plug the hole that these boguscharacters can leak elsewhere into the system through current_setting,though. Perhaps we could put the verify_mbstr() call somewhere in guc.c,to forbid incorrectly encoded characters from being stored in the gucvariable in the first place.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Client Messages

Reply via email to