Re: [HACKERS] nls and server log

2014-12-29 Thread Jim Nasby

On 12/28/14, 2:56 AM, Craig Ringer wrote:

On 12/25/2014 02:35 AM, Euler Taveira wrote:

Hi,

Currently the same message goes to server log and client app. Sometimes
it bothers me since I have to analyze server logs and discovered that
lc_messages is set to pt_BR and to worse things that stup^H^H^H
application parse some error messages in portuguese.


IMO logging is simply broken for platforms where the postmaster and all
DBs don't share an encoding. We mix different encodings in log messages
and provide no way to separate them out. Nor is there a way to log
different messages to different files.

It's not just an issue with translations. We mix and mangle encodings of
user-supplied text, like RAISE strings in procs, for example.

We really need to be treating encoding for logging and for the client
much more separately than we currently do. I think any consideration of
translations for logging should be done with the underlying encoding
issues in mind.


Agreed.


My personal opinion is that we should require the server log to be
capable of representing all chars in the encodings used by any DB. Which
in practice means that we always just log in utf-8 if the user wants to
permit DBs with different encodings. An alternative would be one file
per database, always in the encoding of that database.


How much of this issue is caused by trying to machine-parse log files? Is a 
better option to improve that case, possibly doing something like including a 
field in each line that tells you the encoding for that entry?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nls and server log

2014-12-29 Thread Craig Ringer
On 12/30/2014 06:39 AM, Jim Nasby wrote:

 
 How much of this issue is caused by trying to machine-parse log files?
 Is a better option to improve that case, possibly doing something like
 including a field in each line that tells you the encoding for that entry?

That'd be absolutely ghastly. You couldn't just view the logs with
'less' or a text editor if your logs had mixed encodings, you'd need
some kind of special PostgreSQL log viewer tool.

Why would we possibly do that when we could just emit utf-8 instead?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nls and server log

2014-12-28 Thread Craig Ringer
On 12/25/2014 02:35 AM, Euler Taveira wrote:
 Hi,
 
 Currently the same message goes to server log and client app. Sometimes
 it bothers me since I have to analyze server logs and discovered that
 lc_messages is set to pt_BR and to worse things that stup^H^H^H
 application parse some error messages in portuguese.

IMO logging is simply broken for platforms where the postmaster and all
DBs don't share an encoding. We mix different encodings in log messages
and provide no way to separate them out. Nor is there a way to log
different messages to different files.

It's not just an issue with translations. We mix and mangle encodings of
user-supplied text, like RAISE strings in procs, for example.

We really need to be treating encoding for logging and for the client
much more separately than we currently do. I think any consideration of
translations for logging should be done with the underlying encoding
issues in mind.


My personal opinion is that we should require the server log to be
capable of representing all chars in the encodings used by any DB. Which
in practice means that we always just log in utf-8 if the user wants to
permit DBs with different encodings. An alternative would be one file
per database, always in the encoding of that database.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nls and server log

2014-12-27 Thread Robert Haas
On Wed, Dec 24, 2014 at 1:35 PM, Euler Taveira eu...@timbira.com.br wrote:
 Currently the same message goes to server log and client app. Sometimes
 it bothers me since I have to analyze server logs and discovered that
 lc_messages is set to pt_BR and to worse things that stup^H^H^H
 application parse some error messages in portuguese. My solution has
 been a modified version of pgBadger (former was pgfouine) -- that has
 its problems: (i) translations are not as stable as english messages,
 (ii) translations are not always available and it means there is a mix
 of translated and untranslated messages and (iii) it is minor version
 dependent. I'm tired to fight against those problems and started to
 research if there is a good solution for backend.

 I'm thinking to carry both translated and untranslated messages if we
 ask to. We store the untranslated messages if the new GUC (say
 server_lc_messages) is set. The cost will be copy to new five variables
 (message, detail, detail_log, hint, and context) in ErrorData struct
 that will be used iif server_lc_messages is set. A possible optimization
 is not to use the new variables if the lc_messages and
 server_lc_messages does not match. My use case is a server log in
 english but I'm perfect fine allowing server log in spanish and client
 messages in french. Is it an acceptable plan? Ideas?

Seems reasonable to me, I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nls and server log

2014-12-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 24, 2014 at 1:35 PM, Euler Taveira eu...@timbira.com.br wrote:
 Currently the same message goes to server log and client app.
 ...
 I'm thinking to carry both translated and untranslated messages if we
 ask to. We store the untranslated messages if the new GUC (say
 server_lc_messages) is set. The cost will be copy to new five variables
 (message, detail, detail_log, hint, and context) in ErrorData struct
 that will be used iif server_lc_messages is set. A possible optimization
 is not to use the new variables if the lc_messages and
 server_lc_messages does not match. My use case is a server log in
 english but I'm perfect fine allowing server log in spanish and client
 messages in french. Is it an acceptable plan? Ideas?

 Seems reasonable to me, I think.

The core problem that we've worried about in previous discussions about
this is what to do about translation failures and encoding conversion
failures.  That is, there's been worry that a poor choice of log locale
could result in failures that don't occur otherwise; failures that could
be particularly nasty if they result in the inability to log important
conditions, perhaps even prevent reporting them to the client either.
While I don't say that we cannot accept any risk of that sort, I think
we should consider what risks exist and whether they can be minimized
before we plow ahead.

It would also be useful to think about the requests we get from time to
time to ensure that log messages appear in a uniform choice of encoding.
I don't know whether trying to enforce a uniform log message locale
would make that easier or harder.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] nls and server log

2014-12-24 Thread Euler Taveira
Hi,

Currently the same message goes to server log and client app. Sometimes
it bothers me since I have to analyze server logs and discovered that
lc_messages is set to pt_BR and to worse things that stup^H^H^H
application parse some error messages in portuguese. My solution has
been a modified version of pgBadger (former was pgfouine) -- that has
its problems: (i) translations are not as stable as english messages,
(ii) translations are not always available and it means there is a mix
of translated and untranslated messages and (iii) it is minor version
dependent. I'm tired to fight against those problems and started to
research if there is a good solution for backend.

I'm thinking to carry both translated and untranslated messages if we
ask to. We store the untranslated messages if the new GUC (say
server_lc_messages) is set. The cost will be copy to new five variables
(message, detail, detail_log, hint, and context) in ErrorData struct
that will be used iif server_lc_messages is set. A possible optimization
is not to use the new variables if the lc_messages and
server_lc_messages does not match. My use case is a server log in
english but I'm perfect fine allowing server log in spanish and client
messages in french. Is it an acceptable plan? Ideas?


-- 
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers