Re: [HACKERS] parallel workers and client encoding

2016-06-30 Thread Robert Haas
On Wed, Jun 29, 2016 at 10:52 PM, Peter Eisentraut
 wrote:
> I'm happy with this patch.

Great.  I have committed it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-29 Thread Peter Eisentraut

I'm happy with this patch.


On 6/29/16 12:41 PM, Robert Haas wrote:

On Tue, Jun 28, 2016 at 10:10 PM, Peter Eisentraut
 wrote:

On 6/27/16 5:37 PM, Robert Haas wrote:

Please find attached an a patch for a proposed alternative approach.
This does the following:

1. When the client_encoding GUC is changed in the worker,
SetClientEncoding() is not called.


I think this could be a problem, because then the client encoding in the
background worker process is inherited from the postmaster, which could in
theory be anything.  I think you need to set it at least once to the correct
value.


Fixed in the attached version.


Thus, GetClientEncoding() in the
worker always returns the database encoding, regardless of how the
client encoding is set.  This is better than your approach of calling
SetClientEncoding() during worker startup, I think, because the worker
could call a parallel-safe function with SET clause, and one of the
GUCs temporarily set could be client_encoding.  That would be stupid,
but somebody could do it.


I think if we're worried about this, then this should be an error, but
that's a minor concern.


We can't actually throw an error at that point, because we really do
want the GUC to have the same value in the worker as it does in the
leader.  Otherwise, anything that can observe the value of an
arbitrary GUC suddenly becomes parallel-restricted, which is certainly
unwanted.


I realize that we don't have a good mechanism in the GUC code to distinguish
these two situations.

Then again, this shouldn't be so much different in concept from the
restoring of GUC variables in the EXEC_BACKEND case.  There is special code
in set_config_option() to bypass some of the rules in that case.
RestoreGUCState() should be able to get the same sort of pass.


I think we can use the existing flag InitializingParallelWorker to
handle this case.  See attached.


Also, set_config_option() knows something about what settings are allowed in
a parallel worker, so I wonder if setting client_encoding would even work in
spite of that?


I think that with this change, a SET client_encoding on a supposedly
parallel-safe function will just fail to have any effect when the
function runs inside a worker.  The attached patch will make that kind
of thing fail outright when attempted inside a parallel worker.


2. A new function pq_getmsgrawstring() is added.  This is like
pq_getmsgstring() but it does no encoding conversion.
pq_parse_errornotice() is changed to use pq_getmsgrawstring() instead
of pq_getmsgstring().  Because of (1), when the leader receives an
ErrorResponse or NoticeResponse from the worker, it will not have been
subject to encoding conversion; because of this item, the leader will
not try to convert it either when initially parsing it.  So the extra
encoding round-trip is avoided.


I like that.


3. The changes for NotifyResponse which you proposed are included
here, but with the modification that pq_getmsgrawstring() is used.


and that.


Thanks for the review.







--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-29 Thread Robert Haas
On Tue, Jun 28, 2016 at 10:10 PM, Peter Eisentraut
 wrote:
> On 6/27/16 5:37 PM, Robert Haas wrote:
>> Please find attached an a patch for a proposed alternative approach.
>> This does the following:
>>
>> 1. When the client_encoding GUC is changed in the worker,
>> SetClientEncoding() is not called.
>
> I think this could be a problem, because then the client encoding in the
> background worker process is inherited from the postmaster, which could in
> theory be anything.  I think you need to set it at least once to the correct
> value.

Fixed in the attached version.

>> Thus, GetClientEncoding() in the
>> worker always returns the database encoding, regardless of how the
>> client encoding is set.  This is better than your approach of calling
>> SetClientEncoding() during worker startup, I think, because the worker
>> could call a parallel-safe function with SET clause, and one of the
>> GUCs temporarily set could be client_encoding.  That would be stupid,
>> but somebody could do it.
>
> I think if we're worried about this, then this should be an error, but
> that's a minor concern.

We can't actually throw an error at that point, because we really do
want the GUC to have the same value in the worker as it does in the
leader.  Otherwise, anything that can observe the value of an
arbitrary GUC suddenly becomes parallel-restricted, which is certainly
unwanted.

> I realize that we don't have a good mechanism in the GUC code to distinguish
> these two situations.
>
> Then again, this shouldn't be so much different in concept from the
> restoring of GUC variables in the EXEC_BACKEND case.  There is special code
> in set_config_option() to bypass some of the rules in that case.
> RestoreGUCState() should be able to get the same sort of pass.

I think we can use the existing flag InitializingParallelWorker to
handle this case.  See attached.

> Also, set_config_option() knows something about what settings are allowed in
> a parallel worker, so I wonder if setting client_encoding would even work in
> spite of that?

I think that with this change, a SET client_encoding on a supposedly
parallel-safe function will just fail to have any effect when the
function runs inside a worker.  The attached patch will make that kind
of thing fail outright when attempted inside a parallel worker.

>> 2. A new function pq_getmsgrawstring() is added.  This is like
>> pq_getmsgstring() but it does no encoding conversion.
>> pq_parse_errornotice() is changed to use pq_getmsgrawstring() instead
>> of pq_getmsgstring().  Because of (1), when the leader receives an
>> ErrorResponse or NoticeResponse from the worker, it will not have been
>> subject to encoding conversion; because of this item, the leader will
>> not try to convert it either when initially parsing it.  So the extra
>> encoding round-trip is avoided.
>
> I like that.
>
>> 3. The changes for NotifyResponse which you proposed are included
>> here, but with the modification that pq_getmsgrawstring() is used.
>
> and that.

Thanks for the review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


parallel-encoding-fun-v2.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-28 Thread Peter Eisentraut

On 6/27/16 5:37 PM, Robert Haas wrote:

Please find attached an a patch for a proposed alternative approach.
This does the following:

1. When the client_encoding GUC is changed in the worker,
SetClientEncoding() is not called.


I think this could be a problem, because then the client encoding in the 
background worker process is inherited from the postmaster, which could 
in theory be anything.  I think you need to set it at least once to the 
correct value.



Thus, GetClientEncoding() in the
worker always returns the database encoding, regardless of how the
client encoding is set.  This is better than your approach of calling
SetClientEncoding() during worker startup, I think, because the worker
could call a parallel-safe function with SET clause, and one of the
GUCs temporarily set could be client_encoding.  That would be stupid,
but somebody could do it.


I think if we're worried about this, then this should be an error, but 
that's a minor concern.


I realize that we don't have a good mechanism in the GUC code to 
distinguish these two situations.


Then again, this shouldn't be so much different in concept from the 
restoring of GUC variables in the EXEC_BACKEND case.  There is special 
code in set_config_option() to bypass some of the rules in that case. 
RestoreGUCState() should be able to get the same sort of pass.


Also, set_config_option() knows something about what settings are 
allowed in a parallel worker, so I wonder if setting client_encoding 
would even work in spite of that?



2. A new function pq_getmsgrawstring() is added.  This is like
pq_getmsgstring() but it does no encoding conversion.
pq_parse_errornotice() is changed to use pq_getmsgrawstring() instead
of pq_getmsgstring().  Because of (1), when the leader receives an
ErrorResponse or NoticeResponse from the worker, it will not have been
subject to encoding conversion; because of this item, the leader will
not try to convert it either when initially parsing it.  So the extra
encoding round-trip is avoided.


I like that.


3. The changes for NotifyResponse which you proposed are included
here, but with the modification that pq_getmsgrawstring() is used.


and that.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-27 Thread Robert Haas
On Mon, Jun 27, 2016 at 12:53 PM, Robert Haas  wrote:
> On Mon, Jun 13, 2016 at 10:27 PM, Peter Eisentraut
>  wrote:
>> Modulo that last point, here is a patch that shows how I think this could
>> work, in combination with the patch I posted previously that sets the
>> "client encoding" in the parallel worker to the server encoding.
>>
>> This patch disassembles the NotificationResponse message with a temporary
>> client encoding, and then sends it off to the real frontend using the real
>> client encoding.
>>
>> Doing it this way also takes care of a few special cases that
>> NotifyMyFrontEnd() handles, such as a client with protocol version 2 that
>> doesn't expect a payload in the message.
>
> How does this address the concern raised in the last sentence of
> https://www.postgresql.org/message-id/CA+TgmoaAAEXmjVMB4nM=znf_5b9jhuim6q3afro4spt23ti...@mail.gmail.com
> ?  It seems that if an error occurs between the two SetClientEncoding
> calls, the change will persist for the rest of the session, resulting
> in chaos.

Please find attached an a patch for a proposed alternative approach.
This does the following:

1. When the client_encoding GUC is changed in the worker,
SetClientEncoding() is not called.  Thus, GetClientEncoding() in the
worker always returns the database encoding, regardless of how the
client encoding is set.  This is better than your approach of calling
SetClientEncoding() during worker startup, I think, because the worker
could call a parallel-safe function with SET clause, and one of the
GUCs temporarily set could be client_encoding.  That would be stupid,
but somebody could do it.

2. A new function pq_getmsgrawstring() is added.  This is like
pq_getmsgstring() but it does no encoding conversion.
pq_parse_errornotice() is changed to use pq_getmsgrawstring() instead
of pq_getmsgstring().  Because of (1), when the leader receives an
ErrorResponse or NoticeResponse from the worker, it will not have been
subject to encoding conversion; because of this item, the leader will
not try to convert it either when initially parsing it.  So the extra
encoding round-trip is avoided.

3. The changes for NotifyResponse which you proposed are included
here, but with the modification that pq_getmsgrawstring() is used.

This seems to fix your original test case for me, and hopefully all of
the related cases also.  Review is appreciated.  The main thing about
this is that it doesn't rely on being able to make temporary changes
to global variables, thus hopefully making it robust in the face of
non-local transfer of control.

(Official status update: I'll commit this on Thursday unless anyone
reports a problem with it before then.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 088700e..3996897 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -810,7 +810,17 @@ HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg)
 		case 'A':/* NotifyResponse */
 			{
 /* Propagate NotifyResponse. */
-pq_putmessage(msg->data[0], >data[1], msg->len - 1);
+int32		pid;
+const char *channel;
+const char *payload;
+
+pid = pq_getmsgint(msg, 4);
+channel = pq_getmsgrawstring(msg);
+payload = pq_getmsgrawstring(msg);
+pq_endmessage(msg);
+
+NotifyMyFrontEnd(channel, payload, pid);
+
 break;
 			}
 
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index c39ac3a..716f1c3 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -390,9 +390,6 @@ static bool asyncQueueProcessPageEntries(volatile QueuePosition *current,
 			 char *page_buffer);
 static void asyncQueueAdvanceTail(void);
 static void ProcessIncomingNotify(void);
-static void NotifyMyFrontEnd(const char *channel,
- const char *payload,
- int32 srcPid);
 static bool AsyncExistsPendingNotify(const char *channel, const char *payload);
 static void ClearPendingActionsAndNotifies(void);
 
@@ -2076,7 +2073,7 @@ ProcessIncomingNotify(void)
 /*
  * Send NOTIFY message to my front end.
  */
-static void
+void
 NotifyMyFrontEnd(const char *channel, const char *payload, int32 srcPid)
 {
 	if (whereToSendOutput == DestRemote)
diff --git a/src/backend/commands/variable.c b/src/backend/commands/variable.c
index 962d75d..7180be6 100644
--- a/src/backend/commands/variable.c
+++ b/src/backend/commands/variable.c
@@ -755,6 +755,15 @@ assign_client_encoding(const char *newval, void *extra)
 {
 	int			encoding = *((int *) extra);
 
+	/*
+	 * Even though the value of the client_encoding GUC might change within a
+	 * parallel worker, we don't really change the client encoding in that
+	 * case.  For a parallel worker, the client is the leader process, which
+	 * always expects the database encoding from us.
+	 */
+	if 

Re: [HACKERS] parallel workers and client encoding

2016-06-27 Thread Robert Haas
On Mon, Jun 13, 2016 at 10:27 PM, Peter Eisentraut
 wrote:
> Modulo that last point, here is a patch that shows how I think this could
> work, in combination with the patch I posted previously that sets the
> "client encoding" in the parallel worker to the server encoding.
>
> This patch disassembles the NotificationResponse message with a temporary
> client encoding, and then sends it off to the real frontend using the real
> client encoding.
>
> Doing it this way also takes care of a few special cases that
> NotifyMyFrontEnd() handles, such as a client with protocol version 2 that
> doesn't expect a payload in the message.

How does this address the concern raised in the last sentence of
https://www.postgresql.org/message-id/CA+TgmoaAAEXmjVMB4nM=znf_5b9jhuim6q3afro4spt23ti...@mail.gmail.com
?  It seems that if an error occurs between the two SetClientEncoding
calls, the change will persist for the rest of the session, resulting
in chaos.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-26 Thread Peter Geoghegan
On Mon, Jun 13, 2016 at 9:39 AM, Robert Haas  wrote:
> There is no realistic way that I am going to have this fixed before
> beta2.  There are currently 10 open items listed on the open items
> page, of which 8 relate to my commits and 5 to parallel query in
> particular.  I am not going to get 8 issues fixed in the next 4 days,
> or even the next 6 days if you assume I can work through the weekend
> on this (and that it would be desirable that I be slinging fixes into
> the tree just before the wrap, which seems doubtful).  Furthermore, of
> those issues, I judge this to be least important (except for the
> documentation update, but that's pending further from Peter Geoghegan
> about exactly what he thinks needs to be changed).

I got sidetracked on that, in part due to investigating a bug in the
9.6 external sort work. I'll have more information on that, soon.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-13 Thread Peter Eisentraut

On 6/9/16 7:16 PM, Tatsuo Ishii wrote:

Something like SetClientEncoding(GetDatabaseEncoding()) is a Little
bit ugly. It would be nice if we could have a switch to turn off the
automatic encoding conversion in the future, but for 9.6, I feel I'm
fine with your proposed patch.


One way to make this nicer would be to put the encoding of a string into 
the StringInfoData structure.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-13 Thread Peter Eisentraut

On 6/10/16 2:08 PM, Peter Eisentraut wrote:

On 6/6/16 9:45 PM, Peter Eisentraut wrote:

Attached is a patch to illustrates how this could be fixed.  There might
be similar issues elsewhere.  The notification propagation in particular
could be affected.


Tracing the code, NotificationResponse messages are converted to the
client encoding during transmission from the worker to the leader and
then sent on binarily to the client, so this should appear to work at
the moment.  But it will break if we make a change like I suggested,
namely changing the client encoding in the worker process to be the
server encoding, because then nothing will transcode it before it
reaches the client anymore.  So this will need a well-considered
solution in concert with the error/notice issue.

Then again, it's not clear to me under what circumstances a parallel
worker could or should be sending a NotificationResponse.


Modulo that last point, here is a patch that shows how I think this 
could work, in combination with the patch I posted previously that sets 
the "client encoding" in the parallel worker to the server encoding.


This patch disassembles the NotificationResponse message with a 
temporary client encoding, and then sends it off to the real frontend 
using the real client encoding.


Doing it this way also takes care of a few special cases that 
NotifyMyFrontEnd() handles, such as a client with protocol version 2 
that doesn't expect a payload in the message.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index ab5ef25..0bb93b4 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -810,7 +810,22 @@ HandleParallelMessage(ParallelContext *pcxt, int i, 
StringInfo msg)
case 'A':   /* NotifyResponse */
{
/* Propagate NotifyResponse. */
-   pq_putmessage(msg->data[0], >data[1], 
msg->len - 1);
+   int save_client_encoding;
+   int32   pid;
+   const char *channel;
+   const char *payload;
+
+   save_client_encoding = pg_get_client_encoding();
+   SetClientEncoding(GetDatabaseEncoding());
+
+   pid = pq_getmsgint(msg, 4);
+   channel = pq_getmsgstring(msg);
+   payload = pq_getmsgstring(msg);
+   pq_endmessage(msg);
+
+   SetClientEncoding(save_client_encoding);
+
+   NotifyMyFrontEnd(channel, payload, pid);
break;
}
 
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index c39ac3a..716f1c3 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -390,9 +390,6 @@ static bool asyncQueueProcessPageEntries(volatile 
QueuePosition *current,
 char *page_buffer);
 static void asyncQueueAdvanceTail(void);
 static void ProcessIncomingNotify(void);
-static void NotifyMyFrontEnd(const char *channel,
-const char *payload,
-int32 srcPid);
 static bool AsyncExistsPendingNotify(const char *channel, const char *payload);
 static void ClearPendingActionsAndNotifies(void);
 
@@ -2076,7 +2073,7 @@ ProcessIncomingNotify(void)
 /*
  * Send NOTIFY message to my front end.
  */
-static void
+void
 NotifyMyFrontEnd(const char *channel, const char *payload, int32 srcPid)
 {
if (whereToSendOutput == DestRemote)
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index b4c13fa..95559df 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -28,6 +28,10 @@ extern volatile sig_atomic_t notifyInterruptPending;
 extern Size AsyncShmemSize(void);
 extern void AsyncShmemInit(void);
 
+extern void NotifyMyFrontEnd(const char *channel,
+const char *payload,
+int32 srcPid);
+
 /* notify-related SQL statements */
 extern void Async_Notify(const char *channel, const char *payload);
 extern void Async_Listen(const char *channel);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-13 Thread Robert Haas
On Sun, Jun 12, 2016 at 3:05 AM, Noah Misch  wrote:
> On Thu, Jun 09, 2016 at 12:04:59PM -0400, Peter Eisentraut wrote:
>> On 6/6/16 9:45 PM, Peter Eisentraut wrote:
>> >There appears to be a problem with how client encoding is handled in the
>> >communication from parallel workers.
>>
>> I have added this to the open items for now.
>
> [Action required within 72 hours.  This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 9.6 open item.  Robert,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> 9.6 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within 72 hours of this
> message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping 9.6rc1.  Consequently, I will appreciate your
> efforts toward speedy resolution.  Thanks.

There is no realistic way that I am going to have this fixed before
beta2.  There are currently 10 open items listed on the open items
page, of which 8 relate to my commits and 5 to parallel query in
particular.  I am not going to get 8 issues fixed in the next 4 days,
or even the next 6 days if you assume I can work through the weekend
on this (and that it would be desirable that I be slinging fixes into
the tree just before the wrap, which seems doubtful).  Furthermore, of
those issues, I judge this to be least important (except for the
documentation update, but that's pending further from Peter Geoghegan
about exactly what he thinks needs to be changed).

Therefore, my plan is to revisit this in two weeks once beta2 is out
the door, unless someone else would like to volunteer to fix it
sooner.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-12 Thread Noah Misch
On Thu, Jun 09, 2016 at 12:04:59PM -0400, Peter Eisentraut wrote:
> On 6/6/16 9:45 PM, Peter Eisentraut wrote:
> >There appears to be a problem with how client encoding is handled in the
> >communication from parallel workers.
> 
> I have added this to the open items for now.

[Action required within 72 hours.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
9.6 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within 72 hours of this
message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping 9.6rc1.  Consequently, I will appreciate your
efforts toward speedy resolution.  Thanks.

[1] 
http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-10 Thread Peter Eisentraut

On 6/6/16 9:45 PM, Peter Eisentraut wrote:

Attached is a patch to illustrates how this could be fixed.  There might
be similar issues elsewhere.  The notification propagation in particular
could be affected.


Tracing the code, NotificationResponse messages are converted to the 
client encoding during transmission from the worker to the leader and 
then sent on binarily to the client, so this should appear to work at 
the moment.  But it will break if we make a change like I suggested, 
namely changing the client encoding in the worker process to be the 
server encoding, because then nothing will transcode it before it 
reaches the client anymore.  So this will need a well-considered 
solution in concert with the error/notice issue.


Then again, it's not clear to me under what circumstances a parallel 
worker could or should be sending a NotificationResponse.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-09 Thread Tatsuo Ishii
> There appears to be a problem with how client encoding is handled in
> the communication from parallel workers.

Ouch.

>  In a parallel worker, the
> client encoding setting is inherited from its creating process as part
> of the GUC setup.  So any plain-text stuff the parallel worker sends
> to its leader is actually converted to the client encoding.  Since
> most data is sent in binary format, the plain-text provision applies
> mainly to notice and error messages.  At the other end, error messages
> are parsed using pq_parse_errornotice(), which internally uses
> routines that were meant for communication from the client, and
> therefore will convert everything back from the client encoding to the
> server encoding.  So this whole thing actually happens to work as long
> as round tripping is possible between the involved encodings.
> 
> In cases where it isn't, it's still hard to notice the difference
> because depending on whether you get a parallel plan or not, the
> following happens:
> 
> not parallel: conversion error happens between server and client,
> client sees an error message about that
> 
> parallel: conversion error happens between worker and leader, worker
> generates an error message about that, sends it to leader, leader
> forwards it to client
> 
> The client sees the same error message in both cases.
> 
> To construct a case where this makes a difference, the leader has to
> be set up to catch certain errors.  Here is an example:
> 
> """
> create table test1 (a int, b text);
> truncate test1;
> insert into test1 values (1, 'a');
> 
> create or replace function test1() returns text language plpgsql
> as $$
> declare
>   res text;
> begin
>   perform from test1 where a = test2();
>   return res;
> exception when division_by_zero then
>   return 'boom';
> end;
> $$;
> 
> create or replace function test2() returns int language plpgsql
> parallel safe
> as $$
> begin
>   raise division_by_zero using message = 'Motörhead';
>   return 1;
> end
> $$;
> 
> set force_parallel_mode to on;
> 
> select test1();
> """
> 
> With client_encoding = server_encoding, this will return a single row
> 'boom'.  But with, say, database encoding UTF8 and
> PGCLIENTENCODING=KOI8R, it will error:
> 
> ERROR: 22P05: character with byte sequence 0xef 0xbe 0x83 in encoding
> "UTF8" has no equivalent in encoding "KOI8R"
> CONTEXT:  parallel worker
> 
> (Note that changing force_parallel_mode does not force replanning in
> plpgsql, so if you run test1() first before setting
> force_parallel_mode, then you won't get the error.)
> 
> Attached is a patch to illustrates how this could be fixed.  There
> might be similar issues elsewhere.  The notification propagation in
> particular could be affected.

Something like SetClientEncoding(GetDatabaseEncoding()) is a Little
bit ugly. It would be nice if we could have a switch to turn off the
automatic encoding conversion in the future, but for 9.6, I feel I'm
fine with your proposed patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-09 Thread Robert Haas
On Thu, Jun 9, 2016 at 1:47 PM, Tom Lane  wrote:
>> Second, if you can't convert an error or notice message (or possibly a
>> notify message) from the server encoding to the client coding, you are
>> definitely going to fail, with or without parallel query, because that
>> conversion has to be done at some stage anyway.
>
> Only if the message gets sent to the client, which it might never be;
> for example because the query is inside a plpgsql exception block that
> traps the error.

Hmm... yeah, OK, that's another case.

> You do have a bug here; please don't argue you don't.

I just said it was a bug in my previous post, so I think it is clear
that I am not arguing that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-09 Thread Tom Lane
Robert Haas  writes:
> On Thu, Jun 9, 2016 at 1:14 PM, Tom Lane  wrote:
>> The current code results in failures if any non-latin1 data has to be
>> transmitted from worker to leader, even though the query might not have
>> ever sent that data to the client, and therefore would work fine in
>> non-parallel mode.

> So, I don't think this is true.  First, to be clear, there's no
> encoding conversion going on when tuples are sent from worker to
> leader, so that case has no problem of this type at all.  This is
> limited to non-tuple protocol messages: errors, notices, and possibly
> notifies.

Okay ...

> Second, if you can't convert an error or notice message (or possibly a
> notify message) from the server encoding to the client coding, you are
> definitely going to fail, with or without parallel query, because that
> conversion has to be done at some stage anyway.

Only if the message gets sent to the client, which it might never be;
for example because the query is inside a plpgsql exception block that
traps the error.

You do have a bug here; please don't argue you don't.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-09 Thread Robert Haas
On Thu, Jun 9, 2016 at 1:14 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jun 6, 2016 at 9:45 PM, Peter Eisentraut
>>  wrote:
>>> ...  So this whole thing
>>> actually happens to work as long as round tripping is possible between the
>>> involved encodings.
>
>> Hmm.  I didn't realize that we had encodings where round-tripping
>> wasn't possible.  I figured that if you could convert from A to B, you
>> would also be able to convert from B to A.
>
> Uh, that's not the point: the real problem is when B is simply smaller
> than A.  For example, server encoding utf8, client encoding latin1.
> The current code results in failures if any non-latin1 data has to be
> transmitted from worker to leader, even though the query might not have
> ever sent that data to the client, and therefore would work fine in
> non-parallel mode.

So, I don't think this is true.  First, to be clear, there's no
encoding conversion going on when tuples are sent from worker to
leader, so that case has no problem of this type at all.  This is
limited to non-tuple protocol messages: errors, notices, and possibly
notifies.

Second, if you can't convert an error or notice message (or possibly a
notify message) from the server encoding to the client coding, you are
definitely going to fail, with or without parallel query, because that
conversion has to be done at some stage anyway.  The difference is
that without parallel query, we convert server->client and that's it,
whereas with parallel query we convert server->client->server->client
if the error occurs in the worker, and so the conversion in the
reverse direction (client back to server) can introduce a failure that
couldn't occur otherwise.

That's a pretty obscure corner case, but of course it should still be fixed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-09 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 6, 2016 at 9:45 PM, Peter Eisentraut
>  wrote:
>> ...  So this whole thing
>> actually happens to work as long as round tripping is possible between the
>> involved encodings.

> Hmm.  I didn't realize that we had encodings where round-tripping
> wasn't possible.  I figured that if you could convert from A to B, you
> would also be able to convert from B to A.

Uh, that's not the point: the real problem is when B is simply smaller
than A.  For example, server encoding utf8, client encoding latin1.
The current code results in failures if any non-latin1 data has to be
transmitted from worker to leader, even though the query might not have
ever sent that data to the client, and therefore would work fine in
non-parallel mode.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-09 Thread Robert Haas
On Mon, Jun 6, 2016 at 9:45 PM, Peter Eisentraut
 wrote:
> There appears to be a problem with how client encoding is handled in the
> communication from parallel workers.  In a parallel worker, the client
> encoding setting is inherited from its creating process as part of the GUC
> setup.  So any plain-text stuff the parallel worker sends to its leader is
> actually converted to the client encoding.  Since most data is sent in
> binary format, the plain-text provision applies mainly to notice and error
> messages.  At the other end, error messages are parsed using
> pq_parse_errornotice(), which internally uses routines that were meant for
> communication from the client, and therefore will convert everything back
> from the client encoding to the server encoding.  So this whole thing
> actually happens to work as long as round tripping is possible between the
> involved encodings.

Hmm.  I didn't realize that we had encodings where round-tripping
wasn't possible.  I figured that if you could convert from A to B, you
would also be able to convert from B to A.  I see that this isn't
necessarily true in theory, but I had assumed (incorrectly, it seems)
that it was true in practice.  It seems very odd to me.

> Attached is a patch to illustrates how this could be fixed.  There might be
> similar issues elsewhere.  The notification propagation in particular could
> be affected.

Making the parallel worker always use the database encoding seems like
a good approach to me, but won't the changes you made to
HandleParallelMessage() leave the expect client encoding in the wrong
state if pq_parse_errornotice throws an error?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel workers and client encoding

2016-06-09 Thread Peter Eisentraut

On 6/6/16 9:45 PM, Peter Eisentraut wrote:

There appears to be a problem with how client encoding is handled in the
communication from parallel workers.


I have added this to the open items for now.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] parallel workers and client encoding

2016-06-06 Thread Peter Eisentraut
There appears to be a problem with how client encoding is handled in the 
communication from parallel workers.  In a parallel worker, the client 
encoding setting is inherited from its creating process as part of the 
GUC setup.  So any plain-text stuff the parallel worker sends to its 
leader is actually converted to the client encoding.  Since most data is 
sent in binary format, the plain-text provision applies mainly to notice 
and error messages.  At the other end, error messages are parsed using 
pq_parse_errornotice(), which internally uses routines that were meant 
for communication from the client, and therefore will convert everything 
back from the client encoding to the server encoding.  So this whole 
thing actually happens to work as long as round tripping is possible 
between the involved encodings.


In cases where it isn't, it's still hard to notice the difference 
because depending on whether you get a parallel plan or not, the 
following happens:


not parallel: conversion error happens between server and client, client 
sees an error message about that


parallel: conversion error happens between worker and leader, worker 
generates an error message about that, sends it to leader, leader 
forwards it to client


The client sees the same error message in both cases.

To construct a case where this makes a difference, the leader has to be 
set up to catch certain errors.  Here is an example:


"""
create table test1 (a int, b text);
truncate test1;
insert into test1 values (1, 'a');

create or replace function test1() returns text language plpgsql
as $$
declare
  res text;
begin
  perform from test1 where a = test2();
  return res;
exception when division_by_zero then
  return 'boom';
end;
$$;

create or replace function test2() returns int language plpgsql
parallel safe
as $$
begin
  raise division_by_zero using message = 'Motörhead';
  return 1;
end
$$;

set force_parallel_mode to on;

select test1();
"""

With client_encoding = server_encoding, this will return a single row 
'boom'.  But with, say, database encoding UTF8 and 
PGCLIENTENCODING=KOI8R, it will error:


ERROR:  22P05: character with byte sequence 0xef 0xbe 0x83 in encoding 
"UTF8" has no equivalent in encoding "KOI8R"

CONTEXT:  parallel worker

(Note that changing force_parallel_mode does not force replanning in 
plpgsql, so if you run test1() first before setting force_parallel_mode, 
then you won't get the error.)


Attached is a patch to illustrates how this could be fixed.  There might 
be similar issues elsewhere.  The notification propagation in particular 
could be affected.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index 934dba8..d2105ec 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -775,6 +775,7 @@ HandleParallelMessage(ParallelContext *pcxt, int i, 
StringInfo msg)
ErrorData   edata;
ErrorContextCallback errctx;
ErrorContextCallback *save_error_context_stack;
+   int save_client_encoding;
 
/*
 * Rethrow the error using the error context 
callbacks that
@@ -787,9 +788,14 @@ HandleParallelMessage(ParallelContext *pcxt, int i, 
StringInfo msg)
errctx.previous = pcxt->error_context_stack;
error_context_stack = 
 
+   save_client_encoding = pg_get_client_encoding();
+   SetClientEncoding(GetDatabaseEncoding());
+
/* Parse ErrorResponse or NoticeResponse. */
pq_parse_errornotice(msg, );
 
+   SetClientEncoding(save_client_encoding);
+
/* Death of a worker isn't enough justification 
for suicide. */
edata.elevel = Min(edata.elevel, ERROR);
 
@@ -989,6 +995,7 @@ ParallelWorkerMain(Datum main_arg)
StartTransactionCommand();
RestoreGUCState(gucspace);
CommitTransactionCommand();
+   SetClientEncoding(GetDatabaseEncoding());
 
/* Crank up a transaction state appropriate to a parallel worker. */
tstatespace = shm_toc_lookup(toc, PARALLEL_KEY_TRANSACTION_STATE);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers