Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
The subject of this thread is not directly related to this discussion and we have a new thread [1] for relevant discussion. So, let's discuss this further on that thread. [1] https://www.postgresql.org/message-id/CAF1DzPU8Kx%2BfMXEbFoP289xtm3bz3t%2BZfxhmKavr98Bh-C0TqQ%40mail.gmail.com On Tue, Apr 18, 2017 at 9:30 AM, Kyotaro HORIGUCHI wrote: > At Tue, 18 Apr 2017 09:12:07 +0530, Ashutosh Bapat > wrote in > >> On Tue, Apr 18, 2017 at 8:12 AM, Kyotaro HORIGUCHI >> wrote: >> > At Mon, 17 Apr 2017 17:50:58 +0530, Ashutosh Bapat >> > wrote in >> > >> >> On Mon, Apr 17, 2017 at 1:53 PM, Kyotaro HORIGUCHI >> >> wrote: >> >> > At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas >> >> > wrote in >> >> > >> >> >> On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas >> >> >> wrote: >> >> >> > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier >> >> >> > wrote: >> >> >> >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita >> >> >> >> wrote: >> >> >> >>> Attached is an updated version of the patch, which modified >> >> >> >>> Michael's >> >> >> >>> version of the patch, as I proposed in [1] (see "Other changes:"). >> >> >> >>> I >> >> >> >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, >> >> >> >>> mainly because >> >> >> >>> words like "non-blocking mode" there seems confusing (note that we >> >> >> >>> have >> >> >> >>> PQsetnonbloking). >> >> >> >> >> >> >> >> OK, so that is what I sent except that the comments mentioning >> >> >> >> PG_TRY >> >> >> >> are moved to their correct places. That's fine for me. Thanks for >> >> >> >> gathering everything in a single patch and correcting it. >> >> >> > >> >> >> > I have committed this patch. Thanks for working on this. Sorry for >> >> >> > the delay. >> >> >> >> >> >> This 9.6-era patch, as it turns out, has a problem, which is that we >> >> >> now respond to an interrupt by sending a cancel request and a >> >> >> NON-interruptible ABORT TRANSACTION command to the remote side. If >> >> >> the reason that the user is trying to interrupt is that the network >> >> >> connection has been cut, they interrupt the original query only to get >> >> >> stuck in a non-interruptible wait for ABORT TRANSACTION. That is >> >> >> non-optimal. >> >> > >> >> > Agreed. >> >> > >> >> >> It is not exactly clear to me how to fix this. Could we get by with >> >> >> just slamming the remote connection shut, instead of sending an >> >> >> explicit ABORT TRANSACTION? The remote side ought to treat a >> >> >> disconnect as equivalent to an ABORT anyway, I think, but maybe our >> >> >> local state would get confused. (I have not checked.) >> >> >> >> >> >> Thoughts? >> >> > >> >> > Perhaps we will get stuck at query cancellation before ABORT >> >> > TRANSACTION in the case. A connection will be shut down when >> >> > anything wrong (PQstatus(conn) != CONNECTION_OK and so) on >> >> > aborting local transactoin . So I don't think fdw gets confused >> >> > or sholdn't be confused by shutting down there. >> >> > >> >> > The most significant issue I can see is that the same thing >> >> > happens in the case of graceful ABORT TRANSACTION. It could be a >> >> > performance issue. >> >> > >> >> > We could set timeout here but maybe we can just slamming the >> >> > connection down instead of sending a query cancellation. It is >> >> > caused only by timeout or interrupts so I suppose it is not a >> >> > problem *with a few connections*. >> >> > >> >> > >> >> > Things are a bit diffent with hundreds of connections. The >> >> > penalty of reconnection would be very high in the case. >> >> > >> >> > If we are not willing to pay such high penalty, maybe we are to >> >> > manage busy-idle time of each connection and trying graceful >> >> > abort if it is short enough, maybe having a shoft timeout. >> >> > >> >> > Furthermore, if most or all of the hundreds of connections get >> >> > stuck, such timeout will accumulate up like a mountain... >> >> >> >> Even when the transaction is aborted because a user cancels a query, >> >> we do want to preserve the connection, if possible, to avoid >> > >> > Yes. >> > >> >> reconnection. If the request to cancel the query itself fails, we >> >> should certainly drop the connection. Here's the patch to do that. >> > >> > A problem I think on this would be that we still try to make >> > another connection for canceling and it would stall for several >> > minutes per connection on a packet stall, which should be done in >> > a second on ordinary circumstances. Perhaps we might want here is >> > async canceling with timeout. >> >> I am not sure what do you mean "make another connection for >> cancelling.". Do you mean to say that PQcancel would make another >> connection? > > Yes. It will take about 3 minutes on standard settings when no > ACK returned. I thought that this discussion is based on such > situation. > >> The patch proposed simply improves the condition when PQcancel has >> returned a failure. Right now we ignore that failure and
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
At Tue, 18 Apr 2017 09:12:07 +0530, Ashutosh Bapat wrote in > On Tue, Apr 18, 2017 at 8:12 AM, Kyotaro HORIGUCHI > wrote: > > At Mon, 17 Apr 2017 17:50:58 +0530, Ashutosh Bapat > > wrote in > > > >> On Mon, Apr 17, 2017 at 1:53 PM, Kyotaro HORIGUCHI > >> wrote: > >> > At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas > >> > wrote in > >> > > >> >> On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas > >> >> wrote: > >> >> > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier > >> >> > wrote: > >> >> >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita > >> >> >> wrote: > >> >> >>> Attached is an updated version of the patch, which modified > >> >> >>> Michael's > >> >> >>> version of the patch, as I proposed in [1] (see "Other changes:"). > >> >> >>> I > >> >> >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, > >> >> >>> mainly because > >> >> >>> words like "non-blocking mode" there seems confusing (note that we > >> >> >>> have > >> >> >>> PQsetnonbloking). > >> >> >> > >> >> >> OK, so that is what I sent except that the comments mentioning PG_TRY > >> >> >> are moved to their correct places. That's fine for me. Thanks for > >> >> >> gathering everything in a single patch and correcting it. > >> >> > > >> >> > I have committed this patch. Thanks for working on this. Sorry for > >> >> > the delay. > >> >> > >> >> This 9.6-era patch, as it turns out, has a problem, which is that we > >> >> now respond to an interrupt by sending a cancel request and a > >> >> NON-interruptible ABORT TRANSACTION command to the remote side. If > >> >> the reason that the user is trying to interrupt is that the network > >> >> connection has been cut, they interrupt the original query only to get > >> >> stuck in a non-interruptible wait for ABORT TRANSACTION. That is > >> >> non-optimal. > >> > > >> > Agreed. > >> > > >> >> It is not exactly clear to me how to fix this. Could we get by with > >> >> just slamming the remote connection shut, instead of sending an > >> >> explicit ABORT TRANSACTION? The remote side ought to treat a > >> >> disconnect as equivalent to an ABORT anyway, I think, but maybe our > >> >> local state would get confused. (I have not checked.) > >> >> > >> >> Thoughts? > >> > > >> > Perhaps we will get stuck at query cancellation before ABORT > >> > TRANSACTION in the case. A connection will be shut down when > >> > anything wrong (PQstatus(conn) != CONNECTION_OK and so) on > >> > aborting local transactoin . So I don't think fdw gets confused > >> > or sholdn't be confused by shutting down there. > >> > > >> > The most significant issue I can see is that the same thing > >> > happens in the case of graceful ABORT TRANSACTION. It could be a > >> > performance issue. > >> > > >> > We could set timeout here but maybe we can just slamming the > >> > connection down instead of sending a query cancellation. It is > >> > caused only by timeout or interrupts so I suppose it is not a > >> > problem *with a few connections*. > >> > > >> > > >> > Things are a bit diffent with hundreds of connections. The > >> > penalty of reconnection would be very high in the case. > >> > > >> > If we are not willing to pay such high penalty, maybe we are to > >> > manage busy-idle time of each connection and trying graceful > >> > abort if it is short enough, maybe having a shoft timeout. > >> > > >> > Furthermore, if most or all of the hundreds of connections get > >> > stuck, such timeout will accumulate up like a mountain... > >> > >> Even when the transaction is aborted because a user cancels a query, > >> we do want to preserve the connection, if possible, to avoid > > > > Yes. > > > >> reconnection. If the request to cancel the query itself fails, we > >> should certainly drop the connection. Here's the patch to do that. > > > > A problem I think on this would be that we still try to make > > another connection for canceling and it would stall for several > > minutes per connection on a packet stall, which should be done in > > a second on ordinary circumstances. Perhaps we might want here is > > async canceling with timeout. > > I am not sure what do you mean "make another connection for > cancelling.". Do you mean to say that PQcancel would make another > connection? Yes. It will take about 3 minutes on standard settings when no ACK returned. I thought that this discussion is based on such situation. > The patch proposed simply improves the condition when PQcancel has > returned a failure. Right now we ignore that failure and try to ABORT > the transaction, which is most probably going to get stalled or fail > to be dispatched. With the patch such a connection will be reset. Ah, I understand that. It is surely an improvement since it avoids useless ABORT TRANSACTION that is known to stall. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.o
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Tue, Apr 18, 2017 at 8:12 AM, Kyotaro HORIGUCHI wrote: > At Mon, 17 Apr 2017 17:50:58 +0530, Ashutosh Bapat > wrote in > >> On Mon, Apr 17, 2017 at 1:53 PM, Kyotaro HORIGUCHI >> wrote: >> > At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas >> > wrote in >> > >> >> On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas >> >> wrote: >> >> > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier >> >> > wrote: >> >> >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita >> >> >> wrote: >> >> >>> Attached is an updated version of the patch, which modified Michael's >> >> >>> version of the patch, as I proposed in [1] (see "Other changes:"). I >> >> >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly >> >> >>> because >> >> >>> words like "non-blocking mode" there seems confusing (note that we >> >> >>> have >> >> >>> PQsetnonbloking). >> >> >> >> >> >> OK, so that is what I sent except that the comments mentioning PG_TRY >> >> >> are moved to their correct places. That's fine for me. Thanks for >> >> >> gathering everything in a single patch and correcting it. >> >> > >> >> > I have committed this patch. Thanks for working on this. Sorry for >> >> > the delay. >> >> >> >> This 9.6-era patch, as it turns out, has a problem, which is that we >> >> now respond to an interrupt by sending a cancel request and a >> >> NON-interruptible ABORT TRANSACTION command to the remote side. If >> >> the reason that the user is trying to interrupt is that the network >> >> connection has been cut, they interrupt the original query only to get >> >> stuck in a non-interruptible wait for ABORT TRANSACTION. That is >> >> non-optimal. >> > >> > Agreed. >> > >> >> It is not exactly clear to me how to fix this. Could we get by with >> >> just slamming the remote connection shut, instead of sending an >> >> explicit ABORT TRANSACTION? The remote side ought to treat a >> >> disconnect as equivalent to an ABORT anyway, I think, but maybe our >> >> local state would get confused. (I have not checked.) >> >> >> >> Thoughts? >> > >> > Perhaps we will get stuck at query cancellation before ABORT >> > TRANSACTION in the case. A connection will be shut down when >> > anything wrong (PQstatus(conn) != CONNECTION_OK and so) on >> > aborting local transactoin . So I don't think fdw gets confused >> > or sholdn't be confused by shutting down there. >> > >> > The most significant issue I can see is that the same thing >> > happens in the case of graceful ABORT TRANSACTION. It could be a >> > performance issue. >> > >> > We could set timeout here but maybe we can just slamming the >> > connection down instead of sending a query cancellation. It is >> > caused only by timeout or interrupts so I suppose it is not a >> > problem *with a few connections*. >> > >> > >> > Things are a bit diffent with hundreds of connections. The >> > penalty of reconnection would be very high in the case. >> > >> > If we are not willing to pay such high penalty, maybe we are to >> > manage busy-idle time of each connection and trying graceful >> > abort if it is short enough, maybe having a shoft timeout. >> > >> > Furthermore, if most or all of the hundreds of connections get >> > stuck, such timeout will accumulate up like a mountain... >> >> Even when the transaction is aborted because a user cancels a query, >> we do want to preserve the connection, if possible, to avoid > > Yes. > >> reconnection. If the request to cancel the query itself fails, we >> should certainly drop the connection. Here's the patch to do that. > > A problem I think on this would be that we still try to make > another connection for canceling and it would stall for several > minutes per connection on a packet stall, which should be done in > a second on ordinary circumstances. Perhaps we might want here is > async canceling with timeout. I am not sure what do you mean "make another connection for cancelling.". Do you mean to say that PQcancel would make another connection? The patch proposed simply improves the condition when PQcancel has returned a failure. Right now we ignore that failure and try to ABORT the transaction, which is most probably going to get stalled or fail to be dispatched. With the patch such a connection will be reset. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
At Mon, 17 Apr 2017 17:50:58 +0530, Ashutosh Bapat wrote in > On Mon, Apr 17, 2017 at 1:53 PM, Kyotaro HORIGUCHI > wrote: > > At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas > > wrote in > > > >> On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas > >> wrote: > >> > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier > >> > wrote: > >> >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita > >> >> wrote: > >> >>> Attached is an updated version of the patch, which modified Michael's > >> >>> version of the patch, as I proposed in [1] (see "Other changes:"). I > >> >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly > >> >>> because > >> >>> words like "non-blocking mode" there seems confusing (note that we have > >> >>> PQsetnonbloking). > >> >> > >> >> OK, so that is what I sent except that the comments mentioning PG_TRY > >> >> are moved to their correct places. That's fine for me. Thanks for > >> >> gathering everything in a single patch and correcting it. > >> > > >> > I have committed this patch. Thanks for working on this. Sorry for the > >> > delay. > >> > >> This 9.6-era patch, as it turns out, has a problem, which is that we > >> now respond to an interrupt by sending a cancel request and a > >> NON-interruptible ABORT TRANSACTION command to the remote side. If > >> the reason that the user is trying to interrupt is that the network > >> connection has been cut, they interrupt the original query only to get > >> stuck in a non-interruptible wait for ABORT TRANSACTION. That is > >> non-optimal. > > > > Agreed. > > > >> It is not exactly clear to me how to fix this. Could we get by with > >> just slamming the remote connection shut, instead of sending an > >> explicit ABORT TRANSACTION? The remote side ought to treat a > >> disconnect as equivalent to an ABORT anyway, I think, but maybe our > >> local state would get confused. (I have not checked.) > >> > >> Thoughts? > > > > Perhaps we will get stuck at query cancellation before ABORT > > TRANSACTION in the case. A connection will be shut down when > > anything wrong (PQstatus(conn) != CONNECTION_OK and so) on > > aborting local transactoin . So I don't think fdw gets confused > > or sholdn't be confused by shutting down there. > > > > The most significant issue I can see is that the same thing > > happens in the case of graceful ABORT TRANSACTION. It could be a > > performance issue. > > > > We could set timeout here but maybe we can just slamming the > > connection down instead of sending a query cancellation. It is > > caused only by timeout or interrupts so I suppose it is not a > > problem *with a few connections*. > > > > > > Things are a bit diffent with hundreds of connections. The > > penalty of reconnection would be very high in the case. > > > > If we are not willing to pay such high penalty, maybe we are to > > manage busy-idle time of each connection and trying graceful > > abort if it is short enough, maybe having a shoft timeout. > > > > Furthermore, if most or all of the hundreds of connections get > > stuck, such timeout will accumulate up like a mountain... > > Even when the transaction is aborted because a user cancels a query, > we do want to preserve the connection, if possible, to avoid Yes. > reconnection. If the request to cancel the query itself fails, we > should certainly drop the connection. Here's the patch to do that. A problem I think on this would be that we still try to make another connection for canceling and it would stall for several minutes per connection on a packet stall, which should be done in a second on ordinary circumstances. Perhaps we might want here is async canceling with timeout. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Mon, Apr 17, 2017 at 1:53 PM, Kyotaro HORIGUCHI wrote: > At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas wrote > in >> On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas wrote: >> > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier >> > wrote: >> >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita >> >> wrote: >> >>> Attached is an updated version of the patch, which modified Michael's >> >>> version of the patch, as I proposed in [1] (see "Other changes:"). I >> >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly >> >>> because >> >>> words like "non-blocking mode" there seems confusing (note that we have >> >>> PQsetnonbloking). >> >> >> >> OK, so that is what I sent except that the comments mentioning PG_TRY >> >> are moved to their correct places. That's fine for me. Thanks for >> >> gathering everything in a single patch and correcting it. >> > >> > I have committed this patch. Thanks for working on this. Sorry for the >> > delay. >> >> This 9.6-era patch, as it turns out, has a problem, which is that we >> now respond to an interrupt by sending a cancel request and a >> NON-interruptible ABORT TRANSACTION command to the remote side. If >> the reason that the user is trying to interrupt is that the network >> connection has been cut, they interrupt the original query only to get >> stuck in a non-interruptible wait for ABORT TRANSACTION. That is >> non-optimal. > > Agreed. > >> It is not exactly clear to me how to fix this. Could we get by with >> just slamming the remote connection shut, instead of sending an >> explicit ABORT TRANSACTION? The remote side ought to treat a >> disconnect as equivalent to an ABORT anyway, I think, but maybe our >> local state would get confused. (I have not checked.) >> >> Thoughts? > > Perhaps we will get stuck at query cancellation before ABORT > TRANSACTION in the case. A connection will be shut down when > anything wrong (PQstatus(conn) != CONNECTION_OK and so) on > aborting local transactoin . So I don't think fdw gets confused > or sholdn't be confused by shutting down there. > > The most significant issue I can see is that the same thing > happens in the case of graceful ABORT TRANSACTION. It could be a > performance issue. > > We could set timeout here but maybe we can just slamming the > connection down instead of sending a query cancellation. It is > caused only by timeout or interrupts so I suppose it is not a > problem *with a few connections*. > > > Things are a bit diffent with hundreds of connections. The > penalty of reconnection would be very high in the case. > > If we are not willing to pay such high penalty, maybe we are to > manage busy-idle time of each connection and trying graceful > abort if it is short enough, maybe having a shoft timeout. > > Furthermore, if most or all of the hundreds of connections get > stuck, such timeout will accumulate up like a mountain... Even when the transaction is aborted because a user cancels a query, we do want to preserve the connection, if possible, to avoid reconnection. If the request to cancel the query itself fails, we should certainly drop the connection. Here's the patch to do that. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company 0001-Drop-connection-when-query-can-not-be-cancelled.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas wrote in > On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas wrote: > > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier > > wrote: > >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita > >> wrote: > >>> Attached is an updated version of the patch, which modified Michael's > >>> version of the patch, as I proposed in [1] (see "Other changes:"). I > >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly > >>> because > >>> words like "non-blocking mode" there seems confusing (note that we have > >>> PQsetnonbloking). > >> > >> OK, so that is what I sent except that the comments mentioning PG_TRY > >> are moved to their correct places. That's fine for me. Thanks for > >> gathering everything in a single patch and correcting it. > > > > I have committed this patch. Thanks for working on this. Sorry for the > > delay. > > This 9.6-era patch, as it turns out, has a problem, which is that we > now respond to an interrupt by sending a cancel request and a > NON-interruptible ABORT TRANSACTION command to the remote side. If > the reason that the user is trying to interrupt is that the network > connection has been cut, they interrupt the original query only to get > stuck in a non-interruptible wait for ABORT TRANSACTION. That is > non-optimal. Agreed. > It is not exactly clear to me how to fix this. Could we get by with > just slamming the remote connection shut, instead of sending an > explicit ABORT TRANSACTION? The remote side ought to treat a > disconnect as equivalent to an ABORT anyway, I think, but maybe our > local state would get confused. (I have not checked.) > > Thoughts? Perhaps we will get stuck at query cancellation before ABORT TRANSACTION in the case. A connection will be shut down when anything wrong (PQstatus(conn) != CONNECTION_OK and so) on aborting local transactoin . So I don't think fdw gets confused or sholdn't be confused by shutting down there. The most significant issue I can see is that the same thing happens in the case of graceful ABORT TRANSACTION. It could be a performance issue. We could set timeout here but maybe we can just slamming the connection down instead of sending a query cancellation. It is caused only by timeout or interrupts so I suppose it is not a problem *with a few connections*. Things are a bit diffent with hundreds of connections. The penalty of reconnection would be very high in the case. If we are not willing to pay such high penalty, maybe we are to manage busy-idle time of each connection and trying graceful abort if it is short enough, maybe having a shoft timeout. Furthermore, if most or all of the hundreds of connections get stuck, such timeout will accumulate up like a mountain... regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas wrote: > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier > wrote: >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita >> wrote: >>> Attached is an updated version of the patch, which modified Michael's >>> version of the patch, as I proposed in [1] (see "Other changes:"). I >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly because >>> words like "non-blocking mode" there seems confusing (note that we have >>> PQsetnonbloking). >> >> OK, so that is what I sent except that the comments mentioning PG_TRY >> are moved to their correct places. That's fine for me. Thanks for >> gathering everything in a single patch and correcting it. > > I have committed this patch. Thanks for working on this. Sorry for the > delay. This 9.6-era patch, as it turns out, has a problem, which is that we now respond to an interrupt by sending a cancel request and a NON-interruptible ABORT TRANSACTION command to the remote side. If the reason that the user is trying to interrupt is that the network connection has been cut, they interrupt the original query only to get stuck in a non-interruptible wait for ABORT TRANSACTION. That is non-optimal. It is not exactly clear to me how to fix this. Could we get by with just slamming the remote connection shut, instead of sending an explicit ABORT TRANSACTION? The remote side ought to treat a disconnect as equivalent to an ABORT anyway, I think, but maybe our local state would get confused. (I have not checked.) Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/05/17 0:25, Robert Haas wrote: On Wed, May 11, 2016 at 3:20 AM, Etsuro Fujita wrote: Thanks for the review! I'll add this to the next CF. I think this should be addressed in advance of the release of 9.6, though. I agree. Committed. Thanks! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, May 11, 2016 at 3:20 AM, Etsuro Fujita wrote: > Thanks for the review! > > I'll add this to the next CF. I think this should be addressed in advance > of the release of 9.6, though. I agree. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/28 13:45, Michael Paquier wrote: On Wed, Apr 27, 2016 at 12:16 PM, Etsuro Fujita wrote: On 2016/04/26 21:45, Etsuro Fujita wrote: While re-reviewing the fix, I noticed that since PQcancel we added to pgfdw_xact_callback to cancel a DML pushdown query isn't followed by a ROLLBACK, the connection to the remote server will be discarded at the end of the while loop in that function, which will cause a FATAL error of "connection to client lost". Probably, that was proposed by me in the first version of the patch, but I don't think that's a good idea. Shouldn't we execute ROLLBACK after that PQcancel? Another thing I noticed is, ISTM that we miss the case where DML pushdown queries are performed in subtransactions. I think cancellation logic would also need to be added to pgfdw_subxact_callback. Attached is a patch for that. I have spent some time looking at that... And yeah, losing the connection because of that is a little bit annoying if there are ways to make things clean, and as a START TRANSACTION is always sent for such queries it seems really better to issue a ROLLBACK in any case. Actually, by using PQcancel there is no way to be sure if the cancel will be effective or not. So it could be possible that the command is still able to complete correctly, or it could be able to cancel correctly and it would return an ERROR earlier. In any case, doing the ROLLBACK unconditionally seems adapted to me because we had better clean up the remote state in both cases. Thanks for the review! I'll add this to the next CF. I think this should be addressed in advance of the release of 9.6, though. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Apr 27, 2016 at 12:16 PM, Etsuro Fujita wrote: > On 2016/04/26 21:45, Etsuro Fujita wrote: >> While re-reviewing the fix, I noticed that since PQcancel we added to >> pgfdw_xact_callback to cancel a DML pushdown query isn't followed by a >> ROLLBACK, the connection to the remote server will be discarded at the >> end of the while loop in that function, which will cause a FATAL error >> of "connection to client lost". Probably, that was proposed by me in >> the first version of the patch, but I don't think that's a good idea. >> Shouldn't we execute ROLLBACK after that PQcancel? >> >> Another thing I noticed is, ISTM that we miss the case where DML >> pushdown queries are performed in subtransactions. I think cancellation >> logic would also need to be added to pgfdw_subxact_callback. > > > Attached is a patch for that. I have spent some time looking at that... And yeah, losing the connection because of that is a little bit annoying if there are ways to make things clean, and as a START TRANSACTION is always sent for such queries it seems really better to issue a ROLLBACK in any case. Actually, by using PQcancel there is no way to be sure if the cancel will be effective or not. So it could be possible that the command is still able to complete correctly, or it could be able to cancel correctly and it would return an ERROR earlier. In any case, doing the ROLLBACK unconditionally seems adapted to me because we had better clean up the remote state in both cases. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/26 21:45, Etsuro Fujita wrote: While re-reviewing the fix, I noticed that since PQcancel we added to pgfdw_xact_callback to cancel a DML pushdown query isn't followed by a ROLLBACK, the connection to the remote server will be discarded at the end of the while loop in that function, which will cause a FATAL error of "connection to client lost". Probably, that was proposed by me in the first version of the patch, but I don't think that's a good idea. Shouldn't we execute ROLLBACK after that PQcancel? Another thing I noticed is, ISTM that we miss the case where DML pushdown queries are performed in subtransactions. I think cancellation logic would also need to be added to pgfdw_subxact_callback. Attached is a patch for that. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/connection.c --- b/contrib/postgres_fdw/connection.c *** *** 677,684 pgfdw_xact_callback(XactEvent event, void *arg) * using an asynchronous execution function, the command * might not have yet completed. Check to see if a command * is still being processed by the remote server, and if so, ! * request cancellation of the command; if not, abort ! * gracefully. */ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) { --- 677,683 * using an asynchronous execution function, the command * might not have yet completed. Check to see if a command * is still being processed by the remote server, and if so, ! * request cancellation of the command. */ if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) { *** *** 694,700 pgfdw_xact_callback(XactEvent event, void *arg) errbuf))); PQfreeCancel(cancel); } - break; } /* If we're aborting, abort all remote transactions too */ --- 693,698 *** *** 798,803 pgfdw_subxact_callback(SubXactEvent event, SubTransactionId mySubid, --- 796,825 { /* Assume we might have lost track of prepared statements */ entry->have_error = true; + + /* + * If a command has been submitted to the remote server by using an + * asynchronous execution function, the command might not have yet + * completed. Check to see if a command is still being processed by + * the remote server, and if so, request cancellation of the + * command. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + if (!PQcancel(cancel, errbuf, sizeof(errbuf))) + ereport(WARNING, + (errcode(ERRCODE_CONNECTION_FAILURE), + errmsg("could not send cancel request: %s", + errbuf))); + PQfreeCancel(cancel); + } + } + /* Rollback all remote subtransactions during abort */ snprintf(sql, sizeof(sql), "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d", -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Hi, While re-reviewing the fix, I noticed that since PQcancel we added to pgfdw_xact_callback to cancel a DML pushdown query isn't followed by a ROLLBACK, the connection to the remote server will be discarded at the end of the while loop in that function, which will cause a FATAL error of "connection to client lost". Probably, that was proposed by me in the first version of the patch, but I don't think that's a good idea. Shouldn't we execute ROLLBACK after that PQcancel? Another thing I noticed is, ISTM that we miss the case where DML pushdown queries are performed in subtransactions. I think cancellation logic would also need to be added to pgfdw_subxact_callback. Comments are welcome! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/22 9:22, Michael Paquier wrote: On Thu, Apr 21, 2016 at 11:53 PM, Robert Haas wrote: I have committed this patch. Thanks for working on this. Sorry for the delay. Thanks for the push. open_item--; Thank you, Robert and Michael. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Thu, Apr 21, 2016 at 11:53 PM, Robert Haas wrote: > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier > wrote: >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita >> wrote: >>> Attached is an updated version of the patch, which modified Michael's >>> version of the patch, as I proposed in [1] (see "Other changes:"). I >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly because >>> words like "non-blocking mode" there seems confusing (note that we have >>> PQsetnonbloking). >> >> OK, so that is what I sent except that the comments mentioning PG_TRY >> are moved to their correct places. That's fine for me. Thanks for >> gathering everything in a single patch and correcting it. > > I have committed this patch. Thanks for working on this. Sorry for the > delay. Thanks for the push. open_item--; -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier wrote: > On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita > wrote: >> Attached is an updated version of the patch, which modified Michael's >> version of the patch, as I proposed in [1] (see "Other changes:"). I >> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly because >> words like "non-blocking mode" there seems confusing (note that we have >> PQsetnonbloking). > > OK, so that is what I sent except that the comments mentioning PG_TRY > are moved to their correct places. That's fine for me. Thanks for > gathering everything in a single patch and correcting it. I have committed this patch. Thanks for working on this. Sorry for the delay. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita wrote: > Attached is an updated version of the patch, which modified Michael's > version of the patch, as I proposed in [1] (see "Other changes:"). I > modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly because > words like "non-blocking mode" there seems confusing (note that we have > PQsetnonbloking). OK, so that is what I sent except that the comments mentioning PG_TRY are moved to their correct places. That's fine for me. Thanks for gathering everything in a single patch and correcting it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/19 14:25, Etsuro Fujita wrote: On 2016/04/19 13:59, Michael Paquier wrote: But huge objection to that because this fragilizes the current logic postgres_fdw is based on: PQexec returns the last result to caller, I'd rather not break that logic for 9.6 stability's sake. IIUC, I think each query submitted by PQexec in postgres_fdw.c contains just a single command. Maybe I'm missing something, though. A even better proof of that is the following, which just emulates what your version of pgfdw_get_result is doing when consuming the results. + /* Verify that there are no more results */ + res = pgfdw_get_result(fmstate->conn, fmstate->query); + if (res != NULL) + pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query); This could even lead to incorrect errors in the future if multiple queries are combined with those DMLs for a reason or another. I'd like to leave such enhancements for future work... On reflection, I'd like to agree with Michael on that point. As mentioned by him, we should not lose the future extendability. And I don't think his version of pgfdw_get_result would damage the robustness of in-postgres_fdw.c functions using that, which was my concern. Sorry for the noise. Attached is an updated version of the patch, which modified Michael's version of the patch, as I proposed in [1] (see "Other changes:"). I modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly because words like "non-blocking mode" there seems confusing (note that we have PQsetnonbloking). Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/5715b08a.2030...@lab.ntt.co.jp diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 189f290..16ef38f 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -17,6 +17,7 @@ #include "access/xact.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "storage/latch.h" #include "utils/hsearch.h" #include "utils/memutils.h" @@ -448,6 +449,78 @@ GetPrepStmtNumber(PGconn *conn) } /* + * Submit a query and wait for the result. + * + * This function is interruptible by signals. + * + * Caller is responsible for the error handling on the result. + */ +PGresult * +pgfdw_exec_query(PGconn *conn, const char *query) +{ + /* + * Submit a query. Since we don't use non-blocking mode, this also can + * block. But its risk is relatively small, so we ignore that for now. + */ + if (!PQsendQuery(conn, query)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + + /* Wait for the result. */ + return pgfdw_get_result(conn, query); +} + +/* + * Wait for the result from a prior asynchronous execution function call. + * + * This function offers quick responsiveness by checking for any interruptions. + * + * This function emulates the PQexec()'s behavior of returning the last result + * when there are many. + * + * Caller is responsible for the error handling on the result. + */ +PGresult * +pgfdw_get_result(PGconn *conn, const char *query) +{ + PGresult *last_res = NULL; + + for (;;) + { + PGresult *res; + + while (PQisBusy(conn)) + { + int wc; + + /* Sleep until there's something to do */ + wc = WaitLatchOrSocket(MyLatch, + WL_LATCH_SET | WL_SOCKET_READABLE, + PQsocket(conn), + -1L); + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS(); + + /* Data available in socket */ + if (wc & WL_SOCKET_READABLE) + { +if (!PQconsumeInput(conn)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + } + } + + res = PQgetResult(conn); + if (res == NULL) + break;/* query is complete */ + + PQclear(last_res); + last_res = res; + } + + return last_res; +} + +/* * Report an error we got from the remote server. * * elevel: error level to use (typically ERROR, but might be less) @@ -598,6 +671,32 @@ pgfdw_xact_callback(XactEvent event, void *arg) case XACT_EVENT_ABORT: /* Assume we might have lost track of prepared statements */ entry->have_error = true; + + /* + * If a command has been submitted to the remote server by + * using an asynchronous execution function, the command + * might not have yet completed. Check to see if a command + * is still being processed by the remote server, and if so, + * request cancellation of the command; if not, abort + * gracefully. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + if (!PQcancel(cancel, errbuf, sizeof(errbuf))) +ereport(WARNING, + (errcode(ERRCODE_CONNECTION_FAILURE), + errmsg("could not send cancel request: %s", +errbuf))); + PQfreeCancel(cancel); + } + break; + } + /* If we're aborting, abort all remote transactions too */ res = PQexec(entry->conn,
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/19 13:59, Michael Paquier wrote: On Tue, Apr 19, 2016 at 1:14 PM, Etsuro Fujita wrote: What do you think about that? + /* Wait for the result */ + res = pgfdw_get_result(conn, query); + if (res == NULL) + pgfdw_report_error(ERROR, NULL, conn, false, query); + last_res = res; + + /* +* Verify that there are no more results +* +* We don't use a PG_TRY block here, so be careful not to throw error +* without releasing the PGresult. +*/ + res = pgfdw_get_result(conn, query); + if (res != NULL) + { + PQclear(last_res); + pgfdw_report_error(ERROR, res, conn, true, query); + } But huge objection to that because this fragilizes the current logic postgres_fdw is based on: PQexec returns the last result to caller, I'd rather not break that logic for 9.6 stability's sake. IIUC, I think each query submitted by PQexec in postgres_fdw.c contains just a single command. Maybe I'm missing something, though. A even better proof of that is the following, which just emulates what your version of pgfdw_get_result is doing when consuming the results. + /* Verify that there are no more results */ + res = pgfdw_get_result(fmstate->conn, fmstate->query); + if (res != NULL) + pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query); This could even lead to incorrect errors in the future if multiple queries are combined with those DMLs for a reason or another. I'd like to leave such enhancements for future work... Thanks for the comment! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Tue, Apr 19, 2016 at 1:14 PM, Etsuro Fujita wrote: > The comment "We don't use a PG_TRY block here ..." seems to be wrongly > placed, so I moved that comment. Also, I think it'd be better to call > pgfdw_report_error() with the clear argument set to false, not true, since > we don't need to clear the PGresult. Same for postgresExecForeignUpdate, > postgresExecForeignUpdate, and prepare_foreign_modify. No objection to moving those comment blocks where pgfdw_get_result is called. > What do you think about that? + /* Wait for the result */ + res = pgfdw_get_result(conn, query); + if (res == NULL) + pgfdw_report_error(ERROR, NULL, conn, false, query); + last_res = res; + + /* +* Verify that there are no more results +* +* We don't use a PG_TRY block here, so be careful not to throw error +* without releasing the PGresult. +*/ + res = pgfdw_get_result(conn, query); + if (res != NULL) + { + PQclear(last_res); + pgfdw_report_error(ERROR, res, conn, true, query); + } But huge objection to that because this fragilizes the current logic postgres_fdw is based on: PQexec returns the last result to caller, I'd rather not break that logic for 9.6 stability's sake. A even better proof of that is the following, which just emulates what your version of pgfdw_get_result is doing when consuming the results. + /* Verify that there are no more results */ + res = pgfdw_get_result(fmstate->conn, fmstate->query); + if (res != NULL) + pgfdw_report_error(ERROR, res, fmstate->conn, true, fmstate->query); This could even lead to incorrect errors in the future if multiple queries are combined with those DMLs for a reason or another. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/19 12:45, Etsuro Fujita wrote: On 2016/04/19 12:26, Michael Paquier wrote: Note for Robert: pgfdw_get_result copycats PQexec by discarding all PQresult received except the last one. I think that's fine for the purposes of postgres_fdw, but perhaps you have a different opinion on the matter. That seemed reasonable to me, but sorry, on second thought, I'm not sure that's still a good idea. One reason is (1) I think it's better for the in-postgres_fdw.c functions using pgfdw_get_result to verify that there are no more results, in itself. I think that would improve the robustness of those functions. Another reason is I don't think pgfdw_report_error, which is used in pgfdw_get_result, works well for cases where the query contains multiple SQL commands. So, +1 for the idea of simply encapsulating the while (PQisBusy(...)) loop into a new function pgfdw_get_result(). Here is a proposed patch for that. Other changes: * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ - res = PQexecPrepared(fmstate->conn, -fmstate->p_name, -fmstate->p_nums, -p_values, -NULL, -NULL, -0); + if (!PQsendQueryPrepared(fmstate->conn, +fmstate->p_name, +fmstate->p_nums, +p_values, +NULL, +NULL, +0)) + pgfdw_report_error(ERROR, NULL, fmstate->conn, true, fmstate->query); The comment "We don't use a PG_TRY block here ..." seems to be wrongly placed, so I moved that comment. Also, I think it'd be better to call pgfdw_report_error() with the clear argument set to false, not true, since we don't need to clear the PGresult. Same for postgresExecForeignUpdate, postgresExecForeignUpdate, and prepare_foreign_modify. What do you think about that? Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/connection.c --- b/contrib/postgres_fdw/connection.c *** *** 17,22 --- 17,23 #include "access/xact.h" #include "mb/pg_wchar.h" #include "miscadmin.h" + #include "storage/latch.h" #include "utils/hsearch.h" #include "utils/memutils.h" *** *** 448,453 GetPrepStmtNumber(PGconn *conn) --- 449,537 } /* + * Send a query using PQsendQuery, and wait for the results. + * + * This function is interruptible by signals. + * + * Note: we assume that the query string contains a single SQL command. + */ + PGresult * + pgfdw_exec_query(PGconn *conn, const char *query) + { + PGresult *res; + PGresult *last_res = NULL; + + /* + * Submit a query. Since we don't use non-blocking mode, this also can + * block. But its risk is relatively small, so we ignore that for now. + */ + if (!PQsendQuery(conn, query)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + + /* Wait for the result */ + res = pgfdw_get_result(conn, query); + if (res == NULL) + pgfdw_report_error(ERROR, NULL, conn, false, query); + last_res = res; + + /* + * Verify that there are no more results + * + * We don't use a PG_TRY block here, so be careful not to throw error + * without releasing the PGresult. + */ + res = pgfdw_get_result(conn, query); + if (res != NULL) + { + PQclear(last_res); + pgfdw_report_error(ERROR, res, conn, true, query); + } + + return last_res; + } + + /* + * Wait for the next result from a prior asynchronous execution function call, + * and return it. + * + * This function offers quick responsiveness by checking for any interruptions. + * + * Caller is responsible for the error handling on the fetched result. + * + * Note: we assume that the query string contains a single SQL command. + */ + PGresult * + pgfdw_get_result(PGconn *conn, const char *query) + { + /* + * Receive data until PQgetResult is ready to get the result without + * blocking. + */ + while (PQisBusy(conn)) + { + int wc; + + /* Sleep until there's something to do */ + wc = WaitLatchOrSocket(MyLatch, + WL_LATCH_SET | WL_SOCKET_READABLE, + PQsocket(conn), + -1L); + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS(); + + /* Data available in socket */ + if (wc & WL_SOCKET_READABLE) + { + if (!PQconsumeInput(conn)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + } + } + + return PQgetResult(conn); + } + + /* * Report an error we got from the remote server
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/19 12:26, Michael Paquier wrote: On Tue, Apr 19, 2016 at 12:16 PM, Noah Misch wrote: On Sat, Apr 16, 2016 at 08:59:40AM +0900, Michael Paquier wrote: Here is a new version. I just recalled that I forgot a PQclear() call to clean up results. Thanks for updating the patch! Robert, the deadline to fix this open item expired eleven days ago. The thread had been seeing regular activity, but it has now been quiet for three days. Do you have an updated plan for fixing this open item? Note for Robert: pgfdw_get_result copycats PQexec by discarding all PQresult received except the last one. I think that's fine for the purposes of postgres_fdw, but perhaps you have a different opinion on the matter. That seemed reasonable to me, but sorry, on second thought, I'm not sure that's still a good idea. One reason is (1) I think it's better for the in-postgres_fdw.c functions using pgfdw_get_result to verify that there are no more results, in itself. I think that would improve the robustness of those functions. Another reason is I don't think pgfdw_report_error, which is used in pgfdw_get_result, works well for cases where the query contains multiple SQL commands. So, +1 for the idea of simply encapsulating the while (PQisBusy(...)) loop into a new function pgfdw_get_result(). Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Tue, Apr 19, 2016 at 12:16 PM, Noah Misch wrote: > On Sat, Apr 16, 2016 at 08:59:40AM +0900, Michael Paquier wrote: >> On Fri, Apr 15, 2016 at 9:46 PM, Michael Paquier >> wrote: >> > On Fri, Apr 15, 2016 at 8:25 PM, Etsuro Fujita >> > wrote: >> >> How about doing something similar for PQprepare/PQexecPrepared in >> >> postgresExecForeignInsert, postgresExecForeignUpdate, and >> >> postgresExecForeignDelete? >> > >> > Yes, I hesitated to touch those, but they are good candidates for this >> > new interface, and actually it has proved not to be complicated to >> > plug in the new routines in those code paths. >> > >> >> Also, how about doing that for PQexec in connection.c? >> > >> > Here I disagree, this is not adapted. All the PQexec calls are part of >> > callbacks that are triggered on failures, and we rely on such a >> > callback when issuing PQcancel. do_sql_command runs commands that take >> > a short amount of time, so I think as well that it is fine as-is with >> > PQexec. >> >> Here is a new version. I just recalled that I forgot a PQclear() call >> to clean up results. > > Robert, the deadline to fix this open item expired eleven days ago. The > thread had been seeing regular activity, but it has now been quiet for three > days. Do you have an updated plan for fixing this open item? Note for Robert: pgfdw_get_result copycats PQexec by discarding all PQresult received except the last one. I think that's fine for the purposes of postgres_fdw, but perhaps you have a different opinion on the matter. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Sat, Apr 16, 2016 at 08:59:40AM +0900, Michael Paquier wrote: > On Fri, Apr 15, 2016 at 9:46 PM, Michael Paquier > wrote: > > On Fri, Apr 15, 2016 at 8:25 PM, Etsuro Fujita > > wrote: > >> How about doing something similar for PQprepare/PQexecPrepared in > >> postgresExecForeignInsert, postgresExecForeignUpdate, and > >> postgresExecForeignDelete? > > > > Yes, I hesitated to touch those, but they are good candidates for this > > new interface, and actually it has proved not to be complicated to > > plug in the new routines in those code paths. > > > >> Also, how about doing that for PQexec in connection.c? > > > > Here I disagree, this is not adapted. All the PQexec calls are part of > > callbacks that are triggered on failures, and we rely on such a > > callback when issuing PQcancel. do_sql_command runs commands that take > > a short amount of time, so I think as well that it is fine as-is with > > PQexec. > > Here is a new version. I just recalled that I forgot a PQclear() call > to clean up results. Robert, the deadline to fix this open item expired eleven days ago. The thread had been seeing regular activity, but it has now been quiet for three days. Do you have an updated plan for fixing this open item? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Fri, Apr 15, 2016 at 9:46 PM, Michael Paquier wrote: > On Fri, Apr 15, 2016 at 8:25 PM, Etsuro Fujita > wrote: >> How about doing something similar for PQprepare/PQexecPrepared in >> postgresExecForeignInsert, postgresExecForeignUpdate, and >> postgresExecForeignDelete? > > Yes, I hesitated to touch those, but they are good candidates for this > new interface, and actually it has proved not to be complicated to > plug in the new routines in those code paths. > >> Also, how about doing that for PQexec in connection.c? > > Here I disagree, this is not adapted. All the PQexec calls are part of > callbacks that are triggered on failures, and we rely on such a > callback when issuing PQcancel. do_sql_command runs commands that take > a short amount of time, so I think as well that it is fine as-is with > PQexec. Here is a new version. I just recalled that I forgot a PQclear() call to clean up results. -- Michael diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 189f290..33b7363 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -17,6 +17,7 @@ #include "access/xact.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "storage/latch.h" #include "utils/hsearch.h" #include "utils/memutils.h" @@ -448,6 +449,68 @@ GetPrepStmtNumber(PGconn *conn) } /* + * Execute query in non-blocking mode and fetch back result + */ +PGresult * +pgfdw_exec_query(PGconn *conn, const char *query) +{ + if (!PQsendQuery(conn, query)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + + return pgfdw_get_result(conn, query); +} + +/* + * Scan for results of query run in non-waiting mode + * + * This should be run after executing a query on the remote server and + * offers quick responsiveness by checking for any interruptions. The + * last result is returned back to caller, and previous results are + * consumed. Caller is responsible for the error handling on the fetched + * result. + */ +PGresult * +pgfdw_get_result(PGconn *conn, const char *query) +{ + PGresult *last_res = NULL; + + for (;;) + { + PGresult *res; + + while (PQisBusy(conn)) + { + int wc; + + /* Sleep until there's something to do */ + wc = WaitLatchOrSocket(MyLatch, + WL_LATCH_SET | WL_SOCKET_READABLE, + PQsocket(conn), + -1L); + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS(); + + /* Data available in socket */ + if (wc & WL_SOCKET_READABLE) + { +if (!PQconsumeInput(conn)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + } + } + + res = PQgetResult(conn); + if (res == NULL) + break; + + PQclear(last_res); + last_res = res; + } + + return last_res; +} + +/* * Report an error we got from the remote server. * * elevel: error level to use (typically ERROR, but might be less) @@ -598,6 +661,32 @@ pgfdw_xact_callback(XactEvent event, void *arg) case XACT_EVENT_ABORT: /* Assume we might have lost track of prepared statements */ entry->have_error = true; + + /* + * If a command has been submitted to the remote server + * using an asynchronous execution function, the command + * might not have yet completed. Check to see if a command + * is still being processed by the remote server, and if so, + * request cancellation of the command; if not, abort + * gracefully. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + if (!PQcancel(cancel, errbuf, sizeof(errbuf))) +ereport(WARNING, + (errcode(ERRCODE_CONNECTION_FAILURE), + errmsg("could not send cancel request: %s", +errbuf))); + PQfreeCancel(cancel); + } + break; + } + /* If we're aborting, abort all remote transactions too */ res = PQexec(entry->conn, "ABORT TRANSACTION"); /* Note: can't throw ERROR, it would be infinite loop */ diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index ee0220a..4566cf4 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -1421,7 +1421,7 @@ postgresReScanForeignScan(ForeignScanState *node) * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ - res = PQexec(fsstate->conn, sql); + res = pgfdw_exec_query(fsstate->conn, sql); if (PQresultStatus(res) != PGRES_COMMAND_OK) pgfdw_report_error(ERROR, res, fsstate->conn, true, sql); PQclear(res); @@ -1754,13 +1754,16 @@ postgresExecForeignInsert(EState *estate, * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ - res = PQexecPrepared(fmstate->conn, - fmstate->p_name, - fmstate->p_nums, - p_values, - NULL, - NULL, - 0); + if (!PQsendQueryPrepared(fmstate->con
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Fri, Apr 15, 2016 at 8:25 PM, Etsuro Fujita wrote: > How about doing something similar for PQprepare/PQexecPrepared in > postgresExecForeignInsert, postgresExecForeignUpdate, and > postgresExecForeignDelete? Yes, I hesitated to touch those, but they are good candidates for this new interface, and actually it has proved not to be complicated to plug in the new routines in those code paths. > Also, how about doing that for PQexec in connection.c? Here I disagree, this is not adapted. All the PQexec calls are part of callbacks that are triggered on failures, and we rely on such a callback when issuing PQcancel. do_sql_command runs commands that take a short amount of time, so I think as well that it is fine as-is with PQexec. -- Michael diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 189f290..64a00b4 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -17,6 +17,7 @@ #include "access/xact.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "storage/latch.h" #include "utils/hsearch.h" #include "utils/memutils.h" @@ -448,6 +449,67 @@ GetPrepStmtNumber(PGconn *conn) } /* + * Execute query in non-blocking mode and fetch back result + */ +PGresult * +pgfdw_exec_query(PGconn *conn, const char *query) +{ + if (!PQsendQuery(conn, query)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + + return pgfdw_get_result(conn, query); +} + +/* + * Scan for results of query run in non-waiting mode + * + * This should be run after executing a query on the remote server and + * offers quick responsiveness by checking for any interruptions. The + * last result is returned back to caller, and previous results are + * consumed. Caller is responsible for the error handling on the fetched + * result. + */ +PGresult * +pgfdw_get_result(PGconn *conn, const char *query) +{ + PGresult *last_res; + + for (;;) + { + PGresult *res; + + while (PQisBusy(conn)) + { + int wc; + + /* Sleep until there's something to do */ + wc = WaitLatchOrSocket(MyLatch, + WL_LATCH_SET | WL_SOCKET_READABLE, + PQsocket(conn), + -1L); + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS(); + + /* Data available in socket */ + if (wc & WL_SOCKET_READABLE) + { +if (!PQconsumeInput(conn)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + } + } + + res = PQgetResult(conn); + if (res == NULL) + break; + + last_res = res; + } + + return last_res; +} + +/* * Report an error we got from the remote server. * * elevel: error level to use (typically ERROR, but might be less) @@ -598,6 +660,32 @@ pgfdw_xact_callback(XactEvent event, void *arg) case XACT_EVENT_ABORT: /* Assume we might have lost track of prepared statements */ entry->have_error = true; + + /* + * If a command has been submitted to the remote server + * using an asynchronous execution function, the command + * might not have yet completed. Check to see if a command + * is still being processed by the remote server, and if so, + * request cancellation of the command; if not, abort + * gracefully. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + if (!PQcancel(cancel, errbuf, sizeof(errbuf))) +ereport(WARNING, + (errcode(ERRCODE_CONNECTION_FAILURE), + errmsg("could not send cancel request: %s", +errbuf))); + PQfreeCancel(cancel); + } + break; + } + /* If we're aborting, abort all remote transactions too */ res = PQexec(entry->conn, "ABORT TRANSACTION"); /* Note: can't throw ERROR, it would be infinite loop */ diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index ee0220a..4566cf4 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -1421,7 +1421,7 @@ postgresReScanForeignScan(ForeignScanState *node) * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ - res = PQexec(fsstate->conn, sql); + res = pgfdw_exec_query(fsstate->conn, sql); if (PQresultStatus(res) != PGRES_COMMAND_OK) pgfdw_report_error(ERROR, res, fsstate->conn, true, sql); PQclear(res); @@ -1754,13 +1754,16 @@ postgresExecForeignInsert(EState *estate, * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ - res = PQexecPrepared(fmstate->conn, - fmstate->p_name, - fmstate->p_nums, - p_values, - NULL, - NULL, - 0); + if (!PQsendQueryPrepared(fmstate->conn, + fmstate->p_name, + fmstate->p_nums, + p_values, + NULL, + NULL, + 0)) + pgfdw_report_error(ERROR, NULL, fmstate->conn, true, fmstate->query); + + res = pgfdw
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/15 14:31, Michael Paquier wrote: On Thu, Apr 14, 2016 at 10:44 AM, Etsuro Fujita wrote: On 2016/04/13 21:50, Michael Paquier wrote: On Wed, Apr 13, 2016 at 9:46 PM, Robert Haas wrote: On Tue, Apr 12, 2016 at 10:24 PM, Etsuro Fujita wrote: How about we encapsulate the while (PQisBusy(...)) loop into a new function pgfdw_get_result(), which can be called after first calling PQsendQueryParams()? So then this code will say dmstate->result = pgfdw_get_result(dmstate->conn). And we can do something similar for the other call to PQexecParams() in create_cursor(). Then let's also add something like pgfdw_exec_query() which calls PQsendQuery() and then pgfdw_get_result, and use that to replace all of the existing calls to PQexec(). Then all the SQL postgres_fdw executes would be interruptible, not just the new stuff. I would be happy if you work on that. OK, so I have finished with the attached. Thank you for working on that! One thing that I noticed in the previous patch version is that it completely ignored cases where multiple PGresult could be returned by server, say when multiple queries are sent in the same string: PQexec gets always the last one, so I think that we had better do the same here. Seems reasonable. so I think that we had better do the same here. I have switched all the PQexec calls to a custom routine that combines PQsendQuery/PQgetResult, and PQexecParams is switched to PQsendQueryParams/PQgetResult. This structure allows all queries run though postgres_fdw.c to be interruptible. How about doing something similar for PQprepare/PQexecPrepared in postgresExecForeignInsert, postgresExecForeignUpdate, and postgresExecForeignDelete? Also, how about doing that for PQexec in connection.c? Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Thu, Apr 14, 2016 at 10:44 AM, Etsuro Fujita wrote: > On 2016/04/13 21:50, Michael Paquier wrote: >> On Wed, Apr 13, 2016 at 9:46 PM, Robert Haas >> wrote: >>> On Tue, Apr 12, 2016 at 10:24 PM, Etsuro Fujita >>> wrote: > > How about we encapsulate the while (PQisBusy(...)) loop into a new > function pgfdw_get_result(), which can be called after first calling > PQsendQueryParams()? So then this code will say dmstate->result = > pgfdw_get_result(dmstate->conn). And we can do something similar for > the other call to PQexecParams() in create_cursor(). Then let's also > add something like pgfdw_exec_query() which calls PQsendQuery() and > then pgfdw_get_result, and use that to replace all of the existing > calls to PQexec(). > > Then all the SQL postgres_fdw executes would be interruptible, not > just the new stuff. > > I would be happy if you work on that. OK, so I have finished with the attached. One thing that I noticed in the previous patch version is that it completely ignored cases where multiple PGresult could be returned by server, say when multiple queries are sent in the same string: PQexec gets always the last one, so I think that we had better do the same here. I have switched all the PQexec calls to a custom routine that combines PQsendQuery/PQgetResult, and PQexecParams is switched to PQsendQueryParams/PQgetResult. This structure allows all queries run though postgres_fdw.c to be interruptible. -- Michael diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 189f290..64a00b4 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -17,6 +17,7 @@ #include "access/xact.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "storage/latch.h" #include "utils/hsearch.h" #include "utils/memutils.h" @@ -448,6 +449,67 @@ GetPrepStmtNumber(PGconn *conn) } /* + * Execute query in non-blocking mode and fetch back result + */ +PGresult * +pgfdw_exec_query(PGconn *conn, const char *query) +{ + if (!PQsendQuery(conn, query)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + + return pgfdw_get_result(conn, query); +} + +/* + * Scan for results of query run in non-waiting mode + * + * This should be run after executing a query on the remote server and + * offers quick responsiveness by checking for any interruptions. The + * last result is returned back to caller, and previous results are + * consumed. Caller is responsible for the error handling on the fetched + * result. + */ +PGresult * +pgfdw_get_result(PGconn *conn, const char *query) +{ + PGresult *last_res; + + for (;;) + { + PGresult *res; + + while (PQisBusy(conn)) + { + int wc; + + /* Sleep until there's something to do */ + wc = WaitLatchOrSocket(MyLatch, + WL_LATCH_SET | WL_SOCKET_READABLE, + PQsocket(conn), + -1L); + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS(); + + /* Data available in socket */ + if (wc & WL_SOCKET_READABLE) + { +if (!PQconsumeInput(conn)) + pgfdw_report_error(ERROR, NULL, conn, false, query); + } + } + + res = PQgetResult(conn); + if (res == NULL) + break; + + last_res = res; + } + + return last_res; +} + +/* * Report an error we got from the remote server. * * elevel: error level to use (typically ERROR, but might be less) @@ -598,6 +660,32 @@ pgfdw_xact_callback(XactEvent event, void *arg) case XACT_EVENT_ABORT: /* Assume we might have lost track of prepared statements */ entry->have_error = true; + + /* + * If a command has been submitted to the remote server + * using an asynchronous execution function, the command + * might not have yet completed. Check to see if a command + * is still being processed by the remote server, and if so, + * request cancellation of the command; if not, abort + * gracefully. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + if (!PQcancel(cancel, errbuf, sizeof(errbuf))) +ereport(WARNING, + (errcode(ERRCODE_CONNECTION_FAILURE), + errmsg("could not send cancel request: %s", +errbuf))); + PQfreeCancel(cancel); + } + break; + } + /* If we're aborting, abort all remote transactions too */ res = PQexec(entry->conn, "ABORT TRANSACTION"); /* Note: can't throw ERROR, it would be infinite loop */ diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index ee0220a..31f5bb2 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -1421,7 +1421,7 @@ postgresReScanForeignScan(ForeignScanState *node) * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ - res = PQexec(fsstate->conn, s
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/13 21:50, Michael Paquier wrote: On Wed, Apr 13, 2016 at 9:46 PM, Robert Haas wrote: On Tue, Apr 12, 2016 at 10:24 PM, Etsuro Fujita wrote: How about we encapsulate the while (PQisBusy(...)) loop into a new function pgfdw_get_result(), which can be called after first calling PQsendQueryParams()? So then this code will say dmstate->result = pgfdw_get_result(dmstate->conn). And we can do something similar for the other call to PQexecParams() in create_cursor(). Then let's also add something like pgfdw_exec_query() which calls PQsendQuery() and then pgfdw_get_result, and use that to replace all of the existing calls to PQexec(). Then all the SQL postgres_fdw executes would be interruptible, not just the new stuff. Seems like a good idea. Will do. When will you do this? We are on a bit of a time budget here. Fujita-san, I can code that tomorrow or in two days if need be. That should not be an issue from here. I would be happy if you work on that. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Apr 13, 2016 at 9:46 PM, Robert Haas wrote: > On Tue, Apr 12, 2016 at 10:24 PM, Etsuro Fujita > wrote: >>> How about we encapsulate the while (PQisBusy(...)) loop into a new >>> function pgfdw_get_result(), which can be called after first calling >>> PQsendQueryParams()? So then this code will say dmstate->result = >>> pgfdw_get_result(dmstate->conn). And we can do something similar for >>> the other call to PQexecParams() in create_cursor(). Then let's also >>> add something like pgfdw_exec_query() which calls PQsendQuery() and >>> then pgfdw_get_result, and use that to replace all of the existing >>> calls to PQexec(). >>> >>> Then all the SQL postgres_fdw executes would be interruptible, not >>> just the new stuff. >> >> Seems like a good idea. Will do. > > When will you do this? We are on a bit of a time budget here. Fujita-san, I can code that tomorrow or in two days if need be. That should not be an issue from here. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Tue, Apr 12, 2016 at 10:24 PM, Etsuro Fujita wrote: >> How about we encapsulate the while (PQisBusy(...)) loop into a new >> function pgfdw_get_result(), which can be called after first calling >> PQsendQueryParams()? So then this code will say dmstate->result = >> pgfdw_get_result(dmstate->conn). And we can do something similar for >> the other call to PQexecParams() in create_cursor(). Then let's also >> add something like pgfdw_exec_query() which calls PQsendQuery() and >> then pgfdw_get_result, and use that to replace all of the existing >> calls to PQexec(). >> >> Then all the SQL postgres_fdw executes would be interruptible, not >> just the new stuff. > > Seems like a good idea. Will do. When will you do this? We are on a bit of a time budget here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Apr 13, 2016 at 11:24 AM, Etsuro Fujita wrote: > On 2016/04/13 3:14, Robert Haas wrote: >> >> I'm wondering why we are fixing this specific case and not any of the >> other calls to PQexec() or PQexecParams() in postgres_fdw.c. >> >> I mean, many of those instances are cases where the query isn't likely >> to run for very long, but certainly "FETCH %d FROM c%u" is in theory >> just as bad as the new code introduced in 9.6. In practice, it >> probably isn't, because we're probably only fetching 50 rows at a time >> rather than potentially a lot more, but if we're fixing this code up >> to be interrupt-safe, maybe we should fix it all at the same time. >> Even for the short-running queries like CLOSE and DEALLOCATE, it seems >> possible that there could be a network-related hang which you might >> want to interrupt. > > > Actually, I was wondering, too, but I didn't propose that because, as far as > I know, there are no reports from the field. But I agree with you. For something that is HEAD-only that's a great idea to put everything into the same flag like that. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/13 3:14, Robert Haas wrote: I'm wondering why we are fixing this specific case and not any of the other calls to PQexec() or PQexecParams() in postgres_fdw.c. I mean, many of those instances are cases where the query isn't likely to run for very long, but certainly "FETCH %d FROM c%u" is in theory just as bad as the new code introduced in 9.6. In practice, it probably isn't, because we're probably only fetching 50 rows at a time rather than potentially a lot more, but if we're fixing this code up to be interrupt-safe, maybe we should fix it all at the same time. Even for the short-running queries like CLOSE and DEALLOCATE, it seems possible that there could be a network-related hang which you might want to interrupt. Actually, I was wondering, too, but I didn't propose that because, as far as I know, there are no reports from the field. But I agree with you. How about we encapsulate the while (PQisBusy(...)) loop into a new function pgfdw_get_result(), which can be called after first calling PQsendQueryParams()? So then this code will say dmstate->result = pgfdw_get_result(dmstate->conn). And we can do something similar for the other call to PQexecParams() in create_cursor(). Then let's also add something like pgfdw_exec_query() which calls PQsendQuery() and then pgfdw_get_result, and use that to replace all of the existing calls to PQexec(). Then all the SQL postgres_fdw executes would be interruptible, not just the new stuff. Seems like a good idea. Will do. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Mon, Apr 11, 2016 at 9:45 PM, Michael Paquier wrote: > On Mon, Apr 11, 2016 at 5:16 PM, Etsuro Fujita > wrote: >> On 2016/04/11 12:30, Michael Paquier wrote: >>> >>> + if ((cancel = PQgetCancel(entry->conn))) >>> + { >>> + PQcancel(cancel, errbuf, sizeof(errbuf)); >>> + PQfreeCancel(cancel); >>> + } >>> Wouldn't it be better to issue a WARNING here if PQcancel does not >>> succeed? >> >> Seems like a good idea. Attached is an updated version of the patch. > > Thanks for the new version. The patch looks good to me. I'm wondering why we are fixing this specific case and not any of the other calls to PQexec() or PQexecParams() in postgres_fdw.c. I mean, many of those instances are cases where the query isn't likely to run for very long, but certainly "FETCH %d FROM c%u" is in theory just as bad as the new code introduced in 9.6. In practice, it probably isn't, because we're probably only fetching 50 rows at a time rather than potentially a lot more, but if we're fixing this code up to be interrupt-safe, maybe we should fix it all at the same time. Even for the short-running queries like CLOSE and DEALLOCATE, it seems possible that there could be a network-related hang which you might want to interrupt. How about we encapsulate the while (PQisBusy(...)) loop into a new function pgfdw_get_result(), which can be called after first calling PQsendQueryParams()? So then this code will say dmstate->result = pgfdw_get_result(dmstate->conn). And we can do something similar for the other call to PQexecParams() in create_cursor(). Then let's also add something like pgfdw_exec_query() which calls PQsendQuery() and then pgfdw_get_result, and use that to replace all of the existing calls to PQexec(). Then all the SQL postgres_fdw executes would be interruptible, not just the new stuff. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Mon, Apr 11, 2016 at 5:16 PM, Etsuro Fujita wrote: > On 2016/04/11 12:30, Michael Paquier wrote: >> >> + if ((cancel = PQgetCancel(entry->conn))) >> + { >> + PQcancel(cancel, errbuf, sizeof(errbuf)); >> + PQfreeCancel(cancel); >> + } >> Wouldn't it be better to issue a WARNING here if PQcancel does not >> succeed? > > Seems like a good idea. Attached is an updated version of the patch. Thanks for the new version. The patch looks good to me. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/11 12:30, Michael Paquier wrote: + if ((cancel = PQgetCancel(entry->conn))) + { + PQcancel(cancel, errbuf, sizeof(errbuf)); + PQfreeCancel(cancel); + } Wouldn't it be better to issue a WARNING here if PQcancel does not succeed? Seems like a good idea. Attached is an updated version of the patch. Thanks for the review! Best regards, Etsuro Fujita diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 189f290..da1758b 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -598,6 +598,34 @@ pgfdw_xact_callback(XactEvent event, void *arg) case XACT_EVENT_ABORT: /* Assume we might have lost track of prepared statements */ entry->have_error = true; + /* + * If we had submitted a command to the remote server using + * an asynchronous execution function, the command might + * have not yet completed. Check to see if a command is + * still being processed by the remote server, and if so, + * request cancellation of the command; if not, abort + * gracefully. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + /* + * Note: can't throw ERROR, it would be infinite + * loop + */ + if (!PQcancel(cancel, errbuf, sizeof(errbuf))) +ereport(WARNING, + (errcode(ERRCODE_CONNECTION_FAILURE), + errmsg("could not send cancel request: %s", +errbuf))); + PQfreeCancel(cancel); + } + break; + } /* If we're aborting, abort all remote transactions too */ res = PQexec(entry->conn, "ABORT TRANSACTION"); /* Note: can't throw ERROR, it would be infinite loop */ diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index ee0220a..0378f1d 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -32,6 +32,7 @@ #include "optimizer/var.h" #include "optimizer/tlist.h" #include "parser/parsetree.h" +#include "storage/latch.h" #include "utils/builtins.h" #include "utils/guc.h" #include "utils/lsyscache.h" @@ -3131,6 +3132,7 @@ execute_dml_stmt(ForeignScanState *node) ExprContext *econtext = node->ss.ps.ps_ExprContext; int numParams = dmstate->numParams; const char **values = dmstate->param_values; + int wc; /* * Construct array of query parameter values in text format. @@ -3147,12 +3149,42 @@ execute_dml_stmt(ForeignScanState *node) * parameter (see deparse.c), the "inference" is trivial and will produce * the desired result. This allows us to avoid assuming that the remote * server has the same OIDs we do for the parameters' types. + */ + if (!PQsendQueryParams(dmstate->conn, dmstate->query, numParams, + NULL, values, NULL, NULL, 0)) + pgfdw_report_error(ERROR, NULL, dmstate->conn, false, dmstate->query); + + /* + * Receive data until PQgetResult is ready to get the result without + * blocking. + */ + while (PQisBusy(dmstate->conn)) + { + /* Sleep until there's something to do */ + wc = WaitLatchOrSocket(MyLatch, + WL_LATCH_SET | WL_SOCKET_READABLE, + PQsocket(dmstate->conn), + -1L); + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS(); + + /* Data available in socket */ + if (wc & WL_SOCKET_READABLE) + { + if (!PQconsumeInput(dmstate->conn)) +pgfdw_report_error(ERROR, NULL, dmstate->conn, false, + dmstate->query); + } + } + + /* + * Get the result * * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ - dmstate->result = PQexecParams(dmstate->conn, dmstate->query, - numParams, NULL, values, NULL, NULL, 0); + dmstate->result = PQgetResult(dmstate->conn); if (PQresultStatus(dmstate->result) != (dmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK)) pgfdw_report_error(ERROR, dmstate->result, dmstate->conn, true, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Mon, Apr 11, 2016 at 11:30 AM, Etsuro Fujita wrote: > I agree with Rushabh. Thanks for updating the patch! Yes, not using a PG_TRY/CATCH block is better. We are not doing anything that need to clean up PGresult in case of an unplanned failure. > Another thing I'd like to propose to revise the patch is to call > pgfdw_report_error in the newly added hunk, with the clear argument set to > *false*. The PGresult argument is NULL there, so no need to release the > PGresult. Sure, this saves a couple of cycles. PQclear is anyway smart enough to handle NULL results correctly, but this way is better. > Attached is an updated patch. + if (wc & WL_SOCKET_READABLE) + { + if (!PQconsumeInput(dmstate->conn)) + pgfdw_report_error(ERROR, NULL, dmstate->conn, false, + dmstate->query); + } OK, so here we would fail with ERRCODE_CONNECTION_FAILURE in the case where the socket is readable but no data can be consumed. I guess it makes sense. + if ((cancel = PQgetCancel(entry->conn))) + { + PQcancel(cancel, errbuf, sizeof(errbuf)); + PQfreeCancel(cancel); + } Wouldn't it be better to issue a WARNING here if PQcancel does not succeed? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/08 22:21, Rushabh Lathia wrote: On Fri, Apr 8, 2016 at 6:28 PM, Robert Haas mailto:robertmh...@gmail.com>> wrote: The comment just before the second hunk in the patch says: * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. But the patch adds a whole bunch of new things there that seem like they can error out, like CHECK_FOR_INTERRUPTS(), for example. Isn't that a problem? Basically we fetching the PGresult, after the newly added hunk, so there should not be any problem. But yes comment is definitely at wrong place. PFA patch with correction. I agree with Rushabh. Thanks for updating the patch! Another thing I'd like to propose to revise the patch is to call pgfdw_report_error in the newly added hunk, with the clear argument set to *false*. The PGresult argument is NULL there, so no need to release the PGresult. Attached is an updated patch. Best regards, Etsuro Fujita diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 189f290..8820597 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -598,6 +598,26 @@ pgfdw_xact_callback(XactEvent event, void *arg) case XACT_EVENT_ABORT: /* Assume we might have lost track of prepared statements */ entry->have_error = true; + /* + * If we had submitted a command to the remote server using + * an asynchronous execution function, the command might + * have not yet completed. Check to see if a command is + * still being processed by the remote server, and if so, + * request cancellation of the command; if not, abort + * gracefully. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + PQcancel(cancel, errbuf, sizeof(errbuf)); + PQfreeCancel(cancel); + } + break; + } /* If we're aborting, abort all remote transactions too */ res = PQexec(entry->conn, "ABORT TRANSACTION"); /* Note: can't throw ERROR, it would be infinite loop */ diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index ee0220a..0378f1d 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -32,6 +32,7 @@ #include "optimizer/var.h" #include "optimizer/tlist.h" #include "parser/parsetree.h" +#include "storage/latch.h" #include "utils/builtins.h" #include "utils/guc.h" #include "utils/lsyscache.h" @@ -3131,6 +3132,7 @@ execute_dml_stmt(ForeignScanState *node) ExprContext *econtext = node->ss.ps.ps_ExprContext; int numParams = dmstate->numParams; const char **values = dmstate->param_values; + int wc; /* * Construct array of query parameter values in text format. @@ -3147,12 +3149,42 @@ execute_dml_stmt(ForeignScanState *node) * parameter (see deparse.c), the "inference" is trivial and will produce * the desired result. This allows us to avoid assuming that the remote * server has the same OIDs we do for the parameters' types. + */ + if (!PQsendQueryParams(dmstate->conn, dmstate->query, numParams, + NULL, values, NULL, NULL, 0)) + pgfdw_report_error(ERROR, NULL, dmstate->conn, false, dmstate->query); + + /* + * Receive data until PQgetResult is ready to get the result without + * blocking. + */ + while (PQisBusy(dmstate->conn)) + { + /* Sleep until there's something to do */ + wc = WaitLatchOrSocket(MyLatch, + WL_LATCH_SET | WL_SOCKET_READABLE, + PQsocket(dmstate->conn), + -1L); + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS(); + + /* Data available in socket */ + if (wc & WL_SOCKET_READABLE) + { + if (!PQconsumeInput(dmstate->conn)) +pgfdw_report_error(ERROR, NULL, dmstate->conn, false, + dmstate->query); + } + } + + /* + * Get the result * * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ - dmstate->result = PQexecParams(dmstate->conn, dmstate->query, - numParams, NULL, values, NULL, NULL, 0); + dmstate->result = PQgetResult(dmstate->conn); if (PQresultStatus(dmstate->result) != (dmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK)) pgfdw_report_error(ERROR, dmstate->result, dmstate->conn, true, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Fri, Apr 8, 2016 at 6:28 PM, Robert Haas wrote: > On Fri, Apr 8, 2016 at 3:05 AM, Etsuro Fujita > wrote: > >> What do you think? This open item's seven-day deadline has passed. It > >> would > >> help keep things moving to know whether you consider your latest patch > >> optimal > >> or whether you wish to change it the way Michael described. > > > > I wish to change it that way because it not only avoids the duplicate but > > fixes a bug in the previous patch that I overlooked that there is a race > > condition if a signal arrives just before entering the CheckSocket. > > > > Attached is an updated version of the patch. > > The comment just before the second hunk in the patch says: > >* We don't use a PG_TRY block here, so be careful not to throw error >* without releasing the PGresult. > > But the patch adds a whole bunch of new things there that seem like > they can error out, like CHECK_FOR_INTERRUPTS(), for example. Isn't > that a problem? > Basically we fetching the PGresult, after the newly added hunk, so there should not be any problem. But yes comment is definitely at wrong place. PFA patch with correction. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > -- Rushabh Lathia diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c index 189f290..8820597 100644 --- a/contrib/postgres_fdw/connection.c +++ b/contrib/postgres_fdw/connection.c @@ -598,6 +598,26 @@ pgfdw_xact_callback(XactEvent event, void *arg) case XACT_EVENT_ABORT: /* Assume we might have lost track of prepared statements */ entry->have_error = true; + /* + * If we had submitted a command to the remote server using + * an asynchronous execution function, the command might + * have not yet completed. Check to see if a command is + * still being processed by the remote server, and if so, + * request cancellation of the command; if not, abort + * gracefully. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + PQcancel(cancel, errbuf, sizeof(errbuf)); + PQfreeCancel(cancel); + } + break; + } /* If we're aborting, abort all remote transactions too */ res = PQexec(entry->conn, "ABORT TRANSACTION"); /* Note: can't throw ERROR, it would be infinite loop */ diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index ee0220a..2b6a61b 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -32,6 +32,7 @@ #include "optimizer/var.h" #include "optimizer/tlist.h" #include "parser/parsetree.h" +#include "storage/latch.h" #include "utils/builtins.h" #include "utils/guc.h" #include "utils/lsyscache.h" @@ -3131,6 +3132,7 @@ execute_dml_stmt(ForeignScanState *node) ExprContext *econtext = node->ss.ps.ps_ExprContext; int numParams = dmstate->numParams; const char **values = dmstate->param_values; + int wc; /* * Construct array of query parameter values in text format. @@ -3147,12 +3149,42 @@ execute_dml_stmt(ForeignScanState *node) * parameter (see deparse.c), the "inference" is trivial and will produce * the desired result. This allows us to avoid assuming that the remote * server has the same OIDs we do for the parameters' types. + */ + if (!PQsendQueryParams(dmstate->conn, dmstate->query, numParams, + NULL, values, NULL, NULL, 0)) + pgfdw_report_error(ERROR, NULL, dmstate->conn, true, dmstate->query); + + /* + * Receive data until PQgetResult is ready to get the result without + * blocking. + */ + while (PQisBusy(dmstate->conn)) + { + /* Sleep until there's something to do */ + wc = WaitLatchOrSocket(MyLatch, + WL_LATCH_SET | WL_SOCKET_READABLE, + PQsocket(dmstate->conn), + -1L); + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS(); + + /* Data available in socket */ + if (wc & WL_SOCKET_READABLE) + { + if (!PQconsumeInput(dmstate->conn)) +pgfdw_report_error(ERROR, NULL, dmstate->conn, true, + dmstate->query); + } + } + + /* + * Get the result * * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ - dmstate->result = PQexecParams(dmstate->conn, dmstate->query, - numParams, NULL, values, NULL, NULL, 0); + dmstate->result = PQgetResult(dmstate->conn); if (PQresultStatus(dmstate->result) != (dmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK)) pgfdw_report_error(ERROR, dmstate->result, dmstate->conn, true, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Fri, Apr 8, 2016 at 3:05 AM, Etsuro Fujita wrote: >> What do you think? This open item's seven-day deadline has passed. It >> would >> help keep things moving to know whether you consider your latest patch >> optimal >> or whether you wish to change it the way Michael described. > > I wish to change it that way because it not only avoids the duplicate but > fixes a bug in the previous patch that I overlooked that there is a race > condition if a signal arrives just before entering the CheckSocket. > > Attached is an updated version of the patch. The comment just before the second hunk in the patch says: * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. But the patch adds a whole bunch of new things there that seem like they can error out, like CHECK_FOR_INTERRUPTS(), for example. Isn't that a problem? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/08 13:42, Noah Misch wrote: On Tue, Apr 05, 2016 at 03:22:03PM +0900, Etsuro Fujita wrote: On 2016/04/04 20:35, Michael Paquier wrote: On Mon, Apr 4, 2016 at 7:49 PM, Etsuro Fujita wrote: Here is a patch to fix this issue. As proposed by Michael, I modified execute_dml_stmt so that it uses PQsendQueryParams, not PQexecParams. Any comments are welcome. + * This is based on pqSocketCheck. + */ + bool + CheckSocket(PGconn *conn) + { + intret; + + Assert(conn != NULL); Instead of copying again pqSocketQuery, which is as well copied in libpqwalreceiver.c, wouldn't it be better to use WaitLatchOrSocket with the socket returned by PQsocket? Will check. Thanks for the comment! What do you think? This open item's seven-day deadline has passed. It would help keep things moving to know whether you consider your latest patch optimal or whether you wish to change it the way Michael described. I wish to change it that way because it not only avoids the duplicate but fixes a bug in the previous patch that I overlooked that there is a race condition if a signal arrives just before entering the CheckSocket. Attached is an updated version of the patch. Sorry for the delay. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/connection.c --- b/contrib/postgres_fdw/connection.c *** *** 598,603 pgfdw_xact_callback(XactEvent event, void *arg) --- 598,623 case XACT_EVENT_ABORT: /* Assume we might have lost track of prepared statements */ entry->have_error = true; + /* + * If we had submitted a command to the remote server using + * an asynchronous execution function, the command might + * have not yet completed. Check to see if a command is + * still being processed by the remote server, and if so, + * request cancellation of the command; if not, abort + * gracefully. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + PQcancel(cancel, errbuf, sizeof(errbuf)); + PQfreeCancel(cancel); + } + break; + } /* If we're aborting, abort all remote transactions too */ res = PQexec(entry->conn, "ABORT TRANSACTION"); /* Note: can't throw ERROR, it would be infinite loop */ *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 32,37 --- 32,38 #include "optimizer/var.h" #include "optimizer/tlist.h" #include "parser/parsetree.h" + #include "storage/latch.h" #include "utils/builtins.h" #include "utils/guc.h" #include "utils/lsyscache.h" *** *** 3131,3136 execute_dml_stmt(ForeignScanState *node) --- 3132,3138 ExprContext *econtext = node->ss.ps.ps_ExprContext; int numParams = dmstate->numParams; const char **values = dmstate->param_values; + int wc; /* * Construct array of query parameter values in text format. *** *** 3151,3158 execute_dml_stmt(ForeignScanState *node) * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ ! dmstate->result = PQexecParams(dmstate->conn, dmstate->query, ! numParams, NULL, values, NULL, NULL, 0); if (PQresultStatus(dmstate->result) != (dmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK)) pgfdw_report_error(ERROR, dmstate->result, dmstate->conn, true, --- 3153,3188 * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ ! if (!PQsendQueryParams(dmstate->conn, dmstate->query, numParams, ! NULL, values, NULL, NULL, 0)) ! pgfdw_report_error(ERROR, NULL, dmstate->conn, true, dmstate->query); ! ! /* ! * Receive data until PQgetResult is ready to get the result without ! * blocking. ! */ ! while (PQisBusy(dmstate->conn)) ! { ! /* Sleep until there's something to do */ ! wc = WaitLatchOrSocket(MyLatch, ! WL_LATCH_SET | WL_SOCKET_READABLE, ! PQsocket(dmstate->conn), ! -1L); ! ResetLatch(MyLatch); ! ! CHECK_FOR_INTERRUPTS(); ! ! /* Data available in socket */ ! if (wc & WL_SOCKET_READABLE) ! { ! if (!PQconsumeInput(dmstate->conn)) ! pgfdw_report_error(ERROR, NULL, dmstate->conn, true, ! dmstate->query); ! } ! } ! ! /* Get the result */ ! dmstate->result = PQgetResult(dmstate->conn); if (PQresultStatus(dmstate->result) != (dmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK)) pgfdw_report_error(ERROR, dmstate->result, dmstate->conn, true, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Tue, Apr 05, 2016 at 03:22:03PM +0900, Etsuro Fujita wrote: > On 2016/04/04 20:35, Michael Paquier wrote: > >On Mon, Apr 4, 2016 at 7:49 PM, Etsuro Fujita > > wrote: > >>Here is a patch to fix this issue. As proposed by Michael, I modified > >>execute_dml_stmt so that it uses PQsendQueryParams, not PQexecParams. Any > >>comments are welcome. > > >+ * This is based on pqSocketCheck. > >+ */ > >+ bool > >+ CheckSocket(PGconn *conn) > >+ { > >+ intret; > >+ > >+ Assert(conn != NULL); > >Instead of copying again pqSocketQuery, which is as well copied in > >libpqwalreceiver.c, wouldn't it be better to use WaitLatchOrSocket > >with the socket returned by PQsocket? > > Will check. Thanks for the comment! What do you think? This open item's seven-day deadline has passed. It would help keep things moving to know whether you consider your latest patch optimal or whether you wish to change it the way Michael described. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/04/04 20:35, Michael Paquier wrote: On Mon, Apr 4, 2016 at 7:49 PM, Etsuro Fujita wrote: Here is a patch to fix this issue. As proposed by Michael, I modified execute_dml_stmt so that it uses PQsendQueryParams, not PQexecParams. Any comments are welcome. + * This is based on pqSocketCheck. + */ + bool + CheckSocket(PGconn *conn) + { + intret; + + Assert(conn != NULL); Instead of copying again pqSocketQuery, which is as well copied in libpqwalreceiver.c, wouldn't it be better to use WaitLatchOrSocket with the socket returned by PQsocket? Will check. Thanks for the comment! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Mon, Apr 4, 2016 at 7:49 PM, Etsuro Fujita wrote: > On 2016/03/31 16:38, Etsuro Fujita wrote: >> >> On 2016/03/31 14:07, Noah Misch wrote: >>> >>> On Thu, Mar 24, 2016 at 01:02:57PM +0900, Etsuro Fujita wrote: On 2016/03/24 11:14, Michael Paquier wrote: > > On Wed, Mar 23, 2016 at 10:05 PM, Thom Brown wrote: >> >> I've noticed that you now can't cancel a query if there's DML pushdown >> to a foreign server. This previously worked while it was sending >> individual statements as it interrupted and rolled it back. >> >> Here's what the local server sees when trying to cancel: >> >> # DELETE FROM remote.contacts; >> ^CCancel request sent >> DELETE 500 >> >> This should probably be fixed. > > > Looking at what has been committed, execute_dml_stmt is using > PQexecParams, so we'd want to use an asynchronous call and loop on > PQgetResult with CHECK_FOR_INTERRUPTS() in it. > > Will fix. > > >>> [This is a generic notification.] > > >> Sorry for not having taken any action. I've been busy with another task >> lately, but I started working on this. I plan to post a patch early >> next week. > > > Here is a patch to fix this issue. As proposed by Michael, I modified > execute_dml_stmt so that it uses PQsendQueryParams, not PQexecParams. Any > comments are welcome. + * This is based on pqSocketCheck. + */ + bool + CheckSocket(PGconn *conn) + { + intret; + + Assert(conn != NULL); Instead of copying again pqSocketQuery, which is as well copied in libpqwalreceiver.c, wouldn't it be better to use WaitLatchOrSocket with the socket returned by PQsocket? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/03/31 16:38, Etsuro Fujita wrote: On 2016/03/31 14:07, Noah Misch wrote: On Thu, Mar 24, 2016 at 01:02:57PM +0900, Etsuro Fujita wrote: On 2016/03/24 11:14, Michael Paquier wrote: On Wed, Mar 23, 2016 at 10:05 PM, Thom Brown wrote: I've noticed that you now can't cancel a query if there's DML pushdown to a foreign server. This previously worked while it was sending individual statements as it interrupted and rolled it back. Here's what the local server sees when trying to cancel: # DELETE FROM remote.contacts; ^CCancel request sent DELETE 500 This should probably be fixed. Looking at what has been committed, execute_dml_stmt is using PQexecParams, so we'd want to use an asynchronous call and loop on PQgetResult with CHECK_FOR_INTERRUPTS() in it. Will fix. [This is a generic notification.] Sorry for not having taken any action. I've been busy with another task lately, but I started working on this. I plan to post a patch early next week. Here is a patch to fix this issue. As proposed by Michael, I modified execute_dml_stmt so that it uses PQsendQueryParams, not PQexecParams. Any comments are welcome. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/connection.c --- b/contrib/postgres_fdw/connection.c *** *** 12,17 --- 12,20 */ #include "postgres.h" + #include + #include + #include "postgres_fdw.h" #include "access/xact.h" *** *** 20,25 --- 23,38 #include "utils/hsearch.h" #include "utils/memutils.h" + #ifdef HAVE_POLL_H + #include + #endif + #ifdef HAVE_SYS_POLL_H + #include + #endif + #ifdef HAVE_SYS_SELECT_H + #include + #endif + /* * Connection cache hash table entry *** *** 417,422 ReleaseConnection(PGconn *conn) --- 430,484 } /* + * Wait until we can read data. + * + * Returns true if data has become available for reading, false if interrupted + * by signal. + * + * This is based on pqSocketCheck. + */ + bool + CheckSocket(PGconn *conn) + { + int ret; + + Assert(conn != NULL); + if (PQsocket(conn) < 0) + ereport(ERROR, + (errcode_for_socket_access(), + errmsg("invalid socket: %s", PQerrorMessage(conn; + + /* We use poll(2) if available, otherwise select(2) */ + { + #ifdef HAVE_POLL + struct pollfd input_fd; + + input_fd.fd = PQsocket(conn); + input_fd.events = POLLIN | POLLERR; + input_fd.revents = 0; + + ret = poll(&input_fd, 1, -1); + #else /* !HAVE_POLL */ + fd_set input_mask; + + FD_ZERO(&input_mask); + FD_SET(PQsocket(conn), &input_mask); + + ret = select(PQsocket(conn) + 1, &input_mask, NULL, NULL, NULL); + #endif /* HAVE_POLL */ + } + + Assert(ret != 0); + if (ret < 0 && errno == EINTR) + return false; + if (ret < 0) + ereport(ERROR, + (errcode_for_socket_access(), + errmsg("select() failed: %s", strerror(errno; + return true; + } + + /* * Assign a "unique" number for a cursor. * * These really only need to be unique per connection within a transaction. *** *** 598,603 pgfdw_xact_callback(XactEvent event, void *arg) --- 660,684 case XACT_EVENT_ABORT: /* Assume we might have lost track of prepared statements */ entry->have_error = true; + /* + * If we executed a query asynchronously, the query might + * have not yet completed. Check to see if a command is + * still being processed by the remote server, and if so, + * request cancellation of the command; if not, abort + * gracefully. + */ + if (PQtransactionStatus(entry->conn) == PQTRANS_ACTIVE) + { + PGcancel *cancel; + char errbuf[256]; + + if ((cancel = PQgetCancel(entry->conn))) + { + PQcancel(cancel, errbuf, sizeof(errbuf)); + PQfreeCancel(cancel); + } + break; + } /* If we're aborting, abort all remote transactions too */ res = PQexec(entry->conn, "ABORT TRANSACTION"); /* Note: can't throw ERROR, it would be infinite loop */ *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 3151,3158 execute_dml_stmt(ForeignScanState *node) * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ ! dmstate->result = PQexecParams(dmstate->conn, dmstate->query, ! numParams, NULL, values, NULL, NULL, 0); if (PQresultStatus(dmstate->result) != (dmstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK)) pgfdw_report_error(ERROR, dmstate->result, dmstate->conn, true, --- 3151,3181 * We don't use a PG_TRY block here, so be careful not to throw error * without releasing the PGresult. */ ! if (!PQsendQueryParams(dmstate->conn, dmstate->query, numParams, ! NULL, values, NULL, NULL, 0)) ! pgfdw_report_error(ERROR, NULL, dmstate->conn, true, dmstate->query); ! ! /* ! * Rece
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/03/31 14:07, Noah Misch wrote: On Thu, Mar 24, 2016 at 01:02:57PM +0900, Etsuro Fujita wrote: On 2016/03/24 11:14, Michael Paquier wrote: On Wed, Mar 23, 2016 at 10:05 PM, Thom Brown wrote: I've noticed that you now can't cancel a query if there's DML pushdown to a foreign server. This previously worked while it was sending individual statements as it interrupted and rolled it back. Here's what the local server sees when trying to cancel: # DELETE FROM remote.contacts; ^CCancel request sent DELETE 500 This should probably be fixed. Looking at what has been committed, execute_dml_stmt is using PQexecParams, so we'd want to use an asynchronous call and loop on PQgetResult with CHECK_FOR_INTERRUPTS() in it. Will fix. [This is a generic notification.] The above-described topic is currently a PostgreSQL 9.6 open item. Robert, since you committed the patch believed to have created it, you own this open item. If that responsibility lies elsewhere, please let us know whose responsibility it is to fix this. Since new open items may be discovered at any time and I want to plan to have them all fixed well in advance of the ship date, I will appreciate your efforts toward speedy resolution. Please present, within 72 hours, a plan to fix the defect within seven days of this message. Thanks. Sorry for not having taken any action. I've been busy with another task lately, but I started working on this. I plan to post a patch early next week. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Thu, Mar 24, 2016 at 01:02:57PM +0900, Etsuro Fujita wrote: > On 2016/03/24 11:14, Michael Paquier wrote: > >On Wed, Mar 23, 2016 at 10:05 PM, Thom Brown wrote: > >>I've noticed that you now can't cancel a query if there's DML pushdown > >>to a foreign server. This previously worked while it was sending > >>individual statements as it interrupted and rolled it back. > >> > >>Here's what the local server sees when trying to cancel: > >> > >># DELETE FROM remote.contacts; > >>^CCancel request sent > >>DELETE 500 > >> > >>This should probably be fixed. > > >Looking at what has been committed, execute_dml_stmt is using > >PQexecParams, so we'd want to use an asynchronous call and loop on > >PQgetResult with CHECK_FOR_INTERRUPTS() in it. > > Will fix. > > Thanks for the report, Thom! Thanks for the advice, Michael! [This is a generic notification.] The above-described topic is currently a PostgreSQL 9.6 open item. Robert, since you committed the patch believed to have created it, you own this open item. If that responsibility lies elsewhere, please let us know whose responsibility it is to fix this. Since new open items may be discovered at any time and I want to plan to have them all fixed well in advance of the ship date, I will appreciate your efforts toward speedy resolution. Please present, within 72 hours, a plan to fix the defect within seven days of this message. Thanks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Fri, Mar 25, 2016 at 3:10 PM, Michael Paquier wrote: > On Thu, Mar 24, 2016 at 1:02 PM, Etsuro Fujita > wrote: >> Thanks for the report, Thom! Thanks for the advice, Michael! > > I am adding that to the list of open items of 9.6 to not forget about it. My bad. Thom did it already :p -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Thu, Mar 24, 2016 at 1:02 PM, Etsuro Fujita wrote: > On 2016/03/24 11:14, Michael Paquier wrote: >> >> On Wed, Mar 23, 2016 at 10:05 PM, Thom Brown wrote: >>> >>> I've noticed that you now can't cancel a query if there's DML pushdown >>> to a foreign server. This previously worked while it was sending >>> individual statements as it interrupted and rolled it back. >>> >>> Here's what the local server sees when trying to cancel: >>> >>> # DELETE FROM remote.contacts; >>> ^CCancel request sent >>> DELETE 500 >>> >>> This should probably be fixed. > > >> Looking at what has been committed, execute_dml_stmt is using >> PQexecParams, so we'd want to use an asynchronous call and loop on >> PQgetResult with CHECK_FOR_INTERRUPTS() in it. > > > Will fix. > > Thanks for the report, Thom! Thanks for the advice, Michael! I am adding that to the list of open items of 9.6 to not forget about it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/03/24 11:14, Michael Paquier wrote: On Wed, Mar 23, 2016 at 10:05 PM, Thom Brown wrote: I've noticed that you now can't cancel a query if there's DML pushdown to a foreign server. This previously worked while it was sending individual statements as it interrupted and rolled it back. Here's what the local server sees when trying to cancel: # DELETE FROM remote.contacts; ^CCancel request sent DELETE 500 This should probably be fixed. Looking at what has been committed, execute_dml_stmt is using PQexecParams, so we'd want to use an asynchronous call and loop on PQgetResult with CHECK_FOR_INTERRUPTS() in it. Will fix. Thanks for the report, Thom! Thanks for the advice, Michael! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Mar 23, 2016 at 10:05 PM, Thom Brown wrote: > I've noticed that you now can't cancel a query if there's DML pushdown > to a foreign server. This previously worked while it was sending > individual statements as it interrupted and rolled it back. > > Here's what the local server sees when trying to cancel: > > # DELETE FROM remote.contacts; > ^CCancel request sent > DELETE 500 > > This should probably be fixed. Looking at what has been committed, execute_dml_stmt is using PQexecParams, so we'd want to use an asynchronous call and loop on PQgetResult with CHECK_FOR_INTERRUPTS() in it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 22 March 2016 at 02:30, Etsuro Fujita wrote: > On 2016/03/19 3:30, Robert Haas wrote: >> >> On Fri, Mar 18, 2016 at 5:15 AM, Etsuro Fujita >> wrote: >>> >>> Attached is the updated version of the patch. I've noticed that you now can't cancel a query if there's DML pushdown to a foreign server. This previously worked while it was sending individual statements as it interrupted and rolled it back. Here's what the local server sees when trying to cancel: # DELETE FROM remote.contacts; ^CCancel request sent DELETE 500 This should probably be fixed. Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/03/19 3:30, Robert Haas wrote: On Fri, Mar 18, 2016 at 5:15 AM, Etsuro Fujita wrote: Attached is the updated version of the patch. Committed. Thank you. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/03/10 2:56, Robert Haas wrote: I see that you went and changed all of the places that tested for != CMD_SELECT and made them test for == CMD_INSERT || == CMD_UPDATE || == CMD_DELETE instead. I think that's the wrong direction. I think that we should use the != CMD_SELECT version of the test everywhere. That's a single test instead of three, so marginally faster, and maybe marginally more future-proof. I think deparsePushedDownUpdateSql should be renamed to use the new "direct modify" naming, like deparseDirectUpdateSql, maybe. I would suggest not numbering the tests in postgresPlanDirectModify. That just becomes a nuisance to keep up to date as things change. Agreed. I updated the patch to address these comments. Attached is the updated version of the patch. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 1315,1320 deparseUpdateSql(StringInfo buf, PlannerInfo *root, --- 1315,1383 } /* + * deparse remote UPDATE statement + * + * The statement text is appended to buf, and we also create an integer List + * of the columns being retrieved by RETURNING (if any), which is returned + * to *retrieved_attrs. + */ + void + deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root, + Index rtindex, Relation rel, + List *targetlist, + List *targetAttrs, + List *remote_conds, + List **params_list, + List *returningList, + List **retrieved_attrs) + { + RelOptInfo *baserel = root->simple_rel_array[rtindex]; + deparse_expr_cxt context; + int nestlevel; + bool first; + ListCell *lc; + + /* Set up context struct for recursion */ + context.root = root; + context.foreignrel = baserel; + context.buf = buf; + context.params_list = params_list; + + appendStringInfoString(buf, "UPDATE "); + deparseRelation(buf, rel); + appendStringInfoString(buf, " SET "); + + /* Make sure any constants in the exprs are printed portably */ + nestlevel = set_transmission_modes(); + + first = true; + foreach(lc, targetAttrs) + { + int attnum = lfirst_int(lc); + TargetEntry *tle = get_tle_by_resno(targetlist, attnum); + + if (!first) + appendStringInfoString(buf, ", "); + first = false; + + deparseColumnRef(buf, rtindex, attnum, root, false); + appendStringInfoString(buf, " = "); + deparseExpr((Expr *) tle->expr, &context); + } + + reset_transmission_modes(nestlevel); + + if (remote_conds) + { + appendStringInfo(buf, " WHERE "); + appendConditions(remote_conds, &context); + } + + deparseReturningList(buf, root, rtindex, rel, false, + returningList, retrieved_attrs); + } + + /* * deparse remote DELETE statement * * The statement text is appended to buf, and we also create an integer List *** *** 1337,1342 deparseDeleteSql(StringInfo buf, PlannerInfo *root, --- 1400,1442 } /* + * deparse remote DELETE statement + * + * The statement text is appended to buf, and we also create an integer List + * of the columns being retrieved by RETURNING (if any), which is returned + * to *retrieved_attrs. + */ + void + deparseDirectDeleteSql(StringInfo buf, PlannerInfo *root, + Index rtindex, Relation rel, + List *remote_conds, + List **params_list, + List *returningList, + List **retrieved_attrs) + { + RelOptInfo *baserel = root->simple_rel_array[rtindex]; + deparse_expr_cxt context; + + /* Set up context struct for recursion */ + context.root = root; + context.foreignrel = baserel; + context.buf = buf; + context.params_list = params_list; + + appendStringInfoString(buf, "DELETE FROM "); + deparseRelation(buf, rel); + + if (remote_conds) + { + appendStringInfo(buf, " WHERE "); + appendConditions(remote_conds, &context); + } + + deparseReturningList(buf, root, rtindex, rel, false, + returningList, retrieved_attrs); + } + + /* * Add a RETURNING clause, if needed, to an INSERT/UPDATE/DELETE. */ static void *** a/contrib/postgres_fdw/expected/postgres_fdw.out --- b/contrib/postgres_fdw/expected/postgres_fdw.out *** *** 2259,2265 INSERT INTO ft2 (c1,c2,c3) --- 2259,2284 (3 rows) INSERT INTO ft2 (c1,c2,c3) VALUES (1104,204,'ddd'), (1105,205,'eee'); + EXPLAIN (verbose, costs off) + UPDATE ft2 SET c2 = c2 + 300, c3 = c3 || '_update3' WHERE c1 % 10 = 3; -- can be pushed down + QUERY PLAN + -- + Update on public.ft2 +-> Foreign Update on public.ft2 + Remote SQL: UPDATE "S 1"."T 1" SET c2 = (c2 + 300), c3 = (c3 || '_update3'::text) WHERE ((("C 1" % 10) = 3)) + (3 rows) + UPDATE ft2 SET c2 = c2 + 300, c3 = c3 || '_update3' WHERE c1 % 10 = 3;
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Fri, Mar 18, 2016 at 5:15 AM, Etsuro Fujita wrote: > On 2016/03/10 2:56, Robert Haas wrote: >> I see that you went and changed all of the places that tested for != >> CMD_SELECT and made them test for == CMD_INSERT || == CMD_UPDATE || == >> CMD_DELETE instead. I think that's the wrong direction. I think that >> we should use the != CMD_SELECT version of the test everywhere. >> That's a single test instead of three, so marginally faster, and maybe >> marginally more future-proof. >> >> I think deparsePushedDownUpdateSql should be renamed to use the new >> "direct modify" naming, like deparseDirectUpdateSql, maybe. >> >> I would suggest not numbering the tests in postgresPlanDirectModify. >> That just becomes a nuisance to keep up to date as things change. > > Agreed. I updated the patch to address these comments. Attached is the > updated version of the patch. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/03/10 19:51, Robert Haas wrote: On Wed, Mar 9, 2016 at 3:30 PM, Tom Lane wrote: Robert Haas writes: Just taking a guess here, you might be thinking that instead of something like this... Update on public.ft2 -> Foreign Update on public.ft2 We could boil it all the way down to this: Foreign Update on public.ft2 Exactly. I'm not claiming that that would be particularly faster, but it would eliminate a whole bunch of seriously ugly stuff that's in this patch. But can you, really? What if the UPDATE is targeting an inheritance hierarchy containing some local tables and some remote tables? I don't really see why that couldn't be made to work. You'd end up with ForeignUpdates on the remote tables and a ModifyTable handling the rest. I don't get it. I mean, what's the parent node going to be? If it's the ModifyTable, then the plan tree looks the same as what this already does. If not, then what? I don't get it, either. If the ForeignUpdates would be executed separately from the ModifyTable, how would the query's reported row count (ie, estate->es_processed) be handled? Just to recap the history, this patch was rewritten, criticized by Stephen and you and rewritten to match your feedback. Then, both of you ignored it for a long time while I and others but a lot of work into it. Now, we're up against the deadline for this release and you don't like it again. Well, OK. If you want to rewrite it into some form you think is better, I'm cool with that. But it would be really unfair if this slipped out of this release after so much work has been put into making it match the design ideas that *you* put forward just because, at the eleventh hour, you now have new ones. Personally, I think we should just commit the darned thing and you can rewrite it later if you want. If you want to rewrite it now instead, I can live with that, too. But let's figure out how we're going to move this forward. I agree with Robert. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Mar 9, 2016 at 3:30 PM, Tom Lane wrote: > Robert Haas writes: >> On Wed, Mar 9, 2016 at 1:12 PM, Tom Lane wrote: >>> I hadn't been paying any attention to this thread, but I wonder whether >>> this entire approach isn't considerably inferior to what we can do now >>> with the planner pathification patch. To quote from the new docs: > >> Well, I guess I'm not quite seeing it. What do you have in mind? >> Just taking a guess here, you might be thinking that instead of >> something like this... > >> Update on public.ft2 >> -> Foreign Update on public.ft2 > >> We could boil it all the way down to this: > >> Foreign Update on public.ft2 > > Exactly. I'm not claiming that that would be particularly faster, but > it would eliminate a whole bunch of seriously ugly stuff that's in > this patch. Like what? >> But can you, really? What if the UPDATE is targeting an inheritance >> hierarchy containing some local tables and some remote tables? > > I don't really see why that couldn't be made to work. You'd end up > with ForeignUpdates on the remote tables and a ModifyTable handling > the rest. I don't get it. I mean, what's the parent node going to be? If it's the ModifyTable, then the plan tree looks the same as what this already does. If not, then what? Just to recap the history, this patch was rewritten, criticized by Stephen and you and rewritten to match your feedback. Then, both of you ignored it for a long time while I and others but a lot of work into it. Now, we're up against the deadline for this release and you don't like it again. Well, OK. If you want to rewrite it into some form you think is better, I'm cool with that. But it would be really unfair if this slipped out of this release after so much work has been put into making it match the design ideas that *you* put forward just because, at the eleventh hour, you now have new ones. Personally, I think we should just commit the darned thing and you can rewrite it later if you want. If you want to rewrite it now instead, I can live with that, too. But let's figure out how we're going to move this forward. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Robert Haas writes: > On Wed, Mar 9, 2016 at 1:12 PM, Tom Lane wrote: >> I hadn't been paying any attention to this thread, but I wonder whether >> this entire approach isn't considerably inferior to what we can do now >> with the planner pathification patch. To quote from the new docs: > Well, I guess I'm not quite seeing it. What do you have in mind? > Just taking a guess here, you might be thinking that instead of > something like this... > Update on public.ft2 > -> Foreign Update on public.ft2 > We could boil it all the way down to this: > Foreign Update on public.ft2 Exactly. I'm not claiming that that would be particularly faster, but it would eliminate a whole bunch of seriously ugly stuff that's in this patch. > But can you, really? What if the UPDATE is targeting an inheritance > hierarchy containing some local tables and some remote tables? I don't really see why that couldn't be made to work. You'd end up with ForeignUpdates on the remote tables and a ModifyTable handling the rest. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Mar 9, 2016 at 1:12 PM, Tom Lane wrote: > Robert Haas writes: >> Overall, I think this is looking pretty good. > > I hadn't been paying any attention to this thread, but I wonder whether > this entire approach isn't considerably inferior to what we can do now > with the planner pathification patch. To quote from the new docs: > >PlanForeignModify and the other callbacks described in Section 54.2.3 >are designed around the assumption that the foreign relation will be >scanned in the usual way and then individual row updates will be driven >by a local ModifyTable plan node. This approach is necessary for the >general case where an update requires reading local tables as well as >foreign tables. However, if the operation could be executed entirely by >the foreign server, the FDW could generate a path representing that and >insert it into the UPPERREL_FINAL upper relation, where it would >compete against the ModifyTable approach. This approach could also be >used to implement remote SELECT FOR UPDATE, rather than using the row >locking callbacks described in Section 54.2.4. Keep in mind that a path >inserted into UPPERREL_FINAL is responsible for implementing all >behavior of the query. > > I don't really see anything that this patch does that wouldn't be better > done that way. And I'd kind of like to get a working example of that type > of path insertion into 9.6, so that we can find out if we need any hooks > or callbacks that aren't there today. Well, I guess I'm not quite seeing it. What do you have in mind? Just taking a guess here, you might be thinking that instead of something like this... Update on public.ft2 -> Foreign Update on public.ft2 We could boil it all the way down to this: Foreign Update on public.ft2 But can you, really? What if the UPDATE is targeting an inheritance hierarchy containing some local tables and some remote tables? Apologies if I've completely misunderstood what you have in mind here, but you haven't really explained what you have in mind here. IMHO, if you want to do something really cool with the new pathification stuff, the thing to do would be pushing down aggregates to foreign servers. A lot of people would be really happy to see SELECT count(*) FROM ft ship the count operation to the remote side! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Robert Haas writes: > Overall, I think this is looking pretty good. I hadn't been paying any attention to this thread, but I wonder whether this entire approach isn't considerably inferior to what we can do now with the planner pathification patch. To quote from the new docs: PlanForeignModify and the other callbacks described in Section 54.2.3 are designed around the assumption that the foreign relation will be scanned in the usual way and then individual row updates will be driven by a local ModifyTable plan node. This approach is necessary for the general case where an update requires reading local tables as well as foreign tables. However, if the operation could be executed entirely by the foreign server, the FDW could generate a path representing that and insert it into the UPPERREL_FINAL upper relation, where it would compete against the ModifyTable approach. This approach could also be used to implement remote SELECT FOR UPDATE, rather than using the row locking callbacks described in Section 54.2.4. Keep in mind that a path inserted into UPPERREL_FINAL is responsible for implementing all behavior of the query. I don't really see anything that this patch does that wouldn't be better done that way. And I'd kind of like to get a working example of that type of path insertion into 9.6, so that we can find out if we need any hooks or callbacks that aren't there today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Mar 9, 2016 at 3:30 AM, Etsuro Fujita wrote: > Great! I changed the naming. I also updated docs as proposed by you in a > previous email, and rebased the patch to the latest HEAD. Please find > attached an updated version of the patch. Thanks. The new naming looks much better (and better also than what I suggested). I see that you went and changed all of the places that tested for != CMD_SELECT and made them test for == CMD_INSERT || == CMD_UPDATE || == CMD_DELETE instead. I think that's the wrong direction. I think that we should use the != CMD_SELECT version of the test everywhere. That's a single test instead of three, so marginally faster, and maybe marginally more future-proof. I think deparsePushedDownUpdateSql should be renamed to use the new "direct modify" naming, like deparseDirectUpdateSql, maybe. I would suggest not numbering the tests in postgresPlanDirectModify. That just becomes a nuisance to keep up to date as things change. Overall, I think this is looking pretty good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/03/08 2:35, Robert Haas wrote: On Mon, Mar 7, 2016 at 7:53 AM, Etsuro Fujita wrote: Another option to avoid such a hazard would be to remove the two changes from ExecInitModifyTable and create ExecAuxRowMarks and junk filters even in the pushdown case. I made the changes because we won't use ExecAuxRowMarks in that case since we don't need to do EvalPlanQual rechecks and because we won't use junk filters in that case since we do UPDATE/DELETE in the subplan. But the creating cost is enough small, so simply removing the changes seems like a good idea. Sure, that works. OK, I removed the changes. This issue crops up elsewhere as well. The changes to ExecModifyTable() have the same problem -- in that case, it might be wise to move the code that's going to have to be indented yet another level into a separate function. That code also says this: + /* No need to provide scan tuple to ExecProcessReturning. */ + slot = ExecProcessReturning(resultRelInfo, NULL, planSlot); ...but, uh, why not? The comment says what the code does, but what it should do is explain why it does it. As documented in IterateDMLPushdown in fdwhandler.sgml, the reason for that is that in the pushdown case it's the IterateDMLPushdown's responsiblity to get actually inserted/updated/deleted tuples and make those tuples available to the ExecProcessReturning. I'll add comments. Comments are good things to have. :-) Yeah, I added comments. On a broader level, I'm not very happy with the naming this patch uses. Here's an example: + + If an FDW supports optimizing foreign table updates, it still needs to + provide PlanDMLPushdown, BeginDMLPushdown, + IterateDMLPushdown and EndDMLPushdown + described below. + "Optimizing foreign table updates" is both inaccurate (since it doesn't only optimize updates) and so vague as to be meaningless unless you already know what it means. The actual patch uses terminology like "fdwPushdowns" which is just as bad. We might push a lot of things to the foreign side -- sorts, joins, aggregates, limits -- and this is just one of them. Worse, "pushdown" is itself something of a term of art - will people who haven't been following all of the mammoth, multi-hundred-email threads on this topic know what that means? I think we need some better terminology here. The best thing that I can come up with offhand is "bulk modify". So we'd have PlanBulkModify, BeginBulkModify, IterateBulkModify, EndBulkModify, ExplainBulkModify. Other suggestions welcome. The ResultRelInfo flag could be ri_usesFDWBulkModify. I'm not sure that "bulk modify" is best. Yeah, this would improve the performance especially in the bulk-modification case, but would improve the performance even in the case where an UPDATE/DELETE modifies just a single row. Let me explain using an example. Without the patch, we have the following plan for an UPDATE on a foreign table that updates a single row: postgres=# explain verbose update foo set a = a + 1 where a = 1; QUERY PLAN -- Update on public.foo (cost=100.00..101.05 rows=1 width=14) Remote SQL: UPDATE public.foo SET a = $2 WHERE ctid = $1 -> Foreign Scan on public.foo (cost=100.00..101.05 rows=1 width=14) Output: (a + 1), b, ctid Remote SQL: SELECT a, b, ctid FROM public.foo WHERE ((a = 1)) FOR UPDATE (5 rows) The plan requires two queries, SELECT and UPDATE, to do the update. (Actually, the plan have additional overheads in creating a cursor for the SELECT and establishing a prepared statement for the UPDATE.) But with the patch, we have: postgres=# explain verbose update foo set a = a + 1 where a = 1; QUERY PLAN --- Update on public.foo (cost=100.00..101.05 rows=1 width=14) -> Foreign Update on public.foo (cost=100.00..101.05 rows=1 width=14) Remote SQL: UPDATE public.foo SET a = (a + 1) WHERE ((a = 1)) (3 rows) The optimized plan requires just a single UPDATE query to do that! So, even in the single-row-modification case the patch could improve the performance. How about "Direct Modify"; PlanDirectModify, BeginDirectModify, IterateDirectModify, EndDirectModify, ExplainDirectModify, and ri_usesFDWDirectModify. Works for me! Great! I changed the naming. I also updated docs as proposed by you in a previous email, and rebased the patch to the latest HEAD. Please find attached an updated version of the patch. Best regards, Etsuro Fujita fdw-dml-pushdown-v10.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Mon, Mar 7, 2016 at 7:53 AM, Etsuro Fujita wrote: > Another option to avoid such a hazard would be to remove the two changes > from ExecInitModifyTable and create ExecAuxRowMarks and junk filters even in > the pushdown case. I made the changes because we won't use ExecAuxRowMarks > in that case since we don't need to do EvalPlanQual rechecks and because we > won't use junk filters in that case since we do UPDATE/DELETE in the > subplan. But the creating cost is enough small, so simply removing the > changes seems like a good idea. Sure, that works. >> This issue crops up elsewhere as well. The changes to >> ExecModifyTable() have the same problem -- in that case, it might be >> wise to move the code that's going to have to be indented yet another >> level into a separate function. That code also says this: >> >> + /* No need to provide scan tuple to >> ExecProcessReturning. */ >> + slot = ExecProcessReturning(resultRelInfo, >> NULL, planSlot); >> >> ...but, uh, why not? The comment says what the code does, but what it >> should do is explain why it does it. > > As documented in IterateDMLPushdown in fdwhandler.sgml, the reason for that > is that in the pushdown case it's the IterateDMLPushdown's responsiblity to > get actually inserted/updated/deleted tuples and make those tuples available > to the ExecProcessReturning. I'll add comments. Comments are good things to have. :-) >> On a broader level, I'm not very happy with the naming this patch >> uses. Here's an example: >> >> + >> + If an FDW supports optimizing foreign table updates, it still needs >> to >> + provide PlanDMLPushdown, BeginDMLPushdown, >> + IterateDMLPushdown and EndDMLPushdown >> + described below. >> + >> >> "Optimizing foreign table updates" is both inaccurate (since it >> doesn't only optimize updates) and so vague as to be meaningless >> unless you already know what it means. The actual patch uses >> terminology like "fdwPushdowns" which is just as bad. We might push a >> lot of things to the foreign side -- sorts, joins, aggregates, limits >> -- and this is just one of them. Worse, "pushdown" is itself >> something of a term of art - will people who haven't been following >> all of the mammoth, multi-hundred-email threads on this topic know >> what that means? I think we need some better terminology here. >> >> The best thing that I can come up with offhand is "bulk modify". So >> we'd have PlanBulkModify, BeginBulkModify, IterateBulkModify, >> EndBulkModify, ExplainBulkModify. Other suggestions welcome. The >> ResultRelInfo flag could be ri_usesFDWBulkModify. > > I'm not sure that "bulk modify" is best. Yeah, this would improve the > performance especially in the bulk-modification case, but would improve the > performance even in the case where an UPDATE/DELETE modifies just a single > row. Let me explain using an example. Without the patch, we have the > following plan for an UPDATE on a foreign table that updates a single row: > > postgres=# explain verbose update foo set a = a + 1 where a = 1; > QUERY PLAN > -- > Update on public.foo (cost=100.00..101.05 rows=1 width=14) >Remote SQL: UPDATE public.foo SET a = $2 WHERE ctid = $1 >-> Foreign Scan on public.foo (cost=100.00..101.05 rows=1 width=14) > Output: (a + 1), b, ctid > Remote SQL: SELECT a, b, ctid FROM public.foo WHERE ((a = 1)) FOR > UPDATE > (5 rows) > > The plan requires two queries, SELECT and UPDATE, to do the update. > (Actually, the plan have additional overheads in creating a cursor for the > SELECT and establishing a prepared statement for the UPDATE.) But with the > patch, we have: > > postgres=# explain verbose update foo set a = a + 1 where a = 1; > QUERY PLAN > --- > Update on public.foo (cost=100.00..101.05 rows=1 width=14) >-> Foreign Update on public.foo (cost=100.00..101.05 rows=1 width=14) > Remote SQL: UPDATE public.foo SET a = (a + 1) WHERE ((a = 1)) > (3 rows) > > The optimized plan requires just a single UPDATE query to do that! So, even > in the single-row-modification case the patch could improve the performance. > > How about "Direct Modify"; PlanDirectModify, BeginDirectModify, > IterateDirectModify, EndDirectModify, ExplainDirectModify, and > ri_usesFDWDirectModify. Works for me! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/03/05 5:45, Robert Haas wrote: Some comments on the latest version. I haven't reviewed the postgres_fdw changes in detail here, so this is just about the core changes. Thank you for taking the time to review the patch! I see that show_plan_tlist checks whether the operation is any of CMD_INSERT, CMD_UPDATE, or CMD_DELETE. But practically every place else where a similar test is needed instead tests whether the operation is *not* CMD_SELECT. I think this place should do it that way, too. + resultRelInfo = mtstate->resultRelInfo; for (i = 0; i < nplans; i++) { ExecAuxRowMark *aerm; + /* +* ignore subplan if the FDW pushes down the command to the remote +* server; the ModifyTable won't have anything to do except for +* evaluation of RETURNING expressions +*/ + if (resultRelInfo->ri_FdwPushdown) + { + resultRelInfo++; + continue; + } + subplan = mtstate->mt_plans[i]->plan; aerm = ExecBuildAuxRowMark(erm, subplan->targetlist); mtstate->mt_arowmarks[i] = lappend(mtstate->mt_arowmarks[i], aerm); + resultRelInfo++; } This kind of thing creates a hazard for future people maintaining this code. If somebody adds some code to this loop that needs to execute even when resultRelInfo->ri_FdwPushdown is true, they have to add two copies of it. It's much better to move the three lines of logic that execute only in the non-pushdown case inside of if (!resultRelInfo->ri_FdwPushdown). Another option to avoid such a hazard would be to remove the two changes from ExecInitModifyTable and create ExecAuxRowMarks and junk filters even in the pushdown case. I made the changes because we won't use ExecAuxRowMarks in that case since we don't need to do EvalPlanQual rechecks and because we won't use junk filters in that case since we do UPDATE/DELETE in the subplan. But the creating cost is enough small, so simply removing the changes seems like a good idea. This issue crops up elsewhere as well. The changes to ExecModifyTable() have the same problem -- in that case, it might be wise to move the code that's going to have to be indented yet another level into a separate function. That code also says this: + /* No need to provide scan tuple to ExecProcessReturning. */ + slot = ExecProcessReturning(resultRelInfo, NULL, planSlot); ...but, uh, why not? The comment says what the code does, but what it should do is explain why it does it. As documented in IterateDMLPushdown in fdwhandler.sgml, the reason for that is that in the pushdown case it's the IterateDMLPushdown's responsiblity to get actually inserted/updated/deleted tuples and make those tuples available to the ExecProcessReturning. I'll add comments. On a broader level, I'm not very happy with the naming this patch uses. Here's an example: + + If an FDW supports optimizing foreign table updates, it still needs to + provide PlanDMLPushdown, BeginDMLPushdown, + IterateDMLPushdown and EndDMLPushdown + described below. + "Optimizing foreign table updates" is both inaccurate (since it doesn't only optimize updates) and so vague as to be meaningless unless you already know what it means. The actual patch uses terminology like "fdwPushdowns" which is just as bad. We might push a lot of things to the foreign side -- sorts, joins, aggregates, limits -- and this is just one of them. Worse, "pushdown" is itself something of a term of art - will people who haven't been following all of the mammoth, multi-hundred-email threads on this topic know what that means? I think we need some better terminology here. The best thing that I can come up with offhand is "bulk modify". So we'd have PlanBulkModify, BeginBulkModify, IterateBulkModify, EndBulkModify, ExplainBulkModify. Other suggestions welcome. The ResultRelInfo flag could be ri_usesFDWBulkModify. I'm not sure that "bulk modify" is best. Yeah, this would improve the performance especially in the bulk-modification case, but would improve the performance even in the case where an UPDATE/DELETE modifies just a single row. Let me explain using an example. Without the patch, we have the following plan for an UPDATE on a foreign table that updates a single row: postgres=# explain verbose update foo set a = a + 1 where a = 1; QUERY PLAN -- Update on public.foo (cost=100.00..101.05 rows=1 width=14) Remote SQL: UPDATE public.foo SET a = $2 WHERE ctid = $1 -> Foreign Sc
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Tue, Feb 23, 2016 at 1:18 AM, Etsuro Fujita wrote: > Thanks again for updating the patch and fixing the issues! Some comments on the latest version. I haven't reviewed the postgres_fdw changes in detail here, so this is just about the core changes. I see that show_plan_tlist checks whether the operation is any of CMD_INSERT, CMD_UPDATE, or CMD_DELETE. But practically every place else where a similar test is needed instead tests whether the operation is *not* CMD_SELECT. I think this place should do it that way, too. + resultRelInfo = mtstate->resultRelInfo; for (i = 0; i < nplans; i++) { ExecAuxRowMark *aerm; + /* +* ignore subplan if the FDW pushes down the command to the remote +* server; the ModifyTable won't have anything to do except for +* evaluation of RETURNING expressions +*/ + if (resultRelInfo->ri_FdwPushdown) + { + resultRelInfo++; + continue; + } + subplan = mtstate->mt_plans[i]->plan; aerm = ExecBuildAuxRowMark(erm, subplan->targetlist); mtstate->mt_arowmarks[i] = lappend(mtstate->mt_arowmarks[i], aerm); + resultRelInfo++; } This kind of thing creates a hazard for future people maintaining this code. If somebody adds some code to this loop that needs to execute even when resultRelInfo->ri_FdwPushdown is true, they have to add two copies of it. It's much better to move the three lines of logic that execute only in the non-pushdown case inside of if (!resultRelInfo->ri_FdwPushdown). This issue crops up elsewhere as well. The changes to ExecModifyTable() have the same problem -- in that case, it might be wise to move the code that's going to have to be indented yet another level into a separate function. That code also says this: + /* No need to provide scan tuple to ExecProcessReturning. */ + slot = ExecProcessReturning(resultRelInfo, NULL, planSlot); ...but, uh, why not? The comment says what the code does, but what it should do is explain why it does it. On a broader level, I'm not very happy with the naming this patch uses. Here's an example: + + If an FDW supports optimizing foreign table updates, it still needs to + provide PlanDMLPushdown, BeginDMLPushdown, + IterateDMLPushdown and EndDMLPushdown + described below. + "Optimizing foreign table updates" is both inaccurate (since it doesn't only optimize updates) and so vague as to be meaningless unless you already know what it means. The actual patch uses terminology like "fdwPushdowns" which is just as bad. We might push a lot of things to the foreign side -- sorts, joins, aggregates, limits -- and this is just one of them. Worse, "pushdown" is itself something of a term of art - will people who haven't been following all of the mammoth, multi-hundred-email threads on this topic know what that means? I think we need some better terminology here. The best thing that I can come up with offhand is "bulk modify". So we'd have PlanBulkModify, BeginBulkModify, IterateBulkModify, EndBulkModify, ExplainBulkModify. Other suggestions welcome. The ResultRelInfo flag could be ri_usesFDWBulkModify. The documentation could say something like this: Some inserts, updates, and deletes to foreign tables can be optimized by implementing an alternate set of interfaces. The ordinary interfaces for inserts, updates, and deletes fetch rows from the remote server and then modify those rows one at a time. In some cases, this row-by-row approach is necessary, but it can be inefficient. If it is possible for the foreign server to determine which rows should be modified without actually retrieving them, and if there are no local triggers which would affect the operation, then it is possible to arrange things so that the entire operation is performed on the remote server. The interfaces described below make this possible. + Begin executing a foreign table update directly on the remote server. I think this should say "Prepare to execute a bulk modification directly on the remote server". It shouldn't actually begin the execution phase. + End the table update and release resources. It is normally not important And I think this one should say "Clean up following a bulk modification on the remote server". It's not actually ending the update; the iterate method already did that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/02/22 20:13, Rushabh Lathia wrote: PFA update patch, which includes changes into postgresPlanDMLPushdown() to check for join condition before target columns and also fixed couple of whitespace issues. For pushing down an UPDATE/DELETE on a foreign join to the remote, I created a WIP patch on top of the latest version of the DML pushdown patch. Attached is the WIP patch. I'd like to propose this as part of (I'd like to discuss this as a separate patch, though): https://commitfest.postgresql.org/9/453/ The patch doesn't correctly evaluate the values of system columns of joined relations in RETURNING, other than ctid. I'll fix that ASAP. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 130,135 static void deparseReturningList(StringInfo buf, PlannerInfo *root, --- 130,143 bool trig_after_row, List *returningList, List **retrieved_attrs); + static void deparseJoinedReturningList(List *fdw_scan_tlist, + Index rtindex, + Bitmapset *attrs_used, + List *returningList, + List **retrieved_attrs, + List **result_attrs, + List **result_attrnos, + deparse_expr_cxt *context); static void deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, bool qualify_col); static void deparseRelation(StringInfo buf, Relation rel); *** *** 158,164 static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, ! RelOptInfo *joinrel, bool use_alias, List **params_list); /* --- 166,175 static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, ! RelOptInfo *joinrel, ! Index ignore_rel, ! bool use_alias, ! List **params_list); /* *** *** 850,856 deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context) * Construct FROM clause */ appendStringInfoString(buf, " FROM "); ! deparseFromExprForRel(buf, root, foreignrel, (foreignrel->reloptkind == RELOPT_JOINREL), context->params_list); } --- 861,867 * Construct FROM clause */ appendStringInfoString(buf, " FROM "); ! deparseFromExprForRel(buf, root, foreignrel, (Index) 0, (foreignrel->reloptkind == RELOPT_JOINREL), context->params_list); } *** *** 1135,1144 deparseExplicitTargetList(List *tlist, List **retrieved_attrs, * The function constructs ... JOIN ... ON ... for join relation. For a base * relation it just returns schema-qualified tablename, with the appropriate * alias if so requested. */ static void ! deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, ! bool use_alias, List **params_list) { PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private; --- 1146,1166 * The function constructs ... JOIN ... ON ... for join relation. For a base * relation it just returns schema-qualified tablename, with the appropriate * alias if so requested. + * + * If constructing FROM clause of UPDATE statement or USING clause of DELETE + * statement, we simply ignore the ignore_rel target relation when deparsing + * the join to the target relation. Note that the join is safely interchanged + * with higher-level outer joins (if any) by outer-join identity 1 since that + * the join won't appear on the nullable side of such outer joins (we currently + * don't allow the result relation to appear on the nullable side of an outer + * join) and that the target relation won't be outer-joined to other relations. */ static void ! deparseFromExprForRel(StringInfo buf, PlannerInfo *root, ! RelOptInfo *foreignrel, ! Index ignore_rel, ! bool use_alias, ! List **params_list) { PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private; *** *** 1148,1169 deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, RelOptInfo *rel_i = fpinfo->innerrel; StringInfoData join_sql_o; StringInfoData join_sql_i; /* Deparse outer relation */ ! initStringInfo(&join_sql_o); ! deparseFromExprForRel(&join_sql_o, root, rel_o, true, params_list); /* Deparse inner relation */ ! initStringInfo(&join_sql_i); ! deparseFromExprForRel(&join_sql_i, root, rel_i, true, params_list); /* * For a join relation FROM clause entry is deparsed as * * ((outer relation) (inner relation) ON (joinclauses) */ ! appendStrin
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/02/22 20:13, Rushabh Lathia wrote: I did another round of review for the latest patch and well as performed the sanity test, and haven't found any functional issues. Found couple of issue, see in-line comments for the same. Thanks! On Thu, Feb 18, 2016 at 3:15 PM, Etsuro Fujita mailto:fujita.ets...@lab.ntt.co.jp>> wrote: On 2016/02/12 21:19, Etsuro Fujita wrote: + /* Check point 1 */ + if (operation == CMD_INSERT) + return false; + + /* Check point 2 */ + if (nodeTag(subplan) != T_ForeignScan) + return false; + + /* Check point 3 */ + if (subplan->qual != NIL) + return false; + + /* Check point 4 */ + if (operation == CMD_UPDATE) These comments are referring to something in the function header further up, but you could instead just delete the stuff from the header and mention the actual conditions here. Will fix. Done. The patch doesn't allow the postgres_fdw to push down an UPDATE/DELETE on a foreign join, so I added one more condition here not to handle such cases. (I'm planning to propose a patch to handle such cases, in the next CF.) I think we should place the checking foreign join condition before the target columns, as foreign join condition is less costly then the target columns. Agreed. Other changes: * I keep Rushabh's code change that we call PlanDMLPushdown only when all the required APIs are available with FDW, but for CheckValidResultRel, I left the code as-is (no changes to that function), to match the docs saying that the FDW needs to provide the DML pushdown callback functions together with existing table-updating functions such as ExecForeignInsert, ExecForeignUpdate and ExecForeignDelete. I think we should also update the CheckValidResultRel(), because even though ExecForeignInsert, ExecForeignUpdate and ExecForeignDelete not present, FDW still can perform UPDATE/DELETE/INSERT using DML Pushdown APIs. Lets take committer's view on this. OK PFA update patch, which includes changes into postgresPlanDMLPushdown() to check for join condition before target columns and also fixed couple of whitespace issues. Thanks again for updating the patch and fixing the issues! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
I did another round of review for the latest patch and well as performed the sanity test, and haven't found any functional issues. Found couple of issue, see in-line comments for the same. On Thu, Feb 18, 2016 at 3:15 PM, Etsuro Fujita wrote: > On 2016/02/12 21:19, Etsuro Fujita wrote: > >> Comments on specific points follow. >>> >>> This seems to need minor rebasing in the wake of the join pushdown patch. >>> >> > Will do. >> > > Done. > > + /* Likewise for ForeignScan that has pushed down >>> INSERT/UPDATE/DELETE */ >>> + if (IsA(plan, ForeignScan) && >>> + (((ForeignScan *) plan)->operation == CMD_INSERT || >>> +((ForeignScan *) plan)->operation == CMD_UPDATE || >>> +((ForeignScan *) plan)->operation == CMD_DELETE)) >>> + return; >>> >>> I don't really understand why this is a good idea. Since target lists >>> are only displayed with EXPLAIN (VERBOSE), I don't really understand >>> what is to be gained by suppressing them in any case at all, and >>> therefore I don't understand why it's a good idea to do it here, >>> either. If there is a good reason, maybe the comment should explain >>> what it is. >>> >> > I think that displaying target lists would be confusing for users. >> > > There seems no objection from you (or anyone), I left that as proposed, > and added more comments. > > + /* Check point 1 */ >>> + if (operation == CMD_INSERT) >>> + return false; >>> + >>> + /* Check point 2 */ >>> + if (nodeTag(subplan) != T_ForeignScan) >>> + return false; >>> + >>> + /* Check point 3 */ >>> + if (subplan->qual != NIL) >>> + return false; >>> + >>> + /* Check point 4 */ >>> + if (operation == CMD_UPDATE) >>> >>> These comments are referring to something in the function header >>> further up, but you could instead just delete the stuff from the >>> header and mention the actual conditions here. >>> >> > Will fix. >> > > Done. > > The patch doesn't allow the postgres_fdw to push down an UPDATE/DELETE on > a foreign join, so I added one more condition here not to handle such > cases. (I'm planning to propose a patch to handle such cases, in the next > CF.) > I think we should place the checking foreign join condition before the target columns, as foreign join condition is less costly then the target columns. > > - If the first condition is supposed accept only CMD_UPDATE or >>> CMD_DELETE, say if (operation != CMD_UPDATE || operation != >>> CMD_DELETE) rather than doing it this way. Is-not-insert may in this >>> context be functionally equivalent to is-update-or-delete, but it's >>> better to write the comment and the code so that they exactly match >>> rather than kinda-match. >>> >>> - For point 2, use IsA(subplan, ForiegnScan). >>> >> > Will fix. >> > > Done. > > + /* >>> +* ignore subplan if the FDW pushes down the >>> command to the remote >>> +* server >>> +*/ >>> >>> This comment states what the code does, instead of explaining why it >>> does it. Please update it so that it explains why it does it. >>> >> > Will update. >> > > Done. > > + List *fdwPushdowns; /* per-target-table FDW >>> pushdown flags */ >>> >>> Isn't a list of booleans an awfully inefficient representation? How >>> about a Bitmapset? >>> >> > OK, will do. >> > > Done. > > + /* >>> +* Prepare remote-parameter expressions for evaluation. >>> (Note: in >>> +* practice, we expect that all these expressions will be just >>> Params, so >>> +* we could possibly do something more efficient than using >>> the full >>> +* expression-eval machinery for this. But probably there >>> would be little >>> +* benefit, and it'd require postgres_fdw to know more than is >>> desirable >>> +* about Param evaluation.) >>> +*/ >>> + dpstate->param_exprs = (List *) >>> + ExecInitExpr((Expr *) fsplan->fdw_exprs, >>> +(PlanState *) node); >>> >>> This is an exact copy of an existing piece of code and its associated >>> comment. A good bit of the surrounding code is copied, too. You need >>> to refactor to avoid duplication, like by putting some of the code in >>> a new function that both postgresBeginForeignScan and >>> postgresBeginForeignModify can call. >>> >>> execute_dml_stmt() has some of the same disease. >>> >> > Will do. >> > > Done. > > Other changes: > > * I fixed docs as discussed before with Rushabh Lathia and Thom Brown. > > * I keep Rushabh's code change that we call PlanDMLPushdown only when all > the required APIs are available with FDW, but for CheckValidResultRel, I > left the code as-is (no changes to that function), to match the docs saying > that the FDW needs to provide the DML pushdown callback functions together > with existing table
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/02/12 21:19, Etsuro Fujita wrote: Comments on specific points follow. This seems to need minor rebasing in the wake of the join pushdown patch. Will do. Done. + /* Likewise for ForeignScan that has pushed down INSERT/UPDATE/DELETE */ + if (IsA(plan, ForeignScan) && + (((ForeignScan *) plan)->operation == CMD_INSERT || +((ForeignScan *) plan)->operation == CMD_UPDATE || +((ForeignScan *) plan)->operation == CMD_DELETE)) + return; I don't really understand why this is a good idea. Since target lists are only displayed with EXPLAIN (VERBOSE), I don't really understand what is to be gained by suppressing them in any case at all, and therefore I don't understand why it's a good idea to do it here, either. If there is a good reason, maybe the comment should explain what it is. I think that displaying target lists would be confusing for users. There seems no objection from you (or anyone), I left that as proposed, and added more comments. + /* Check point 1 */ + if (operation == CMD_INSERT) + return false; + + /* Check point 2 */ + if (nodeTag(subplan) != T_ForeignScan) + return false; + + /* Check point 3 */ + if (subplan->qual != NIL) + return false; + + /* Check point 4 */ + if (operation == CMD_UPDATE) These comments are referring to something in the function header further up, but you could instead just delete the stuff from the header and mention the actual conditions here. Will fix. Done. The patch doesn't allow the postgres_fdw to push down an UPDATE/DELETE on a foreign join, so I added one more condition here not to handle such cases. (I'm planning to propose a patch to handle such cases, in the next CF.) - If the first condition is supposed accept only CMD_UPDATE or CMD_DELETE, say if (operation != CMD_UPDATE || operation != CMD_DELETE) rather than doing it this way. Is-not-insert may in this context be functionally equivalent to is-update-or-delete, but it's better to write the comment and the code so that they exactly match rather than kinda-match. - For point 2, use IsA(subplan, ForiegnScan). Will fix. Done. + /* +* ignore subplan if the FDW pushes down the command to the remote +* server +*/ This comment states what the code does, instead of explaining why it does it. Please update it so that it explains why it does it. Will update. Done. + List *fdwPushdowns; /* per-target-table FDW pushdown flags */ Isn't a list of booleans an awfully inefficient representation? How about a Bitmapset? OK, will do. Done. + /* +* Prepare remote-parameter expressions for evaluation. (Note: in +* practice, we expect that all these expressions will be just Params, so +* we could possibly do something more efficient than using the full +* expression-eval machinery for this. But probably there would be little +* benefit, and it'd require postgres_fdw to know more than is desirable +* about Param evaluation.) +*/ + dpstate->param_exprs = (List *) + ExecInitExpr((Expr *) fsplan->fdw_exprs, +(PlanState *) node); This is an exact copy of an existing piece of code and its associated comment. A good bit of the surrounding code is copied, too. You need to refactor to avoid duplication, like by putting some of the code in a new function that both postgresBeginForeignScan and postgresBeginForeignModify can call. execute_dml_stmt() has some of the same disease. Will do. Done. Other changes: * I fixed docs as discussed before with Rushabh Lathia and Thom Brown. * I keep Rushabh's code change that we call PlanDMLPushdown only when all the required APIs are available with FDW, but for CheckValidResultRel, I left the code as-is (no changes to that function), to match the docs saying that the FDW needs to provide the DML pushdown callback functions together with existing table-updating functions such as ExecForeignInsert, ExecForeignUpdate and ExecForeignDelete. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 1316,1321 deparseUpdateSql(StringInfo buf, PlannerInfo *root, --- 1316,1384 } /* + * deparse remote UPDATE statement + * + * The statement text is appended to buf, and we also create an integer List + * of the columns being retrieved by RETURNING (if any), which is returned + * to *retrieved_attrs. + */ + void + deparsePushedDownUpdateSql(StringInfo buf, PlannerInfo *root, + Index rtindex, Relation rel, + List *targetlist, + List *targetAttrs, + List *remote_conds, + List **params_list, + List *returning
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/02/15 15:20, Rushabh Lathia wrote: On Fri, Feb 12, 2016 at 5:40 PM, Etsuro Fujita mailto:fujita.ets...@lab.ntt.co.jp>> wrote: As a result of our discussions, we reached a conclusion that the DML pushdown APIs should be provided together with existing APIs such as ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete, IIUC. So, how about (1) leaving the description for the existing APIs as-is and (2) adding a new description for the DML pushdown APIs in parenthesis, like this?: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively. (If the FDW attempts to optimize a foreign table update, it still needs to provide PlanDMLPushdown, BeginDMLPushdown, IterateDMLPushdown and EndDMLPushdown.) Actually, if the FDW provides the DML pushdown APIs, (pushdown-able) foreign table updates can be done without ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete. So, the above docs are not necessarily correct. But we don't recommend to do that without the existing APIs, so I'm not sure it's worth complicating the docs. Adding a new description for DML pushdown API seems good idea. I would suggest to add that as separate paragraph rather then into brackets. OK, will do. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Fri, Feb 12, 2016 at 5:40 PM, Etsuro Fujita wrote: > Hi Rushabh and Thom, > > Thanks for the review! > > On 2016/02/10 22:37, Thom Brown wrote: > >> On 10 February 2016 at 08:00, Rushabh Lathia >> wrote: >> >>> Fujita-san, I am attaching update version of the patch, which added >>> the documentation update. >>> >> > Thanks for updating the patch! > > Once we finalize this, I feel good with the patch and think that we >>> could mark this as ready for committer. >>> >> > I find this wording a bit confusing: >> >> "If the IsForeignRelUpdatable pointer is set to NULL, foreign tables >> are assumed to be insertable, updatable, or deletable either the FDW >> provides ExecForeignInsert,ExecForeignUpdate, or ExecForeignDelete >> respectively or if the FDW optimizes a foreign table update on a >> foreign table using PlanDMLPushdown (PlanDMLPushdown still needs to >> provide BeginDMLPushdown, IterateDMLPushdown and EndDMLPushdown to >> execute the optimized update.)." >> >> This is a very long sentence, and the word "either" doesn't work here. >> > > Agreed. > > As a result of our discussions, we reached a conclusion that the DML > pushdown APIs should be provided together with existing APIs such as > ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete, IIUC. So, how > about (1) leaving the description for the existing APIs as-is and (2) > adding a new description for the DML pushdown APIs in parenthesis, like > this?: > > If the IsForeignRelUpdatable pointer is set to > NULL, foreign tables are assumed to be insertable, > updatable, > or deletable if the FDW provides ExecForeignInsert, > ExecForeignUpdate, or ExecForeignDelete > respectively. > (If the FDW attempts to optimize a foreign table update, it still > needs to provide PlanDMLPushdown, BeginDMLPushdown, > IterateDMLPushdown and EndDMLPushdown.) > > Actually, if the FDW provides the DML pushdown APIs, (pushdown-able) > foreign table updates can be done without ExecForeignInsert, > ExecForeignUpdate or ExecForeignDelete. So, the above docs are not > necessarily correct. But we don't recommend to do that without the > existing APIs, so I'm not sure it's worth complicating the docs. Adding a new description for DML pushdown API seems good idea. I would suggest to add that as separate paragraph rather then into brackets. > > > Also: >> >> "When the query doesn't has the clause, the FDW must also increment >> the row count for the ForeignScanState node in the EXPLAIN ANALYZE >> case." >> >> Should read "doesn't have" >> > > Will fix. > > Best regards, > Etsuro Fujita > > > -- Rushabh Lathia
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/02/12 21:46, Robert Haas wrote: On Fri, Feb 12, 2016 at 7:19 AM, Etsuro Fujita wrote: I think that displaying target lists would be confusing for users. Here is an example: EXPLAIN (verbose, costs off) DELETE FROM rem1; -- can be pushed down QUERY PLAN - Delete on public.rem1 -> Foreign Delete on public.rem1 Output: ctid Remote SQL: DELETE FROM public.loc1 (4 rows) Should we output the "Output" line? I see your point, but what if there's a RETURNING clause? IMO I think that would be confusing in that case. Here is an example: EXPLAIN (verbose, costs off) DELETE FROM rem1 RETURNING *; -- can be pushed down QUERY PLAN -- Delete on public.rem1 Output: f1, f2 -> Foreign Delete on public.rem1 Output: ctid Remote SQL: DELETE FROM public.loc1 RETURNING f1, f2 (5 rows) The Output line beneath the ForeignScan node doesn't match the RETURNING expressions in the remote query as the Output line beneath the ModifyTable node does, so I think displaying that would be confusing even in that case. Another example: postgres=# explain verbose update foo set a = a + 1 returning *; QUERY PLAN -- Update on public.foo (cost=100.00..137.50 rows=1000 width=10) Output: a -> Foreign Update on public.foo (cost=100.00..137.50 rows=1000 width=10) Output: (a + 1), ctid Remote SQL: UPDATE public.foo SET a = (a + 1) RETURNING a (5 rows) Same above. As for case of INSERT .. RETURNING .., I guess there is not such a mismatch, but I'm not sure that displaying that is that helpful, honestly, so I'd vote for suppressing that in all cases, for consistency. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Fri, Feb 12, 2016 at 7:19 AM, Etsuro Fujita wrote: > I think that displaying target lists would be confusing for users. Here is > an example: > > EXPLAIN (verbose, costs off) > DELETE FROM rem1; -- can be pushed down > QUERY PLAN > - > Delete on public.rem1 >-> Foreign Delete on public.rem1 > Output: ctid > Remote SQL: DELETE FROM public.loc1 > (4 rows) > > Should we output the "Output" line? I see your point, but what if there's a RETURNING clause? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Hi Robert, Thanks for the review! On 2016/02/11 5:43, Robert Haas wrote: On Wed, Feb 10, 2016 at 3:00 AM, Rushabh Lathia wrote: Fujita-san, I am attaching update version of the patch, which added the documentation update. Once we finalize this, I feel good with the patch and think that we could mark this as ready for committer. It would be nice to hear from Tom and/or Stephen whether the changes that have been made since they last commented on it. I feel like the design is reasonably OK, but I don't want to push this through if they're still not happy with it. One thing I'm not altogether keen on is the use of "pushdown" or "dml pushdown" as a term of art; on the other hand, I'm not sure what other term would be better. I'm open to that naming. Proposals are welcome! Comments on specific points follow. This seems to need minor rebasing in the wake of the join pushdown patch. Will do. + /* Likewise for ForeignScan that has pushed down INSERT/UPDATE/DELETE */ + if (IsA(plan, ForeignScan) && + (((ForeignScan *) plan)->operation == CMD_INSERT || +((ForeignScan *) plan)->operation == CMD_UPDATE || +((ForeignScan *) plan)->operation == CMD_DELETE)) + return; I don't really understand why this is a good idea. Since target lists are only displayed with EXPLAIN (VERBOSE), I don't really understand what is to be gained by suppressing them in any case at all, and therefore I don't understand why it's a good idea to do it here, either. If there is a good reason, maybe the comment should explain what it is. I think that displaying target lists would be confusing for users. Here is an example: EXPLAIN (verbose, costs off) DELETE FROM rem1; -- can be pushed down QUERY PLAN - Delete on public.rem1 -> Foreign Delete on public.rem1 Output: ctid Remote SQL: DELETE FROM public.loc1 (4 rows) Should we output the "Output" line? + /* Check point 1 */ + if (operation == CMD_INSERT) + return false; + + /* Check point 2 */ + if (nodeTag(subplan) != T_ForeignScan) + return false; + + /* Check point 3 */ + if (subplan->qual != NIL) + return false; + + /* Check point 4 */ + if (operation == CMD_UPDATE) These comments are referring to something in the function header further up, but you could instead just delete the stuff from the header and mention the actual conditions here. Also: Will fix. - If the first condition is supposed accept only CMD_UPDATE or CMD_DELETE, say if (operation != CMD_UPDATE || operation != CMD_DELETE) rather than doing it this way. Is-not-insert may in this context be functionally equivalent to is-update-or-delete, but it's better to write the comment and the code so that they exactly match rather than kinda-match. - For point 2, use IsA(subplan, ForiegnScan). Will fix. + /* +* ignore subplan if the FDW pushes down the command to the remote +* server +*/ This comment states what the code does, instead of explaining why it does it. Please update it so that it explains why it does it. Will update. + List *fdwPushdowns; /* per-target-table FDW pushdown flags */ Isn't a list of booleans an awfully inefficient representation? How about a Bitmapset? OK, will do. + /* +* Prepare remote-parameter expressions for evaluation. (Note: in +* practice, we expect that all these expressions will be just Params, so +* we could possibly do something more efficient than using the full +* expression-eval machinery for this. But probably there would be little +* benefit, and it'd require postgres_fdw to know more than is desirable +* about Param evaluation.) +*/ + dpstate->param_exprs = (List *) + ExecInitExpr((Expr *) fsplan->fdw_exprs, +(PlanState *) node); This is an exact copy of an existing piece of code and its associated comment. A good bit of the surrounding code is copied, too. You need to refactor to avoid duplication, like by putting some of the code in a new function that both postgresBeginForeignScan and postgresBeginForeignModify can call. execute_dml_stmt() has some of the same disease. Will do. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Hi Rushabh and Thom, Thanks for the review! On 2016/02/10 22:37, Thom Brown wrote: On 10 February 2016 at 08:00, Rushabh Lathia wrote: Fujita-san, I am attaching update version of the patch, which added the documentation update. Thanks for updating the patch! Once we finalize this, I feel good with the patch and think that we could mark this as ready for committer. I find this wording a bit confusing: "If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable either the FDW provides ExecForeignInsert,ExecForeignUpdate, or ExecForeignDelete respectively or if the FDW optimizes a foreign table update on a foreign table using PlanDMLPushdown (PlanDMLPushdown still needs to provide BeginDMLPushdown, IterateDMLPushdown and EndDMLPushdown to execute the optimized update.)." This is a very long sentence, and the word "either" doesn't work here. Agreed. As a result of our discussions, we reached a conclusion that the DML pushdown APIs should be provided together with existing APIs such as ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete, IIUC. So, how about (1) leaving the description for the existing APIs as-is and (2) adding a new description for the DML pushdown APIs in parenthesis, like this?: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively. (If the FDW attempts to optimize a foreign table update, it still needs to provide PlanDMLPushdown, BeginDMLPushdown, IterateDMLPushdown and EndDMLPushdown.) Actually, if the FDW provides the DML pushdown APIs, (pushdown-able) foreign table updates can be done without ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete. So, the above docs are not necessarily correct. But we don't recommend to do that without the existing APIs, so I'm not sure it's worth complicating the docs. Also: "When the query doesn't has the clause, the FDW must also increment the row count for the ForeignScanState node in the EXPLAIN ANALYZE case." Should read "doesn't have" Will fix. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Feb 10, 2016 at 3:00 AM, Rushabh Lathia wrote: > Fujita-san, I am attaching update version of the patch, which added > the documentation update. > > Once we finalize this, I feel good with the patch and think that we > could mark this as ready for committer. It would be nice to hear from Tom and/or Stephen whether the changes that have been made since they last commented on it. I feel like the design is reasonably OK, but I don't want to push this through if they're still not happy with it. One thing I'm not altogether keen on is the use of "pushdown" or "dml pushdown" as a term of art; on the other hand, I'm not sure what other term would be better. Comments on specific points follow. This seems to need minor rebasing in the wake of the join pushdown patch. + /* Likewise for ForeignScan that has pushed down INSERT/UPDATE/DELETE */ + if (IsA(plan, ForeignScan) && + (((ForeignScan *) plan)->operation == CMD_INSERT || +((ForeignScan *) plan)->operation == CMD_UPDATE || +((ForeignScan *) plan)->operation == CMD_DELETE)) + return; I don't really understand why this is a good idea. Since target lists are only displayed with EXPLAIN (VERBOSE), I don't really understand what is to be gained by suppressing them in any case at all, and therefore I don't understand why it's a good idea to do it here, either. If there is a good reason, maybe the comment should explain what it is. + /* Check point 1 */ + if (operation == CMD_INSERT) + return false; + + /* Check point 2 */ + if (nodeTag(subplan) != T_ForeignScan) + return false; + + /* Check point 3 */ + if (subplan->qual != NIL) + return false; + + /* Check point 4 */ + if (operation == CMD_UPDATE) These comments are referring to something in the function header further up, but you could instead just delete the stuff from the header and mention the actual conditions here. Also: - If the first condition is supposed accept only CMD_UPDATE or CMD_DELETE, say if (operation != CMD_UPDATE || operation != CMD_DELETE) rather than doing it this way. Is-not-insert may in this context be functionally equivalent to is-update-or-delete, but it's better to write the comment and the code so that they exactly match rather than kinda-match. - For point 2, use IsA(subplan, ForiegnScan). + /* +* ignore subplan if the FDW pushes down the command to the remote +* server +*/ This comment states what the code does, instead of explaining why it does it. Please update it so that it explains why it does it. + List *fdwPushdowns; /* per-target-table FDW pushdown flags */ Isn't a list of booleans an awfully inefficient representation? How about a Bitmapset? + /* +* Prepare remote-parameter expressions for evaluation. (Note: in +* practice, we expect that all these expressions will be just Params, so +* we could possibly do something more efficient than using the full +* expression-eval machinery for this. But probably there would be little +* benefit, and it'd require postgres_fdw to know more than is desirable +* about Param evaluation.) +*/ + dpstate->param_exprs = (List *) + ExecInitExpr((Expr *) fsplan->fdw_exprs, +(PlanState *) node); This is an exact copy of an existing piece of code and its associated comment. A good bit of the surrounding code is copied, too. You need to refactor to avoid duplication, like by putting some of the code in a new function that both postgresBeginForeignScan and postgresBeginForeignModify can call. execute_dml_stmt() has some of the same disease. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 10 February 2016 at 08:00, Rushabh Lathia wrote: > Fujita-san, I am attaching update version of the patch, which added > the documentation update. > > Once we finalize this, I feel good with the patch and think that we > could mark this as ready for committer. I find this wording a bit confusing: "If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable either the FDW provides ExecForeignInsert,ExecForeignUpdate, or ExecForeignDelete respectively or if the FDW optimizes a foreign table update on a foreign table using PlanDMLPushdown (PlanDMLPushdown still needs to provide BeginDMLPushdown, IterateDMLPushdown and EndDMLPushdown to execute the optimized update.)." This is a very long sentence, and the word "either" doesn't work here. Also: "When the query doesn't has the clause, the FDW must also increment the row count for the ForeignScanState node in the EXPLAIN ANALYZE case." Should read "doesn't have" The rest looks fine AFAICT. Regards Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Fujita-san, I am attaching update version of the patch, which added the documentation update. Once we finalize this, I feel good with the patch and think that we could mark this as ready for committer. On Fri, Feb 5, 2016 at 5:33 PM, Rushabh Lathia wrote: > > > On Fri, Feb 5, 2016 at 4:46 PM, Rushabh Lathia > wrote: > >> >> >> On Wed, Feb 3, 2016 at 3:31 PM, Etsuro Fujita < >> fujita.ets...@lab.ntt.co.jp> wrote: >> >>> On 2016/01/28 15:20, Rushabh Lathia wrote: >>> On Thu, Jan 28, 2016 at 11:33 AM, Etsuro Fujita mailto:fujita.ets...@lab.ntt.co.jp>> wrote: On 2016/01/27 21:23, Rushabh Lathia wrote: If I understood correctly, above documentation means, that if FDW have DMLPushdown APIs that is enough. But in reality thats not the case, we need ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete in case DML is not pushable. And here fact is DMLPushdown APIs are optional for FDW, so that if FDW don't have DMLPushdown APIs they can still very well perform the DML operations using ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete. >>> >>> So documentation should be like: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively, If FDW provides DMLPushdown APIs and the DML are pushable to the foreign server, then FDW still needs ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete for the non-pushable DML operation. What's your opinion ? >>> >>> I agree that we should add this to the documentation, too. >>> >>> I added docs to the IsForeignRelUpdatable documentation. Also, a brief >>> introductory remark has been added at the beginning of the DML pushdown >>> APIs' documentation. >>> >>> BTW, if I understand correctly, I think we should also modify relation_is_updatabale() accordingly. Am I right? >>> >>> Yep, we need to modify relation_is_updatable(). >>> >>> I thought I'd modify that function in the same way as >>> CheckValidResultRel(), but I noticed that we cannot do that, because we >>> don't have any information on whether each update is pushed down to the >>> remote server by PlanDMLPushdown, during relation_is_updatabale(). >> >> >> Sorry I didn't get you here. Can't resultRelInfo->ri_FdwPushdown gives >> information update whether update is pushed down safe or not ? What my >> concern here is, lets say resultRelInfo->ri_FdwPushdown marked as true >> (PlanDMLPushdown return true), but later into CheckValidResultRel() it >> found out that missing BeginDMLPushdown, IterateDMLPushdown and >> EndDMLPushdown APIs and it will end up with error. >> >> What I think CheckValidResultRel() should do is, if >> resultRelInfo->ri_FdwPushdown is true then check for the DMLPushdown API >> and if it doesn't find those API then check for traditional APIs >> (ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete). And when it >> doesn't find both then it should return an error. >> >> I changed CheckValidResultRel(), where >> >> 1) Don't throw an error if resultRelInfo->ri_FdwPushdown is true and >> DMLPushdown APIs are missing as query can still perform operation with >> traditional ExecForeign APIs. So in this situation just marked >> resultRelInfo->ri_FdwPushdown to false. >> >> (Wondering can we add the checks for DMLPushdown APIs into >> PlanDMLPushdown as additional check? Means PlanDMLPushdown should return >> true only if FDW provides the BeginDMLPushdown & IterateDMLPushdown & >> EndDMLPushdown APIs ? What you say ?) >> >> > On another thought, we should not give responsibility to check for the > APIs to the FDW. So may be we should call PlanDMLPushdown only if > BeginDMLPushdown & IterateDMLPushdown & EndDMLPushdown APIs are present > into FDW. That means prepare DMLPushdown plan only when all the required > APIs are available with FDW. This will also reduce the changes into > CheckValidResultRel(). > > Thanks Ashutosh Bapat for healthy discussion. > > PFA patch. > > > >> 2) Don't throw an error if resultRelInfo->ri_FdwPushdown is true and >> DMLPushdown APIs is present but ExecForeign APIs are missing. >> 3) Throw an error if resultRelInfo->ri_FdwPushdown is false and >> ExecForeign APIs are missing. >> >> Attaching is the WIP patch here, do share your thought. >> (need to apply on top of V6 patch) >> >> >> So, I left that function as-is. relation_is_updatabale() is just used >>> for display in the information_schema views, so ISTM that that function is >>> fine as-is. (As for CheckValidResultRel(), I revised it
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Fri, Feb 5, 2016 at 4:46 PM, Rushabh Lathia wrote: > > > On Wed, Feb 3, 2016 at 3:31 PM, Etsuro Fujita > wrote: > >> On 2016/01/28 15:20, Rushabh Lathia wrote: >> >>> On Thu, Jan 28, 2016 at 11:33 AM, Etsuro Fujita >>> mailto:fujita.ets...@lab.ntt.co.jp>> >>> wrote: >>> >>> On 2016/01/27 21:23, Rushabh Lathia wrote: >>> >>> If I understood correctly, above documentation means, that if >>> FDW have >>> DMLPushdown APIs that is enough. But in reality thats not the >>> case, we >>> need ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete >>> in case >>> DML is not pushable. >>> >>> And here fact is DMLPushdown APIs are optional for FDW, so that >>> if FDW >>> don't have DMLPushdown APIs they can still very well perform the >>> DML >>> operations using ExecForeignInsert, ExecForeignUpdate, or >>> ExecForeignDelete. >>> >> >> So documentation should be like: >>> >>> If the IsForeignRelUpdatable pointer is set to NULL, foreign >>> tables are >>> assumed to be insertable, updatable, or deletable if the FDW >>> provides >>> ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete >>> respectively, >>> >>> If FDW provides DMLPushdown APIs and the DML are pushable to the >>> foreign >>> server, then FDW still needs ExecForeignInsert, >>> ExecForeignUpdate, or >>> ExecForeignDelete for the non-pushable DML operation. >>> >>> What's your opinion ? >>> >> >> I agree that we should add this to the documentation, too. >>> >> >> I added docs to the IsForeignRelUpdatable documentation. Also, a brief >> introductory remark has been added at the beginning of the DML pushdown >> APIs' documentation. >> >> BTW, if I understand correctly, I think we should also modify >>> relation_is_updatabale() accordingly. Am I right? >>> >> >> Yep, we need to modify relation_is_updatable(). >>> >> >> I thought I'd modify that function in the same way as >> CheckValidResultRel(), but I noticed that we cannot do that, because we >> don't have any information on whether each update is pushed down to the >> remote server by PlanDMLPushdown, during relation_is_updatabale(). > > > Sorry I didn't get you here. Can't resultRelInfo->ri_FdwPushdown gives > information update whether update is pushed down safe or not ? What my > concern here is, lets say resultRelInfo->ri_FdwPushdown marked as true > (PlanDMLPushdown return true), but later into CheckValidResultRel() it > found out that missing BeginDMLPushdown, IterateDMLPushdown and > EndDMLPushdown APIs and it will end up with error. > > What I think CheckValidResultRel() should do is, if > resultRelInfo->ri_FdwPushdown is true then check for the DMLPushdown API > and if it doesn't find those API then check for traditional APIs > (ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete). And when it > doesn't find both then it should return an error. > > I changed CheckValidResultRel(), where > > 1) Don't throw an error if resultRelInfo->ri_FdwPushdown is true and > DMLPushdown APIs are missing as query can still perform operation with > traditional ExecForeign APIs. So in this situation just marked > resultRelInfo->ri_FdwPushdown to false. > > (Wondering can we add the checks for DMLPushdown APIs into PlanDMLPushdown > as additional check? Means PlanDMLPushdown should return true only if FDW > provides the BeginDMLPushdown & IterateDMLPushdown & EndDMLPushdown APIs ? > What you say ?) > > On another thought, we should not give responsibility to check for the APIs to the FDW. So may be we should call PlanDMLPushdown only if BeginDMLPushdown & IterateDMLPushdown & EndDMLPushdown APIs are present into FDW. That means prepare DMLPushdown plan only when all the required APIs are available with FDW. This will also reduce the changes into CheckValidResultRel(). Thanks Ashutosh Bapat for healthy discussion. PFA patch. > 2) Don't throw an error if resultRelInfo->ri_FdwPushdown is true and > DMLPushdown APIs is present but ExecForeign APIs are missing. > 3) Throw an error if resultRelInfo->ri_FdwPushdown is false and > ExecForeign APIs are missing. > > Attaching is the WIP patch here, do share your thought. > (need to apply on top of V6 patch) > > > So, I left that function as-is. relation_is_updatabale() is just used for >> display in the information_schema views, so ISTM that that function is fine >> as-is. (As for CheckValidResultRel(), I revised it so as to check the >> presence of DML pushdown APIs after checking the existing APIs if the given >> command will be pushed down. The reason is because we assume the presence >> of the existing APIs, anyway.) >> >> > I revised other docs and some comments, mostly for consistency. >> >> Attached is an updated version of the patch, which has been created on >> top of the updated version of the bugfix patch posted by Rober
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Feb 3, 2016 at 3:31 PM, Etsuro Fujita wrote: > On 2016/01/28 15:20, Rushabh Lathia wrote: > >> On Thu, Jan 28, 2016 at 11:33 AM, Etsuro Fujita >> mailto:fujita.ets...@lab.ntt.co.jp>> wrote: >> >> On 2016/01/27 21:23, Rushabh Lathia wrote: >> >> If I understood correctly, above documentation means, that if >> FDW have >> DMLPushdown APIs that is enough. But in reality thats not the >> case, we >> need ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete >> in case >> DML is not pushable. >> >> And here fact is DMLPushdown APIs are optional for FDW, so that >> if FDW >> don't have DMLPushdown APIs they can still very well perform the >> DML >> operations using ExecForeignInsert, ExecForeignUpdate, or >> ExecForeignDelete. >> > > So documentation should be like: >> >> If the IsForeignRelUpdatable pointer is set to NULL, foreign >> tables are >> assumed to be insertable, updatable, or deletable if the FDW >> provides >> ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete >> respectively, >> >> If FDW provides DMLPushdown APIs and the DML are pushable to the >> foreign >> server, then FDW still needs ExecForeignInsert, >> ExecForeignUpdate, or >> ExecForeignDelete for the non-pushable DML operation. >> >> What's your opinion ? >> > > I agree that we should add this to the documentation, too. >> > > I added docs to the IsForeignRelUpdatable documentation. Also, a brief > introductory remark has been added at the beginning of the DML pushdown > APIs' documentation. > > BTW, if I understand correctly, I think we should also modify >> relation_is_updatabale() accordingly. Am I right? >> > > Yep, we need to modify relation_is_updatable(). >> > > I thought I'd modify that function in the same way as > CheckValidResultRel(), but I noticed that we cannot do that, because we > don't have any information on whether each update is pushed down to the > remote server by PlanDMLPushdown, during relation_is_updatabale(). Sorry I didn't get you here. Can't resultRelInfo->ri_FdwPushdown gives information update whether update is pushed down safe or not ? What my concern here is, lets say resultRelInfo->ri_FdwPushdown marked as true (PlanDMLPushdown return true), but later into CheckValidResultRel() it found out that missing BeginDMLPushdown, IterateDMLPushdown and EndDMLPushdown APIs and it will end up with error. What I think CheckValidResultRel() should do is, if resultRelInfo->ri_FdwPushdown is true then check for the DMLPushdown API and if it doesn't find those API then check for traditional APIs (ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete). And when it doesn't find both then it should return an error. I changed CheckValidResultRel(), where 1) Don't throw an error if resultRelInfo->ri_FdwPushdown is true and DMLPushdown APIs are missing as query can still perform operation with traditional ExecForeign APIs. So in this situation just marked resultRelInfo->ri_FdwPushdown to false. (Wondering can we add the checks for DMLPushdown APIs into PlanDMLPushdown as additional check? Means PlanDMLPushdown should return true only if FDW provides the BeginDMLPushdown & IterateDMLPushdown & EndDMLPushdown APIs ? What you say ?) 2) Don't throw an error if resultRelInfo->ri_FdwPushdown is true and DMLPushdown APIs is present but ExecForeign APIs are missing. 3) Throw an error if resultRelInfo->ri_FdwPushdown is false and ExecForeign APIs are missing. Attaching is the WIP patch here, do share your thought. (need to apply on top of V6 patch) So, I left that function as-is. relation_is_updatabale() is just used for > display in the information_schema views, so ISTM that that function is fine > as-is. (As for CheckValidResultRel(), I revised it so as to check the > presence of DML pushdown APIs after checking the existing APIs if the given > command will be pushed down. The reason is because we assume the presence > of the existing APIs, anyway.) > > I revised other docs and some comments, mostly for consistency. > > Attached is an updated version of the patch, which has been created on top > of the updated version of the bugfix patch posted by Robert in [1] > (attached). > > > Best regards, > Etsuro Fujita > > [1] > http://www.postgresql.org/message-id/ca+tgmoz40j2uc5ac1nxu03oj4crvolks15xx+ptfp-1u-8z...@mail.gmail.com > -- Rushabh Lathia diff --git a/src/backend/executor/.execMain.c.swp b/src/backend/executor/.execMain.c.swp index 6f54fd9..8a29cc6 100644 Binary files a/src/backend/executor/.execMain.c.swp and b/src/backend/executor/.execMain.c.swp differ diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index eb24051..04d90ea 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -1082,86 +1082,88 @@ CheckValidResu
Re: Odd behavior in foreign table modification (Was: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW)
On 2016/02/05 12:28, Robert Haas wrote: On Thu, Jan 28, 2016 at 7:36 AM, Etsuro Fujita wrote: Attached is that version of the patch. I think that postgres_fdw might be able to insert a tableoid value in the returned slot in e.g., postgresExecForeignInsert if AFTER ROW Triggers or RETURNING expressions reference that value, but I didn't do anything about that. Thanks. I went with the earlier version, but split it into two commits. Thank you! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/02/04 19:44, Rushabh Lathia wrote: I just started reviewing this and realized that patch is not getting applied cleanly on latest source, it having some conflicts. Can you please upload the correct version of patch. I rebased the patch to the latest HEAD. Attached is a rebased version of the patch. You don't need to apply the patch fdw-foreign-modify-rmh-v2.patch attached before. Thanks for the review! Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 1069,1074 deparseUpdateSql(StringInfo buf, PlannerInfo *root, --- 1069,1134 } /* + * deparse remote UPDATE statement + * + * The statement text is appended to buf, and we also create an integer List + * of the columns being retrieved by RETURNING (if any), which is returned + * to *retrieved_attrs. + */ + void + deparsePushedDownUpdateSql(StringInfo buf, PlannerInfo *root, + Index rtindex, Relation rel, + List *targetlist, + List *targetAttrs, + List *remote_conds, + List **params_list, + List *returningList, + List **retrieved_attrs) + { + RelOptInfo *baserel = root->simple_rel_array[rtindex]; + deparse_expr_cxt context; + int nestlevel; + bool first; + ListCell *lc; + + /* Set up context struct for recursion */ + context.root = root; + context.foreignrel = baserel; + context.buf = buf; + context.params_list = params_list; + + appendStringInfoString(buf, "UPDATE "); + deparseRelation(buf, rel); + appendStringInfoString(buf, " SET "); + + /* Make sure any constants in the exprs are printed portably */ + nestlevel = set_transmission_modes(); + + first = true; + foreach(lc, targetAttrs) + { + int attnum = lfirst_int(lc); + TargetEntry *tle = get_tle_by_resno(targetlist, attnum); + + if (!first) + appendStringInfoString(buf, ", "); + first = false; + + deparseColumnRef(buf, rtindex, attnum, root); + appendStringInfoString(buf, " = "); + deparseExpr((Expr *) tle->expr, &context); + } + + reset_transmission_modes(nestlevel); + + if (remote_conds) + appendWhereClause(remote_conds, &context); + + deparseReturningList(buf, root, rtindex, rel, false, + returningList, retrieved_attrs); + } + + /* * deparse remote DELETE statement * * The statement text is appended to buf, and we also create an integer List *** *** 1091,1096 deparseDeleteSql(StringInfo buf, PlannerInfo *root, --- 1151,1190 } /* + * deparse remote DELETE statement + * + * The statement text is appended to buf, and we also create an integer List + * of the columns being retrieved by RETURNING (if any), which is returned + * to *retrieved_attrs. + */ + void + deparsePushedDownDeleteSql(StringInfo buf, PlannerInfo *root, + Index rtindex, Relation rel, + List *remote_conds, + List **params_list, + List *returningList, + List **retrieved_attrs) + { + RelOptInfo *baserel = root->simple_rel_array[rtindex]; + deparse_expr_cxt context; + + /* Set up context struct for recursion */ + context.root = root; + context.foreignrel = baserel; + context.buf = buf; + context.params_list = params_list; + + appendStringInfoString(buf, "DELETE FROM "); + deparseRelation(buf, rel); + + if (remote_conds) + appendWhereClause(remote_conds, &context); + + deparseReturningList(buf, root, rtindex, rel, false, + returningList, retrieved_attrs); + } + + /* * Add a RETURNING clause, if needed, to an INSERT/UPDATE/DELETE. */ static void *** a/contrib/postgres_fdw/expected/postgres_fdw.out --- b/contrib/postgres_fdw/expected/postgres_fdw.out *** *** 1314,1320 INSERT INTO ft2 (c1,c2,c3) --- 1314,1339 (3 rows) INSERT INTO ft2 (c1,c2,c3) VALUES (1104,204,'ddd'), (1105,205,'eee'); + EXPLAIN (verbose, costs off) + UPDATE ft2 SET c2 = c2 + 300, c3 = c3 || '_update3' WHERE c1 % 10 = 3; -- can be pushed down + QUERY PLAN + -- + Update on public.ft2 +-> Foreign Update on public.ft2 + Remote SQL: UPDATE "S 1"."T 1" SET c2 = (c2 + 300), c3 = (c3 || '_update3'::text) WHERE ((("C 1" % 10) = 3)) + (3 rows) + UPDATE ft2 SET c2 = c2 + 300, c3 = c3 || '_update3' WHERE c1 % 10 = 3; + EXPLAIN (verbose, costs off) + UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING *; -- can be pushed down + QUERY PLAN +
Re: Odd behavior in foreign table modification (Was: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW)
On Thu, Jan 28, 2016 at 7:36 AM, Etsuro Fujita wrote: > Attached is that version of the patch. > > I think that postgres_fdw might be able to insert a tableoid value in the > returned slot in e.g., postgresExecForeignInsert if AFTER ROW Triggers or > RETURNING expressions reference that value, but I didn't do anything about > that. Thanks. I went with the earlier version, but split it into two commits. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Feb 3, 2016 at 3:31 PM, Etsuro Fujita wrote: > On 2016/01/28 15:20, Rushabh Lathia wrote: > >> On Thu, Jan 28, 2016 at 11:33 AM, Etsuro Fujita >> mailto:fujita.ets...@lab.ntt.co.jp>> wrote: >> >> On 2016/01/27 21:23, Rushabh Lathia wrote: >> >> If I understood correctly, above documentation means, that if >> FDW have >> DMLPushdown APIs that is enough. But in reality thats not the >> case, we >> need ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete >> in case >> DML is not pushable. >> >> And here fact is DMLPushdown APIs are optional for FDW, so that >> if FDW >> don't have DMLPushdown APIs they can still very well perform the >> DML >> operations using ExecForeignInsert, ExecForeignUpdate, or >> ExecForeignDelete. >> > > So documentation should be like: >> >> If the IsForeignRelUpdatable pointer is set to NULL, foreign >> tables are >> assumed to be insertable, updatable, or deletable if the FDW >> provides >> ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete >> respectively, >> >> If FDW provides DMLPushdown APIs and the DML are pushable to the >> foreign >> server, then FDW still needs ExecForeignInsert, >> ExecForeignUpdate, or >> ExecForeignDelete for the non-pushable DML operation. >> >> What's your opinion ? >> > > I agree that we should add this to the documentation, too. >> > > I added docs to the IsForeignRelUpdatable documentation. Also, a brief > introductory remark has been added at the beginning of the DML pushdown > APIs' documentation. > > BTW, if I understand correctly, I think we should also modify >> relation_is_updatabale() accordingly. Am I right? >> > > Yep, we need to modify relation_is_updatable(). >> > > I thought I'd modify that function in the same way as > CheckValidResultRel(), but I noticed that we cannot do that, because we > don't have any information on whether each update is pushed down to the > remote server by PlanDMLPushdown, during relation_is_updatabale(). So, I > left that function as-is. relation_is_updatabale() is just used for > display in the information_schema views, so ISTM that that function is fine > as-is. (As for CheckValidResultRel(), I revised it so as to check the > presence of DML pushdown APIs after checking the existing APIs if the given > command will be pushed down. The reason is because we assume the presence > of the existing APIs, anyway.) > > I revised other docs and some comments, mostly for consistency. > > I just started reviewing this and realized that patch is not getting applied cleanly on latest source, it having some conflicts. Can you please upload the correct version of patch. Attached is an updated version of the patch, which has been created on top > of the updated version of the bugfix patch posted by Robert in [1] > (attached). > > > Best regards, > Etsuro Fujita > > [1] > http://www.postgresql.org/message-id/ca+tgmoz40j2uc5ac1nxu03oj4crvolks15xx+ptfp-1u-8z...@mail.gmail.com > -- Rushabh Lathia
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/01/28 15:20, Rushabh Lathia wrote: On Thu, Jan 28, 2016 at 11:33 AM, Etsuro Fujita mailto:fujita.ets...@lab.ntt.co.jp>> wrote: On 2016/01/27 21:23, Rushabh Lathia wrote: If I understood correctly, above documentation means, that if FDW have DMLPushdown APIs that is enough. But in reality thats not the case, we need ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete in case DML is not pushable. And here fact is DMLPushdown APIs are optional for FDW, so that if FDW don't have DMLPushdown APIs they can still very well perform the DML operations using ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete. So documentation should be like: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively, If FDW provides DMLPushdown APIs and the DML are pushable to the foreign server, then FDW still needs ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete for the non-pushable DML operation. What's your opinion ? I agree that we should add this to the documentation, too. I added docs to the IsForeignRelUpdatable documentation. Also, a brief introductory remark has been added at the beginning of the DML pushdown APIs' documentation. BTW, if I understand correctly, I think we should also modify relation_is_updatabale() accordingly. Am I right? Yep, we need to modify relation_is_updatable(). I thought I'd modify that function in the same way as CheckValidResultRel(), but I noticed that we cannot do that, because we don't have any information on whether each update is pushed down to the remote server by PlanDMLPushdown, during relation_is_updatabale(). So, I left that function as-is. relation_is_updatabale() is just used for display in the information_schema views, so ISTM that that function is fine as-is. (As for CheckValidResultRel(), I revised it so as to check the presence of DML pushdown APIs after checking the existing APIs if the given command will be pushed down. The reason is because we assume the presence of the existing APIs, anyway.) I revised other docs and some comments, mostly for consistency. Attached is an updated version of the patch, which has been created on top of the updated version of the bugfix patch posted by Robert in [1] (attached). Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/ca+tgmoz40j2uc5ac1nxu03oj4crvolks15xx+ptfp-1u-8z...@mail.gmail.com diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index df3d1ee..d778e61 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -112,6 +112,7 @@ static void deparseTargetList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, + bool is_returning, Bitmapset *attrs_used, List **retrieved_attrs); static void deparseReturningList(StringInfo buf, PlannerInfo *root, @@ -776,7 +777,7 @@ deparseSelectSql(Bitmapset *attrs_used, List **retrieved_attrs, * Construct SELECT list */ appendStringInfoString(buf, "SELECT "); - deparseTargetList(buf, root, foreignrel->relid, rel, attrs_used, + deparseTargetList(buf, root, foreignrel->relid, rel, false, attrs_used, retrieved_attrs); /* @@ -790,7 +791,8 @@ deparseSelectSql(Bitmapset *attrs_used, List **retrieved_attrs, /* * Emit a target list that retrieves the columns specified in attrs_used. - * This is used for both SELECT and RETURNING targetlists. + * This is used for both SELECT and RETURNING targetlists; the is_returning + * parameter is true only for a RETURNING targetlist. * * The tlist text is appended to buf, and we also create an integer List * of the columns being retrieved, which is returned to *retrieved_attrs. @@ -800,6 +802,7 @@ deparseTargetList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, + bool is_returning, Bitmapset *attrs_used, List **retrieved_attrs) { @@ -829,6 +832,8 @@ deparseTargetList(StringInfo buf, { if (!first) appendStringInfoString(buf, ", "); + else if (is_returning) +appendStringInfoString(buf, " RETURNING "); first = false; deparseColumnRef(buf, rtindex, i, root); @@ -846,6 +851,8 @@ deparseTargetList(StringInfo buf, { if (!first) appendStringInfoString(buf, ", "); + else if (is_returning) + appendStringInfoString(buf, " RETURNING "); first = false; appendStringInfoString(buf, "ctid"); @@ -855,7 +862,7 @@ deparseTargetList(StringInfo buf, } /* Don't generate bad syntax if no undropped columns */ - if (first) + if (first && !is_returning) appendStringInfoString(buf, "NULL");
Re: Odd behavior in foreign table modification (Was: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW)
On 2016/01/28 12:58, Robert Haas wrote: On Thu, Jan 21, 2016 at 4:05 AM, Etsuro Fujita wrote: By the way, I'm not too sure I understand the need for the core changes that are part of this patch, and I think that point merits some discussion. Whenever you change core like this, you're changing the contract between the FDW and core; it's not just postgres_fdw that needs updating, but every FDW. So we'd better be pretty sure we need these changes and they are adequately justified before we think about putting them into the tree. Are these core changes really needed here, or can we fix this whole issue in postgres_fdw and leave the core code alone? Well, if we think it is the FDW's responsibility to insert a valid value for tableoid in the returned slot during ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete, we don't need those core changes. However, I think it would be better that that's done by ModifyTable in the same way as ForeignScan does in ForeignNext, IMO. That eliminates the need for postgres_fdw or any other FDW to do that business in the callback routines. I'm not necessarily opposed to the core changes, but I want to understand better what complexity they are avoiding. Can you send a version of this patch that only touches postgres_fdw, so I can compare? Attached is that version of the patch. I think that postgres_fdw might be able to insert a tableoid value in the returned slot in e.g., postgresExecForeignInsert if AFTER ROW Triggers or RETURNING expressions reference that value, but I didn't do anything about that. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 110,115 static void deparseTargetList(StringInfo buf, --- 110,116 PlannerInfo *root, Index rtindex, Relation rel, + bool is_returning, Bitmapset *attrs_used, List **retrieved_attrs); static void deparseReturningList(StringInfo buf, PlannerInfo *root, *** *** 724,730 deparseSelectSql(StringInfo buf, * Construct SELECT list */ appendStringInfoString(buf, "SELECT "); ! deparseTargetList(buf, root, baserel->relid, rel, attrs_used, retrieved_attrs); /* --- 725,731 * Construct SELECT list */ appendStringInfoString(buf, "SELECT "); ! deparseTargetList(buf, root, baserel->relid, rel, false, attrs_used, retrieved_attrs); /* *** *** 738,744 deparseSelectSql(StringInfo buf, /* * Emit a target list that retrieves the columns specified in attrs_used. ! * This is used for both SELECT and RETURNING targetlists. * * The tlist text is appended to buf, and we also create an integer List * of the columns being retrieved, which is returned to *retrieved_attrs. --- 739,746 /* * Emit a target list that retrieves the columns specified in attrs_used. ! * This is used for both SELECT and RETURNING targetlists; the is_returning ! * parameter is true only for a RETURNING targetlist. * * The tlist text is appended to buf, and we also create an integer List * of the columns being retrieved, which is returned to *retrieved_attrs. *** *** 748,753 deparseTargetList(StringInfo buf, --- 750,756 PlannerInfo *root, Index rtindex, Relation rel, + bool is_returning, Bitmapset *attrs_used, List **retrieved_attrs) { *** *** 777,782 deparseTargetList(StringInfo buf, --- 780,787 { if (!first) appendStringInfoString(buf, ", "); + else if (is_returning) + appendStringInfoString(buf, " RETURNING "); first = false; deparseColumnRef(buf, rtindex, i, root); *** *** 794,799 deparseTargetList(StringInfo buf, --- 799,806 { if (!first) appendStringInfoString(buf, ", "); + else if (is_returning) + appendStringInfoString(buf, " RETURNING "); first = false; appendStringInfoString(buf, "ctid"); *** *** 803,809 deparseTargetList(StringInfo buf, } /* Don't generate bad syntax if no undropped columns */ ! if (first) appendStringInfoString(buf, "NULL"); } --- 810,816 } /* Don't generate bad syntax if no undropped columns */ ! if (first && !is_returning) appendStringInfoString(buf, "NULL"); } *** *** 1022,1032 deparseReturningList(StringInfo buf, PlannerInfo *root, } if (attrs_used != NULL) ! { ! appendStringInfoString(buf, " RETURNING "); ! deparseTargetList(buf, root, rtindex, rel, attrs_used, retrieved_attrs); - } else *retrieved_attrs = NIL; } --- 1029,1036 } if (attrs_used != NULL) ! deparseTargetList(buf, root, rtindex, rel, true, attrs_used, retrieved_attrs); else *retrieved_attrs = NIL; } *** a/contrib/postgres_fdw/expected/postgres_fdw.out --- b/contrib/postgres_fdw/expected/pos
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/01/28 15:20, Rushabh Lathia wrote: On Thu, Jan 28, 2016 at 11:33 AM, Etsuro Fujita mailto:fujita.ets...@lab.ntt.co.jp>> wrote: On 2016/01/27 21:23, Rushabh Lathia wrote: If I understood correctly, above documentation means, that if FDW have DMLPushdown APIs that is enough. But in reality thats not the case, we need ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete in case DML is not pushable. And here fact is DMLPushdown APIs are optional for FDW, so that if FDW don't have DMLPushdown APIs they can still very well perform the DML operations using ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete. I agree with you. I guess I was wrong. sorry. So documentation should be like: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively, If FDW provides DMLPushdown APIs and the DML are pushable to the foreign server, then FDW still needs ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete for the non-pushable DML operation. What's your opinion ? I agree that we should add this to the documentation, too. BTW, if I understand correctly, I think we should also modify relation_is_updatabale() accordingly. Am I right? Yep, we need to modify relation_is_updatable(). OK, will do. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Thu, Jan 28, 2016 at 11:33 AM, Etsuro Fujita wrote: > On 2016/01/27 21:23, Rushabh Lathia wrote: > >> If I understood correctly, above documentation means, that if FDW have >> DMLPushdown APIs that is enough. But in reality thats not the case, we >> need ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete in case >> DML is not pushable. >> >> And here fact is DMLPushdown APIs are optional for FDW, so that if FDW >> don't have DMLPushdown APIs they can still very well perform the DML >> operations using ExecForeignInsert, ExecForeignUpdate, or >> ExecForeignDelete. >> > > I agree with you. I guess I was wrong. sorry. > > So documentation should be like: >> >> If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are >> assumed to be insertable, updatable, or deletable if the FDW provides >> ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively, >> >> If FDW provides DMLPushdown APIs and the DML are pushable to the foreign >> server, then FDW still needs ExecForeignInsert, ExecForeignUpdate, or >> ExecForeignDelete for the non-pushable DML operation. >> >> What's your opinion ? >> > > I agree that we should add this to the documentation, too. > > BTW, if I understand correctly, I think we should also modify > relation_is_updatabale() accordingly. Am I right? > Yep, we need to modify relation_is_updatable(). > > Best regards, > Etsuro Fujita > > > -- Rushabh Lathia
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/01/27 21:23, Rushabh Lathia wrote: If I understood correctly, above documentation means, that if FDW have DMLPushdown APIs that is enough. But in reality thats not the case, we need ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete in case DML is not pushable. And here fact is DMLPushdown APIs are optional for FDW, so that if FDW don't have DMLPushdown APIs they can still very well perform the DML operations using ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete. I agree with you. I guess I was wrong. sorry. So documentation should be like: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively, If FDW provides DMLPushdown APIs and the DML are pushable to the foreign server, then FDW still needs ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete for the non-pushable DML operation. What's your opinion ? I agree that we should add this to the documentation, too. BTW, if I understand correctly, I think we should also modify relation_is_updatabale() accordingly. Am I right? Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Odd behavior in foreign table modification (Was: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW)
On Thu, Jan 21, 2016 at 4:05 AM, Etsuro Fujita wrote: >> By the way, I'm not too sure I understand the need for the core >> changes that are part of this patch, and I think that point merits >> some discussion. Whenever you change core like this, you're changing >> the contract between the FDW and core; it's not just postgres_fdw that >> needs updating, but every FDW. So we'd better be pretty sure we need >> these changes and they are adequately justified before we think about >> putting them into the tree. Are these core changes really needed >> here, or can we fix this whole issue in postgres_fdw and leave the >> core code alone? > > Well, if we think it is the FDW's responsibility to insert a valid value for > tableoid in the returned slot during ExecForeignInsert, ExecForeignUpdate or > ExecForeignDelete, we don't need those core changes. However, I think it > would be better that that's done by ModifyTable in the same way as > ForeignScan does in ForeignNext, IMO. That eliminates the need for > postgres_fdw or any other FDW to do that business in the callback routines. I'm not necessarily opposed to the core changes, but I want to understand better what complexity they are avoiding. Can you send a version of this patch that only touches postgres_fdw, so I can compare? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Wed, Jan 27, 2016 at 2:50 PM, Etsuro Fujita wrote: > On 2016/01/27 12:20, Etsuro Fujita wrote: > >> On 2016/01/26 22:57, Rushabh Lathia wrote: >> >>> On Tue, Jan 26, 2016 at 4:15 PM, Etsuro Fujita >>> mailto:fujita.ets...@lab.ntt.co.jp>> >>> wrote: >>> >>> On 2016/01/25 17:03, Rushabh Lathia wrote: >>> >> > int >>> IsForeignRelUpdatable (Relation rel); >>> >> > Documentation for IsForeignUpdatable() need to change as it says: >>> >>> If the IsForeignRelUpdatable pointer is set to NULL, foreign >>> tables are >>> assumed >>> to be insertable, updatable, or deletable if the FDW provides >>> ExecForeignInsert, >>> ExecForeignUpdate or ExecForeignDelete respectively. >>> >>> With introduce of DMLPushdown API now this is no more correct, >>> as even if >>> FDW don't provide ExecForeignInsert, ExecForeignUpdate or >>> ExecForeignDelete API >>> still foreign tables are assumed to be updatable or deletable >>> with >>> DMLPushdown >>> API's, right ? >>> >> > That's what I'd like to discuss. >>> >>> I intentionally leave that as-is, because I think we should >>> determine the updatability of a foreign table in the current >>> manner. As you pointed out, even if the FDW doesn't provide eg, >>> ExecForeignUpdate, an UPDATE on a foreign table could be done using >>> the DML pushdown APIs if the UPDATE is *pushdown-safe*. However, >>> since all UPDATEs on the foreign table are not necessarily >>> pushdown-safe, I'm not sure it's a good idea to assume the >>> table-level updatability if the FDW provides the DML pushdown >>> callback routines. To keep the existing updatability decision, I >>> think the FDW should provide the DML pushdown callback routines >>> together with ExecForeignInsert, ExecForeignUpdate, or >>> ExecForeignDelete. What do you think about that? >>> >> > Sorry but I am not in favour of adding compulsion that FDW should provide >>> the DML pushdown callback routines together with existing >>> ExecForeignInsert, >>> ExecForeignUpdate or ExecForeignDelete APIs. >>> >>> May be we should change the documentation in such way, that explains >>> >>> a) If FDW PlanDMLPushdown is NULL, then check for ExecForeignInsert, >>> ExecForeignUpdate or ExecForeignDelete APIs >>> b) If FDW PlanDMLPushdown is non-NULL and plan is not pushable >>> check for ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs >>> c) If FDW PlanDMLPushdown is non-NULL and plan is pushable >>> check for DMLPushdown APIs. >>> >>> Does this sounds wired ? >>> >> > Yeah, but I think that that would be what is done during executor >> startup (see CheckValidResultRel()), while what the documentation is >> saying is about relation_is_updatable(); that is, how to decide the >> updatability of a given foreign table, not how the executor processes an >> individual INSERT/UPDATE/DELETE on a updatable foreign table. So, I'm >> not sure it's a good idea to modify the documentation in such a way. >> > > However, I agree that we should add a documentation note about the >> compulsion somewhere. Maybe something like this: >> >> The FDW should provide DML pushdown callback routines together with >> table-updating callback routines described above. Even if the callback >> routines are provided, the updatability of a foreign table is determined >> based on the presence of ExecForeignInsert, ExecForeignUpdate or >> ExecForeignDelete if the IsForeignRelUpdatable pointer is set to NULL. >> > > On second thought, I think it might be okay to assume the presence of > PlanDMLPushdown, BeginDMLPushdown, IterateDMLPushdown, and EndDMLPushdown > is also sufficient for the insertablity, updatability, and deletability of > a foreign table, if the IsForeignRelUpdatable pointer is set to NULL. How > about modifying the documentation like this: > > If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are > assumed to be insertable, updatable, or deletable if the FDW provides > ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively, or > if the FDW provides PlanDMLPushdown, BeginDMLPushdown, IterateDMLPushdown, > and EndDMLPushdown described below. > > Of course, we also need to modify relation_is_updatable() accordingly. > > > What's your opinion? > > If I understood correctly, above documentation means, that if FDW have DMLPushdown APIs that is enough. But in reality thats not the case, we need ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete in case DML is not pushable. And here fact is DMLPushdown APIs are optional for FDW, so that if FDW don't have DMLPushdown APIs they can still very well perform the DML operations using ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete. So documentation should be like: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/01/27 12:20, Etsuro Fujita wrote: On 2016/01/26 22:57, Rushabh Lathia wrote: On Tue, Jan 26, 2016 at 4:15 PM, Etsuro Fujita mailto:fujita.ets...@lab.ntt.co.jp>> wrote: On 2016/01/25 17:03, Rushabh Lathia wrote: int IsForeignRelUpdatable (Relation rel); Documentation for IsForeignUpdatable() need to change as it says: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete respectively. With introduce of DMLPushdown API now this is no more correct, as even if FDW don't provide ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete API still foreign tables are assumed to be updatable or deletable with DMLPushdown API's, right ? That's what I'd like to discuss. I intentionally leave that as-is, because I think we should determine the updatability of a foreign table in the current manner. As you pointed out, even if the FDW doesn't provide eg, ExecForeignUpdate, an UPDATE on a foreign table could be done using the DML pushdown APIs if the UPDATE is *pushdown-safe*. However, since all UPDATEs on the foreign table are not necessarily pushdown-safe, I'm not sure it's a good idea to assume the table-level updatability if the FDW provides the DML pushdown callback routines. To keep the existing updatability decision, I think the FDW should provide the DML pushdown callback routines together with ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete. What do you think about that? Sorry but I am not in favour of adding compulsion that FDW should provide the DML pushdown callback routines together with existing ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs. May be we should change the documentation in such way, that explains a) If FDW PlanDMLPushdown is NULL, then check for ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs b) If FDW PlanDMLPushdown is non-NULL and plan is not pushable check for ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs c) If FDW PlanDMLPushdown is non-NULL and plan is pushable check for DMLPushdown APIs. Does this sounds wired ? Yeah, but I think that that would be what is done during executor startup (see CheckValidResultRel()), while what the documentation is saying is about relation_is_updatable(); that is, how to decide the updatability of a given foreign table, not how the executor processes an individual INSERT/UPDATE/DELETE on a updatable foreign table. So, I'm not sure it's a good idea to modify the documentation in such a way. However, I agree that we should add a documentation note about the compulsion somewhere. Maybe something like this: The FDW should provide DML pushdown callback routines together with table-updating callback routines described above. Even if the callback routines are provided, the updatability of a foreign table is determined based on the presence of ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete if the IsForeignRelUpdatable pointer is set to NULL. On second thought, I think it might be okay to assume the presence of PlanDMLPushdown, BeginDMLPushdown, IterateDMLPushdown, and EndDMLPushdown is also sufficient for the insertablity, updatability, and deletability of a foreign table, if the IsForeignRelUpdatable pointer is set to NULL. How about modifying the documentation like this: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively, or if the FDW provides PlanDMLPushdown, BeginDMLPushdown, IterateDMLPushdown, and EndDMLPushdown described below. Of course, we also need to modify relation_is_updatable() accordingly. What's your opinion? Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/01/26 22:57, Rushabh Lathia wrote: On Tue, Jan 26, 2016 at 4:15 PM, Etsuro Fujita mailto:fujita.ets...@lab.ntt.co.jp>> wrote: On 2016/01/25 17:03, Rushabh Lathia wrote: int IsForeignRelUpdatable (Relation rel); Documentation for IsForeignUpdatable() need to change as it says: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete respectively. With introduce of DMLPushdown API now this is no more correct, as even if FDW don't provide ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete API still foreign tables are assumed to be updatable or deletable with DMLPushdown API's, right ? That's what I'd like to discuss. I intentionally leave that as-is, because I think we should determine the updatability of a foreign table in the current manner. As you pointed out, even if the FDW doesn't provide eg, ExecForeignUpdate, an UPDATE on a foreign table could be done using the DML pushdown APIs if the UPDATE is *pushdown-safe*. However, since all UPDATEs on the foreign table are not necessarily pushdown-safe, I'm not sure it's a good idea to assume the table-level updatability if the FDW provides the DML pushdown callback routines. To keep the existing updatability decision, I think the FDW should provide the DML pushdown callback routines together with ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete. What do you think about that? Sorry but I am not in favour of adding compulsion that FDW should provide the DML pushdown callback routines together with existing ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs. May be we should change the documentation in such way, that explains a) If FDW PlanDMLPushdown is NULL, then check for ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs b) If FDW PlanDMLPushdown is non-NULL and plan is not pushable check for ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs c) If FDW PlanDMLPushdown is non-NULL and plan is pushable check for DMLPushdown APIs. Does this sounds wired ? Yeah, but I think that that would be what is done during executor startup (see CheckValidResultRel()), while what the documentation is saying is about relation_is_updatable(); that is, how to decide the updatability of a given foreign table, not how the executor processes an individual INSERT/UPDATE/DELETE on a updatable foreign table. So, I'm not sure it's a good idea to modify the documentation in such a way. BTW, I have added the description about that check partially. I added to the PlanDMLPushdown documentation: + If this function succeeds, PlanForeignModify + won't be executed, and BeginDMLPushdown, + IterateDMLPushdown and EndDMLPushdown will + be called at the execution stage, instead. And for example, I added to the BeginDMLPushdown documentation: + If the BeginDMLPushdown pointer is set to + NULL, attempts to execute the table update directly on + the remote server will fail with an error message. However, I agree that we should add a documentation note about the compulsion somewhere. Maybe something like this: The FDW should provide DML pushdown callback routines together with table-updating callback routines described above. Even if the callback routines are provided, the updatability of a foreign table is determined based on the presence of ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete if the IsForeignRelUpdatable pointer is set to NULL. What's your opinion? Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Tue, Jan 26, 2016 at 4:15 PM, Etsuro Fujita wrote: > On 2016/01/25 17:03, Rushabh Lathia wrote: > >> Here are couple of comments: >> > > 1) >> >> int >> IsForeignRelUpdatable (Relation rel); >> > > Documentation for IsForeignUpdatable() need to change as it says: >> >> If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are >> assumed >> to be insertable, updatable, or deletable if the FDW provides >> ExecForeignInsert, >> ExecForeignUpdate or ExecForeignDelete respectively. >> >> With introduce of DMLPushdown API now this is no more correct, as even if >> FDW don't provide ExecForeignInsert, ExecForeignUpdate or >> ExecForeignDelete API >> still foreign tables are assumed to be updatable or deletable with >> DMLPushdown >> API's, right ? >> > > That's what I'd like to discuss. > > I intentionally leave that as-is, because I think we should determine the > updatability of a foreign table in the current manner. As you pointed out, > even if the FDW doesn't provide eg, ExecForeignUpdate, an UPDATE on a > foreign table could be done using the DML pushdown APIs if the UPDATE is > *pushdown-safe*. However, since all UPDATEs on the foreign table are not > necessarily pushdown-safe, I'm not sure it's a good idea to assume the > table-level updatability if the FDW provides the DML pushdown callback > routines. To keep the existing updatability decision, I think the FDW > should provide the DML pushdown callback routines together with > ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete. What do you > think about that? > > Sorry but I am not in favour of adding compulsion that FDW should provide the DML pushdown callback routines together with existing ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs. May be we should change the documentation in such way, that explains a) If FDW PlanDMLPushdown is NULL, then check for ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs b) If FDW PlanDMLPushdown is non-NULL and plan is not pushable check for ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs c) If FDW PlanDMLPushdown is non-NULL and plan is pushable check for DMLPushdown APIs. Does this sounds wired ? > 2) >> >> + /* SQL statement to execute remotely (as a String node) */ >> + FdwDmlPushdownPrivateUpdateSql, >> >> FdwDmlPushdownPrivateUpdateSql holds the UPDATE/DELETE query, so name >> should >> be something like FdwDmlPushdownPrivateQuery os FdwDmlPushdownPrivateSql ? >> > > Later I realized that for FdwModifyPrivateIndex too the index name is >> FdwModifyPrivateUpdateSql even though its holding any DML query. Not sure >> whether we should consider to change this or not ? >> > > To tell the truth, I imitated FdwModifyPrivateIndex when adding > FdwDmlPushdownPrivateIndex, because I think "UpdateSql" means INSERT, > UPDATE, or DELETE, not just UPDATE. (IsForeignRelUpdatable discussed above > reports not only the updatability but the insertability and deletability of > a foreign table!). So, +1 for leaving that as-is. > > Make sense for now. > Apart from this perform sanity testing on the new patch and things working >> as expected. >> > > Thanks for the review! > > Best regards, > Etsuro Fujita > > > -- Rushabh Lathia
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/01/25 17:03, Rushabh Lathia wrote: Here are couple of comments: 1) int IsForeignRelUpdatable (Relation rel); Documentation for IsForeignUpdatable() need to change as it says: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete respectively. With introduce of DMLPushdown API now this is no more correct, as even if FDW don't provide ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete API still foreign tables are assumed to be updatable or deletable with DMLPushdown API's, right ? That's what I'd like to discuss. I intentionally leave that as-is, because I think we should determine the updatability of a foreign table in the current manner. As you pointed out, even if the FDW doesn't provide eg, ExecForeignUpdate, an UPDATE on a foreign table could be done using the DML pushdown APIs if the UPDATE is *pushdown-safe*. However, since all UPDATEs on the foreign table are not necessarily pushdown-safe, I'm not sure it's a good idea to assume the table-level updatability if the FDW provides the DML pushdown callback routines. To keep the existing updatability decision, I think the FDW should provide the DML pushdown callback routines together with ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete. What do you think about that? 2) + /* SQL statement to execute remotely (as a String node) */ + FdwDmlPushdownPrivateUpdateSql, FdwDmlPushdownPrivateUpdateSql holds the UPDATE/DELETE query, so name should be something like FdwDmlPushdownPrivateQuery os FdwDmlPushdownPrivateSql ? Later I realized that for FdwModifyPrivateIndex too the index name is FdwModifyPrivateUpdateSql even though its holding any DML query. Not sure whether we should consider to change this or not ? To tell the truth, I imitated FdwModifyPrivateIndex when adding FdwDmlPushdownPrivateIndex, because I think "UpdateSql" means INSERT, UPDATE, or DELETE, not just UPDATE. (IsForeignRelUpdatable discussed above reports not only the updatability but the insertability and deletability of a foreign table!). So, +1 for leaving that as-is. Apart from this perform sanity testing on the new patch and things working as expected. Thanks for the review! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On Thu, Jan 21, 2016 at 3:20 PM, Etsuro Fujita wrote: > On 2016/01/20 19:57, Rushabh Lathia wrote: > >> Overall I am quite done with the review of this patch. Patch is in good >> shape and covered most of the things which been discussed earlier >> or been mentioned during review process. Patch pass through the >> make check and also includes good test coverage. >> > > Thanks for the review! > > Here are couple of things which is still open for discussion: >> > > 1) >> .) When Tom Lane and Stephen Frost suggested getting the core >> code involved, >> I thought that we can do the mandatory checks into core it self >> and making >> completely out of dml_is_pushdown_safe(). Please correct me >> > > The reason why I put that function in postgres_fdw.c is Check point 4: >> >> + * 4. We can't push an UPDATE down, if any expressions to assign >> to the target >> + * columns are unsafe to evaluate on the remote server. >> > > Here I was talking about checks related to triggers, or to LIMIT. I think >> earlier thread talked about those mandatory check to the core. So may >> be we can move those checks into make_modifytable() before calling >> the PlanDMLPushdown. >> >> This need to handle by the Owner. >> > > Done. For that, I modified relation_has_row_triggers a bit, renamed it to > has_row_triggers (more shortly), and moved it to plancat.c. And I merged > dml_is_pushdown_safe with postgresPlanDMLPushdown, and revised that > callback routine a bit. Attached is an updated version of the patch > created on top of Robert's version of the patch [1], which fixes handling > of RETURNING tableoid in updating foreign tables. > This looks great. > > 2) Decision on whether we need the separate new node ForeignUpdate, >> ForeignDelete. In my opinion I really don't see the need of this as we >> that will add lot of duplicate. Having said that if committer or someone >> else feel like that will make code more clean that is also true, >> >> This need more comments from the committer. >> > > I agree with you. > > Other changes: > > * In previous version, I assumed that PlanDMLPushdown sets fsSystemCol to > true when rewriting the ForeignScan plan node so as to push down an > UPDATE/DELETE to the remote server, in order to initialize t_tableOid for > the scan tuple in ForeignNext. The reason is that I created the patch so > that the scan tuple is provided to the local query's RETURNING computation, > which might see the tableoid column. In this version, however, I modified > the patch so that the tableoid value is inserted by ModifyTable. This > eliminates the need for postgres_fdw (or any other FDW) to set fsSystemCol > to true in PlanDMLPushdown. > > * Add set_transmission_modes/reset_transmission_modes to > deparsePushedDownUpdateSql. > > * Revise comments a bit further. > > * Revise docs, including a fix for a wrong copy-and-paste. > > Best regards, > Etsuro Fujita > > [1] > http://www.postgresql.org/message-id/ca+tgmoz40j2uc5ac1nxu03oj4crvolks15xx+ptfp-1u-8z...@mail.gmail.com > Here are couple of comments: 1) int IsForeignRelUpdatable (Relation rel); Documentation for IsForeignUpdatable() need to change as it says: If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are assumed to be insertable, updatable, or deletable if the FDW provides ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete respectively. With introduce of DMLPushdown API now this is no more correct, as even if FDW don't provide ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete API still foreign tables are assumed to be updatable or deletable with DMLPushdown API's, right ? 2) + /* SQL statement to execute remotely (as a String node) */ + FdwDmlPushdownPrivateUpdateSql, FdwDmlPushdownPrivateUpdateSql holds the UPDATE/DELETE query, so name should be something like FdwDmlPushdownPrivateQuery os FdwDmlPushdownPrivateSql ? Later I realized that for FdwModifyPrivateIndex too the index name is FdwModifyPrivateUpdateSql even though its holding any DML query. Not sure whether we should consider to change this or not ? Apart from this perform sanity testing on the new patch and things working as expected. -- Rushabh Lathia www.EnterpriseDB.com
Re: Odd behavior in foreign table modification (Was: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW)
On 2016/01/19 19:04, Thom Brown wrote: On 12 January 2016 at 11:49, Etsuro Fujita wrote: On 2016/01/12 20:36, Thom Brown wrote: On 8 January 2016 at 05:08, Etsuro Fujita wrote: On 2016/01/06 20:37, Thom Brown wrote: I've run into an issue: *# UPDATE master_customers SET id = 22 WHERE id = 16 RETURNING tableoid::regclass; ERROR: CONTEXT: Remote SQL command: UPDATE public.customers SET id = 22 WHERE ((id = 16)) RETURNING NULL While working on this, I noticed that the existing postgres_fdw system shows similar behavior, so I changed the subject. IIUC, the reason for that is when the local query specifies "RETURNING tableoid::regclass", the FDW has fmstate->has_returning=false while the remote query executed at ModifyTable has "RETURNING NULL", as shown in the above example; that would cause an abnormal exit in executing the remote query in postgresExecForeignUpdate, since that the FDW would get PGRES_TUPLES_OK as a result of the query while the FDW would think that the right result to get should be PGRES_COMMAND_OK, from the flag fmstate->has_returning=false. Attached is a patch to fix that. I can't apply this patch in tandem with FDW DML pushdown patch (either v2 or v3). That patch is for fixing the similar issue in the existing postgres_fdw system. So, please apply that patch without the DML pushdown patch. If that patch is reasonable as a fix for the issue, I'll update the DML pushdown patch (v3) on top of that patch. The patch seems to work for me: Before: *# UPDATE master_customers SET id = 22 WHERE id = 1 RETURNING tableoid::regclass; ERROR: CONTEXT: Remote SQL command: UPDATE public.customers SET id = $2 WHERE ctid = $1 RETURNING NULL After: *# UPDATE master_customers SET id = 22 WHERE id = 1 RETURNING tableoid::regclass; tableoid -- remote.customers (1 row) UPDATE 1 Thanks for the testing! I updated the DML pushdown patch on top of Robert's version of this bugfix patch. Please see http://www.postgresql.org/message-id/56a0a9f0.9090...@lab.ntt.co.jp Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2016/01/20 19:57, Rushabh Lathia wrote: Overall I am quite done with the review of this patch. Patch is in good shape and covered most of the things which been discussed earlier or been mentioned during review process. Patch pass through the make check and also includes good test coverage. Thanks for the review! Here are couple of things which is still open for discussion: 1) .) When Tom Lane and Stephen Frost suggested getting the core code involved, I thought that we can do the mandatory checks into core it self and making completely out of dml_is_pushdown_safe(). Please correct me The reason why I put that function in postgres_fdw.c is Check point 4: + * 4. We can't push an UPDATE down, if any expressions to assign to the target + * columns are unsafe to evaluate on the remote server. Here I was talking about checks related to triggers, or to LIMIT. I think earlier thread talked about those mandatory check to the core. So may be we can move those checks into make_modifytable() before calling the PlanDMLPushdown. This need to handle by the Owner. Done. For that, I modified relation_has_row_triggers a bit, renamed it to has_row_triggers (more shortly), and moved it to plancat.c. And I merged dml_is_pushdown_safe with postgresPlanDMLPushdown, and revised that callback routine a bit. Attached is an updated version of the patch created on top of Robert's version of the patch [1], which fixes handling of RETURNING tableoid in updating foreign tables. 2) Decision on whether we need the separate new node ForeignUpdate, ForeignDelete. In my opinion I really don't see the need of this as we that will add lot of duplicate. Having said that if committer or someone else feel like that will make code more clean that is also true, This need more comments from the committer. I agree with you. Other changes: * In previous version, I assumed that PlanDMLPushdown sets fsSystemCol to true when rewriting the ForeignScan plan node so as to push down an UPDATE/DELETE to the remote server, in order to initialize t_tableOid for the scan tuple in ForeignNext. The reason is that I created the patch so that the scan tuple is provided to the local query's RETURNING computation, which might see the tableoid column. In this version, however, I modified the patch so that the tableoid value is inserted by ModifyTable. This eliminates the need for postgres_fdw (or any other FDW) to set fsSystemCol to true in PlanDMLPushdown. * Add set_transmission_modes/reset_transmission_modes to deparsePushedDownUpdateSql. * Revise comments a bit further. * Revise docs, including a fix for a wrong copy-and-paste. Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/ca+tgmoz40j2uc5ac1nxu03oj4crvolks15xx+ptfp-1u-8z...@mail.gmail.com *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 823,829 deparseTargetList(StringInfo buf, * * If params is not NULL, it receives a list of Params and other-relation Vars * used in the clauses; these values must be transmitted to the remote server ! * as parameter values. * * If params is NULL, we're generating the query for EXPLAIN purposes, * so Params and other-relation Vars should be replaced by dummy values. --- 823,829 * * If params is not NULL, it receives a list of Params and other-relation Vars * used in the clauses; these values must be transmitted to the remote server ! * as parameter values. Caller is responsible for initializing it to empty. * * If params is NULL, we're generating the query for EXPLAIN purposes, * so Params and other-relation Vars should be replaced by dummy values. *** *** 840,848 appendWhereClause(StringInfo buf, int nestlevel; ListCell *lc; - if (params) - *params = NIL; /* initialize result list to empty */ - /* Set up context struct for recursion */ context.root = root; context.foreignrel = baserel; --- 840,845 *** *** 978,983 deparseUpdateSql(StringInfo buf, PlannerInfo *root, --- 975,1044 } /* + * deparse remote UPDATE statement + * + * The statement text is appended to buf, and we also create an integer List + * of the columns being retrieved by RETURNING (if any), which is returned + * to *retrieved_attrs. + */ + void + deparsePushedDownUpdateSql(StringInfo buf, PlannerInfo *root, + Index rtindex, Relation rel, + List *targetlist, + List *targetAttrs, + List *remote_conds, + List **params_list, + List *returningList, + List **retrieved_attrs) + { + RelOptInfo *baserel = root->simple_rel_array[rtindex]; + deparse_expr_cxt context; + int nestlevel; + bool first; + ListCell *lc; + + if (params_list) + *params_list = NIL; /* initialize result list to empty */ + + /* Set up context struct for recursion */ +
Re: Odd behavior in foreign table modification (Was: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW)
On 2016/01/21 5:06, Robert Haas wrote: On Wed, Jan 20, 2016 at 4:50 AM, Etsuro Fujita wrote: My concern about that is that would make the code in deparseTargetList() complicated. Essentially, I think your propossal needs a two-pass algorithm for deparseTargetList; (1) create an integer List of the columns being retrieved from the given attrs_used (getRetrievedAttrs()), and (2) print those columns (printRetrievedAttrs()). How about sharing those two functions between deparseTargetList and deparseReturningList?: I don't see why we'd need that. I adjusted the code in postgres_fdw along the lines I had in mind and am attaching the result. It doesn't look complicated to me, and it passes the regression test you wrote. Thanks for the patch! From the patch, I correctly understand what you proposed. Good idea! By the way, I'm not too sure I understand the need for the core changes that are part of this patch, and I think that point merits some discussion. Whenever you change core like this, you're changing the contract between the FDW and core; it's not just postgres_fdw that needs updating, but every FDW. So we'd better be pretty sure we need these changes and they are adequately justified before we think about putting them into the tree. Are these core changes really needed here, or can we fix this whole issue in postgres_fdw and leave the core code alone? Well, if we think it is the FDW's responsibility to insert a valid value for tableoid in the returned slot during ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete, we don't need those core changes. However, I think it would be better that that's done by ModifyTable in the same way as ForeignScan does in ForeignNext, IMO. That eliminates the need for postgres_fdw or any other FDW to do that business in the callback routines. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Odd behavior in foreign table modification (Was: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW)
On Wed, Jan 20, 2016 at 4:50 AM, Etsuro Fujita wrote: > My concern about that is that would make the code in deparseTargetList() > complicated. > > Essentially, I think your propossal needs a two-pass algorithm for > deparseTargetList; (1) create an integer List of the columns being retrieved > from the given attrs_used (getRetrievedAttrs()), and (2) print those columns > (printRetrievedAttrs()). How about sharing those two functions between > deparseTargetList and deparseReturningList?: I don't see why we'd need that. I adjusted the code in postgres_fdw along the lines I had in mind and am attaching the result. It doesn't look complicated to me, and it passes the regression test you wrote. By the way, I'm not too sure I understand the need for the core changes that are part of this patch, and I think that point merits some discussion. Whenever you change core like this, you're changing the contract between the FDW and core; it's not just postgres_fdw that needs updating, but every FDW. So we'd better be pretty sure we need these changes and they are adequately justified before we think about putting them into the tree. Are these core changes really needed here, or can we fix this whole issue in postgres_fdw and leave the core code alone? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index e59af2c..12a1031 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -110,6 +110,7 @@ static void deparseTargetList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, + bool is_returning, Bitmapset *attrs_used, List **retrieved_attrs); static void deparseReturningList(StringInfo buf, PlannerInfo *root, @@ -724,7 +725,7 @@ deparseSelectSql(StringInfo buf, * Construct SELECT list */ appendStringInfoString(buf, "SELECT "); - deparseTargetList(buf, root, baserel->relid, rel, attrs_used, + deparseTargetList(buf, root, baserel->relid, rel, false, attrs_used, retrieved_attrs); /* @@ -738,7 +739,8 @@ deparseSelectSql(StringInfo buf, /* * Emit a target list that retrieves the columns specified in attrs_used. - * This is used for both SELECT and RETURNING targetlists. + * This is used for both SELECT and RETURNING targetlists; the is_returning + * parameter is true only for a RETURNING targetlist. * * The tlist text is appended to buf, and we also create an integer List * of the columns being retrieved, which is returned to *retrieved_attrs. @@ -748,6 +750,7 @@ deparseTargetList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, + bool is_returning, Bitmapset *attrs_used, List **retrieved_attrs) { @@ -777,6 +780,8 @@ deparseTargetList(StringInfo buf, { if (!first) appendStringInfoString(buf, ", "); + else if (is_returning) +appendStringInfoString(buf, " RETURNING "); first = false; deparseColumnRef(buf, rtindex, i, root); @@ -794,6 +799,8 @@ deparseTargetList(StringInfo buf, { if (!first) appendStringInfoString(buf, ", "); + else if (is_returning) + appendStringInfoString(buf, " RETURNING "); first = false; appendStringInfoString(buf, "ctid"); @@ -803,7 +810,7 @@ deparseTargetList(StringInfo buf, } /* Don't generate bad syntax if no undropped columns */ - if (first) + if (first && !is_returning) appendStringInfoString(buf, "NULL"); } @@ -1022,11 +1029,8 @@ deparseReturningList(StringInfo buf, PlannerInfo *root, } if (attrs_used != NULL) - { - appendStringInfoString(buf, " RETURNING "); - deparseTargetList(buf, root, rtindex, rel, attrs_used, + deparseTargetList(buf, root, rtindex, rel, true, attrs_used, retrieved_attrs); - } else *retrieved_attrs = NIL; } diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index b471c67..b0d12ac 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -2408,6 +2408,59 @@ SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1; 1104 | 204 | ddd| (819 rows) +EXPLAIN (verbose, costs off) +INSERT INTO ft2 (c1,c2,c3) VALUES (,999,'foo') RETURNING tableoid::regclass; + QUERY PLAN +- + Insert on public.ft2 + Output: (tableoid)::regclass + Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) + -> Result + Output: , 999, NULL::integer, 'foo'::text, NULL::times