On Fri, May 5, 2017 at 7:44 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, May 5, 2017 at 1:01 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>> I have tried to verify above point and it seems to me in such a
>> situation the transaction status will be either PQTRANS_INTRANS or
>> PQTRANS_INERROR.  I have tried to mimic "out of memory" failure by
>> making PQmakeEmptyPGresult return NULL (malloc failure).  AFAIU, it
>> will make the state as PQTRANS_IDLE only when the statement is
>> executed successfully.
> Hrm.  OK, you might be right, but still I don't see why having the 30
> second timeout is bad.  People don't want a transaction rollback to
> hang for a long period of time waiting for ABORT TRANSACTION to run if
> we could've just dropped the connection to get the same effect.  Sure,
> the next transaction will then have to reconnect, but that takes a few
> tens of milliseconds; if you wait for one extra second to avoid that
> outcome, you come out behind even if you do manage to avoid the
> reconnect.  Now for a subtransaction you could argue that it's worth
> being more patient, because forcing a toplevel abort sucks.  But what
> are the chances that, after failing to respond to a ROLLBACK; RELEASE
> SAVEPOINT within 30 seconds, the remote side is going to wake up again
> and start behaving normally?  I'd argue that it's not terribly likely
> and most people aren't going to want to wait for multiple minutes to
> see whether it does, but I suppose we could impose the 30 second
> timeout only at the toplevel and let subtransactions wait however long
> the operating system takes to time out.
>>>> Also, does it matter if pgfdw_subxact_callback fails
>>>> during execution of "ROLLBACK TO SAVEPOINT", because anyway user needs
>>>> to execute rollback to savepoint command before proceeding and that
>>>> should be good enough?
>>> I don't understand this.  If pgfdw_subxact_callback doesn't roll the
>>> remote side back to the savepoint, how is it going to get done later?
>>> There's no code in postgres_fdw other than that code to issue the
>>> rollback.
>> It is done via toplevel transaction ABORT.  I think how it works is
>> after "ROLLBACK TO SAVEPOINT" failure, any next execution is going to
>> fail.  Let us say if we issue Commit after failure, it will try to
>> execute RELEASE SAVEPOINT which will fail and lead to toplevel
>> transaction ABORT.  Now if the toplevel transaction ABORT also fails,
>> it will drop the connection.
> Mmmph.  I guess that would work, but I think it's better to track it
> explicitly.  I tried this (bar is a foreign table):
> begin;
> select * from bar;
> savepoint s1;
> select * from bar;
> savepoint s2;
> select * from bar;
> savepoint s3;
> select * from bar;
> commit;
> On the remote side, the commit statement produces this:
> 2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: RELEASE SAVEPOINT s4
> 2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: RELEASE SAVEPOINT s3
> 2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: RELEASE SAVEPOINT s2
> 2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: COMMIT TRANSACTION
> But that's obviously silly.  Somebody could easily come along and
> optimize that by getting rid of the RELEASE SAVEPOINT commands which
> are completely unnecessary if we're going to commit the toplevel
> transaction anyway, and then the argument that you're making wouldn't
> be true any more.  It seems smart to me to explicitly track whether
> the remote side is known to be in a bad state rather than relying on
> the future sequence of commands to be such that we'll get the right
> result anyway.

Sure, tracking explicitly might be a better way to do deal with it,
but at the cost of always dropping the connection in case of error
might not be the best thing to do.

>> Agreed, in such a situation toplevel transaction abort is the only way
>> out.  However, it seems to me that will anyway happen in current code
>> even if we don't try to force it via abort_cleanup_failure.  I
>> understand that it might be better to force it as you have done in the
>> patch, but not sure if it is good to drop the connection where
>> previously it could have done without it (for ex. if the first
>> statement Rollback To Savepoint fails, but ABORT succeeds).  I feel it
>> is worth considering whether such a behavior difference is okay as you
>> have proposed to backpatch it.
> Well, I think the whole point of this patch is that if one command on
> a connection fails, the chances of future commands succeeding are
> pretty poor.  It's not impossible, but it's not very likely either.
> To recap, the problem at present is that if network connectivity is
> interrupted and the statement is then killed by statement_timeout or a
> local cancel request, we try to send a cancel request to the remote
> side, which takes a while to time out.  Even if that fails, we then
> send an ABORT TRANSACTION, hoping against hope that it will succeed
> despite the failure of the cancel request.  The user ends up waiting
> for the cancel request to time out and then for the ABORT TRANSACTION
> to time out, which takes 15-16 minutes no matter how you set the
> keepalive settings and no matter how many times you hit ^C on the
> local side.
> Now, if you want to allow for the possibility that some future
> operation might succeed never mind past failures, then you're
> basically saying that trying the ABORT TRANSACTION even though the
> query cancel has failed is the correct behavior.  And if you don't
> want the ABORT TRANSACTION to have a 30 second timeout either, then
> you're saying we should wait however long it takes the operating
> system to detect the failure, which again is the current behavior.  If
> we rip those things out of the patch, then there's not much left of
> the patch.

I am not saying to rip those changes.  Your most of the mail stresses
about the 30-second timeout which you have added in the patch to
detect timeouts during failures (rollback processing).  I have no
objection to that part of the patch except that still there is a race
condition around PQsendQuery and you indicate that it is better to
deal it as a separate patch to which I agree.  The point of which I am
not in total agreement is about the failure other than the time out.

+pgfdw_exec_cleanup_query(PGconn *conn, const char *query)
+ /* Get the result of the query. */
+ if (pgfdw_get_cleanup_result(conn, endtime, &result))
+ return false;
+ /* Issue a warning if not successful. */
+ if (PQresultStatus(result) != PGRES_COMMAND_OK)
+ {
+ pgfdw_report_error(WARNING, result, conn, true, query);
+ return false;
+ }

In the above code, if there is a failure due to timeout then it will
return from first statement (pgfdw_get_cleanup_result), however, if
there is any other failure, then it will issue a warning and return
false.  I am not sure if there is a need to return false in the second
case, because even without that it will work fine (and there is a
chance that it won't drop the connection), but maybe your point is
better to be consistent for all kind of error handling in this path.

>  The only thing that would left would be making the wait
> for ABORT TRANSACTION interruptible, so that if you hit ^C *again*
> after you've already canceled the query, we then give up on waiting
> for the ABORT TRANSACTION.  I guess that would be an improvement over
> the status quo, but it's not really a solution.  People don't expect
> to have to send multiple cancel requests to kill the same query in a
> timely fashion, and if the commands are being sent by an application
> rather than by a human being it probably won't happen.
> Do you want to propose a competing patch to fix this problem?  I'm
> happy to listen to suggestions.  But I think if you're determined to
> say "let's rely entirely on the OS timeouts rather than having any of
> our own" and you're also determined to say "let's not assume that a
> failure of one operation means that the connection is dead", then
> you're basically arguing that there's nothing really broken here, and
> I don't agree with that.

No, I am not saying any of those.  I hope the previous point clarifies
what I want to say.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to