On Thu, Oct 6, 2016 at 2:52 PM, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote: > On 2016/10/06 17:45, Ashutosh Bapat wrote: >> On Thu, Oct 6, 2016 at 1:34 PM, Masahiko Sawada <sawada.m...@gmail.com> >> wrote: >>> On Thu, Oct 6, 2016 at 1:41 PM, Ashutosh Bapat >>> <ashutosh.ba...@enterprisedb.com> wrote: >>>>> My understanding is that basically the local server can not return >>>>> COMMIT to the client until 2nd phase is completed. >>>> >>>> If we do that, the local server may not return to the client at all, >>>> if the foreign server crashes and never comes up. Practically, it may >>>> take much longer to finish a COMMIT, depending upon how long it takes >>>> for the foreign server to reply to a COMMIT message. >>> >>> Yes, I think 2PC behaves so, please refer to . >>> To prevent local server stops forever due to communication failure., >>> we could provide the timeout on coordinator side or on participant >>> side. >> >> This too, looks like a heuristic and shouldn't be the default >> behaviour and hence not part of the first version of this feature. > > At any rate, the coordinator should not return to the client until after > the 2nd phase is completed, which was the original point. If COMMIT > taking longer is an issue, then it could be handled with one of the > approaches mentioned so far (even if not in the first version), but no > version of this feature should really return COMMIT to the client only > after finishing the first phase. Am I missing something?
There is small time window between actual COMMIT and a commit message returned. An actual commit happens when we insert a WAL saying transaction X committed and then we return to the client saying a COMMIT happened. Note that a transaction may be committed but we will never return to the client with a commit message, because connection was lost or the server crashed. I hope we agree on this. COMMITTING the foreign prepared transactions happens after we COMMIT the local transaction. If we do it before COMMITTING local transaction and the local server crashes, we will roll back local transaction during subsequence recovery while the foreign segments have committed resulting in an inconsistent state. If we are successful in COMMITTING foreign transactions during post-commit phase, COMMIT message will be returned after we have committed all foreign transactions. But in case we can not reach a foreign server, and request times out, we can not revert back our decision that we are going to commit the transaction. That's my answer to the timeout based heuristic. I don't see much point in holding up post-commit processing for a non-responsive foreign server, which may not respond for days together. Can you please elaborate a use case? Which commercial transaction manager does that? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers