On Thu, Jul 27, 2017 at 10:28 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, Apr 7, 2017 at 10:56 AM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
>> Vinayak, why did you marked this patch as "Move to next CF"? AFAIU
>> there is not discussion yet.
> I'd like to discuss this patch.  Clearly, a lot of work has been done
> here, but I am not sure about the approach.

Thank you for the comment. I'd like to reply about the goal of this
feature first.

> If we were to commit this patch set, then you could optionally enable
> two_phase_commit for a postgres_fdw foreign server.  If you did, then,
> modulo bugs and administrator shenanigans, and given proper
> configuration, you would be guaranteed that a successful commit of a
> transaction which touched postgres_fdw foreign tables would eventually
> end up committed or rolled back on all of the nodes, rather than
> committed on some and rolled back on others.  However, you would not
> be guaranteed that all of those commits or rollbacks happen at
> anything like the same time.  So, you would have a sort of eventual
> consistency.  Any given snapshot might not be consistent, but if you
> waited long enough and with all nodes online, eventually all
> distributed transactions would be resolved in a consistent manner.
> That's kinda cool, but I think what people really want is a stronger
> guarantee, namely, that they will get consistent snapshots.  It's not
> clear to me that this patch gets us any closer to that goal.  Does
> anyone have a plan for how we'd get from here to that stronger goal?
> If not, is the patch useful enough to justify committing it for what
> it can already do?  It would be particularly good to hear some
> end-user views on this functionality and whether or not they would use
> it and find it valuable.

Yeah, this patch only guarantees that if you got a commit the
transaction either committed or rollback-ed on all relevant nodes.
And subsequent transactions can see a consistent result (if the server
failed we have to recover in-doubt transactions properly from a
crash). But it doesn't guarantees that a concurrent transaction can
see a consistent result. To provide seeing cluster-wide consistent
result, I think we need a transaction manager for distributed queries
which is responsible for providing consistent snapshots. There were
some discussions of the type of transaction manager but at least we
need a new transaction manager for distributed queries. I think the
providing a consistent result to concurrent transactions and the
committing or rollback-ing atomically a transaction should be
separated features, and should be discussed separately. It's not
useful and users would complain if we provide a consistent snapshot
but a distributed transaction could commit on part of nodes. So this
patch could be also an important feature for providing consistent

> On a technical level, I am pretty sure that it is not OK to call
> AtEOXact_FDWXacts() from the sections of CommitTransaction,
> AbortTransaction, and PrepareTransaction that are described as
> "non-critical resource releasing".  At that point, it's too late to
> throw an error, and it is very difficult to imagine something that
> involves a TCP connection to another machine not being subject to
> error.  You might say "well, we can just make sure that any problems
> are reporting as a WARNING rather than an ERROR", but that's pretty
> hard to guarantee; most backend code assumes it can ERROR, so anything
> you call is a potential hazard.  There is a second problem, too: any
> code that runs from here is not interruptible.  The user can hit ^C
> all day and nothing will happen.  That's a bad situation when you're
> busy doing network I/O.  I'm not exactly sure what the best thing to
> do about this problem would be.


Masahiko Sawada
NTT Open Source Software Center

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to