On Tue, Sep 26, 2017 at 9:50 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, Sep 26, 2017 at 5:06 AM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
>> Based on the review comment from Robert, I'm planning to do the big
>> change to the architecture of this patch so that a backend process
>> work together with a dedicated background worker that is responsible
>> for resolving the foreign transactions. For the usage of this feature,
>> it will be almost the same as what this patch has been doing except
>> for adding a new GUC paramter that controls the number of resovler
>> process launch. That is, we can have multiple resolver process to keep
>> latency down.
> Multiple resolver processes is useful but gets a bit complicated.  For
> example, if process 1 has a connection open to foreign server A and
> process 2 does not, and a request arrives that needs to be handled on
> foreign server A, what happens?  If process 1 is already busy doing
> something else, probably we want process 2 to try to open a new
> connection to foreign server A and handle the request.  But if process
> 1 and 2 are both idle, ideally we'd like 1 to get that request rather
> than 2.  That seems a bit difficult to get working though.  Maybe we
> should just ignore such considerations in the first version.

I understood. I keep it simple in the first version.

>> * Resovler processes
>> 1. Fetch PGPROC entry from the shmem queue and get its XID (say, XID-a).
>> 2. Get the fdw_xact_state entry from shmem hash by XID-a.
>> 3. Iterate fdw_xact entries using the index, and resolve the foreign
>> transactions.
>> 3-a. If even one foreign transaction failed to resolve, raise an error.
>> 4. Change the waiting backend state to FDWXACT_COMPLETED and release it.
> Comments:
> - Note that any error we raise here won't reach the user; this is a
> background process.  We don't want to get into a loop where we just
> error out repeatedly forever -- at least not if there's any other
> reasonable choice.

Thank you for the comments.


> - I suggest that we ought to track the status for each XID separately
> on each server rather than just track the XID status overall.  That
> way, if transaction resolution fails on one server, we don't keep
> trying to reconnect to the others.

Agreed. In the current patch we manage fdw_xact entries that track the
status for each XID separately on each server. I'm going to use the
same mechanism. The resolver process get an target XID from shmem
queue and get the all fdw_xact entries associated with the XID from
the fdw_xact array in shmem. But since the scanning the whole fdw_xact
entries could be slow because the number of entry of fdw_xact array
could be a large number (e.g, max_connections * # of foreign servers),
 I'm considering to have a linked list of the all fdw_xact entries
associated with same XID, and to have a shmem hash pointing to the
first fdw_xact entry of the linked lists for each XID. That way, we
can find the target fdw_xact entries from the array in O(1).

> - If we go to resolve a remote transaction and find that no such
> remote transaction exists, what should we do?  I'm inclined to think
> that we should regard that as if we had succeeded in resolving the
> transaction.  Certainly, if we've retried the server repeatedly, it
> might be that we previously succeeded in resolving the transaction but
> then the network connection was broken before we got the success
> message back from the remote server.  But even if that's not the
> scenario, I think we should assume that the DBA or some other system
> resolved it and therefore we don't need to do anything further.  If we
> assume anything else, then we just go into an infinite error loop,
> which isn't useful behavior.  We could log a message, though (for
> example, LOG: unable to resolve foreign transaction ... because it
> does not exist).



Masahiko Sawada
NTT Open Source Software Center

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to