Is the following sequence possible ?
In pgfdw_inval_callback():

            entry->invalidated = true;
+           have_invalid_connections = true;

At which time the loop in pgfdw_xact_callback() is already running (but
past the above entry).
Then:

+   /* We are done closing all the invalidated connections so reset. */
+   have_invalid_connections = false;

At which time, there is still at least one invalid connection but the
global flag is off.

On Mon, Dec 14, 2020 at 6:39 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> Hi,
>
> As discussed in [1], in postgres_fdw the cached connections to remote
> servers can stay until the lifetime of the local session without
> getting a chance to disconnect (connection leak), if the underlying
> user mapping or foreign server is dropped in another session. Here are
> few scenarios how this can happen:
>
> Use case 1:
> 1) Run a foreign query in session 1 with server 1 and user mapping 1
> 2) Drop user mapping 1 in another session 2, an invalidation message
> gets generated which will have to be processed by all the sessions
> 3) Run the foreign query again in session 1, at the start of txn the
> cached entry gets invalidated via pgfdw_inval_callback() (as part of
> invalidation message processing). Whatever may be the type of foreign
> query (select, update, explain, delete, insert, analyze etc.), upon
> next call to GetUserMapping() from postgres_fdw.c, the cache lookup
> fails(with ERROR:  user mapping not found for "XXXX") since the user
> mapping 1 has been dropped in session 2 and the query will also fail
> before reaching GetConnection() where the connections associated with
> the invalidated entries would have got disconnected.
>
> So, the connection associated with invalidated entry would remain
> until the local session exits.
>
> Use case 2:
> 1) Run a foreign query in session 1 with server 1 and user mapping 1
> 2) Try to drop foreign server 1, then we would not be allowed because
> of dependency. Use CASCADE so that dependent objects i.e. user mapping
> 1 and foreign tables get dropped along with foreign server 1.
> 3) Run the foreign query again in session 1, at the start of txn, the
> cached entry gets invalidated via pgfdw_inval_callback() and the query
> fails because there is no foreign table.
>
> Note that the remote connection remains open in session 1 until the
> local session exits.
>
> To solve the above connection leak problem, it looks like the right
> place to close all the invalid connections is pgfdw_xact_callback(),
> once registered, which gets called at the end of every txn in the
> current session(by then all the sub txns also would have been
> finished). Note that if there are too many invalidated entries, then
> the following txn has to close all of them, but that's okay than
> having leaked connections and it's a one time job for the following
> one txn.
>
> Attaching a patch for the same.
>
> Thoughts?
>
> [1] -
> https://www.postgresql.org/message-id/flat/CALj2ACUOQYs%2BssxkxRvZ6Ja5%2Bsfc6a-s_0e-Nv2A591hEyOgiw%40mail.gmail.com
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com
>

Reply via email to