Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-28 Thread Fujii Masao




On 2020/12/24 23:45, Fujii Masao wrote:



On 2020/12/24 23:30, Bharath Rupireddy wrote:

On Thu, Dec 24, 2020 at 7:43 PM Fujii Masao  wrote:

Even when we're in the midst of transaction, if that transaction has not used
the cached connections yet, we close them immediately. So, to make the
comment more precise, what about updating the comment as follows?

-
  After a change to a pg_foreign_server or pg_user_mapping catalog entry,
  close connections depending on that entry immediately if current
  transaction has not used those connections yet. Otherwise, mark those
  connections as invalid and then make pgfdw_xact_callback() close them
  at the end of current transaction, since they cannot be closed in the 
midst
  of a transaction using them. Closed connections will be remade at the next
  opportunity if necessary.
-


Done.


+   /*
+    * Close the connection if it's not in midst of a xact. 
Otherwise
+    * mark it as invalid so that it can be disconnected at 
the end of
+    * main xact in pgfdw_xact_callback().
+    */

Because of the same reason as the above, what about updating this comment
as follows?

-
  Close the connection immediately if it's not used yet in this transaction.
  Otherwise mark it as invalid so that pgfdw_xact_callback() can close it
  at the end of this transaction.
-


Done.

Attaching v3 patch. Please have a look. Thanks!


Thanks a lot! Barring any objection, I will commit this version.


Pushed. Thanks!

Regards,


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-24 Thread Fujii Masao




On 2020/12/24 23:30, Bharath Rupireddy wrote:

On Thu, Dec 24, 2020 at 7:43 PM Fujii Masao  wrote:

Even when we're in the midst of transaction, if that transaction has not used
the cached connections yet, we close them immediately. So, to make the
comment more precise, what about updating the comment as follows?

-
  After a change to a pg_foreign_server or pg_user_mapping catalog entry,
  close connections depending on that entry immediately if current
  transaction has not used those connections yet. Otherwise, mark those
  connections as invalid and then make pgfdw_xact_callback() close them
  at the end of current transaction, since they cannot be closed in the 
midst
  of a transaction using them. Closed connections will be remade at the next
  opportunity if necessary.
-


Done.


+   /*
+* Close the connection if it's not in midst of a xact. 
Otherwise
+* mark it as invalid so that it can be disconnected at 
the end of
+* main xact in pgfdw_xact_callback().
+*/

Because of the same reason as the above, what about updating this comment
as follows?

-
  Close the connection immediately if it's not used yet in this transaction.
  Otherwise mark it as invalid so that pgfdw_xact_callback() can close it
  at the end of this transaction.
-


Done.

Attaching v3 patch. Please have a look. Thanks!


Thanks a lot! Barring any objection, I will commit this version.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-24 Thread Bharath Rupireddy
On Thu, Dec 24, 2020 at 7:43 PM Fujii Masao  wrote:
> Even when we're in the midst of transaction, if that transaction has not used
> the cached connections yet, we close them immediately. So, to make the
> comment more precise, what about updating the comment as follows?
>
> -
>  After a change to a pg_foreign_server or pg_user_mapping catalog entry,
>  close connections depending on that entry immediately if current
>  transaction has not used those connections yet. Otherwise, mark those
>  connections as invalid and then make pgfdw_xact_callback() close them
>  at the end of current transaction, since they cannot be closed in the 
> midst
>  of a transaction using them. Closed connections will be remade at the 
> next
>  opportunity if necessary.
> -

Done.

> +   /*
> +* Close the connection if it's not in midst of a 
> xact. Otherwise
> +* mark it as invalid so that it can be disconnected 
> at the end of
> +* main xact in pgfdw_xact_callback().
> +*/
>
> Because of the same reason as the above, what about updating this comment
> as follows?
>
> -
>  Close the connection immediately if it's not used yet in this 
> transaction.
>  Otherwise mark it as invalid so that pgfdw_xact_callback() can close it
>  at the end of this transaction.
> -

Done.

Attaching v3 patch. Please have a look. Thanks!

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


v3-0001-postgres_fdw-cached-connection-leaks-if-the-assoc.patch
Description: Binary data


Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-24 Thread Fujii Masao




On 2020/12/24 15:42, Bharath Rupireddy wrote:

On Thu, Dec 24, 2020 at 7:21 AM Fujii Masao  wrote:

On 2020/12/23 23:40, Bharath Rupireddy wrote:

On Wed, Dec 23, 2020 at 7:31 PM Fujii Masao  wrote:

I agree to make pgfdw_xact_callback() close the connection when
entry->invalidated == true. But I think that it's better to get rid of
have_invalid_connections flag and make pgfdw_inval_callback() close
the connection immediately if entry->xact_depth == 0, to avoid unnecessary
scan of the hashtable during COMMIT of transaction not accessing to
foreign servers. Attached is the POC patch that I'm thinking. Thought?


We could do that way as well. It seems okay to me. Now the disconnect
code is spread in pgfdw_inval_callback() and pgfdw_xact_callback().
There's also less burden on pgfdw_xact_callback() to close a lot of
connections on a single commit. The behaviour is like this - If
entry->xact_depth == 0, then the entries wouldn't have got any
connection in the current txn so they can be immediately closed in
pgfdw_inval_callback() and pgfdw_xact_callback() can exit immediately
as xact_got_connection is false. If entry->xact_depth > 0 which means
that probably pgfdw_inval_callback() came from a sub txn, we would
have got a connection in the txn i.e. xact_got_connection becomes true
due to which it can get invalidated in pgfdw_inval_callback() and get
closed in pgfdw_xact_callback() at the end of the txn.

And there's no chance of entry->xact_depth > 0 and xact_got_connection false.

I think we need to change the comment before pgfdw_inval_callback() a bit:

   * After a change to a pg_foreign_server or pg_user_mapping catalog entry,
   * mark connections depending on that entry as needing to be remade.
   * We can't immediately destroy them, since they might be in the midst of
   * a transaction, but we'll remake them at the next opportunity.

to

   * After a change to a pg_foreign_server or pg_user_mapping catalog entry,
*  try to close the cached connections associated with them if they
are not in the
*  midst of a transaction otherwise mark them as invalidated. We will
destroy the
   * invalidated connections in pgfdw_xact_callback() at the end of the main 
xact.
   * Closed connections will be remade at the next opportunity.


Yes, I agree that we need to update that comment.



Any thoughts on the earlier point in [1] related to a test case(which
becomes unnecessary with this patch) coverage?



ISTM that we can leave that test as it is because it's still useful to test
the case where the cached connection is discarded in pgfdw_inval_callback().
Thought?


Yes, that test case covers the code this patch adds i.e. closing the
connection in pgfdw_inval_callback() while committing alter foreign
server stmt.


By applying the patch, probably we can get rid of the code to discard
the invalidated cached connection in GetConnection(). But at least for
the back branches, I'd like to leave the code as it is so that we can make
sure that the invalidated cached connection doesn't exist when getting
new connection. Maybe we can improve that in the master, but I'm not
sure if it's really worth doing that against the gain. Thought?


+1 to keep that code as is even after this patch is applied(at least
it works as an assertion that we don't have any leftover invalid
connections). I'm not quite sure, we may need that in some cases, say
if we don't hit disconnect_pg_server() code in pgfdw_xact_callback()
and pgfdw_inval_callback() due to some errors in between. I can not
think of an exact use case though.

Attaching v2 patch that adds the comments and the other test case that
covers disconnecting code in pgfdw_xact_callback. Please review it.


Thanks for updating the patch! It basically looks good to me except
the following minor things.

+ * After a change to a pg_foreign_server or pg_user_mapping catalog entry, try
+ * to close the cached connections associated with them if they are not in the
+ * midst of a transaction otherwise mark them as invalid. We will destroy the
+ * invalidated connections in pgfdw_xact_callback() at the end of the main
+ * xact. Closed connections will be remade at the next opportunity.

Even when we're in the midst of transaction, if that transaction has not used
the cached connections yet, we close them immediately. So, to make the
comment more precise, what about updating the comment as follows?

-
After a change to a pg_foreign_server or pg_user_mapping catalog entry,
close connections depending on that entry immediately if current
transaction has not used those connections yet. Otherwise, mark those
connections as invalid and then make pgfdw_xact_callback() close them
at the end of current transaction, since they cannot be closed in the midst
of a transaction using them. Closed connections will be remade at the next
opportunity if necessary.
-

+   /*
+* Close the connection 

Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-23 Thread Bharath Rupireddy
On Thu, Dec 24, 2020 at 7:21 AM Fujii Masao  wrote:
> On 2020/12/23 23:40, Bharath Rupireddy wrote:
> > On Wed, Dec 23, 2020 at 7:31 PM Fujii Masao  
> > wrote:
> >> I agree to make pgfdw_xact_callback() close the connection when
> >> entry->invalidated == true. But I think that it's better to get rid of
> >> have_invalid_connections flag and make pgfdw_inval_callback() close
> >> the connection immediately if entry->xact_depth == 0, to avoid unnecessary
> >> scan of the hashtable during COMMIT of transaction not accessing to
> >> foreign servers. Attached is the POC patch that I'm thinking. Thought?
> >
> > We could do that way as well. It seems okay to me. Now the disconnect
> > code is spread in pgfdw_inval_callback() and pgfdw_xact_callback().
> > There's also less burden on pgfdw_xact_callback() to close a lot of
> > connections on a single commit. The behaviour is like this - If
> > entry->xact_depth == 0, then the entries wouldn't have got any
> > connection in the current txn so they can be immediately closed in
> > pgfdw_inval_callback() and pgfdw_xact_callback() can exit immediately
> > as xact_got_connection is false. If entry->xact_depth > 0 which means
> > that probably pgfdw_inval_callback() came from a sub txn, we would
> > have got a connection in the txn i.e. xact_got_connection becomes true
> > due to which it can get invalidated in pgfdw_inval_callback() and get
> > closed in pgfdw_xact_callback() at the end of the txn.
> >
> > And there's no chance of entry->xact_depth > 0 and xact_got_connection 
> > false.
> >
> > I think we need to change the comment before pgfdw_inval_callback() a bit:
> >
> >   * After a change to a pg_foreign_server or pg_user_mapping catalog entry,
> >   * mark connections depending on that entry as needing to be remade.
> >   * We can't immediately destroy them, since they might be in the midst of
> >   * a transaction, but we'll remake them at the next opportunity.
> >
> > to
> >
> >   * After a change to a pg_foreign_server or pg_user_mapping catalog entry,
> > *  try to close the cached connections associated with them if they
> > are not in the
> > *  midst of a transaction otherwise mark them as invalidated. We will
> > destroy the
> >   * invalidated connections in pgfdw_xact_callback() at the end of the main 
> > xact.
> >   * Closed connections will be remade at the next opportunity.
>
> Yes, I agree that we need to update that comment.
>
> >
> > Any thoughts on the earlier point in [1] related to a test case(which
> > becomes unnecessary with this patch) coverage?
> >
>
> ISTM that we can leave that test as it is because it's still useful to test
> the case where the cached connection is discarded in pgfdw_inval_callback().
> Thought?

Yes, that test case covers the code this patch adds i.e. closing the
connection in pgfdw_inval_callback() while committing alter foreign
server stmt.

> By applying the patch, probably we can get rid of the code to discard
> the invalidated cached connection in GetConnection(). But at least for
> the back branches, I'd like to leave the code as it is so that we can make
> sure that the invalidated cached connection doesn't exist when getting
> new connection. Maybe we can improve that in the master, but I'm not
> sure if it's really worth doing that against the gain. Thought?

+1 to keep that code as is even after this patch is applied(at least
it works as an assertion that we don't have any leftover invalid
connections). I'm not quite sure, we may need that in some cases, say
if we don't hit disconnect_pg_server() code in pgfdw_xact_callback()
and pgfdw_inval_callback() due to some errors in between. I can not
think of an exact use case though.

Attaching v2 patch that adds the comments and the other test case that
covers disconnecting code in pgfdw_xact_callback. Please review it.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From 20f605a7586a6d6bc3cb671092c5c1e08c383c4f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 24 Dec 2020 11:53:30 +0530
Subject: [PATCH v2] postgres_fdw - cached connection leaks if the associated
 user mapping is dropped

In postgres_fdw the cached connections to remote servers can stay
until the lifetime of the local session, if the underlying user
mapping is dropped in another session.

To solve the above connection leak problem, it looks like the right
place to close the invalid connections is either in pgfdw_inval_callback()
if they are not in midst any xact, or in pgfdw_xact_callback(), which gets
called at the end of every act once registered, in the
current session(by then all the sub xacts also would have been finished).
Note that if there are too many invalidated entries, then the
following xact has to bear running this extra disconnect code, but
that's okay than having leaked connections.
---
 contrib/postgres_fdw/connection.c | 32 ++-
 .../postgres_fdw/expected/postgres_fdw.out| 18 +++
 

Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-23 Thread Fujii Masao




On 2020/12/23 23:40, Bharath Rupireddy wrote:

On Wed, Dec 23, 2020 at 7:31 PM Fujii Masao  wrote:

I agree to make pgfdw_xact_callback() close the connection when
entry->invalidated == true. But I think that it's better to get rid of
have_invalid_connections flag and make pgfdw_inval_callback() close
the connection immediately if entry->xact_depth == 0, to avoid unnecessary
scan of the hashtable during COMMIT of transaction not accessing to
foreign servers. Attached is the POC patch that I'm thinking. Thought?


We could do that way as well. It seems okay to me. Now the disconnect
code is spread in pgfdw_inval_callback() and pgfdw_xact_callback().
There's also less burden on pgfdw_xact_callback() to close a lot of
connections on a single commit. The behaviour is like this - If
entry->xact_depth == 0, then the entries wouldn't have got any
connection in the current txn so they can be immediately closed in
pgfdw_inval_callback() and pgfdw_xact_callback() can exit immediately
as xact_got_connection is false. If entry->xact_depth > 0 which means
that probably pgfdw_inval_callback() came from a sub txn, we would
have got a connection in the txn i.e. xact_got_connection becomes true
due to which it can get invalidated in pgfdw_inval_callback() and get
closed in pgfdw_xact_callback() at the end of the txn.

And there's no chance of entry->xact_depth > 0 and xact_got_connection false.

I think we need to change the comment before pgfdw_inval_callback() a bit:

  * After a change to a pg_foreign_server or pg_user_mapping catalog entry,
  * mark connections depending on that entry as needing to be remade.
  * We can't immediately destroy them, since they might be in the midst of
  * a transaction, but we'll remake them at the next opportunity.

to

  * After a change to a pg_foreign_server or pg_user_mapping catalog entry,
*  try to close the cached connections associated with them if they
are not in the
*  midst of a transaction otherwise mark them as invalidated. We will
destroy the
  * invalidated connections in pgfdw_xact_callback() at the end of the main 
xact.
  * Closed connections will be remade at the next opportunity.


Yes, I agree that we need to update that comment.



Any thoughts on the earlier point in [1] related to a test case(which
becomes unnecessary with this patch) coverage?



ISTM that we can leave that test as it is because it's still useful to test
the case where the cached connection is discarded in pgfdw_inval_callback().
Thought?

By applying the patch, probably we can get rid of the code to discard
the invalidated cached connection in GetConnection(). But at least for
the back branches, I'd like to leave the code as it is so that we can make
sure that the invalidated cached connection doesn't exist when getting
new connection. Maybe we can improve that in the master, but I'm not
sure if it's really worth doing that against the gain. Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-23 Thread Bharath Rupireddy
On Wed, Dec 23, 2020 at 7:31 PM Fujii Masao  wrote:
> I agree to make pgfdw_xact_callback() close the connection when
> entry->invalidated == true. But I think that it's better to get rid of
> have_invalid_connections flag and make pgfdw_inval_callback() close
> the connection immediately if entry->xact_depth == 0, to avoid unnecessary
> scan of the hashtable during COMMIT of transaction not accessing to
> foreign servers. Attached is the POC patch that I'm thinking. Thought?

We could do that way as well. It seems okay to me. Now the disconnect
code is spread in pgfdw_inval_callback() and pgfdw_xact_callback().
There's also less burden on pgfdw_xact_callback() to close a lot of
connections on a single commit. The behaviour is like this - If
entry->xact_depth == 0, then the entries wouldn't have got any
connection in the current txn so they can be immediately closed in
pgfdw_inval_callback() and pgfdw_xact_callback() can exit immediately
as xact_got_connection is false. If entry->xact_depth > 0 which means
that probably pgfdw_inval_callback() came from a sub txn, we would
have got a connection in the txn i.e. xact_got_connection becomes true
due to which it can get invalidated in pgfdw_inval_callback() and get
closed in pgfdw_xact_callback() at the end of the txn.

And there's no chance of entry->xact_depth > 0 and xact_got_connection false.

I think we need to change the comment before pgfdw_inval_callback() a bit:

 * After a change to a pg_foreign_server or pg_user_mapping catalog entry,
 * mark connections depending on that entry as needing to be remade.
 * We can't immediately destroy them, since they might be in the midst of
 * a transaction, but we'll remake them at the next opportunity.

to

 * After a change to a pg_foreign_server or pg_user_mapping catalog entry,
*  try to close the cached connections associated with them if they
are not in the
*  midst of a transaction otherwise mark them as invalidated. We will
destroy the
 * invalidated connections in pgfdw_xact_callback() at the end of the main xact.
 * Closed connections will be remade at the next opportunity.

Any thoughts on the earlier point in [1] related to a test case(which
becomes unnecessary with this patch) coverage?

[1] - 
https://www.postgresql.org/message-id/CALj2ACXymb%3Dd4KeOq%2Bjnh_E6yThn%2Bcf1CDRhtK_crkj0_hVimQ%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-23 Thread Fujii Masao



On 2020/12/15 11:38, Bharath Rupireddy wrote:

Hi,

As discussed in [1], in postgres_fdw the cached connections to remote
servers can stay until the lifetime of the local session without
getting a chance to disconnect (connection leak), if the underlying
user mapping or foreign server is dropped in another session. Here are
few scenarios how this can happen:

Use case 1:
1) Run a foreign query in session 1 with server 1 and user mapping 1
2) Drop user mapping 1 in another session 2, an invalidation message
gets generated which will have to be processed by all the sessions
3) Run the foreign query again in session 1, at the start of txn the
cached entry gets invalidated via pgfdw_inval_callback() (as part of
invalidation message processing). Whatever may be the type of foreign
query (select, update, explain, delete, insert, analyze etc.), upon
next call to GetUserMapping() from postgres_fdw.c, the cache lookup
fails(with ERROR:  user mapping not found for "") since the user
mapping 1 has been dropped in session 2 and the query will also fail
before reaching GetConnection() where the connections associated with
the invalidated entries would have got disconnected.

So, the connection associated with invalidated entry would remain
until the local session exits.

Use case 2:
1) Run a foreign query in session 1 with server 1 and user mapping 1
2) Try to drop foreign server 1, then we would not be allowed because
of dependency. Use CASCADE so that dependent objects i.e. user mapping
1 and foreign tables get dropped along with foreign server 1.
3) Run the foreign query again in session 1, at the start of txn, the
cached entry gets invalidated via pgfdw_inval_callback() and the query
fails because there is no foreign table.

Note that the remote connection remains open in session 1 until the
local session exits.

To solve the above connection leak problem, it looks like the right
place to close all the invalid connections is pgfdw_xact_callback(),
once registered, which gets called at the end of every txn in the
current session(by then all the sub txns also would have been
finished). Note that if there are too many invalidated entries, then
the following txn has to close all of them, but that's okay than
having leaked connections and it's a one time job for the following
one txn.

Attaching a patch for the same.

Thoughts?


Thanks for making the patch!

I agree to make pgfdw_xact_callback() close the connection when
entry->invalidated == true. But I think that it's better to get rid of
have_invalid_connections flag and make pgfdw_inval_callback() close
the connection immediately if entry->xact_depth == 0, to avoid unnecessary
scan of the hashtable during COMMIT of transaction not accessing to
foreign servers. Attached is the POC patch that I'm thinking. Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/contrib/postgres_fdw/connection.c 
b/contrib/postgres_fdw/connection.c
index 66581e5414..dcd14fa4b5 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -940,12 +940,14 @@ pgfdw_xact_callback(XactEvent event, void *arg)
entry->xact_depth = 0;
 
/*
-* If the connection isn't in a good idle state, discard it to
-* recover. Next GetConnection will open a new connection.
+* If the connection isn't in a good idle state or it is marked 
as
+* invalid, then discard it to recover. Next GetConnection will 
open a
+* new connection.
 */
if (PQstatus(entry->conn) != CONNECTION_OK ||
PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
-   entry->changing_xact_state)
+   entry->changing_xact_state ||
+   entry->invalidated)
{
elog(DEBUG3, "discarding connection %p", entry->conn);
disconnect_pg_server(entry);
@@ -1102,7 +1104,15 @@ pgfdw_inval_callback(Datum arg, int cacheid, uint32 
hashvalue)
 entry->server_hashvalue == hashvalue) ||
(cacheid == USERMAPPINGOID &&
 entry->mapping_hashvalue == hashvalue))
-   entry->invalidated = true;
+   {
+   if (entry->xact_depth == 0)
+   {
+   elog(DEBUG3, "discarding connection %p", 
entry->conn);
+   disconnect_pg_server(entry);
+   }
+   else
+   entry->invalidated = true;
+   }
}
 }
 


Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-22 Thread Bharath Rupireddy
On Fri, Dec 18, 2020 at 6:46 PM Bharath Rupireddy
 wrote:
> On Fri, Dec 18, 2020 at 6:39 PM Bharath Rupireddy
>  wrote:
> >
> > On Fri, Dec 18, 2020 at 5:06 PM Hou, Zhijie  
> > wrote:
> > > I have an issue about the existing testcase.
> > >
> > > """
> > > -- Test that alteration of server options causes reconnection
> > > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should work
> > > ALTER SERVER loopback OPTIONS (SET dbname 'no such database');
> > > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should fail
> > > DO $d$
> > > BEGIN
> > > EXECUTE $$ALTER SERVER loopback
> > > OPTIONS (SET dbname '$$||current_database()||$$')$$;
> > > END;
> > > $d$;
> > > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should work again
> > > """
> > >
> > > IMO, the above case is designed to test the following code[1]:
> > > With the patch, it seems the following code[1] will not work for this 
> > > case, right?
> > > (It seems the connection will be disconnect in pgfdw_xact_callback)
> > >
> > > I do not know does it matter, or should we add a testcase to cover that?
> > >
> > > [1] /*
> > >  * If the connection needs to be remade due to invalidation, 
> > > disconnect as
> > >  * soon as we're out of all transactions.
> > >  */
> > > if (entry->conn != NULL && entry->invalidated && 
> > > entry->xact_depth == 0)
> > > {
> > > elog(DEBUG3, "closing connection %p for option changes to 
> > > take effect",
> > >  entry->conn);
> > > disconnect_pg_server(entry);
> > > }
> >
> > Yes you are right. With the patch if the server is altered in the same
> > session in which the connection exists, the connection gets closed at
> > the end of that alter query txn, not at the beginning of the next
> > foreign tbl query. So, that part of the code in GetConnection()
> > doesn't get hit. Having said that, this code gets hit when the alter
> > query is run in another session and the connection in the current
> > session gets disconnected at the start of the next foreign tbl query.
> >
> > If we want to cover it with a test case, we might have to alter the
> > foreign server in another session. I'm not sure whether we can open
> > another session from the current psql session and run a sql command.

I further checked on how we can add/move the test case( that is
altering server options in a different session and see if the
connection gets disconnected at the start of the next foreign query in
the current session ) to cover the above code. Upon some initial
analysis, it looks like it is possible to add that under
src/test/isolation tests. Another way could be to add it using the TAP
framework under contrib/postgres_fdw. Having said that, currently
these two places don't have any postgres_fdw related tests, we will be
the first ones to add.

I'm not quite sure whether that's okay or is there any better way of
doing it. Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-18 Thread Bharath Rupireddy
On Fri, Dec 18, 2020 at 6:39 PM Bharath Rupireddy
 wrote:
>
> On Fri, Dec 18, 2020 at 5:06 PM Hou, Zhijie  wrote:
> > I have an issue about the existing testcase.
> >
> > """
> > -- Test that alteration of server options causes reconnection
> > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should work
> > ALTER SERVER loopback OPTIONS (SET dbname 'no such database');
> > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should fail
> > DO $d$
> > BEGIN
> > EXECUTE $$ALTER SERVER loopback
> > OPTIONS (SET dbname '$$||current_database()||$$')$$;
> > END;
> > $d$;
> > SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should work again
> > """
> >
> > IMO, the above case is designed to test the following code[1]:
> > With the patch, it seems the following code[1] will not work for this case, 
> > right?
> > (It seems the connection will be disconnect in pgfdw_xact_callback)
> >
> > I do not know does it matter, or should we add a testcase to cover that?
> >
> > [1] /*
> >  * If the connection needs to be remade due to invalidation, 
> > disconnect as
> >  * soon as we're out of all transactions.
> >  */
> > if (entry->conn != NULL && entry->invalidated && entry->xact_depth 
> > == 0)
> > {
> > elog(DEBUG3, "closing connection %p for option changes to 
> > take effect",
> >  entry->conn);
> > disconnect_pg_server(entry);
> > }
>
> Yes you are right. With the patch if the server is altered in the same
> session in which the connection exists, the connection gets closed at
> the end of that alter query txn, not at the beginning of the next
> foreign tbl query. So, that part of the code in GetConnection()
> doesn't get hit. Having said that, this code gets hit when the alter
> query is run in another session and the connection in the current
> session gets disconnected at the start of the next foreign tbl query.
>
> If we want to cover it with a test case, we might have to alter the
> foreign server in another session. I'm not sure whether we can open
> another session from the current psql session and run a sql command.
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com




RE: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-18 Thread Hou, Zhijie
Hi

I have an issue about the existing testcase.

"""
-- Test that alteration of server options causes reconnection SELECT c3, c4 
FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should work ALTER SERVER loopback OPTIONS 
(SET dbname 'no such database'); SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 
1;  -- should fail DO $d$
BEGIN
EXECUTE $$ALTER SERVER loopback
OPTIONS (SET dbname '$$||current_database()||$$')$$;
END;
$d$;
SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;  -- should work again """

IMO, the above case is designed to test the following code[1]:
With the patch, it seems the following code[1] will not work for this case, 
right?
(It seems the connection will be disconnect in pgfdw_xact_callback)

I do not know does it matter, or should we add a testcase to cover that?

[1] /*
 * If the connection needs to be remade due to invalidation, disconnect 
as
 * soon as we're out of all transactions.
 */
if (entry->conn != NULL && entry->invalidated && entry->xact_depth == 0)
{
elog(DEBUG3, "closing connection %p for option changes to take 
effect",
 entry->conn);
disconnect_pg_server(entry);
}


Best regards,
houzj




Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-14 Thread Bharath Rupireddy
On Tue, Dec 15, 2020 at 8:25 AM Zhihong Yu  wrote:
> Is the following sequence possible ?
> In pgfdw_inval_callback():
>
> entry->invalidated = true;
> +   have_invalid_connections = true;
>
> At which time the loop in pgfdw_xact_callback() is already running (but past 
> the above entry).
> Then:
>
> +   /* We are done closing all the invalidated connections so reset. */
> +   have_invalid_connections = false;
>
> At which time, there is still at least one invalid connection but the global 
> flag is off.

It's not possible, as this backend specific code doesn't run in
multiple threads. We can not have pgfdw_inval_callback() and
pgfdw_xact_callback() running at the same time, so we are safe there.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-14 Thread Zhihong Yu
Is the following sequence possible ?
In pgfdw_inval_callback():

entry->invalidated = true;
+   have_invalid_connections = true;

At which time the loop in pgfdw_xact_callback() is already running (but
past the above entry).
Then:

+   /* We are done closing all the invalidated connections so reset. */
+   have_invalid_connections = false;

At which time, there is still at least one invalid connection but the
global flag is off.

On Mon, Dec 14, 2020 at 6:39 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> Hi,
>
> As discussed in [1], in postgres_fdw the cached connections to remote
> servers can stay until the lifetime of the local session without
> getting a chance to disconnect (connection leak), if the underlying
> user mapping or foreign server is dropped in another session. Here are
> few scenarios how this can happen:
>
> Use case 1:
> 1) Run a foreign query in session 1 with server 1 and user mapping 1
> 2) Drop user mapping 1 in another session 2, an invalidation message
> gets generated which will have to be processed by all the sessions
> 3) Run the foreign query again in session 1, at the start of txn the
> cached entry gets invalidated via pgfdw_inval_callback() (as part of
> invalidation message processing). Whatever may be the type of foreign
> query (select, update, explain, delete, insert, analyze etc.), upon
> next call to GetUserMapping() from postgres_fdw.c, the cache lookup
> fails(with ERROR:  user mapping not found for "") since the user
> mapping 1 has been dropped in session 2 and the query will also fail
> before reaching GetConnection() where the connections associated with
> the invalidated entries would have got disconnected.
>
> So, the connection associated with invalidated entry would remain
> until the local session exits.
>
> Use case 2:
> 1) Run a foreign query in session 1 with server 1 and user mapping 1
> 2) Try to drop foreign server 1, then we would not be allowed because
> of dependency. Use CASCADE so that dependent objects i.e. user mapping
> 1 and foreign tables get dropped along with foreign server 1.
> 3) Run the foreign query again in session 1, at the start of txn, the
> cached entry gets invalidated via pgfdw_inval_callback() and the query
> fails because there is no foreign table.
>
> Note that the remote connection remains open in session 1 until the
> local session exits.
>
> To solve the above connection leak problem, it looks like the right
> place to close all the invalid connections is pgfdw_xact_callback(),
> once registered, which gets called at the end of every txn in the
> current session(by then all the sub txns also would have been
> finished). Note that if there are too many invalidated entries, then
> the following txn has to close all of them, but that's okay than
> having leaked connections and it's a one time job for the following
> one txn.
>
> Attaching a patch for the same.
>
> Thoughts?
>
> [1] -
> https://www.postgresql.org/message-id/flat/CALj2ACUOQYs%2BssxkxRvZ6Ja5%2Bsfc6a-s_0e-Nv2A591hEyOgiw%40mail.gmail.com
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com
>


postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped

2020-12-14 Thread Bharath Rupireddy
Hi,

As discussed in [1], in postgres_fdw the cached connections to remote
servers can stay until the lifetime of the local session without
getting a chance to disconnect (connection leak), if the underlying
user mapping or foreign server is dropped in another session. Here are
few scenarios how this can happen:

Use case 1:
1) Run a foreign query in session 1 with server 1 and user mapping 1
2) Drop user mapping 1 in another session 2, an invalidation message
gets generated which will have to be processed by all the sessions
3) Run the foreign query again in session 1, at the start of txn the
cached entry gets invalidated via pgfdw_inval_callback() (as part of
invalidation message processing). Whatever may be the type of foreign
query (select, update, explain, delete, insert, analyze etc.), upon
next call to GetUserMapping() from postgres_fdw.c, the cache lookup
fails(with ERROR:  user mapping not found for "") since the user
mapping 1 has been dropped in session 2 and the query will also fail
before reaching GetConnection() where the connections associated with
the invalidated entries would have got disconnected.

So, the connection associated with invalidated entry would remain
until the local session exits.

Use case 2:
1) Run a foreign query in session 1 with server 1 and user mapping 1
2) Try to drop foreign server 1, then we would not be allowed because
of dependency. Use CASCADE so that dependent objects i.e. user mapping
1 and foreign tables get dropped along with foreign server 1.
3) Run the foreign query again in session 1, at the start of txn, the
cached entry gets invalidated via pgfdw_inval_callback() and the query
fails because there is no foreign table.

Note that the remote connection remains open in session 1 until the
local session exits.

To solve the above connection leak problem, it looks like the right
place to close all the invalid connections is pgfdw_xact_callback(),
once registered, which gets called at the end of every txn in the
current session(by then all the sub txns also would have been
finished). Note that if there are too many invalidated entries, then
the following txn has to close all of them, but that's okay than
having leaked connections and it's a one time job for the following
one txn.

Attaching a patch for the same.

Thoughts?

[1] - 
https://www.postgresql.org/message-id/flat/CALj2ACUOQYs%2BssxkxRvZ6Ja5%2Bsfc6a-s_0e-Nv2A591hEyOgiw%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From 76ae4aad5c78ace5ab348c024e724669057e755f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 14 Dec 2020 22:39:46 +0530
Subject: [PATCH v1] postgres_fdw - cached connection leaks if the associated
 user mapping is dropped

In postgres_fdw the cached connections to remote servers can stay
until the lifetime of the local session, if the underlying user
mapping is dropped in another session.

To solve the above connection leak problem, it looks like the right
place to close all the invalid connections is pgfdw_xact_callback(),
once registered, which gets called at the end of every txn in the
current session(by then all the sub txns also would have been finished).
Note that if there are too many invalidated entries, then the
following txn has to bear running this extra disconnect code, but
that's okay than having leaked connections.
---
 contrib/postgres_fdw/connection.c | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index ab3226287d..e7ec914e48 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -73,6 +73,12 @@ static unsigned int prep_stmt_number = 0;
 /* tracks whether any work is needed in callback functions */
 static bool xact_got_connection = false;
 
+/*
+ * Tracks whether there exists at least one invalid connection in the
+ * connection cache.
+ */
+static bool have_invalid_connections = false;
+
 /* prototypes of private functions */
 static void make_new_connection(ConnCacheEntry *entry, UserMapping *user);
 static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
@@ -788,8 +794,15 @@ pgfdw_xact_callback(XactEvent event, void *arg)
 	HASH_SEQ_STATUS scan;
 	ConnCacheEntry *entry;
 
-	/* Quick exit if no connections were touched in this transaction. */
-	if (!xact_got_connection)
+	/*
+	 * Quick exit if no connections were touched in this transaction and there
+	 * are no invalid connections in the cache. If there are any invalid
+	 * connections exists in the cache, this is the safe place to close all of
+	 * them so that they will not be leaked in case underlying user mappings or
+	 * foreign servers are dropped in other sessions because of which they
+	 * may not have a chance to get disconnected in GetConnection.
+	 */
+	if (!xact_got_connection && !have_invalid_connections)
 		return;
 
 	/*
@@ -943,12 +956,14 @@