Bharath Rupireddy <[email protected]> writes:
> On Tue, May 4, 2021 at 4:12 AM Tom Lane <[email protected]> wrote:
>> The buildfarm is showing that one of these test queries is not stable
>> under CLOBBER_CACHE_ALWAYS:
> I can reproduce the issue with the failing case. Issue is that the
> backend pid will be null in the pg_stat_activity because of the cache
> invalidation that happens at the beginning of the query and hence
> pg_terminate_backend returns null on null input.
No, that's nonsense: if it were happening that way, the query would
return one row with a NULL result, but actually it's returning no
rows. What's actually happening, it seems, is that because
pgfdw_inval_callback is constantly getting called due to cache
flushes, we invariably drop remote connections immediately during
transaction termination (cf pgfdw_xact_callback). Thus, by the time
we inspect pg_stat_activity, there is no remote session to terminate.
I don't like your patch because what it effectively does is mask
whether termination happened or not; if there were a bug there
causing that not to happen, the test would still appear to pass.
I think the most expedient fix, if we want to keep this test, is
just to transiently disable debug_invalidate_system_caches_always.
(That option wasn't available before v14, but fortunately we
don't need a fix for the back branches.)
I believe the attached will do the trick, but I'm running the test
with debug_invalidate_system_caches_always turned on to verify
that. Should be done in an hour or so...
regards, tom lane
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 8e1cc69508..6f533c745d 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9204,6 +9204,12 @@ WARNING: there is no transaction in progress
-- Change application_name of remote connection to special one
-- so that we can easily terminate the connection later.
ALTER SERVER loopback OPTIONS (application_name 'fdw_retry_check');
+-- If debug_invalidate_system_caches_always is active, it results in
+-- dropping remote connections after every transaction, making it
+-- impossible to test termination meaningfully. So turn that off
+-- for this test.
+SET debug_invalidate_system_caches_always = 0;
+-- Make sure we have a remote connection.
SELECT 1 FROM ft1 LIMIT 1;
?column?
----------
@@ -9227,9 +9233,8 @@ SELECT 1 FROM ft1 LIMIT 1;
1
(1 row)
--- If the query detects the broken connection when starting new remote
--- subtransaction, it doesn't reestablish new connection and should fail.
--- The text of the error might vary across platforms, so don't show it.
+-- If we detect the broken connection when starting a new remote
+-- subtransaction, we should fail instead of establishing a new connection.
-- Terminate the remote connection and wait for the termination to complete.
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
WHERE application_name = 'fdw_retry_check';
@@ -9239,11 +9244,13 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
(1 row)
SAVEPOINT s;
+-- The text of the error might vary across platforms, so only show SQLSTATE.
\set VERBOSITY sqlstate
SELECT 1 FROM ft1 LIMIT 1; -- should fail
ERROR: 08006
\set VERBOSITY default
COMMIT;
+RESET debug_invalidate_system_caches_always;
-- =============================================================================
-- test connection invalidation cases and postgres_fdw_get_connections function
-- =============================================================================
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index dcd36a9753..000e2534fc 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -2795,6 +2795,14 @@ ROLLBACK;
-- Change application_name of remote connection to special one
-- so that we can easily terminate the connection later.
ALTER SERVER loopback OPTIONS (application_name 'fdw_retry_check');
+
+-- If debug_invalidate_system_caches_always is active, it results in
+-- dropping remote connections after every transaction, making it
+-- impossible to test termination meaningfully. So turn that off
+-- for this test.
+SET debug_invalidate_system_caches_always = 0;
+
+-- Make sure we have a remote connection.
SELECT 1 FROM ft1 LIMIT 1;
-- Terminate the remote connection and wait for the termination to complete.
@@ -2806,18 +2814,20 @@ SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
BEGIN;
SELECT 1 FROM ft1 LIMIT 1;
--- If the query detects the broken connection when starting new remote
--- subtransaction, it doesn't reestablish new connection and should fail.
--- The text of the error might vary across platforms, so don't show it.
+-- If we detect the broken connection when starting a new remote
+-- subtransaction, we should fail instead of establishing a new connection.
-- Terminate the remote connection and wait for the termination to complete.
SELECT pg_terminate_backend(pid, 180000) FROM pg_stat_activity
WHERE application_name = 'fdw_retry_check';
SAVEPOINT s;
+-- The text of the error might vary across platforms, so only show SQLSTATE.
\set VERBOSITY sqlstate
SELECT 1 FROM ft1 LIMIT 1; -- should fail
\set VERBOSITY default
COMMIT;
+RESET debug_invalidate_system_caches_always;
+
-- =============================================================================
-- test connection invalidation cases and postgres_fdw_get_connections function
-- =============================================================================