pgsql: Fix the unstable output of tests added by commit 8fccf75834.
Fix the unstable output of tests added by commit 8fccf75834. The test cases added by that commit were trying to test the exact number of times a particular transaction has spilled. However, that number can vary if any background transaction (say by autovacuum) happens in parallel to the main transaction. So let's not try to verify the exact count. Author: Amit Kapila Reviewed-by: Sawada Masahiko Discussion: https://postgr.es/m/ca+fd4k5_ppayrtdro2pbttoe0ehqpbvuqmcr8ic39utnmr4...@mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/2050832d0d637358a376a99514071941aa93ed31 Modified Files -- contrib/test_decoding/expected/stats.out | 20 +++- contrib/test_decoding/sql/stats.sql | 12 +++- 2 files changed, 18 insertions(+), 14 deletions(-)
pgsql: Create ResultRelInfos later in InitPlan, index them by RT index.
Create ResultRelInfos later in InitPlan, index them by RT index. Instead of allocating all the ResultRelInfos upfront in one big array, allocate them in ExecInitModifyTable(). es_result_relations is now an array of ResultRelInfo pointers, rather than an array of structs, and it is indexed by the RT index. This simplifies things: we get rid of the separate concept of a "result rel index", and don't need to set it in setrefs.c anymore. This also allows follow-up optimizations (not included in this commit yet) to skip initializing ResultRelInfos for target relations that were not needed at runtime, and removal of the es_result_relation_info pointer. The EState arrays of regular result rels and root result rels are merged into one array. Similarly, the resultRelations and rootResultRelations lists in PlannedStmt are merged into one. It's not actually clear to me why they were kept separate in the first place, but now that the es_result_relations array is indexed by RT index, it certainly seems pointless. The PlannedStmt->resultRelations list is now only needed for ExecRelationIsTargetRelation(). One visible effect of this change is that ExecRelationIsTargetRelation() will now return 'true' also for the partition root, if a partitioned table is updated. That seems like a good thing, although the function isn't used in core code, and I don't see any reason for an FDW to call it on a partition root. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/1375422c7826a2bf387be29895e961614f69de4b Modified Files -- src/backend/commands/copy.c | 24 +-- src/backend/commands/explain.c | 17 +- src/backend/commands/tablecmds.c | 9 +- src/backend/commands/trigger.c | 2 +- src/backend/executor/execMain.c | 256 +-- src/backend/executor/execParallel.c | 1 - src/backend/executor/execUtils.c | 57 --- src/backend/executor/nodeModifyTable.c | 24 ++- src/backend/nodes/copyfuncs.c| 3 - src/backend/nodes/outfuncs.c | 4 - src/backend/nodes/readfuncs.c| 3 - src/backend/optimizer/plan/createplan.c | 2 - src/backend/optimizer/plan/planner.c | 3 - src/backend/optimizer/plan/setrefs.c | 17 +- src/backend/replication/logical/worker.c | 32 ++-- src/include/executor/executor.h | 5 +- src/include/nodes/execnodes.h| 18 +-- src/include/nodes/pathnodes.h| 2 - src/include/nodes/plannodes.h| 8 - 19 files changed, 184 insertions(+), 303 deletions(-)
Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect
On 2020/10/11 9:16, Tom Lane wrote:
Fujii Masao writes:
Therefore, the easy fix is to make libpq mark the connection as
CONNECTION_BAD even in ECONNABORTED, like we do in ECONNRESET.
So in the wake of commit fe27009cb,
Many thanks for the commit fe27009cb !!
this is what lorikeet is doing:
@@ -9028,9 +9028,7 @@
CALL terminate_backend_and_wait('fdw_retry_check');
SAVEPOINT s;
SELECT 1 FROM ft1 LIMIT 1;-- should fail
-ERROR: server closed the connection unexpectedly
- This probably means the server terminated abnormally
- before or while processing the request.
+ERROR: could not receive data from server: Software caused connection abort
CONTEXT: remote SQL command: SAVEPOINT s2
COMMIT;
-- Clean up
which is better --- the connection recovery is working --- but
obviously still not a "pass".
The only way to make this test pass as-is is to hack libpq so that
it deems ECONNABORTED to be a server crash, which frankly I do not
think is a good change. I don't see any principled reason to think
that it means that rather than a network connectivity failure.
What I've done instead as a stopgap is to adjust the test case to be
insensitive to the exact error message text.
For now I've not come up with better idea than current this fix.
Maybe there's a better
way, but I can't think of anything besides having two (or more?)
expected-output files. That would be quite unpalatable as things
stand, though perhaps we could make it tolerable by splitting this
test off into a second test script.
Meanwhile, now that I've looked at commit 32a9c0bdf, I'm not very
happy with it:
* The control flow seems rather forced. I think it was designed
on the assumption that reindenting the existing code is forbidden.
Why not use an actual loop, instead of a goto? I also think that
it's far from obvious that the loop isn't an infinite loop; it
took me quite a while to glom onto the fact that the test inside the
PG_CATCH avoids that by checking retry_conn. Maybe a comment would
be enough to improve that, but I feel the control logic could stand
a rethink.
Isn't it simpler and easier-to-read to just reestablish new connection again
in the retry case instead of using a loop because we retry only once?
Patch attached.
* The CATCH case is completely failing to clean up after itself.
At the minimum, there has to be a FlushErrorState() there.
I don't think it'd be terribly hard to drive this code to an
error-stack-overflow PANIC.
You're right. Sorry I forgot to take into consideration this :(
I fixed this issue in the attached patch.
* As is so often true of proposed patches in which PG_CATCH is
thought to be an adequate way to catch an error, this is just
unbelievably fragile. It will break, and badly so, if it catches
an error that is anything other than what it is expecting ...
and it's not even particularly trying to verify that the error is
what it's expecting. It might be okay, or at least a little bit
harder to break, if it verified that the error's SQLSTATE is
ERRCODE_CONNECTION_FAILURE.
"PQstatus()==CONNECTION_BAD" was checked for that purpose. But that's not
enough?
Anyway, in the patch, I changed the code so that
sqlstate==ERRCODE_CONNECTION_FAILURE
is checked.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/contrib/postgres_fdw/connection.c
b/contrib/postgres_fdw/connection.c
index 76994f3820..8bbad61c3b 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -74,6 +74,7 @@ static unsigned int prep_stmt_number = 0;
static bool xact_got_connection = false;
/* prototypes of private functions */
+static void make_new_connection(ConnCacheEntry *entry, UserMapping *user);
static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
static void disconnect_pg_server(ConnCacheEntry *entry);
static void check_conn_params(const char **keywords, const char **values,
UserMapping *user);
@@ -108,9 +109,10 @@ PGconn *
GetConnection(UserMapping *user, bool will_prep_stmt)
{
boolfound;
- volatile bool retry_conn = false;
+ boolretry = false;
ConnCacheEntry *entry;
ConnCacheKey key;
+ MemoryContext ccxt = CurrentMemoryContext;
/* First time through, initialize connection cache hashtable */
if (ConnectionHash == NULL)
@@ -160,23 +162,14 @@ GetConnection(UserMapping *user, bool will_prep_stmt)
/* Reject further use of connections which failed abort cleanup. */
pgfdw_reject_incomplete_xact_state_change(entry);
-retry:
-
/*
* If the connection needs to be remade due to invalidation, disconnect
as
-* soon as we're out of all transactions. Also, if previous attempt to
-* start new remote transaction failed on the cached connection,
-* disconnect it to retry a new connection.
+* soon as we're
Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect
Fujii Masao writes: > On 2020/10/11 9:16, Tom Lane wrote: >> Meanwhile, now that I've looked at commit 32a9c0bdf, I'm not very >> happy with it: >> >> * The control flow seems rather forced. I think it was designed >> on the assumption that reindenting the existing code is forbidden. > Isn't it simpler and easier-to-read to just reestablish new connection again > in the retry case instead of using a loop because we retry only once? > Patch attached. Yeah, that seems better. >> * As is so often true of proposed patches in which PG_CATCH is >> thought to be an adequate way to catch an error, this is just >> unbelievably fragile. It will break, and badly so, if it catches >> an error that is anything other than what it is expecting ... >> and it's not even particularly trying to verify that the error is >> what it's expecting. It might be okay, or at least a little bit >> harder to break, if it verified that the error's SQLSTATE is >> ERRCODE_CONNECTION_FAILURE. > "PQstatus()==CONNECTION_BAD" was checked for that purpose. But that's not > enough? > Anyway, in the patch, I changed the code so that > sqlstate==ERRCODE_CONNECTION_FAILURE > is checked. The reason I'm concerned about this is that we could potentially throw an elog(ERROR) somewhere between return from libpq and the expected ereport call in pgfdw_report_error. It wouldn't be wise to ignore such a condition (it might be out-of-memory or some such). Personally I'd code the check with explicit tests for *both* PQstatus()==CONNECTION_BAD and sqlstate==ERRCODE_CONNECTION_FAILURE, rather than just asserting that the latter must imply the former. By my reading, pgfdw_report_error will report ERRCODE_CONNECTION_FAILURE for any libpq-originated error condition, but not all of those will set CONNECTION_BAD. Other than that, this seems reasonable by eyeball (I didn't do any testing). regards, tom lane
pgsql: Paper over regression failures in infinite_recurse() on PPC64 Li
Paper over regression failures in infinite_recurse() on PPC64 Linux. Our infinite_recurse() test to verify sane stack-overrun behavior is affected by a bug of the Linux kernel on PPC64: it will get SIGSEGV if it receives a signal when the stack depth is (a) over 1MB and (b) within a few kB of filling the current physical stack allocation. See https://bugzilla.kernel.org/show_bug.cgi?id=205183. Since this test is a bit time-consuming and we run it in parallel with test scripts that do a lot of DDL, it can be expected to get an sinval catchup interrupt at some point, leading to failure if the timing is wrong. This has caused more than 100 buildfarm failures over the past year or so. While a fix exists for the kernel bug, it might be years before that propagates into all production kernels, particularly in some of the older distros we have in the buildfarm. For now, let's just back off and not run this test on Linux PPC64; that loses nothing in test coverage so far as our own code is concerned. To do that, split this test into a new script infinite_recurse.sql and skip the test when the platform name is powerpc64...-linux-gnu. Back-patch to v12. Branches before that have not been seen to get this failure. No doubt that's because the "errors" test was not run in parallel with other tests before commit 798070ec0, greatly reducing the odds of an sinval catchup being necessary. I also back-patched 3c8553547 into v12, just so the new regression script would look the same in all branches having it. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/20190723162703.GM22387%40telsasoft.com Branch -- REL_13_STABLE Details --- https://git.postgresql.org/pg/commitdiff/855b6f287100f3eab24df0a83998db251ac4fd09 Modified Files -- src/test/regress/expected/errors.out | 10 src/test/regress/expected/infinite_recurse.out | 24 src/test/regress/expected/infinite_recurse_1.out | 16 + src/test/regress/parallel_schedule | 2 +- src/test/regress/serial_schedule | 1 + src/test/regress/sql/errors.sql | 9 src/test/regress/sql/infinite_recurse.sql| 29 7 files changed, 71 insertions(+), 20 deletions(-)
pgsql: Paper over regression failures in infinite_recurse() on PPC64 Li
Paper over regression failures in infinite_recurse() on PPC64 Linux. Our infinite_recurse() test to verify sane stack-overrun behavior is affected by a bug of the Linux kernel on PPC64: it will get SIGSEGV if it receives a signal when the stack depth is (a) over 1MB and (b) within a few kB of filling the current physical stack allocation. See https://bugzilla.kernel.org/show_bug.cgi?id=205183. Since this test is a bit time-consuming and we run it in parallel with test scripts that do a lot of DDL, it can be expected to get an sinval catchup interrupt at some point, leading to failure if the timing is wrong. This has caused more than 100 buildfarm failures over the past year or so. While a fix exists for the kernel bug, it might be years before that propagates into all production kernels, particularly in some of the older distros we have in the buildfarm. For now, let's just back off and not run this test on Linux PPC64; that loses nothing in test coverage so far as our own code is concerned. To do that, split this test into a new script infinite_recurse.sql and skip the test when the platform name is powerpc64...-linux-gnu. Back-patch to v12. Branches before that have not been seen to get this failure. No doubt that's because the "errors" test was not run in parallel with other tests before commit 798070ec0, greatly reducing the odds of an sinval catchup being necessary. I also back-patched 3c8553547 into v12, just so the new regression script would look the same in all branches having it. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/20190723162703.GM22387%40telsasoft.com Branch -- REL_12_STABLE Details --- https://git.postgresql.org/pg/commitdiff/c7e2364a5f17b3ef3518f92542fb1b1628288382 Modified Files -- src/test/regress/expected/errors.out | 7 -- src/test/regress/expected/infinite_recurse.out | 24 src/test/regress/expected/infinite_recurse_1.out | 16 + src/test/regress/parallel_schedule | 2 +- src/test/regress/serial_schedule | 1 + src/test/regress/sql/errors.sql | 7 -- src/test/regress/sql/infinite_recurse.sql| 29 7 files changed, 71 insertions(+), 15 deletions(-)
pgsql: Paper over regression failures in infinite_recurse() on PPC64 Li
Paper over regression failures in infinite_recurse() on PPC64 Linux. Our infinite_recurse() test to verify sane stack-overrun behavior is affected by a bug of the Linux kernel on PPC64: it will get SIGSEGV if it receives a signal when the stack depth is (a) over 1MB and (b) within a few kB of filling the current physical stack allocation. See https://bugzilla.kernel.org/show_bug.cgi?id=205183. Since this test is a bit time-consuming and we run it in parallel with test scripts that do a lot of DDL, it can be expected to get an sinval catchup interrupt at some point, leading to failure if the timing is wrong. This has caused more than 100 buildfarm failures over the past year or so. While a fix exists for the kernel bug, it might be years before that propagates into all production kernels, particularly in some of the older distros we have in the buildfarm. For now, let's just back off and not run this test on Linux PPC64; that loses nothing in test coverage so far as our own code is concerned. To do that, split this test into a new script infinite_recurse.sql and skip the test when the platform name is powerpc64...-linux-gnu. Back-patch to v12. Branches before that have not been seen to get this failure. No doubt that's because the "errors" test was not run in parallel with other tests before commit 798070ec0, greatly reducing the odds of an sinval catchup being necessary. I also back-patched 3c8553547 into v12, just so the new regression script would look the same in all branches having it. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/20190723162703.GM22387%40telsasoft.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/ae0f7b11f143d9748e2201c3647330893c4c1c5a Modified Files -- src/test/regress/expected/errors.out | 10 src/test/regress/expected/infinite_recurse.out | 24 src/test/regress/expected/infinite_recurse_1.out | 16 + src/test/regress/parallel_schedule | 2 +- src/test/regress/serial_schedule | 1 + src/test/regress/sql/errors.sql | 9 src/test/regress/sql/infinite_recurse.sql| 29 7 files changed, 71 insertions(+), 20 deletions(-)
pgsql: Correct error message
Correct error message Apparently copy-and-pasted from nearby. Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/4e118fc33e3ca5244c11a81a71bd25cf9ed3d484 Modified Files -- src/backend/parser/parse_target.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
pgsql: Use https for gnu.org links
Use https for gnu.org links Mostly already done, but there were some stragglers. Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/39b4a951003a6545268e141272e123929d0d710f Modified Files -- doc/src/sgml/installation.sgml | 2 +- src/include/c.h| 3 ++- src/include/port/atomics/generic-gcc.h | 4 ++-- src/interfaces/ecpg/Makefile | 2 +- src/tools/RELEASE_CHANGES | 2 +- 5 files changed, 7 insertions(+), 6 deletions(-)
