On Tue, 8 Sep 2020 at 10:56, Tom Lane <t...@sss.pgh.pa.us> wrote:

>
> So that's good.  However, a similar experiment with returning from inside
> a PG_TRY does *not* produce a warning.  I suspect maybe the compiler
> throws up its hands when it sees sigsetjmp?
>

I find that odd myself, given that in PG_TRY() we:

    PG_exception_stack = &_local_sigjmp_buf;

and a return within that block should count as a pointer to
_local_sigjmp_buf escaping. Yet I confirm that I don't get a warning on
clang-analyzer 10.0.0 (fedora 32) in my builds either. Even with -O0.

TL;DR:
-------

It's not due to sigsetjmp() at all though; see
https://github.com/ringerc/scrapcode/blob/master/c/clang_return_stack_checks/mask_local_escape.c
for part of what's going on.

I haven't worked out why sometimes an unrelated stack escape gets hidden
though.

Attached patches demonstrate tests and are DEFINITELY NOT proposals for
core!

SUMMARY
-----

I found per test details below (links and demo patches supplied, patches
NOT proposed for inclusion in pg) that scan-build seems to be able to track
a local escape via simple indirection such as copying the address to a
local first, but cannot track it when it is assigned to a struct member in
a global pointer if that global pointer is then replaced, even if it's
restored immediately.

Demonstration code to illustrate this is at
https://github.com/ringerc/scrapcode/blob/master/c/clang_return_stack_checks/mask_local_escape.c

This produces no warnings from scan-build but leaks a pointer to an
auto-variable on the stack, then dereferences it to read garbage. I
deliberately clobber the stack to ensure that it's predictable garbage, so
we get:

$ make run_mask_local_escape
./mask_local_escape
&g = 0x7fff1b31e6bc
&g has escaped: 0x7fff1b31e6bc
value is: 0x7f7f7f7f

I think we're using a similar pattern for PG_error_stack, and that's why we
don't get any warnings.

In this case valgrind detects the issue when the bogus data is actually
read. Run "valgrind ./mask_local_escape". But I'm pretty sure I've seen
other cases where it does not.

So I'm wondering if the explicit __attribute__((callback(cbfn))) approach
has some merits after all, for cassert builds. Assuming we care about this
error, since it's a bit of an academic exercise at this point. I just
needed to understand what was going on and why there was no warning.


DETAILS OF TESTS
-----

If I wrap the sigjmp_buf in a struct defined in Pg itself and allow that to
escape I see the same behaviour, so it's not specific to sigjmp_buf.

I tried adding a dummy guard variable to PG_TRY() and PG_END_TRY() solely
for the purpose of tracking this sort of escape. But I notice that a
warning from scan-build is only emitted in PG_CATCH() or PG_FINALLY(),
never from inside the PG_TRY() body. I'd expect to see a warning from
either branch, as it's escaping either way.

scan-build will raise other warnings from within the PG_CATCH() block, such
as if I introduce a

+       int foo = 0;
+       fprintf(stdout, "%d", 10/foo);

so it's not that scan-build as a whole is failing to take the sigsetjmp()
== 0  branch. It raises *other* warnings. This can also be verified by
replacing sigsetjmp() with a dummy static inline function that always takes
one branch or the other.

I landed up writing the experiment code at
https://github.com/ringerc/scrapcode/blob/master/c/clang_return_stack_checks/
to explore the behaviour some more.

The file sigjmp_escape.c explores various combinations of guard variables
and early returns. The file sigjmp_escape_hdr.c and its companions
sigjmp_escape_hdr_try.{c,h} check whether indirection via separate header
and compilation unit makes any difference (it didn't).

Initially I found that in my PG_TRY()-alike test code I got a warning like:

sigjmp_escape_hdr.c:36:4: warning: Address of stack memory associated with
local variable '_local_sigjmp_buf' is still referred to by the global
variable 'TEST_exception_stack' upon returning to the caller.  This will be
a dangling reference
                        return;
                        ^~~~~~

just like I want to see from inside a PG_TRY() block.

But with some further testing I noticed that it seems the checker may not
notice escapes where there's even the simplest level of indirection, and
also that one candidate escape may mask another. I'm not totally sure yet.

In the sigjmp_escape.c test, if I build it with both inner and outer guards
(-DUSE_INNER_GUARDS), scan-build will only complain about the inner guard
pointer g_branch_1 escaping. It will not complain about the stack pointer
to the outer guard escaping, nor about the possibility of
should_return_early causing the sigjmp_buf to escape into the global
sigjmp_buf * jbuf. Yet running

    ./sigjmp_escape 1 1

shows that both can occur.

This still doesn't explain how it *also* misses the escape of &b into jbuf.
Is it losing track of the unrelated escape when it stops tracking the first
one?

The test case sigjmp_escape_hdr doesn't attempt to use the guards, and it
reports the sigjmp_buf escape fine as shown above.

If sigjmp_escape.c is built with -DUSE_INNER_GUARDS -DPOP_ONLY_INNER_GUARDS
then it raises no warnings at all, BUT has both the sigjmp_buf leak AND a
guard variable stack escape as seen by:

   ./sigjmp_escape_inner_pop 1 1

We've seen above that it loses the outer guard pointer escape because it
doesn't track any sort of indirection of assignments.

What's a bit surprising is that it *also* fails to complain about the
escape of the sigjmp_buf pointer &b into the global variable jbuf in this
case, even though it successfully detects the escape with other
combinations of guard uses. It's like it can't track multiple things at
once properly.

RELATED: detecting return from PG_FINALLY()
--------

Note that even if scan-build did detect the escape of the sigjmp_buf,
returning from PG_FINALLY() is a coding error that we cannot detect this
way since we've restored PG_errror_stack so there's no local auto ptr to
escape. To do that we'd need our own stack that we pushed in PG_TRY() and
didn't pop until PG_END_TRY(), like in the demo patch attached. That works,
but adds another local to each PG_TRY() that's otherwise useless. If we
really felt like it we could use it to produce a poor-man's backtrace
through each PG_TRY() at the cost of one sizeof(void) per PG_TRY() + some
constdata storage, but that seems like a solution looking for a problem.

Also, AFAICS it's actually harmless, if ugly, to return from PG_CATCH(). It
should still be discouraged.


-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise
From fc028d57e4182005dd9510b63aceada89202a4b0 Mon Sep 17 00:00:00 2001
From: Craig Ringer <craig.rin...@2ndquadrant.com>
Date: Wed, 9 Sep 2020 12:54:56 +0800
Subject: [PATCH v999 1/3] Deliberately break something so we can check it

Break postgres_fdw
---
 contrib/postgres_fdw/postgres_fdw.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index a31abce7c9..0565037ac8 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -3441,6 +3441,9 @@ fetch_more_data(ForeignScanState *node)
 
 		/* Must be EOF if we didn't get as many tuples as we asked for. */
 		fsstate->eof_reached = (numrows < fsstate->fetch_size);
+
+		/* Escape PG_TRY() */
+		return;
 	}
 	PG_FINALLY();
 	{
@@ -3844,6 +3847,9 @@ store_returning_result(PgFdwModifyState *fmstate,
 	{
 		HeapTuple	newtup;
 
+		int foo = 0;
+		fprintf(stdout, "%d", 10/foo);
+
 		newtup = make_tuple_from_result_row(res, 0,
 											fmstate->rel,
 											fmstate->attinmeta,
@@ -3856,11 +3862,23 @@ store_returning_result(PgFdwModifyState *fmstate,
 		 * heaptuples directly, so allow for conversion.
 		 */
 		ExecForceStoreHeapTuple(newtup, slot, true);
+
+		/* Escape PG_TRY() */
+		return;
+
 	}
 	PG_CATCH();
 	{
+		int bar = 0;
+		fprintf(stdout, "%d", 10/bar);
+
 		if (res)
 			PQclear(res);
+
+
+		/* Escape PG_CATCH() */
+		return;
+
 		PG_RE_THROW();
 	}
 	PG_END_TRY();
-- 
2.26.2

From 4424cbd218c3ff968d1273edf40ffc3689cc1f5d Mon Sep 17 00:00:00 2001
From: Craig Ringer <craig.rin...@2ndquadrant.com>
Date: Wed, 9 Sep 2020 12:55:29 +0800
Subject: [PATCH v999 3/3] Add guards to detect return from between PG_TRY()
 and PG_END_TRY()

These don't work because of the issue mentioned earlier where llvm's
scan-build loses track of a pointer escape when the containing struct
pointer is cleared from a global then restored to the global again.
---
 src/backend/utils/error/elog.c |  7 +++++++
 src/include/utils/elog.h       | 10 ++++++++++
 2 files changed, 17 insertions(+)

diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 9af7155171..c1fb3d4785 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -93,6 +93,9 @@ ErrorContextCallback *error_context_stack = NULL;
 
 PG_sigjmp_buf *PG_exception_stack = NULL;
 
+/* HACK for scan-build testing */
+const PG_try_guard_entry * pg_try_guard = NULL;
+
 extern bool redirection_done;
 
 /*
@@ -580,6 +583,10 @@ errfinish(const char *filename, int lineno, const char *funcname)
 
 	if (elevel >= PANIC)
 	{
+		/* trick clang into thinking we actually use pg_try_guard, hopefully... */
+		if (pg_try_guard)
+			abort();
+
 		/*
 		 * Serious crash time. Postmaster will observe SIGABRT process exit
 		 * status and kill the other backends too.
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index fe82225dbe..bf40ca632c 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -299,12 +299,21 @@ typedef struct PG_sigjmp_buf
 	sigjmp_buf buf;
 } PG_sigjmp_buf;
 
+/* HACK to show that scan-build does detect escapes from PG_CATCH() */
+typedef struct PG_try_guard_entry
+{
+	const struct PG_try_guard_entry * const previous;
+} PG_try_guard_entry;
+extern const PG_try_guard_entry * pg_try_guard;
+
 #define PG_TRY()  \
 	do { \
 		PG_sigjmp_buf *_save_exception_stack = PG_exception_stack; \
 		ErrorContextCallback *_save_context_stack = error_context_stack; \
 		PG_sigjmp_buf _local_sigjmp_buf; \
 		bool _do_rethrow = false; \
+		const PG_try_guard_entry _this_pg_try_guard = { pg_try_guard }; \
+		pg_try_guard = &_this_pg_try_guard; \
 		if (sigsetjmp(_local_sigjmp_buf.buf, 0) == 0) \
 		{ \
 			PG_exception_stack = &_local_sigjmp_buf
@@ -326,6 +335,7 @@ typedef struct PG_sigjmp_buf
 
 #define PG_END_TRY()  \
 		} \
+		pg_try_guard = _this_pg_try_guard.previous; \
 		if (_do_rethrow) \
 				PG_RE_THROW(); \
 		PG_exception_stack = _save_exception_stack; \
-- 
2.26.2

From 816cfc93d9e7123a77759e63edeb87832a258fc2 Mon Sep 17 00:00:00 2001
From: Craig Ringer <craig.rin...@2ndquadrant.com>
Date: Wed, 9 Sep 2020 12:54:01 +0800
Subject: [PATCH v999 2/3] Show that it's not special casing of sigjmp_buf by
 using a wrapper struct

---
 src/backend/postmaster/autovacuum.c   |  8 ++++----
 src/backend/postmaster/bgworker.c     |  4 ++--
 src/backend/postmaster/bgwriter.c     |  4 ++--
 src/backend/postmaster/checkpointer.c |  4 ++--
 src/backend/postmaster/walwriter.c    |  4 ++--
 src/backend/tcop/postgres.c           |  4 ++--
 src/backend/utils/error/elog.c        |  5 +++--
 src/include/utils/elog.h              | 15 +++++++++++----
 8 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1b8cd7bacd..c01953c642 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -431,7 +431,7 @@ StartAutoVacLauncher(void)
 NON_EXEC_STATIC void
 AutoVacLauncherMain(int argc, char *argv[])
 {
-	sigjmp_buf	local_sigjmp_buf;
+	PG_sigjmp_buf	local_sigjmp_buf;
 
 	am_autovacuum_launcher = true;
 
@@ -496,7 +496,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 	 *
 	 * This code is a stripped down version of PostgresMain error recovery.
 	 */
-	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
+	if (sigsetjmp(local_sigjmp_buf.buf, 1) != 0)
 	{
 		/* since not using PG_TRY, must reset error stack by hand */
 		error_context_stack = NULL;
@@ -1502,7 +1502,7 @@ StartAutoVacWorker(void)
 NON_EXEC_STATIC void
 AutoVacWorkerMain(int argc, char *argv[])
 {
-	sigjmp_buf	local_sigjmp_buf;
+	PG_sigjmp_buf	local_sigjmp_buf;
 	Oid			dbid;
 
 	am_autovacuum_worker = true;
@@ -1552,7 +1552,7 @@ AutoVacWorkerMain(int argc, char *argv[])
 	 *
 	 * See notes in postgres.c about the design of this coding.
 	 */
-	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
+	if (sigsetjmp(local_sigjmp_buf.buf, 1) != 0)
 	{
 		/* since not using PG_TRY, must reset error stack by hand */
 		error_context_stack = NULL;
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index d043ced686..d83c0c5309 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -679,7 +679,7 @@ bgworker_sigusr1_handler(SIGNAL_ARGS)
 void
 StartBackgroundWorker(void)
 {
-	sigjmp_buf	local_sigjmp_buf;
+	PG_sigjmp_buf	local_sigjmp_buf;
 	BackgroundWorker *worker = MyBgworkerEntry;
 	bgworker_main_type entrypt;
 
@@ -745,7 +745,7 @@ StartBackgroundWorker(void)
 	 *
 	 * We just need to clean up, report the error, and go away.
 	 */
-	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
+	if (sigsetjmp(local_sigjmp_buf.buf, 1) != 0)
 	{
 		/* Since not using PG_TRY, must reset error stack by hand */
 		error_context_stack = NULL;
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 069e27e427..14770c037b 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -93,7 +93,7 @@ static XLogRecPtr last_snapshot_lsn = InvalidXLogRecPtr;
 void
 BackgroundWriterMain(void)
 {
-	sigjmp_buf	local_sigjmp_buf;
+	PG_sigjmp_buf	local_sigjmp_buf;
 	MemoryContext bgwriter_context;
 	bool		prev_hibernate;
 	WritebackContext wb_context;
@@ -142,7 +142,7 @@ BackgroundWriterMain(void)
 	 *
 	 * See notes in postgres.c about the design of this coding.
 	 */
-	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
+	if (sigsetjmp(local_sigjmp_buf.buf, 1) != 0)
 	{
 		/* Since not using PG_TRY, must reset error stack by hand */
 		error_context_stack = NULL;
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 624a3238b8..259b57544b 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -182,7 +182,7 @@ static void ReqCheckpointHandler(SIGNAL_ARGS);
 void
 CheckpointerMain(void)
 {
-	sigjmp_buf	local_sigjmp_buf;
+	PG_sigjmp_buf	local_sigjmp_buf;
 	MemoryContext checkpointer_context;
 
 	CheckpointerShmem->checkpointer_pid = MyProcPid;
@@ -233,7 +233,7 @@ CheckpointerMain(void)
 	 *
 	 * See notes in postgres.c about the design of this coding.
 	 */
-	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
+	if (sigsetjmp(local_sigjmp_buf.buf, 1) != 0)
 	{
 		/* Since not using PG_TRY, must reset error stack by hand */
 		error_context_stack = NULL;
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index 45a2757969..c96e37c7ad 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -87,7 +87,7 @@ int			WalWriterFlushAfter = 128;
 void
 WalWriterMain(void)
 {
-	sigjmp_buf	local_sigjmp_buf;
+	PG_sigjmp_buf	local_sigjmp_buf;
 	MemoryContext walwriter_context;
 	int			left_till_hibernate;
 	bool		hibernating;
@@ -131,7 +131,7 @@ WalWriterMain(void)
 	 *
 	 * This code is heavily based on bgwriter.c, q.v.
 	 */
-	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
+	if (sigsetjmp(local_sigjmp_buf.buf, 1) != 0)
 	{
 		/* Since not using PG_TRY, must reset error stack by hand */
 		error_context_stack = NULL;
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index c9424f167c..ab8a732d04 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3780,7 +3780,7 @@ PostgresMain(int argc, char *argv[],
 {
 	int			firstchar;
 	StringInfoData input_message;
-	sigjmp_buf	local_sigjmp_buf;
+	PG_sigjmp_buf	local_sigjmp_buf;
 	volatile bool send_ready_for_query = true;
 	bool		disable_idle_in_transaction_timeout = false;
 
@@ -4039,7 +4039,7 @@ PostgresMain(int argc, char *argv[],
 	 * were inside a transaction.
 	 */
 
-	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
+	if (sigsetjmp(local_sigjmp_buf.buf, 1) != 0)
 	{
 		/*
 		 * NOTE: if you are tempted to add more code in this if-block,
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index d0b368530e..9af7155171 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -91,7 +91,7 @@
 /* Global variables */
 ErrorContextCallback *error_context_stack = NULL;
 
-sigjmp_buf *PG_exception_stack = NULL;
+PG_sigjmp_buf *PG_exception_stack = NULL;
 
 extern bool redirection_done;
 
@@ -589,6 +589,7 @@ errfinish(const char *filename, int lineno, const char *funcname)
 		 */
 		fflush(stdout);
 		fflush(stderr);
+
 		abort();
 	}
 
@@ -1714,7 +1715,7 @@ pg_re_throw(void)
 {
 	/* If possible, throw the error to the next outer setjmp handler */
 	if (PG_exception_stack != NULL)
-		siglongjmp(*PG_exception_stack, 1);
+		siglongjmp(PG_exception_stack->buf, 1);
 	else
 	{
 		/*
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 1e09ee0541..fe82225dbe 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -292,13 +292,20 @@ extern PGDLLIMPORT ErrorContextCallback *error_context_stack;
  * warnings are just about entirely useless for catching such oversights.
  *----------
  */
+
+/* HACK for PoC with scan-build NOT FOR COMMIT */
+typedef struct PG_sigjmp_buf
+{
+	sigjmp_buf buf;
+} PG_sigjmp_buf;
+
 #define PG_TRY()  \
 	do { \
-		sigjmp_buf *_save_exception_stack = PG_exception_stack; \
+		PG_sigjmp_buf *_save_exception_stack = PG_exception_stack; \
 		ErrorContextCallback *_save_context_stack = error_context_stack; \
-		sigjmp_buf _local_sigjmp_buf; \
+		PG_sigjmp_buf _local_sigjmp_buf; \
 		bool _do_rethrow = false; \
-		if (sigsetjmp(_local_sigjmp_buf, 0) == 0) \
+		if (sigsetjmp(_local_sigjmp_buf.buf, 0) == 0) \
 		{ \
 			PG_exception_stack = &_local_sigjmp_buf
 
@@ -337,7 +344,7 @@ extern PGDLLIMPORT ErrorContextCallback *error_context_stack;
 	(pg_re_throw(), pg_unreachable())
 #endif
 
-extern PGDLLIMPORT sigjmp_buf *PG_exception_stack;
+extern PGDLLIMPORT PG_sigjmp_buf *PG_exception_stack;
 
 
 /* Stuff that error handlers might want to use */
-- 
2.26.2

Reply via email to