[off-list for now]

On Tue, Nov 17, 2020 at 10:21 PM Heikki Linnakangas <hlinn...@iki.fi> wrote:

>
> 2. It's difficult for extensions to use. There is a callback mechanism
> for extensions, but it's much less convenient to use than the built-in
> functions. The pgcrypto code uses the callbacks currently, and Michael's
> patch [2] would move that support for tracking OpenSSL contexts to the
> core, which makes it a lot more convenient for pgcrypto. Wouldn't it be
> nice if extensions could have the same ergonomics as built-in code, if
> they need to track external resources?
>

Yes, in general I'm a big fan of making life easier for extensions (see
postscript), and this looks helpful. It's certainly fairly clear to read.
I'm in-principle in favour, though I haven't read the old resowner code
closely enough to have a strong opinion.

I haven't done thorough performance testing of this, but with some quick
> testing with Kyotaro's "catcachebench" to stress-test syscache lookups,
> this performs a little better. In that test, all the activity happens in
> the small array portion, but I believe that's true for most use cases.
>
> Thoughts? Can anyone suggest test scenarios to verify the performance of
> this?
>

I didn't think of much really.

I have tossed together a patch on top of yours that adds some
systemtap/dtrace tracepoints and provides a demo systemtap script that
shows some basic stats collection done using them. My intention was to make
it easier to see where the work in the resource owner code was being done.
But after I did the initial version I realised that in this case it
probably doesn't add a lot of value over using targeted profiling of only
functions in resowner.c and their callees which is easy enough using perf
and regular DWARF debuginfo. So I didn't land up polishing it and adding
stats for the other counters. It's attached anyway, in case it's
interesting or useful to anyone.

P.S. At some point I want to tackle some things similar to what you've done
here for other kinds of backend resources. In particular, to allow
extensions to register their own heavyweight lock types - with the ability
to handle lock timeouts, and add callbacks in the deadlock detector to
allow extensions to register dependency edges that are not natively visible
to the deadlock detector and to choose victim priorities. Also some
enhancements to how pg_depend tracking can be used by extensions. And I
want to help get the proposed custom ProcSignal changes in too. I think
there's a whole lot to be done to make extensions easier and safer to
author in general, like providing a simple way to do error trapping and
recovery in an extension mainloop without copy/pasting a bunch of
PostgresMain code, better default signal handlers, startup/shutdown that
shares more with user backends, etc. Right now it's quite tricky to get
bgworkers to behave well. </wildhandwaving>
From 9f3e6978b1bbfeb8ca3f1cac42e86c4731143404 Mon Sep 17 00:00:00 2001
From: Craig Ringer <craig.rin...@2ndquadrant.com>
Date: Wed, 18 Nov 2020 14:30:08 +0800
Subject: [PATCH] Poc of systemtap probes for resowner timings

---
 resowner.stp                          | 106 ++++++++++++++++++++++++++
 src/backend/utils/probes.d            |  22 ++++++
 src/backend/utils/resowner/resowner.c |  96 +++++++++++++++++++++--
 3 files changed, 216 insertions(+), 8 deletions(-)
 create mode 100644 resowner.stp

diff --git a/resowner.stp b/resowner.stp
new file mode 100644
index 0000000000..24b37b39e6
--- /dev/null
+++ b/resowner.stp
@@ -0,0 +1,106 @@
+# PATH="$(dirname $(realpath $(find build/ -name postgres -type f))):$PATH" /usr/local/systemtap/bin/stap -v ./resowner.stp
+
+/*
+.mark("resowner__created") $arg1:long $arg2:long
+.mark("resowner__delete__done") $arg1:long $arg2:long
+.mark("resowner__delete__start") $arg1:long $arg2:long
+.mark("resowner__enlarge__done") $arg1:long $arg2:long $arg3:long
+.mark("resowner__enlarge__skipped") $arg1:long $arg2:long
+.mark("resowner__enlarge__start") $arg1:long $arg2:long $arg3:long $arg4:long
+.mark("resowner__forget__done") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long
+.mark("resowner__forget__start") $arg1:long $arg2:long $arg3:long $arg4:long
+.mark("resowner__new__parent__done") $arg1:long
+.mark("resowner__new__parent__start") $arg1:long $arg2:long $arg3:long
+.mark("resowner__release__all__done") $arg1:long $arg2:long $arg3:long
+.mark("resowner__release__all__hash")
+.mark("resowner__release__all__start") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long
+.mark("resowner__release__done") $arg1:long $arg2:long
+.mark("resowner__release__plancacherefs__done") $arg1:long $arg2:long
+.mark("resowner__release__plancacherefs__start") $arg1:long $arg2:long
+.mark("resowner__release__start") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long
+.mark("resowner__remembered") $arg1:long $arg2:long $arg3:long $arg4:long
+*/
+
+@define probe_prefix %( sprintf("[%6d] %20s %20s %8x %20s", pid(), application_names[pid()], $$name, owner, owner_name) %)
+
+probe pg = process("postgres") {}
+
+# stats
+global operation_stats;
+
+# backend info
+private application_names;
+
+# in-progress operations
+private track_resowner_deletes;
+
+probe pg.begin {
+	if (@defined(@var("application_name@guc.c","postgres"))) {
+		application_names[pid()] = user_string(@var("application_name@guc.c","postgres"),"<?>")
+	} else {
+		application_names[pid()] = "<?>"
+	}
+	printf("[%6d] started %s\n", pid(), application_names[pid()])
+}
+
+probe pg.end {
+	delete application_names[pid()];
+	delete track_resowner_deletes[pid(),*];
+}
+
+probe pg.mark("resowner__created") {
+	owner = $arg1
+	owner_name = user_string($arg2)
+	printf("%s\n", @probe_prefix)
+}
+
+probe pg.mark("resowner__delete__start") {
+	owner = $arg1
+	owner_name = user_string($arg2)
+	track_resowner_deletes[pid(),owner] = gettimeofday_us()
+}
+
+probe pg.mark("resowner__delete__done") {
+	owner = $arg1
+	owner_name = user_string($arg2)
+	starttime_us = track_resowner_deletes[pid(),owner]
+	if (starttime_us != 0) {
+		elapsed = gettimeofday_us() - starttime_us
+		operation_stats["resowner_delete"] <<< elapsed
+		printf("%s: %s\n", @probe_prefix, usecs_to_string(elapsed));
+	}
+	delete track_resowner_deletes[pid(),owner]
+}
+
+function print_stat(statname) {
+	count = @count(operation_stats[statname])
+	if (count > 0) {
+		min = @min(operation_stats[statname])
+		max = @max(operation_stats[statname])
+		printf("-- %s:\n"
+		       "--    count:  %6d\n"
+		       "--    mean:   %6d\n"
+		       "--    stddev: %6d\n"
+		       "--    min:    %6d\n"
+		       "--    max:    %6d\n",
+		       statname,
+		       count,
+		       @avg(operation_stats[statname]),
+		       @variance(operation_stats[statname])/count,
+		       min, max
+		)
+		println(@hist_log(operation_stats[statname]))
+	}
+}
+
+function print_stats() {
+	print_stat("resowner_delete")
+}
+
+probe timer.ms(5000) {
+	print_stats()
+}
+
+probe end {
+	print_stats()
+ }
diff --git a/src/backend/utils/probes.d b/src/backend/utils/probes.d
index a0b0458108..7be815f9de 100644
--- a/src/backend/utils/probes.d
+++ b/src/backend/utils/probes.d
@@ -21,6 +21,9 @@
 #define Oid unsigned int
 #define ForkNumber int
 #define bool unsigned char
+#define Datum long int
+#define ResourceOwnerData void
+#define ResourceReleasePhase int
 
 provider postgresql {
 
@@ -91,4 +94,23 @@ provider postgresql {
 	probe wal__switch();
 	probe wal__buffer__write__dirty__start();
 	probe wal__buffer__write__dirty__done();
+
+	probe resowner__created(ResourceOwnerData*, const char *);
+	probe resowner__remembered(ResourceOwnerData*, const char *, Datum, const char *);
+	probe resowner__enlarge__skipped(ResourceOwnerData*, const char*);
+	probe resowner__enlarge__start(ResourceOwnerData*, const char *, int, int);
+	probe resowner__enlarge__done(ResourceOwnerData*, const char *, int);
+	probe resowner__forget__start(ResourceOwnerData*, const char *, Datum, const char *);
+	probe resowner__forget__done(ResourceOwnerData*, const char *, Datum, int, int);
+	probe resowner__release__all__start(ResourceOwnerData*, const char *, ResourceReleasePhase, int, int);
+	probe resowner__release__all__hash();
+	probe resowner__release__all__done(ResourceOwnerData*, const char *, ResourceReleasePhase);
+	probe resowner__release__start(ResourceOwnerData*, const char *, int, bool, bool);
+	probe resowner__release__done(ResourceOwnerData*, const char *);
+	probe resowner__delete__start(ResourceOwnerData*, const char *);
+	probe resowner__delete__done(long int, const char *);
+	probe resowner__release__plancacherefs__start(ResourceOwnerData*, const char *);
+	probe resowner__release__plancacherefs__done(ResourceOwnerData*, const char *);
+	probe resowner__new__parent__start(ResourceOwnerData*, ResourceOwnerData*, ResourceOwnerData*);
+	probe resowner__new__parent__done(ResourceOwnerData*);
 };
diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index 2e1ae4f8e0..fcd2c6f61d 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -26,6 +26,7 @@
 #include "storage/proc.h"
 #include "utils/memutils.h"
 #include "utils/plancache.h"
+#include "utils/probes.h"
 #include "utils/resowner.h"
 
 /*
@@ -152,10 +153,18 @@ static ResourceReleaseCallbackItem *ResourceRelease_callbacks = NULL;
 
 
 /* Internal routines */
-static void ResourceOwnerReleaseInternal(ResourceOwner owner,
+static void ResourceOwnerReleaseRecurse(ResourceOwner owner,
 										 ResourceReleasePhase phase,
 										 bool isCommit,
 										 bool isTopLevel);
+static void ResourceOwnerReleaseSelf(ResourceOwner owner,
+										 ResourceReleasePhase phase,
+										 bool isCommit,
+										 bool isTopLevel);
+static void ResourceOwnerReleaseCallbacks(ResourceOwner owner,
+                               ResourceReleasePhase phase,
+                               bool isCommit,
+                               bool isTopLevel);
 static void ReleaseAuxProcessResourcesCallback(int code, Datum arg);
 
 
@@ -194,9 +203,13 @@ void
 ResourceOwnerEnlarge(ResourceOwner owner)
 {
 	if (owner->narr < RESOWNER_ARRAY_SIZE)
+	{
+		TRACE_POSTGRESQL_RESOWNER_ENLARGE_SKIPPED(owner, owner->name);
 		return;					/* no work needed */
+	}
 
 	/* Is there space in the hash? If not, enlarge it. */
+	TRACE_POSTGRESQL_RESOWNER_ENLARGE_START(owner, owner->name, owner->nhash, owner->narr);
 
 	/* Double the capacity of the array (capacity must stay a power of 2!) */
 	if (owner->narr + owner->nhash >= owner->grow_at)
@@ -246,6 +259,8 @@ ResourceOwnerEnlarge(ResourceOwner owner)
 	owner->narr = 0;
 
 	Assert(owner->nhash < owner->grow_at);
+
+	TRACE_POSTGRESQL_RESOWNER_ENLARGE_DONE(owner, owner->name, owner->nhash);
 }
 
 /*
@@ -270,6 +285,8 @@ ResourceOwnerRemember(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind
 	owner->arr[idx].item = value;
 	owner->arr[idx].kind = kind;
 	owner->narr++;
+
+	TRACE_POSTGRESQL_RESOWNER_REMEMBERED(owner, owner->name, value, kind->name);
 }
 
 /*
@@ -292,6 +309,8 @@ ResourceOwnerForget(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind)
 		 resowner_trace_counter++, owner, DatumGetUInt64(value), kind->name);
 #endif
 
+	TRACE_POSTGRESQL_RESOWNER_FORGET_START(owner, owner->name, value, kind->name);
+
 	/* Search through all items, but check the array first. */
 	for (i = 0; i < owner->narr; i++)
 	{
@@ -305,6 +324,7 @@ ResourceOwnerForget(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind)
 			narray_lookups++;
 #endif
 
+			TRACE_POSTGRESQL_RESOWNER_FORGET_DONE(owner, owner->name, value, i, 0);
 			return;
 		}
 	}
@@ -327,6 +347,7 @@ ResourceOwnerForget(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind)
 #ifdef RESOWNER_STATS
 				nhash_lookups++;
 #endif
+				TRACE_POSTGRESQL_RESOWNER_FORGET_DONE(owner, owner->name, value, owner->narr, i);
 				return;
 			}
 			idx = (idx + 1) & mask;
@@ -351,6 +372,9 @@ ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase,
 {
 	bool		found;
 
+	TRACE_POSTGRESQL_RESOWNER_RELEASE_ALL_START(owner, owner->name, phase,
+					owner->narr, owner->nhash);
+
 	/* First handle all the entries in the array. */
 	do
 	{
@@ -377,6 +401,9 @@ ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase,
 		 */
 	} while (found && owner->narr > 0);
 
+	/* For probe/trace tools to time array and hash release separately */
+	TRACE_POSTGRESQL_RESOWNER_RELEASE_ALL_HASH();
+
 	/* Ok, the array has now been handled. Then the hash */
 	do
 	{
@@ -406,6 +433,8 @@ ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase,
 		 */
 	}
 	while (found && owner->nhash > 0);
+
+	TRACE_POSTGRESQL_RESOWNER_RELEASE_ALL_DONE(owner, owner->name, phase);
 }
 
 
@@ -420,6 +449,9 @@ ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase,
  *
  * All ResourceOwner objects are kept in TopMemoryContext, since they should
  * only be freed explicitly.
+ *
+ * The owner name is not copied. It should generally be a string constant. Otherwise
+ * the 
  */
 ResourceOwner
 ResourceOwnerCreate(ResourceOwner parent, const char *name)
@@ -442,6 +474,8 @@ ResourceOwnerCreate(ResourceOwner parent, const char *name)
 		 resowner_trace_counter++, owner, name);
 #endif
 
+	TRACE_POSTGRESQL_RESOWNER_CREATED(owner, owner->name);
+
 	return owner;
 }
 
@@ -478,7 +512,7 @@ ResourceOwnerRelease(ResourceOwner owner,
 					 bool isTopLevel)
 {
 	/* There's not currently any setup needed before recursing */
-	ResourceOwnerReleaseInternal(owner, phase, isCommit, isTopLevel);
+	ResourceOwnerReleaseRecurse(owner, phase, isCommit, isTopLevel);
 
 #ifdef RESOWNER_STATS
 	if (isTopLevel)
@@ -491,18 +525,19 @@ ResourceOwnerRelease(ResourceOwner owner,
 }
 
 static void
-ResourceOwnerReleaseInternal(ResourceOwner owner,
+ResourceOwnerReleaseRecurse(ResourceOwner owner,
 							 ResourceReleasePhase phase,
 							 bool isCommit,
 							 bool isTopLevel)
 {
 	ResourceOwner child;
 	ResourceOwner save;
-	ResourceReleaseCallbackItem *item;
+
+	TRACE_POSTGRESQL_RESOWNER_RELEASE_START(owner, owner->name, phase, isCommit, isTopLevel);
 
 	/* Recurse to handle descendants */
 	for (child = owner->firstchild; child != NULL; child = child->nextchild)
-		ResourceOwnerReleaseInternal(child, phase, isCommit, isTopLevel);
+		ResourceOwnerReleaseRecurse(child, phase, isCommit, isTopLevel);
 
 	/*
 	 * Make CurrentResourceOwner point to me, so that ReleaseBuffer etc don't
@@ -511,6 +546,27 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 	save = CurrentResourceOwner;
 	CurrentResourceOwner = owner;
 
+	/* Release directly managed resources */
+	ResourceOwnerReleaseSelf(owner, phase, isCommit, isTopLevel);
+
+	/* And run any release callbacks */
+	ResourceOwnerReleaseCallbacks(owner, phase, isCommit, isTopLevel);
+
+	CurrentResourceOwner = save;
+
+	TRACE_POSTGRESQL_RESOWNER_RELEASE_DONE(owner, owner->name);
+}
+
+/*
+ * Non-recursive part of ResourceOwnerRelease for a single resowner's
+ * resources.
+ */
+static void
+ResourceOwnerReleaseSelf(ResourceOwner owner,
+                               ResourceReleasePhase phase,
+                               bool isCommit,
+                               bool isTopLevel)
+{
 	if (phase == RESOURCE_RELEASE_BEFORE_LOCKS)
 	{
 		/*
@@ -577,12 +633,18 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 		 */
 		ResourceOwnerReleaseAll(owner, phase, isCommit);
 	}
+}
+
+static void
+ResourceOwnerReleaseCallbacks(ResourceOwner owner,
+                               ResourceReleasePhase phase,
+                               bool isCommit,
+                               bool isTopLevel)
+{
+	ResourceReleaseCallbackItem *item;
 
-	/* Let add-on modules get a chance too */
 	for (item = ResourceRelease_callbacks; item; item = item->next)
 		item->callback(phase, isCommit, isTopLevel, item->arg);
-
-	CurrentResourceOwner = save;
 }
 
 /*
@@ -600,6 +662,8 @@ ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner)
 	save = CurrentResourceOwner;
 	CurrentResourceOwner = owner;
 
+	TRACE_POSTGRESQL_RESOWNER_RELEASE_PLANCACHEREFS_START(owner, owner->name);
+
 	/* array first */
 	for (int i = 0; i < owner->narr; i++)
 	{
@@ -633,6 +697,8 @@ ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner)
 	}
 
 	CurrentResourceOwner = save;
+
+	TRACE_POSTGRESQL_RESOWNER_RELEASE_PLANCACHEREFS_DONE(owner, owner->name);
 }
 
 /*
@@ -644,6 +710,8 @@ ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner)
 void
 ResourceOwnerDelete(ResourceOwner owner)
 {
+	const char * const name = owner->name;
+
 	/* We had better not be deleting CurrentResourceOwner ... */
 	Assert(owner != CurrentResourceOwner);
 
@@ -657,6 +725,8 @@ ResourceOwnerDelete(ResourceOwner owner)
 		 resowner_trace_counter++, owner, owner->name);
 #endif
 
+	TRACE_POSTGRESQL_RESOWNER_DELETE_START(owner, owner->name);
+
 	/*
 	 * Delete children.  The recursive call will delink the child from me, so
 	 * just iterate as long as there is a child.
@@ -675,6 +745,12 @@ ResourceOwnerDelete(ResourceOwner owner)
 	if (owner->hash)
 		pfree(owner->hash);
 	pfree(owner);
+
+	/*
+	 * Passing the free'd owner pointer here is deliberate, it serves to
+	 * identify the freed owner when concurrently tracing many backends.
+	 */
+	TRACE_POSTGRESQL_RESOWNER_DELETE_DONE((long int)owner, name);
 }
 
 /*
@@ -695,6 +771,8 @@ ResourceOwnerNewParent(ResourceOwner owner,
 {
 	ResourceOwner oldparent = owner->parent;
 
+	TRACE_POSTGRESQL_RESOWNER_NEW_PARENT_START(owner, oldparent, newparent);
+
 	if (oldparent)
 	{
 		if (owner == oldparent->firstchild)
@@ -726,6 +804,8 @@ ResourceOwnerNewParent(ResourceOwner owner,
 		owner->parent = NULL;
 		owner->nextchild = NULL;
 	}
+
+	TRACE_POSTGRESQL_RESOWNER_NEW_PARENT_DONE(owner);
 }
 
 /*
-- 
2.26.2

Reply via email to