Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2023-11-16 Thread Daniel Gustafsson
> On 30 Jul 2020, at 14:11, Robert Haas  wrote:
> 
> On Mon, Jul 20, 2020 at 3:47 PM Robert Haas  wrote:
>> But I also agree that what pg_start_backup() was doing before v13 was
>> wrong; that's why I committed
>> 303640199d0436c5e7acdf50b837a027b5726594. The only reason I didn't
>> back-patch it is because the consequences are so minor I didn't think
>> it was worth worrying about. We could, though. I'd be somewhat
>> inclined to both do that and also change LLVM to use on_proc_exit() in
>> master, but I don't feel super-strongly about it.
> 
> Unless somebody complains pretty soon, I'm going to go ahead and do
> what is described above.

When backpatching 9dce22033d5d I ran into this in v13 and below, since it needs
llvm_shutdown to happen via on_proc_exit in order for all llvm_release_context
calls to have finished.  Unless anyone objects I will backpatch bab150045bd97
to v12 and v13 as part of my backpatch.

--
Daniel Gustafsson





Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-09-08 Thread Tom Lane
Bharath Rupireddy  writes:
> Attaching a patch with both the comments modification and changing
> DEBUG3 to ERROR. make check and make world-check passes on this patch.

I pushed this after simplifying the ereport down to an elog.  I see
no reason to consider this a user-facing error, so there's no need
to make translators deal with the message.

regards, tom lane




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-09-08 Thread Bharath Rupireddy
On Mon, Sep 7, 2020 at 8:50 PM Tom Lane  wrote:
>
> I think there is agreement that we're not going to change
> cancel_before_shmem_exit's restriction to only allow LIFO popping.
> So we should improve its comment to explain why.  The other thing
> that seems legitimately on-the-table for this CF entry is whether
> we should change cancel_before_shmem_exit to complain, rather than
> silently do nothing, if it fails to pop the stack.  Bharath's
> last patchset proposed to add an elog(DEBUG3) complaint, which
> seems to me to be just about entirely useless.  I'd make it an
> ERROR, or maybe an Assert.
>

Attaching a patch with both the comments modification and changing
DEBUG3 to ERROR. make check and make world-check passes on this patch.


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From 45cb0ada52eba25ce83985cc23edab0d34be9e4a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 7 Sep 2020 23:46:26 -0700
Subject: [PATCH v4] Modify cancel_before_shmem_exit() comments

Currently, there is no mention of safe usage of cancel_before_shmem_exit()
in the function comment and also it has a mis-directing point that, there
is still scope for improving the way cancel_before_shmem_exit() looks
for callback to be removed from before_shmem_exit_list. So, this patch
modifies the comment to reflect how the caller needs to use
before_shmem_exit_list mechanism.

This patch also adds an error in case the before_shmem_exit
function is not found at the latest entry.
---
 src/backend/storage/ipc/ipc.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index bdbc2c3ac4..3c2c30c189 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -381,9 +381,9 @@ on_shmem_exit(pg_on_exit_callback function, Datum arg)
  *		cancel_before_shmem_exit
  *
  *		this function removes a previously-registered before_shmem_exit
- *		callback.  For simplicity, only the latest entry can be
- *		removed.  (We could work harder but there is no need for
- *		current uses.)
+ *		callback.  We only look at the latest entry for removal, as we
+ * 		expect callers to add and remove temporary before_shmem_exit
+ * 		callbacks in strict LIFO order.
  * 
  */
 void
@@ -393,7 +393,18 @@ cancel_before_shmem_exit(pg_on_exit_callback function, Datum arg)
 		before_shmem_exit_list[before_shmem_exit_index - 1].function
 		== function &&
 		before_shmem_exit_list[before_shmem_exit_index - 1].arg == arg)
+	{
 		--before_shmem_exit_index;
+	}
+	else if (before_shmem_exit_index > 0)
+	{
+		ereport(ERROR,
+(errmsg("before_shmem_exit callback %p cannot be removed from the callback list as it doesn't match with the latest entry %p at index %d",
+		function,
+		before_shmem_exit_list[before_shmem_exit_index - 1].function,
+		before_shmem_exit_index),
+ errhint("Make sure the temporary before_shmem_exit callbacks are added and removed in strict Last-In-First-Out(LIFO) order.")));
+	}
 }
 
 /* 
-- 
2.25.1



Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-09-07 Thread Tom Lane
Michael Paquier  writes:
> Is there still something that needs to absolutely be done here knowing
> that we have bab1500 that got rid of the root issue?  Can the CF entry
> be marked as committed?

I think there is agreement that we're not going to change
cancel_before_shmem_exit's restriction to only allow LIFO popping.
So we should improve its comment to explain why.  The other thing
that seems legitimately on-the-table for this CF entry is whether
we should change cancel_before_shmem_exit to complain, rather than
silently do nothing, if it fails to pop the stack.  Bharath's
last patchset proposed to add an elog(DEBUG3) complaint, which
seems to me to be just about entirely useless.  I'd make it an
ERROR, or maybe an Assert.

regards, tom lane




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-09-07 Thread Michael Paquier
On Mon, Aug 10, 2020 at 10:10:08PM -0400, Robert Haas wrote:
> That split dates to the parallel query work, and there are some
> comments in shmem_exit() about it; see in particular the explanation
> in the middle where it says "Call dynamic shared memory callbacks." It
> seemed to me that I needed the re-entrancy behavior that is described
> there, but for a set of callbacks that needed to run before some of
> the existing callbacks and after others.

We still have a CF entry here:
https://commitfest.postgresql.org/29/2649/

Is there still something that needs to absolutely be done here knowing
that we have bab1500 that got rid of the root issue?  Can the CF entry
be marked as committed?

(FWIW, I would move any discussion about improving more stuff related
to shared memory cleanup code at proc exit into a new thread, as that
looks like a new topic.)
--
Michael


signature.asc
Description: PGP signature


Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-10 Thread Robert Haas
On Mon, Aug 10, 2020 at 8:46 PM Tom Lane  wrote:
> It's certainly arguable that PG_ENSURE_ERROR_CLEANUP is a special
> snowflake and needs to use a separate mechanism.  What is not real clear
> to me is why there are any other callers that must use before_shmem_exit
> rather than on_shmem_exit --- IOW, except for P_E_E_C's use, I have never
> been persuaded that the former callback list should exist at all.  The
> expectation for on_shmem_exit is that callbacks correspond to system
> service modules that are initialized in a particular order, and can safely
> be torn down in the reverse order.  Why can't the existing callers just
> make even-later entries into that same callback list?

That split dates to the parallel query work, and there are some
comments in shmem_exit() about it; see in particular the explanation
in the middle where it says "Call dynamic shared memory callbacks." It
seemed to me that I needed the re-entrancy behavior that is described
there, but for a set of callbacks that needed to run before some of
the existing callbacks and after others.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-10 Thread Tom Lane
Andres Freund  writes:
> I think there's two different aspects here: Having before_shmem_exit(),
> and having cancel_before_shmem_exit(). We could just not have the
> latter, and instead use a separate list for PG_ENSURE_ERROR_CLEANUP
> internally. With the callback for PG_ENSURE_ERROR_CLEANUP calling those
> from its private list.  There's no other uses of
> cancel_before_shmem_exit afaict.

It's certainly arguable that PG_ENSURE_ERROR_CLEANUP is a special
snowflake and needs to use a separate mechanism.  What is not real clear
to me is why there are any other callers that must use before_shmem_exit
rather than on_shmem_exit --- IOW, except for P_E_E_C's use, I have never
been persuaded that the former callback list should exist at all.  The
expectation for on_shmem_exit is that callbacks correspond to system
service modules that are initialized in a particular order, and can safely
be torn down in the reverse order.  Why can't the existing callers just
make even-later entries into that same callback list?

regards, tom lane




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-10 Thread Andres Freund
Hi,

On 2020-08-10 15:50:19 -0400, Robert Haas wrote:
> On Mon, Aug 10, 2020 at 3:41 PM Tom Lane  wrote:
> > > What I do think we should do, after thinking about it more,
> > > is discourage the casual use of before_shmem_exit() for things where
> > > on_shmem_exit() or on_proc_exit() would be just as good. I think
> > > that's what would avoid the most problems here.
> >
> > I think we're mostly in violent agreement here.  The interesting
> > question seems to be Andres' one about whether before_shmem_exit
> > actually has any safe use except for PG_ENSURE_ERROR_CLEANUP.
> > It may not, in which case perhaps we oughta rename it?

> If we could eliminate the other places where it's used, that'd be
> great. That's not too clear to me, though, because of the above two
> cases.

I think there's two different aspects here: Having before_shmem_exit(),
and having cancel_before_shmem_exit(). We could just not have the
latter, and instead use a separate list for PG_ENSURE_ERROR_CLEANUP
internally. With the callback for PG_ENSURE_ERROR_CLEANUP calling those
from its private list.  There's no other uses of
cancel_before_shmem_exit afaict.

I guess alternatively we at some point might just need a more complex
callback system, where one can specify where in relation to another
callback a callback needs to be registered etc.

Greetings,

Andres Freund




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-10 Thread Robert Haas
On Mon, Aug 10, 2020 at 3:41 PM Tom Lane  wrote:
> Robert Haas  writes:
> > Perhaps we really have four categories here:
> > (1) Temporary handlers for PG_ENSURE_ERROR_CLEANUP().
> > (2) High-level cleanup that needs to run after aborting out of the
> > current transaction.
> > (3) Per-subsystem shutdown for shared memory stuff.
> > (4) Per-subsystem shutdown for backend-private stuff.
>
> Hmm, I don't think we actually have any of (2) do we?  Or at least
> we aren't using ipc.c callbacks for them.

Well, I was thinking about the place where ShutdownPostgres() does
LockReleaseAll(), and also the stuff in RemoveTempRelationsCallback().
Those are pretty high-level operations that need to happen before we
start shutting down subsystems. Especially the removal of temp
relations.

> > What I do think we should do, after thinking about it more,
> > is discourage the casual use of before_shmem_exit() for things where
> > on_shmem_exit() or on_proc_exit() would be just as good. I think
> > that's what would avoid the most problems here.
>
> I think we're mostly in violent agreement here.  The interesting
> question seems to be Andres' one about whether before_shmem_exit
> actually has any safe use except for PG_ENSURE_ERROR_CLEANUP.
> It may not, in which case perhaps we oughta rename it?

If we could eliminate the other places where it's used, that'd be
great. That's not too clear to me, though, because of the above two
cases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-10 Thread Tom Lane
Robert Haas  writes:
> Perhaps we really have four categories here:
> (1) Temporary handlers for PG_ENSURE_ERROR_CLEANUP().
> (2) High-level cleanup that needs to run after aborting out of the
> current transaction.
> (3) Per-subsystem shutdown for shared memory stuff.
> (4) Per-subsystem shutdown for backend-private stuff.

Hmm, I don't think we actually have any of (2) do we?  Or at least
we aren't using ipc.c callbacks for them.

> What I do think we should do, after thinking about it more,
> is discourage the casual use of before_shmem_exit() for things where
> on_shmem_exit() or on_proc_exit() would be just as good. I think
> that's what would avoid the most problems here.

I think we're mostly in violent agreement here.  The interesting
question seems to be Andres' one about whether before_shmem_exit
actually has any safe use except for PG_ENSURE_ERROR_CLEANUP.
It may not, in which case perhaps we oughta rename it?

regards, tom lane




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-10 Thread Robert Haas
On Mon, Aug 10, 2020 at 10:44 AM Tom Lane  wrote:
> I agree that doing nothing seems like a bad idea.  My concern about
> allowing non-LIFO callback removal is that it seems to me that such
> usage patterns have likely got *other* bugs, so we should discourage
> that.  These callbacks don't exist in a vacuum: they reflect that
> the mainline code path has set up, or torn down, important state.
> Non-LIFO usage requires very strong assumptions that the states in
> question are not interdependent, and that's something I'd rather not
> rely on if we don't have to.

I have mixed feelings about this. On the one hand, I think saying that
it requires "very strong assumptions" about interdependence makes it
sound scarier than it is -- not that your statement is wrong, but that
in most cases we kinda know whether that's true or not, and the idea
that you shouldn't remove a callback upon which some later callback
might be depending is not too difficult for anybody to understand. On
the other hand, I think that in most cases we ought to be discouraging
people who are trying to do per-subsystem cleanup from using
before_shmem_exit() at all. I think such callbacks ought to be done
using on_shmem_exit() if they involve shared memory or on_proc_exit()
if they do not. Those functions don't have cancel_blah_exit()
variants, and I don't think they should: the right way to code those
things is not to remove the callbacks from the stack when they're no
longer needed, but rather to code the callbacks so that they will do
nothing if no work is required, leaving them permanently registered.

And the main reason why I think that such callbacks should be
registered using on_shmem_exit() or on_proc_exit() rather than
before_shmem_exit() is because of (1) the interdependency issue you
raise and (2) the fact that cancel_before_shmem_exit doesn't do what
people tend to think it does. For an example of (1), look at
ShutdownPostgres() and RemoveTempRelationsCallback(). The latter
aborts out of any transaction and then starts a new one to drop your
temp schema. But that might fail, so ShutdownPostgres() also needs to
be prepared to AbortOutOfAnyTransaction(). If you inserted more
cleanup steps that were thematically similar to those, each one of
them would also need to begin with AbortOutOfAnyTransaction(). That
suggests that this whole thing is a bit under-engineered. Some of the
other before_shmem_exit() callbacks don't start with that incantation,
but that's because they are per-subsystem callbacks that likely ought
to be using on_shmem_exit() rather than actually being the same sort
of thing.

Perhaps we really have four categories here:
(1) Temporary handlers for PG_ENSURE_ERROR_CLEANUP().
(2) High-level cleanup that needs to run after aborting out of the
current transaction.
(3) Per-subsystem shutdown for shared memory stuff.
(4) Per-subsystem shutdown for backend-private stuff.

Right now we blend (1), (2), and some of (3) together, but we could
try to create a cleaner line. We could redefine before_shmem_exit() as
being exactly #2, and abort out of any transaction before calling each
step, and document that you shouldn't use it unless you need that
behavior. And we could have a separate stack for #1 that is explicitly
LIFO and not intended for any other use. But then again maybe that's
overkill. What I do think we should do, after thinking about it more,
is discourage the casual use of before_shmem_exit() for things where
on_shmem_exit() or on_proc_exit() would be just as good. I think
that's what would avoid the most problems here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-10 Thread Tom Lane
Robert Haas  writes:
> Well, I don't really care whether or not we change this function to
> iterate over the callback list or whether we add a warning that you
> need to use it in LIFO order, but I think we should do one or the
> other, because this same confusion has come up multiple times. I
> thought that Tom was opposed to making it iterate over the callback
> list (for reasons I don't really understand, honestly) so adding a
> comment and a cross-check seemed like the practical option. Now I also
> think it's fine to iterate over the callback list: this function
> doesn't get used so much that it's likely to be a performance problem,
> and I don't think this is the first bug that would have become a
> non-bug had we done that years and years ago whenever it was first
> proposed. In fact, I'd go so far as to say that the latter is a
> slightly better option. However, doing nothing is clearly worst.

I agree that doing nothing seems like a bad idea.  My concern about
allowing non-LIFO callback removal is that it seems to me that such
usage patterns have likely got *other* bugs, so we should discourage
that.  These callbacks don't exist in a vacuum: they reflect that
the mainline code path has set up, or torn down, important state.
Non-LIFO usage requires very strong assumptions that the states in
question are not interdependent, and that's something I'd rather not
rely on if we don't have to.

regards, tom lane




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-10 Thread Robert Haas
On Fri, Aug 7, 2020 at 5:20 PM Andres Freund  wrote:
> In which situations is the removal actually useful *and* safe, with
> these constraints? You'd have to have a very narrow set of functions
> that are called while the exit hook is present, i.e. basically this
> would only be usable for PG_ENSURE_ERROR_CLEANUP and nothing else.  And
> even there it seems like it's pretty easy to get into a situation where
> it's not safe.

Well, I don't really care whether or not we change this function to
iterate over the callback list or whether we add a warning that you
need to use it in LIFO order, but I think we should do one or the
other, because this same confusion has come up multiple times. I
thought that Tom was opposed to making it iterate over the callback
list (for reasons I don't really understand, honestly) so adding a
comment and a cross-check seemed like the practical option. Now I also
think it's fine to iterate over the callback list: this function
doesn't get used so much that it's likely to be a performance problem,
and I don't think this is the first bug that would have become a
non-bug had we done that years and years ago whenever it was first
proposed. In fact, I'd go so far as to say that the latter is a
slightly better option. However, doing nothing is clearly worst.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-10 Thread Bharath Rupireddy
On Fri, Aug 7, 2020 at 11:09 PM Robert Haas  wrote:
>
> On Fri, Aug 7, 2020 at 1:12 PM Tom Lane  wrote:
> > That's a meaningless statement for any one caller.  So it needs to be more
> > like "we expect callers to add and remove temporary before_shmem_exit
> > callbacks in strict LIFO order".
>
> Sure, that seems fine.
>

v2 patch has the comments modified.

>
> > I wonder whether we ought to change the function to complain if the
> > last list entry doesn't match.  We'd have caught this bug sooner
> > if it did, and it's not very clear why silently doing nothing is
> > a good idea when there's no match.
>
> +1.
>

This is a good idea. v3 patch has both the modified comments(from v2)
as well as a DEBUG3 (DEBUG3 level, because the other
non-error/non-fatal logs in ipc.c are using the same level) log to
report when the latest entry for removal is not matched with the one
the caller cancel_before_shmem_exit() is looking for and a hint on how
to safely use temporary before_shmem_exit() callbacks. In v3 patch,
function pointer is being printed, I'm not sure how much it is helpful
to have function pointers in the logs though there are some other
places printing pointers into the logs, I wish I could print function
names. (Is there a way we could get function names from function
pointers?).

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From 7cdea387c2e18eb537077ccf518747b9cffb405c Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 10 Aug 2020 10:42:23 +0530
Subject: [PATCH v2] Modify cancel_before_shmem_exit() comments

Currently, there is no mention of safe usage of cancel_before_shmem_exit()
in the function comment and also it has a point that, there is still
scope for improving the way cancel_before_shmem_exit() looks for callback
to removed from before_shmem_exit_list. So, this patch modifies the comment
to reflect how the caller needs to use before_shmem_exit_list mechanism.
---
 src/backend/storage/ipc/ipc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index bdbc2c3ac4..3ca5e7c89c 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -381,9 +381,9 @@ on_shmem_exit(pg_on_exit_callback function, Datum arg)
  *		cancel_before_shmem_exit
  *
  *		this function removes a previously-registered before_shmem_exit
- *		callback.  For simplicity, only the latest entry can be
- *		removed.  (We could work harder but there is no need for
- *		current uses.)
+ *		callback.  We only look at the latest entry for removal, as we
+ * 		expect callers to add and remove temporary before_shmem_exit
+ * 		callbacks in strict LIFO order.
  * 
  */
 void
-- 
2.25.1

From b41e4fa116f1e6eb5d1cee25f9fc6b0fe020a8af Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 10 Aug 2020 12:13:31 +0530
Subject: [PATCH v3] Modify cancel_before_shmem_exit() comments

Currently, there is no mention of safe usage of cancel_before_shmem_exit()
in the function comment and also it has a point that, there is still
scope for improving the way cancel_before_shmem_exit() looks for callback
to removed from before_shmem_exit_list. So, this patch modifies the comment
to reflect how the caller needs to use before_shmem_exit_list mechanism.
---
 src/backend/storage/ipc/ipc.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index bdbc2c3ac4..3cc5cf9b9b 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -381,9 +381,9 @@ on_shmem_exit(pg_on_exit_callback function, Datum arg)
  *		cancel_before_shmem_exit
  *
  *		this function removes a previously-registered before_shmem_exit
- *		callback.  For simplicity, only the latest entry can be
- *		removed.  (We could work harder but there is no need for
- *		current uses.)
+ *		callback.  We only look at the latest entry for removal, as we
+ * 		expect callers to add and remove temporary before_shmem_exit
+ * 		callbacks in strict LIFO order.
  * 
  */
 void
@@ -393,7 +393,18 @@ cancel_before_shmem_exit(pg_on_exit_callback function, Datum arg)
 		before_shmem_exit_list[before_shmem_exit_index - 1].function
 		== function &&
 		before_shmem_exit_list[before_shmem_exit_index - 1].arg == arg)
+	{
 		--before_shmem_exit_index;
+	}
+	else if (before_shmem_exit_index > 0)
+	{
+		ereport(DEBUG3,
+(errmsg("before_shmem_exit callback %p cannot be removed from the callback list as it doesn't match with the latest entry %p at index %d",
+		function,
+		before_shmem_exit_list[before_shmem_exit_index - 1].function,
+		before_shmem_exit_index),
+ errhint("Make sure the temporary before_shmem_exit callbacks are added and removed in strict Last-In-First-Out(LIFO) order.")));
+	}
 }
 
 /

Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-07 Thread Andres Freund
Hi,

On 2020-08-07 12:29:03 -0400, Robert Haas wrote:
> On Thu, Aug 6, 2020 at 11:46 PM Bharath Rupireddy
>  wrote:
> > I sent the patch previously[1], but attaching here again, modifies
> > cancel_before_shmem_exit() function comment to reflect the safe usage
> > of before_shmem_exit_list callback mechanism and also removes the
> > point "For simplicity, only the latest entry can be removed*"
> > as this gives a meaning that there is still scope for improvement in
> > cancel_before_shmem_exit() search mechanism.
> >
> > Thoughts?
> 
> I think that the first part of the comment change you suggest is a
> good idea and would avoid developer confusion, but I think that the
> statement about unordered removal of comments being risky doesn't add
> much. It's too vague to help anybody and I don't think I believe it,
> either. So I suggest something more like:
> 
> - * callback.  For simplicity, only the latest entry can be
> - * removed.  (We could work harder but there is no need for
> - * current uses.)
> + * callback.  We only look at the latest entry for removal, as we
> + * expect the caller to use before_shmem_exit callback mechanism
> + * in the LIFO order.

In which situations is the removal actually useful *and* safe, with
these constraints? You'd have to have a very narrow set of functions
that are called while the exit hook is present, i.e. basically this
would only be usable for PG_ENSURE_ERROR_CLEANUP and nothing else.  And
even there it seems like it's pretty easy to get into a situation where
it's not safe.

Greetings,

Andres Freund




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-07 Thread Robert Haas
On Fri, Aug 7, 2020 at 1:12 PM Tom Lane  wrote:
> That's a meaningless statement for any one caller.  So it needs to be more
> like "we expect callers to add and remove temporary before_shmem_exit
> callbacks in strict LIFO order".

Sure, that seems fine.

> I wonder whether we ought to change the function to complain if the
> last list entry doesn't match.  We'd have caught this bug sooner
> if it did, and it's not very clear why silently doing nothing is
> a good idea when there's no match.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-07 Thread Tom Lane
Robert Haas  writes:
> ... So I suggest something more like:

> - * callback.  For simplicity, only the latest entry can be
> - * removed.  (We could work harder but there is no need for
> - * current uses.)
> + * callback.  We only look at the latest entry for removal, as we
> + * expect the caller to use before_shmem_exit callback mechanism
> + * in the LIFO order.

That's a meaningless statement for any one caller.  So it needs to be more
like "we expect callers to add and remove temporary before_shmem_exit
callbacks in strict LIFO order".

I wonder whether we ought to change the function to complain if the
last list entry doesn't match.  We'd have caught this bug sooner
if it did, and it's not very clear why silently doing nothing is
a good idea when there's no match.

regards, tom lane




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-07 Thread Robert Haas
On Thu, Aug 6, 2020 at 11:46 PM Bharath Rupireddy
 wrote:
> I sent the patch previously[1], but attaching here again, modifies
> cancel_before_shmem_exit() function comment to reflect the safe usage
> of before_shmem_exit_list callback mechanism and also removes the
> point "For simplicity, only the latest entry can be removed*"
> as this gives a meaning that there is still scope for improvement in
> cancel_before_shmem_exit() search mechanism.
>
> Thoughts?

I think that the first part of the comment change you suggest is a
good idea and would avoid developer confusion, but I think that the
statement about unordered removal of comments being risky doesn't add
much. It's too vague to help anybody and I don't think I believe it,
either. So I suggest something more like:

- * callback.  For simplicity, only the latest entry can be
- * removed.  (We could work harder but there is no need for
- * current uses.)
+ * callback.  We only look at the latest entry for removal, as we
+ * expect the caller to use before_shmem_exit callback mechanism
+ * in the LIFO order.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-06 Thread Bharath Rupireddy
On Thu, Aug 6, 2020 at 11:51 PM Robert Haas  wrote:
>
> On Thu, Jul 30, 2020 at 8:11 AM Robert Haas  wrote:
> > Unless somebody complains pretty soon, I'm going to go ahead and do
> > what is described above.
>
> Done.
>

Thanks!

I have one more request to make: since we are of the opinion to not
change the way cancel_before_shmem_exit() searches
before_shmem_exit_list array, wouldn't it be good to adjust comments
before the function cancel_before_shmem_exit()?

I sent the patch previously[1], but attaching here again, modifies
cancel_before_shmem_exit() function comment to reflect the safe usage
of before_shmem_exit_list callback mechanism and also removes the
point "For simplicity, only the latest entry can be removed*"
as this gives a meaning that there is still scope for improvement in
cancel_before_shmem_exit() search mechanism.

Thoughts?

[1] - 
https://www.postgresql.org/message-id/CALj2ACVwOKZ8qYUsZrU2y2efnYZOLRxPC6k52FQcB3oriH9Kcg%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From b1c8e83b5c81102070a66d7761ee2aa8894d6ec2 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 24 Jul 2020 09:35:54 +0530
Subject: [PATCH v1] Modify cancel_before_shmem_exit() comments

Currently, there is no mention of safe usage of cancel_before_shmem_exit()
in the function comment and also it has a point that, there is still
scope for improving the way cancel_before_shmem_exit() looks for callback
to removed from before_shmem_exit_list. So, this patch modifies the comment
to reflect how the caller needs to use before_shmem_exit_list mechanism.
---
 src/backend/storage/ipc/ipc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index bdbc2c3ac4..4ade61fe38 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -381,9 +381,9 @@ on_shmem_exit(pg_on_exit_callback function, Datum arg)
  *		cancel_before_shmem_exit
  *
  *		this function removes a previously-registered before_shmem_exit
- *		callback.  For simplicity, only the latest entry can be
- *		removed.  (We could work harder but there is no need for
- *		current uses.)
+ *		callback.  We only look at the latest entry for removal, as we
+ * 		expect the caller to use before_shmem_exit callback mechanism
+ * 		in the LIFO order. Un-ordered removal of callbacks may be risky.
  * 
  */
 void
-- 
2.25.1



Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-08-06 Thread Robert Haas
On Thu, Jul 30, 2020 at 8:11 AM Robert Haas  wrote:
> Unless somebody complains pretty soon, I'm going to go ahead and do
> what is described above.

Done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-07-30 Thread Robert Haas
On Mon, Jul 20, 2020 at 3:47 PM Robert Haas  wrote:
> But I also agree that what pg_start_backup() was doing before v13 was
> wrong; that's why I committed
> 303640199d0436c5e7acdf50b837a027b5726594. The only reason I didn't
> back-patch it is because the consequences are so minor I didn't think
> it was worth worrying about. We could, though. I'd be somewhat
> inclined to both do that and also change LLVM to use on_proc_exit() in
> master, but I don't feel super-strongly about it.

Unless somebody complains pretty soon, I'm going to go ahead and do
what is described above.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-07-26 Thread Bharath Rupireddy
On Fri, Jul 24, 2020 at 8:07 PM Robert Haas  wrote:
>
> On Fri, Jul 24, 2020 at 7:10 AM Bharath Rupireddy
>  wrote:
> > I looked at what actually llvm_shutdown() does? It frees up JIT stacks,
also if exists perf related resource, using LLVMOrcDisposeInstance() and
LLVMOrcUnregisterPerf(), that were dynamically allocated in
llvm_session_initialize through a JIT library function
LLVMOrcCreateInstance() [1].
> >
> > It looks like there is no problem in moving llvm_shutdown to either
on_shmem_exit() or on_proc_exit().
>
> If it doesn't involve shared memory, I guess it can be on_proc_exit()
> rather than on_shmem_exit().
>
> I guess the other question is why we're doing it at all. What
> resources are being allocated that wouldn't be freed up by process
> exit anyway?
>

LLVMOrcCreateInstance() and LLVMOrcDisposeInstance() are doing new and
delete respectively, I just found these functions from the link [1]. But I
don't exactly know whether there are any other resources being allocated
that can't be freed up by proc_exit(). Tagging @Andres Freund  for inputs
on whether we have any problem making llvm_shutdown() a on_proc_exit()
callback instead of before_shmem_exit() callback.

And as suggested in the previous mails, we wanted to make it on_proc_exit()
to avoid the abort issue reported in this mail chain, however if we take
the abort issue fix commit # 303640199d0436c5e7acdf50b837a027b5726594 as
mentioned in the previous response[2], then it may not be necessary, right
now, but just to be safer and to avoid any of these similar kind of issues
in future, we can consider this change as well.

[1] - https://llvm.org/doxygen/OrcCBindings_8cpp_source.html

 LLVMOrcJITStackRef LLVMOrcCreateInstance(LLVMTargetMachineRef TM) {
  TargetMachine *TM2(unwrap(TM));
   Triple T(TM2->getTargetTriple());
   auto IndirectStubsMgrBuilder =
  orc::createLocalIndirectStubsManagerBuilder(T);
   OrcCBindingsStack *JITStack =
  new OrcCBindingsStack(*TM2, std::move(IndirectStubsMgrBuilder));
   return wrap(JITStack);
 }

LLVMErrorRef LLVMOrcDisposeInstance(LLVMOrcJITStackRef JITStack) {
  auto *J = unwrap(JITStack);
  auto Err = J->shutdown();
  delete J;
  return wrap(std::move(Err));
 }

[2] -
https://www.postgresql.org/message-id/CALj2ACVwOKZ8qYUsZrU2y2efnYZOLRxPC6k52FQcB3oriH9Kcg%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-07-24 Thread Robert Haas
On Fri, Jul 24, 2020 at 7:10 AM Bharath Rupireddy
 wrote:
> I looked at what actually llvm_shutdown() does? It frees up JIT stacks, also 
> if exists perf related resource, using LLVMOrcDisposeInstance() and 
> LLVMOrcUnregisterPerf(), that were dynamically allocated in 
> llvm_session_initialize through a JIT library function 
> LLVMOrcCreateInstance() [1].
>
> It looks like there is no problem in moving llvm_shutdown to either 
> on_shmem_exit() or on_proc_exit().

If it doesn't involve shared memory, I guess it can be on_proc_exit()
rather than on_shmem_exit().

I guess the other question is why we're doing it at all. What
resources are being allocated that wouldn't be freed up by process
exit anyway?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-07-24 Thread Bharath Rupireddy
On Tue, Jul 21, 2020 at 1:17 AM Robert Haas  wrote:
>
> On Tue, Jul 7, 2020 at 12:55 PM Andres Freund  wrote:
> > What are you proposing? For now we could easily enough work around this
> > by just making it a on_proc_exit() callback, but that doesn't really
> > change the fundamental issue imo.
>
> I think it would be more correct for it to be an on_proc_exit()
> callback, because before_shmem_exit() callbacks can and do perform
> actions which rely on an awful lot of the system being still in a
> working state. RemoveTempRelationsCallback() is a good example: it
> thinks it can start and end transactions and make a bunch of catalog
> changes. I don't know that any of that could use JIT, but moving the
> JIT cleanup to the on_shmem_exit() stage seems better. At that point,
> there shouldn't be anybody doing anything that relies on being able to
> perform logical changes to the database; we're just shutting down
> low-level subsystems at that point, and thus presumably not doing
> anything that could possibly need JIT.
>

I looked at what actually llvm_shutdown() does? It frees up JIT stacks,
also if exists perf related resource, using LLVMOrcDisposeInstance() and
LLVMOrcUnregisterPerf(), that were dynamically allocated in
llvm_session_initialize through a JIT library function
LLVMOrcCreateInstance() [1].

It looks like there is no problem in moving llvm_shutdown to either
on_shmem_exit() or on_proc_exit().

[1] - https://llvm.org/doxygen/OrcCBindings_8cpp_source.html

>
> But I also agree that what pg_start_backup() was doing before v13 was
> wrong; that's why I committed
> 303640199d0436c5e7acdf50b837a027b5726594. The only reason I didn't
> back-patch it is because the consequences are so minor I didn't think
> it was worth worrying about. We could, though. I'd be somewhat
> inclined to both do that and also change LLVM to use on_proc_exit() in
> master, but I don't feel super-strongly about it.
>

Patch: v1-0001-Move-llvm_shutdown-to-on_proc_exit-list-from-befo.patch
Moved llvm_shutdown to on_proc_exit() call back list. Request to consider
this change for master, if possible <=13 versions. Basic JIT use cases and
regression tests are working fine with the patch.

Patches: PG11-0001-Fix-minor-problems-with-non-exclusive-backup-clea.patch
and PG12-0001-Fix-minor-problems-with-non-exclusive-backup-cleanup.patch
Request to consider the commit
303640199d0436c5e7acdf50b837a027b5726594(above two patches are for this
commit) to versions < 13, to fix the abort issue. Please note that the
above two patches have no difference in the code, just I made it applicable
on PG11.

Patch: v1-0001-Modify-cancel_before_shmem_exit-comments.patch
This patch, modifies cancel_before_shmem_exit() function comment to reflect
the safe usage of before_shmem_exit_list callback mechanism and also
removes the point "For simplicity, only the latest entry can be
removed*" as this gives a meaning that there is still scope for
improvement in cancel_before_shmem_exit() search mechanism.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From 5fae4b3a046789fb7b54e689c979b01cb322aaf9 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 23 Jul 2020 17:56:21 +0530
Subject: [PATCH v1] Move llvm_shutdown to on_proc_exit list from
 before_shmem_exit.

llvm_shutdown frees up dynamically allocated memory for jit
compilers in a session/backend. Having this as on_proc_exit doesn't
cause any harm.
---
 src/backend/jit/llvm/llvmjit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index af8b34aaaf..43bed78a52 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -683,7 +683,7 @@ llvm_session_initialize(void)
 	}
 #endif
 
-	before_shmem_exit(llvm_shutdown, 0);
+	on_proc_exit(llvm_shutdown, 0);
 
 	llvm_session_initialized = true;
 
-- 
2.25.1

From 3b0a048afd7ad6d8564a54d13397c72f6eadc5da Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 24 Jul 2020 08:55:52 +0530
Subject: [PATCH v5] Fix minor problems with non-exclusive backup cleanup.

The previous coding imagined that it could call before_shmem_exit()
when a non-exclusive backup began and then remove the previously-added
handler by calling cancel_before_shmem_exit() when that backup
ended. However, this only works provided that nothing else in the
system has registered a before_shmem_exit() hook in the interim,
because cancel_before_shmem_exit() is documented to remove a callback
only if it is the latest callback registered. It also only works
if nothing can ERROR out between the time that sessionBackupState
is reset and the time that cancel_before_shmem_exit(), which doesn't
seem to be strictly true.

To fix, leave the handler installed for the lifetime of the session,
arrange to install it just once, and teach it to quietly do nothing if
there isn't a non-exclusive backup in process.

This is a bug, but for now I'm not going to back-p

Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-07-20 Thread Robert Haas
On Tue, Jul 7, 2020 at 12:55 PM Andres Freund  wrote:
> What are you proposing? For now we could easily enough work around this
> by just making it a on_proc_exit() callback, but that doesn't really
> change the fundamental issue imo.

I think it would be more correct for it to be an on_proc_exit()
callback, because before_shmem_exit() callbacks can and do perform
actions which rely on an awful lot of the system being still in a
working state. RemoveTempRelationsCallback() is a good example: it
thinks it can start and end transactions and make a bunch of catalog
changes. I don't know that any of that could use JIT, but moving the
JIT cleanup to the on_shmem_exit() stage seems better. At that point,
there shouldn't be anybody doing anything that relies on being able to
perform logical changes to the database; we're just shutting down
low-level subsystems at that point, and thus presumably not doing
anything that could possibly need JIT.

But I also agree that what pg_start_backup() was doing before v13 was
wrong; that's why I committed
303640199d0436c5e7acdf50b837a027b5726594. The only reason I didn't
back-patch it is because the consequences are so minor I didn't think
it was worth worrying about. We could, though. I'd be somewhat
inclined to both do that and also change LLVM to use on_proc_exit() in
master, but I don't feel super-strongly about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-07-14 Thread Bharath Rupireddy
On Tue, Jul 7, 2020 at 10:24 PM Andres Freund  wrote:
>
> On 2020-07-07 09:44:41 -0400, Tom Lane wrote:
> > Bharath Rupireddy  writes:
> > > Firstly, pg_start_backup registers nonexclusive_base_backup_cleanup as
> > > on_shmem_exit call back which will
> > > add this function to the before_shmem_exit_list array which is
> > > supposed to be removed on pg_stop_backup
> > > so that we can do the pending cleanup and issue a warning for each
> > > pg_start_backup for which we did not call
> > > the pg_stop backup. Now, I executed a query for which JIT is picked
> > > up, then the the llvm compiler inserts it's
> > > own exit callback i.e. llvm_shutdown at the end of
> > > before_shmem_exit_list array. Now, I executed pg_stop_backup
> > > and call to cancel_before_shmem_exit() is made with the expectation
> > > that the nonexclusive_base_backup_cleanup
> > > callback is removed from before_shmem_exit_list array.
> >
> > I'm of the opinion that the JIT code is abusing this mechanism, and the
> > right thing to do is fix that.
>
> What are you proposing? For now we could easily enough work around this
> by just making it a on_proc_exit() callback, but that doesn't really
> change the fundamental issue imo.
>
> > The restriction you propose to remove is not just there out of
> > laziness, it's an expectation about what safe use of this mechanism
> > would involve.  Un-ordered removal of callbacks seems pretty risky: it
> > would mean that whatever cleanup is needed is not going to be done in
> > LIFO order.
>

I quickly searched(in HEAD) what all the callbacks are getting
registered to before_shmem_exit_list, with the intention to see if
they also call corresponding cancel_before_shmem_exit() after their
intended usage is done.

For few of the callbacks there is no cancel_before_shmem_exit(). This
seems expected; those callbacks ought to be executed before shmem
exit. These callbacks are(let say SET 1): ShutdownPostgres,
logicalrep_worker_onexit, llvm_shutdown, Async_UnlistenOnExit,
RemoveTempRelationsCallback, ShutdownAuxiliaryProcess,
do_pg_abort_backup in xlog.c (this callback exist only in v13 or
later), AtProcExit_Twophase.

Which means, once they are into the before_shmem_exit_list array, in
some order, they are never going to be removed from it as they don't
have corresponding cancel_before_shmem_exit() and the relative order
of execution remains the same.

And there are other callbacks that are getting registered to
before_shmem_exit_list array(let say SET 2): apw_detach_shmem,
_bt_end_vacuum_callback, pg_start_backup_callback,
pg_stop_backup_callback, createdb_failure_callback,
movedb_failure_callback, do_pg_abort_backup(in basebackup.c). They all
have corresponding cancel_before_shmem_exit() to unregister/remove the
callbacks from before_shmem_exit_list array.

I think the callbacks that have no cancel_before_shmem_exit()(SET 1)
may have to be executed in the LIFO order: it makes sense to execute
ShutdownPostgres at the end after let's say other callbacks in SET 1.

And the SET 2 callbacks have cancel_before_shmem_exit() with the only
intention that there's no need to call the callbacks on the
before_shmem_exit(), since they are not needed, and try to remove from
the before_shmem_exit_list array and may fail, if any other callback
gets registered in between.

If I'm not wrong with the above points, we must enhance
cancel_before_shmem_exit() or have cancel_before_shmem_exit_v2() (as
mentioned in my below response).

>
> Maybe I am confused, but isn't it <13's pg_start_backup() that's
> violating the protocol much more clearly than the JIT code? Given that
> it relies on there not being any callbacks registered between two SQL
> function calls?  I mean, what it does is basically:
>
> 1) before_shmem_exit(nonexclusive_base_backup_cleanup...
> 2) arbitrary code executed for a long time
> 3) cancel_before_shmem_exit(nonexclusive_base_backup_cleanup...
>
> which pretty obviously can't at all deal with any other
> before_shmem_exit callbacks being registered in 2).  Won't this be a
> problem for every other before_shmem_exit callback that we register
> on-demand? Say Async_UnlistenOnExit, RemoveTempRelationsCallback?
>

Yes, for versions <13's, clearly pg_start_backup causes the problem
and the issue can also be reproduced with Async_UnlistenOnExit,
RemoveTempRelationsCallback coming in between pg_start_backup and
pg_stop_backup.

We can have it fixed in a few ways: 1) enhance
cancel_before_shmem_exit() as attached in the original patch. 2) have
existing cancel_before_shmem_exit(), whenever called for
nonexclusive_base_backup_cleanup(), we can look for the entire array
instead of just the last entry. 3) have a separate function, say,
cancel_before_shmem_exit_v2(), that searches for the entire
before_shmem_exit_list array(the logic proposed in this patch) so that
it will not disturb the existing cancel_before_shmem_exit(). 4) or try
to have the pg_start_backup code that exists in after > 13 versions.

If okay to

Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-07-07 Thread Andres Freund
Hi,

On 2020-07-07 09:44:41 -0400, Tom Lane wrote:
> Bharath Rupireddy  writes:
> > Firstly, pg_start_backup registers nonexclusive_base_backup_cleanup as
> > on_shmem_exit call back which will
> > add this function to the before_shmem_exit_list array which is
> > supposed to be removed on pg_stop_backup
> > so that we can do the pending cleanup and issue a warning for each
> > pg_start_backup for which we did not call
> > the pg_stop backup. Now, I executed a query for which JIT is picked
> > up, then the the llvm compiler inserts it's
> > own exit callback i.e. llvm_shutdown at the end of
> > before_shmem_exit_list array. Now, I executed pg_stop_backup
> > and call to cancel_before_shmem_exit() is made with the expectation
> > that the nonexclusive_base_backup_cleanup
> > callback is removed from before_shmem_exit_list array.
> 
> I'm of the opinion that the JIT code is abusing this mechanism, and the
> right thing to do is fix that.

What are you proposing? For now we could easily enough work around this
by just making it a on_proc_exit() callback, but that doesn't really
change the fundamental issue imo.


> The restriction you propose to remove is not just there out of
> laziness, it's an expectation about what safe use of this mechanism
> would involve.  Un-ordered removal of callbacks seems pretty risky: it
> would mean that whatever cleanup is needed is not going to be done in
> LIFO order.

Maybe I am confused, but isn't it <13's pg_start_backup() that's
violating the protocol much more clearly than the JIT code? Given that
it relies on there not being any callbacks registered between two SQL
function calls?  I mean, what it does is basically:

1) before_shmem_exit(nonexclusive_base_backup_cleanup...
2) arbitrary code executed for a long time
3) cancel_before_shmem_exit(nonexclusive_base_backup_cleanup...

which pretty obviously can't at all deal with any other
before_shmem_exit callbacks being registered in 2).  Won't this be a
problem for every other before_shmem_exit callback that we register
on-demand? Say Async_UnlistenOnExit, RemoveTempRelationsCallback?

Greetings,

Andres Freund




Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

2020-07-07 Thread Tom Lane
Bharath Rupireddy  writes:
> Firstly, pg_start_backup registers nonexclusive_base_backup_cleanup as
> on_shmem_exit call back which will
> add this function to the before_shmem_exit_list array which is
> supposed to be removed on pg_stop_backup
> so that we can do the pending cleanup and issue a warning for each
> pg_start_backup for which we did not call
> the pg_stop backup. Now, I executed a query for which JIT is picked
> up, then the the llvm compiler inserts it's
> own exit callback i.e. llvm_shutdown at the end of
> before_shmem_exit_list array. Now, I executed pg_stop_backup
> and call to cancel_before_shmem_exit() is made with the expectation
> that the nonexclusive_base_backup_cleanup
> callback is removed from before_shmem_exit_list array.

I'm of the opinion that the JIT code is abusing this mechanism, and the
right thing to do is fix that.  The restriction you propose to remove is
not just there out of laziness, it's an expectation about what safe use of
this mechanism would involve.  Un-ordered removal of callbacks seems
pretty risky: it would mean that whatever cleanup is needed is not going
to be done in LIFO order.

regards, tom lane