Re: Avoid memory leaks during base backups

2022-10-22 Thread Michael Paquier
On Fri, Oct 21, 2022 at 09:02:04PM +0530, Bharath Rupireddy wrote:
> After all, that is what is being discussed here; what if palloc down
> below fails and they're not reset to NULL after MemoryContextReset()?

It does not seem to matter much to me for that, so left these as
proposed.

> On Fri, Oct 21, 2022 at 12:11 PM Kyotaro Horiguchi  
> wrote:
>> I think the "less" is somewhat obscure.  I feel we should be more
>> explicitly. And we don't need to put emphasis on "leak".  I recklessly
>> propose this as the draft.
> 
> I tried to put it simple, please see the attached v10. I'll leave it
> to the committer's discretion for better wording.

I am still not sure what "less" means when referring to a "memory
context".  Anyway, I have gone through the comments and finished with
something much more simplified, and applied the whole.
--
Michael


signature.asc
Description: PGP signature


Re: Avoid memory leaks during base backups

2022-10-21 Thread Bharath Rupireddy
On Fri, Oct 21, 2022 at 11:58 AM Michael Paquier  wrote:
>
> To be exact, it seems to me that tablespace_map and backup_state
> should be reset before deleting backupcontext, but the reset of
> backupcontext should happen after the fact.
>
> +   backup_state = NULL;
> tablespace_map = NULL;
> These two in pg_backup_start() don't matter, do they?  They are
> reallocated a couple of lines down.

After all, that is what is being discussed here; what if palloc down
below fails and they're not reset to NULL after MemoryContextReset()?

> +* across. We keep the memory allocated in this memory context less,
> What does "We keep the memory allocated in this memory context less"
> mean here?

We try to keep it less because we don't want to allocate more memory
and leak it unless pg_start_backup() is called again. Please read the
description. I'll leave it to the committer's discretion whether to
have that part or remove it.

On Fri, Oct 21, 2022 at 12:11 PM Kyotaro Horiguchi
 wrote:
>
>
> +* across. We keep the memory allocated in this memory context less,
> +* because any error before reaching pg_backup_stop() can leak the 
> memory
> +* until pg_backup_start() is called again. While this is not smart, 
> it
> +* helps to keep things simple.
>
> I think the "less" is somewhat obscure.  I feel we should be more
> explicitly. And we don't need to put emphasis on "leak".  I recklessly
> propose this as the draft.

I tried to put it simple, please see the attached v10. I'll leave it
to the committer's discretion for better wording.

On Fri, Oct 21, 2022 at 7:47 PM Robert Haas  wrote:
>
> On Fri, Oct 21, 2022 at 2:18 AM Michael Paquier  wrote:
> > AFAIK, one of the callbacks associated to a memory context could
> > fail, see comments before MemoryContextCallResetCallbacks() in
> > MemoryContextDelete().  I agree that it should not matter here, but I
> > think that it is better to reset the pointers before attempting the
> > deletion of the memory context in this case.
>
> I think this is nitpicking. There's no real danger here, and if there
> were, the error handling would have to take it into account somehow,
> which it doesn't.
>
> I'd probably do it before resetting the context as a matter of style,
> to make it clear that there's no window in which the pointers are set
> but referencing invalid memory. But I do not think it makes any
> practical difference at all.

Please see the attached v10.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 3d8cf10d0bab48afee639359669b69c2901afd34 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 21 Oct 2022 15:04:36 +
Subject: [PATCH v10] Avoid memory leaks during backups using SQL-callable
 functions

Any failure while taking backups using SQL-callable functions can
leak the memory allocated for backup_state and tablespace_map, as
they both need to be long-lived, they are being created in
TopMemoryContext.

To fix the memory leak problem, we create a special session-level
memory context as a direct child of TopMemoryContext so that the
memory allocated is carried across. We delete the memory context at
the end of pg_backup_stop(). We keep the memory allocated in this
memory context less, because any error before reaching
pg_backup_stop() can still leak the memory until pg_backup_start()
is called again. While this is not smart, it helps to keep things
simple.

Author: Bharath Rupireddy
Reviewed-by: Robert Haas, Alvaro Herrera
Reviewed-by: Cary Huang, Michael Paquier
Discussion: https://www.postgresql.org/message-id/CALj2ACXqvfKF2B0beQ=aJMdWnpNohmBPsRg=EDQj_6y1t2O8mQ@mail.gmail.com
---
 src/backend/access/transam/xlogfuncs.c | 42 ++
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index a801a94fe8..e9ec86316b 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -45,6 +45,9 @@
 static BackupState *backup_state = NULL;
 static StringInfo tablespace_map = NULL;
 
+/* A long-lived workspace for SQL-callable backup functions. */
+static MemoryContext backupcontext = NULL;
+
 /*
  * pg_backup_start: set up for taking an on-line backup dump
  *
@@ -72,27 +75,27 @@ pg_backup_start(PG_FUNCTION_ARGS)
 
 	/*
 	 * backup_state and tablespace_map need to be long-lived as they are used
-	 * in pg_backup_stop().
+	 * in pg_backup_stop(). We create a special session-level memory context as
+	 * a direct child of TopMemoryContext so that the memory allocated is
+	 * carried across. We typically delete the memory context at the end of
+	 * pg_backup_stop(), however, an error before it can leak the memory until
+	 * pg_backup_start() is called again. Hence, we try to keep the memory
+	 * allocated in this memory context less. While this is not smart, it helps
+	 * to keep things simple.
 	 */
-	

Re: Avoid memory leaks during base backups

2022-10-21 Thread Robert Haas
On Fri, Oct 21, 2022 at 2:18 AM Michael Paquier  wrote:
> AFAIK, one of the callbacks associated to a memory context could
> fail, see comments before MemoryContextCallResetCallbacks() in
> MemoryContextDelete().  I agree that it should not matter here, but I
> think that it is better to reset the pointers before attempting the
> deletion of the memory context in this case.

I think this is nitpicking. There's no real danger here, and if there
were, the error handling would have to take it into account somehow,
which it doesn't.

I'd probably do it before resetting the context as a matter of style,
to make it clear that there's no window in which the pointers are set
but referencing invalid memory. But I do not think it makes any
practical difference at all.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Avoid memory leaks during base backups

2022-10-21 Thread Kyotaro Horiguchi
At Thu, 20 Oct 2022 19:47:07 +0200, Alvaro Herrera  
wrote in 
> I agree that's a good idea, and the patch looks good to me, but I don't
> think asserting that they are null afterwards is useful.

+1 for this direction. And the patch is fine to me.


>   oldcontext = MemoryContextSwitchTo(backupcontext);
>   Assert(backup_state == NULL);
>   Assert(tablespace_map == NULL);
>   backup_state = (BackupState *) palloc0(sizeof(BackupState));
>   tablespace_map = makeStringInfo();
>   MemoryContextSwitchTo(oldcontext);

We can use MemoryContextAllocZero() for this purpose, but of couse not
mandatory.


+* across. We keep the memory allocated in this memory context less,
+* because any error before reaching pg_backup_stop() can leak the 
memory
+* until pg_backup_start() is called again. While this is not smart, it
+* helps to keep things simple.

I think the "less" is somewhat obscure.  I feel we should be more
explicitly. And we don't need to put emphasis on "leak".  I recklessly
propose this as the draft.

"The context is intended to be used by this function to store only
session-lifetime values.  It is, if left alone, reset at the next call
to blow away orphan memory blocks from the previous failed call.
While this is not smart, it helps to keep things simple."

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Avoid memory leaks during base backups

2022-10-21 Thread Michael Paquier
On Fri, Oct 21, 2022 at 11:34:27AM +0530, Bharath Rupireddy wrote:
> On Fri, Oct 21, 2022 at 12:21 AM Robert Haas  wrote:
>> On Thu, Oct 20, 2022 at 1:35 PM Bharath Rupireddy 
>>  wrote:
>>> I think elsewhere in the code we reset dangling pointers either ways -
>>> before or after deleting/resetting memory context. But placing them
>>> before would give us extra safety in case memory context
>>> deletion/reset fails. Not sure what's the best way.
>>
>> I think it's OK to assume that deallocating memory will always
>> succeed, so it doesn't matter whether you do it just before or just
>> after that. But it's not OK to assume that *allocating* memory will
>> always succeed.
> 
> Right.

To be exact, it seems to me that tablespace_map and backup_state
should be reset before deleting backupcontext, but the reset of
backupcontext should happen after the fact.

+   backup_state = NULL;
tablespace_map = NULL;
These two in pg_backup_start() don't matter, do they?  They are
reallocated a couple of lines down.

+* across. We keep the memory allocated in this memory context less,
What does "We keep the memory allocated in this memory context less"
mean here? 
--
Michael


signature.asc
Description: PGP signature


Re: Avoid memory leaks during base backups

2022-10-21 Thread Michael Paquier
On Thu, Oct 20, 2022 at 02:51:21PM -0400, Robert Haas wrote:
> On Thu, Oct 20, 2022 at 1:35 PM Bharath Rupireddy
>  wrote:
>> I think elsewhere in the code we reset dangling pointers either ways -
>> before or after deleting/resetting memory context. But placing them
>> before would give us extra safety in case memory context
>> deletion/reset fails. Not sure what's the best way.
> 
> I think it's OK to assume that deallocating memory will always
> succeed, so it doesn't matter whether you do it just before or just
> after that. But it's not OK to assume that *allocating* memory will
> always succeed.

AFAIK, one of the callbacks associated to a memory context could
fail, see comments before MemoryContextCallResetCallbacks() in
MemoryContextDelete().  I agree that it should not matter here, but I
think that it is better to reset the pointers before attempting the
deletion of the memory context in this case.
--
Michael


signature.asc
Description: PGP signature


Re: Avoid memory leaks during base backups

2022-10-21 Thread Bharath Rupireddy
On Thu, Oct 20, 2022 at 11:17 PM Alvaro Herrera  wrote:
>
> On 2022-Oct-20, Bharath Rupireddy wrote:
>
> > I think elsewhere in the code we reset dangling pointers either ways -
> > before or after deleting/resetting memory context. But placing them
> > before would give us extra safety in case memory context
> > deletion/reset fails. Not sure what's the best way. However, I'm
> > nullifying the dangling pointers after deleting/resetting memory
> > context.
>
> I agree that's a good idea, and the patch looks good to me, but I don't
> think asserting that they are null afterwards is useful.

+1. Removed those assertions. Please see the attached v9 patch.

On Fri, Oct 21, 2022 at 12:21 AM Robert Haas  wrote:
>
> On Thu, Oct 20, 2022 at 1:35 PM Bharath Rupireddy
>  wrote:
> > I think elsewhere in the code we reset dangling pointers either ways -
> > before or after deleting/resetting memory context. But placing them
> > before would give us extra safety in case memory context
> > deletion/reset fails. Not sure what's the best way.
>
> I think it's OK to assume that deallocating memory will always
> succeed, so it doesn't matter whether you do it just before or just
> after that. But it's not OK to assume that *allocating* memory will
> always succeed.

Right.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v9-0001-Avoid-memory-leaks-during-backups-using-SQL-calla.patch
Description: Binary data


Re: Avoid memory leaks during base backups

2022-10-20 Thread Robert Haas
On Thu, Oct 20, 2022 at 1:35 PM Bharath Rupireddy
 wrote:
> I think elsewhere in the code we reset dangling pointers either ways -
> before or after deleting/resetting memory context. But placing them
> before would give us extra safety in case memory context
> deletion/reset fails. Not sure what's the best way.

I think it's OK to assume that deallocating memory will always
succeed, so it doesn't matter whether you do it just before or just
after that. But it's not OK to assume that *allocating* memory will
always succeed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Avoid memory leaks during base backups

2022-10-20 Thread Alvaro Herrera
On 2022-Oct-20, Bharath Rupireddy wrote:

> I think elsewhere in the code we reset dangling pointers either ways -
> before or after deleting/resetting memory context. But placing them
> before would give us extra safety in case memory context
> deletion/reset fails. Not sure what's the best way. However, I'm
> nullifying the dangling pointers after deleting/resetting memory
> context.

I agree that's a good idea, and the patch looks good to me, but I don't
think asserting that they are null afterwards is useful.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Avoid memory leaks during base backups

2022-10-20 Thread Bharath Rupireddy
On Thu, Oct 20, 2022 at 9:48 PM Robert Haas  wrote:
>
> On Thu, Oct 20, 2022 at 6:47 AM Bharath Rupireddy
>  wrote:
> > I tried implementing this, please see the attached v7 patch.
>
> I haven't checked this in detail but it looks much more reasonable in
> terms of code footprint. However, we should, I think, set backup_state
> = NULL and tablespace_map = NULL before deleting the memory context.
> As you have it, I believe that if backup_state = (BackupState *)
> palloc0(sizeof(BackupState)) fails -- say due to running out of memory
> -- then those variables could end up pointing to garbage because the
> context had already been reset before initializing them. I don't know
> whether it's possible for that to cause any concrete harm, but nulling
> out the pointers seems like cheap insurance.

I think elsewhere in the code we reset dangling pointers either ways -
before or after deleting/resetting memory context. But placing them
before would give us extra safety in case memory context
deletion/reset fails. Not sure what's the best way. However, I'm
nullifying the dangling pointers after deleting/resetting memory
context.
MemoryContextDelete(Conf->buildCxt);
MemoryContextDelete(PostmasterContext);
MemoryContextDelete(rulescxt);

Please see the attached v8 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v8-0001-Avoid-memory-leaks-during-backups-using-SQL-calla.patch
Description: Binary data


Re: Avoid memory leaks during base backups

2022-10-20 Thread Robert Haas
On Thu, Oct 20, 2022 at 6:47 AM Bharath Rupireddy
 wrote:
> I tried implementing this, please see the attached v7 patch.

I haven't checked this in detail but it looks much more reasonable in
terms of code footprint. However, we should, I think, set backup_state
= NULL and tablespace_map = NULL before deleting the memory context.
As you have it, I believe that if backup_state = (BackupState *)
palloc0(sizeof(BackupState)) fails -- say due to running out of memory
-- then those variables could end up pointing to garbage because the
context had already been reset before initializing them. I don't know
whether it's possible for that to cause any concrete harm, but nulling
out the pointers seems like cheap insurance.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Avoid memory leaks during base backups

2022-10-20 Thread Bharath Rupireddy
On Wed, Oct 19, 2022 at 9:23 PM Bharath Rupireddy
 wrote:
>
> On Wed, Oct 19, 2022 at 8:10 PM Robert Haas  wrote:
>
> > One option is to just have do_pg_start_backup() blow
> > away any old memory context before it allocates any new memory, and
> > forget about releasing anything in PostgresMain(). That means memory
> > could remain allocated after a failure until you next retry the
> > operation, but I don't think that really matters. It's not a lot of
> > memory; we just don't want it to accumulate across many repetitions.
>
> This seems reasonable to me.

I tried implementing this, please see the attached v7 patch.
Currently, memory allocated in the new memory context is around 4KB
[1]. In the extreme and rarest of the rare cases where somebody
executes select pg_backup_start(repeat('foo', 1024)); or a failure
occurs before reaching pg_backup_stop() on all of the sessions
(max_connections) at once, the maximum/peak memory bloat/leak is
around max_connections*4KB, which will still be way less than the
total amount of RAM. Hence, I think this approach seems very
reasonable and non-invasive yet can solve the memory leak problem.
Thoughts?

[1]
(gdb) p *backupcontext
$4 = {type = T_AllocSetContext, isReset = false, allowInCritSection =
false, mem_allocated = 4232,
  methods = 0x55c925b81f90 , parent =
0x55c92766d2a0, firstchild = 0x0, prevchild = 0x0,
  nextchild = 0x55c92773f1f0, name = 0x55c9258be05c "on-line backup
context", ident = 0x0, reset_cbs = 0x0}

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 657933ab5cc6001ad13d56b89fbf220af541f216 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 20 Oct 2022 10:09:59 +
Subject: [PATCH v7] Avoid memory leaks during backups using SQL-callable
 functions

---
 src/backend/access/transam/xlogfuncs.c | 43 --
 1 file changed, 20 insertions(+), 23 deletions(-)

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index a801a94fe8..198a99a1ed 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -45,6 +45,9 @@
 static BackupState *backup_state = NULL;
 static StringInfo tablespace_map = NULL;
 
+/* A long-lived workspace for SQL-callable backup functions. */
+static MemoryContext backupcontext = NULL;
+
 /*
  * pg_backup_start: set up for taking an on-line backup dump
  *
@@ -72,27 +75,22 @@ pg_backup_start(PG_FUNCTION_ARGS)
 
 	/*
 	 * backup_state and tablespace_map need to be long-lived as they are used
-	 * in pg_backup_stop().
+	 * in pg_backup_stop(). Create a special session-level memory context as a
+	 * direct child of TopMemoryContext so that the memory allocated is carried
+	 * across. We keep the memory allocated in this memory context less,
+	 * because any error before reaching pg_backup_stop() can leak the memory
+	 * until pg_backup_start() is called again. While this is not smart, it
+	 * helps to keep things simple.
 	 */
-	oldcontext = MemoryContextSwitchTo(TopMemoryContext);
-
-	/* Allocate backup state or reset it, if it comes from a previous run */
-	if (backup_state == NULL)
-		backup_state = (BackupState *) palloc0(sizeof(BackupState));
+	if (backupcontext == NULL)
+		backupcontext = AllocSetContextCreate(TopMemoryContext,
+			  "on-line backup context",
+			  ALLOCSET_START_SMALL_SIZES);
 	else
-		MemSet(backup_state, 0, sizeof(BackupState));
-
-	/*
-	 * tablespace_map may have been created in a previous backup, so take this
-	 * occasion to clean it.
-	 */
-	if (tablespace_map != NULL)
-	{
-		pfree(tablespace_map->data);
-		pfree(tablespace_map);
-		tablespace_map = NULL;
-	}
+		MemoryContextReset(backupcontext);
 
+	oldcontext = MemoryContextSwitchTo(backupcontext);
+	backup_state = (BackupState *) palloc0(sizeof(BackupState));
 	tablespace_map = makeStringInfo();
 	MemoryContextSwitchTo(oldcontext);
 
@@ -157,13 +155,12 @@ pg_backup_stop(PG_FUNCTION_ARGS)
 	values[2] = CStringGetTextDatum(tablespace_map->data);
 
 	/* Deallocate backup-related variables */
-	pfree(backup_state);
-	backup_state = NULL;
-	pfree(tablespace_map->data);
-	pfree(tablespace_map);
-	tablespace_map = NULL;
 	pfree(backup_label);
 
+	/* Clean up the session-level backup memory context */
+	MemoryContextDelete(backupcontext);
+	backupcontext = NULL;
+
 	/* Returns the record as Datum */
 	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
 }
-- 
2.34.1



Re: Avoid memory leaks during base backups

2022-10-19 Thread Bharath Rupireddy
On Wed, Oct 19, 2022 at 8:10 PM Robert Haas  wrote:
>
> Well this still touches postgres.c. And I still think it's an awful
> lot of machinery for a pretty trivial problem. As a practical matter,
> nobody is going to open a connection and sit there and try to start a
> backup over and over again on the same connection. And even if someone
> wrote a client that did that -- why? -- they'd have to be awfully
> persistent to leak any amount of memory that would actually matter. So
> it is not insane to just think of ignoring this problem entirely.

I understand that the amount of memory allocated by pg_backup_start()
is small compared to the overall RAM, however, I don't think we can
ignore the problem and let postgres cause memory leaks.

> But if we want to fix it, I think we should do it in some more
> localized way.

Agreed.

> One option is to just have do_pg_start_backup() blow
> away any old memory context before it allocates any new memory, and
> forget about releasing anything in PostgresMain(). That means memory
> could remain allocated after a failure until you next retry the
> operation, but I don't think that really matters. It's not a lot of
> memory; we just don't want it to accumulate across many repetitions.

This seems reasonable to me.

> Another option, perhaps, is to delete some memory context from within
> the TRY/CATCH block if non-NULL, although that wouldn't work nicely if
> it might blow away the data we need to generate the error message.

Right.

> A third option is to do something useful inside WalSndErrorCleanup() or
> WalSndResourceCleanup() as I suggested previously.

These functions will not be called for SQL-callable backup functions
pg_backup_start() and pg_backup_stop(). And the memory leak problem
we're trying to solve is for SQL-callable functions, but not for
basebaskups as they already have a memory context named "Replication
command context" that gets deleted in PostgresMain().

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Avoid memory leaks during base backups

2022-10-19 Thread Robert Haas
On Wed, Oct 19, 2022 at 3:04 AM Bharath Rupireddy
 wrote:
> > Echoing with what I mentioned upthread in [1], I don't quite
> > understand why this patch needs to touch basebackup.c, walsender.c
> > and postgres.c.  In the case of a replication command processed by a
> > WAL sender, memory allocations happen in the memory context created
> > for replication commands, which is itself, as far as I understand, the
> > message memory context when we get a 'Q' message for a simple query.
> > Why do we need more code for a cleanup that should be already
> > happening?  Am I missing something obvious?
>
> My bad, I missed that. You are right. We have "Replication command
> context" as a child of "MessageContext" memory context for base backup
> that gets cleaned upon error in PostgresMain() [1].

Well this still touches postgres.c. And I still think it's an awful
lot of machinery for a pretty trivial problem. As a practical matter,
nobody is going to open a connection and sit there and try to start a
backup over and over again on the same connection. And even if someone
wrote a client that did that -- why? -- they'd have to be awfully
persistent to leak any amount of memory that would actually matter. So
it is not insane to just think of ignoring this problem entirely.

But if we want to fix it, I think we should do it in some more
localized way. One option is to just have do_pg_start_backup() blow
away any old memory context before it allocates any new memory, and
forget about releasing anything in PostgresMain(). That means memory
could remain allocated after a failure until you next retry the
operation, but I don't think that really matters. It's not a lot of
memory; we just don't want it to accumulate across many repetitions.
Another option, perhaps, is to delete some memory context from within
the TRY/CATCH block if non-NULL, although that wouldn't work nicely if
it might blow away the data we need to generate the error message. A
third option is to do something useful inside WalSndErrorCleanup() or
WalSndResourceCleanup() as I suggested previously.

I'm not exactly sure what the right solution is here, but I think you
need to put more thought into how to make the code look simple,
elegant, and non-invasive.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Avoid memory leaks during base backups

2022-10-19 Thread Bharath Rupireddy
On Mon, Oct 17, 2022 at 1:09 PM Michael Paquier  wrote:
>
> On Fri, Oct 14, 2022 at 09:56:31PM +, Cary Huang wrote:
> > I applied your v5 patch on the current master and run valgrind on it
> > while doing a basebackup with simulated error. No memory leak
> > related to backup is observed. Regression is also passing.
>
> Echoing with what I mentioned upthread in [1], I don't quite
> understand why this patch needs to touch basebackup.c, walsender.c
> and postgres.c.  In the case of a replication command processed by a
> WAL sender, memory allocations happen in the memory context created
> for replication commands, which is itself, as far as I understand, the
> message memory context when we get a 'Q' message for a simple query.
> Why do we need more code for a cleanup that should be already
> happening?  Am I missing something obvious?
>
> [1]: https://www.postgresql.org/message-id/YzPKpKEk/jmjh...@paquier.xyz

My bad, I missed that. You are right. We have "Replication command
context" as a child of "MessageContext" memory context for base backup
that gets cleaned upon error in PostgresMain() [1].

> xlogfuncs.c, by storing stuff in the TopMemoryContext of the process
> running the SQL commands pg_backup_start/stop() is different, of
> course.  Perhaps the point of centralizing the base backup context in
> xlogbackup.c makes sense, but my guess that it makes more sense to
> keep that with the SQL functions as these are the only ones in need of
> a cleanup, coming down to the fact that the start and stop functions
> happen in different queries, aka these are not bind to a message
> context.

Yes, they're not related to "MessageContext" memory context.

Please see the attached v6 patch that deals with memory leaks for
backup SQL-callable functions.

[1]
(gdb) bt
#0  MemoryContextDelete (context=0x558b7cd0de50) at mcxt.c:378
#1  0x558b7c655733 in MemoryContextDeleteChildren
(context=0x558b7ccda8c0) at mcxt.c:430
#2  0x558b7c65546d in MemoryContextReset (context=0x558b7ccda8c0)
at mcxt.c:309
#3  0x558b7c43b5cd in PostgresMain (dbname=0x558b7cd11fb8 "",
username=0x558b7ccd6298 "ubuntu")
at postgres.c:4358
#4  0x558b7c364a88 in BackendRun (port=0x558b7cd09620) at postmaster.c:4482
#5  0x558b7c36431b in BackendStartup (port=0x558b7cd09620) at
postmaster.c:4210
#6  0x558b7c3603be in ServerLoop () at postmaster.c:1804
#7  0x558b7c35fb1b in PostmasterMain (argc=3, argv=0x558b7ccd4200)
at postmaster.c:1476
#8  0x558b7c229a0e in main (argc=3, argv=0x558b7ccd4200) at main.c:197

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 8fcf6c5c6bd60de215ad8e066cc9c759aa3a7dd5 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 19 Oct 2022 06:18:12 +
Subject: [PATCH v6] Avoid memory leaks during backups using SQL-callable
 functions

Postgres currently can leak memory if a failure occurs while
taking backup using SQL-callable functions pg_backup_start() and
pg_backup_stop(). The palloc'd memory such as backup_state or
tablespace_map in xlogfuncs.c or any other, is left-out which may
cause memory bloat on the server eventually.

To solve the above problem, we create a separate memory context
and add clean up callback in PostgresMain()'s error handling code
using sigsetjmp(). The new memory context is created as a child of
TopMemoryContext in pg_backup_start() which gets deleted upon any
error or at the end of pg_backup_stop().
---
 src/backend/access/transam/xlogfuncs.c | 101 +++--
 src/backend/tcop/postgres.c|   4 +
 src/include/access/xlogbackup.h|   1 +
 3 files changed, 81 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index a801a94fe8..b7dd5cf8f3 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -45,6 +45,30 @@
 static BackupState *backup_state = NULL;
 static StringInfo tablespace_map = NULL;
 
+/* A workspace for SQL-callable backup functions. */
+static MemoryContext backupcontext = NULL;
+
+/*
+ * A session-level variable to define the scope of SQL-callable backup
+ * functions memory context clean up.
+ */
+static bool allowedToCleanBackupMemCxt = false;
+
+/*
+ * Clean up memory context created for SQL-callable backup functions.
+ */
+void
+BackupMemoryContextCleanup(void)
+{
+	if (backupcontext == NULL || allowedToCleanBackupMemCxt == false)
+		return;		/* Nothing to do. */
+
+	MemoryContextDelete(backupcontext);
+	backupcontext = NULL;
+	backup_state = NULL;
+	tablespace_map = NULL;
+}
+
 /*
  * pg_backup_start: set up for taking an on-line backup dump
  *
@@ -71,34 +95,41 @@ pg_backup_start(PG_FUNCTION_ARGS)
  errmsg("a backup is already in progress in this session")));
 
 	/*
-	 * backup_state and tablespace_map need to be long-lived as they are used
-	 * in pg_backup_stop().
+	 * Let's clean backup memory context while 

Re: Avoid memory leaks during base backups

2022-10-17 Thread Michael Paquier
On Fri, Oct 14, 2022 at 09:56:31PM +, Cary Huang wrote:
> I applied your v5 patch on the current master and run valgrind on it
> while doing a basebackup with simulated error. No memory leak
> related to backup is observed. Regression is also passing.

Echoing with what I mentioned upthread in [1], I don't quite
understand why this patch needs to touch basebackup.c, walsender.c
and postgres.c.  In the case of a replication command processed by a
WAL sender, memory allocations happen in the memory context created
for replication commands, which is itself, as far as I understand, the
message memory context when we get a 'Q' message for a simple query.
Why do we need more code for a cleanup that should be already
happening?  Am I missing something obvious?

xlogfuncs.c, by storing stuff in the TopMemoryContext of the process
running the SQL commands pg_backup_start/stop() is different, of
course.  Perhaps the point of centralizing the base backup context in
xlogbackup.c makes sense, but my guess that it makes more sense to
keep that with the SQL functions as these are the only ones in need of
a cleanup, coming down to the fact that the start and stop functions
happen in different queries, aka these are not bind to a message
context.

[1]: https://www.postgresql.org/message-id/YzPKpKEk/jmjh...@paquier.xyz
--
Michael


signature.asc
Description: PGP signature


Re: Avoid memory leaks during base backups

2022-10-14 Thread Cary Huang
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Hello

I applied your v5 patch on the current master and run valgrind on it while 
doing a basebackup with simulated error. No memory leak related to backup is 
observed. Regression is also passing 

thank you

Cary Huang
HighGo Software Canada

Re: Avoid memory leaks during base backups

2022-10-03 Thread Bharath Rupireddy
On Thu, Sep 29, 2022 at 10:38 PM Bharath Rupireddy
 wrote:
>
> Please review the v4 patch.

I used valgrind for testing. Without patch, there's an obvious memory
leak [1], with patch no memory leak.

I used ALLOCSET_START_SMALL_SIZES instead of ALLOCSET_DEFAULT_SIZES
for backup memory context so that it can start small and grow if
required.

I'm attaching v5 patch, please review it further.

[1]
==00:00:01:36.306 145709== VALGRINDERROR-BEGIN
==00:00:01:36.306 145709== 24 bytes in 1 blocks are still reachable in
loss record 122 of 511
==00:00:01:36.306 145709==at 0x98E501: palloc (mcxt.c:1170)
==00:00:01:36.306 145709==by 0x9C1795: makeStringInfo (stringinfo.c:45)
==00:00:01:36.306 145709==by 0x2DE22A: pg_backup_start (xlogfuncs.c:96)
==00:00:01:36.306 145709==by 0x4D2DB6: ExecMakeTableFunctionResult
(execSRF.c:234)
==00:00:01:36.306 145709==by 0x4F08DA: FunctionNext (nodeFunctionscan.c:95)
==00:00:01:36.306 145709==by 0x4D48EA: ExecScanFetch (execScan.c:133)
==00:00:01:36.306 145709==by 0x4D4963: ExecScan (execScan.c:182)
==00:00:01:36.306 145709==by 0x4F0C84: ExecFunctionScan
(nodeFunctionscan.c:270)
==00:00:01:36.306 145709==by 0x4D0255: ExecProcNodeFirst
(execProcnode.c:464)
==00:00:01:36.306 145709==by 0x4C32D4: ExecProcNode (executor.h:259)
==00:00:01:36.306 145709==by 0x4C619C: ExecutePlan (execMain.c:1636)
==00:00:01:36.306 145709==by 0x4C3A0F: standard_ExecutorRun (execMain.c:363)
==00:00:01:36.306 145709==
==00:00:01:36.306 145709== VALGRINDERROR-END

==00:00:01:36.334 145709== VALGRINDERROR-BEGIN
==00:00:01:36.334 145709== 1,024 bytes in 1 blocks are still reachable
in loss record 426 of 511
==00:00:01:36.334 145709==at 0x98E501: palloc (mcxt.c:1170)
==00:00:01:36.334 145709==by 0x9C17CF: initStringInfo (stringinfo.c:63)
==00:00:01:36.334 145709==by 0x9C17A5: makeStringInfo (stringinfo.c:47)
==00:00:01:36.334 145709==by 0x2DE22A: pg_backup_start (xlogfuncs.c:96)
==00:00:01:36.334 145709==by 0x4D2DB6: ExecMakeTableFunctionResult
(execSRF.c:234)
==00:00:01:36.334 145709==by 0x4F08DA: FunctionNext (nodeFunctionscan.c:95)
==00:00:01:36.334 145709==by 0x4D48EA: ExecScanFetch (execScan.c:133)
==00:00:01:36.334 145709==by 0x4D4963: ExecScan (execScan.c:182)
==00:00:01:36.334 145709==by 0x4F0C84: ExecFunctionScan
(nodeFunctionscan.c:270)
==00:00:01:36.334 145709==by 0x4D0255: ExecProcNodeFirst
(execProcnode.c:464)
==00:00:01:36.334 145709==by 0x4C32D4: ExecProcNode (executor.h:259)
==00:00:01:36.334 145709==by 0x4C619C: ExecutePlan (execMain.c:1636)
==00:00:01:36.334 145709==
==00:00:01:36.334 145709== VALGRINDERROR-END

==00:00:01:36.335 145709== VALGRINDERROR-BEGIN
==00:00:01:36.335 145709== 1,096 bytes in 1 blocks are still reachable
in loss record 431 of 511
==00:00:01:36.335 145709==at 0x98E766: palloc0 (mcxt.c:1201)
==00:00:01:36.335 145709==by 0x2DE152: pg_backup_start (xlogfuncs.c:81)
==00:00:01:36.335 145709==by 0x4D2DB6: ExecMakeTableFunctionResult
(execSRF.c:234)
==00:00:01:36.335 145709==by 0x4F08DA: FunctionNext (nodeFunctionscan.c:95)
==00:00:01:36.335 145709==by 0x4D48EA: ExecScanFetch (execScan.c:133)
==00:00:01:36.335 145709==by 0x4D4963: ExecScan (execScan.c:182)
==00:00:01:36.335 145709==by 0x4F0C84: ExecFunctionScan
(nodeFunctionscan.c:270)
==00:00:01:36.335 145709==by 0x4D0255: ExecProcNodeFirst
(execProcnode.c:464)
==00:00:01:36.335 145709==by 0x4C32D4: ExecProcNode (executor.h:259)
==00:00:01:36.335 145709==by 0x4C619C: ExecutePlan (execMain.c:1636)
==00:00:01:36.335 145709==by 0x4C3A0F: standard_ExecutorRun (execMain.c:363)
==00:00:01:36.335 145709==by 0x4C37FA: ExecutorRun (execMain.c:307)
==00:00:01:36.335 145709==
==00:00:01:36.335 145709== VALGRINDERROR-END

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From b99336d3a5510aa10036844ef8d3a443694d5dec Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 3 Oct 2022 13:06:30 +
Subject: [PATCH v5] Avoid memory leaks during backups

Postgres currently can leak memory if a failure occurs during base
backup in do_pg_backup_start() or do_pg_backup_stop() or
perform_base_backup(). The palloc'd memory such as backup_state or
tablespace_map in xlogfuncs.c or basebackup.c or tablespaceinfo in
perform_base_backup() or any other, is left-out which may cause
memory bloat on the server eventually. To experience this issue,
run pg_basebackup with --label name longer than 1024 characters and
observe memory with watch command, the memory usage goes up. It looks
like the memory leak issue has been there for quite some time.

To solve the above problem, we create separate memory context for
backup and add clean up callback in PostgresMain()'s error handling
code using sigsetjmp().

The design of the patch is as follows:

For backup functions or on-line backups, a new memory context is
created as a child of TopMemoryContext 

Re: Avoid memory leaks during base backups

2022-09-29 Thread Bharath Rupireddy
On Thu, Sep 29, 2022 at 7:05 PM Robert Haas  wrote:
>
> On Thu, Sep 29, 2022 at 4:29 AM Bharath Rupireddy
>  wrote:
> > Please review the attached v3 patch.
>
> template1=# select * from pg_backup_start('sdgkljsdgkjdsg', true);
>  pg_backup_start
> -
>  0/228
> (1 row)
>
> template1=# select 1/0;
> ERROR:  division by zero
> template1=# select * from pg_backup_stop();
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.
> !?>

Thanks! I used a variable to define the scope to clean up the backup
memory context for SQL functions/on-line backup. We don't have this
problem in case of base backup because we don't give control in
between start and stop backup in perform_base_backup().

Please review the v4 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From f22c5e3afcdf2bf0d7bbcab9356f9ad70a7ec201 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 29 Sep 2022 16:31:59 +
Subject: [PATCH v4] Avoid memory leaks during backups

Postgres currently can leak memory if a failure occurs during base
backup in do_pg_backup_start() or do_pg_backup_stop() or
perform_base_backup(). The palloc'd memory such as backup_state or
tablespace_map in xlogfuncs.c or basebackup.c or tablespaceinfo in
perform_base_backup() or any other, is left-out which may cause
memory bloat on the server eventually. To experience this issue,
run pg_basebackup with --label name longer than 1024 characters and
observe memory with watch command, the memory usage goes up. It looks
like the memory leak issue has been there for quite some time.

To solve the above problem, we create separate memory context for
backup and add clean up callback in PostgresMain()'s error handling
code using sigsetjmp().

The design of the patch is as follows:

For backup functions or on-line backups, a new memory context is
created as a child of TopMemoryContext in pg_backup_start() which
gets deleted upon any error or at the end of pg_backup_stop().

For perform_base_backup() or base backups, a new memory context is
created as a child of CurrentMemoryContext which gets deleted upon
any error or at the end of perform_base_backup().
---
 src/backend/access/transam/xlogfuncs.c | 103 +++--
 src/backend/backup/basebackup.c|  40 ++
 src/backend/replication/walsender.c|   2 +
 src/backend/tcop/postgres.c|   4 +
 src/include/access/xlogbackup.h|   1 +
 src/include/backup/basebackup.h|   1 +
 6 files changed, 127 insertions(+), 24 deletions(-)

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index a801a94fe8..985be59ca8 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -45,6 +45,30 @@
 static BackupState *backup_state = NULL;
 static StringInfo tablespace_map = NULL;
 
+/* A workspace for on-line backup. */
+static MemoryContext backupcontext = NULL;
+
+/*
+ * A session-level variable to define the scope of backup memory context clean
+ * up.
+ */
+static bool allowedToCleanBackupMemCxt = false;
+
+/*
+ * Clean up backup memory context we created.
+ */
+void
+BackupMemoryContextCleanup(void)
+{
+	if (backupcontext == NULL || allowedToCleanBackupMemCxt == false)
+		return;		/* Nothing to do. */
+
+	MemoryContextDelete(backupcontext);
+	backupcontext = NULL;
+	backup_state = NULL;
+	tablespace_map = NULL;
+}
+
 /*
  * pg_backup_start: set up for taking an on-line backup dump
  *
@@ -71,34 +95,45 @@ pg_backup_start(PG_FUNCTION_ARGS)
  errmsg("a backup is already in progress in this session")));
 
 	/*
-	 * backup_state and tablespace_map need to be long-lived as they are used
-	 * in pg_backup_stop().
+	 * We're okay to clean backup memory context while an error occurs within
+	 * this function.
 	 */
-	oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+	allowedToCleanBackupMemCxt = true;
 
-	/* Allocate backup state or reset it, if it comes from a previous run */
-	if (backup_state == NULL)
-		backup_state = (BackupState *) palloc0(sizeof(BackupState));
-	else
-		MemSet(backup_state, 0, sizeof(BackupState));
+	/*
+	 * backup_state and tablespace_map need to be long-lived as they are used
+	 * in pg_backup_stop(). Create a special memory context as a direct child
+	 * of TopMemoryContext so that the memory allocated is carried till the
+	 * pg_backup_stop() and cleaned in case of errors (see error handling code
+	 * using sigsetjmp() in PostgresMain()).
+	 */
+	if (backupcontext == NULL)
+		backupcontext = AllocSetContextCreate(TopMemoryContext,
+			  "on-line backup context",
+			  ALLOCSET_DEFAULT_SIZES);
 
 	/*
-	 * tablespace_map may have been created in a previous 

Re: Avoid memory leaks during base backups

2022-09-29 Thread Robert Haas
On Thu, Sep 29, 2022 at 4:29 AM Bharath Rupireddy
 wrote:
> Please review the attached v3 patch.

template1=# select * from pg_backup_start('sdgkljsdgkjdsg', true);
 pg_backup_start
-
 0/228
(1 row)

template1=# select 1/0;
ERROR:  division by zero
template1=# select * from pg_backup_stop();
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
!?>

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Avoid memory leaks during base backups

2022-09-29 Thread Bharath Rupireddy
On Wed, Sep 28, 2022 at 8:46 PM Robert Haas  wrote:
>
> I feel like we ought to be trying to tie
> the cleanup into WalSndErrorCleanup() or WalSndResourceCleanup() based
> on having the memory context that we ought to be blowing away stored
> in a global variable, rather than using a try/catch block.

Okay, I got rid of the try-catch block. I added two clean up callbacks
(one for SQL backup functions or on-line backup, another for base
backup) that basically delete the respective memory contexts and reset
the file-level variables, they get called from PostgresMain()'s error
handling code.

> Like, maybe there's a function EstablishWalSenderMemoryContext() that
> commands can call before allocating memory that shouldn't survive an
> error. And it's deleted after each command if it exists, or if an
> error occurs then WalSndErrorCleanup() deletes it.

I don't think we need any of the above. I've used file-level variables
to hold memory contexts, allocating them whenever needed and cleaning
them up either at the end of backup operation or upon error.

Please review the attached v3 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v3-0001-Avoid-memory-leaks-during-backups.patch
Description: Binary data


Re: Avoid memory leaks during base backups

2022-09-28 Thread Robert Haas
On Wed, Sep 28, 2022 at 5:30 AM Bharath Rupireddy
 wrote:
> I'm attaching the v2 patch designed as described above. Please review it.
>
> I've added an entry in CF - https://commitfest.postgresql.org/40/3915/

This looks odd to me. In the case of a regular backend, the
sigsetjmp() handler in src/backend/tcop/postgres.c is responsible for
cleaning up memory. It calls AbortCurrentTransaction() which will call
CleanupTransaction() which will call AtCleanup_Memory() which will
block away TopTransactionContext. I think we ought to do something
analogous here, and we almost do already. Some walsender commands are
going to be SQL commands and some aren't. For those that aren't, the
same block calls WalSndErrorCleanup() which does similar kinds of
cleanup, including in some situations calling WalSndResourceCleanup()
which cleans up the resource owner in cases where we have a resource
owner without a transaction. I feel like we ought to be trying to tie
the cleanup into WalSndErrorCleanup() or WalSndResourceCleanup() based
on having the memory context that we ought to be blowing away stored
in a global variable, rather than using a try/catch block.

Like, maybe there's a function EstablishWalSenderMemoryContext() that
commands can call before allocating memory that shouldn't survive an
error. And it's deleted after each command if it exists, or if an
error occurs then WalSndErrorCleanup() deletes it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Avoid memory leaks during base backups

2022-09-28 Thread Bharath Rupireddy
On Wed, Sep 28, 2022 at 10:09 AM Bharath Rupireddy
 wrote:
>
> Here's what I think - for backup functions, we
> can have the new memory context child of TopMemoryContext and for
> perform_base_backup(), we can have the memory context child of
> CurrentMemoryContext. With PG_TRY()-PG_FINALLY()-PG_END_TRY(), we can
> delete those memory contexts upon ERRORs. This approach works for us
> since backup-related code doesn't have any FATALs.
>
> Thoughts?

I'm attaching the v2 patch designed as described above. Please review it.

I've added an entry in CF - https://commitfest.postgresql.org/40/3915/

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 36607bb46a1328a8b4d63169eb1739cf3e6769fd Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 28 Sep 2022 09:04:00 +
Subject: [PATCH v2] Avoid memory leaks during backups

Postgres currently can leak memory if a failure occurs during base
backup in do_pg_backup_start() or do_pg_backup_stop() or
perform_base_backup(). The palloc'd memory such as backup_state or
tablespace_map in xlogfuncs.c or basebackup.c or tablespaceinfo in
perform_base_backup() or any other, is left-out which may cause
memory bloat on the server eventually. To experience this issue,
run pg_basebackup with --label name longer than 1024 characters and
observe memory with watch command, the memory usage goes up.

It looks like the memory leak issue has been there for quite some
time.

To solve the above problem, leverage PG_TRY()-PG_FINALLY()-PG_END_TRY()
mechanism and memory context. The design of the patch is as follows:

For backup functions or on-line backups, a new memory context is
created as a child of TopMemoryContext in pg_backup_start() which
gets deleted upon ERROR or at the end of pg_backup_stop().

For perform_base_backup() or base backups, a new memory context is
created as a child of CurrentMemoryContext which gets deleted upon
ERROR or at the end of perform_base_backup().
---
 src/backend/access/transam/xlogfuncs.c | 114 ++---
 src/backend/backup/basebackup.c|  41 -
 src/backend/utils/error/elog.c |  17 
 src/include/utils/elog.h   |   1 +
 4 files changed, 141 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index a801a94fe8..6c13da91bd 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -45,6 +45,9 @@
 static BackupState *backup_state = NULL;
 static StringInfo tablespace_map = NULL;
 
+/* A workspace for on-line backup. */
+static MemoryContext backupcontext = NULL;
+
 /*
  * pg_backup_start: set up for taking an on-line backup dump
  *
@@ -71,33 +74,52 @@ pg_backup_start(PG_FUNCTION_ARGS)
  errmsg("a backup is already in progress in this session")));
 
 	/*
-	 * backup_state and tablespace_map need to be long-lived as they are used
-	 * in pg_backup_stop().
+	 * Start the on-line backup, but make sure we clean up the allocated memory
+	 * even if an error occurs.
 	 */
-	oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+	PG_TRY();
+	{
+		/*
+		 * backup_state and tablespace_map need to be long-lived as they are
+		 * used in pg_backup_stop(). Create a special memory context as a
+		 * direct child of TopMemoryContext so that the memory allocated is
+		 * cleaned in case of errors.
+		 */
+		if (backupcontext == NULL)
+		{
+			backupcontext = AllocSetContextCreate(TopMemoryContext,
+  "on-line backup context",
+  ALLOCSET_DEFAULT_SIZES);
+		}
 
-	/* Allocate backup state or reset it, if it comes from a previous run */
-	if (backup_state == NULL)
+		oldcontext = MemoryContextSwitchTo(backupcontext);
+
+		Assert(backup_state == NULL);
+		Assert(tablespace_map == NULL);
 		backup_state = (BackupState *) palloc0(sizeof(BackupState));
-	else
-		MemSet(backup_state, 0, sizeof(BackupState));
+		tablespace_map = makeStringInfo();
 
-	/*
-	 * tablespace_map may have been created in a previous backup, so take this
-	 * occasion to clean it.
-	 */
-	if (tablespace_map != NULL)
+		register_persistent_abort_backup_handler();
+		do_pg_backup_start(backupidstr, fast, NULL, backup_state, tablespace_map);
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+	PG_FINALLY();
 	{
-		pfree(tablespace_map->data);
-		pfree(tablespace_map);
-		tablespace_map = NULL;
+		/*
+		 * Delete on-line backup memory context only when an ERROR occurred,
+		 * because the memory allocated in backupcontext is used in
+		 * pg_backup_stop() for non-ERROR cases.
+		 */
+		if (backupcontext != NULL && geterrlevel() == ERROR)
+		{
+			MemoryContextDelete(backupcontext);
+			backupcontext = NULL;
+			backup_state = NULL;
+			tablespace_map = NULL;
+		}
 	}
-
-	tablespace_map = makeStringInfo();
-	MemoryContextSwitchTo(oldcontext);
-
-	register_persistent_abort_backup_handler();
-	do_pg_backup_start(backupidstr, fast, NULL, 

Re: Avoid memory leaks during base backups

2022-09-28 Thread Kyotaro Horiguchi
At Wed, 28 Sep 2022 13:16:36 +0900, Michael Paquier  wrote 
in 
> On Tue, Sep 27, 2022 at 05:32:26PM +0900, Kyotaro Horiguchi wrote:
> > At Tue, 27 Sep 2022 11:33:56 +0530, Bharath Rupireddy 
> >  wrote in 
> > > On Mon, Sep 26, 2022 at 7:34 PM Tom Lane  wrote:
> > > > This ... seems like inventing your own shape of wheel.  The
> > > > normal mechanism for preventing this type of leak is to put the
> > > > allocations in a memory context that can be reset or deallocated
> > > > in mainline code at the end of the operation.
> > > 
> > > Yes, that's the typical way and the patch attached does it for
> > > perform_base_backup(). What happens if we allocate some memory in the
> > > new memory context and error-out before reaching the end of operation?
> > > How do we deallocate such memory?
> > 
> > Whoever directly or indirectly catches the exception can do that.  For
> > example, SendBaseBackup() seems to catch execptions from
> > perform_base_backup(). bbsinc_cleanup() is already resides there.
> 
> Even with that, what's the benefit in using an extra memory context in
> basebackup.c?  backup_label and tablespace_map are mentioned upthread,
> but we have a tight control of these, and they should be allocated in
> the memory context created for replication commands (grep for
> "Replication command context") anyway.  Using a dedicated memory
> context for the SQL backup functions under TopMemoryContext could be
> interesting, on the other hand..

If I understand you correctly, my point was the usage of error
callbacks.  I meant that we can release that tangling memory blocks in
SendBaseBackup() even by directly pfree()ing then NULLing the pointer,
if the pointer were file-scoped static.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Avoid memory leaks during base backups

2022-09-27 Thread Bharath Rupireddy
On Wed, Sep 28, 2022 at 10:19 AM Tom Lane  wrote:
>
> Bharath Rupireddy  writes:
> > I had the same opinion. Here's what I think - for backup functions, we
> > can have the new memory context child of TopMemoryContext and for
> > perform_base_backup(), we can have the memory context child of
> > CurrentMemoryContext. With PG_TRY()-PG_FINALLY()-PG_END_TRY(), we can
> > delete those memory contexts upon ERRORs. This approach works for us
> > since backup-related code doesn't have any FATALs.
>
> Not following your last point here?  A process exiting on FATAL
> does not especially need to clean up its memory allocations first.
> Which is good, because "backup-related code doesn't have any FATALs"
> seems like an assertion with a very short half-life.

You're right. My bad. For FATALs, we don't need to clean the memory as
the process itself exits.

 * Note: an ereport(FATAL) will not be caught by this construct; control will
 * exit straight through proc_exit().

/*
 * Perform error recovery action as specified by elevel.
 */
if (elevel == FATAL)
{

/*
 * Do normal process-exit cleanup, then return exit code 1 to indicate
 * FATAL termination.  The postmaster may or may not consider this
 * worthy of panic, depending on which subprocess returns it.
 */
proc_exit(1);
}

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Avoid memory leaks during base backups

2022-09-27 Thread Tom Lane
Bharath Rupireddy  writes:
> I had the same opinion. Here's what I think - for backup functions, we
> can have the new memory context child of TopMemoryContext and for
> perform_base_backup(), we can have the memory context child of
> CurrentMemoryContext. With PG_TRY()-PG_FINALLY()-PG_END_TRY(), we can
> delete those memory contexts upon ERRORs. This approach works for us
> since backup-related code doesn't have any FATALs.

Not following your last point here?  A process exiting on FATAL
does not especially need to clean up its memory allocations first.
Which is good, because "backup-related code doesn't have any FATALs"
seems like an assertion with a very short half-life.

regards, tom lane




Re: Avoid memory leaks during base backups

2022-09-27 Thread Bharath Rupireddy
On Wed, Sep 28, 2022 at 9:46 AM Michael Paquier  wrote:
>
> On Tue, Sep 27, 2022 at 05:32:26PM +0900, Kyotaro Horiguchi wrote:
> > At Tue, 27 Sep 2022 11:33:56 +0530, Bharath Rupireddy 
> >  wrote in
> > > On Mon, Sep 26, 2022 at 7:34 PM Tom Lane  wrote:
> > > > This ... seems like inventing your own shape of wheel.  The
> > > > normal mechanism for preventing this type of leak is to put the
> > > > allocations in a memory context that can be reset or deallocated
> > > > in mainline code at the end of the operation.
> > >
> > > Yes, that's the typical way and the patch attached does it for
> > > perform_base_backup(). What happens if we allocate some memory in the
> > > new memory context and error-out before reaching the end of operation?
> > > How do we deallocate such memory?
> >
> > Whoever directly or indirectly catches the exception can do that.  For
> > example, SendBaseBackup() seems to catch execptions from
> > perform_base_backup(). bbsinc_cleanup() is already resides there.
>
> Even with that, what's the benefit in using an extra memory context in
> basebackup.c?  backup_label and tablespace_map are mentioned upthread,
> but we have a tight control of these, and they should be allocated in
> the memory context created for replication commands (grep for
> "Replication command context") anyway.  Using a dedicated memory
> context for the SQL backup functions under TopMemoryContext could be
> interesting, on the other hand..

I had the same opinion. Here's what I think - for backup functions, we
can have the new memory context child of TopMemoryContext and for
perform_base_backup(), we can have the memory context child of
CurrentMemoryContext. With PG_TRY()-PG_FINALLY()-PG_END_TRY(), we can
delete those memory contexts upon ERRORs. This approach works for us
since backup-related code doesn't have any FATALs.

Thoughts?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Avoid memory leaks during base backups

2022-09-27 Thread Michael Paquier
On Tue, Sep 27, 2022 at 05:32:26PM +0900, Kyotaro Horiguchi wrote:
> At Tue, 27 Sep 2022 11:33:56 +0530, Bharath Rupireddy 
>  wrote in 
> > On Mon, Sep 26, 2022 at 7:34 PM Tom Lane  wrote:
> > > This ... seems like inventing your own shape of wheel.  The
> > > normal mechanism for preventing this type of leak is to put the
> > > allocations in a memory context that can be reset or deallocated
> > > in mainline code at the end of the operation.
> > 
> > Yes, that's the typical way and the patch attached does it for
> > perform_base_backup(). What happens if we allocate some memory in the
> > new memory context and error-out before reaching the end of operation?
> > How do we deallocate such memory?
> 
> Whoever directly or indirectly catches the exception can do that.  For
> example, SendBaseBackup() seems to catch execptions from
> perform_base_backup(). bbsinc_cleanup() is already resides there.

Even with that, what's the benefit in using an extra memory context in
basebackup.c?  backup_label and tablespace_map are mentioned upthread,
but we have a tight control of these, and they should be allocated in
the memory context created for replication commands (grep for
"Replication command context") anyway.  Using a dedicated memory
context for the SQL backup functions under TopMemoryContext could be
interesting, on the other hand..
--
Michael


signature.asc
Description: PGP signature


Re: Avoid memory leaks during base backups

2022-09-27 Thread Kyotaro Horiguchi
At Tue, 27 Sep 2022 11:33:56 +0530, Bharath Rupireddy 
 wrote in 
> On Mon, Sep 26, 2022 at 7:34 PM Tom Lane  wrote:
> > This ... seems like inventing your own shape of wheel.  The
> > normal mechanism for preventing this type of leak is to put the
> > allocations in a memory context that can be reset or deallocated
> > in mainline code at the end of the operation.
> 
> Yes, that's the typical way and the patch attached does it for
> perform_base_backup(). What happens if we allocate some memory in the
> new memory context and error-out before reaching the end of operation?
> How do we deallocate such memory?

Whoever directly or indirectly catches the exception can do that.  For
example, SendBaseBackup() seems to catch execptions from
perform_base_backup(). bbsinc_cleanup() is already resides there.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Avoid memory leaks during base backups

2022-09-27 Thread Bharath Rupireddy
On Mon, Sep 26, 2022 at 7:34 PM Tom Lane  wrote:
>
> > I'm proposing a patch that leverages the error callback mechanism and
> > memory context.
>
> This ... seems like inventing your own shape of wheel.  The
> normal mechanism for preventing this type of leak is to put the
> allocations in a memory context that can be reset or deallocated
> in mainline code at the end of the operation.

Yes, that's the typical way and the patch attached does it for
perform_base_backup(). What happens if we allocate some memory in the
new memory context and error-out before reaching the end of operation?
How do we deallocate such memory?
Backup related code has simple-to-generate-error paths in between and
memory can easily be leaked.

Are you suggesting to use sigsetjmp or some other way to prevent memory leaks?

> I do not think that
> having an errcontext callback with side-effects like deallocating
> memory is even remotely safe, and it's certainly a first-order
> abuse of that mechanism.

Are you saying that the error callback might deallocate the memory
that may be needed later in the error processing?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Avoid memory leaks during base backups

2022-09-26 Thread Tom Lane
Bharath Rupireddy  writes:
> Postgres currently can leak memory if a failure occurs during base
> backup in do_pg_backup_start() or do_pg_backup_stop() or
> perform_base_backup(). The palloc'd memory such as backup_state or
> tablespace_map in xlogfuncs.c or basebackup.c or tablespaceinfo or the
> memory that gets allocated by bbsink_begin_backup() in
> perform_base_backup() or any other, is left-out which may cause memory
> bloat on the server eventually. To experience this issue, run
> pg_basebackup with --label name longer than 1024 characters and
> observe memory with watch command, the memory usage goes up.

> It looks like the memory leak issue has been there for quite some
> time, discussed in [1].

> I'm proposing a patch that leverages the error callback mechanism and
> memory context.

This ... seems like inventing your own shape of wheel.  The
normal mechanism for preventing this type of leak is to put the
allocations in a memory context that can be reset or deallocated
in mainline code at the end of the operation.  I do not think that
having an errcontext callback with side-effects like deallocating
memory is even remotely safe, and it's certainly a first-order
abuse of that mechanism.

regards, tom lane




Avoid memory leaks during base backups

2022-09-26 Thread Bharath Rupireddy
Hi,

Postgres currently can leak memory if a failure occurs during base
backup in do_pg_backup_start() or do_pg_backup_stop() or
perform_base_backup(). The palloc'd memory such as backup_state or
tablespace_map in xlogfuncs.c or basebackup.c or tablespaceinfo or the
memory that gets allocated by bbsink_begin_backup() in
perform_base_backup() or any other, is left-out which may cause memory
bloat on the server eventually. To experience this issue, run
pg_basebackup with --label name longer than 1024 characters and
observe memory with watch command, the memory usage goes up.

It looks like the memory leak issue has been there for quite some
time, discussed in [1].

I'm proposing a patch that leverages the error callback mechanism and
memory context. The design of the patch is as follows:
1) pg_backup_start() and pg_backup_stop() - the error callback frees
up the backup_state and tablespace_map variables allocated in
TopMemoryContext. We don't need a separate memory context here because
do_pg_backup_start() and do_pg_backup_stop() don't return any
dynamically created memory for now. We can choose to create a separate
memory context for the future changes that may come, but now it is not
required.
2) perform_base_backup() - a new memory context has been created that
gets deleted by the callback upon error.

The error callbacks are typically called for all the elevels, but we
need to free up the memory only when elevel is >= ERROR or ==
COMMERROR. The COMMERROR is a common scenario because the server can
close the connection to the client or vice versa in which case the
base backup fails. For all other elevels like WARNING, NOTICE, DEBUGX,
INFO etc. we don't free up the memory.

I'm attaching v1 patch herewith.

Thoughts?

[1] https://www.postgresql.org/message-id/Yyq15ekNzjZecwMW%40paquier.xyz

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v1-0001-Avoid-memory-leaks-during-base-backups.patch
Description: Binary data