Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-11-29 Thread Andres Freund
Hi,

Thank for all the work on this topic - I'd somehow accidentally marked this
thread as read when coming back from vacation...

Greetings,

Andres Freund




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-09-01 Thread Amit Kapila
On Wed, Sep 1, 2021 at 5:33 PM Dilip Kumar  wrote:
>
> On Wed, Sep 1, 2021 at 5:23 PM Amit Kapila  wrote:
>>
>> On Wed, Sep 1, 2021 at 1:53 PM Dilip Kumar  wrote:
>> >
>>
>> The latest patch looks good to me. I have made some changes in the
>> comments, see attached. I am planning to push this tomorrow unless you
>> or others have any comments on it.
>
>
> These comments changes look good to me.
>

Pushed.

-- 
With Regards,
Amit Kapila.




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-09-01 Thread Dilip Kumar
On Wed, Sep 1, 2021 at 5:23 PM Amit Kapila  wrote:

> On Wed, Sep 1, 2021 at 1:53 PM Dilip Kumar  wrote:
> >
>
> The latest patch looks good to me. I have made some changes in the
> comments, see attached. I am planning to push this tomorrow unless you
> or others have any comments on it.
>

These comments changes look good to me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-09-01 Thread Amit Kapila
On Wed, Sep 1, 2021 at 1:53 PM Dilip Kumar  wrote:
>

The latest patch looks good to me. I have made some changes in the
comments, see attached. I am planning to push this tomorrow unless you
or others have any comments on it.

-- 
With Regards,
Amit Kapila.


v8-0001-Optimize-fileset-usage-in-apply-worker.patch
Description: Binary data


Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-09-01 Thread Dilip Kumar
On Tue, Aug 31, 2021 at 3:20 PM Amit Kapila  wrote:
>
> On Fri, Aug 27, 2021 at 12:04 PM Dilip Kumar  wrote:
> >
>
> Few comments on v6-0002*
> =
> 1.
> -BufFileDeleteFileSet(FileSet *fileset, const char *name)
> +BufFileDeleteFileSet(FileSet *fileset, const char *name, bool missing_ok)
>  {
>   char segment_name[MAXPGPATH];
>   int segment = 0;
> @@ -358,7 +369,7 @@ BufFileDeleteFileSet(FileSet *fileset, const char *name)
>   for (;;)
>   {
>   FileSetSegmentName(segment_name, name, segment);
> - if (!FileSetDelete(fileset, segment_name, true))
> + if (!FileSetDelete(fileset, segment_name, !missing_ok))
>
> I don't think the usage of missing_ok is correct here. If you see
> FileSetDelete->PathNameDeleteTemporaryFile, it already tolerates that
> the file doesn't exist but gives an error only when it is unable to
> link. So, with this missing_ok users (say like worker.c) won't even
> get errors when they are not able to remove files whereas I think the
> need for the patch is to not get an error when the file doesn't exist.
> I think you don't need to change anything in the way we invoke
> FileSetDelete.

Right, fixed.

> 2.
> -static HTAB *xidhash = NULL;
> +static FileSet *stream_fileset = NULL;
>
> Can we keep this in LogicalRepWorker and initialize it accordingly?

Done

> 3.
> + /* Open the subxact file, if it does not exist, create it. */
> + fd = BufFileOpenFileSet(stream_fileset, path, O_RDWR, true);
> + if (fd == NULL)
> + fd = BufFileCreateFileSet(stream_fileset, path);
>
> I think retaining the existing comment: "Create the subxact file if it
> not already created, otherwise open the existing file." seems better
> here.

Done

> 4.
>   /*
> - * If there is no subtransaction then nothing to do, but if already have
> - * subxact file then delete that.
> + * If there are no subtransactions, there is nothing to be done, but if
> + * subxacts already exist, delete it.
>   */
>
> How about changing the above comment to something like: "Delete the
> subxacts file, if exists"?

Done

> 5. Can we slightly change the commit message as:
> Optimize fileset usage in apply worker.
>
> Use one fileset for the entire worker lifetime instead of using
> separate filesets for each streaming transaction. Now, the
> changes/subxacts files for every streaming transaction will be created
> under the same fileset and the files will be deleted after the
> transaction is completed.
>
> This patch extends the BufFileOpenFileSet and BufFileDeleteFileSet
> APIs to allow users to specify whether to give an error on missing
> files.

Done


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From fade8ae2779e8aeda0d1fee3902a93eaedb0187f Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Fri, 27 Aug 2021 11:49:12 +0530
Subject: [PATCH v7] Optimize fileset usage in apply worker

Use one fileset for the entire worker lifetime instead of using
separate filesets for each streaming transaction. Now, the
changes/subxacts files for every streaming transaction will be
created under the same fileset and the files will be deleted
after the transaction is completed.

This patch extends the BufFileOpenFileSet and BufFileDeleteFileSet
APIs to allow users to specify whether to give an error on missing
files.
---
 src/backend/replication/logical/launcher.c |   6 +-
 src/backend/replication/logical/worker.c   | 249 +
 src/backend/storage/file/buffile.c |  21 ++-
 src/backend/utils/sort/logtape.c   |   2 +-
 src/backend/utils/sort/sharedtuplestore.c  |   3 +-
 src/include/replication/worker_internal.h  |  11 +-
 src/include/storage/buffile.h  |   5 +-
 7 files changed, 80 insertions(+), 217 deletions(-)

diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 8b1772d..3fb4caa 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -379,6 +379,7 @@ retry:
 	worker->relid = relid;
 	worker->relstate = SUBREL_STATE_UNKNOWN;
 	worker->relstate_lsn = InvalidXLogRecPtr;
+	worker->stream_fileset = NULL;
 	worker->last_lsn = InvalidXLogRecPtr;
 	TIMESTAMP_NOBEGIN(worker->last_send_time);
 	TIMESTAMP_NOBEGIN(worker->last_recv_time);
@@ -648,8 +649,9 @@ logicalrep_worker_onexit(int code, Datum arg)
 
 	logicalrep_worker_detach();
 
-	/* Cleanup filesets used for streaming transactions. */
-	logicalrep_worker_cleanupfileset();
+	/* Cleanup fileset used for streaming transactions. */
+	if (MyLogicalRepWorker->stream_fileset != NULL)
+		FileSetDeleteAll(MyLogicalRepWorker->stream_fileset);
 
 	ApplyLauncherWakeup();
 }
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index bfb7d1a..a222cb3 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -236,20 +236,6 @@ static ApplyErrorCallbackArg apply_error_callback_arg =
 	.ts = 0,
 };
 
-/*
- * Stream xid hash entry. 

Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-31 Thread Amit Kapila
On Fri, Aug 27, 2021 at 12:04 PM Dilip Kumar  wrote:
>

Few comments on v6-0002*
=
1.
-BufFileDeleteFileSet(FileSet *fileset, const char *name)
+BufFileDeleteFileSet(FileSet *fileset, const char *name, bool missing_ok)
 {
  char segment_name[MAXPGPATH];
  int segment = 0;
@@ -358,7 +369,7 @@ BufFileDeleteFileSet(FileSet *fileset, const char *name)
  for (;;)
  {
  FileSetSegmentName(segment_name, name, segment);
- if (!FileSetDelete(fileset, segment_name, true))
+ if (!FileSetDelete(fileset, segment_name, !missing_ok))

I don't think the usage of missing_ok is correct here. If you see
FileSetDelete->PathNameDeleteTemporaryFile, it already tolerates that
the file doesn't exist but gives an error only when it is unable to
link. So, with this missing_ok users (say like worker.c) won't even
get errors when they are not able to remove files whereas I think the
need for the patch is to not get an error when the file doesn't exist.
I think you don't need to change anything in the way we invoke
FileSetDelete.

2.
-static HTAB *xidhash = NULL;
+static FileSet *stream_fileset = NULL;

Can we keep this in LogicalRepWorker and initialize it accordingly?

3.
+ /* Open the subxact file, if it does not exist, create it. */
+ fd = BufFileOpenFileSet(stream_fileset, path, O_RDWR, true);
+ if (fd == NULL)
+ fd = BufFileCreateFileSet(stream_fileset, path);

I think retaining the existing comment: "Create the subxact file if it
not already created, otherwise open the existing file." seems better
here.

4.
  /*
- * If there is no subtransaction then nothing to do, but if already have
- * subxact file then delete that.
+ * If there are no subtransactions, there is nothing to be done, but if
+ * subxacts already exist, delete it.
  */

How about changing the above comment to something like: "Delete the
subxacts file, if exists"?

5. Can we slightly change the commit message as:
Optimize fileset usage in apply worker.

Use one fileset for the entire worker lifetime instead of using
separate filesets for each streaming transaction. Now, the
changes/subxacts files for every streaming transaction will be created
under the same fileset and the files will be deleted after the
transaction is completed.

This patch extends the BufFileOpenFileSet and BufFileDeleteFileSet
APIs to allow users to specify whether to give an error on missing
files.

-- 
With Regards,
Amit Kapila.




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-30 Thread Dilip Kumar
On Tue, Aug 31, 2021 at 7:39 AM houzj.f...@fujitsu.com <
houzj.f...@fujitsu.com> wrote:

> From Mon, Aug 30, 2021 2:15 PM Masahiko Sawada 
> wrote:
> > On Fri, Aug 27, 2021 at 3:34 PM Dilip Kumar 
> wrote:
> > >
> > > On Fri, Aug 27, 2021 at 10:56 AM houzj.f...@fujitsu.com <
> houzj.f...@fujitsu.com> wrote:
> > > >
> > > > On Thu, Aug 26, 2021 2:18 PM Dilip Kumar 
> wrote:
> > > >
> > > > > The patch looks good to me, I have rebased 0002 atop this patch
> > > > > and also done some cosmetic fixes in 0002.
> >
> > Thank you for updating the patch!
> >
> > The patch looks good to me except for the below comment:
> >
> > +/* Delete the subxact file, if it exist. */
> > +subxact_filename(path, subid, xid);
> > +BufFileDeleteFileSet(stream_fileset, path, true);
> >
> > s/it exist/it exists/
>
> Except for Sawada-san's comment, the v6-0002 patch looks good to me.
>

Thanks, I will wait for a day to see if there are any other comments on
this, after that I will fix this one issue and post the updated patch.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


RE: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-30 Thread houzj.f...@fujitsu.com
From Mon, Aug 30, 2021 2:15 PM Masahiko Sawada  wrote:
> On Fri, Aug 27, 2021 at 3:34 PM Dilip Kumar  wrote:
> >
> > On Fri, Aug 27, 2021 at 10:56 AM houzj.f...@fujitsu.com 
> >  wrote:
> > >
> > > On Thu, Aug 26, 2021 2:18 PM Dilip Kumar  wrote:
> > >
> > > > The patch looks good to me, I have rebased 0002 atop this patch
> > > > and also done some cosmetic fixes in 0002.
> 
> Thank you for updating the patch!
> 
> The patch looks good to me except for the below comment:
> 
> +/* Delete the subxact file, if it exist. */
> +subxact_filename(path, subid, xid);
> +BufFileDeleteFileSet(stream_fileset, path, true);
> 
> s/it exist/it exists/

Except for Sawada-san's comment, the v6-0002 patch looks good to me.

Best regards,
Hou zj


Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-30 Thread Amit Kapila
On Mon, Aug 30, 2021 at 6:42 AM Masahiko Sawada  wrote:
>
> On Fri, Aug 27, 2021 at 2:25 PM Dilip Kumar  wrote:
> >
> > On Thu, Aug 26, 2021 at 2:10 PM Amit Kapila  wrote:
> >>
> >>
> >> Thanks, Sawada-San and Dilip for confirmation. I would like to commit
> >> this and the second patch (the second one still needs some more
> >> testing and review) for PG-15 as there is no bug per-se related to
> >> this work in PG-14 but I see an argument to commit this for PG-14 to
> >> keep the code (APIs) consistent. What do you think? Does anybody else
> >> have any opinion on this?
> >>
> >
> > IMHO, this is a fair amount of refactoring and this is actually an 
> > improvement patch so we should push it to v15.
>
> I think so too.
>

I have pushed the first patch in this series.

-- 
With Regards,
Amit Kapila.




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-30 Thread Masahiko Sawada
On Fri, Aug 27, 2021 at 3:34 PM Dilip Kumar  wrote:
>
> On Fri, Aug 27, 2021 at 10:56 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Thu, Aug 26, 2021 2:18 PM Dilip Kumar  wrote:
> >
> > > The patch looks good to me, I have rebased 0002 atop
> > > this patch and also done some cosmetic fixes in 0002.
> >
> > Here are some comments for the 0002 patch.
> >
> > 1)
> >
> > - * toplevel transaction. Each subscription has a separate set of files.
> > + * toplevel transaction. Each subscription has a separate files.
> >
> > a separate files => a separate file
>
> Done
>
> > 2)
> > +* Otherwise, just open it file for writing, in append mode.
> >  */
> >
> > open it file => open the file
>
> Done
>
> >
> > 3)
> > if (subxact_data.nsubxacts == 0)
> > {
> > -   if (ent->subxact_fileset)
> > -   {
> > -   cleanup_subxact_info();
> > -   FileSetDeleteAll(ent->subxact_fileset);
> > -   pfree(ent->subxact_fileset);
> > -   ent->subxact_fileset = NULL;
> > -   }
> > +   cleanup_subxact_info();
> > +   BufFileDeleteFileSet(stream_fileset, path, true);
> > +
> >
> > Before applying the patch, the code only invoke cleanup_subxact_info() when 
> > the
> > file exists. After applying the patch, it will invoke cleanup_subxact_info()
> > either the file exists or not. Is it correct ?
>
> I think this is just structure resetting at the end of the stream.
> Earlier the hash was telling us whether we have ever dirtied that
> structure or not but now we are not maintaining that hash so we just
> reset it at the end of the stream.  I don't think its any bad, in fact
> I think this is much cheaper compared to maining the hash.
>
> >
> > 4)
> > /*
> > -* If this is the first streamed segment, the file must not exist, 
> > so make
> > -* sure we're the ones creating it. Otherwise just open the file for
> > -* writing, in append mode.
> > +* If this is the first streamed segment, create the changes file.
> > +* Otherwise, just open it file for writing, in append mode.
> >  */
> > if (first_segment)
> > -   {
> > ...
> > -   if (found)
> > -   ereport(ERROR,
> > -   
> > (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > -errmsg_internal("incorrect 
> > first-segment flag for streamed replication transaction")));
> > ...
> > -   }
> > +   stream_fd = BufFileCreateFileSet(stream_fileset, path);
> >
> >
> > Since the function BufFileCreateFileSet() doesn't check the file's 
> > existence,
> > the change here seems remove the file existence check which the old code 
> > did.
>
> Not really, we were just doing a sanity check of the in memory hash
> entry, now we don't maintain that so we don't need to do that.

Thank you for updating the patch!

The patch looks good to me except for the below comment:

+/* Delete the subxact file, if it exist. */
+subxact_filename(path, subid, xid);
+BufFileDeleteFileSet(stream_fileset, path, true);

s/it exist/it exists/

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-29 Thread Masahiko Sawada
On Fri, Aug 27, 2021 at 2:25 PM Dilip Kumar  wrote:
>
> On Thu, Aug 26, 2021 at 2:10 PM Amit Kapila  wrote:
>>
>>
>> Thanks, Sawada-San and Dilip for confirmation. I would like to commit
>> this and the second patch (the second one still needs some more
>> testing and review) for PG-15 as there is no bug per-se related to
>> this work in PG-14 but I see an argument to commit this for PG-14 to
>> keep the code (APIs) consistent. What do you think? Does anybody else
>> have any opinion on this?
>>
>
> IMHO, this is a fair amount of refactoring and this is actually an 
> improvement patch so we should push it to v15.

I think so too.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-27 Thread Dilip Kumar
On Fri, Aug 27, 2021 at 10:56 AM houzj.f...@fujitsu.com
 wrote:
>
> On Thu, Aug 26, 2021 2:18 PM Dilip Kumar  wrote:
>
> > The patch looks good to me, I have rebased 0002 atop
> > this patch and also done some cosmetic fixes in 0002.
>
> Here are some comments for the 0002 patch.
>
> 1)
>
> - * toplevel transaction. Each subscription has a separate set of files.
> + * toplevel transaction. Each subscription has a separate files.
>
> a separate files => a separate file

Done

> 2)
> +* Otherwise, just open it file for writing, in append mode.
>  */
>
> open it file => open the file

Done

>
> 3)
> if (subxact_data.nsubxacts == 0)
> {
> -   if (ent->subxact_fileset)
> -   {
> -   cleanup_subxact_info();
> -   FileSetDeleteAll(ent->subxact_fileset);
> -   pfree(ent->subxact_fileset);
> -   ent->subxact_fileset = NULL;
> -   }
> +   cleanup_subxact_info();
> +   BufFileDeleteFileSet(stream_fileset, path, true);
> +
>
> Before applying the patch, the code only invoke cleanup_subxact_info() when 
> the
> file exists. After applying the patch, it will invoke cleanup_subxact_info()
> either the file exists or not. Is it correct ?

I think this is just structure resetting at the end of the stream.
Earlier the hash was telling us whether we have ever dirtied that
structure or not but now we are not maintaining that hash so we just
reset it at the end of the stream.  I don't think its any bad, in fact
I think this is much cheaper compared to maining the hash.

>
> 4)
> /*
> -* If this is the first streamed segment, the file must not exist, so 
> make
> -* sure we're the ones creating it. Otherwise just open the file for
> -* writing, in append mode.
> +* If this is the first streamed segment, create the changes file.
> +* Otherwise, just open it file for writing, in append mode.
>  */
> if (first_segment)
> -   {
> ...
> -   if (found)
> -   ereport(ERROR,
> -   (errcode(ERRCODE_PROTOCOL_VIOLATION),
> -errmsg_internal("incorrect 
> first-segment flag for streamed replication transaction")));
> ...
> -   }
> +   stream_fd = BufFileCreateFileSet(stream_fileset, path);
>
>
> Since the function BufFileCreateFileSet() doesn't check the file's existence,
> the change here seems remove the file existence check which the old code did.

Not really, we were just doing a sanity check of the in memory hash
entry, now we don't maintain that so we don't need to do that.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From dfae7d9e3ae875c8db2de792e9a95ef86fef2b00 Mon Sep 17 00:00:00 2001
From: Amit Kapila 
Date: Wed, 25 Aug 2021 15:00:19 +0530
Subject: [PATCH v6 1/2] Refactor sharedfileset.c to separate out fileset
 implementation.

Move fileset related implementation out of sharedfileset.c to allow its
usage by backends that don't want to share filesets among different
processes. After this split, fileset infrastructure is used by both
sharedfileset.c and worker.c for the named temporary files that survive
across transactions.

Author: Dilip Kumar, based on suggestion by Andres Freund
Reviewed-by: Hou Zhijie, Masahiko Sawada, Amit Kapila
Discussion: https://postgr.es/m/e1mcc6u-0004ik...@gemulon.postgresql.org
---
 src/backend/replication/logical/launcher.c |   3 +
 src/backend/replication/logical/worker.c   |  82 ++
 src/backend/storage/file/Makefile  |   1 +
 src/backend/storage/file/buffile.c |  84 +-
 src/backend/storage/file/fd.c  |   2 +-
 src/backend/storage/file/fileset.c | 205 
 src/backend/storage/file/sharedfileset.c   | 244 +
 src/backend/utils/sort/logtape.c   |   8 +-
 src/backend/utils/sort/sharedtuplestore.c  |   5 +-
 src/include/replication/worker_internal.h  |   1 +
 src/include/storage/buffile.h  |  14 +-
 src/include/storage/fileset.h  |  40 +
 src/include/storage/sharedfileset.h|  14 +-
 src/tools/pgindent/typedefs.list   |   1 +
 14 files changed, 368 insertions(+), 336 deletions(-)
 create mode 100644 src/backend/storage/file/fileset.c
 create mode 100644 src/include/storage/fileset.h

diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index e3b11da..8b1772d 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -648,6 +648,9 @@ logicalrep_worker_onexit(int code, Datum arg)
 
 	logicalrep_worker_detach();
 
+	/* Cleanup filesets used for streaming transactions. */
+	logicalrep_worker_cleanupfileset();
+
 	ApplyLauncherWakeup();
 }
 
diff --git 

RE: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-26 Thread houzj.f...@fujitsu.com
On Fri, Aug 27, 2021 1:25 PM Dilip Kumar   wrote:
> On Thu, Aug 26, 2021 at 2:10 PM Amit Kapila  
> wrote:
> 
>> Thanks, Sawada-San and Dilip for confirmation. I would like to commit
>> this and the second patch (the second one still needs some more
>> testing and review) for PG-15 as there is no bug per-se related to
>> this work in PG-14 but I see an argument to commit this for PG-14 to
>> keep the code (APIs) consistent. What do you think? Does anybody else
>> have any opinion on this?

> IMHO, this is a fair amount of refactoring and this is actually an
> improvement patch so we should push it to v15.

+1

Best regards,
Hou zj


RE: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-26 Thread houzj.f...@fujitsu.com
On Thu, Aug 26, 2021 2:18 PM Dilip Kumar  wrote:

> The patch looks good to me, I have rebased 0002 atop
> this patch and also done some cosmetic fixes in 0002. 

Here are some comments for the 0002 patch.

1)

- * toplevel transaction. Each subscription has a separate set of files.
+ * toplevel transaction. Each subscription has a separate files.

a separate files => a separate file

2)
+* Otherwise, just open it file for writing, in append mode.
 */

open it file => open the file


3)
if (subxact_data.nsubxacts == 0)
{
-   if (ent->subxact_fileset)
-   {
-   cleanup_subxact_info();
-   FileSetDeleteAll(ent->subxact_fileset);
-   pfree(ent->subxact_fileset);
-   ent->subxact_fileset = NULL;
-   }
+   cleanup_subxact_info();
+   BufFileDeleteFileSet(stream_fileset, path, true);
+


Before applying the patch, the code only invoke cleanup_subxact_info() when the
file exists. After applying the patch, it will invoke cleanup_subxact_info()
either the file exists or not. Is it correct ?


4)
/*
-* If this is the first streamed segment, the file must not exist, so 
make
-* sure we're the ones creating it. Otherwise just open the file for
-* writing, in append mode.
+* If this is the first streamed segment, create the changes file.
+* Otherwise, just open it file for writing, in append mode.
 */
if (first_segment)
-   {
...
-   if (found)
-   ereport(ERROR,
-   (errcode(ERRCODE_PROTOCOL_VIOLATION),
-errmsg_internal("incorrect 
first-segment flag for streamed replication transaction")));
...
-   }
+   stream_fd = BufFileCreateFileSet(stream_fileset, path);


Since the function BufFileCreateFileSet() doesn't check the file's existence,
the change here seems remove the file existence check which the old code did.


Best regards,
Hou zj


Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-26 Thread Dilip Kumar
On Thu, Aug 26, 2021 at 2:10 PM Amit Kapila  wrote:

>
> Thanks, Sawada-San and Dilip for confirmation. I would like to commit
> this and the second patch (the second one still needs some more
> testing and review) for PG-15 as there is no bug per-se related to
> this work in PG-14 but I see an argument to commit this for PG-14 to
> keep the code (APIs) consistent. What do you think? Does anybody else
> have any opinion on this?
>
>
IMHO, this is a fair amount of refactoring and this is actually an
improvement patch so we should push it to v15.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-26 Thread Amit Kapila
On Thu, Aug 26, 2021 at 11:47 AM Dilip Kumar  wrote:
>
> On Wed, Aug 25, 2021 at 5:49 PM Amit Kapila  wrote:
>>
>> On Tue, Aug 24, 2021 at 3:55 PM Dilip Kumar  wrote:
>> >
>> > On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila  
>> > wrote:
>> >
>>
>> The first patch looks good to me. I have made minor changes to the
>> attached patch. The changes include: fixing compilation warning, made
>> some comment changes, ran pgindent, and few other cosmetic changes. If
>> you are fine with the attached, then kindly rebase the second patch
>> atop it.
>>
>
> The patch looks good to me,
>

Thanks, Sawada-San and Dilip for confirmation. I would like to commit
this and the second patch (the second one still needs some more
testing and review) for PG-15 as there is no bug per-se related to
this work in PG-14 but I see an argument to commit this for PG-14 to
keep the code (APIs) consistent. What do you think? Does anybody else
have any opinion on this?

Below is a summary of each of the patches for those who are not
following this closely:
Patch-1: The purpose of this patch is to refactor sharedfileset.c to
separate out fileset implementation. Basically, this moves the fileset
related implementation out of sharedfileset.c to allow its usage by
backends that don't want to share filesets among different processes.
After this split, fileset infrastructure is used by both
sharedfileset.c and worker.c for the named temporary files that
survive across transactions. This is suggested by Andres to have a
cleaner API and its usage.

Patch-2: Allow to use single fileset in worker.c (for the lifetime of
worker) instead of using a separate fileset for each remote
transaction. After this, the files to record changes for each remote
transaction will be created under the same fileset and the files will
be deleted after the transaction is completed. This is suggested by
Thomas to allow better resource usage.

-- 
With Regards,
Amit Kapila.




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-26 Thread Dilip Kumar
On Wed, Aug 25, 2021 at 5:49 PM Amit Kapila  wrote:

> On Tue, Aug 24, 2021 at 3:55 PM Dilip Kumar  wrote:
> >
> > On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila 
> wrote:
> >
>
> The first patch looks good to me. I have made minor changes to the
> attached patch. The changes include: fixing compilation warning, made
> some comment changes, ran pgindent, and few other cosmetic changes. If
> you are fine with the attached, then kindly rebase the second patch
> atop it.
>
>
The patch looks good to me, I have rebased 0002 atop this patch and also
done some cosmetic fixes in 0002.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 3558cdc8a4f77e3c9092bbaeded451eaaa1feca4 Mon Sep 17 00:00:00 2001
From: Amit Kapila 
Date: Wed, 25 Aug 2021 15:00:19 +0530
Subject: [PATCH v5 1/2] Refactor sharedfileset.c to separate out fileset
 implementation.

Move fileset related implementation out of sharedfileset.c to allow its
usage by backends that don't want to share filesets among different
processes. After this split, fileset infrastructure is used by both
sharedfileset.c and worker.c for the named temporary files that survive
across transactions.

Author: Dilip Kumar, based on suggestion by Andres Freund
Reviewed-by: Hou Zhijie, Masahiko Sawada, Amit Kapila
Discussion: https://postgr.es/m/e1mcc6u-0004ik...@gemulon.postgresql.org
---
 src/backend/replication/logical/launcher.c |   3 +
 src/backend/replication/logical/worker.c   |  82 ++
 src/backend/storage/file/Makefile  |   1 +
 src/backend/storage/file/buffile.c |  84 +-
 src/backend/storage/file/fd.c  |   2 +-
 src/backend/storage/file/fileset.c | 205 
 src/backend/storage/file/sharedfileset.c   | 244 +
 src/backend/utils/sort/logtape.c   |   8 +-
 src/backend/utils/sort/sharedtuplestore.c  |   5 +-
 src/include/replication/worker_internal.h  |   1 +
 src/include/storage/buffile.h  |  14 +-
 src/include/storage/fileset.h  |  40 +
 src/include/storage/sharedfileset.h|  14 +-
 src/tools/pgindent/typedefs.list   |   1 +
 14 files changed, 368 insertions(+), 336 deletions(-)
 create mode 100644 src/backend/storage/file/fileset.c
 create mode 100644 src/include/storage/fileset.h

diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index e3b11da..8b1772d 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -648,6 +648,9 @@ logicalrep_worker_onexit(int code, Datum arg)
 
 	logicalrep_worker_detach();
 
+	/* Cleanup filesets used for streaming transactions. */
+	logicalrep_worker_cleanupfileset();
+
 	ApplyLauncherWakeup();
 }
 
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 38b493e..6adc6ddd 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -39,13 +39,13 @@
  * BufFile infrastructure supports temporary files that exceed the OS file size
  * limit, (b) provides a way for automatic clean up on the error and (c) provides
  * a way to survive these files across local transactions and allow to open and
- * close at stream start and close. We decided to use SharedFileSet
+ * close at stream start and close. We decided to use FileSet
  * infrastructure as without that it deletes the files on the closure of the
  * file and if we decide to keep stream files open across the start/stop stream
  * then it will consume a lot of memory (more than 8K for each BufFile and
  * there could be multiple such BufFiles as the subscriber could receive
  * multiple start/stop streams for different transactions before getting the
- * commit). Moreover, if we don't use SharedFileSet then we also need to invent
+ * commit). Moreover, if we don't use FileSet then we also need to invent
  * a new way to pass filenames to BufFile APIs so that we are allowed to open
  * the file we desired across multiple stream-open calls for the same
  * transaction.
@@ -231,8 +231,8 @@ typedef struct ApplyExecutionData
 typedef struct StreamXidHash
 {
 	TransactionId xid;			/* xid is the hash key and must be first */
-	SharedFileSet *stream_fileset;	/* shared file set for stream data */
-	SharedFileSet *subxact_fileset; /* shared file set for subxact info */
+	FileSet*stream_fileset; /* file set for stream data */
+	FileSet*subxact_fileset;	/* file set for subxact info */
 } StreamXidHash;
 
 static MemoryContext ApplyMessageContext = NULL;
@@ -255,8 +255,8 @@ static bool in_streamed_transaction = false;
 static TransactionId stream_xid = InvalidTransactionId;
 
 /*
- * Hash table for storing the streaming xid information along with shared file
- * set for streaming and subxact files.
+ * Hash table for storing the streaming xid information along with filesets
+ * for streaming and subxact files.
  */
 static HTAB *xidhash = 

Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-25 Thread Masahiko Sawada
On Wed, Aug 25, 2021 at 9:19 PM Amit Kapila  wrote:
>
> On Tue, Aug 24, 2021 at 3:55 PM Dilip Kumar  wrote:
> >
> > On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila  
> > wrote:
> >
>
> The first patch looks good to me. I have made minor changes to the
> attached patch. The changes include: fixing compilation warning, made
> some comment changes, ran pgindent, and few other cosmetic changes. If
> you are fine with the attached, then kindly rebase the second patch
> atop it.

The patch looks good to me.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-25 Thread Amit Kapila
On Tue, Aug 24, 2021 at 3:55 PM Dilip Kumar  wrote:
>
> On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila  wrote:
>

The first patch looks good to me. I have made minor changes to the
attached patch. The changes include: fixing compilation warning, made
some comment changes, ran pgindent, and few other cosmetic changes. If
you are fine with the attached, then kindly rebase the second patch
atop it.

-- 
With Regards,
Amit Kapila.


v5-0001-Refactor-sharedfileset.c-to-separate-out-fileset-.patch
Description: Binary data


RE: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-24 Thread houzj.f...@fujitsu.com
On Tuesday, August 24, 2021 6:25 PM Dilip Kumar  wrote:
> On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila  wrote:
> 
> > > I was also thinking about the same, does it make sense to name it
> > > just ""%s/%s%lu.%u.fileset"?
> 
> Done
> 
> > I think it is reasonable to use .fileset as proposed by you.
> >
> > Few other comments:
> Done

After applying the patch, I tested the impacted features
(parallel hashjoin/btree build) with different settings
(log_temp_files/maintenance_work_mem/parallel_leader_participation/workers)
, and the patch works well.

One thing I noticed is that when enable log_temp_files, the pathname in log
changed from "xxx.sharedfileset" to "xxx.fileset", and I also noticed some
blogs[1] reference the old path name. Although, I think it doesn't matter, just
share the information here in case someone has different opinions.


Some minor thing I noticed in the patch:

1)
it seems we can add "FileSet" to typedefs.list

2)
The commit message in 0002 used "shared fileset" which should be "fileset",
---
Instead of using a separate shared fileset for each xid, use one shared
fileset for whole lifetime of the worker...
---

[1] https://blog.dbi-services.com/about-temp_tablespaces-in-postgresql/

Best regards,
Hou zj



Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-24 Thread Dilip Kumar
On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila  wrote:

> > I was also thinking about the same, does it make sense to name it just
> > ""%s/%s%lu.%u.fileset"?

Done

> I think it is reasonable to use .fileset as proposed by you.
>
> Few other comments:
> =
> 1.
> + /*
> + * Register before-shmem-exit hook to ensure filesets are dropped while we
> + * can still report stats for underlying temporary files.
> + */
> + before_shmem_exit(worker_cleanup, (Datum) 0);
> +
>
> Do we really need to register a new callback here? Won't the existing
> logical replication worker exit routine (logicalrep_worker_onexit) be
> sufficient for this patch's purpose?

Right, we don't need an extra function for this.

> 2.
> - SharedFileSet *fileset; /* space for segment files if shared */
> - const char *name; /* name of this BufFile if shared */
> + FileSet*fileset; /* space for fileset for fileset based file */
> + const char *name; /* name of this BufFile */
>
> The comments for the above two variables can be written as (a) space
> for fileset based segment files, (b) name of fileset based BufFile.

Done

> 3.
>  /*
> - * Create a new segment file backing a shared BufFile.
> + * Create a new segment file backing a fileset BufFile.
>   */
>  static File
> -MakeNewSharedSegment(BufFile *buffile, int segment)
> +MakeNewSegment(BufFile *buffile, int segment)
>
> I think it is better to name this function as MakeNewFileSetSegment.
> You can slightly change the comment as: "Create a new segment file
> backing a fileset based BufFile."

 Make sense

> 4.
> /*
>   * It is possible that there are files left over from before a crash
> - * restart with the same name.  In order for BufFileOpenShared() not to
> + * restart with the same name.  In order for BufFileOpen() not to
>   * get confused about how many segments there are, we'll unlink the next
>
> Typo. /BufFileOpen/BufFileOpenFileSet

Fixed

> 5.
>  static void
> -SharedSegmentName(char *name, const char *buffile_name, int segment)
> +SegmentName(char *name, const char *buffile_name, int segment)
>
> Can we name this as FileSetSegmentName?

Done

> 6.
>   *
> - * The naming scheme for shared BufFiles is left up to the calling code.  The
> + * The naming scheme for fileset BufFiles is left up to the calling code.
>
> Isn't it better to say "... fileset based BufFiles .."?

Done

> 7.
> + * FileSets provide a temporary namespace (think directory) so that files can
> + * be discovered by name
>
> A full stop is missing at the end of the statement.

Fixed

> 8.
> + *
> + * The callers are expected to explicitly remove such files by using
> + * SharedFileSetDelete/ SharedFileSetDeleteAll.
> + *
> + * Files will be distributed over the tablespaces configured in
> + * temp_tablespaces.
> + *
> + * Under the covers the set is one or more directories which will eventually
> + * be deleted.
> + */
> +void
> +FileSetInit(FileSet *fileset)
>
> Is there a need to mention 'Shared' in API names (SharedFileSetDelete/
> SharedFileSetDeleteAll) in the comments? Also, there doesn't seem to
> be a need for extra space before *DeleteAll API in comments.

Fixed

> 9.
>  * Files will be distributed over the tablespaces configured in
>  * temp_tablespaces.
>  *
>  * Under the covers the set is one or more directories which will eventually
>  * be deleted.
>  */
> void
> SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
>
> I think we can remove the part of the above comment where it says
> "Files will be distributed over ..." as that is already mentioned atop
> FileSetInit.

Done

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 94a8a22726f08f07259b49ccee340ab1d4cba116 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Wed, 18 Aug 2021 15:52:21 +0530
Subject: [PATCH v3 1/2] Sharedfileset refactoring

Currently, sharedfileset.c is designed for a very specific purpose i.e.
one backend could create the fileset and that could be shared across
multiple backends so the fileset should be created in DSM.  But in some
use cases, we need exact behavior as sharedfileset that the files are
created under this should be named files and should also survive the
transaction and should be allowed to be opened and closed.  In this
patch  we have refactored these files such that there will be two
files a) fileset.c which will provide general-purpose interfaces
b) sharedfileset.c, which will internally used fileset.c but this
will be specific for DSM-based filesets.
---
 src/backend/replication/logical/launcher.c |   3 +
 src/backend/replication/logical/worker.c   |  83 ++
 src/backend/storage/file/Makefile  |   1 +
 src/backend/storage/file/buffile.c |  84 +-
 src/backend/storage/file/fd.c  |   2 +-
 src/backend/storage/file/fileset.c | 204 
 src/backend/storage/file/sharedfileset.c   | 244 +
 src/backend/utils/sort/logtape.c   |   8 +-
 

Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-24 Thread Amit Kapila
On Mon, Aug 23, 2021 at 3:13 PM Dilip Kumar  wrote:
>
> On Mon, Aug 23, 2021 at 1:43 PM Amit Kapila  wrote:
>
> Note: merge comments from multiple mails
>
> > > I think we should handle that in worker.c itself, by adding a
> > > before_dsm_detach function before_shmem_exit right?
> > >
> >
> > Yeah, I thought of handling it in worker.c similar to what you've in 0002 
> > patch.
> >
>
> I have done handling in worker.c
>
> On Mon, Aug 23, 2021 at 9:11 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Sat, Aug 21, 2021 8:38 PM Dilip Kumar  wrote:
> >
> > 1)
> > +   TempTablespacePath(tempdirpath, tablespace);
> > +   snprintf(path, MAXPGPATH, "%s/%s%lu.%u.sharedfileset",
> > +tempdirpath, PG_TEMP_FILE_PREFIX,
> > +(unsigned long) fileset->creator_pid, 
> > fileset->number);
> >
> > do we need to use different filename for shared and un-shared fileset ?
>
> I was also thinking about the same, does it make sense to name it just
> ""%s/%s%lu.%u.fileset"?
>

I think it is reasonable to use .fileset as proposed by you.

Few other comments:
=
1.
+ /*
+ * Register before-shmem-exit hook to ensure filesets are dropped while we
+ * can still report stats for underlying temporary files.
+ */
+ before_shmem_exit(worker_cleanup, (Datum) 0);
+

Do we really need to register a new callback here? Won't the existing
logical replication worker exit routine (logicalrep_worker_onexit) be
sufficient for this patch's purpose?

2.
- SharedFileSet *fileset; /* space for segment files if shared */
- const char *name; /* name of this BufFile if shared */
+ FileSet*fileset; /* space for fileset for fileset based file */
+ const char *name; /* name of this BufFile */

The comments for the above two variables can be written as (a) space
for fileset based segment files, (b) name of fileset based BufFile.

3.
 /*
- * Create a new segment file backing a shared BufFile.
+ * Create a new segment file backing a fileset BufFile.
  */
 static File
-MakeNewSharedSegment(BufFile *buffile, int segment)
+MakeNewSegment(BufFile *buffile, int segment)

I think it is better to name this function as MakeNewFileSetSegment.
You can slightly change the comment as: "Create a new segment file
backing a fileset based BufFile."

4.
/*
  * It is possible that there are files left over from before a crash
- * restart with the same name.  In order for BufFileOpenShared() not to
+ * restart with the same name.  In order for BufFileOpen() not to
  * get confused about how many segments there are, we'll unlink the next

Typo. /BufFileOpen/BufFileOpenFileSet

5.
 static void
-SharedSegmentName(char *name, const char *buffile_name, int segment)
+SegmentName(char *name, const char *buffile_name, int segment)

Can we name this as FileSetSegmentName?

6.
  *
- * The naming scheme for shared BufFiles is left up to the calling code.  The
+ * The naming scheme for fileset BufFiles is left up to the calling code.

Isn't it better to say "... fileset based BufFiles .."?

7.
+ * FileSets provide a temporary namespace (think directory) so that files can
+ * be discovered by name

A full stop is missing at the end of the statement.

8.
+ *
+ * The callers are expected to explicitly remove such files by using
+ * SharedFileSetDelete/ SharedFileSetDeleteAll.
+ *
+ * Files will be distributed over the tablespaces configured in
+ * temp_tablespaces.
+ *
+ * Under the covers the set is one or more directories which will eventually
+ * be deleted.
+ */
+void
+FileSetInit(FileSet *fileset)

Is there a need to mention 'Shared' in API names (SharedFileSetDelete/
SharedFileSetDeleteAll) in the comments? Also, there doesn't seem to
be a need for extra space before *DeleteAll API in comments.

9.
 * Files will be distributed over the tablespaces configured in
 * temp_tablespaces.
 *
 * Under the covers the set is one or more directories which will eventually
 * be deleted.
 */
void
SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)

I think we can remove the part of the above comment where it says
"Files will be distributed over ..." as that is already mentioned atop
FileSetInit.

-- 
With Regards,
Amit Kapila.




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-23 Thread Dilip Kumar
On Mon, Aug 23, 2021 at 1:43 PM Amit Kapila  wrote:

Note: merge comments from multiple mails

> > I think we should handle that in worker.c itself, by adding a
> > before_dsm_detach function before_shmem_exit right?
> >
>
> Yeah, I thought of handling it in worker.c similar to what you've in 0002 
> patch.
>

I have done handling in worker.c

On Mon, Aug 23, 2021 at 9:11 AM houzj.f...@fujitsu.com
 wrote:
>
> On Sat, Aug 21, 2021 8:38 PM Dilip Kumar  wrote:
>
> 1)
> +   TempTablespacePath(tempdirpath, tablespace);
> +   snprintf(path, MAXPGPATH, "%s/%s%lu.%u.sharedfileset",
> +tempdirpath, PG_TEMP_FILE_PREFIX,
> +(unsigned long) fileset->creator_pid, 
> fileset->number);
>
> do we need to use different filename for shared and un-shared fileset ?

I was also thinking about the same, does it make sense to name it just
""%s/%s%lu.%u.fileset"?

On Mon, Aug 23, 2021 at 1:08 PM Masahiko Sawada  wrote:

> Here are some comments on 0001 patch:
>
> +/*
> + * Initialize a space for temporary files. This API can be used by shared
> + * fileset as well as if the temporary files are used only by single backend
> + * but the files need to be opened and closed multiple times and also the
> + * underlying files need to survive across transactions.
> + *
> + * Files will be distributed over the tablespaces configured in
> + * temp_tablespaces.
> + *
> + * Under the covers the set is one or more directories which will eventually
> + * be deleted.
> + */
>
> I think it's better to mention cleaning up here like we do in the
> comment of SharedFileSetInit().

Right, done

>
> ---
> I think we need to clean up both stream_fileset and subxact_fileset on
> proc exit. In 0002 patch cleans up the fileset but I think we need to
> do that also in 0001 patch.

Done

> ---
> There still are some comments using "shared fileset" in both buffile.c
> and worker.c.

Done


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From ced3a5e0222c2f89a977add167615f09d99d107a Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Wed, 18 Aug 2021 15:52:21 +0530
Subject: [PATCH v2 1/2] Sharedfileset refactoring

Currently, sharedfileset.c is designed for a very specific purpose i.e.
one backend could create the fileset and  could be shared across multiple
backend so the fileset should be created in DSM.  But in some use cases,
we need exact behavior as sharedfileset that the files created under this
should be named files and should survive the transaction and should allow
to have feature of open and close.  In this patch  we have refactored
these files such that there will be two files a) fileset.c which will provide
general-purpose interfaces b) sharedfileset.c, which will internally used
fileset.c but this will be specific for DSM-based filesets.
---
 src/backend/replication/logical/worker.c  |  89 +++
 src/backend/storage/file/Makefile |   1 +
 src/backend/storage/file/buffile.c|  82 +-
 src/backend/storage/file/fd.c |   2 +-
 src/backend/storage/file/fileset.c| 204 +
 src/backend/storage/file/sharedfileset.c  | 239 +-
 src/backend/utils/sort/logtape.c  |   8 +-
 src/backend/utils/sort/sharedtuplestore.c |   5 +-
 src/include/storage/buffile.h |  12 +-
 src/include/storage/fileset.h |  40 +
 src/include/storage/sharedfileset.h   |  14 +-
 11 files changed, 366 insertions(+), 330 deletions(-)
 create mode 100644 src/backend/storage/file/fileset.c
 create mode 100644 src/include/storage/fileset.h

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ecaed15..d98ebab 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -39,13 +39,13 @@
  * BufFile infrastructure supports temporary files that exceed the OS file size
  * limit, (b) provides a way for automatic clean up on the error and (c) provides
  * a way to survive these files across local transactions and allow to open and
- * close at stream start and close. We decided to use SharedFileSet
+ * close at stream start and close. We decided to use FileSet
  * infrastructure as without that it deletes the files on the closure of the
  * file and if we decide to keep stream files open across the start/stop stream
  * then it will consume a lot of memory (more than 8K for each BufFile and
  * there could be multiple such BufFiles as the subscriber could receive
  * multiple start/stop streams for different transactions before getting the
- * commit). Moreover, if we don't use SharedFileSet then we also need to invent
+ * commit). Moreover, if we don't use FileSet then we also need to invent
  * a new way to pass filenames to BufFile APIs so that we are allowed to open
  * the file we desired across multiple stream-open calls for the same
  * transaction.
@@ -231,8 +231,8 @@ typedef struct 

Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-23 Thread Amit Kapila
On Mon, Aug 23, 2021 at 12:52 PM Dilip Kumar  wrote:
>
> On Mon, Aug 23, 2021 at 11:43 AM Amit Kapila  wrote:
> >
> > On Mon, Aug 23, 2021 at 9:53 AM Dilip Kumar  wrote:
> > >
> > > On Mon, Aug 23, 2021 at 9:11 AM houzj.f...@fujitsu.com
> > >  wrote:
> > >
> > > > 4)
> > > > -extern File SharedFileSetCreate(SharedFileSet *fileset, const char 
> > > > *name);
> > > > -extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
> > > > - int mode);
> > > > -extern bool SharedFileSetDelete(SharedFileSet *fileset, const char 
> > > > *name,
> > > > -   bool 
> > > > error_on_failure);
> > > >  extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
> > > > -extern void SharedFileSetUnregister(SharedFileSet *input_fileset);
> > > >
> > > > I noticed the patch delete part of public api, is it better to keep the 
> > > > old api and
> > > > let them invoke new api internally ? Having said that, I didn’t find 
> > > > any open source
> > > > extension use these old api, so it might be fine to delete them.
> > >
> > > Right, those were internally used by buffile.c but now we have changed
> > > buffile.c to directly use the fileset interfaces, so we better remove
> > > them.
> > >
> >
> > I also don't see any reason to keep those exposed from
> > sharedfileset.h. I see that even in the original commit dc6c4c9dc2,
> > these APIs seem to be introduced to be used by buffile. Andres/Thomas,
> > do let us know if you think otherwise?
> >
> > One more comment:
> > I think v1-0001-Sharedfileset-refactoring doesn't have a way for
> > cleaning up worker.c temporary files on error/exit. It is better to
> > have that to make it an independent patch.
>
> I think we should handle that in worker.c itself, by adding a
> before_dsm_detach function before_shmem_exit right?
>

Yeah, I thought of handling it in worker.c similar to what you've in 0002 patch.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-23 Thread Masahiko Sawada
On Sat, Aug 21, 2021 at 9:38 PM Dilip Kumar  wrote:
>
> On Wed, Aug 18, 2021 at 3:45 PM Dilip Kumar  wrote:
> >
> > On Wed, Aug 18, 2021 at 11:24 AM Amit Kapila  
> > wrote:
> > >
> > > On Tue, Aug 17, 2021 at 4:34 PM Andres Freund  wrote:
> > > >
> > > > On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > > > > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > > > > for non-dsm FileSet? One idea could be that we can have a variable
> > > > > (probably bool) in SharedFileSet structure which will be initialized
> > > > > in SharedFileSetInit based on whether the caller has provided dsm
> > > > > segment. Then in other DSM-based APIs, we can check if it is used for
> > > > > the wrong type.
> > > >
> > > > Well, isn't the issue here that it's not a shared file set in case you
> > > > explicitly don't want to share it? ISTM that the proper way to address
> > > > this would be to split out a FileSet from SharedFileSet that's then used
> > > > for worker.c and sharedfileset.c.
> > > >
> > >
> > > Okay, but note that to accomplish the same, we need to tweak the
> > > BufFile (buffile.c) APIs as well so that they can work with FileSet.
> > > As per the initial analysis, there doesn't seem to be any problem with
> > > that though.
> >
> > I was looking into this, so if we want to do that I think the outline
> > will look like this
> >
> > - There will be a fileset.c and fileset.h files, and we will expose a
> > new structure FileSet, which will be the same as SharedFileSet, except
> > mutext and refcount.  The fileset.c will expose FileSetInit(),
> > FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
> > interfaces.
> >
> > - sharefileset.c will internally call the fileset.c's interfaces.  The
> > SharedFileSet structure will also contain FileSet and other members
> > i.e. mutex and refcount.
> >
> > - the buffile.c's interfaces which are ending with Shared e.g.
> > BufFileCreateShared, BufFileOpenShared, should be converted to
> > BufFileCreate and
> > BufFileOpen respectively.  And the input to these interfaces can be
> > converted to FileSet instead of SharedFileSet.
>
> Here is the first draft based on the idea we discussed, 0001, splits
> sharedfileset.c in sharedfileset.c and fileset.c and 0002 is same
> patch I submitted earlier(use single fileset throughout the worker),
> just it is rebased on top of 0001.  Please let me know your thoughts.

Here are some comments on 0001 patch:

+/*
+ * Initialize a space for temporary files. This API can be used by shared
+ * fileset as well as if the temporary files are used only by single backend
+ * but the files need to be opened and closed multiple times and also the
+ * underlying files need to survive across transactions.
+ *
+ * Files will be distributed over the tablespaces configured in
+ * temp_tablespaces.
+ *
+ * Under the covers the set is one or more directories which will eventually
+ * be deleted.
+ */

I think it's better to mention cleaning up here like we do in the
comment of SharedFileSetInit().

---
I think we need to clean up both stream_fileset and subxact_fileset on
proc exit. In 0002 patch cleans up the fileset but I think we need to
do that also in 0001 patch.

---
There still are some comments using "shared fileset" in both buffile.c
and worker.c.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-23 Thread Dilip Kumar
On Mon, Aug 23, 2021 at 11:43 AM Amit Kapila  wrote:
>
> On Mon, Aug 23, 2021 at 9:53 AM Dilip Kumar  wrote:
> >
> > On Mon, Aug 23, 2021 at 9:11 AM houzj.f...@fujitsu.com
> >  wrote:
> >
> > > 4)
> > > -extern File SharedFileSetCreate(SharedFileSet *fileset, const char 
> > > *name);
> > > -extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
> > > - int mode);
> > > -extern bool SharedFileSetDelete(SharedFileSet *fileset, const char *name,
> > > -   bool 
> > > error_on_failure);
> > >  extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
> > > -extern void SharedFileSetUnregister(SharedFileSet *input_fileset);
> > >
> > > I noticed the patch delete part of public api, is it better to keep the 
> > > old api and
> > > let them invoke new api internally ? Having said that, I didn’t find any 
> > > open source
> > > extension use these old api, so it might be fine to delete them.
> >
> > Right, those were internally used by buffile.c but now we have changed
> > buffile.c to directly use the fileset interfaces, so we better remove
> > them.
> >
>
> I also don't see any reason to keep those exposed from
> sharedfileset.h. I see that even in the original commit dc6c4c9dc2,
> these APIs seem to be introduced to be used by buffile. Andres/Thomas,
> do let us know if you think otherwise?
>
> One more comment:
> I think v1-0001-Sharedfileset-refactoring doesn't have a way for
> cleaning up worker.c temporary files on error/exit. It is better to
> have that to make it an independent patch.

I think we should handle that in worker.c itself, by adding a
before_dsm_detach function before_shmem_exit right?  Or you are
thinking that in FileSetInit, we keep the mechanism of filesetlist
like we were doing in SharedFileSetInit?  I think that will just add
unnecessary complexity in the first patch which will eventually go
away in the second patch.  And if we do that then SharedFileSetInit
can not directly use the FileSetInit, otherwise, the dsm based fileset
will also get registered for cleanup in filesetlist so for that we
might need to pass one parameter to the FileSetInit() that whether to
register for cleanup or not and that will again not look clean because
now we are again adding the conditional cleanup, IMHO that is the same
problem what we are trying to cleanup in SharedFileSetInit by
introducing a new FileSetInit.

I think what we can do is, introduce a new function
FileSetInitInternal(), that will do what FileSetInit() is doing today
and now both SharedFileSetInit() and the FileSetInit() will call this
function, and along with that SharedFileSetInit(), will register the
dsm based cleanup and FileSetInit() will do the filesetlist based
cleanup.  But IMHO, we should try to go in this direction only if we
are sure that we want to commit the first patch and not the second.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-08-23 Thread Amit Kapila
On Mon, Aug 23, 2021 at 9:53 AM Dilip Kumar  wrote:
>
> On Mon, Aug 23, 2021 at 9:11 AM houzj.f...@fujitsu.com
>  wrote:
>
> > 4)
> > -extern File SharedFileSetCreate(SharedFileSet *fileset, const char *name);
> > -extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
> > - int mode);
> > -extern bool SharedFileSetDelete(SharedFileSet *fileset, const char *name,
> > -   bool 
> > error_on_failure);
> >  extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
> > -extern void SharedFileSetUnregister(SharedFileSet *input_fileset);
> >
> > I noticed the patch delete part of public api, is it better to keep the old 
> > api and
> > let them invoke new api internally ? Having said that, I didn’t find any 
> > open source
> > extension use these old api, so it might be fine to delete them.
>
> Right, those were internally used by buffile.c but now we have changed
> buffile.c to directly use the fileset interfaces, so we better remove
> them.
>

I also don't see any reason to keep those exposed from
sharedfileset.h. I see that even in the original commit dc6c4c9dc2,
these APIs seem to be introduced to be used by buffile. Andres/Thomas,
do let us know if you think otherwise?

One more comment:
I think v1-0001-Sharedfileset-refactoring doesn't have a way for
cleaning up worker.c temporary files on error/exit. It is better to
have that to make it an independent patch.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-22 Thread Dilip Kumar
On Mon, Aug 23, 2021 at 9:11 AM houzj.f...@fujitsu.com
 wrote:

> 4)
> -extern File SharedFileSetCreate(SharedFileSet *fileset, const char *name);
> -extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
> - int mode);
> -extern bool SharedFileSetDelete(SharedFileSet *fileset, const char *name,
> -   bool 
> error_on_failure);
>  extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
> -extern void SharedFileSetUnregister(SharedFileSet *input_fileset);
>
> I noticed the patch delete part of public api, is it better to keep the old 
> api and
> let them invoke new api internally ? Having said that, I didn’t find any open 
> source
> extension use these old api, so it might be fine to delete them.

Right, those were internally used by buffile.c but now we have changed
buffile.c to directly use the fileset interfaces, so we better remove
them.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




RE: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-22 Thread houzj.f...@fujitsu.com
On Sat, Aug 21, 2021 8:38 PM Dilip Kumar  wrote:
> On Wed, Aug 18, 2021 at 3:45 PM Dilip Kumar  wrote:
> > I was looking into this, so if we want to do that I think the outline
> > will look like this
> >
> > - There will be a fileset.c and fileset.h files, and we will expose a
> > new structure FileSet, which will be the same as SharedFileSet, except
> > mutext and refcount.  The fileset.c will expose FileSetInit(),
> > FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
> > interfaces.
> >
> > - sharefileset.c will internally call the fileset.c's interfaces.  The
> > SharedFileSet structure will also contain FileSet and other members
> > i.e. mutex and refcount.
> >
> > - the buffile.c's interfaces which are ending with Shared e.g.
> > BufFileCreateShared, BufFileOpenShared, should be converted to
> > BufFileCreate and BufFileOpen respectively.  And the input to these
> > interfaces can be converted to FileSet instead of SharedFileSet.
> 
> Here is the first draft based on the idea we discussed, 0001, splits
> sharedfileset.c in sharedfileset.c and fileset.c and 0002 is same patch I
> submitted earlier(use single fileset throughout the worker), just it is 
> rebased on
> top of 0001.  Please let me know your thoughts.

Hi,

Here are some comments for the new version patches.

1)
+   TempTablespacePath(tempdirpath, tablespace);
+   snprintf(path, MAXPGPATH, "%s/%s%lu.%u.sharedfileset",
+tempdirpath, PG_TEMP_FILE_PREFIX,
+(unsigned long) fileset->creator_pid, fileset->number);

do we need to use different filename for shared and un-shared fileset ?

2)
I think we can remove or adjust the following comments in sharedfileset.c.


 * SharedFileSets can be used by backends when the temporary files need to be
 * opened/closed multiple times and the underlying files need to survive across
 * transactions.

 * We can also use this interface if the temporary files are used only by
 * single backend but the files need to be opened and closed multiple times
 * and also the underlying files need to survive across transactions.  For


3)
The 0002 patch still used the word "shared fileset" in some places, I think we
should change it to "fileset".

4)
-extern File SharedFileSetCreate(SharedFileSet *fileset, const char *name);
-extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
- int mode);
-extern bool SharedFileSetDelete(SharedFileSet *fileset, const char *name,
-   bool 
error_on_failure);
 extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
-extern void SharedFileSetUnregister(SharedFileSet *input_fileset);

I noticed the patch delete part of public api, is it better to keep the old api 
and
let them invoke new api internally ? Having said that, I didn’t find any open 
source
extension use these old api, so it might be fine to delete them.

Best regards,
Hou zj


Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-21 Thread Dilip Kumar
On Wed, Aug 18, 2021 at 3:45 PM Dilip Kumar  wrote:
>
> On Wed, Aug 18, 2021 at 11:24 AM Amit Kapila  wrote:
> >
> > On Tue, Aug 17, 2021 at 4:34 PM Andres Freund  wrote:
> > >
> > > On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > > > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > > > for non-dsm FileSet? One idea could be that we can have a variable
> > > > (probably bool) in SharedFileSet structure which will be initialized
> > > > in SharedFileSetInit based on whether the caller has provided dsm
> > > > segment. Then in other DSM-based APIs, we can check if it is used for
> > > > the wrong type.
> > >
> > > Well, isn't the issue here that it's not a shared file set in case you
> > > explicitly don't want to share it? ISTM that the proper way to address
> > > this would be to split out a FileSet from SharedFileSet that's then used
> > > for worker.c and sharedfileset.c.
> > >
> >
> > Okay, but note that to accomplish the same, we need to tweak the
> > BufFile (buffile.c) APIs as well so that they can work with FileSet.
> > As per the initial analysis, there doesn't seem to be any problem with
> > that though.
>
> I was looking into this, so if we want to do that I think the outline
> will look like this
>
> - There will be a fileset.c and fileset.h files, and we will expose a
> new structure FileSet, which will be the same as SharedFileSet, except
> mutext and refcount.  The fileset.c will expose FileSetInit(),
> FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
> interfaces.
>
> - sharefileset.c will internally call the fileset.c's interfaces.  The
> SharedFileSet structure will also contain FileSet and other members
> i.e. mutex and refcount.
>
> - the buffile.c's interfaces which are ending with Shared e.g.
> BufFileCreateShared, BufFileOpenShared, should be converted to
> BufFileCreate and
> BufFileOpen respectively.  And the input to these interfaces can be
> converted to FileSet instead of SharedFileSet.

Here is the first draft based on the idea we discussed, 0001, splits
sharedfileset.c in sharedfileset.c and fileset.c and 0002 is same
patch I submitted earlier(use single fileset throughout the worker),
just it is rebased on top of 0001.  Please let me know your thoughts.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 7368136290a67e49f6bf0ad5773be67243e99637 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Fri, 20 Aug 2021 11:32:33 +0530
Subject: [PATCH v1 2/2] Better usage of sharedfileset in apply worker

Instead of using a separate shared fileset for each xid, use one shared
fileset for whole lifetime of the worker.  So for each xid, just create
shared buffile under that shared fileset and remove the file whenever we
are done with the file.  For subxact file we only need to create once
we get the first subtransaction and for detecting that we also extend the
buffile open and buffile delete interfaces to allow the missing files.
---
 src/backend/replication/logical/worker.c  | 229 +++---
 src/backend/storage/file/buffile.c|  23 ++-
 src/backend/utils/sort/logtape.c  |   2 +-
 src/backend/utils/sort/sharedtuplestore.c |   3 +-
 src/include/storage/buffile.h |   5 +-
 5 files changed, 76 insertions(+), 186 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 9901cf6..77cad7f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -221,20 +221,6 @@ typedef struct ApplyExecutionData
 	PartitionTupleRouting *proute;	/* partition routing info */
 } ApplyExecutionData;
 
-/*
- * Stream xid hash entry. Whenever we see a new xid we create this entry in the
- * xidhash and along with it create the streaming file and store the fileset handle.
- * The subxact file is created iff there is any subxact info under this xid. This
- * entry is used on the subsequent streams for the xid to get the corresponding
- * fileset handles, so storing them in hash makes the search faster.
- */
-typedef struct StreamXidHash
-{
-	TransactionId xid;			/* xid is the hash key and must be first */
-	FileSet		 *stream_fileset;	/* shared file set for stream data */
-	FileSet		 *subxact_fileset;	/* shared file set for subxact info */
-} StreamXidHash;
-
 static MemoryContext ApplyMessageContext = NULL;
 MemoryContext ApplyContext = NULL;
 
@@ -255,10 +241,12 @@ static bool in_streamed_transaction = false;
 static TransactionId stream_xid = InvalidTransactionId;
 
 /*
- * Hash table for storing the streaming xid information along with shared file
- * set for streaming and subxact files.
+ * Fileset for storing the changes and subxact information for the streaming
+ * transaction.  We will use only one fileset and for each xid a separate
+ * changes and subxact files will be created under the same fileset.
  */
-static HTAB *xidhash = NULL;
+static FileSet *xidfileset = NULL;
+
 
 /* BufFile 

Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-18 Thread Amit Kapila
On Wed, Aug 18, 2021 at 3:45 PM Dilip Kumar  wrote:
>
> On Wed, Aug 18, 2021 at 11:24 AM Amit Kapila  wrote:
> >
> > On Tue, Aug 17, 2021 at 4:34 PM Andres Freund  wrote:
> > >
> > > On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > > > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > > > for non-dsm FileSet? One idea could be that we can have a variable
> > > > (probably bool) in SharedFileSet structure which will be initialized
> > > > in SharedFileSetInit based on whether the caller has provided dsm
> > > > segment. Then in other DSM-based APIs, we can check if it is used for
> > > > the wrong type.
> > >
> > > Well, isn't the issue here that it's not a shared file set in case you
> > > explicitly don't want to share it? ISTM that the proper way to address
> > > this would be to split out a FileSet from SharedFileSet that's then used
> > > for worker.c and sharedfileset.c.
> > >
> >
> > Okay, but note that to accomplish the same, we need to tweak the
> > BufFile (buffile.c) APIs as well so that they can work with FileSet.
> > As per the initial analysis, there doesn't seem to be any problem with
> > that though.
>
> I was looking into this, so if we want to do that I think the outline
> will look like this
>
> - There will be a fileset.c and fileset.h files, and we will expose a
> new structure FileSet, which will be the same as SharedFileSet, except
> mutext and refcount.  The fileset.c will expose FileSetInit(),
> FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
> interfaces.
>
> - sharefileset.c will internally call the fileset.c's interfaces.  The
> SharedFileSet structure will also contain FileSet and other members
> i.e. mutex and refcount.
>
> - the buffile.c's interfaces which are ending with Shared e.g.
> BufFileCreateShared, BufFileOpenShared, should be converted to
> BufFileCreate and
> BufFileOpen respectively.
>

The other alternative to name buffile APIs could be
BufFileCreateFileSet, BufFileOpenFileSet, etc.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-18 Thread Dilip Kumar
On Wed, Aug 18, 2021 at 11:24 AM Amit Kapila  wrote:
>
> On Tue, Aug 17, 2021 at 4:34 PM Andres Freund  wrote:
> >
> > On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > > for non-dsm FileSet? One idea could be that we can have a variable
> > > (probably bool) in SharedFileSet structure which will be initialized
> > > in SharedFileSetInit based on whether the caller has provided dsm
> > > segment. Then in other DSM-based APIs, we can check if it is used for
> > > the wrong type.
> >
> > Well, isn't the issue here that it's not a shared file set in case you
> > explicitly don't want to share it? ISTM that the proper way to address
> > this would be to split out a FileSet from SharedFileSet that's then used
> > for worker.c and sharedfileset.c.
> >
>
> Okay, but note that to accomplish the same, we need to tweak the
> BufFile (buffile.c) APIs as well so that they can work with FileSet.
> As per the initial analysis, there doesn't seem to be any problem with
> that though.

I was looking into this, so if we want to do that I think the outline
will look like this

- There will be a fileset.c and fileset.h files, and we will expose a
new structure FileSet, which will be the same as SharedFileSet, except
mutext and refcount.  The fileset.c will expose FileSetInit(),
FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
interfaces.

- sharefileset.c will internally call the fileset.c's interfaces.  The
SharedFileSet structure will also contain FileSet and other members
i.e. mutex and refcount.

- the buffile.c's interfaces which are ending with Shared e.g.
BufFileCreateShared, BufFileOpenShared, should be converted to
BufFileCreate and
BufFileOpen respectively.  And the input to these interfaces can be
converted to FileSet instead of SharedFileSet.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread Amit Kapila
On Tue, Aug 17, 2021 at 4:34 PM Andres Freund  wrote:
>
> On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > for non-dsm FileSet? One idea could be that we can have a variable
> > (probably bool) in SharedFileSet structure which will be initialized
> > in SharedFileSetInit based on whether the caller has provided dsm
> > segment. Then in other DSM-based APIs, we can check if it is used for
> > the wrong type.
>
> Well, isn't the issue here that it's not a shared file set in case you
> explicitly don't want to share it? ISTM that the proper way to address
> this would be to split out a FileSet from SharedFileSet that's then used
> for worker.c and sharedfileset.c.
>

Okay, but note that to accomplish the same, we need to tweak the
BufFile (buffile.c) APIs as well so that they can work with FileSet.
As per the initial analysis, there doesn't seem to be any problem with
that though.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread Dilip Kumar
On Wed, Aug 18, 2021 at 9:30 AM Amit Kapila  wrote:
>
> On Wed, Aug 18, 2021 at 8:23 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Wed, Aug 18, 2021 9:17 AM houzj.f...@fujitsu.com wrote:
> > > Hi,
> > >
> > > I took a quick look at the v2 patch and noticed a typo.
> > >
> > > + * backends and render it read-only.  If missing_ok is true then it will 
> > > return
> > > + * NULL if file doesn not exist otherwise error.
> > >   */
> > > doesn not=> doesn't
> > >
> >
> > Here are some other comments:
> >
> > 1).
> > +BufFileOpenShared(SharedFileSet *fileset, const char *name, int mode,
> > + bool missing_ok)
> >  {
> > BufFile*file;
> > charsegment_name[MAXPGPATH];
> > ...
> > files = palloc(sizeof(File) * capacity);
> > ...
> > @@ -318,10 +320,15 @@ BufFileOpenShared(SharedFileSet *fileset, const char 
> > *name, int mode)
> >  * name.
> >  */
> > if (nfiles == 0)
> > +   {
> > +   if (missing_ok)
> > +   return NULL;
> > +
> >
> > I think it might be better to pfree(files) when return NULL.
> >
> >
> > 2).
> > /* Delete the subxact file and release the memory, if it exist */
> > -   if (ent->subxact_fileset)
> > -   {
> > -   subxact_filename(path, subid, xid);
> > -   SharedFileSetDeleteAll(ent->subxact_fileset);
> > -   pfree(ent->subxact_fileset);
> > -   ent->subxact_fileset = NULL;
> > -   }
> > -
> > -   /* Remove the xid entry from the stream xid hash */
> > -   hash_search(xidhash, (void *) , HASH_REMOVE, NULL);
> > +   subxact_filename(path, subid, xid);
> > +   SharedFileSetDelete(xidfileset, path, true);
> >
> > Without the patch it doesn't throw an error if not exist,
> > But with the patch, it pass error_on_failure=true to SharedFileSetDelete().
> >
>
> Don't we need to use BufFileDeleteShared instead of
> SharedFileSetDelete as you have used to remove the changes file?

Yeah, it should be BufFileDeleteShared.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread Amit Kapila
On Wed, Aug 18, 2021 at 8:23 AM houzj.f...@fujitsu.com
 wrote:
>
> On Wed, Aug 18, 2021 9:17 AM houzj.f...@fujitsu.com wrote:
> > Hi,
> >
> > I took a quick look at the v2 patch and noticed a typo.
> >
> > + * backends and render it read-only.  If missing_ok is true then it will 
> > return
> > + * NULL if file doesn not exist otherwise error.
> >   */
> > doesn not=> doesn't
> >
>
> Here are some other comments:
>
> 1).
> +BufFileOpenShared(SharedFileSet *fileset, const char *name, int mode,
> + bool missing_ok)
>  {
> BufFile*file;
> charsegment_name[MAXPGPATH];
> ...
> files = palloc(sizeof(File) * capacity);
> ...
> @@ -318,10 +320,15 @@ BufFileOpenShared(SharedFileSet *fileset, const char 
> *name, int mode)
>  * name.
>  */
> if (nfiles == 0)
> +   {
> +   if (missing_ok)
> +   return NULL;
> +
>
> I think it might be better to pfree(files) when return NULL.
>
>
> 2).
> /* Delete the subxact file and release the memory, if it exist */
> -   if (ent->subxact_fileset)
> -   {
> -   subxact_filename(path, subid, xid);
> -   SharedFileSetDeleteAll(ent->subxact_fileset);
> -   pfree(ent->subxact_fileset);
> -   ent->subxact_fileset = NULL;
> -   }
> -
> -   /* Remove the xid entry from the stream xid hash */
> -   hash_search(xidhash, (void *) , HASH_REMOVE, NULL);
> +   subxact_filename(path, subid, xid);
> +   SharedFileSetDelete(xidfileset, path, true);
>
> Without the patch it doesn't throw an error if not exist,
> But with the patch, it pass error_on_failure=true to SharedFileSetDelete().
>

Don't we need to use BufFileDeleteShared instead of
SharedFileSetDelete as you have used to remove the changes file?

-- 
With Regards,
Amit Kapila.




RE: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread houzj.f...@fujitsu.com
On Wed, Aug 18, 2021 9:17 AM houzj.f...@fujitsu.com wrote:
> Hi,
> 
> I took a quick look at the v2 patch and noticed a typo.
> 
> + * backends and render it read-only.  If missing_ok is true then it will 
> return
> + * NULL if file doesn not exist otherwise error.
>   */
> doesn not=> doesn't
>

Here are some other comments:

1).
+BufFileOpenShared(SharedFileSet *fileset, const char *name, int mode,
+ bool missing_ok)
 {
BufFile*file;
charsegment_name[MAXPGPATH];
...
files = palloc(sizeof(File) * capacity);
...
@@ -318,10 +320,15 @@ BufFileOpenShared(SharedFileSet *fileset, const char 
*name, int mode)
 * name.
 */
if (nfiles == 0)
+   {
+   if (missing_ok)
+   return NULL;
+

I think it might be better to pfree(files) when return NULL.


2).
/* Delete the subxact file and release the memory, if it exist */
-   if (ent->subxact_fileset)
-   {
-   subxact_filename(path, subid, xid);
-   SharedFileSetDeleteAll(ent->subxact_fileset);
-   pfree(ent->subxact_fileset);
-   ent->subxact_fileset = NULL;
-   }
-
-   /* Remove the xid entry from the stream xid hash */
-   hash_search(xidhash, (void *) , HASH_REMOVE, NULL);
+   subxact_filename(path, subid, xid);
+   SharedFileSetDelete(xidfileset, path, true);

Without the patch it doesn't throw an error if not exist,
But with the patch, it pass error_on_failure=true to SharedFileSetDelete().
Was it intentional ?

Best regards,
Houzj




RE: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread houzj.f...@fujitsu.com
Hi,

I took a quick look at the v2 patch and noticed a typo.

+ * backends and render it read-only.  If missing_ok is true then it will return
+ * NULL if file doesn not exist otherwise error.
  */
doesn not=> doesn't

Best regards,
Houzj




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread Andres Freund
Hi,

On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> 5. How can we provide a strict mechanism to not allow to use dsm APIs
> for non-dsm FileSet? One idea could be that we can have a variable
> (probably bool) in SharedFileSet structure which will be initialized
> in SharedFileSetInit based on whether the caller has provided dsm
> segment. Then in other DSM-based APIs, we can check if it is used for
> the wrong type.

Well, isn't the issue here that it's not a shared file set in case you
explicitly don't want to share it? ISTM that the proper way to address
this would be to split out a FileSet from SharedFileSet that's then used
for worker.c and sharedfileset.c. Rather than making sharedfileset.c
support a non-shared mode.

Greetings,

Andres Freund




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread Amit Kapila
On Tue, Aug 17, 2021 at 1:30 PM Dilip Kumar  wrote:
>
> On Tue, Aug 17, 2021 at 10:54 AM Amit Kapila  wrote:
> >
>
> > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > for non-dsm FileSet? One idea could be that we can have a variable
> > (probably bool) in SharedFileSet structure which will be initialized
> > in SharedFileSetInit based on whether the caller has provided dsm
> > segment. Then in other DSM-based APIs, we can check if it is used for
> > the wrong type.
>
> Yeah, we can do something like that, can't we just use an existing
> variable instead of adding new, e.g. refcnt is required only when
> multiple processes are attached, so maybe if dsm segment is not passed
> then we can keep refcnt as 0 and based on we can give an error.  For
> example, if we try to call SharedFileSetAttach for the SharedFileSet
> which has refcnt as 0 then we error out?
>

But as of now, we treat refcnt as 0 for SharedFileSet that is already
destroyed. See SharedFileSetAttach.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread Dilip Kumar
On Tue, Aug 17, 2021 at 10:54 AM Amit Kapila  wrote:
>
> On Mon, Aug 16, 2021 at 8:18 PM Dilip Kumar  wrote:
> >
> > On Fri, Aug 13, 2021 at 9:29 PM Andres Freund  wrote:
> >
> > > > I think we can extend this API but I guess it is better to then do it
> > > > for dsm-based as well so that these get tracked via resowner.
> > >
> > > DSM segments are resowner managed already, so it's not obvious that 
> > > that'd buy
> > > us much? Although I guess there could be a few edge cases that'd look 
> > > cleaner,
> > > because we could reliably trigger cleanup in the leader instead of 
> > > relying on
> > > dsm detach hooks + refcounts to manage when a set is physically deleted?
> > >
> >
> > In an off-list discussion with Thomas and Amit, we tried to discuss
> > how to clean up the shared files set in the current use case.  Thomas
> > suggested that instead of using individual shared fileset for storing
> > the data for each XID why don't we just create a single shared fileset
> > for complete worker lifetime and when the worker is exiting we can
> > just remove that shared fileset.  And for storing XID data, we can
> > just create the files under the same shared fileset and delete those
> > files when we longer need them.  I like this idea and it looks much
> > cleaner, after this, we can get rid of the special cleanup mechanism
> > using 'filesetlist'.  I have attached a patch for the same.
> >
>
> It seems to me that this idea would obviate any need for resource
> owners as we will have only one fileset now. I have a few initial
> comments on the patch:
>
> 1.
> + /* do cleanup on worker exit (e.g. after DROP SUBSCRIPTION) */
> + on_shmem_exit(worker_cleanup, (Datum) 0);
>
> It should be registered with before_shmem_exit() hook to allow sending
> stats for file removal.

Done

> 2. After these changes, the comments atop stream_open_file and
> SharedFileSetInit need to be changed.

Done

> 3. In function subxact_info_write(), we are computing subxact file
> path twice which doesn't seem to be required.

Fixed

> 4.
> + if (missing_ok)
> + return NULL;
>   ereport(ERROR,
>   (errcode_for_file_access(),
> - errmsg("could not open temporary file \"%s\" from BufFile \"%s\": %m",
> + errmsg("could not open temporary file \"%s\" from BufFile \"%s\": %m",
>   segment_name, name)));
>
> There seems to be a formatting issue with errmsg. Also, it is better
> to keep an empty line before ereport.

Done

> 5. How can we provide a strict mechanism to not allow to use dsm APIs
> for non-dsm FileSet? One idea could be that we can have a variable
> (probably bool) in SharedFileSet structure which will be initialized
> in SharedFileSetInit based on whether the caller has provided dsm
> segment. Then in other DSM-based APIs, we can check if it is used for
> the wrong type.

Yeah, we can do something like that, can't we just use an existing
variable instead of adding new, e.g. refcnt is required only when
multiple processes are attached, so maybe if dsm segment is not passed
then we can keep refcnt as 0 and based on we can give an error.  For
example, if we try to call SharedFileSetAttach for the SharedFileSet
which has refcnt as 0 then we error out?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v2-0001-Better-usage-of-sharedfileset-in-apply-worker.patch
Description: Binary data


Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread Dilip Kumar
On Tue, Aug 17, 2021 at 12:06 PM Amit Kapila  wrote:

> One more comment:
> @@ -2976,39 +2952,17 @@ subxact_info_write(Oid subid, TransactionId xid)
> ..
> + /* Try to open the subxact file, if it doesn't exist then create it */
> + fd = BufFileOpenShared(xidfileset, path, O_RDWR, true);
> + if (fd == NULL)
> + fd = BufFileCreateShared(xidfileset, path);
> ..
>
> Instead of trying to create the file here based on whether it exists
> or not, can't we create it in subxact_info_add where we are first time
> allocating memory for subxacts? If that works then in the above code,
> the file should always exist.

One problem with this approach is that for now we delay creating the
subxact file till the end of the stream and if by end of the stream
all the subtransactions got aborted within the same stream then we
don't even create that file.  But with this suggestion, we will always
create the file as soon as we get the first subtransaction.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-17 Thread Amit Kapila
On Tue, Aug 17, 2021 at 10:54 AM Amit Kapila  wrote:
>
> On Mon, Aug 16, 2021 at 8:18 PM Dilip Kumar  wrote:
> >
> > On Fri, Aug 13, 2021 at 9:29 PM Andres Freund  wrote:
> >
> > > > I think we can extend this API but I guess it is better to then do it
> > > > for dsm-based as well so that these get tracked via resowner.
> > >
> > > DSM segments are resowner managed already, so it's not obvious that 
> > > that'd buy
> > > us much? Although I guess there could be a few edge cases that'd look 
> > > cleaner,
> > > because we could reliably trigger cleanup in the leader instead of 
> > > relying on
> > > dsm detach hooks + refcounts to manage when a set is physically deleted?
> > >
> >
> > In an off-list discussion with Thomas and Amit, we tried to discuss
> > how to clean up the shared files set in the current use case.  Thomas
> > suggested that instead of using individual shared fileset for storing
> > the data for each XID why don't we just create a single shared fileset
> > for complete worker lifetime and when the worker is exiting we can
> > just remove that shared fileset.  And for storing XID data, we can
> > just create the files under the same shared fileset and delete those
> > files when we longer need them.  I like this idea and it looks much
> > cleaner, after this, we can get rid of the special cleanup mechanism
> > using 'filesetlist'.  I have attached a patch for the same.
> >
>
> It seems to me that this idea would obviate any need for resource
> owners as we will have only one fileset now. I have a few initial
> comments on the patch:
>

One more comment:
@@ -2976,39 +2952,17 @@ subxact_info_write(Oid subid, TransactionId xid)
..
+ /* Try to open the subxact file, if it doesn't exist then create it */
+ fd = BufFileOpenShared(xidfileset, path, O_RDWR, true);
+ if (fd == NULL)
+ fd = BufFileCreateShared(xidfileset, path);
..

Instead of trying to create the file here based on whether it exists
or not, can't we create it in subxact_info_add where we are first time
allocating memory for subxacts? If that works then in the above code,
the file should always exist.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-16 Thread Amit Kapila
On Mon, Aug 16, 2021 at 8:18 PM Dilip Kumar  wrote:
>
> On Fri, Aug 13, 2021 at 9:29 PM Andres Freund  wrote:
>
> > > I think we can extend this API but I guess it is better to then do it
> > > for dsm-based as well so that these get tracked via resowner.
> >
> > DSM segments are resowner managed already, so it's not obvious that that'd 
> > buy
> > us much? Although I guess there could be a few edge cases that'd look 
> > cleaner,
> > because we could reliably trigger cleanup in the leader instead of relying 
> > on
> > dsm detach hooks + refcounts to manage when a set is physically deleted?
> >
>
> In an off-list discussion with Thomas and Amit, we tried to discuss
> how to clean up the shared files set in the current use case.  Thomas
> suggested that instead of using individual shared fileset for storing
> the data for each XID why don't we just create a single shared fileset
> for complete worker lifetime and when the worker is exiting we can
> just remove that shared fileset.  And for storing XID data, we can
> just create the files under the same shared fileset and delete those
> files when we longer need them.  I like this idea and it looks much
> cleaner, after this, we can get rid of the special cleanup mechanism
> using 'filesetlist'.  I have attached a patch for the same.
>

It seems to me that this idea would obviate any need for resource
owners as we will have only one fileset now. I have a few initial
comments on the patch:

1.
+ /* do cleanup on worker exit (e.g. after DROP SUBSCRIPTION) */
+ on_shmem_exit(worker_cleanup, (Datum) 0);

It should be registered with before_shmem_exit() hook to allow sending
stats for file removal.

2. After these changes, the comments atop stream_open_file and
SharedFileSetInit need to be changed.

3. In function subxact_info_write(), we are computing subxact file
path twice which doesn't seem to be required.

4.
+ if (missing_ok)
+ return NULL;
  ereport(ERROR,
  (errcode_for_file_access(),
- errmsg("could not open temporary file \"%s\" from BufFile \"%s\": %m",
+ errmsg("could not open temporary file \"%s\" from BufFile \"%s\": %m",
  segment_name, name)));

There seems to be a formatting issue with errmsg. Also, it is better
to keep an empty line before ereport.

5. How can we provide a strict mechanism to not allow to use dsm APIs
for non-dsm FileSet? One idea could be that we can have a variable
(probably bool) in SharedFileSet structure which will be initialized
in SharedFileSetInit based on whether the caller has provided dsm
segment. Then in other DSM-based APIs, we can check if it is used for
the wrong type.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-16 Thread Dilip Kumar
On Fri, Aug 13, 2021 at 9:29 PM Andres Freund  wrote:

> > I think we can extend this API but I guess it is better to then do it
> > for dsm-based as well so that these get tracked via resowner.
>
> DSM segments are resowner managed already, so it's not obvious that that'd buy
> us much? Although I guess there could be a few edge cases that'd look cleaner,
> because we could reliably trigger cleanup in the leader instead of relying on
> dsm detach hooks + refcounts to manage when a set is physically deleted?
>

In an off-list discussion with Thomas and Amit, we tried to discuss
how to clean up the shared files set in the current use case.  Thomas
suggested that instead of using individual shared fileset for storing
the data for each XID why don't we just create a single shared fileset
for complete worker lifetime and when the worker is exiting we can
just remove that shared fileset.  And for storing XID data, we can
just create the files under the same shared fileset and delete those
files when we longer need them.  I like this idea and it looks much
cleaner, after this, we can get rid of the special cleanup mechanism
using 'filesetlist'.  I have attached a patch for the same.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


0001-Better-usage-of-sharedfileset-in-apply-worker.patch
Description: Binary data


Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-13 Thread Andres Freund
Hi,

(dropping -committers to avoid moderation stalls due xposting to multiple lists 
-
I find that more annoying than helpful)

On 2021-08-13 14:38:37 +0530, Amit Kapila wrote:
> > What I'm wondering is why it is a good idea to have a SharedFileSet specific
> > cleanup mechanism. One that only operates on process lifetime level, rather
> > than something more granular. I get that the of the files here needs to be
> > longer than a transaction, but that can easily be addressed by having a 
> > longer
> > lived resource owner.
> >
> > Process lifetime may work well for the current worker.c, but even there it
> > doesn't seem optimal. One e.g. could easily imagine that we'd want to handle
> > connection errors or configuration changes without restarting the worker, in
> > which case process lifetime obviously isn't a good idea anymore.
> >
> 
> I don't deny that we can't make this at a more granular level. IIRC,
> at that time, we tried to follow AtProcExit_Files which cleans up temp
> files at proc exit and we needed something similar for temporary files
> used via SharedFileSet.

The comparison to AtProcExit_Files isn't convincing to me - normally temp
files are cleaned up long before that via resowner cleanup.

To me the reuse of SharedFileSet for worker.c as executed seems like a bad
design. As things stand there's little code shared between dsm/non-dsm shared
file sets. The cleanup semantics are entirely different. Several functions
don't work if used on the "wrong kind" of set (e.g. SharedFileSetAttach()).


> I think we can extend this API but I guess it is better to then do it
> for dsm-based as well so that these get tracked via resowner.

DSM segments are resowner managed already, so it's not obvious that that'd buy
us much? Although I guess there could be a few edge cases that'd look cleaner,
because we could reliably trigger cleanup in the leader instead of relying on
dsm detach hooks + refcounts to manage when a set is physically deleted?

Greetings,

Andres Freund




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-13 Thread Amit Kapila
On Thu, Aug 12, 2021 at 6:24 PM Andres Freund  wrote:
>
> On 2021-08-12 05:48:19 -0700, Andres Freund wrote:
> > I think SharedFileSetInit() needs a comment explaining that it needs to be
> > called in a process-lifetime memory context if used without dsm
> > segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
> > already freed memory (both for filesetlist and the SharedFileSet itself).
>
> Oh. And I think it's not ok that SharedFileSetDeleteAll() unconditionally does
> SharedFileSetUnregister(). SharedFileSetUnregister() asserts out if there's no
> match, but DSM based sets are never entered into filesetlist. So one cannot
> have a non-DSM and DSM set coexisting. Which seems surprising.
>

Oops, it should be allowed to have both non-DSM and DSM set
coexisting. I think we can remove Assert from
SharedFileSetUnregister(). The other way could be to pass a parameter
to SharedFileSetDeleteAll() to tell whether to unregister or not.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-13 Thread Amit Kapila
On Thu, Aug 12, 2021 at 6:18 PM Andres Freund  wrote:
>
> Hi,
>
> On 2021-08-12 15:06:23 +0530, Amit Kapila wrote:
> > On Thu, Aug 12, 2021 at 1:52 PM Andres Freund  wrote:
> > > I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> > > first place? ISTM that this should be dealt with using resowners, rathers 
> > > than
> > > a sharedfileset specific mechanism?
> > >
>
> > The underlying temporary files need to be closed at xact end but need
> > to survive across transactions.
>
> Why do they need to be closed at xact end? To avoid wasting memory due to too
> many buffered files?
>

Yes.

>
> > These are registered with the resource owner via
> > PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and then closed
> > at xact end. So, we need a way to remove the files used by the process
> > (apply worker in this particular case) before process exit and used
> > this proc_exit hook (possibly on the lines of AtProcExit_Files).
>
> What I'm wondering is why it is a good idea to have a SharedFileSet specific
> cleanup mechanism. One that only operates on process lifetime level, rather
> than something more granular. I get that the of the files here needs to be
> longer than a transaction, but that can easily be addressed by having a longer
> lived resource owner.
>
> Process lifetime may work well for the current worker.c, but even there it
> doesn't seem optimal. One e.g. could easily imagine that we'd want to handle
> connection errors or configuration changes without restarting the worker, in
> which case process lifetime obviously isn't a good idea anymore.
>

I don't deny that we can't make this at a more granular level. IIRC,
at that time, we tried to follow AtProcExit_Files which cleans up temp
files at proc exit and we needed something similar for temporary files
used via SharedFileSet. I think we can extend this API but I guess it
is better to then do it for dsm-based as well so that these get
tracked via resowner.

>
> I think SharedFileSetInit() needs a comment explaining that it needs to be
> called in a process-lifetime memory context if used without dsm
> segments.
>

We already have a comment about proc_exit clean up of files but will
extend that a bit about memory context.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-12 Thread Andres Freund
Hi,

On 2021-08-12 05:48:19 -0700, Andres Freund wrote:
> I think SharedFileSetInit() needs a comment explaining that it needs to be
> called in a process-lifetime memory context if used without dsm
> segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
> already freed memory (both for filesetlist and the SharedFileSet itself).

Oh. And I think it's not ok that SharedFileSetDeleteAll() unconditionally does
SharedFileSetUnregister(). SharedFileSetUnregister() asserts out if there's no
match, but DSM based sets are never entered into filesetlist. So one cannot
have a non-DSM and DSM set coexisting. Which seems surprising.

Greetings,

Andres Freund




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-12 Thread Andres Freund
Hi,

On 2021-08-12 15:06:23 +0530, Amit Kapila wrote:
> On Thu, Aug 12, 2021 at 1:52 PM Andres Freund  wrote:
> > I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> > first place? ISTM that this should be dealt with using resowners, rathers 
> > than
> > a sharedfileset specific mechanism?
> >

> The underlying temporary files need to be closed at xact end but need
> to survive across transactions.

Why do they need to be closed at xact end? To avoid wasting memory due to too
many buffered files?


> These are registered with the resource owner via
> PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and then closed
> at xact end. So, we need a way to remove the files used by the process
> (apply worker in this particular case) before process exit and used
> this proc_exit hook (possibly on the lines of AtProcExit_Files).

What I'm wondering is why it is a good idea to have a SharedFileSet specific
cleanup mechanism. One that only operates on process lifetime level, rather
than something more granular. I get that the of the files here needs to be
longer than a transaction, but that can easily be addressed by having a longer
lived resource owner.

Process lifetime may work well for the current worker.c, but even there it
doesn't seem optimal. One e.g. could easily imagine that we'd want to handle
connection errors or configuration changes without restarting the worker, in
which case process lifetime obviously isn't a good idea anymore.


I think SharedFileSetInit() needs a comment explaining that it needs to be
called in a process-lifetime memory context if used without dsm
segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
already freed memory (both for filesetlist and the SharedFileSet itself).

Greetings,

Andres Freund




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-12 Thread Amit Kapila
On Thu, Aug 12, 2021 at 1:52 PM Andres Freund  wrote:
>
> On 2021-08-12 11:46:09 +0530, Amit Kapila wrote:
> > On Thu, Aug 12, 2021 at 11:38 AM Dilip Kumar  wrote:
> > > On Thu, Aug 12, 2021 at 7:39 AM Masahiko Sawada  
> > > wrote:
> > > > It seems to me that moving the shared fileset cleanup to
> > > > before_shmem_exit() is the right approach to fix this problem. The
> > > > issue is fixed by the attached patch.
> > >
> > > +1, the fix makes sense to me.
>
> I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> first place? ISTM that this should be dealt with using resowners, rathers than
> a sharedfileset specific mechanism?
>

The underlying temporary files need to be closed at xact end but need
to survive across transactions. These are registered with the resource
owner via PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and
then closed at xact end. So, we need a way to remove the files used by
the process (apply worker in this particular case) before process exit
and used this proc_exit hook (possibly on the lines of
AtProcExit_Files).

> That said, I think it's fine to go for the ordering change in the short term.
>
>
> > I have also tested and fix works for me. The fix works because
> > pgstat_initialize() is called before we register clean up in
> > SharedFileSetInit(). I am not sure if we need an Assert to ensure that
> > and if so how we can do that? Any suggestions?
>
> I don't think we need to assert that - we'd see failures soon enough if
> that rule were violated...
>

Fair enough.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-12 Thread Andres Freund
Hi,

On 2021-08-12 11:46:09 +0530, Amit Kapila wrote:
> On Thu, Aug 12, 2021 at 11:38 AM Dilip Kumar  wrote:
> > On Thu, Aug 12, 2021 at 7:39 AM Masahiko Sawada  
> > wrote:
> > > It seems to me that moving the shared fileset cleanup to
> > > before_shmem_exit() is the right approach to fix this problem. The
> > > issue is fixed by the attached patch.
> >
> > +1, the fix makes sense to me.

I'm not so sure. Why does sharedfileset have its own proc exit hook in the
first place? ISTM that this should be dealt with using resowners, rathers than
a sharedfileset specific mechanism?

That said, I think it's fine to go for the ordering change in the short term.


> I have also tested and fix works for me. The fix works because
> pgstat_initialize() is called before we register clean up in
> SharedFileSetInit(). I am not sure if we need an Assert to ensure that
> and if so how we can do that? Any suggestions?

I don't think we need to assert that - we'd see failures soon enough if
that rule were violated...

Greetings,

Andres Freund