Re: Add index scan progress to pg_stat_progress_vacuum

2023-07-10 Thread Masahiko Sawada
On Thu, Jul 6, 2023 at 2:28 PM Masahiko Sawada  wrote:
>
> On Thu, Jul 6, 2023 at 11:15 AM Michael Paquier  wrote:
> >
> > On Thu, Jul 06, 2023 at 11:07:14AM +0900, Masahiko Sawada wrote:
> > > Thank you for updating the patch. It looks good to me too. I've
> > > updated the commit message.
> >
> > Thanks.  I was planning to review this patch today and/or tomorrow now
> > that my stack of things to do is getting slightly lower (registered my
> > name as committer as well a few weeks ago to not format).
> >
> > One thing I was planning to do is to move the new message processing
> > API for the incrementational updates in its own commit for clarity, as
> > that's a separate concept than the actual feature, useful on its own.
>
> +1. I had the same idea. Please find the attached patches.
>
> > Anyway, would you prefer taking care of it?
>
> I can take care of it if you're okay.
>

Pushed.

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2023-07-05 Thread Masahiko Sawada
On Thu, Jul 6, 2023 at 11:15 AM Michael Paquier  wrote:
>
> On Thu, Jul 06, 2023 at 11:07:14AM +0900, Masahiko Sawada wrote:
> > Thank you for updating the patch. It looks good to me too. I've
> > updated the commit message.
>
> Thanks.  I was planning to review this patch today and/or tomorrow now
> that my stack of things to do is getting slightly lower (registered my
> name as committer as well a few weeks ago to not format).
>
> One thing I was planning to do is to move the new message processing
> API for the incrementational updates in its own commit for clarity, as
> that's a separate concept than the actual feature, useful on its own.

+1. I had the same idea. Please find the attached patches.

> Anyway, would you prefer taking care of it?

I can take care of it if you're okay.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


v31-0002-Report-index-vacuum-progress.patch
Description: Binary data


v31-0001-Add-new-parallel-message-type-to-progress-report.patch
Description: Binary data


Re: Add index scan progress to pg_stat_progress_vacuum

2023-07-05 Thread Michael Paquier
On Thu, Jul 06, 2023 at 11:07:14AM +0900, Masahiko Sawada wrote:
> Thank you for updating the patch. It looks good to me too. I've
> updated the commit message.

Thanks.  I was planning to review this patch today and/or tomorrow now
that my stack of things to do is getting slightly lower (registered my
name as committer as well a few weeks ago to not format).  

One thing I was planning to do is to move the new message processing
API for the incrementational updates in its own commit for clarity, as
that's a separate concept than the actual feature, useful on its own.

Anyway, would you prefer taking care of it?
--
Michael


signature.asc
Description: PGP signature


Re: Add index scan progress to pg_stat_progress_vacuum

2023-07-05 Thread Masahiko Sawada
On Wed, Apr 12, 2023 at 9:22 PM Imseih (AWS), Sami  wrote:
>
> > This should be OK (also checked the code paths where the reports are
> > added). Note that the patch needed a few adjustments for its
> > indentation.
>
> Thanks for the formatting corrections! This looks good to me.

Thank you for updating the patch. It looks good to me too. I've
updated the commit message.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From f92929f21d4052cfbc3b068ca5b8be954741da3d Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 12 Apr 2023 13:45:59 +0900
Subject: [PATCH v30] Report index vacuum progress.

This commit adds two columns: indexes_total and indexes_processed, to
pg_stat_progress_vacuum system view to show the index vacuum
progress. These numbers are reported in the "vacuuming indexes" and
"cleaning up indexes" phases.

It also adds a new type of parallel message, 'P'. Which is used to
convey the progress updates made by parallel workers to the leader
process. Therefore, the parallel workers' progress updates are
reflected in an asynchronous manner. Currently it supports only
incremental progress reporting but it's possible to allow for
supporting of other backend process APIs in the future.

Authors: Sami Imseih
Reviewed by: Masahiko Sawada, Michael Paquier, Nathan Bossart, Andres Freund
Discussion: https://www.postgresql.org/message-id/flat/5478DFCD-2333-401A-B2F0-0D186AB09228@amazon.com
---
 doc/src/sgml/monitoring.sgml  | 23 ++
 src/backend/access/heap/vacuumlazy.c  | 70 ---
 src/backend/access/transam/parallel.c | 18 +
 src/backend/catalog/system_views.sql  |  3 +-
 src/backend/commands/vacuumparallel.c |  9 ++-
 src/backend/utils/activity/backend_progress.c | 32 +
 src/include/commands/progress.h   |  2 +
 src/include/utils/backend_progress.h  |  1 +
 src/test/regress/expected/rules.out   |  4 +-
 9 files changed, 150 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 506aeaa879..588b720f57 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6110,6 +6110,29 @@ FROM pg_stat_get_backend_idset() AS backendid;
Number of dead tuples collected since the last index vacuum cycle.
   
  
+
+ 
+  
+   indexes_total bigint
+  
+  
+   Total number of indexes that will be vacuumed or cleaned up. This
+   number is reported at the beginning of the
+   vacuuming indexes phase or the
+   cleaning up indexes phase.
+  
+ 
+
+ 
+  
+   indexes_processed bigint
+  
+  
+   Number of indexes processed. This counter only advances when the
+   phase is vacuuming indexes or
+   cleaning up indexes.
+  
+ 
 

   
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 4eb953f904..6a41ee635d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2319,6 +2319,17 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 {
 	bool		allindexes = true;
 	double		old_live_tuples = vacrel->rel->rd_rel->reltuples;
+	const int	progress_start_index[] = {
+		PROGRESS_VACUUM_PHASE,
+		PROGRESS_VACUUM_INDEXES_TOTAL
+	};
+	const int	progress_end_index[] = {
+		PROGRESS_VACUUM_INDEXES_TOTAL,
+		PROGRESS_VACUUM_INDEXES_PROCESSED,
+		PROGRESS_VACUUM_NUM_INDEX_VACUUMS
+	};
+	int64		progress_start_val[2];
+	int64		progress_end_val[3];
 
 	Assert(vacrel->nindexes > 0);
 	Assert(vacrel->do_index_vacuuming);
@@ -2331,9 +2342,13 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 		return false;
 	}
 
-	/* Report that we are now vacuuming indexes */
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
- PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
+	/*
+	 * Report that we are now vacuuming indexes and the number of indexes to
+	 * vacuum.
+	 */
+	progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX;
+	progress_start_val[1] = vacrel->nindexes;
+	pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val);
 
 	if (!ParallelVacuumIsActive(vacrel))
 	{
@@ -2346,6 +2361,10 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 		  old_live_tuples,
 		  vacrel);
 
+			/* Report the number of indexes vacuumed */
+			pgstat_progress_update_param(PROGRESS_VACUUM_INDEXES_PROCESSED,
+		 idx + 1);
+
 			if (lazy_check_wraparound_failsafe(vacrel))
 			{
 /* Wraparound emergency -- end current index scan */
@@ -2380,14 +2399,17 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
-	 * Increase and report the number of index scans.
+	 * Increase and report the number of index scans.  Also, we reset
+	 * PROGRESS_VACUUM_INDEXES_TOTAL and PROGRESS_VACUUM_INDEXES_PROCESSED.
 	 *
 	 * We deliberately include the case where we started a round of bulk
 	 * deletes that 

Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-12 Thread Imseih (AWS), Sami
> This should be OK (also checked the code paths where the reports are
> added). Note that the patch needed a few adjustments for its
> indentation.

Thanks for the formatting corrections! This looks good to me.

--
Sami





Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-11 Thread Michael Paquier
On Mon, Apr 10, 2023 at 07:20:42PM +, Imseih (AWS), Sami wrote:
> See v28 addressing the comments.

This should be OK (also checked the code paths where the reports are
added).  Note that the patch needed a few adjustments for its
indentation.
--
Michael
From 9267342ba2a1b8b120efaa98871b736a5bbde9a9 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 12 Apr 2023 13:45:59 +0900
Subject: [PATCH v29] Report index vacuum progress.

Add 2 new columns to pg_stat_progress_vacuum.
The columns are ndexes_total as the total indexes to be vacuumed or cleaned and
indexes_processed as the number of indexes vacuumed or cleaned up so far.

Authors: Sami Imseih
Reviewed by: Masahiko Sawada, Michael Paquier, Nathan Bossart, Andres Freund
Discussion: https://www.postgresql.org/message-id/flat/5478dfcd-2333-401a-b2f0-0d186ab09...@amazon.com
---
 src/include/commands/progress.h   |  2 +
 src/include/utils/backend_progress.h  |  1 +
 src/backend/access/heap/vacuumlazy.c  | 70 ---
 src/backend/access/transam/parallel.c | 18 +
 src/backend/catalog/system_views.sql  |  3 +-
 src/backend/commands/vacuumparallel.c |  9 ++-
 src/backend/utils/activity/backend_progress.c | 32 +
 src/test/regress/expected/rules.out   |  4 +-
 doc/src/sgml/monitoring.sgml  | 23 ++
 9 files changed, 150 insertions(+), 12 deletions(-)

diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index e5add41352..2478e87425 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -25,6 +25,8 @@
 #define PROGRESS_VACUUM_NUM_INDEX_VACUUMS		4
 #define PROGRESS_VACUUM_MAX_DEAD_TUPLES			5
 #define PROGRESS_VACUUM_NUM_DEAD_TUPLES			6
+#define PROGRESS_VACUUM_INDEXES_TOTAL			7
+#define PROGRESS_VACUUM_INDEXES_PROCESSED		8
 
 /* Phases of vacuum (as advertised via PROGRESS_VACUUM_PHASE) */
 #define PROGRESS_VACUUM_PHASE_SCAN_HEAP			1
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index a84752ade9..70dea55fc0 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -37,6 +37,7 @@ extern void pgstat_progress_start_command(ProgressCommandType cmdtype,
 		  Oid relid);
 extern void pgstat_progress_update_param(int index, int64 val);
 extern void pgstat_progress_incr_param(int index, int64 incr);
+extern void pgstat_progress_parallel_incr_param(int index, int64 incr);
 extern void pgstat_progress_update_multi_param(int nparam, const int *index,
 			   const int64 *val);
 extern void pgstat_progress_end_command(void);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0a9ebd22bd..e5707098bd 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2314,6 +2314,17 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 {
 	bool		allindexes = true;
 	double		old_live_tuples = vacrel->rel->rd_rel->reltuples;
+	const int	progress_start_index[] = {
+		PROGRESS_VACUUM_PHASE,
+		PROGRESS_VACUUM_INDEXES_TOTAL
+	};
+	const int	progress_end_index[] = {
+		PROGRESS_VACUUM_INDEXES_TOTAL,
+		PROGRESS_VACUUM_INDEXES_PROCESSED,
+		PROGRESS_VACUUM_NUM_INDEX_VACUUMS
+	};
+	int64		progress_start_val[2];
+	int64		progress_end_val[3];
 
 	Assert(vacrel->nindexes > 0);
 	Assert(vacrel->do_index_vacuuming);
@@ -2326,9 +2337,13 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 		return false;
 	}
 
-	/* Report that we are now vacuuming indexes */
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
- PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
+	/*
+	 * Report that we are now vacuuming indexes and the number of indexes to
+	 * vacuum.
+	 */
+	progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX;
+	progress_start_val[1] = vacrel->nindexes;
+	pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val);
 
 	if (!ParallelVacuumIsActive(vacrel))
 	{
@@ -2341,6 +2356,10 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 		  old_live_tuples,
 		  vacrel);
 
+			/* Report the number of indexes vacuumed */
+			pgstat_progress_update_param(PROGRESS_VACUUM_INDEXES_PROCESSED,
+		 idx + 1);
+
 			if (lazy_check_wraparound_failsafe(vacrel))
 			{
 /* Wraparound emergency -- end current index scan */
@@ -2375,14 +2394,17 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
-	 * Increase and report the number of index scans.
+	 * Increase and report the number of index scans.  Also, we reset
+	 * PROGRESS_VACUUM_INDEXES_TOTAL and PROGRESS_VACUUM_INDEXES_PROCESSED.
 	 *
 	 * We deliberately include the case where we started a round of bulk
 	 * deletes that we weren't able to finish due to the failsafe triggering.
 	 */
 	vacrel->num_index_scans++;
-	pgstat_progress_update_param(PROGRESS_VACUUM_NUM_INDEX_VACUUMS,
- vacrel->num_index_scans);
+	progress_end_val[0] 

Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-10 Thread Andres Freund
Hi,

On 2023-04-10 08:14:18 +0900, Michael Paquier wrote:
> On Fri, Apr 07, 2023 at 12:01:17PM -0700, Andres Freund wrote:
> > Why would it mean that? Parallel workers are updated together with the 
> > leader,
> > so there's no compatibility issue?
> 
> My point is that the callback system would still need to be maintained
> in a stable branch, and, while useful, it could be used for much more
> than it is originally written.  I guess that this could be used in
> custom nodes with their own custom parallel nodes.

Hm, I'm somewhat doubtful that that's something we should encourage. And
doubtful we'd get it right without a concrete use case at hand to verify the
design.

Greetings,

Andres Freund




Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-10 Thread Imseih (AWS), Sami
> + case 'P': /* Parallel progress reporting */

I kept this comment as-is but inside case code block I added 
more comments. This is to avoid cluttering up the one-liner comment.

> + * Increase and report the number of index scans. Also, we reset the progress
> + * counters.


> The counters reset are the two index counts, perhaps this comment
> should mention this fact.

Yes, since we are using the multi_param API here, it makes sense to 
mention the progress fields being reset in the comments.


+ /* update progress */
+ int index = pq_getmsgint(msg, 4);
+ int incr = pq_getmsgint(msg, 1);
[...]
+ pq_beginmessage(_message, 'P');
+ pq_sendint32(_message, index);
+ pq_sendint64(_message, incr);
+ pq_endmessage(_message);


> It seems to me that the receiver side is missing one pq_getmsgend()?
Yes. I added this.

> incr is defined and sent as an int64 on the sender side, hence the
> receiver should use pq_getmsgint64(), no? pq_getmsgint(msg, 1) means
> to receive only one byte, see pqformat.c. 

Ah correct, incr is an int64 so what we need is.

int64 incr = pq_getmsgint64(msg);

I also added the pq_getmsgend call.


> And the order is reversed?

I don't think so. The index then incr are sent and they are
back in the same order. Testing the patch shows the value
increments correctly.


See v28 addressing the comments.

Regards,

Sami Imseih 
AWS (Amazon Web Services)




v28-0001-Report-index-vacuum-progress.patch
Description: v28-0001-Report-index-vacuum-progress.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-09 Thread Michael Paquier
On Fri, Apr 07, 2023 at 12:01:17PM -0700, Andres Freund wrote:
> Why would it mean that? Parallel workers are updated together with the leader,
> so there's no compatibility issue?

My point is that the callback system would still need to be maintained
in a stable branch, and, while useful, it could be used for much more
than it is originally written.  I guess that this could be used in
custom nodes with their own custom parallel nodes.
--
Michael


signature.asc
Description: PGP signature


Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-09 Thread Michael Paquier
On Fri, Apr 07, 2023 at 07:27:17PM +, Imseih (AWS), Sami wrote:
> If called by a worker, it will send a 'P' message to the front end
> passing both the progress index, i.e. PROGRESS_VACUUM_INDEXES_PROCESSED
> And the value to increment by, i.e. 1 for index vacuum progress.
> 
> With that, the additional shared memory counters in ParallelVacuumState
> are not needed, and the poke of the worker to the leader goes directly
> through a generic backend_progress API.

Thanks for the new version.  This has unfortunately not been able to
make the cut for v16, but let's see it done at the beginning of the
v17 cycle.

+void
+pgstat_progress_parallel_incr_param(int index, int64 incr)
+{
+   /*
+* Parallel workers notify a leader through a 'P'
+* protocol message to update progress, passing the
+* progress index and increment value. Leaders can
+* just call pgstat_progress_incr_param directly.
+*/
+   if (IsParallelWorker())
+   {
+   static StringInfoData progress_message;
+
+   initStringInfo(_message);
+
+   pq_beginmessage(_message, 'P');
+   pq_sendint32(_message, index);
+   pq_sendint64(_message, incr);
+   pq_endmessage(_message);
+   }
+   else
+   pgstat_progress_incr_param(index, incr);
+}

I see.  You need to handle both the leader and worker case because
parallel_vacuum_process_one_index() can be called by either of them.

+   case 'P':   /* Parallel progress reporting */

Perhaps this comment should say that this is only about incremental
progress reporting, for the moment.

+* Increase and report the number of index scans. Also, we reset the 
progress
+* counters.

The counters reset are the two index counts, perhaps this comment
should mention this fact.

+   /* update progress */
+   int index = pq_getmsgint(msg, 4);
+   int incr = pq_getmsgint(msg, 1);
[...]
+   pq_beginmessage(_message, 'P');
+   pq_sendint32(_message, index);
+   pq_sendint64(_message, incr);
+   pq_endmessage(_message);

It seems to me that the receiver side is missing one pq_getmsgend()?
incr is defined and sent as an int64 on the sender side, hence the
receiver should use pq_getmsgint64(), no?  pq_getmsgint(msg, 1) means
to receive only one byte, see pqformat.c.  And the order is reversed?

There may be a case in the future about making 'P' more complicated
with more arguments, but what you have here should be sufficient for
your use-case.  Were there plans to increment more data for some
different and/or new progress indexes in the VACUUM path, by the way?
Most of that looked a bit tricky to me as this was AM-dependent, but I
may have missed something.
--
Michael


signature.asc
Description: PGP signature


Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-07 Thread Imseih (AWS), Sami
> The arguments of pgstat_progress_update_param() would be given by the
> worker directly as components of the 'P' message. It seems to me that
> this approach would have the simplicity to not require the setup of a
> shmem area for the extra counters, and there would be no need for a
> callback. Hence, the only thing the code paths of workers would need
> to do is to call this routine, then the leaders would increment their
> progress when they see a CFI to process the 'P' message. Also, I
> guess that we would only need an interface in backend_progress.c to
> increment counters, like pgstat_progress_incr_param(), but usable by
> workers. Like a pgstat_progress_worker_incr_param()?

So, here is what I think should be workable to give a generic
progress interface.

pgstat_progress_parallel_incr_param will be a new API that
can be called by either worker of leader from any parallel
code path that chooses to increment a progress index. 

If called by a worker, it will send a 'P' message to the front end
passing both the progress index, i.e. PROGRESS_VACUUM_INDEXES_PROCESSED
And the value to increment by, i.e. 1 for index vacuum progress.

With that, the additional shared memory counters in ParallelVacuumState
are not needed, and the poke of the worker to the leader goes directly
through a generic backend_progress API.

Let me know your thoughts.

Thanks!

--
Sami Imseih
Amazon Web Services (AWS)





v27-0001-Report-index-vacuum-progress.patch
Description: v27-0001-Report-index-vacuum-progress.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-07 Thread Andres Freund
Hi,

On 2023-04-06 12:28:04 +0900, Michael Paquier wrote:
> As some say, the introduction of a new message type in pqmq.c would be
> basically a one-way door, because we'd have to maintain it in a stable
> branch.

Why would it mean that? Parallel workers are updated together with the leader,
so there's no compatibility issue?

Greetings,

Andres Freund




Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-07 Thread Michael Paquier
On Thu, Apr 06, 2023 at 03:14:20PM +, Imseih (AWS), Sami wrote:
>> Could it be worth thinking about a different design where
>> the value incremented and the parameters of
>> pgstat_progress_update_param() are passed through the 'P' message
>> instead?
> 
> I am not sure how this is different than the approach suggested.
> In the current design, the 'P' message is used to pass the
> ParallelvacuumState to parallel_vacuum_update_progress which then
> calls pgstat_progress_update_param.

The arguments of pgstat_progress_update_param() would be given by the
worker directly as components of the 'P' message.  It seems to me that
this approach would have the simplicity to not require the setup of a
shmem area for the extra counters, and there would be no need for a
callback.  Hence, the only thing the code paths of workers would need
to do is to call this routine, then the leaders would increment their
progress when they see a CFI to process the 'P' message.  Also, I
guess that we would only need an interface in backend_progress.c to
increment counters, like pgstat_progress_incr_param(), but usable by
workers.  Like a pgstat_progress_worker_incr_param()?
--
Michael


signature.asc
Description: PGP signature


Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-06 Thread Imseih (AWS), Sami
> As one thing,
> for example, it introduces a dependency to parallel.h to do progress
> reporting without touching at backend_progress.h.  

Containing the logic in backend_progress.h is a reasonable point
from a maintenance standpoint.

We can create a new function in backend_progress.h called 
pgstat_progress_update_leader which is called from
vacuumparallel.c. 

pgstat_progress_update_leader can then call pq_putmessage('P', NULL, 0)

> Is a callback
> approach combined with a counter in shared memory the best thing there
> could be?  

It seems to be the best way.

The shared memory, ParallelVacuumState, is already tracking the
counters for the Parallel Vacuum.

Also, the callback in ParallelContext is the only way I can see
to let the 'P' message know what to do for updating progress
to the leader.


> Could it be worth thinking about a different design where
> the value incremented and the parameters of
> pgstat_progress_update_param() are passed through the 'P' message
> instead?

I am not sure how this is different than the approach suggested.
In the current design, the 'P' message is used to pass the
ParallelvacuumState to parallel_vacuum_update_progress which then
calls pgstat_progress_update_param.


> It strikes me that gathering data in the leader from a poke
> of the workers is something that could be useful in so much more areas
> than just the parallel index operations done in a vacuum because we do
> more and more things in parallel these days, so the API interface
> ought to have some attention.

We may need an interface that does more than progress
reporting, but I am not sure what those use cases are at
this point, besides progress reporting.


> There is also an argument where we could
> have each worker report their progress independently of the leader?

In this case, we don't need ParallelContext at all or to go through the
'P' message.


--
Regards,

Sami Imseih
Amazon Web Services (AWS)



Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-05 Thread Michael Paquier
On Wed, Apr 05, 2023 at 02:31:54PM +, Imseih (AWS), Sami wrote:
>> That seems a valid argument. I was thinking that such an asynchronous
>> state update mechanism would be a good infrastructure for progress
>> reporting of parallel operations. It might be worth considering to use
>> it in more general usage but since the current implementation is
>> minimal can we extend it in the future when we need it for other use
>> cases?
> 
> I don't think we should delay this patch to design a more general
> infrastructure. I agree this can be handled by a future requirement.

Not so sure to agree on that.  As the patch stands, we have a rather
generally-purposed new message type and facility (callback trigger
poke from workers to their leader) used for a not-so-general purpose,
while being documented under this not-so-general purpose, which is
progress reporting.  I agree that relying on pqmq.c to force the
leader to be more sensible to refreshes is sensible, because it is
cheap, but the interface seems kind of misplaced to me.  As one thing,
for example, it introduces a dependency to parallel.h to do progress
reporting without touching at backend_progress.h.  Is a callback
approach combined with a counter in shared memory the best thing there
could be?  Could it be worth thinking about a different design where
the value incremented and the parameters of
pgstat_progress_update_param() are passed through the 'P' message
instead?  It strikes me that gathering data in the leader from a poke
of the workers is something that could be useful in so much more areas
than just the parallel index operations done in a vacuum because we do
more and more things in parallel these days, so the API interface
ought to have some attention.

As some say, the introduction of a new message type in pqmq.c would be
basically a one-way door, because we'd have to maintain it in a stable
branch.  I would not take that lightly.  One idea of interface that
could be used is an extra set of APIs for workers to do progress
reporting, part of backend_progress.h, where we use a pqmq message
type in a centralized location, say something like a
pgstat_progress_parallel_incr_param().

About the callback interface, we may also want to be more careful
about more things, like the handling of callback chains, or even
unregistrations of callbacks?  There could be much more uses to that
than just progress reporting, though this comes to a balance of what
the leader needs to do before the workers are able to poke at it on a
periodic basis to make the refresh of the aggregated progress
reporting data more verbose.  There is also an argument where we could
have each worker report their progress independently of the leader?
--
Michael


signature.asc
Description: PGP signature


Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-05 Thread Imseih (AWS), Sami
> > The key point of the patch is here. From what I understand based on
> > the information of the thread, this is used as a way to make the
> progress reporting done by the leader more responsive so as we'd
> > update the index counters each time the leader is poked at with a 'P'
> > message by one of its workers, once a worker is done with the parallel
> > cleanup of one of the indexes. That's appealing, because this design
> > is responsive and cheap, while we'd rely on CFIs to make sure that the
> > leader triggers its callback on a timely basis. Unfortunately,
> > sticking a concept of "Parallel progress reporting" is rather
> > confusing here? This stuff can be used for much more purposes than
> > just progress reporting: the leader would execute the callback it has
> > registered based on the timing given by one or more of its workers,
> > these willing to push an event on the leader. Progress reporting is
> > one application of that to force a refresh and make the information of
> > the leader accurate. What about things like a chain of callbacks, for
> > example? Could the leader want to register more than one callback and
> > act on all of them with one single P message?


> That seems a valid argument. I was thinking that such an asynchronous
> state update mechanism would be a good infrastructure for progress
> reporting of parallel operations. It might be worth considering to use
> it in more general usage but since the current implementation is
> minimal can we extend it in the future when we need it for other use
> cases?

I don't think we should delay this patch to design a more general
infrastructure. I agree this can be handled by a future requirement.


> >
> > Another question I have: could the reporting of each individual worker
> > make sense on its own? The cleanup of the indexes depends on the
> > order they are processed, their number, size and AM with their cleanup
> > strategy, still there may be a point in getting information about how
> > much work a single worker is doing rather than just have the
> > aggregated information given to the leader?


> It would also be useful information for users but I don't think it can
> alternate the aggregated information. The aggregated information can
> answer the question from the user like "how many indexes to vacuum are
> remaining?", which helps estimate the remaining time to complete.

The original intention of the thread was to expose stats for both 
aggregate (leader level) and individual index progress. Both the aggregate
and individual indexes information have benefit as mentioned by Swada-San.

For the individual index progress, a suggested patch was suggested earlier in
the thread, v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch.

However, since this particular thread has focused mainly on the aggregated 
stats work,
my thoughts have been to start a new thread for the individual index progress
once this gets committed.


> > Btw, Is an assertion really helpful here? If
> > parallel_progress_callback is not set, we'd just crash one line
> > after. It seems to me that it could be cleaner to do nothing if a
> > leader gets a poke message from a worker if there are no callbacks
> > registered.

> Agreed.

I removed the assert and added an if condition instead.

See the attached v26 please.

Regards,

--
Sami Imseih
Amazon Web Services (AWS)





v26-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v26-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-05 Thread Masahiko Sawada
On Wed, Apr 5, 2023 at 4:47 PM Michael Paquier  wrote:
>
> On Fri, Feb 24, 2023 at 03:16:10PM +0900, Masahiko Sawada wrote:
> > Thanks! It looks good to me so I've marked it as Ready for Committer.
>
> +   case 'P':   /* Parallel progress reporting */
> +   {
> +   /* Call the progress reporting callback */
> +   Assert(pcxt->parallel_progress_callback);
> +   
> pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg);
> +
> +   break;
> +   }
>
> The key point of the patch is here.  From what I understand based on
> the information of the thread, this is used as a way to make the
> progress reporting done by the leader more responsive so as we'd
> update the index counters each time the leader is poked at with a 'P'
> message by one of its workers, once a worker is done with the parallel
> cleanup of one of the indexes.  That's appealing, because this design
> is responsive and cheap, while we'd rely on CFIs to make sure that the
> leader triggers its callback on a timely basis.  Unfortunately,
> sticking a concept of "Parallel progress reporting" is rather
> confusing here?  This stuff can be used for much more purposes than
> just progress reporting: the leader would execute the callback it has
> registered based on the timing given by one or more of its workers,
> these willing to push an event on the leader.  Progress reporting is
> one application of that to force a refresh and make the information of
> the leader accurate.  What about things like a chain of callbacks, for
> example?  Could the leader want to register more than one callback and
> act on all of them with one single P message?

That seems a valid argument. I was thinking that such an asynchronous
state update mechanism would be a good infrastructure for progress
reporting of parallel operations. It might be worth considering to use
it in more general usage but since the current implementation is
minimal can we extend it in the future when we need it for other use
cases?

>
> Another question I have: could the reporting of each individual worker
> make sense on its own?  The cleanup of the indexes depends on the
> order they are processed, their number, size and AM with their cleanup
> strategy, still there may be a point in getting information about how
> much work a single worker is doing rather than just have the
> aggregated information given to the leader?

It would also be useful information for users but I don't think it can
alternate the aggregated information. The aggregated information can
answer the question from the user like "how many indexes to vacuum are
remaining?", which helps estimate the remaining time to complete.

>
> Btw, Is an assertion really helpful here?  If
> parallel_progress_callback is not set, we'd just crash one line
> after.  It seems to me that it could be cleaner to do nothing if a
> leader gets a poke message from a worker if there are no callbacks
> registered.

Agreed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2023-04-05 Thread Michael Paquier
On Fri, Feb 24, 2023 at 03:16:10PM +0900, Masahiko Sawada wrote:
> Thanks! It looks good to me so I've marked it as Ready for Committer.
 
+   case 'P':   /* Parallel progress reporting */
+   {
+   /* Call the progress reporting callback */
+   Assert(pcxt->parallel_progress_callback);
+   
pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg);
+
+   break;
+   }

The key point of the patch is here.  From what I understand based on
the information of the thread, this is used as a way to make the
progress reporting done by the leader more responsive so as we'd
update the index counters each time the leader is poked at with a 'P'
message by one of its workers, once a worker is done with the parallel
cleanup of one of the indexes.  That's appealing, because this design
is responsive and cheap, while we'd rely on CFIs to make sure that the
leader triggers its callback on a timely basis.  Unfortunately,
sticking a concept of "Parallel progress reporting" is rather
confusing here?  This stuff can be used for much more purposes than
just progress reporting: the leader would execute the callback it has
registered based on the timing given by one or more of its workers,
these willing to push an event on the leader.  Progress reporting is
one application of that to force a refresh and make the information of
the leader accurate.  What about things like a chain of callbacks, for
example?  Could the leader want to register more than one callback and
act on all of them with one single P message?

Another question I have: could the reporting of each individual worker
make sense on its own?  The cleanup of the indexes depends on the
order they are processed, their number, size and AM with their cleanup
strategy, still there may be a point in getting information about how
much work a single worker is doing rather than just have the
aggregated information given to the leader?

Btw, Is an assertion really helpful here?  If
parallel_progress_callback is not set, we'd just crash one line
after.  It seems to me that it could be cleaner to do nothing if a
leader gets a poke message from a worker if there are no callbacks
registered.
--
Michael


signature.asc
Description: PGP signature


Re: Add index scan progress to pg_stat_progress_vacuum

2023-02-23 Thread Masahiko Sawada
On Tue, Feb 21, 2023 at 1:48 AM Imseih (AWS), Sami  wrote:
>
> Thanks!
>
> >  I think PROGRESS_VACUUM_INDEXES_TOTAL and
> >   PROGRESS_VACUUM_INDEXES_PROCESSED are better for consistency. The rest
> >   looks good to me.
>
> Took care of that in v25.

Thanks! It looks good to me so I've marked it as Ready for Committer.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2023-02-20 Thread Imseih (AWS), Sami
Thanks!

>  I think PROGRESS_VACUUM_INDEXES_TOTAL and
>   PROGRESS_VACUUM_INDEXES_PROCESSED are better for consistency. The rest
>   looks good to me.

Took care of that in v25. 

Regards

--
Sami Imseih
Amazon Web Services



v25-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v25-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-02-19 Thread Masahiko Sawada
On Sat, Feb 18, 2023 at 11:46 AM Imseih (AWS), Sami  wrote:
>
> Thanks for the review!
>
> >+ 
> >+  ParallelVacuumFinish
> >+  Waiting for parallel vacuum workers to finish index
> >vacuum.
> >+ 
>
> >This change is out-of-date.
>
> That was an oversight. Thanks for catching.
>
> >Total number of indexes that will be vacuumed or cleaned up. This
> >number is reported as of the beginning of the vacuuming indexes phase
> >or the cleaning up indexes phase.
>
> This is cleaner. I was being unnecessarily verbose in the original 
> description.
>
> >Number of indexes processed. This counter only advances when the phase
> >is vacuuming indexes or cleaning up indexes.
>
> I agree.
>
> >Also, index_processed sounds accurate to me. What do you think?
>
> At one point, II used index_processed, but decided to change it.
> "processed" makes sense also. I will use this.
>
> >I think these settings are not necessary since the pcxt is palloc0'ed.
>
> Good point.
>
> >Assert(pvs->pcxt->parallel_progress_callback_arg) looks wrong to me.
> >If 'arg' is NULL, a SEGV happens.
>
> Correct, Assert(pvs) is all that is needed.
>
> >I think it's better to update pvs->shared->nindexes_completed by both
> >leader and worker processes who processed the index.
>
> No reason for that, since only the leader process can report process to
> backend_progress.
>
>
> >I think it's better to make the function type consistent with the
> >existing parallel_worker_main_type. How about
> >parallel_progress_callback_type?
>
> Yes, that makes sense.
>
> > I've attached a patch that incorporates the above comments and has
> >some suggestions of updating comments etc.
>
> I reviewed and incorporated these changes, with a slight change. See v24.
>
> -* Increase and report the number of index. Also, we reset the 
> progress
> -* counters.
>
>
> +* Increase and report the number of index scans. Also, we reset the 
> progress
> +* counters.
>
>
> Thanks

Thanks for updating the patch!

 #define PROGRESS_VACUUM_NUM_INDEX_VACUUMS  4
 #define PROGRESS_VACUUM_MAX_DEAD_TUPLES5
 #define PROGRESS_VACUUM_NUM_DEAD_TUPLES6
+#define PROGRESS_VACUUM_INDEX_TOTAL 7
+#define PROGRESS_VACUUM_INDEX_PROCESSED 8

-s.param7 AS num_dead_tuples
+s.param7 AS num_dead_tuples,
+s.param8 AS indexes_total,
+s.param9 AS indexes_processed

I think PROGRESS_VACUUM_INDEXES_TOTAL and
PROGRESS_VACUUM_INDEXES_PROCESSED are better for consistency. The rest
looks good to me.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2023-02-17 Thread Imseih (AWS), Sami
Thanks for the review!

>+ 
>+  ParallelVacuumFinish
>+  Waiting for parallel vacuum workers to finish index
>vacuum.
>+ 

>This change is out-of-date.

That was an oversight. Thanks for catching.

>Total number of indexes that will be vacuumed or cleaned up. This
>number is reported as of the beginning of the vacuuming indexes phase
>or the cleaning up indexes phase.

This is cleaner. I was being unnecessarily verbose in the original description.

>Number of indexes processed. This counter only advances when the phase
>is vacuuming indexes or cleaning up indexes.

I agree.

>Also, index_processed sounds accurate to me. What do you think?

At one point, II used index_processed, but decided to change it. 
"processed" makes sense also. I will use this.

>I think these settings are not necessary since the pcxt is palloc0'ed.

Good point.

>Assert(pvs->pcxt->parallel_progress_callback_arg) looks wrong to me.
>If 'arg' is NULL, a SEGV happens.

Correct, Assert(pvs) is all that is needed.

>I think it's better to update pvs->shared->nindexes_completed by both
>leader and worker processes who processed the index.

No reason for that, since only the leader process can report process to
backend_progress.


>I think it's better to make the function type consistent with the
>existing parallel_worker_main_type. How about
>parallel_progress_callback_type?

Yes, that makes sense.

> I've attached a patch that incorporates the above comments and has
>some suggestions of updating comments etc.

I reviewed and incorporated these changes, with a slight change. See v24.

-* Increase and report the number of index. Also, we reset the progress
-* counters.


+* Increase and report the number of index scans. Also, we reset the 
progress
+* counters.


Thanks

--
Sami Imseih
Amazon Web Services (AWS)



v24-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v24-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-02-16 Thread Masahiko Sawada
On Wed, Feb 8, 2023 at 11:03 AM Imseih (AWS), Sami  wrote:
>
> Hi,
>
> Thanks for your reply!
>
> I addressed the latest comments in v23.
>
> 1/ cleaned up the asserts as discussed.
> 2/ used pq_putmessage to send the message on index scan completion.

 Thank you for updating the patch! Here are some comments for v23 patch:

+ 
+  ParallelVacuumFinish
+  Waiting for parallel vacuum workers to finish index
vacuum.
+ 

This change is out-of-date.

---
+ 
+  
+   indexes_total bigint
+  
+  
+   Number of indexes that will be vacuumed or cleaned up. This
value will be
+   0 if the phase is not vacuuming
indexes
+   or cleaning up indexes,
INDEX_CLEANUP
+   is set to OFF, index vacuum is skipped due to very
+   few dead tuples in the table, or vacuum failsafe is triggered.
+   See 
+   for more on vacuum failsafe.
+  
+ 

This explanation looks redundant: setting INDEX_CLEANUP to OFF
essentially means the vacuum doesn't enter the vacuuming indexes
phase. The same is true for the case of skipping index vacuum and
failsafe mode. How about the following?

Total number of indexes that will be vacuumed or cleaned up. This
number is reported as of the beginning of the vacuuming indexes phase
or the cleaning up indexes phase.

---
+ 
+  
+   indexes_completed bigint
+  
+  
+   Number of indexes vacuumed in the current vacuum cycle when the
+   phase is vacuuming indexes, or the number
+   of indexes cleaned up during the cleaning up indexes
+   phase.
+  
+ 

How about simplfyy the description to something like:

Number of indexes processed. This counter only advances when the phase
is vacuuming indexes or cleaning up indexes.

Also, index_processed sounds accurate to me. What do you think?

---
+pcxt->parallel_progress_callback = NULL;
+pcxt->parallel_progress_callback_arg = NULL;

I think these settings are not necessary since the pcxt is palloc0'ed.

---
+void
+parallel_vacuum_update_progress(void *arg)
+{
+ParallelVacuumState *pvs = (ParallelVacuumState *)arg;
+
+Assert(!IsParallelWorker());
+Assert(pvs->pcxt->parallel_progress_callback_arg);
+
+if (pvs)
+pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
+
   pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1));
+}

Assert(pvs->pcxt->parallel_progress_callback_arg) looks wrong to me.
If 'arg' is NULL, a SEGV happens.

I think it's better to update pvs->shared->nindexes_completed by both
leader and worker processes who processed the index.

---
+/* progress callback definition */
+typedef void (*ParallelProgressCallback) (void
*parallel_progress_callback_state);
+
 typedef void (*parallel_worker_main_type) (dsm_segment *seg, shm_toc *toc);

I think it's better to make the function type consistent with the
existing parallel_worker_main_type. How about
parallel_progress_callback_type?

I've attached a patch that incorporates the above comments and has
some suggestions of updating comments etc.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


fix_v23_masahiko.patch
Description: Binary data


Re: Add index scan progress to pg_stat_progress_vacuum

2023-02-07 Thread Imseih (AWS), Sami
Hi,

Thanks for your reply!

I addressed the latest comments in v23.

1/ cleaned up the asserts as discussed.
2/ used pq_putmessage to send the message on index scan completion.

Thanks

--
Sami Imseih
Amazon Web Services (AWS)



v23-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v23-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-02-01 Thread Masahiko Sawada
On Fri, Jan 20, 2023 at 11:38 PM Imseih (AWS), Sami  wrote:
>
> >Number of indexes that will be vacuumed or cleaned up. This counter only
> >advances when the phase is vacuuming indexes or cleaning up indexes.
>
> I agree, this reads better.
>
> ---
> -/* Report that we are now vacuuming indexes */
> -pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
> -
> PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
> +/*
> + * Report that we are now vacuuming indexes
> + * and the number of indexes to vacuum.
> + */
> +progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX;
> +progress_start_val[1] = vacrel->nindexes;
> +pgstat_progress_update_multi_param(2, progress_start_index,
> progress_start_val);
>
> >According to our code style guideline[1], we limit line lengths so
> >that the code is readable in an 80-column window. Some comments
>  >   updated in this patch seem too short.
>
> I will correct this.
>
> >I think it's better to define "void *arg".
>
> Agree
>
> >---
> >+/*
> >+ * A Leader process that receives this 
> > message
> >+ * must be ready to update progress.
> >+ */
> >+
> > Assert(pcxt->parallel_progress_callback);
> >+
> > Assert(pcxt->parallel_progress_callback_arg);
> >+
> >+/* Report progress */
> >+
> >pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg);
>
> >I think the parallel query infra should not require
> >parallel_progress_callback_arg to always be set. I think it can be
> >NULL.
>
> This assertion is inside the new 'P' message type handling.
> If a leader is consuming this message, they must have a
> progress callback set. Right now we only set the callback
> in the parallel vacuum case only, so not all leaders will be prepared
> to handle this case.
>
> Would you agree this is needed for safety?

I think it makes sense that we assume pcxt->parallel_progress_callback
is always not NULL but my point is that in the future one might want
to use this callback without the argument and we should allow it. If
parallel vacuum assumes pcxt->parallel_progress_callback_arg is not
NULL, I think we should add an assertion in
parallel_vacuum_update_progress(), but not in HandleParallelMessage().

>
> case 'P':   /* Parallel progress reporting */
> {
> /*
>  * A Leader process that receives this message
>  * must be ready to update progress.
>  */
> Assert(pcxt->parallel_progress_callback);
> Assert(pcxt->parallel_progress_callback_arg);
>
> ---
> >+void
> >+parallel_vacuum_update_progress(void *arg)
> >+{
> >+ParallelVacuumState *pvs = (ParallelVacuumState *)arg;
> >+
> >+Assert(!IsParallelWorker());
> >+
> >+if (pvs)
> >+
> > pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
> >+
> >   pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1));
> >+}
>
> >Since parallel vacuum always sets the arg, I think we don't need to 
> > check it.
>
> The arg is only set during parallel vacuum. During non-parallel vacuum,
> It's NULL. This check can be removed, but I realize now that we do need
> an Assert(pvs). Do you agree?

Agreed to add the assertion in this function.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-20 Thread Imseih (AWS), Sami
>Number of indexes that will be vacuumed or cleaned up. This counter only
>advances when the phase is vacuuming indexes or cleaning up indexes.

I agree, this reads better.

---
-/* Report that we are now vacuuming indexes */
-pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-
PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
+/*
+ * Report that we are now vacuuming indexes
+ * and the number of indexes to vacuum.
+ */
+progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX;
+progress_start_val[1] = vacrel->nindexes;
+pgstat_progress_update_multi_param(2, progress_start_index,
progress_start_val);

>According to our code style guideline[1], we limit line lengths so
>that the code is readable in an 80-column window. Some comments
 >   updated in this patch seem too short.

I will correct this.

>I think it's better to define "void *arg".

Agree

>---
>+/*
>+ * A Leader process that receives this 
> message
>+ * must be ready to update progress.
>+ */
>+Assert(pcxt->parallel_progress_callback);
>+
> Assert(pcxt->parallel_progress_callback_arg);
>+
>+/* Report progress */
>+
>pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg);

>I think the parallel query infra should not require
>parallel_progress_callback_arg to always be set. I think it can be
>NULL.

This assertion is inside the new 'P' message type handling.
If a leader is consuming this message, they must have a
progress callback set. Right now we only set the callback
in the parallel vacuum case only, so not all leaders will be prepared
to handle this case. 

Would you agree this is needed for safety?

case 'P':   /* Parallel progress reporting */
{
/*
 * A Leader process that receives this message
 * must be ready to update progress.
 */
Assert(pcxt->parallel_progress_callback);
Assert(pcxt->parallel_progress_callback_arg);

---
>+void
>+parallel_vacuum_update_progress(void *arg)
>+{
>+ParallelVacuumState *pvs = (ParallelVacuumState *)arg;
>+
>+Assert(!IsParallelWorker());
>+
>+if (pvs)
>+
> pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
>+
>   pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1));
>+}

>Since parallel vacuum always sets the arg, I think we don't need to check 
> it.

The arg is only set during parallel vacuum. During non-parallel vacuum,
It's NULL. This check can be removed, but I realize now that we do need 
an Assert(pvs). Do you agree?

--

Regards,

Sami Imseih
Amazon Web Services (AWS)




Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-19 Thread Masahiko Sawada
On Thu, Jan 12, 2023 at 11:02 PM Imseih (AWS), Sami  wrote:
>
> Thanks for the feedback and I apologize for the delay in response.
>
> >I think the problem here is that you're basically trying to work around 
> > the
> >lack of an asynchronous state update mechanism between leader and 
> > workers. The
> >workaround is to add a lot of different places that poll whether there 
> > has
> >been any progress. And you're not doing that by integrating with the 
> > existing
> >machinery for interrupt processing (i.e. CHECK_FOR_INTERRUPTS()), but by
> >developing a new mechanism.
>
> >I think your best bet would be to integrate with 
> > HandleParallelMessages().
>
> You are correct. I have been trying to work around the async nature
> of parallel workers performing the index vacuum. As you have pointed out,
> integrating with HandleParallelMessages does appear to be the proper way.
> Doing so will also avoid having to do any progress updates in the index AMs.

Very interesting idea. I need to study the parallel query
infrastructure more to consider potential downside of this idea but it
seems okay as far as I researched so far.

> In the attached patch, the parallel workers send a new type of protocol 
> message
> type to the leader called 'P' which signals the leader that it should handle a
> progress update. The leader then performs the progress update by
> invoking a callback set in the ParallelContext. This is done inside  
> HandleParallelMessages.
> In the index vacuum case, the callback is parallel_vacuum_update_progress.
>
> The new message does not contain a payload, and it's merely used to
> signal the leader that it can invoke a progress update.

Thank you for updating the patch. Here are some comments for v22 patch:

---
+  
+   Number of indexes that will be vacuumed or cleaned up. This
value will be
+   0 if the phase is not vacuuming
indexes
+   or cleaning up indexes,
INDEX_CLEANUP
+   is set to OFF, index vacuum is skipped due to very
+   few dead tuples in the table, or vacuum failsafe is triggered.

I think that if INDEX_CLEANUP is set to OFF or index vacuum is skipped
due to failsafe mode, we enter neither vacuum indexes phase nor
cleanup indexes phase. So probably we can say something like:

Number of indexes that will be vacuumed or cleaned up. This counter only
advances when the phase is vacuuming indexes or cleaning up indexes.

---
-/* Report that we are now vacuuming indexes */
-pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
-
PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
+/*
+ * Report that we are now vacuuming indexes
+ * and the number of indexes to vacuum.
+ */
+progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX;
+progress_start_val[1] = vacrel->nindexes;
+pgstat_progress_update_multi_param(2, progress_start_index,
progress_start_val);

According to our code style guideline[1], we limit line lengths so
that the code is readable in an 80-column window. Some comments
updated in this patch seem too short.

---
+StringInfoData msgbuf;
+
+pq_beginmessage(, 'P');
+pq_endmessage();

I think we can use pq_putmessage() instead.

---
+/* progress callback definition */
+typedef void (*ParallelProgressCallback) (void
*parallel_progress_callback_state);

I think it's better to define "void *arg".

---
+/*
+ * A Leader process that receives this message
+ * must be ready to update progress.
+ */
+Assert(pcxt->parallel_progress_callback);
+Assert(pcxt->parallel_progress_callback_arg);
+
+/* Report progress */
+
pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg);

I think the parallel query infra should not require
parallel_progress_callback_arg to always be set. I think it can be
NULL.

---
+void
+parallel_vacuum_update_progress(void *arg)
+{
+ParallelVacuumState *pvs = (ParallelVacuumState *)arg;
+
+Assert(!IsParallelWorker());
+
+if (pvs)
+pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
+
   pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1));
+}

Since parallel vacuum always sets the arg, I think we don't need to check it.

Regards,

[1] https://www.postgresql.org/docs/devel/source-format.html

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-12 Thread Imseih (AWS), Sami
Thanks for the feedback and I apologize for the delay in response.

>I think the problem here is that you're basically trying to work around the
>lack of an asynchronous state update mechanism between leader and workers. 
> The
>workaround is to add a lot of different places that poll whether there has
>been any progress. And you're not doing that by integrating with the 
> existing
>machinery for interrupt processing (i.e. CHECK_FOR_INTERRUPTS()), but by
>developing a new mechanism.

>I think your best bet would be to integrate with HandleParallelMessages().

You are correct. I have been trying to work around the async nature
of parallel workers performing the index vacuum. As you have pointed out,
integrating with HandleParallelMessages does appear to be the proper way.
Doing so will also avoid having to do any progress updates in the index AMs.

In the attached patch, the parallel workers send a new type of protocol message
type to the leader called 'P' which signals the leader that it should handle a
progress update. The leader then performs the progress update by
invoking a callback set in the ParallelContext. This is done inside  
HandleParallelMessages.
In the index vacuum case, the callback is parallel_vacuum_update_progress. 

The new message does not contain a payload, and it's merely used to
signal the leader that it can invoke a progress update.

Also, If the leader performs the index vacuum, it can call 
parallel_vacuum_update_progress
directly inside vacuumparallel.c.

Regards,

Sami Imseih
Amazon Web Services (AWS)






v22-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v22-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-09 Thread Andres Freund
Hi,

On 2023-01-07 01:59:40 +, Imseih (AWS), Sami wrote:
> --- a/src/backend/access/nbtree/nbtree.c
> +++ b/src/backend/access/nbtree/nbtree.c
> @@ -998,6 +998,8 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult 
> *stats,
>   if (info->report_progress)
>   
> pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
>   
>  scanblkno);
> + if (info->report_parallel_progress && (scanblkno % 
> REPORT_PARALLEL_VACUUM_EVERY_PAGES) == 0)
> + 
> parallel_vacuum_update_progress(info->parallel_vacuum_state);
>   }
>   }

I still think it's wrong to need multiple pgstat_progress_*() calls within one
scan. If we really need it, it should be abstracted into a helper function
that wrapps all the vacuum progress stuff needed for an index.


> @@ -688,7 +703,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState 
> *pvs, int num_index_scan
>*/
>   if (nworkers > 0)
>   {
> - /* Wait for all vacuum workers to finish */
> + /*
> +  * Wait for all indexes to be vacuumed while
> +  * updating the parallel vacuum index progress,
> +  * and then wait for all workers to finish.
> +  */
> + parallel_vacuum_wait_to_finish(pvs);
> +
>   WaitForParallelWorkersToFinish(pvs->pcxt);
>  
>   for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)

I don't think it's a good idea to have two difference routines that wait for
workers to exit.  And certainly not when one of them basically just polls in a
regular interval as parallel_vacuum_wait_to_finish().


I think the problem here is that you're basically trying to work around the
lack of an asynchronous state update mechanism between leader and workers. The
workaround is to add a lot of different places that poll whether there has
been any progress. And you're not doing that by integrating with the existing
machinery for interrupt processing (i.e. CHECK_FOR_INTERRUPTS()), but by
developing a new mechanism.

I think your best bet would be to integrate with HandleParallelMessages().

Greetings,

Andres Freund




Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-06 Thread Imseih (AWS), Sami
> My point is whether we should show
> indexes_total throughout the vacuum execution (even also in not
>  relevant phases such as heap scanning/vacuum/truncation).

That is a good point. We should only show indexes_total
and indexes_completed only during the relevant phases.

V21 addresses this along with a documentation fix.

indexes_total and indexes_completed can only be a value > 0 while in the
"vacuuming indexes" or "cleaning up indexes" phases of vacuum. 

Indexes_total is set to 0 at the start of the index vacuum cycle or index 
cleanups
and set back to 0, along with indexes_completed, at the end of the index vacuum
cycle and index cleanups

Also, for the progress updates in vacuumlazy.c that should be done atomically,
I made a change to use pgstat_progress_update_multi_param.

Regards,

--
Sami Imseih
Amazon Web Services: https://aws.amazon.com



v21-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v21-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-06 Thread Masahiko Sawada
On Fri, Jan 6, 2023 at 12:07 PM Imseih (AWS), Sami  wrote:
>
> >Similar to above three cases, vacuum can bypass index vacuuming if
> >there are almost zero TIDs. Should we set indexes_total to 0 in this
> >case too? If so, I think we can set both indexes_total and
> >indexes_completed at the beginning of the index vacuuming/cleanup and
> >reset them at the end.
>
> Unlike the other 3 cases, in which the vacuum and cleanup are totally skipped,
> a cleanup still occurs when the index vacuum is bypassed. From what I can 
> tell,
> this is to allow for things like a gin pending list cleanup even if the index
> is not vacuumed.

Right. But since we set indexes_total also at the beginning of index
cleanup (i.e. lazy_cleanup_all_indexes()), can't we still show the
valid value in this case? My point is whether we should show
indexes_total throughout the vacuum execution (even also in not
relevant phases such as heap scanning/vacuum/truncation). For example,
in the analyze progress report, we have child_tables_total and
child_tables_done. child_tables_total is set before the actual work of
sampling rows from child tables, but not the beginning of the analyze.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-05 Thread Imseih (AWS), Sami
>Similar to above three cases, vacuum can bypass index vacuuming if
>there are almost zero TIDs. Should we set indexes_total to 0 in this
>case too? If so, I think we can set both indexes_total and
>indexes_completed at the beginning of the index vacuuming/cleanup and
>reset them at the end. 

Unlike the other 3 cases, in which the vacuum and cleanup are totally skipped,
a cleanup still occurs when the index vacuum is bypassed. From what I can tell,
this is to allow for things like a gin pending list cleanup even if the index
is not vacuumed. There could be other reasons as well.

if (bypass)
{
/*
 * There are almost zero TIDs.  Behave as if there were 
precisely
 * zero: bypass index vacuuming, but do index cleanup.
 *
 * We expect that the ongoing VACUUM operation will finish very
 * quickly, so there is no point in considering speeding up as a
 * failsafe against wraparound failure. (Index cleanup is 
expected to
 * finish very quickly in cases where there were no 
ambulkdelete()
 * calls.)
 */
vacrel->do_index_vacuuming = false;
}

So it seems like we should still report the total number of indexes as we are 
currently
doing in the patch.

With that said, the documentation should make this be more clear.

For indexes_total, the description should be:

   Number of indexes that will be vacuumed or cleaned up. This value will be
0 if there are no indexes to vacuum, 
INDEX_CLEANUP
is set to OFF, or vacuum failsafe is triggered.
See 

For indexes_completed, it should be:

   Number of indexes vacuumed in the current vacuum cycle when the
   phase is vacuuming indexes, or the number
   of indexes cleaned up in the cleaning up indexes
   phase.


Regards,

--
Sami Imseih
Amazon Web Services: https://aws.amazon.com



Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-05 Thread Masahiko Sawada
On Thu, Jan 5, 2023 at 4:24 AM Imseih (AWS), Sami  wrote:
>
> Thanks for the review!
>
> Addressed the comments.
>
> > "Increment the indexes completed." (dot at the end) instead?
>
> Used the commenting format being used in other places in this
> file with an inclusion of a double-dash. i.,e.
> /* Wraparound emergency -- end current index scan */
>
> > It seems to me that "#define REPORT_PARALLEL_VACUUM_EVERY_PAGES 
> > ((BlockNumber) (1024 * 1024 * 1024 / BLCKSZ))" would be fine too.
>
> I kept this the same as it matches what we are doing in other places such
> as FAILSAFE_EVERY_PAGES
>
> v20 attached.

+ 
+  
+   indexes_total bigint
+  
+  
+   Number of indexes that will be vacuumed. This value will be
+   0 if there are no indexes to vacuum,
INDEX_CLEANUP
+   is set to OFF, or vacuum failsafe is triggered.

Similar to above three cases, vacuum can bypass index vacuuming if
there are almost zero TIDs. Should we set indexes_total to 0 in this
case too? If so, I think we can set both indexes_total and
indexes_completed at the beginning of the index vacuuming/cleanup and
reset them at the end. That is, these values are valid only in index
vacuum phase and index cleanup phase. Otherwise, 0.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-04 Thread Imseih (AWS), Sami
Thanks for the review!

Addressed the comments.

> "Increment the indexes completed." (dot at the end) instead?

Used the commenting format being used in other places in this
file with an inclusion of a double-dash. i.,e.
/* Wraparound emergency -- end current index scan */

> It seems to me that "#define REPORT_PARALLEL_VACUUM_EVERY_PAGES 
> ((BlockNumber) (1024 * 1024 * 1024 / BLCKSZ))" would be fine too.

I kept this the same as it matches what we are doing in other places such
as FAILSAFE_EVERY_PAGES

v20 attached.


Regards,

--
Sami Imseih
Amazon Web Services: https://aws.amazon.com



v20-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v20-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-04 Thread Drouvot, Bertrand

Hi,

On 1/3/23 5:46 PM, Imseih (AWS), Sami wrote:

cirrus-ci.com/task/4557389261701120


I earlier compiled without building with --enable-cassert,
which is why the compilation errors did not produce on my
buid.

Fixed in v19.

Thanks



Thanks for the updated patch!

Some comments about it:

+
+ 
+  
+   indexes_total bigint
+  
+  
+   Number of indexes that wil be vacuumed. This value will be

Typo: wil

+   /* report number of indexes to vacuum, if we are told to cleanup 
indexes */
+   if (vacrel->do_index_cleanup)
+   pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_TOTAL, 
vacrel->nindexes);

"Report" instead? (to looks like the surrounding code)

+   /*
+* Done vacuuming an index. Increment the indexes 
completed
+*/
+   
pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
+   
 idx + 1);

"Increment the indexes completed." (dot at the end) instead?

-* Increase and report the number of index scans.
+* Reset and report the number of indexes scanned.
+* Also, increase and report the number of index
+* scans.

What about "Reset and report zero as the number of indexes scanned."? (just to 
make clear we
don't want to report the value as it was prior to the reset)

-   /* Disable index vacuuming, index cleanup, and heap rel 
truncation */
+   /*
+* Disable index vacuuming, index cleanup, and heap rel 
truncation
+*

The new "Disable index vacuuming, index cleanup, and heap rel truncation" needs 
a dot at the end.

+* Also, report to progress.h that we are no longer tracking
+* index vacuum/cleanup.
+*/

"Also, report that we are" instead?

+   /*
+* Done cleaning an index. Increment the indexes 
completed
+*/

Needs a dot at the end.

-   /* Reset the parallel index processing counter */
+   /* Reset the parallel index processing counter ( index progress counter 
also ) */

"Reset the parallel index processing and progress counters" instead?

+   /* Update the number of indexes completed. */
+   pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1);

Remove the dot at the end? (to looks like the surrounding code)

+
+/*
+ * Read pvs->shared->nindexes_completed and update progress.h
+ * with indexes vacuumed so far. This function is called periodically

"Read pvs->shared->nindexes_completed and report the number of indexes vacuumed so 
far" instead?

+ * Note: This function should be used by the leader process only,

"called" instead of "used"?

case WAIT_EVENT_XACT_GROUP_UPDATE:
event_name = "XactGroupUpdate";
break;
+   case WAIT_EVENT_PARALLEL_VACUUM_FINISH:
+   event_name = "ParallelVacuumFinish";
+   break;
/* no default case, so that compiler will warn */

It seems to me that the case ordering should follow the alphabetical order 
(that's how it is done currently without the patch).

+#define REPORT_PARALLEL_VACUUM_EVERY_PAGES ((BlockNumber) (((uint64) 1024 * 
1024 * 1024) / BLCKSZ))

It seems to me that "#define REPORT_PARALLEL_VACUUM_EVERY_PAGES ((BlockNumber) (1024 
* 1024 * 1024 / BLCKSZ))" would be fine too.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-03 Thread Imseih (AWS), Sami
> cirrus-ci.com/task/4557389261701120

I earlier compiled without building with --enable-cassert,
which is why the compilation errors did not produce on my
buid.

Fixed in v19.

Thanks

--
Sami Imseih
Amazon Web Services: https://aws.amazon.com





v19-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v19-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-03 Thread vignesh C
On Mon, 2 Jan 2023 at 10:04, Imseih (AWS), Sami  wrote:
>
> >cfbot is complaining that this patch no longer applies.  Sami, would you
> >mind rebasing it?
>
> Rebased patch attached.

CFBot shows some compilation errors as in [1], please post an updated
version for the same:
[07:01:58.889] In file included from ../../../src/include/postgres.h:47,
[07:01:58.889] from vacuumparallel.c:27:
[07:01:58.889] vacuumparallel.c: In function ‘parallel_vacuum_update_progress’:
[07:01:58.889] vacuumparallel.c:1118:10: error: ‘IsParallelWorker’
undeclared (first use in this function); did you mean
‘ParallelWorkerMain’?
[07:01:58.889] 1118 | Assert(!IsParallelWorker);
[07:01:58.889] | ^~~~
[07:01:58.889] ../../../src/include/c.h:859:9: note: in definition of
macro ‘Assert’
[07:01:58.889] 859 | if (!(condition)) \
[07:01:58.889] | ^
[07:01:58.889] vacuumparallel.c:1118:10: note: each undeclared
identifier is reported only once for each function it appears in
[07:01:58.889] 1118 | Assert(!IsParallelWorker);
[07:01:58.889] | ^~~~

[1] - https://cirrus-ci.com/task/4557389261701120

Regards,
Vignesh




Re: Add index scan progress to pg_stat_progress_vacuum

2023-01-01 Thread Imseih (AWS), Sami
>cfbot is complaining that this patch no longer applies.  Sami, would you
>mind rebasing it?

Rebased patch attached.

--
Sami Imseih
Amazon Web Services: https://aws.amazon.com



v18-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v18-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2022-12-30 Thread Nathan Bossart
On Wed, Dec 14, 2022 at 05:09:46AM +, Imseih (AWS), Sami wrote:
> Attached version addresses the above and the previous comments.

cfbot is complaining that this patch no longer applies.  Sami, would you
mind rebasing it?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-12-13 Thread Imseih (AWS), Sami
>First of all, I don't think we need to declare ParallelVacuumProgress
>in vacuum.c since it's set and used only in vacuumparallel.c. But I
>don't even think it's a good idea to declare it in vacuumparallel.c as
>a static variable. The primary reason is that it adds things we need
>   to care about. For example, what if we raise an error during index
>vacuum? The transaction aborts but ParallelVacuumProgress still refers
>to something old. Suppose further that the next parallel vacuum
>doesn't launch any workers, the leader process would still end up
>accessing the old value pointed by ParallelVacuumProgress, which
>causes a SEGV. So we need to reset it anyway at the beginning of the
>parallel vacuum. It's easy to fix at this time but once the parallel
>   vacuum code gets more complex, it could forget to care about it.

>IMO VacuumSharedCostBalance and VacuumActiveNWorkers have a different
>story. They are set in vacuumparallel.c and are used in vacuum.c for
>vacuum delay. If they weren't global variables, we would need to pass
>them to every function that could eventually call the vacuum delay
>function. So it makes sense to me to have them as global variables.On
>the other hand, for ParallelVacuumProgress, it's a common pattern that
>ambulkdelete(), amvacuumcleanup() or a common index scan routine like
>btvacuumscan() checks the progress. I don't think index AM needs to
>pass the value down to many of its functions. So it makes sense to me
>to pass it via IndexVacuumInfo.

Thanks for the detailed explanation and especially clearing up
my understanding of VacuumSharedCostBalance and VacuumActiveNWorker.

I do now think that passing ParallelVacuumState in IndexVacuumInfo is
a more optimal choice.

Attached version addresses the above and the previous comments.


Thanks

Sami Imseih
Amazon Web Services (AWS)



v17-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v17-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2022-12-13 Thread Masahiko Sawada
On Tue, Dec 13, 2022 at 1:40 PM Imseih (AWS), Sami  wrote:
>
> Thanks for the feedback. I agree with the feedback, except
> for
>
> >need to have ParallelVacuumProgress. I see
> >parallel_vacuum_update_progress() uses this value but I think it's
> >better to pass ParallelVacuumState to via IndexVacuumInfo.
>
> I was trying to avoid passing a pointer to
> ParallelVacuumState in IndexVacuuminfo.
>
> ParallelVacuumProgress is implemented in the same
> way as VacuumSharedCostBalance and
> VacuumActiveNWorkers. See vacuum.h
>
> These values are reset at the start of a parallel vacuum cycle
> and reset at the end of an index vacuum cycle.
>
> This seems like a better approach and less invasive.
> What would be a reason not to go with this approach?

First of all, I don't think we need to declare ParallelVacuumProgress
in vacuum.c since it's set and used only in vacuumparallel.c. But I
don't even think it's a good idea to declare it in vacuumparallel.c as
a static variable. The primary reason is that it adds things we need
to care about. For example, what if we raise an error during index
vacuum? The transaction aborts but ParallelVacuumProgress still refers
to something old. Suppose further that the next parallel vacuum
doesn't launch any workers, the leader process would still end up
accessing the old value pointed by ParallelVacuumProgress, which
causes a SEGV. So we need to reset it anyway at the beginning of the
parallel vacuum. It's easy to fix at this time but once the parallel
vacuum code gets more complex, it could forget to care about it.

IMO VacuumSharedCostBalance and VacuumActiveNWorkers have a different
story. They are set in vacuumparallel.c and are used in vacuum.c for
vacuum delay. If they weren't global variables, we would need to pass
them to every function that could eventually call the vacuum delay
function. So it makes sense to me to have them as global variables.On
the other hand, for ParallelVacuumProgress, it's a common pattern that
ambulkdelete(), amvacuumcleanup() or a common index scan routine like
btvacuumscan() checks the progress. I don't think index AM needs to
pass the value down to many of its functions. So it makes sense to me
to pass it via IndexVacuumInfo.

Having said that, I'd like to hear opinions also from other hackers, I
might be wrong and it's more invasive as you pointed out.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-12-12 Thread Imseih (AWS), Sami
Thanks for the feedback. I agree with the feedback, except
for 

>need to have ParallelVacuumProgress. I see
>parallel_vacuum_update_progress() uses this value but I think it's
>better to pass ParallelVacuumState to via IndexVacuumInfo.

I was trying to avoid passing a pointer to
ParallelVacuumState in IndexVacuuminfo.

ParallelVacuumProgress is implemented in the same
way as VacuumSharedCostBalance and 
VacuumActiveNWorkers. See vacuum.h

These values are reset at the start of a parallel vacuum cycle
and reset at the end of an index vacuum cycle.

This seems like a better approach and less invasive.
What would be a reason not to go with this approach?


> parallel_vacuum_update_progress() is typically called every 1GB so I
> think we don't need to worry about unnecessary update. Also, I think
> this code doesn't work when pgstat_track_activities is false. Instead,
> I think that in parallel_wait_for_workers_to_finish(), we can check
> the value of pvs->nindexes_completed and update the progress if there
> is an update or it's first time.

I agree that we don’t need to worry about unnecessary updates
in parallel_vacuum_update_progress since we are calling
every 1GB. I also don't think we should do anything additional
in parallel_wait_for_workers_to_finish since here we are only
updating every 1 second.

Thanks,

Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-12-05 Thread Masahiko Sawada
Hi,

Thank you for updating the patch!

On Tue, Nov 29, 2022 at 8:57 AM Imseih (AWS), Sami  wrote:
>
> >   I think that indexes_total should be 0 also when INDEX_CLEANUP is off.
>
> Patch updated for handling of INDEX_CLEANUP = off, with an update to
> the documentation as well.
>
> >I think we don't need to reset it at the end of index vacuuming. There
> >is a small window before switching to the next phase. If we reset this
> >value while showing "index vacuuming" phase, the user might get
> >confused. Instead, we can reset it at the beginning of the index
> >vacuuming.
>
> No, I think the way it's currently done is correct. We want to reset the 
> number
> of indexes completed before we increase the num_index_scans ( index vacuum 
> cycle ).
> This ensures that when we enter a new index cycle, the number of indexes 
> completed
> are already reset. The 2 fields that matter here is how many indexes are 
> vacuumed in the
> currently cycle and this is called out in the documentation as such.
>

Agreed.

Here are comments on v16 patch.

--- a/contrib/bloom/blvacuum.c
+++ b/contrib/bloom/blvacuum.c
@@ -15,12 +15,14 @@
 #include "access/genam.h"
 #include "bloom.h"
 #include "catalog/storage.h"
+#include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "storage/indexfsm.h"
 #include "storage/lmgr.h"
+#include "utils/backend_progress.h"

I think we don't need to include them here. Probably the same is true
for other index AMs.

---
vacuum_delay_point();
+   if (info->report_parallel_progress && (blkno %
REPORT_PARALLEL_VACUUM_EVERY_PAGES) == 0)
+   parallel_vacuum_update_progress();
+

buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,

There is an extra new line.

---
+ 
+  ParallelVacuumFinish
+  Waiting for parallel vacuum workers to finish computing.
+ 

How about "Waiting for parallel vacuum workers to finish index vacuum"?

---
vacrel->rel = rel;
vac_open_indexes(vacrel->rel, RowExclusiveLock, >nindexes,
 >indrels);
+
if (instrument && vacrel->nindexes > 0)
{
/* Copy index names used by instrumentation (not error reporting) */


This added line is not necessary.

---
 /* Counter for vacuuming and cleanup */
 pg_atomic_uint32 idx;
+
+/*
+ * Counter for vacuuming and cleanup progress reporting.
+ * This value is used to report index vacuum/cleanup progress
+ * in parallel_vacuum_progress_report. We keep this
+ * counter to avoid having to loop through
+ * ParallelVacuumState->indstats to determine the number
+ * of indexes completed.
+ */
+pg_atomic_uint32 idx_completed_progress;

I think the name of idx_completed_progress is very confusing. Since
the idx of PVShared refers to the current index in the pvs->indstats[]
whereas idx_completed_progress is the number of vacuumed indexes. How
about "nindexes_completed"?

---
+/*
+ * Set the shared parallel vacuum progress. This will be used
+ * to periodically update progress.h with completed indexes
+ * in a parallel vacuum. See comments in
parallel_vacuum_update_progress
+ * for more details.
+ */
+ParallelVacuumProgress =
&(pvs->shared->idx_completed_progress);
+

Since we pass pvs to parallel_wait_for_workers_to_finish(), we don't
need to have ParallelVacuumProgress. I see
parallel_vacuum_update_progress() uses this value but I think it's
better to pass ParallelVacuumState to via IndexVacuumInfo.

---
+/*
+ * To wait for parallel workers to finish,
+ * first call parallel_wait_for_workers_to_finish
+ * which is responsible for reporting the
+ * number of indexes completed.
+ *
+ * Afterwards, WaitForParallelWorkersToFinish is called
+ * to do the real work of waiting for parallel workers
+ * to finish.
+ *
+ * Note: Both routines will acquire a WaitLatch in their
+ * respective loops.
+ */

How about something like:

Wait for all indexes to be vacuumed while updating the parallel vacuum
index progress. And then wait for all workers to finish.

---
 RelationGetRelationName(indrel));
 }

+if (ivinfo.report_parallel_progress)
+parallel_vacuum_update_progress();
+

I think it's better to update the progress info after updating
pvs->shared->idx_completed_progress.

---
+/*
+ * Check if we are done vacuuming indexes and report
+ * progress.

How about "Waiting for all indexes to be vacuumed while updating the
parallel index vacuum progress"?

+ *
+ * We nap using with a WaitLatch 

Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-28 Thread Imseih (AWS), Sami
>   I think that indexes_total should be 0 also when INDEX_CLEANUP is off.

Patch updated for handling of INDEX_CLEANUP = off, with an update to
the documentation as well.

>I think we don't need to reset it at the end of index vacuuming. There
>is a small window before switching to the next phase. If we reset this
>value while showing "index vacuuming" phase, the user might get
>confused. Instead, we can reset it at the beginning of the index
>vacuuming.

No, I think the way it's currently done is correct. We want to reset the number
of indexes completed before we increase the num_index_scans ( index vacuum 
cycle ).
This ensures that when we enter a new index cycle, the number of indexes 
completed
are already reset. The 2 fields that matter here is how many indexes are 
vacuumed in the
currently cycle and this is called out in the documentation as such.

>We should add CHECK_FOR_INTERRUPTS() at the beginning of the loop to
>make the wait interruptible.

Done.

>I think it would be better to update the counter only when the value
>has been increased.

Done. Did so by checking the progress value from the beentry directly
in the function. We can do a more generalized 

>I think we should set a timeout, say 1 sec, to WaitLatch so that it
>can periodically check the progress.

Done.

>Probably it's better to have a new wait event for this wait in order
>to distinguish wait for workers to complete index vacuum from the wait
>for workers to exit.

I was trying to avoid introducing a new wait event, but thinking about it, 
I agree with your suggestion. 

I created a new ParallelVacuumFinish wait event and documentation
for the wait event.


>I think this doesn't work, since ivinfo.idx_completed is in the
>backend-local memory. Instead, I think we can have a function in
>vacuumparallel.c that updates the progress. Then we can have index AM
>call this function.

Yeah, you're absolutely correct. 

To make this work correctly, I did something similar to VacuumActiveNWorkers
and introduced an extern variable called ParallelVacuumProgress.
This variable points to pvs->shared->idx_completed_progress. 

The index AMs then call parallel_vacuum_update_progress which
Is responsible for updating the progress with the current value
of ParallelVacuumProgress. 

ParallelVacuumProgress is also set to NULL at the end of every index cycle.

>   ivinfo.report_parallel_progress = !IsParallelWorker();

Done


Regards,

Sami Imseih
Amazon Web Services (AWS)





v16-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v16-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-18 Thread Masahiko Sawada
On Sat, Nov 12, 2022 at 4:10 AM Imseih (AWS), Sami  wrote:
>
> >I don't think any of these progress callbacks should be done while 
> > pinning a
> >buffer and ...
>
> Good point.
>
> >I also don't understand why info->parallel_progress_callback exists? 
> > It's only
> >set to parallel_vacuum_progress_report(). Why make this stuff more 
> > expensive
> >than it has to already be?
>
> Agree. Modified the patch to avoid the callback .
>
> >So each of the places that call this need to make an additional external
> >function call for each page, just to find that there's nothing to do or 
> > to
> >make yet another indirect function call. This should probably a static 
> > inline
> >function.
>
> Even better to just remove a function call altogether.
>
> >This is called, for every single page, just to read the number of indexes
> >completed by workers? A number that barely ever changes?
>
> I will take the initial suggestion by Sawada-san to update the progress
> every 1GB of blocks scanned.
>
> Also, It sems to me that we don't need to track progress in brin indexes,
> Since it is rare, if ever, this type of index will go through very heavy
> block scans. In fact, I noticed the vacuum AMs for brin don't invoke the
> vacuum_delay_point at all.
>
> The attached patch addresses the feedback.
>

Thank you for updating the patch! Here are review comments on v15 patch:

+  
+   Number of indexes that wil be vacuumed. This value will be
+   0 if there are no indexes to vacuum or
+   vacuum failsafe is triggered.

I think that indexes_total should be 0 also when INDEX_CLEANUP is off.

---
+/*
+ * Reset the indexes completed at this point.
+ * If we end up in another index vacuum cycle, we will
+ * start counting from the start.
+ */
+pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, 0);

I think we don't need to reset it at the end of index vacuuming. There
is a small window before switching to the next phase. If we reset this
value while showing "index vacuuming" phase, the user might get
confused. Instead, we can reset it at the beginning of the index
vacuuming.

---
+void
+parallel_wait_for_workers_to_finish(ParallelVacuumState *pvs)
+{
+while (pg_atomic_read_u32(&(pvs->shared->idx_completed_progress))
< pvs->nindexes)
+{
+pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
+
   pg_atomic_read_u32(&(pvs->shared->idx_completed_progress)));
+
+(void) WaitLatch(MyLatch, WL_LATCH_SET |
WL_EXIT_ON_PM_DEATH, -1,
+ WAIT_EVENT_PARALLEL_FINISH);
+ResetLatch(MyLatch);
+}
+}

We should add CHECK_FOR_INTERRUPTS() at the beginning of the loop to
make the wait interruptible.

I think it would be better to update the counter only when the value
has been increased.

I think we should set a timeout, say 1 sec, to WaitLatch so that it
can periodically check the progress.

Probably it's better to have a new wait event for this wait in order
to distinguish wait for workers to complete index vacuum from the wait
for workers to exit.

---
@@ -838,7 +867,12 @@
parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation
indrel,
 ivinfo.estimated_count = pvs->shared->estimated_count;
 ivinfo.num_heap_tuples = pvs->shared->reltuples;
 ivinfo.strategy = pvs->bstrategy;
-
+ivinfo.idx_completed_progress = pvs->shared->idx_completed_progress;

and

@@ -998,6 +998,9 @@ btvacuumscan(IndexVacuumInfo *info,
IndexBulkDeleteResult *stats,
 if (info->report_progress)

pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,

   scanblkno);
+if (info->report_parallel_progress &&
(scanblkno % REPORT_PARALLEL_VACUUM_EVERY_PAGES) == 0)
+
pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
+

pg_atomic_read_u32(&(info->idx_completed_progress)));
 }

I think this doesn't work, since ivinfo.idx_completed is in the
backend-local memory. Instead, I think we can have a function in
vacuumparallel.c that updates the progress. Then we can have index AM
call this function.

---
+if (!IsParallelWorker())
+ivinfo.report_parallel_progress = true;
+else
+ivinfo.report_parallel_progress = false;

We can do like:

ivinfo.report_parallel_progress = !IsParallelWorker();

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-11 Thread Imseih (AWS), Sami
>I don't think any of these progress callbacks should be done while pinning 
> a
>buffer and ...

Good point.

>I also don't understand why info->parallel_progress_callback exists? It's 
> only
>set to parallel_vacuum_progress_report(). Why make this stuff more 
> expensive
>than it has to already be?

Agree. Modified the patch to avoid the callback .

>So each of the places that call this need to make an additional external
>function call for each page, just to find that there's nothing to do or to
>make yet another indirect function call. This should probably a static 
> inline
>function.

Even better to just remove a function call altogether.

>This is called, for every single page, just to read the number of indexes
>completed by workers? A number that barely ever changes?

I will take the initial suggestion by Sawada-san to update the progress
every 1GB of blocks scanned. 

Also, It sems to me that we don't need to track progress in brin indexes,
Since it is rare, if ever, this type of index will go through very heavy
block scans. In fact, I noticed the vacuum AMs for brin don't invoke the
vacuum_delay_point at all.

The attached patch addresses the feedback.


Regards,

Sami Imseih
Amazon Web Services (AWS) 




v15-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v15-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-08 Thread Andres Freund
Hi,

On 2022-11-04 13:27:34 +, Imseih (AWS), Sami wrote:
> diff --git a/src/backend/access/gin/ginvacuum.c 
> b/src/backend/access/gin/ginvacuum.c
> index b4fa5f6bf8..3d5e4600dc 100644
> --- a/src/backend/access/gin/ginvacuum.c
> +++ b/src/backend/access/gin/ginvacuum.c
> @@ -633,6 +633,9 @@ ginbulkdelete(IndexVacuumInfo *info, 
> IndexBulkDeleteResult *stats,
>   UnlockReleaseBuffer(buffer);
>   buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,
>   
> RBM_NORMAL, info->strategy);
> +
> + if (info->report_parallel_progress)
> + 
> info->parallel_progress_callback(info->parallel_progress_arg);
>   }
>  
>   /* right now we found leftmost page in entry's BTree */

I don't think any of these progress callbacks should be done while pinning a
buffer and ...

> @@ -677,6 +680,9 @@ ginbulkdelete(IndexVacuumInfo *info, 
> IndexBulkDeleteResult *stats,
>   buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,
>   
> RBM_NORMAL, info->strategy);
>   LockBuffer(buffer, GIN_EXCLUSIVE);
> +
> + if (info->report_parallel_progress)
> + 
> info->parallel_progress_callback(info->parallel_progress_arg);
>   }
>  
>   MemoryContextDelete(gvs.tmpCxt);

... definitely not while holding a buffer lock.


I also don't understand why info->parallel_progress_callback exists? It's only
set to parallel_vacuum_progress_report(). Why make this stuff more expensive
than it has to already be?



> +parallel_vacuum_progress_report(void *arg)
> +{
> + ParallelVacuumState *pvs = (ParallelVacuumState *) arg;
> +
> + if (IsParallelWorker())
> + return;
> +
> + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
> +  
> pg_atomic_read_u32(&(pvs->shared->idx_completed_progress)));
> +}

So each of the places that call this need to make an additional external
function call for each page, just to find that there's nothing to do or to
make yet another indirect function call. This should probably a static inline
function.

This is called, for every single page, just to read the number of indexes
completed by workers? A number that barely ever changes?

This seems all wrong.

Greetings,

Andres Freund




Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-07 Thread Masahiko Sawada
On Thu, Nov 3, 2022 at 1:52 AM Imseih (AWS), Sami  wrote:
>
> Attached is v13-0001--Show-progress-for-index-vacuums.patch which addresses
> the latest comments.

Thank you for updating the patch!

> 4/ Went back to calling parallel_vacuum_progress_report inside
> WaitForParallelWorkersToFinish to cover the case when a
> leader is waiting for parallel workers to finish.

I don't think we need to modify WaitForParallelWorkersToFinish to
cover that case. Instead, I think the leader process can execute a new
function. The function will be like WaitForParallelWorkersToFinish()
but simpler; it just updates the progress information if necessary and
then checks if idx_completed_progress is equal to the number of
indexes to vacuum. If yes, return from the function and call
WaitForParallelWorkersToFinish() to wait for all workers to finish.
Otherwise, it naps by using WaitLatch() and does this loop again.

---
@@ -46,6 +46,8 @@ typedef struct ParallelContext
 ParallelWorkerInfo *worker;
 intnknown_attached_workers;
 bool  *known_attached_workers;
+void   (*parallel_progress_callback)(void *arg);
+void   *parallel_progress_arg;
 } ParallelContext;

With the above change I suggested, I think we won't need to have a
callback function in ParallelContext. Instead, I think we can have
index-AMs call parallel_vacuum_report() if report_parallel_progress is
true.

Regards,

--
Masahiko Sawada

Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-04 Thread Imseih (AWS), Sami
Resubmitting patches with proper format.

Thanks

Sami Imseih
Amazon Web Services (AWS)







v14-0002-Function-to-return-currently-vacuumed-or-cleaned.patch
Description: v14-0002-Function-to-return-currently-vacuumed-or-cleaned.patch


v14-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v14-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-03 Thread Ian Lawrence Barwick
2022年11月3日(木) 1:52 Imseih (AWS), Sami :
>
> Attached is v13-0001--Show-progress-for-index-vacuums.patch which addresses
> the latest comments. The main changes are:
>
> 1/ Call the parallel_vacuum_progress_report inside the AMs rather than 
> vacuum_delay_point.
>
> 2/ A Boolean when set to True in vacuumparallel.c will be used to determine 
> if calling
> parallel_vacuum_progress_report is necessary.
>
> 3/ Removed global varilable from vacuumparallel.c
>
> 4/ Went back to calling parallel_vacuum_progress_report inside
> WaitForParallelWorkersToFinish to cover the case when a
> leader is waiting for parallel workers to finish.
>
> 5/ I did not see a need to only report progress after 1GB as it's a fairly 
> cheap call to update
> progress.
>
> 6/ v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch is a 
> separate patch
> for exposing the index relid being vacuumed by a backend.

This entry was marked "Needs review" in the CommitFest app but cfbot
reports the patch [1] no longer applies.

[1] this patch:
v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch

We've marked it as "Waiting on Author". As CommitFest 2022-11 is
currently underway, this would be an excellent time update the patch.

Once you think the patchset is ready for review again, you (or any
interested party) can  move the patch entry forward by visiting

https://commitfest.postgresql.org/40/3617/

and changing the status to "Needs review".


Thanks

Ian Barwick




Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-02 Thread Imseih (AWS), Sami
Attached is v13-0001--Show-progress-for-index-vacuums.patch which addresses
the latest comments. The main changes are:

1/ Call the parallel_vacuum_progress_report inside the AMs rather than 
vacuum_delay_point.

2/ A Boolean when set to True in vacuumparallel.c will be used to determine if 
calling
parallel_vacuum_progress_report is necessary.

3/ Removed global varilable from vacuumparallel.c

4/ Went back to calling parallel_vacuum_progress_report inside 
WaitForParallelWorkersToFinish to cover the case when a 
leader is waiting for parallel workers to finish.

5/ I did not see a need to only report progress after 1GB as it's a fairly 
cheap call to update
progress.

6/ v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch is a 
separate patch
for exposing the index relid being vacuumed by a backend. 

Thanks

Sami Imseih
Amazon Web Services (AWS)






v13-0001--Show-progress-for-index-vacuums.patch
Description: v13-0001--Show-progress-for-index-vacuums.patch


v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch
Description: v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2022-10-14 Thread Imseih (AWS), Sami
Thank you for the feedback!

>While it seems to be a good idea to have the atomic counter for the
>number of indexes completed, I think we should not use the global
>variable referencing the counter as follow:

>+static pg_atomic_uint32 *index_vacuum_completed = NULL;
>:
>+void
>+parallel_vacuum_progress_report(void)
>+{
>+   if (IsParallelWorker() || !report_parallel_vacuum_progress)
>+   return;
>+
>+   pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
>+
> pg_atomic_read_u32(index_vacuum_completed));
>+}

>I think we can pass ParallelVacuumState (or PVIndStats) to the
>reporting function so that it can check the counter and report the
>progress.

Yes, that makes sense.

>---
>I am not too sure that the idea of using the vacuum delay points is the 
> best
>plan. I think we should try to avoid piggybacking on such general
>infrastructure if we can, and instead look for a way to tie this to
>   something that is specific to parallel vacuum.
>---

Adding the call to vacuum_delay_point made sense since it's
called in all major vacuum scans. While it is also called
by analyze, it will only do anything if the caller sets a flag
to report_parallel_vacuum_progress.


 >   Instead, I think we can add a boolean and the pointer for
 >   ParallelVacuumState to IndexVacuumInfo. If the boolean is true, an
 >   index AM can call the reporting function with the pointer to
 >   ParallelVacuumState while scanning index blocks, for example, for
 >   every 1GB. The boolean can be true only for the leader process.

Will doing this every 1GB be necessary? Considering the function
will not do much more than update progress from the value
stored in index_vacuum_completed?


>   Agreed, but with the following change, the leader process waits in a
>busy loop, which should not be acceptable:

Good point.

>Also, I think it's better to check whether idx_completed_progress
equals to the number indexes instead.

Good point

> 5. Went back to the idea of adding a new view called 
pg_stat_progress_vacuum_index
> which is accomplished by adding a new type called VACUUM_PARALLEL in 
progress.h

>Probably, we can devide the patch into two patches. One for adding the

Makes sense.

Thanks

Sami Imseih
Amazon Web Services (AWS)





Re: Add index scan progress to pg_stat_progress_vacuum

2022-10-12 Thread Masahiko Sawada
On Tue, Oct 11, 2022 at 10:50 PM Imseih (AWS), Sami  wrote:
>
> >One idea would be to add a flag, say report_parallel_vacuum_progress,
> >to IndexVacuumInfo struct and expect index AM to check and update the
> >parallel index vacuum progress, say every 1GB blocks processed. The
> >flag is true only when the leader process is vacuuming an index.
>
> >Regards,
>
> Sorry for the long delay on this. I have taken the approach as suggested
> by Sawada-san and Robert and attached is v12.

Thank you for updating the patch!

>
> 1. The patch introduces a new counter in the same shared memory already
> used by the parallel leader and workers to keep track of the number
> of indexes completed. This way there is no reason to loop through
> the index status every time we want to get the status of indexes completed.

While it seems to be a good idea to have the atomic counter for the
number of indexes completed, I think we should not use the global
variable referencing the counter as follow:

+static pg_atomic_uint32 *index_vacuum_completed = NULL;
:
+void
+parallel_vacuum_progress_report(void)
+{
+   if (IsParallelWorker() || !report_parallel_vacuum_progress)
+   return;
+
+   pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
+pg_atomic_read_u32(index_vacuum_completed));
+}

I think we can pass ParallelVacuumState (or PVIndStats) to the
reporting function so that it can check the counter and report the
progress.

> 2. A new function in vacuumparallel.c will be used to update
> the progress of indexes completed by reading from the
> counter created in point #1.
>
> 3. The function is called during the vacuum_delay_point as a
> matter of convenience, since this is called in all major vacuum
> loops. The function will only do something if the caller
> sets a boolean to report progress. Doing so will also ensure
> progress is being reported in case the parallel workers completed
> before the leader.

Robert pointed out:

---
I am not too sure that the idea of using the vacuum delay points is the best
plan. I think we should try to avoid piggybacking on such general
infrastructure if we can, and instead look for a way to tie this to
something that is specific to parallel vacuum.
---

I agree with this part.

Instead, I think we can add a boolean and the pointer for
ParallelVacuumState to IndexVacuumInfo. If the boolean is true, an
index AM can call the reporting function with the pointer to
ParallelVacuumState while scanning index blocks, for example, for
every 1GB. The boolean can be true only for the leader process.

>
> 4. Rather than adding any complexity to WaitForParallelWorkersToFinish
> and introducing a new callback, vacuumparallel.c will wait until
> the number of vacuum workers is 0 and then call
> WaitForParallelWorkersToFinish as it does currently.

Agreed, but with the following change, the leader process waits in a
busy loop, which should not be acceptable:

+   if (VacuumActiveNWorkers)
+   {
+   while (pg_atomic_read_u32(VacuumActiveNWorkers) > 0)
+   {
+   parallel_vacuum_progress_report();
+   }
+   }
+

Also, I think it's better to check whether idx_completed_progress
equals to the number indexes instead.

> 5. Went back to the idea of adding a new view called 
> pg_stat_progress_vacuum_index
> which is accomplished by adding a new type called VACUUM_PARALLEL in 
> progress.h

Probably, we can devide the patch into two patches. One for adding the
new statistics of the number of vacuumed indexes to
pg_stat_progress_vacuum and another one for adding new statistics view
pg_stat_progress_vacuum_index.

Regards,

-- 
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-10-11 Thread Imseih (AWS), Sami
>One idea would be to add a flag, say report_parallel_vacuum_progress,
>to IndexVacuumInfo struct and expect index AM to check and update the
>parallel index vacuum progress, say every 1GB blocks processed. The
>flag is true only when the leader process is vacuuming an index.

>Regards,

Sorry for the long delay on this. I have taken the approach as suggested
by Sawada-san and Robert and attached is v12.

1. The patch introduces a new counter in the same shared memory already
used by the parallel leader and workers to keep track of the number
of indexes completed. This way there is no reason to loop through
the index status every time we want to get the status of indexes completed.

2. A new function in vacuumparallel.c will be used to update
the progress of indexes completed by reading from the
counter created in point #1.

3. The function is called during the vacuum_delay_point as a
matter of convenience, since this is called in all major vacuum
loops. The function will only do something if the caller
sets a boolean to report progress. Doing so will also ensure
progress is being reported in case the parallel workers completed
before the leader.

4. Rather than adding any complexity to WaitForParallelWorkersToFinish
and introducing a new callback, vacuumparallel.c will wait until
the number of vacuum workers is 0 and then call
WaitForParallelWorkersToFinish as it does currently.

5. Went back to the idea of adding a new view called 
pg_stat_progress_vacuum_index
which is accomplished by adding a new type called VACUUM_PARALLEL in progress.h

Thanks,

Sami Imseih
Amazon Web Servies (AWS)

FYI: the above message was originally sent yesterday but 
was created under a separate thread. Please ignore this
thread[1]

[1]: 
https://www.postgresql.org/message-id/flat/4CD97E17-B9E4-421E-9A53-4317C90EFF35%40amazon.com










v12-0001--Show-progress-for-index-vacuums.patch
Description: v12-0001--Show-progress-for-index-vacuums.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2022-10-10 Thread Imseih (AWS), Sami
> One idea would be to add a flag, say report_parallel_vacuum_progress,
> to IndexVacuumInfo struct and expect index AM to check and update the
> parallel index vacuum progress, say every 1GB blocks processed. The
> flag is true only when the leader process is vacuuming an index.

Sorry for the long delay on this. I have taken the approach as suggested
by Sawada-san and Robert and attached is v12.

1. The patch introduces a new counter in the shared memory already
used by the parallel leader and workers to keep track of the number
of indexes completed. This way there is no reason to loop through
the index status everytime we want to get the status of indexes completed.

2. A new function in vacuumparallel.c will be used to update
the progress of a indexes completed by reading from the
counter created in point #1.

3. The function is called during the vacuum_delay_point as a
matter of convenience, since it's called in all major vacuum
loops. The function will only do anything if the caller
sets a boolean to report progress. Doing so will also ensure
progress is being reported in case the parallel workers completed
before the leader.

4. Rather than adding any complexity to WaitForParallelWorkersToFinish
and introducing a new callback, vacuumparallel.c will wait until
the number of vacuum workers is 0 and then process to call
WaitForParallelWorkersToFinish as it does.

5. Went back to the idea of adding a new view called 
pg_stat_progress_vacuum_index
which is accomplished by adding a new type called VACUUM_PARALLEL in progress.h


Thanks,

Sami Imseih
Amazon Web Servies (AWS)


v12-0001--Show-progress-for-index-vacuums.patch
Description: v12-0001--Show-progress-for-index-vacuums.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2022-08-02 Thread Jacob Champion
This entry has been waiting on author input for a while (our current
threshold is roughly two weeks), so I've marked it Returned with
Feedback.

Once you think the patchset is ready for review again, you (or any
interested party) can resurrect the patch entry by visiting

https://commitfest.postgresql.org/38/3617/

and changing the status to "Needs Review", and then changing the
status again to "Move to next CF". (Don't forget the second step;
hopefully we will have streamlined this in the near future!)

Thanks,
--Jacob





Re: Add index scan progress to pg_stat_progress_vacuum

2022-06-20 Thread Masahiko Sawada
On Mon, Jun 6, 2022 at 11:42 PM Robert Haas  wrote:
>
> On Thu, May 26, 2022 at 11:43 AM Masahiko Sawada  
> wrote:
> > Another idea I came up with is that we can wait for all index vacuums
> > to finish while checking and updating the progress information, and
> > then calls WaitForParallelWorkersToFinish after confirming all index
> > status became COMPLETED. That way, we don’t need to change the
> > parallel query infrastructure. What do you think?
>
> +1 from me. It doesn't seem to me that we should need to add something
> like parallel_vacuum_progress_callback in order to solve this problem,
> because the parallel index vacuum code could just do the waiting
> itself, as you propose here.
>
> The question Sami asks him his reply is a good one, though -- who is
> to say that the leader only needs to update progress at the end, once
> it's finished the index it's handling locally? There will need to be a
> callback system of some kind to allow the leader to update progress as
> other workers finish, even if the leader is still working. I am not
> too sure that the idea of using the vacuum delay points is the best
> plan. I think we should try to avoid piggybacking on such general
> infrastructure if we can, and instead look for a way to tie this to
> something that is specific to parallel vacuum. However, I haven't
> studied the problem so I'm not sure whether there's a reasonable way
> to do that.

One idea would be to add a flag, say report_parallel_vacuum_progress,
to IndexVacuumInfo struct and expect index AM to check and update the
parallel index vacuum progress, say every 1GB blocks processed. The
flag is true only when the leader process is vacuuming an index.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-06-06 Thread Robert Haas
On Thu, May 26, 2022 at 11:43 AM Masahiko Sawada  wrote:
> Another idea I came up with is that we can wait for all index vacuums
> to finish while checking and updating the progress information, and
> then calls WaitForParallelWorkersToFinish after confirming all index
> status became COMPLETED. That way, we don’t need to change the
> parallel query infrastructure. What do you think?

+1 from me. It doesn't seem to me that we should need to add something
like parallel_vacuum_progress_callback in order to solve this problem,
because the parallel index vacuum code could just do the waiting
itself, as you propose here.

The question Sami asks him his reply is a good one, though -- who is
to say that the leader only needs to update progress at the end, once
it's finished the index it's handling locally? There will need to be a
callback system of some kind to allow the leader to update progress as
other workers finish, even if the leader is still working. I am not
too sure that the idea of using the vacuum delay points is the best
plan. I think we should try to avoid piggybacking on such general
infrastructure if we can, and instead look for a way to tie this to
something that is specific to parallel vacuum. However, I haven't
studied the problem so I'm not sure whether there's a reasonable way
to do that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-06-02 Thread Masahiko Sawada
On Fri, May 27, 2022 at 10:52 AM Imseih (AWS), Sami  wrote:
>
> >Another idea I came up with is that we can wait for all index vacuums
> >to finish while checking and updating the progress information, and
> >then calls WaitForParallelWorkersToFinish after confirming all index
> >status became COMPLETED. That way, we don’t need to change the
> >parallel query infrastructure. What do you think?
>
> Thinking about this a bit more, the idea of using
> WaitForParallelWorkersToFinish
> Will not work if you have a leader worker that is
> stuck on a large index. The progress will not be updated
> until the leader completes. Even if the parallel workers
> finish.

Right.

>
> What are your thought about piggybacking on the
> vacuum_delay_point to update progress. The leader can
> perhaps keep a counter to update progress every few thousand
> calls to vacuum_delay_point.
>
> This goes back to your original idea to keep updating progress
> while scanning the indexes.

I think we can have the leader process wait for all index statuses to
become COMPLETED before WaitForParallelWorkersToFinish(). While
waiting for it, the leader can update its progress information. After
the leader confirmed all index statuses became COMPLETED, it can wait
for the workers to finish by WaitForParallelWorkersToFinish().

Regarding waiting in vacuum_delay_point, it might be a side effect as
it’s called every page and used not only by vacuum such as analyze,
but it seems to be worth trying.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-05-26 Thread Imseih (AWS), Sami
>Another idea I came up with is that we can wait for all index vacuums
>to finish while checking and updating the progress information, and
>then calls WaitForParallelWorkersToFinish after confirming all index
>status became COMPLETED. That way, we don’t need to change the
>parallel query infrastructure. What do you think?

Thinking about this a bit more, the idea of using 
WaitForParallelWorkersToFinish
Will not work if you have a leader worker that is
stuck on a large index. The progress will not be updated
until the leader completes. Even if the parallel workers
finish.

What are your thought about piggybacking on the 
vacuum_delay_point to update progress. The leader can 
perhaps keep a counter to update progress every few thousand
calls to vacuum_delay_point. 

This goes back to your original idea to keep updating progress
while scanning the indexes.

/*
 * vacuum_delay_point --- check for interrupts and cost-based delay.
 *
 * This should be called in each major loop of VACUUM processing,
 * typically once per page processed.
 */
void
vacuum_delay_point(void)
{

---
Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-05-26 Thread Masahiko Sawada
On Fri, May 6, 2022 at 4:26 AM Imseih (AWS), Sami  wrote:
>
> Thank you for the feedback!
>
> >I think we can pass the progress update function to
> >   WaitForParallelWorkersToFinish(), which seems simpler. And we can call
>
> Directly passing the callback to WaitForParallelWorkersToFinish
> will require us to modify the function signature.
>
> To me, it seemed simpler and touches less code to have
> the caller set the callback in the ParallelContext.

Okay, but if we do that, I think we should add comments about when
it's used. The callback is used only when
WaitForParallelWorkersToFinish(), but not when
WaitForParallelWorkersToExit().

Another idea I came up with is that we can wait for all index vacuums
to finish while checking and updating the progress information, and
then calls WaitForParallelWorkersToFinish after confirming all index
status became COMPLETED. That way, we don’t need to change the
parallel query infrastructure. What do you think?

>
> >the function after updating the index status to
> >PARALLEL_INDVAC_STATUS_COMPLETED.
>
> I also like this better. Will make the change.
>
> >BTW, currently we don't need a lock for touching index status since
> >each worker touches different indexes. But after this patch, the
> >leader will touch all index status, do we need a lock for that?
>
> I do not think locking is needed here. The leader and workers
> will continue to touch different indexes to update the status.
>
> However, if the process is a leader, it will call the function
> which will go through indstats and count how many
> Indexes have a status of PARALLEL_INDVAC_STATUS_COMPLETED.
> This value is then reported to the leaders backend only.

I was concerned that the leader process could report the wrong
progress if updating and checking index status happen concurrently.
But I think it should be fine since we can read PVIndVacStatus
atomically.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-05-05 Thread Imseih (AWS), Sami
Thank you for the feedback!

>I think we can pass the progress update function to
>   WaitForParallelWorkersToFinish(), which seems simpler. And we can call

Directly passing the callback to WaitForParallelWorkersToFinish
will require us to modify the function signature.

To me, it seemed simpler and touches less code to have
the caller set the callback in the ParallelContext.

>the function after updating the index status to
>PARALLEL_INDVAC_STATUS_COMPLETED.

I also like this better. Will make the change.

>BTW, currently we don't need a lock for touching index status since
>each worker touches different indexes. But after this patch, the
>leader will touch all index status, do we need a lock for that?

I do not think locking is needed here. The leader and workers
will continue to touch different indexes to update the status.

However, if the process is a leader, it will call the function
which will go through indstats and count how many
Indexes have a status of PARALLEL_INDVAC_STATUS_COMPLETED.
This value is then reported to the leaders backend only.


Regards,

Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-05-01 Thread Masahiko Sawada
On Thu, Apr 14, 2022 at 10:32 AM Imseih (AWS), Sami  wrote:
>
> Taking the above feedback, I modified the patches
> and I believe this approach should be acceptable.
>
> For now, I also reduced the scope to only exposing
> Indexes_total and indexes_completed in
> pg_stat_progress_vacuum. I will create a new CF entry
> for the new view pg_stat_progress_vacuum_index.
>
> V10-0001: This patch adds a callback to ParallelContext
> that could be called by the leader in vacuumparallel.c
> and more importantly while the leader is waiting
> for the parallel workers to complete inside.

Thank you for updating the patch! The new design looks much better to me.

 typedef struct ParallelWorkerInfo
@@ -46,6 +49,8 @@ typedef struct ParallelContext
ParallelWorkerInfo *worker;
int nknown_attached_workers;
bool   *known_attached_workers;
+   ParallelProgressCallback parallel_progress_callback;
+   void*parallel_progress_callback_arg;
 } ParallelContext;

I think we can pass the progress update function to
WaitForParallelWorkersToFinish(), which seems simpler. And we can call
the function after updating the index status to
PARALLEL_INDVAC_STATUS_COMPLETED.

BTW, currently we don't need a lock for touching index status since
each worker touches different indexes. But after this patch, the
leader will touch all index status, do we need a lock for that?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-04-07 Thread Greg Stark
It looks like this patch got feedback from Andres and Robert with some
significant design change recommendations. I'm marking the patch
Returned with Feedback. Feel free to add it back to a future
commitfest when a new version is ready.




Re: Add index scan progress to pg_stat_progress_vacuum

2022-04-07 Thread Masahiko Sawada
On Thu, Apr 7, 2022 at 10:20 PM Robert Haas  wrote:
>
> On Wed, Apr 6, 2022 at 5:22 PM Imseih (AWS), Sami  wrote:
> > >At the beginning of a parallel operation, we allocate a chunk of>
> > >dynamic shared memory which persists even after some or all workers
> > >have exited. It's only torn down at the end of the parallel operation.
> > >That seems like the appropriate place to be storing any kind of data
> > >that needs to be propagated between parallel workers. The current
> > >patch uses the main shared memory segment, which seems unacceptable to
> > >me.
> >
> > Correct, DSM does track shared data. However only participating
> > processes in the parallel vacuum can attach and lookup this data.
> >
> > The purpose of the main shared memory is to allow a process that
> > Is querying the progress views to retrieve the information.
>
> Sure, but I think that you should likely be doing what Andres
> recommended before:
>
> # Why isn't the obvious thing to do here to provide a way to associate workers
> # with their leaders in shared memory, but to use the existing progress fields
> # to report progress? Then, when querying progress, the leader and workers
> # progress fields can be combined to show the overall progress?
>
> That is, I am imagining that you would want to use DSM to propagate
> data from workers back to the leader and then have the leader report
> the data using the existing progress-reporting facilities. Now, if we
> really need a whole row from each worker that doesn't work, but why do
> we need that?

+1

I also proposed the same idea before[1]. The leader can know how many
indexes are processed so far by checking PVIndStats.status allocated
on DSM for each index. We can have the leader check it and update the
progress information before and after vacuuming one index. If we want
to update the progress information more timely, probably we can pass a
callback function to ambulkdelete and amvacuumcleanup so that the
leader can do that periodically, e.g., every 1000 blocks, while
vacuuming an index.

Regards,

[1] 
https://www.postgresql.org/message-id/CAD21AoBW6SMJ96CNoMeu%2Bf_BR4jmatPcfVA016FdD2hkLDsaTA%40mail.gmail.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-04-07 Thread Robert Haas
On Wed, Apr 6, 2022 at 5:22 PM Imseih (AWS), Sami  wrote:
> >At the beginning of a parallel operation, we allocate a chunk of>
> >dynamic shared memory which persists even after some or all workers
> >have exited. It's only torn down at the end of the parallel operation.
> >That seems like the appropriate place to be storing any kind of data
> >that needs to be propagated between parallel workers. The current
> >patch uses the main shared memory segment, which seems unacceptable to
> >me.
>
> Correct, DSM does track shared data. However only participating
> processes in the parallel vacuum can attach and lookup this data.
>
> The purpose of the main shared memory is to allow a process that
> Is querying the progress views to retrieve the information.

Sure, but I think that you should likely be doing what Andres
recommended before:

# Why isn't the obvious thing to do here to provide a way to associate workers
# with their leaders in shared memory, but to use the existing progress fields
# to report progress? Then, when querying progress, the leader and workers
# progress fields can be combined to show the overall progress?

That is, I am imagining that you would want to use DSM to propagate
data from workers back to the leader and then have the leader report
the data using the existing progress-reporting facilities. Now, if we
really need a whole row from each worker that doesn't work, but why do
we need that?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-04-06 Thread Imseih (AWS), Sami
>At the beginning of a parallel operation, we allocate a chunk of>
>dynamic shared memory which persists even after some or all workers
>have exited. It's only torn down at the end of the parallel operation.
>That seems like the appropriate place to be storing any kind of data
>that needs to be propagated between parallel workers. The current
>patch uses the main shared memory segment, which seems unacceptable to
>me.

Correct, DSM does track shared data. However only participating
processes in the parallel vacuum can attach and lookup this data.

The purpose of the main shared memory is to allow a process that
Is querying the progress views to retrieve the information.

Regards,

Sami Imseih
Amazon Web Services




Re: Add index scan progress to pg_stat_progress_vacuum

2022-04-05 Thread Imseih (AWS), Sami
>Can't the progress data trivially be inferred by the fact that the worker
>completed?

Yes, at some point, this idea was experimented with in
0004-Expose-progress-for-the-vacuuming-indexes-cleanup-ph.patch.
This patch did the calculation in system_views.sql

However, the view is complex and there could be some edge
cases with inferring the values that lead to wrong values being reported.

Regards,

Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-04-05 Thread Andres Freund
On 2022-04-05 16:42:28 +, Imseih (AWS), Sami wrote:
> >Why isn't the obvious thing to do here to provide a way to associate 
> > workers
> >with their leaders in shared memory, but to use the existing progress 
> > fields
> >to report progress? Then, when querying progress, the leader and workers
> >progress fields can be combined to show the overall progress?
>
> The original intent was this, however the workers
> can exit before the command completes and the
> worker progress data will be lost.

Can't the progress data trivially be inferred by the fact that the worker
completed?




Re: Add index scan progress to pg_stat_progress_vacuum

2022-04-05 Thread Robert Haas
On Tue, Apr 5, 2022 at 12:42 PM Imseih (AWS), Sami  wrote:
> >Why isn't the obvious thing to do here to provide a way to associate 
> > workers
> >with their leaders in shared memory, but to use the existing progress 
> > fields
> >to report progress? Then, when querying progress, the leader and workers
> >progress fields can be combined to show the overall progress?
>
> The original intent was this, however the workers
> can exit before the command completes and the
> worker progress data will be lost.
> This is why the shared memory was introduced.
> This allows the worker progress to persist for the duration
> of the command.

At the beginning of a parallel operation, we allocate a chunk of
dynamic shared memory which persists even after some or all workers
have exited. It's only torn down at the end of the parallel operation.
That seems like the appropriate place to be storing any kind of data
that needs to be propagated between parallel workers. The current
patch uses the main shared memory segment, which seems unacceptable to
me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-04-05 Thread Imseih (AWS), Sami
>I nevertheless think that's not acceptable. The whole premise of the 
> progress
>reporting infrastructure is to be low overhead. It's OK to require locking 
> to
>initialize parallel progress reporting, it's definitely not ok to require
>locking to report progress.

Fair point.

>Why isn't the obvious thing to do here to provide a way to associate 
> workers
>with their leaders in shared memory, but to use the existing progress 
> fields
>to report progress? Then, when querying progress, the leader and workers
>progress fields can be combined to show the overall progress?

The original intent was this, however the workers 
can exit before the command completes and the 
worker progress data will be lost.
This is why the shared memory was introduced. 
This allows the worker progress to persist for the duration 
of the command.

Regards, 

Sami Imseih
Amazon Web Services.






Re: Add index scan progress to pg_stat_progress_vacuum

2022-04-03 Thread Andres Freund
Hi,

On 2022-03-29 12:25:52 +, Imseih (AWS), Sami wrote:
> > I think that's an absolute no-go. Adding locking to progress reporting,
> > particularly a single central lwlock, is going to *vastly* increase the
> > overhead incurred by progress reporting.
> 
> Sorry for the late reply.
> 
> The usage of the shared memory will be limited
> to PARALLEL maintenance operations. For now,
> it will only be populated for parallel vacuums. 
> Autovacuum for example will not be required to 
> populate this shared memory.

I nevertheless think that's not acceptable. The whole premise of the progress
reporting infrastructure is to be low overhead. It's OK to require locking to
initialize parallel progress reporting, it's definitely not ok to require
locking to report progress.

Leaving the locking aside, doing a hashtable lookup for each progress report
is pretty expensive.


Why isn't the obvious thing to do here to provide a way to associate workers
with their leaders in shared memory, but to use the existing progress fields
to report progress? Then, when querying progress, the leader and workers
progress fields can be combined to show the overall progress?


Greetings,

Andres Freund




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-29 Thread Imseih (AWS), Sami
> I think that's an absolute no-go. Adding locking to progress reporting,
> particularly a single central lwlock, is going to *vastly* increase the
> overhead incurred by progress reporting.

Sorry for the late reply.

The usage of the shared memory will be limited
to PARALLEL maintenance operations. For now,
it will only be populated for parallel vacuums. 
Autovacuum for example will not be required to 
populate this shared memory.

Regards,

---
Sami Imseih
Amazon Web Services




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-29 Thread Imseih (AWS), Sami
Sorry for the late reply.

> additional complexity and a possible lag of progress updates. So if we
> go with the current approach, I think we need to make sure enough (and
> not too many) hash table entries.

The hash table can be set 4 times the size of 
max_worker_processes which should give more than
enough padding.
Note that max_parallel_maintenance_workers
is what should be used, but since it's dynamic, it cannot
be used to determine the size of shared memory.

Regards,

---
Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-25 Thread Masahiko Sawada
On Wed, Mar 23, 2022 at 6:57 AM Imseih (AWS), Sami  wrote:
>
> >Can the leader pass a callback that checks PVIndStats to ambulkdelete
> >an amvacuumcleanup callbacks? I think that in the passed callback, the
> >leader checks if the number of processed indexes and updates its
> >progress information if the current progress needs to be updated.
>
> Thanks for the suggestion.
>
> I looked at this option a but today and found that passing the callback
> will also require signature changes to the ambulkdelete and
> amvacuumcleanup routines.

I think it would not be a critical problem since it's a new feature.

>
> This will also require us to check after x pages have been
> scanned inside vacuumscan and vacuumcleanup. After x pages
> the callback can then update the leaders progress.
> I am not sure if adding additional complexity to the scan/cleanup path
>  is justified for what this patch is attempting to do.
>
> There will also be a lag of the leader updating the progress as it
> must scan x amount of pages before updating. Obviously, the more
> Pages to the scan, the longer the lag will be.

Fair points.

On the other hand, the approach of the current patch requires more
memory for progress tracking, which could fail, e.g., due to running
out of hashtable entries. I think that it would be worse that the
parallel operation failed to start due to not being able to track the
progress than the above concerns you mentioned such as introducing
additional complexity and a possible lag of progress updates. So if we
go with the current approach, I think we need to make sure enough (and
not too many) hash table entries.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-22 Thread Imseih (AWS), Sami
>Can the leader pass a callback that checks PVIndStats to ambulkdelete
>an amvacuumcleanup callbacks? I think that in the passed callback, the
>leader checks if the number of processed indexes and updates its
>progress information if the current progress needs to be updated.

Thanks for the suggestion.

I looked at this option a but today and found that passing the callback 
will also require signature changes to the ambulkdelete and 
amvacuumcleanup routines. 

This will also require us to check after x pages have been 
scanned inside vacuumscan and vacuumcleanup. After x pages
the callback can then update the leaders progress.
I am not sure if adding additional complexity to the scan/cleanup path
 is justified for what this patch is attempting to do. 

There will also be a lag of the leader updating the progress as it
must scan x amount of pages before updating. Obviously, the more
Pages to the scan, the longer the lag will be.

Would like to hear your thoughts on the above.

Regards,

Sami Imseih
Amazon Web Services.







Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-22 Thread Masahiko Sawada
On Tue, Mar 22, 2022 at 4:27 PM Imseih (AWS), Sami  wrote:
>
> >BTW have we discussed another idea I mentioned before that we have the
> >leader process periodically check the number of completed indexes and
> >advertise it in its progress information? I'm not sure which one is
> >better but this idea would require only changes of vacuum code and
> >probably simpler than the current idea.
>
> >Regards,
>
>
> If I understand correctly,  to accomplish this we will need to have the leader
> check the number of indexes completed In the ambukdelete or amvacuumcleanup
> callbacks. These routines do not know about  PVIndStats, and they are called
> by both parallel and non-parallel vacuums.
>
> From what I can see, PVIndstats will need to be passed down to these routines
> or pass a NULL for non-parallel vacuums.
>

Can the leader pass a callback that checks PVIndStats to ambulkdelete
an amvacuumcleanup callbacks? I think that in the passed callback, the
leader checks if the number of processed indexes and updates its
progress information if the current progress needs to be updated.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-22 Thread Imseih (AWS), Sami
>BTW have we discussed another idea I mentioned before that we have the
>leader process periodically check the number of completed indexes and
>advertise it in its progress information? I'm not sure which one is
>better but this idea would require only changes of vacuum code and
>probably simpler than the current idea.

>Regards,


If I understand correctly,  to accomplish this we will need to have the leader 
check the number of indexes completed In the ambukdelete or amvacuumcleanup 
callbacks. These routines do not know about  PVIndStats, and they are called 
by both parallel and non-parallel vacuums.

From what I can see, PVIndstats will need to be passed down to these routines 
or pass a NULL for non-parallel vacuums.

Sami




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-21 Thread Andres Freund
Hi,

On 2022-03-16 21:47:49 +, Imseih (AWS), Sami wrote:
> From 85c47dfb3bb72f764b9052e74a7282c19ebd9898 Mon Sep 17 00:00:00 2001
> From: "Sami Imseih (AWS)" 
> Date: Wed, 16 Mar 2022 20:39:52 +
> Subject: [PATCH 1/1] Add infrastructure for parallel progress reporting
> 
> Infrastructure to allow a parallel worker to report
> progress. In a PARALLEL command, the workers and
> leader can report progress using a new pgstat_progress
> API.

What happens if we run out of memory for hashtable entries?


> +void
> +pgstat_progress_update_param_parallel(int leader_pid, int index, int64 val)
> +{
> + ProgressParallelEntry *entry;
> + bool found;
> +
> + LWLockAcquire(ProgressParallelLock, LW_EXCLUSIVE);
> +
> + entry = (ProgressParallelEntry *) hash_search(ProgressParallelHash, 
> _pid, HASH_ENTER, );
> +
> + /*
> +  * If the entry is not found, set the value for the index'th member,
> +  * else increment the current value of the index'th member.
> +  */
> + if (!found)
> + entry->st_progress_param[index] = val;
> + else
> + entry->st_progress_param[index] += val;
> +
> + LWLockRelease(ProgressParallelLock);
> +}

I think that's an absolute no-go. Adding locking to progress reporting,
particularly a single central lwlock, is going to *vastly* increase the
overhead incurred by progress reporting.

Greetings,

Andres Freund




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-21 Thread Masahiko Sawada
Sorry for the late reply.

On Tue, Mar 15, 2022 at 1:20 AM Imseih (AWS), Sami  wrote:
>
> >I'm still unsure the current design of 0001 patch is better than other
> >approaches we’ve discussed. Even users who don't use parallel vacuum
> >are forced to allocate shared memory for index vacuum progress, with
> >GetMaxBackends() entries from the beginning. Also, it’s likely to
> >extend the progress tracking feature for other parallel operations in
> >the future but I think the current design is not extensible. If we
> >want to do that, we will end up creating similar things for each of
> >them or re-creating index vacuum progress tracking feature while
> >creating a common infra. It might not be a problem as of now but I'm
> >concerned that introducing a feature that is not extensible and forces
> >users to allocate additional shmem might be a blocker in the future.
> >Looking at the precedent example, When we introduce the progress
> >tracking feature, we implemented it in an extensible way. On the other
> >hand, others in this thread seem to agree with this approach, so I'd
> >like to leave it to committers.
>
> Thanks for the review!
>
> I think you make strong arguments as to why we need to take a different 
> approach now than later.
>
> Flaws with current patch set:
>
> 1. GetMaxBackends() is a really heavy-handed overallocation of a shared 
> memory serving a very specific purpose.
> 2. Going with the approach of a vacuum specific hash breaks the design of 
> progress which is meant to be extensible.
> 3. Even if we go with this current approach as an interim solution, it will 
> be a real pain in the future.
>
> With that said, v7 introduces the new infrastructure. 0001 includes the new 
> infrastructure and 0002 takes advantage of this.
>
> This approach is the following:
>
> 1. Introduces a new API called pgstat_progress_update_param_parallel along 
> with some others support functions. This new infrastructure is in 
> backend_progress.c
>
> 2. There is still a shared memory involved, but the size is capped to " 
> max_worker_processes" which is the max to how many parallel workers can be 
> doing work at any given time. The shared memory hash includes a 
> st_progress_param array just like the Backend Status array.

I think that there is a corner case where a parallel operation could
not perform due to the lack of a free shared hash entry, because there
is a window between a parallel worker exiting and the leader
deallocating the hash table entry.

BTW have we discussed another idea I mentioned before that we have the
leader process periodically check the number of completed indexes and
advertise it in its progress information? I'm not sure which one is
better but this idea would require only changes of vacuum code and
probably simpler than the current idea.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-16 Thread Nathan Bossart
On Wed, Mar 16, 2022 at 09:47:49PM +, Imseih (AWS), Sami wrote:
> Spoke to Nathan offline and fixed some more comments/nitpicks in the patch.

I don't have any substantial comments for v9, so I think this can be marked
as ready-for-committer.  However, we probably should first see whether
Sawada-san has any comments on the revised approach.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-14 Thread Imseih (AWS), Sami
>I'm still unsure the current design of 0001 patch is better than other
>approaches we’ve discussed. Even users who don't use parallel vacuum
>are forced to allocate shared memory for index vacuum progress, with
>GetMaxBackends() entries from the beginning. Also, it’s likely to
>extend the progress tracking feature for other parallel operations in
>the future but I think the current design is not extensible. If we
>want to do that, we will end up creating similar things for each of
>them or re-creating index vacuum progress tracking feature while
>creating a common infra. It might not be a problem as of now but I'm
>concerned that introducing a feature that is not extensible and forces
>users to allocate additional shmem might be a blocker in the future.
>Looking at the precedent example, When we introduce the progress
>tracking feature, we implemented it in an extensible way. On the other
>hand, others in this thread seem to agree with this approach, so I'd
>like to leave it to committers.

Thanks for the review!

I think you make strong arguments as to why we need to take a different 
approach now than later. 

Flaws with current patch set:

1. GetMaxBackends() is a really heavy-handed overallocation of a shared memory 
serving a very specific purpose.
2. Going with the approach of a vacuum specific hash breaks the design of 
progress which is meant to be extensible.
3. Even if we go with this current approach as an interim solution, it will be 
a real pain in the future.

With that said, v7 introduces the new infrastructure. 0001 includes the new 
infrastructure and 0002 takes advantage of this.

This approach is the following:

1. Introduces a new API called pgstat_progress_update_param_parallel along with 
some others support functions. This new infrastructure is in backend_progress.c

2. There is still a shared memory involved, but the size is capped to " 
max_worker_processes" which is the max to how many parallel workers can be 
doing work at any given time. The shared memory hash includes a 
st_progress_param array just like the Backend Status array.

typedef struct ProgressParallelEntry
{
pid_t   leader_pid;
int64   st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];
} ProgressParallelEntry;

3. The progress update function is "pgstat_progress_update_param_parallel" and 
will aggregate totals reported for a specific progress parameter

For example , it can be called lie below. In the case below, 
PROGRESS_VACUUM_INDEXES_COMPLETED is incremented by 1 in the shared memory 
entry shared by the workers and leader.

case PARALLEL_INDVAC_STATUS_NEED_BULKDELETE:
istat_res = vac_bulkdel_one_index(, istat, 
pvs->dead_items);

pgstat_progress_update_param_parallel(pvs->shared->leader_pid, 
PROGRESS_VACUUM_INDEXES_COMPLETED, 1); <<-
break;

4. pg_stat_get_progress_info will call a function called 
pgstat_progress_set_parallel which will set the parameter value to the total 
from the shared memory hash.

I believe this approach gives proper infrastructure for future use-cases of 
workers reporting progress -and- does not do the heavy-handed shared memory 
allocation.

--
Sami Imseih
Amazon Web Services




v7-0001-Add-infrastructure-for-parallel-progress-reportin.patch
Description: v7-0001-Add-infrastructure-for-parallel-progress-reportin.patch


v7-0002-Show-progress-for-index-vacuums.patch
Description: v7-0002-Show-progress-for-index-vacuums.patch


v7-0003-Expose-indexes-being-processed-in-a-VACUUM-operat.patch
Description: v7-0003-Expose-indexes-being-processed-in-a-VACUUM-operat.patch


v7-0004-Rename-index_vacuum_count-in-pg_stat_pogress_vacu.patch
Description: v7-0004-Rename-index_vacuum_count-in-pg_stat_pogress_vacu.patch


Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-13 Thread Masahiko Sawada
On Sat, Mar 12, 2022 at 4:00 PM Imseih (AWS), Sami  wrote:
>
> > nitpick: Can we remove the extra spaces in the parentheses?
>
> fixed
>
> > What does it mean if there isn't an entry in the map?  Is this actually
> > expected, or should we ERROR instead?
>
> I cleaned up the code here and added comments.
>
> > I think the number of entries should be shared between
> > VacuumWorkerProgressShmemInit() and VacuumWorkerProgressShmemSize().
> > Otherwise, we might update one and not the other.
>
> Fixed
>
> > I think we should elaborate a bit more in this comment.  It's difficult to
> > follow what this is doing without referencing the comment above
> > set_vacuum_worker_progress().
>
> More comments added
>
> I also simplified the 0002 patch as well.

I'm still unsure the current design of 0001 patch is better than other
approaches we’ve discussed. Even users who don't use parallel vacuum
are forced to allocate shared memory for index vacuum progress, with
GetMaxBackends() entries from the beginning. Also, it’s likely to
extend the progress tracking feature for other parallel operations in
the future but I think the current design is not extensible. If we
want to do that, we will end up creating similar things for each of
them or re-creating index vacuum progress tracking feature while
creating a common infra. It might not be a problem as of now but I'm
concerned that introducing a feature that is not extensible and forces
users to allocate additional shmem might be a blocker in the future.
Looking at the precedent example, When we introduce the progress
tracking feature, we implemented it in an extensible way. On the other
hand, others in this thread seem to agree with this approach, so I'd
like to leave it to committers.

Anyway, here are some comments on v5-0001 patch:

+/* in commands/vacuumparallel.c */
+extern void VacuumWorkerProgressShmemInit(void);
+extern Size VacuumWorkerProgressShmemSize(void);
+extern void vacuum_worker_end(int leader_pid);
+extern void vacuum_worker_update(int leader_pid);
+extern void vacuum_worker_end_callback(int code, Datum arg);
+extern void set_vaccum_worker_progress(Datum *values);

These functions' body is not in vacuumparallel.c. As the comment says,
I think these functions should be implemented in vacuumparallel.c.

---
+/*
+ * set_vaccum_worker_progress --- updates the number of indexes that have been
+ * vacuumed or cleaned up in a parallel vacuum.
+ */
+void
+set_vaccum_worker_progress(Datum *values)

s/vaccum/vacuum/

---
+void
+set_vaccum_worker_progress(Datum *values)
+{
+VacProgressEntry *entry;
+int leader_pid = values[0];

I thik we should use DatumGetInt32().

---
+entry = (VacProgressEntry *)
hash_search(VacuumWorkerProgressHash, _pid, HASH_ENTER_NULL,
);
+
+if (!entry)
+elog(ERROR, "cannot allocate shared memory for vacuum
worker progress");

Since we raise an error in case of out of memory, I think we can use
HASH_ENTER instead of HASH_ENTER_NULL. Or do we want to emit a
detailed error message here?

---
+   VacuumWorkerProgressHash = NULL;

This line is not necessary.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-12 Thread Nathan Bossart
On Sat, Mar 12, 2022 at 07:00:06AM +, Imseih (AWS), Sami wrote:
>> nitpick: Can we remove the extra spaces in the parentheses?
> 
> fixed
> 
>> What does it mean if there isn't an entry in the map?  Is this actually
>> expected, or should we ERROR instead?
> 
> I cleaned up the code here and added comments. 
> 
>> I think the number of entries should be shared between
>> VacuumWorkerProgressShmemInit() and VacuumWorkerProgressShmemSize().
>> Otherwise, we might update one and not the other.
> 
> Fixed
> 
>> I think we should elaborate a bit more in this comment.  It's difficult to
>> follow what this is doing without referencing the comment above
>> set_vacuum_worker_progress().
> 
> More comments added
> 
> I also simplified the 0002 patch as well.

These patches look pretty good to me.  Barring additional feedback, I
intend to mark this as ready-for-committer early next week.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-10 Thread Nathan Bossart
On Thu, Mar 10, 2022 at 09:30:57PM +, Imseih (AWS), Sami wrote:
> Attached v4 which includes accounting for the hash size on startup, removal 
> of the no longer needed comment in pgstatfuncs.c and a change in both 
> code/docs to only reset the indexes_total to 0 when failsafe is triggered.

Thanks for the new patch set.

+/*
+ * Structs for tracking shared Progress information
+ * amongst worker ( and leader ) processes of a vacuum.
+ */

nitpick: Can we remove the extra spaces in the parentheses?

+if (entry != NULL)
+values[PGSTAT_NUM_PROGRESS_COMMON + PROGRESS_VACUUM_INDEXES_COMPLETED] 
= entry->indexes_processed;

What does it mean if there isn't an entry in the map?  Is this actually
expected, or should we ERROR instead?

+/* vacuum worker progress hash table */
+max_table_size = GetMaxBackends();
+size = add_size(size, hash_estimate_size(max_table_size,
+ sizeof(VacProgressEntry)));

I think the number of entries should be shared between
VacuumWorkerProgressShmemInit() and VacuumWorkerProgressShmemSize().
Otherwise, we might update one and not the other.

+/* Call the command specific function to override datum values */
+if (pg_strcasecmp(cmd, "VACUUM") == 0)
+set_vaccum_worker_progress(values);

I think we should elaborate a bit more in this comment.  It's difficult to
follow what this is doing without referencing the comment above
set_vacuum_worker_progress().

IMO the patches are in decent shape, and this should likely be marked as
ready-for-committer in the near future.  Before doing so, I think we should
check that Sawada-san is okay with moving the deeper infrastructure changes
to a separate threaḋ.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-09 Thread Imseih (AWS), Sami
> >   BTreeShmemInit();
> >   SyncScanShmemInit();
> >   AsyncShmemInit();
> >   +   vacuum_worker_init();

> >   Don't we also need to add the size of the hash table to
> >   CalculateShmemSize()?

> No, ShmemInitHash takes the min and max size of the hash and in turn calls 
> ShmemInitStruct to setup the shared memory.

Sorry,  I am wrong here. The size needs to be accounted for at startup. 

 --
 Sami Imseih
 Amazon Web Services




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-09 Thread Imseih (AWS), Sami
>I took a look at the latest patch set.

>+  
>+   indexes_total bigint
>+  
>+  
>+   The number of indexes to be processed in the
>+   vacuuming indexes
>+   or cleaning up indexes phase. It is set to
>+   0 when vacuum is not in any of these phases.
>+  

>Could we avoid resetting it to 0 unless INDEX_CLEANUP was turned off or
>failsafe kicked in?  It might be nice to know how many indexes the vacuum
>intends to process.  I don't feel too strongly about this, so if it would
>add a lot of complexity, it might be okay as is.

Your suggestion is valid. On INDEX_CLEANUP it is set to 0 from the start and 
when failsafe kicks in it will be reset to 0. I Will remove the reset call for 
the common index vacuum path. 

 >   BTreeShmemInit();
 >   SyncScanShmemInit();
 >   AsyncShmemInit();
 >   +   vacuum_worker_init();

 >   Don't we also need to add the size of the hash table to
 >   CalculateShmemSize()?

No, ShmemInitHash takes the min and max size of the hash and in turn calls 
ShmemInitStruct to setup the shared memory.

>+ * A command type can optionally define a callback function
>+ * which will derive Datum values rather than use values
>+ * directly from the backends progress array.

>I think this can be removed.

Good catch.

--
Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-09 Thread Nathan Bossart
I took a look at the latest patch set.

+  
+   indexes_total bigint
+  
+  
+   The number of indexes to be processed in the 
+   vacuuming indexes 
+   or cleaning up indexes phase. It is set to
+   0 when vacuum is not in any of these phases.
+  

Could we avoid resetting it to 0 unless INDEX_CLEANUP was turned off or
failsafe kicked in?  It might be nice to know how many indexes the vacuum
intends to process.  I don't feel too strongly about this, so if it would
add a lot of complexity, it might be okay as is.

BTreeShmemInit();
SyncScanShmemInit();
AsyncShmemInit();
+   vacuum_worker_init();

Don't we also need to add the size of the hash table to
CalculateShmemSize()?

+ * A command type can optionally define a callback function
+ * which will derive Datum values rather than use values
+ * directly from the backends progress array.

I think this can be removed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-09 Thread Imseih (AWS), Sami
>I think if it's a better approach we can do that including adding a
>new infrastructure for it.

+1 This is a beneficial idea, especially to other progress reporting, but I see 
this as a separate thread targeting the next major version.





Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-08 Thread Masahiko Sawada
On Wed, Mar 9, 2022 at 11:35 AM Imseih (AWS), Sami  wrote:
>
> >Indeed.
>
> >It might have already been discussed but other than using a new shmem
> >hash for parallel vacuum, I wonder if we can allow workers to change
> >the leader’s progress information. It would break the assumption that
> >the backend status entry is modified by its own backend, though. But
> >it might help for progress updates of other parallel operations too.
> >This essentially does the same thing as what the current patch does
> >but it doesn't require a new shmem hash.
>
> I experimented with this idea, but it did not work. The idea would have been 
> to create a pgstat_progress_update function that takes the leader pid, 
> however infrastructure does not exist to allow one backend to manipulate 
> another backends backend status array.
> pgstat_fetch_stat_beentry returns a local copy only.

I think if it's a better approach we can do that including adding a
new infrastructure for it.

>
> >Another idea I come up with is that the parallel vacuum leader checks
> >PVIndStats.status and updates how many indexes are processed to its
> >progress information. The leader can check it and update the progress
> >information before and after index vacuuming. And possibly we can add
> >a callback to the main loop of index AM's bulkdelete and vacuumcleanup
> >so that the leader can periodically make it up-to-date.
>
> >Regards,
>
> The PVIndStats idea is also one I experimented with but it did not work. The 
> reason being the backend checking the progress needs to do a shm_toc_lookup 
> to access the data, but they are not prepared to do so.

What I imagined is that the leader checks how many PVIndStats.status
is PARALLEL_INDVAC_STATUS_COMPLETED and updates the result to its
progress information as indexes_processed. That way, the backend
checking the progress can see it.

>
> I have not considered the callback in the index AM's bulkdelete and 
> vacuumcleanup, but I can imagine this is not possible since a leader could be 
> busy vacuuming rather than updating counters, but I may be misunderstanding 
> the suggestion.

Checking PVIndStats.status values is cheap. Probably the leader can
check it every 1GB index block, for example.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-08 Thread Imseih (AWS), Sami
>Indeed.

>It might have already been discussed but other than using a new shmem
>hash for parallel vacuum, I wonder if we can allow workers to change
>the leader’s progress information. It would break the assumption that
>the backend status entry is modified by its own backend, though. But
>it might help for progress updates of other parallel operations too.
>This essentially does the same thing as what the current patch does
>but it doesn't require a new shmem hash.

I experimented with this idea, but it did not work. The idea would have been to 
create a pgstat_progress_update function that takes the leader pid, however 
infrastructure does not exist to allow one backend to manipulate another 
backends backend status array.
pgstat_fetch_stat_beentry returns a local copy only. 

>Another idea I come up with is that the parallel vacuum leader checks
>PVIndStats.status and updates how many indexes are processed to its
>progress information. The leader can check it and update the progress
>information before and after index vacuuming. And possibly we can add
>a callback to the main loop of index AM's bulkdelete and vacuumcleanup
>so that the leader can periodically make it up-to-date.

>Regards,

The PVIndStats idea is also one I experimented with but it did not work. The 
reason being the backend checking the progress needs to do a shm_toc_lookup to 
access the data, but they are not prepared to do so. 

I have not considered the callback in the index AM's bulkdelete and 
vacuumcleanup, but I can imagine this is not possible since a leader could be 
busy vacuuming rather than updating counters, but I may be misunderstanding the 
suggestion.

--
Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-08 Thread Masahiko Sawada
On Wed, Mar 9, 2022 at 12:41 AM Imseih (AWS), Sami  wrote:
>
> ++/*
> ++ * vacuum_worker_init --- initialize this module's shared memory hash
> ++ * to track the progress of a vacuum worker
> ++ */
> ++void
> ++vacuum_worker_init(void)
> ++{
> ++   HASHCTL info;
> ++   longmax_table_size = GetMaxBackends();
> ++
> ++   VacuumWorkerProgressHash = NULL;
> ++
> ++   info.keysize = sizeof(pid_t);
> ++   info.entrysize = sizeof(VacProgressEntry);
> ++
> ++   VacuumWorkerProgressHash = ShmemInitHash("Vacuum Progress Hash",
> ++
> +  max_table_size,
> ++
> +  max_table_size,
> ++
> +  ,
> ++
> +  HASH_ELEM | HASH_BLOBS);
> ++}
>
> +It seems to me that creating a shmem hash with max_table_size entries
> +for parallel vacuum process tracking is too much. IIRC an old patch
> +had parallel vacuum workers advertise its progress and changed the
> +pg_stat_progress_vacuum view so that it aggregates the results
> +including workers' stats. I think it’s better than the current one.
> +Why did you change that?
>
> +Regards,
>
> I was trying to avoid a shared memory to track completed indexes, but 
> aggregating stats does not work with parallel vacuums. This is because a 
> parallel worker will exit before the vacuum completes causing the aggregated 
> total to be wrong.
>
> For example
>
> Leader_pid advertises it completed 2 indexes
> Parallel worker advertises it completed 2 indexes
>
> When aggregating we see 4 indexes completed.
>
> After the parallel worker exits, the aggregation will show only 2 indexes 
> completed.

Indeed.

It might have already been discussed but other than using a new shmem
hash for parallel vacuum, I wonder if we can allow workers to change
the leader’s progress information. It would break the assumption that
the backend status entry is modified by its own backend, though. But
it might help for progress updates of other parallel operations too.
This essentially does the same thing as what the current patch does
but it doesn't require a new shmem hash.

Another idea I come up with is that the parallel vacuum leader checks
PVIndStats.status and updates how many indexes are processed to its
progress information. The leader can check it and update the progress
information before and after index vacuuming. And possibly we can add
a callback to the main loop of index AM's bulkdelete and vacuumcleanup
so that the leader can periodically make it up-to-date.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-08 Thread Imseih (AWS), Sami
++/*
++ * vacuum_worker_init --- initialize this module's shared memory hash
++ * to track the progress of a vacuum worker
++ */
++void
++vacuum_worker_init(void)
++{
++   HASHCTL info;
++   longmax_table_size = GetMaxBackends();
++
++   VacuumWorkerProgressHash = NULL;
++
++   info.keysize = sizeof(pid_t);
++   info.entrysize = sizeof(VacProgressEntry);
++
++   VacuumWorkerProgressHash = ShmemInitHash("Vacuum Progress Hash",
++
+  max_table_size,
++
+  max_table_size,
++
+  ,
++
+  HASH_ELEM | HASH_BLOBS);
++}

+It seems to me that creating a shmem hash with max_table_size entries
+for parallel vacuum process tracking is too much. IIRC an old patch
+had parallel vacuum workers advertise its progress and changed the
+pg_stat_progress_vacuum view so that it aggregates the results
+including workers' stats. I think it’s better than the current one.
+Why did you change that?

+Regards,

I was trying to avoid a shared memory to track completed indexes, but 
aggregating stats does not work with parallel vacuums. This is because a 
parallel worker will exit before the vacuum completes causing the aggregated 
total to be wrong. 

For example

Leader_pid advertises it completed 2 indexes
Parallel worker advertises it completed 2 indexes

When aggregating we see 4 indexes completed.

After the parallel worker exits, the aggregation will show only 2 indexes 
completed. 

 --
 Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-07 Thread Masahiko Sawada
On Thu, Mar 3, 2022 at 2:08 PM Imseih (AWS), Sami  wrote:
>
> >>> If the failsafe kicks in midway through a vacuum, the number 
> > indexes_total will not be reset to 0. If INDEX_CLEANUP is turned off, then 
> > the value will be 0 at the start of the vacuum.
> >>
> >> The way that this works with num_index_scans is that we "round up"
> >> when there has been non-zero work in lazy_vacuum_all_indexes(), but
> >> not if the precheck in lazy_vacuum_all_indexes() fails. That seems
> >> like a good model to generalize from here. Note that this makes
> >> INDEX_CLEANUP=off affect num_index_scans in much the same way as a
> >> VACUUM where the failsafe kicks in very early, during the initial heap
> >> pass. That is, if the failsafe kicks in before we reach lazy_vacuum()
> >> for the first time (which is not unlikely), or even in the
> >> lazy_vacuum_all_indexes() precheck, then num_index_scans will remain
> >> at 0, just like INDEX_CLEANUP=off.
> >>
> >> The actual failsafe WARNING shows num_index_scans, possibly before it
> >> gets incremented one last time (by "rounding up"). So it's reasonably
> >> clear how this all works from that context (assuming that the
> >> autovacuum logging stuff, which reports num_index_scans, outputs a
> >> report for a table where the failsafe kicked in).
>
> >I am confused.  If failsafe kicks in during the middle of a vacuum, I
> >   (perhaps naively) would expect indexes_total and indexes_processed to go 
> > to
> >zero, and I'd expect to no longer see the "vacuuming indexes" and 
> > "cleaning
> >   up indexes" phases.  Otherwise, how would I know that we are now skipping
> >  indexes?  Of course, you won't have any historical context about the index
> >  work done before failsafe kicked in, but IMO it is misleading to still
> >   include it in the progress view.
>
> After speaking with Nathan offline, the best forward is to reset 
> indexes_total and indexes_processed to 0 after the start of "vacuuming 
> indexes" or "cleaning up indexes" phase.

+1

+/*
+ * vacuum_worker_init --- initialize this module's shared memory hash
+ * to track the progress of a vacuum worker
+ */
+void
+vacuum_worker_init(void)
+{
+   HASHCTL info;
+   longmax_table_size = GetMaxBackends();
+
+   VacuumWorkerProgressHash = NULL;
+
+   info.keysize = sizeof(pid_t);
+   info.entrysize = sizeof(VacProgressEntry);
+
+   VacuumWorkerProgressHash = ShmemInitHash("Vacuum Progress Hash",
+
  max_table_size,
+
  max_table_size,
+
  ,
+
  HASH_ELEM | HASH_BLOBS);
+}

It seems to me that creating a shmem hash with max_table_size entries
for parallel vacuum process tracking is too much. IIRC an old patch
had parallel vacuum workers advertise its progress and changed the
pg_stat_progress_vacuum view so that it aggregates the results
including workers' stats. I think it’s better than the current one.
Why did you change that?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2022-02-27 Thread Imseih (AWS), Sami
> On Wed, Feb 23, 2022 at 10:02 AM Imseih (AWS), Sami  
wrote:
>> If the failsafe kicks in midway through a vacuum, the number 
indexes_total will not be reset to 0. If INDEX_CLEANUP is turned off, then the 
value will be 0 at the start of the vacuum.
>
> The way that this works with num_index_scans is that we "round up"
> when there has been non-zero work in lazy_vacuum_all_indexes(), but
> not if the precheck in lazy_vacuum_all_indexes() fails. That seems
> like a good model to generalize from here. Note that this makes
> INDEX_CLEANUP=off affect num_index_scans in much the same way as a
> VACUUM where the failsafe kicks in very early, during the initial heap
> pass. That is, if the failsafe kicks in before we reach lazy_vacuum()
> for the first time (which is not unlikely), or even in the
> lazy_vacuum_all_indexes() precheck, then num_index_scans will remain
> at 0, just like INDEX_CLEANUP=off.
>
> The actual failsafe WARNING shows num_index_scans, possibly before it
> gets incremented one last time (by "rounding up"). So it's reasonably
> clear how this all works from that context (assuming that the
> autovacuum logging stuff, which reports num_index_scans, outputs a
> report for a table where the failsafe kicked in).

>I am confused.  If failsafe kicks in during the middle of a vacuum, I
>(perhaps naively) would expect indexes_total and indexes_processed to go to
>zero, and I'd expect to no longer see the "vacuuming indexes" and "cleaning
>up indexes" phases.  Otherwise, how would I know that we are now skipping
>indexes?  Of course, you won't have any historical context about the index
>work done before failsafe kicked in, but IMO it is misleading to still
>include it in the progress view.

Failsafe occurring in the middle of a vacuum and resetting "indexes_total" to 0 
will be misleading. I am thinking that it is a better idea to expose only one 
column "indexes_remaining".

If index_cleanup is set to OFF, the values of indexes_remaining will be 0 at 
the start of the vacuum.
If failsafe kicks in during a vacuum in-progress, "indexes_remaining" will be 
calculated to 0.

This approach will provide a progress based on how many indexes remaining with 
no ambiguity.

--
Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-02-25 Thread Nathan Bossart
On Wed, Feb 23, 2022 at 10:41:36AM -0800, Peter Geoghegan wrote:
> On Wed, Feb 23, 2022 at 10:02 AM Imseih (AWS), Sami  
> wrote:
>> If the failsafe kicks in midway through a vacuum, the number indexes_total 
>> will not be reset to 0. If INDEX_CLEANUP is turned off, then the value will 
>> be 0 at the start of the vacuum.
> 
> The way that this works with num_index_scans is that we "round up"
> when there has been non-zero work in lazy_vacuum_all_indexes(), but
> not if the precheck in lazy_vacuum_all_indexes() fails. That seems
> like a good model to generalize from here. Note that this makes
> INDEX_CLEANUP=off affect num_index_scans in much the same way as a
> VACUUM where the failsafe kicks in very early, during the initial heap
> pass. That is, if the failsafe kicks in before we reach lazy_vacuum()
> for the first time (which is not unlikely), or even in the
> lazy_vacuum_all_indexes() precheck, then num_index_scans will remain
> at 0, just like INDEX_CLEANUP=off.
> 
> The actual failsafe WARNING shows num_index_scans, possibly before it
> gets incremented one last time (by "rounding up"). So it's reasonably
> clear how this all works from that context (assuming that the
> autovacuum logging stuff, which reports num_index_scans, outputs a
> report for a table where the failsafe kicked in).

I am confused.  If failsafe kicks in during the middle of a vacuum, I
(perhaps naively) would expect indexes_total and indexes_processed to go to
zero, and I'd expect to no longer see the "vacuuming indexes" and "cleaning
up indexes" phases.  Otherwise, how would I know that we are now skipping
indexes?  Of course, you won't have any historical context about the index
work done before failsafe kicked in, but IMO it is misleading to still
include it in the progress view.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-02-23 Thread Peter Geoghegan
On Wed, Feb 23, 2022 at 10:02 AM Imseih (AWS), Sami  wrote:
> If the failsafe kicks in midway through a vacuum, the number indexes_total 
> will not be reset to 0. If INDEX_CLEANUP is turned off, then the value will 
> be 0 at the start of the vacuum.

The way that this works with num_index_scans is that we "round up"
when there has been non-zero work in lazy_vacuum_all_indexes(), but
not if the precheck in lazy_vacuum_all_indexes() fails. That seems
like a good model to generalize from here. Note that this makes
INDEX_CLEANUP=off affect num_index_scans in much the same way as a
VACUUM where the failsafe kicks in very early, during the initial heap
pass. That is, if the failsafe kicks in before we reach lazy_vacuum()
for the first time (which is not unlikely), or even in the
lazy_vacuum_all_indexes() precheck, then num_index_scans will remain
at 0, just like INDEX_CLEANUP=off.

The actual failsafe WARNING shows num_index_scans, possibly before it
gets incremented one last time (by "rounding up"). So it's reasonably
clear how this all works from that context (assuming that the
autovacuum logging stuff, which reports num_index_scans, outputs a
report for a table where the failsafe kicked in).

-- 
Peter Geoghegan




Re: Add index scan progress to pg_stat_progress_vacuum

2022-02-23 Thread Imseih (AWS), Sami
+ 
+  
+   indexes_total bigint
+  
+  
+   The number of indexes to be processed in the
+   vacuuming indexes or cleaning up 
indexes phase
+   of the vacuum.
+  
+ 
+
+ 
+  
+   indexes_processed bigint
+  
+  
+   The number of indexes processed in the
+   vacuuming indexes or cleaning up 
indexes phase.
+   At the start of an index vacuum cycle, this value is set to 
0.
+  
+ 

> Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP
> turned off?

If the failsafe kicks in midway through a vacuum, the number indexes_total will 
not be reset to 0. If INDEX_CLEANUP is turned off, then the value will be 0 at 
the start of the vacuum.

+typedef struct VacWorkerProgressInfo
+{
+int num_vacuums;/* number of active VACUUMS 
with parallel workers */
+int max_vacuums;/* max number of VACUUMS with 
parallel workers */
+slock_t mutex;
+VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER];
+} VacWorkerProgressInfo;

> max_vacuums appears to just be a local copy of MaxBackends.  Does this
> information really need to be stored here?  Also, is there a strong reason
> for using a spinlock instead of an LWLock?

First, The BTVacInfo code in backend/access/nbtree/nbtutils.c inspired this, so 
I wanted to follow this pattern. With that said, I do see max_vacuums being 
redundant here, and I am inclined to replace it with a MaxBackends() call. 

Second, There is no strong reason to use spinlock here except I incorrectly 
assumed it will be better for this case. After reading more about this and 
reading up src/backend/storage/lmgr/README, an LWLock will be better.

+void
+vacuum_worker_end(int leader_pid)
+{
+SpinLockAcquire(>mutex);
+for (int i = 0; i < vacworkerprogress->num_vacuums; i++)
+{
+VacOneWorkerProgressInfo *vac = >vacuums[i];
+
+if (vac->leader_pid == leader_pid)
+{
+*vac = 
vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 1];
+vacworkerprogress->num_vacuums--;
+SpinLockRelease(>mutex);
+break;
+}
+}
+SpinLockRelease(>mutex);
+}

  >  I see this loop pattern in a couple of places, and it makes me wonder if
  >  this information would fit more naturally in a hash table.

Followed the pattern in backend/access/nbtree/nbtutils.c for this as well. 
Using dynahash may make sense here if it simplifies the code. Will look.

+if (callback)
+callback(values, 3);

  >  Why does this need to be set up as a callback function?  Could we just call
  >  the function if cmdtype == PROGRESS_COMMAND_VACUUM?  ISTM that is pretty
  >  much all this is doing.

The intention will be for the caller to set the callback early on in the 
function using the existing " if (pg_strcasecmp(cmd, "VACUUM") == 0), etc." 
statement. This way we avoid having to add another if/else block before 
tuplestore_putvalues is called.

--
Sami Imseih
Amazon Web Services



Re: Add index scan progress to pg_stat_progress_vacuum

2022-02-21 Thread Nathan Bossart
On Mon, Feb 21, 2022 at 07:03:39PM +, Imseih (AWS), Sami wrote:
> Sending again with patch files renamed to ensure correct apply order.

I haven't had a chance to test this too much, but I did look through the
patch set and have a couple of small comments.

+ 
+  
+   indexes_total bigint
+  
+  
+   The number of indexes to be processed in the
+   vacuuming indexes or cleaning up 
indexes phase
+   of the vacuum.
+  
+ 
+
+ 
+  
+   indexes_processed bigint
+  
+  
+   The number of indexes processed in the
+   vacuuming indexes or cleaning up 
indexes phase.
+   At the start of an index vacuum cycle, this value is set to 
0.
+  
+ 

Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP
turned off?

+typedef struct VacWorkerProgressInfo
+{
+int num_vacuums;/* number of active VACUUMS with 
parallel workers */
+int max_vacuums;/* max number of VACUUMS with 
parallel workers */
+slock_t mutex;
+VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER];
+} VacWorkerProgressInfo;

max_vacuums appears to just be a local copy of MaxBackends.  Does this
information really need to be stored here?  Also, is there a strong reason
for using a spinlock instead of an LWLock?

+void
+vacuum_worker_end(int leader_pid)
+{
+SpinLockAcquire(>mutex);
+for (int i = 0; i < vacworkerprogress->num_vacuums; i++)
+{
+VacOneWorkerProgressInfo *vac = >vacuums[i];
+
+if (vac->leader_pid == leader_pid)
+{
+*vac = vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 
1];
+vacworkerprogress->num_vacuums--;
+SpinLockRelease(>mutex);
+break;
+}
+}
+SpinLockRelease(>mutex);
+}

I see this loop pattern in a couple of places, and it makes me wonder if
this information would fit more naturally in a hash table.

+if (callback)
+callback(values, 3);

Why does this need to be set up as a callback function?  Could we just call
the function if cmdtype == PROGRESS_COMMAND_VACUUM?  ISTM that is pretty
much all this is doing.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-01-12 Thread Imseih (AWS), Sami
On 1/12/22, 1:28 PM, "Bossart, Nathan"  wrote:

On 1/11/22, 11:46 PM, "Masahiko Sawada"  wrote:
> Regarding the new pg_stat_progress_vacuum_index view, why do we need
> to have a separate view? Users will have to check two views. If this
> view is expected to be used together with and joined to
> pg_stat_progress_vacuum, why don't we provide one view that has full
> information from the beginning? Especially, I think it's not useful
> that the total number of indexes to vacuum (num_indexes_to_vacuum
> column) and the current number of indexes that have been vacuumed
> (index_ordinal_position column) are shown in separate views.

 > I suppose we could add all of the new columns to
 > pg_stat_progress_vacuum and just set columns to NULL as appropriate.
 > But is that really better than having a separate view?

To add, since a vacuum can utilize parallel worker processes + the main vacuum 
process to perform index vacuuming, it made sense to separate the backends 
doing index vacuum/cleanup in a separate view. 
Besides what Nathan suggested, the only other clean option I can think of is to 
perhaps create a json column in pg_stat_progress_vacuum which will include all 
the new fields. My concern with this approach is that it will make usability, 
to flatten the json, difficult for users.

> Also, I’m not sure how useful index_tuples_removed is; what can we
> infer from this value (without a total number)?

>I think the idea was that you can compare it against max_dead_tuples
>   and num_dead_tuples to get an estimate of the current cycle progress.
>Otherwise, it just shows that progress is being made.

The main purpose is to really show that the "index vacuum" phase is actually 
making progress. Note that for certain types of indexes, i.e. GIN/GIST the 
number of tuples_removed will end up exceeding the number of num_dead_tuples.

Nathan

[0] https://postgr.es/m/7874FB21-FAA5-49BD-8386-2866552656C7%40amazon.com




  1   2   >