Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Jul 6, 2023 at 2:28 PM Masahiko Sawada wrote: > > On Thu, Jul 6, 2023 at 11:15 AM Michael Paquier wrote: > > > > On Thu, Jul 06, 2023 at 11:07:14AM +0900, Masahiko Sawada wrote: > > > Thank you for updating the patch. It looks good to me too. I've > > > updated the commit message. > > > > Thanks. I was planning to review this patch today and/or tomorrow now > > that my stack of things to do is getting slightly lower (registered my > > name as committer as well a few weeks ago to not format). > > > > One thing I was planning to do is to move the new message processing > > API for the incrementational updates in its own commit for clarity, as > > that's a separate concept than the actual feature, useful on its own. > > +1. I had the same idea. Please find the attached patches. > > > Anyway, would you prefer taking care of it? > > I can take care of it if you're okay. > Pushed. -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Jul 6, 2023 at 11:15 AM Michael Paquier wrote: > > On Thu, Jul 06, 2023 at 11:07:14AM +0900, Masahiko Sawada wrote: > > Thank you for updating the patch. It looks good to me too. I've > > updated the commit message. > > Thanks. I was planning to review this patch today and/or tomorrow now > that my stack of things to do is getting slightly lower (registered my > name as committer as well a few weeks ago to not format). > > One thing I was planning to do is to move the new message processing > API for the incrementational updates in its own commit for clarity, as > that's a separate concept than the actual feature, useful on its own. +1. I had the same idea. Please find the attached patches. > Anyway, would you prefer taking care of it? I can take care of it if you're okay. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com v31-0002-Report-index-vacuum-progress.patch Description: Binary data v31-0001-Add-new-parallel-message-type-to-progress-report.patch Description: Binary data
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Jul 06, 2023 at 11:07:14AM +0900, Masahiko Sawada wrote: > Thank you for updating the patch. It looks good to me too. I've > updated the commit message. Thanks. I was planning to review this patch today and/or tomorrow now that my stack of things to do is getting slightly lower (registered my name as committer as well a few weeks ago to not format). One thing I was planning to do is to move the new message processing API for the incrementational updates in its own commit for clarity, as that's a separate concept than the actual feature, useful on its own. Anyway, would you prefer taking care of it? -- Michael signature.asc Description: PGP signature
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Apr 12, 2023 at 9:22 PM Imseih (AWS), Sami wrote: > > > This should be OK (also checked the code paths where the reports are > > added). Note that the patch needed a few adjustments for its > > indentation. > > Thanks for the formatting corrections! This looks good to me. Thank you for updating the patch. It looks good to me too. I've updated the commit message. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com From f92929f21d4052cfbc3b068ca5b8be954741da3d Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 12 Apr 2023 13:45:59 +0900 Subject: [PATCH v30] Report index vacuum progress. This commit adds two columns: indexes_total and indexes_processed, to pg_stat_progress_vacuum system view to show the index vacuum progress. These numbers are reported in the "vacuuming indexes" and "cleaning up indexes" phases. It also adds a new type of parallel message, 'P'. Which is used to convey the progress updates made by parallel workers to the leader process. Therefore, the parallel workers' progress updates are reflected in an asynchronous manner. Currently it supports only incremental progress reporting but it's possible to allow for supporting of other backend process APIs in the future. Authors: Sami Imseih Reviewed by: Masahiko Sawada, Michael Paquier, Nathan Bossart, Andres Freund Discussion: https://www.postgresql.org/message-id/flat/5478DFCD-2333-401A-B2F0-0D186AB09228@amazon.com --- doc/src/sgml/monitoring.sgml | 23 ++ src/backend/access/heap/vacuumlazy.c | 70 --- src/backend/access/transam/parallel.c | 18 + src/backend/catalog/system_views.sql | 3 +- src/backend/commands/vacuumparallel.c | 9 ++- src/backend/utils/activity/backend_progress.c | 32 + src/include/commands/progress.h | 2 + src/include/utils/backend_progress.h | 1 + src/test/regress/expected/rules.out | 4 +- 9 files changed, 150 insertions(+), 12 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 506aeaa879..588b720f57 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -6110,6 +6110,29 @@ FROM pg_stat_get_backend_idset() AS backendid; Number of dead tuples collected since the last index vacuum cycle. + + + + indexes_total bigint + + + Total number of indexes that will be vacuumed or cleaned up. This + number is reported at the beginning of the + vacuuming indexes phase or the + cleaning up indexes phase. + + + + + + indexes_processed bigint + + + Number of indexes processed. This counter only advances when the + phase is vacuuming indexes or + cleaning up indexes. + + diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 4eb953f904..6a41ee635d 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -2319,6 +2319,17 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) { bool allindexes = true; double old_live_tuples = vacrel->rel->rd_rel->reltuples; + const int progress_start_index[] = { + PROGRESS_VACUUM_PHASE, + PROGRESS_VACUUM_INDEXES_TOTAL + }; + const int progress_end_index[] = { + PROGRESS_VACUUM_INDEXES_TOTAL, + PROGRESS_VACUUM_INDEXES_PROCESSED, + PROGRESS_VACUUM_NUM_INDEX_VACUUMS + }; + int64 progress_start_val[2]; + int64 progress_end_val[3]; Assert(vacrel->nindexes > 0); Assert(vacrel->do_index_vacuuming); @@ -2331,9 +2342,13 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) return false; } - /* Report that we are now vacuuming indexes */ - pgstat_progress_update_param(PROGRESS_VACUUM_PHASE, - PROGRESS_VACUUM_PHASE_VACUUM_INDEX); + /* + * Report that we are now vacuuming indexes and the number of indexes to + * vacuum. + */ + progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX; + progress_start_val[1] = vacrel->nindexes; + pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val); if (!ParallelVacuumIsActive(vacrel)) { @@ -2346,6 +2361,10 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) old_live_tuples, vacrel); + /* Report the number of indexes vacuumed */ + pgstat_progress_update_param(PROGRESS_VACUUM_INDEXES_PROCESSED, + idx + 1); + if (lazy_check_wraparound_failsafe(vacrel)) { /* Wraparound emergency -- end current index scan */ @@ -2380,14 +2399,17 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) Assert(allindexes || VacuumFailsafeActive); /* - * Increase and report the number of index scans. + * Increase and report the number of index scans. Also, we reset + * PROGRESS_VACUUM_INDEXES_TOTAL and PROGRESS_VACUUM_INDEXES_PROCESSED. * * We deliberately include the case where we started a round of bulk * deletes that
Re: Add index scan progress to pg_stat_progress_vacuum
> This should be OK (also checked the code paths where the reports are > added). Note that the patch needed a few adjustments for its > indentation. Thanks for the formatting corrections! This looks good to me. -- Sami
Re: Add index scan progress to pg_stat_progress_vacuum
On Mon, Apr 10, 2023 at 07:20:42PM +, Imseih (AWS), Sami wrote: > See v28 addressing the comments. This should be OK (also checked the code paths where the reports are added). Note that the patch needed a few adjustments for its indentation. -- Michael From 9267342ba2a1b8b120efaa98871b736a5bbde9a9 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 12 Apr 2023 13:45:59 +0900 Subject: [PATCH v29] Report index vacuum progress. Add 2 new columns to pg_stat_progress_vacuum. The columns are ndexes_total as the total indexes to be vacuumed or cleaned and indexes_processed as the number of indexes vacuumed or cleaned up so far. Authors: Sami Imseih Reviewed by: Masahiko Sawada, Michael Paquier, Nathan Bossart, Andres Freund Discussion: https://www.postgresql.org/message-id/flat/5478dfcd-2333-401a-b2f0-0d186ab09...@amazon.com --- src/include/commands/progress.h | 2 + src/include/utils/backend_progress.h | 1 + src/backend/access/heap/vacuumlazy.c | 70 --- src/backend/access/transam/parallel.c | 18 + src/backend/catalog/system_views.sql | 3 +- src/backend/commands/vacuumparallel.c | 9 ++- src/backend/utils/activity/backend_progress.c | 32 + src/test/regress/expected/rules.out | 4 +- doc/src/sgml/monitoring.sgml | 23 ++ 9 files changed, 150 insertions(+), 12 deletions(-) diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h index e5add41352..2478e87425 100644 --- a/src/include/commands/progress.h +++ b/src/include/commands/progress.h @@ -25,6 +25,8 @@ #define PROGRESS_VACUUM_NUM_INDEX_VACUUMS 4 #define PROGRESS_VACUUM_MAX_DEAD_TUPLES 5 #define PROGRESS_VACUUM_NUM_DEAD_TUPLES 6 +#define PROGRESS_VACUUM_INDEXES_TOTAL 7 +#define PROGRESS_VACUUM_INDEXES_PROCESSED 8 /* Phases of vacuum (as advertised via PROGRESS_VACUUM_PHASE) */ #define PROGRESS_VACUUM_PHASE_SCAN_HEAP 1 diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h index a84752ade9..70dea55fc0 100644 --- a/src/include/utils/backend_progress.h +++ b/src/include/utils/backend_progress.h @@ -37,6 +37,7 @@ extern void pgstat_progress_start_command(ProgressCommandType cmdtype, Oid relid); extern void pgstat_progress_update_param(int index, int64 val); extern void pgstat_progress_incr_param(int index, int64 incr); +extern void pgstat_progress_parallel_incr_param(int index, int64 incr); extern void pgstat_progress_update_multi_param(int nparam, const int *index, const int64 *val); extern void pgstat_progress_end_command(void); diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 0a9ebd22bd..e5707098bd 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -2314,6 +2314,17 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) { bool allindexes = true; double old_live_tuples = vacrel->rel->rd_rel->reltuples; + const int progress_start_index[] = { + PROGRESS_VACUUM_PHASE, + PROGRESS_VACUUM_INDEXES_TOTAL + }; + const int progress_end_index[] = { + PROGRESS_VACUUM_INDEXES_TOTAL, + PROGRESS_VACUUM_INDEXES_PROCESSED, + PROGRESS_VACUUM_NUM_INDEX_VACUUMS + }; + int64 progress_start_val[2]; + int64 progress_end_val[3]; Assert(vacrel->nindexes > 0); Assert(vacrel->do_index_vacuuming); @@ -2326,9 +2337,13 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) return false; } - /* Report that we are now vacuuming indexes */ - pgstat_progress_update_param(PROGRESS_VACUUM_PHASE, - PROGRESS_VACUUM_PHASE_VACUUM_INDEX); + /* + * Report that we are now vacuuming indexes and the number of indexes to + * vacuum. + */ + progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX; + progress_start_val[1] = vacrel->nindexes; + pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val); if (!ParallelVacuumIsActive(vacrel)) { @@ -2341,6 +2356,10 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) old_live_tuples, vacrel); + /* Report the number of indexes vacuumed */ + pgstat_progress_update_param(PROGRESS_VACUUM_INDEXES_PROCESSED, + idx + 1); + if (lazy_check_wraparound_failsafe(vacrel)) { /* Wraparound emergency -- end current index scan */ @@ -2375,14 +2394,17 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) Assert(allindexes || VacuumFailsafeActive); /* - * Increase and report the number of index scans. + * Increase and report the number of index scans. Also, we reset + * PROGRESS_VACUUM_INDEXES_TOTAL and PROGRESS_VACUUM_INDEXES_PROCESSED. * * We deliberately include the case where we started a round of bulk * deletes that we weren't able to finish due to the failsafe triggering. */ vacrel->num_index_scans++; - pgstat_progress_update_param(PROGRESS_VACUUM_NUM_INDEX_VACUUMS, - vacrel->num_index_scans); + progress_end_val[0]
Re: Add index scan progress to pg_stat_progress_vacuum
Hi, On 2023-04-10 08:14:18 +0900, Michael Paquier wrote: > On Fri, Apr 07, 2023 at 12:01:17PM -0700, Andres Freund wrote: > > Why would it mean that? Parallel workers are updated together with the > > leader, > > so there's no compatibility issue? > > My point is that the callback system would still need to be maintained > in a stable branch, and, while useful, it could be used for much more > than it is originally written. I guess that this could be used in > custom nodes with their own custom parallel nodes. Hm, I'm somewhat doubtful that that's something we should encourage. And doubtful we'd get it right without a concrete use case at hand to verify the design. Greetings, Andres Freund
Re: Add index scan progress to pg_stat_progress_vacuum
> + case 'P': /* Parallel progress reporting */ I kept this comment as-is but inside case code block I added more comments. This is to avoid cluttering up the one-liner comment. > + * Increase and report the number of index scans. Also, we reset the progress > + * counters. > The counters reset are the two index counts, perhaps this comment > should mention this fact. Yes, since we are using the multi_param API here, it makes sense to mention the progress fields being reset in the comments. + /* update progress */ + int index = pq_getmsgint(msg, 4); + int incr = pq_getmsgint(msg, 1); [...] + pq_beginmessage(_message, 'P'); + pq_sendint32(_message, index); + pq_sendint64(_message, incr); + pq_endmessage(_message); > It seems to me that the receiver side is missing one pq_getmsgend()? Yes. I added this. > incr is defined and sent as an int64 on the sender side, hence the > receiver should use pq_getmsgint64(), no? pq_getmsgint(msg, 1) means > to receive only one byte, see pqformat.c. Ah correct, incr is an int64 so what we need is. int64 incr = pq_getmsgint64(msg); I also added the pq_getmsgend call. > And the order is reversed? I don't think so. The index then incr are sent and they are back in the same order. Testing the patch shows the value increments correctly. See v28 addressing the comments. Regards, Sami Imseih AWS (Amazon Web Services) v28-0001-Report-index-vacuum-progress.patch Description: v28-0001-Report-index-vacuum-progress.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Fri, Apr 07, 2023 at 12:01:17PM -0700, Andres Freund wrote: > Why would it mean that? Parallel workers are updated together with the leader, > so there's no compatibility issue? My point is that the callback system would still need to be maintained in a stable branch, and, while useful, it could be used for much more than it is originally written. I guess that this could be used in custom nodes with their own custom parallel nodes. -- Michael signature.asc Description: PGP signature
Re: Add index scan progress to pg_stat_progress_vacuum
On Fri, Apr 07, 2023 at 07:27:17PM +, Imseih (AWS), Sami wrote: > If called by a worker, it will send a 'P' message to the front end > passing both the progress index, i.e. PROGRESS_VACUUM_INDEXES_PROCESSED > And the value to increment by, i.e. 1 for index vacuum progress. > > With that, the additional shared memory counters in ParallelVacuumState > are not needed, and the poke of the worker to the leader goes directly > through a generic backend_progress API. Thanks for the new version. This has unfortunately not been able to make the cut for v16, but let's see it done at the beginning of the v17 cycle. +void +pgstat_progress_parallel_incr_param(int index, int64 incr) +{ + /* +* Parallel workers notify a leader through a 'P' +* protocol message to update progress, passing the +* progress index and increment value. Leaders can +* just call pgstat_progress_incr_param directly. +*/ + if (IsParallelWorker()) + { + static StringInfoData progress_message; + + initStringInfo(_message); + + pq_beginmessage(_message, 'P'); + pq_sendint32(_message, index); + pq_sendint64(_message, incr); + pq_endmessage(_message); + } + else + pgstat_progress_incr_param(index, incr); +} I see. You need to handle both the leader and worker case because parallel_vacuum_process_one_index() can be called by either of them. + case 'P': /* Parallel progress reporting */ Perhaps this comment should say that this is only about incremental progress reporting, for the moment. +* Increase and report the number of index scans. Also, we reset the progress +* counters. The counters reset are the two index counts, perhaps this comment should mention this fact. + /* update progress */ + int index = pq_getmsgint(msg, 4); + int incr = pq_getmsgint(msg, 1); [...] + pq_beginmessage(_message, 'P'); + pq_sendint32(_message, index); + pq_sendint64(_message, incr); + pq_endmessage(_message); It seems to me that the receiver side is missing one pq_getmsgend()? incr is defined and sent as an int64 on the sender side, hence the receiver should use pq_getmsgint64(), no? pq_getmsgint(msg, 1) means to receive only one byte, see pqformat.c. And the order is reversed? There may be a case in the future about making 'P' more complicated with more arguments, but what you have here should be sufficient for your use-case. Were there plans to increment more data for some different and/or new progress indexes in the VACUUM path, by the way? Most of that looked a bit tricky to me as this was AM-dependent, but I may have missed something. -- Michael signature.asc Description: PGP signature
Re: Add index scan progress to pg_stat_progress_vacuum
> The arguments of pgstat_progress_update_param() would be given by the > worker directly as components of the 'P' message. It seems to me that > this approach would have the simplicity to not require the setup of a > shmem area for the extra counters, and there would be no need for a > callback. Hence, the only thing the code paths of workers would need > to do is to call this routine, then the leaders would increment their > progress when they see a CFI to process the 'P' message. Also, I > guess that we would only need an interface in backend_progress.c to > increment counters, like pgstat_progress_incr_param(), but usable by > workers. Like a pgstat_progress_worker_incr_param()? So, here is what I think should be workable to give a generic progress interface. pgstat_progress_parallel_incr_param will be a new API that can be called by either worker of leader from any parallel code path that chooses to increment a progress index. If called by a worker, it will send a 'P' message to the front end passing both the progress index, i.e. PROGRESS_VACUUM_INDEXES_PROCESSED And the value to increment by, i.e. 1 for index vacuum progress. With that, the additional shared memory counters in ParallelVacuumState are not needed, and the poke of the worker to the leader goes directly through a generic backend_progress API. Let me know your thoughts. Thanks! -- Sami Imseih Amazon Web Services (AWS) v27-0001-Report-index-vacuum-progress.patch Description: v27-0001-Report-index-vacuum-progress.patch
Re: Add index scan progress to pg_stat_progress_vacuum
Hi, On 2023-04-06 12:28:04 +0900, Michael Paquier wrote: > As some say, the introduction of a new message type in pqmq.c would be > basically a one-way door, because we'd have to maintain it in a stable > branch. Why would it mean that? Parallel workers are updated together with the leader, so there's no compatibility issue? Greetings, Andres Freund
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Apr 06, 2023 at 03:14:20PM +, Imseih (AWS), Sami wrote: >> Could it be worth thinking about a different design where >> the value incremented and the parameters of >> pgstat_progress_update_param() are passed through the 'P' message >> instead? > > I am not sure how this is different than the approach suggested. > In the current design, the 'P' message is used to pass the > ParallelvacuumState to parallel_vacuum_update_progress which then > calls pgstat_progress_update_param. The arguments of pgstat_progress_update_param() would be given by the worker directly as components of the 'P' message. It seems to me that this approach would have the simplicity to not require the setup of a shmem area for the extra counters, and there would be no need for a callback. Hence, the only thing the code paths of workers would need to do is to call this routine, then the leaders would increment their progress when they see a CFI to process the 'P' message. Also, I guess that we would only need an interface in backend_progress.c to increment counters, like pgstat_progress_incr_param(), but usable by workers. Like a pgstat_progress_worker_incr_param()? -- Michael signature.asc Description: PGP signature
Re: Add index scan progress to pg_stat_progress_vacuum
> As one thing, > for example, it introduces a dependency to parallel.h to do progress > reporting without touching at backend_progress.h. Containing the logic in backend_progress.h is a reasonable point from a maintenance standpoint. We can create a new function in backend_progress.h called pgstat_progress_update_leader which is called from vacuumparallel.c. pgstat_progress_update_leader can then call pq_putmessage('P', NULL, 0) > Is a callback > approach combined with a counter in shared memory the best thing there > could be? It seems to be the best way. The shared memory, ParallelVacuumState, is already tracking the counters for the Parallel Vacuum. Also, the callback in ParallelContext is the only way I can see to let the 'P' message know what to do for updating progress to the leader. > Could it be worth thinking about a different design where > the value incremented and the parameters of > pgstat_progress_update_param() are passed through the 'P' message > instead? I am not sure how this is different than the approach suggested. In the current design, the 'P' message is used to pass the ParallelvacuumState to parallel_vacuum_update_progress which then calls pgstat_progress_update_param. > It strikes me that gathering data in the leader from a poke > of the workers is something that could be useful in so much more areas > than just the parallel index operations done in a vacuum because we do > more and more things in parallel these days, so the API interface > ought to have some attention. We may need an interface that does more than progress reporting, but I am not sure what those use cases are at this point, besides progress reporting. > There is also an argument where we could > have each worker report their progress independently of the leader? In this case, we don't need ParallelContext at all or to go through the 'P' message. -- Regards, Sami Imseih Amazon Web Services (AWS)
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Apr 05, 2023 at 02:31:54PM +, Imseih (AWS), Sami wrote: >> That seems a valid argument. I was thinking that such an asynchronous >> state update mechanism would be a good infrastructure for progress >> reporting of parallel operations. It might be worth considering to use >> it in more general usage but since the current implementation is >> minimal can we extend it in the future when we need it for other use >> cases? > > I don't think we should delay this patch to design a more general > infrastructure. I agree this can be handled by a future requirement. Not so sure to agree on that. As the patch stands, we have a rather generally-purposed new message type and facility (callback trigger poke from workers to their leader) used for a not-so-general purpose, while being documented under this not-so-general purpose, which is progress reporting. I agree that relying on pqmq.c to force the leader to be more sensible to refreshes is sensible, because it is cheap, but the interface seems kind of misplaced to me. As one thing, for example, it introduces a dependency to parallel.h to do progress reporting without touching at backend_progress.h. Is a callback approach combined with a counter in shared memory the best thing there could be? Could it be worth thinking about a different design where the value incremented and the parameters of pgstat_progress_update_param() are passed through the 'P' message instead? It strikes me that gathering data in the leader from a poke of the workers is something that could be useful in so much more areas than just the parallel index operations done in a vacuum because we do more and more things in parallel these days, so the API interface ought to have some attention. As some say, the introduction of a new message type in pqmq.c would be basically a one-way door, because we'd have to maintain it in a stable branch. I would not take that lightly. One idea of interface that could be used is an extra set of APIs for workers to do progress reporting, part of backend_progress.h, where we use a pqmq message type in a centralized location, say something like a pgstat_progress_parallel_incr_param(). About the callback interface, we may also want to be more careful about more things, like the handling of callback chains, or even unregistrations of callbacks? There could be much more uses to that than just progress reporting, though this comes to a balance of what the leader needs to do before the workers are able to poke at it on a periodic basis to make the refresh of the aggregated progress reporting data more verbose. There is also an argument where we could have each worker report their progress independently of the leader? -- Michael signature.asc Description: PGP signature
Re: Add index scan progress to pg_stat_progress_vacuum
> > The key point of the patch is here. From what I understand based on > > the information of the thread, this is used as a way to make the > progress reporting done by the leader more responsive so as we'd > > update the index counters each time the leader is poked at with a 'P' > > message by one of its workers, once a worker is done with the parallel > > cleanup of one of the indexes. That's appealing, because this design > > is responsive and cheap, while we'd rely on CFIs to make sure that the > > leader triggers its callback on a timely basis. Unfortunately, > > sticking a concept of "Parallel progress reporting" is rather > > confusing here? This stuff can be used for much more purposes than > > just progress reporting: the leader would execute the callback it has > > registered based on the timing given by one or more of its workers, > > these willing to push an event on the leader. Progress reporting is > > one application of that to force a refresh and make the information of > > the leader accurate. What about things like a chain of callbacks, for > > example? Could the leader want to register more than one callback and > > act on all of them with one single P message? > That seems a valid argument. I was thinking that such an asynchronous > state update mechanism would be a good infrastructure for progress > reporting of parallel operations. It might be worth considering to use > it in more general usage but since the current implementation is > minimal can we extend it in the future when we need it for other use > cases? I don't think we should delay this patch to design a more general infrastructure. I agree this can be handled by a future requirement. > > > > Another question I have: could the reporting of each individual worker > > make sense on its own? The cleanup of the indexes depends on the > > order they are processed, their number, size and AM with their cleanup > > strategy, still there may be a point in getting information about how > > much work a single worker is doing rather than just have the > > aggregated information given to the leader? > It would also be useful information for users but I don't think it can > alternate the aggregated information. The aggregated information can > answer the question from the user like "how many indexes to vacuum are > remaining?", which helps estimate the remaining time to complete. The original intention of the thread was to expose stats for both aggregate (leader level) and individual index progress. Both the aggregate and individual indexes information have benefit as mentioned by Swada-San. For the individual index progress, a suggested patch was suggested earlier in the thread, v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch. However, since this particular thread has focused mainly on the aggregated stats work, my thoughts have been to start a new thread for the individual index progress once this gets committed. > > Btw, Is an assertion really helpful here? If > > parallel_progress_callback is not set, we'd just crash one line > > after. It seems to me that it could be cleaner to do nothing if a > > leader gets a poke message from a worker if there are no callbacks > > registered. > Agreed. I removed the assert and added an if condition instead. See the attached v26 please. Regards, -- Sami Imseih Amazon Web Services (AWS) v26-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v26-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Apr 5, 2023 at 4:47 PM Michael Paquier wrote: > > On Fri, Feb 24, 2023 at 03:16:10PM +0900, Masahiko Sawada wrote: > > Thanks! It looks good to me so I've marked it as Ready for Committer. > > + case 'P': /* Parallel progress reporting */ > + { > + /* Call the progress reporting callback */ > + Assert(pcxt->parallel_progress_callback); > + > pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg); > + > + break; > + } > > The key point of the patch is here. From what I understand based on > the information of the thread, this is used as a way to make the > progress reporting done by the leader more responsive so as we'd > update the index counters each time the leader is poked at with a 'P' > message by one of its workers, once a worker is done with the parallel > cleanup of one of the indexes. That's appealing, because this design > is responsive and cheap, while we'd rely on CFIs to make sure that the > leader triggers its callback on a timely basis. Unfortunately, > sticking a concept of "Parallel progress reporting" is rather > confusing here? This stuff can be used for much more purposes than > just progress reporting: the leader would execute the callback it has > registered based on the timing given by one or more of its workers, > these willing to push an event on the leader. Progress reporting is > one application of that to force a refresh and make the information of > the leader accurate. What about things like a chain of callbacks, for > example? Could the leader want to register more than one callback and > act on all of them with one single P message? That seems a valid argument. I was thinking that such an asynchronous state update mechanism would be a good infrastructure for progress reporting of parallel operations. It might be worth considering to use it in more general usage but since the current implementation is minimal can we extend it in the future when we need it for other use cases? > > Another question I have: could the reporting of each individual worker > make sense on its own? The cleanup of the indexes depends on the > order they are processed, their number, size and AM with their cleanup > strategy, still there may be a point in getting information about how > much work a single worker is doing rather than just have the > aggregated information given to the leader? It would also be useful information for users but I don't think it can alternate the aggregated information. The aggregated information can answer the question from the user like "how many indexes to vacuum are remaining?", which helps estimate the remaining time to complete. > > Btw, Is an assertion really helpful here? If > parallel_progress_callback is not set, we'd just crash one line > after. It seems to me that it could be cleaner to do nothing if a > leader gets a poke message from a worker if there are no callbacks > registered. Agreed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
On Fri, Feb 24, 2023 at 03:16:10PM +0900, Masahiko Sawada wrote: > Thanks! It looks good to me so I've marked it as Ready for Committer. + case 'P': /* Parallel progress reporting */ + { + /* Call the progress reporting callback */ + Assert(pcxt->parallel_progress_callback); + pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg); + + break; + } The key point of the patch is here. From what I understand based on the information of the thread, this is used as a way to make the progress reporting done by the leader more responsive so as we'd update the index counters each time the leader is poked at with a 'P' message by one of its workers, once a worker is done with the parallel cleanup of one of the indexes. That's appealing, because this design is responsive and cheap, while we'd rely on CFIs to make sure that the leader triggers its callback on a timely basis. Unfortunately, sticking a concept of "Parallel progress reporting" is rather confusing here? This stuff can be used for much more purposes than just progress reporting: the leader would execute the callback it has registered based on the timing given by one or more of its workers, these willing to push an event on the leader. Progress reporting is one application of that to force a refresh and make the information of the leader accurate. What about things like a chain of callbacks, for example? Could the leader want to register more than one callback and act on all of them with one single P message? Another question I have: could the reporting of each individual worker make sense on its own? The cleanup of the indexes depends on the order they are processed, their number, size and AM with their cleanup strategy, still there may be a point in getting information about how much work a single worker is doing rather than just have the aggregated information given to the leader? Btw, Is an assertion really helpful here? If parallel_progress_callback is not set, we'd just crash one line after. It seems to me that it could be cleaner to do nothing if a leader gets a poke message from a worker if there are no callbacks registered. -- Michael signature.asc Description: PGP signature
Re: Add index scan progress to pg_stat_progress_vacuum
On Tue, Feb 21, 2023 at 1:48 AM Imseih (AWS), Sami wrote: > > Thanks! > > > I think PROGRESS_VACUUM_INDEXES_TOTAL and > > PROGRESS_VACUUM_INDEXES_PROCESSED are better for consistency. The rest > > looks good to me. > > Took care of that in v25. Thanks! It looks good to me so I've marked it as Ready for Committer. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
Thanks! > I think PROGRESS_VACUUM_INDEXES_TOTAL and > PROGRESS_VACUUM_INDEXES_PROCESSED are better for consistency. The rest > looks good to me. Took care of that in v25. Regards -- Sami Imseih Amazon Web Services v25-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v25-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Sat, Feb 18, 2023 at 11:46 AM Imseih (AWS), Sami wrote: > > Thanks for the review! > > >+ > >+ ParallelVacuumFinish > >+ Waiting for parallel vacuum workers to finish index > >vacuum. > >+ > > >This change is out-of-date. > > That was an oversight. Thanks for catching. > > >Total number of indexes that will be vacuumed or cleaned up. This > >number is reported as of the beginning of the vacuuming indexes phase > >or the cleaning up indexes phase. > > This is cleaner. I was being unnecessarily verbose in the original > description. > > >Number of indexes processed. This counter only advances when the phase > >is vacuuming indexes or cleaning up indexes. > > I agree. > > >Also, index_processed sounds accurate to me. What do you think? > > At one point, II used index_processed, but decided to change it. > "processed" makes sense also. I will use this. > > >I think these settings are not necessary since the pcxt is palloc0'ed. > > Good point. > > >Assert(pvs->pcxt->parallel_progress_callback_arg) looks wrong to me. > >If 'arg' is NULL, a SEGV happens. > > Correct, Assert(pvs) is all that is needed. > > >I think it's better to update pvs->shared->nindexes_completed by both > >leader and worker processes who processed the index. > > No reason for that, since only the leader process can report process to > backend_progress. > > > >I think it's better to make the function type consistent with the > >existing parallel_worker_main_type. How about > >parallel_progress_callback_type? > > Yes, that makes sense. > > > I've attached a patch that incorporates the above comments and has > >some suggestions of updating comments etc. > > I reviewed and incorporated these changes, with a slight change. See v24. > > -* Increase and report the number of index. Also, we reset the > progress > -* counters. > > > +* Increase and report the number of index scans. Also, we reset the > progress > +* counters. > > > Thanks Thanks for updating the patch! #define PROGRESS_VACUUM_NUM_INDEX_VACUUMS 4 #define PROGRESS_VACUUM_MAX_DEAD_TUPLES5 #define PROGRESS_VACUUM_NUM_DEAD_TUPLES6 +#define PROGRESS_VACUUM_INDEX_TOTAL 7 +#define PROGRESS_VACUUM_INDEX_PROCESSED 8 -s.param7 AS num_dead_tuples +s.param7 AS num_dead_tuples, +s.param8 AS indexes_total, +s.param9 AS indexes_processed I think PROGRESS_VACUUM_INDEXES_TOTAL and PROGRESS_VACUUM_INDEXES_PROCESSED are better for consistency. The rest looks good to me. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
Thanks for the review! >+ >+ ParallelVacuumFinish >+ Waiting for parallel vacuum workers to finish index >vacuum. >+ >This change is out-of-date. That was an oversight. Thanks for catching. >Total number of indexes that will be vacuumed or cleaned up. This >number is reported as of the beginning of the vacuuming indexes phase >or the cleaning up indexes phase. This is cleaner. I was being unnecessarily verbose in the original description. >Number of indexes processed. This counter only advances when the phase >is vacuuming indexes or cleaning up indexes. I agree. >Also, index_processed sounds accurate to me. What do you think? At one point, II used index_processed, but decided to change it. "processed" makes sense also. I will use this. >I think these settings are not necessary since the pcxt is palloc0'ed. Good point. >Assert(pvs->pcxt->parallel_progress_callback_arg) looks wrong to me. >If 'arg' is NULL, a SEGV happens. Correct, Assert(pvs) is all that is needed. >I think it's better to update pvs->shared->nindexes_completed by both >leader and worker processes who processed the index. No reason for that, since only the leader process can report process to backend_progress. >I think it's better to make the function type consistent with the >existing parallel_worker_main_type. How about >parallel_progress_callback_type? Yes, that makes sense. > I've attached a patch that incorporates the above comments and has >some suggestions of updating comments etc. I reviewed and incorporated these changes, with a slight change. See v24. -* Increase and report the number of index. Also, we reset the progress -* counters. +* Increase and report the number of index scans. Also, we reset the progress +* counters. Thanks -- Sami Imseih Amazon Web Services (AWS) v24-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v24-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Feb 8, 2023 at 11:03 AM Imseih (AWS), Sami wrote: > > Hi, > > Thanks for your reply! > > I addressed the latest comments in v23. > > 1/ cleaned up the asserts as discussed. > 2/ used pq_putmessage to send the message on index scan completion. Thank you for updating the patch! Here are some comments for v23 patch: + + ParallelVacuumFinish + Waiting for parallel vacuum workers to finish index vacuum. + This change is out-of-date. --- + + + indexes_total bigint + + + Number of indexes that will be vacuumed or cleaned up. This value will be + 0 if the phase is not vacuuming indexes + or cleaning up indexes, INDEX_CLEANUP + is set to OFF, index vacuum is skipped due to very + few dead tuples in the table, or vacuum failsafe is triggered. + See + for more on vacuum failsafe. + + This explanation looks redundant: setting INDEX_CLEANUP to OFF essentially means the vacuum doesn't enter the vacuuming indexes phase. The same is true for the case of skipping index vacuum and failsafe mode. How about the following? Total number of indexes that will be vacuumed or cleaned up. This number is reported as of the beginning of the vacuuming indexes phase or the cleaning up indexes phase. --- + + + indexes_completed bigint + + + Number of indexes vacuumed in the current vacuum cycle when the + phase is vacuuming indexes, or the number + of indexes cleaned up during the cleaning up indexes + phase. + + How about simplfyy the description to something like: Number of indexes processed. This counter only advances when the phase is vacuuming indexes or cleaning up indexes. Also, index_processed sounds accurate to me. What do you think? --- +pcxt->parallel_progress_callback = NULL; +pcxt->parallel_progress_callback_arg = NULL; I think these settings are not necessary since the pcxt is palloc0'ed. --- +void +parallel_vacuum_update_progress(void *arg) +{ +ParallelVacuumState *pvs = (ParallelVacuumState *)arg; + +Assert(!IsParallelWorker()); +Assert(pvs->pcxt->parallel_progress_callback_arg); + +if (pvs) +pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, + pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1)); +} Assert(pvs->pcxt->parallel_progress_callback_arg) looks wrong to me. If 'arg' is NULL, a SEGV happens. I think it's better to update pvs->shared->nindexes_completed by both leader and worker processes who processed the index. --- +/* progress callback definition */ +typedef void (*ParallelProgressCallback) (void *parallel_progress_callback_state); + typedef void (*parallel_worker_main_type) (dsm_segment *seg, shm_toc *toc); I think it's better to make the function type consistent with the existing parallel_worker_main_type. How about parallel_progress_callback_type? I've attached a patch that incorporates the above comments and has some suggestions of updating comments etc. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com fix_v23_masahiko.patch Description: Binary data
Re: Add index scan progress to pg_stat_progress_vacuum
Hi, Thanks for your reply! I addressed the latest comments in v23. 1/ cleaned up the asserts as discussed. 2/ used pq_putmessage to send the message on index scan completion. Thanks -- Sami Imseih Amazon Web Services (AWS) v23-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v23-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Fri, Jan 20, 2023 at 11:38 PM Imseih (AWS), Sami wrote: > > >Number of indexes that will be vacuumed or cleaned up. This counter only > >advances when the phase is vacuuming indexes or cleaning up indexes. > > I agree, this reads better. > > --- > -/* Report that we are now vacuuming indexes */ > -pgstat_progress_update_param(PROGRESS_VACUUM_PHASE, > - > PROGRESS_VACUUM_PHASE_VACUUM_INDEX); > +/* > + * Report that we are now vacuuming indexes > + * and the number of indexes to vacuum. > + */ > +progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX; > +progress_start_val[1] = vacrel->nindexes; > +pgstat_progress_update_multi_param(2, progress_start_index, > progress_start_val); > > >According to our code style guideline[1], we limit line lengths so > >that the code is readable in an 80-column window. Some comments > > updated in this patch seem too short. > > I will correct this. > > >I think it's better to define "void *arg". > > Agree > > >--- > >+/* > >+ * A Leader process that receives this > > message > >+ * must be ready to update progress. > >+ */ > >+ > > Assert(pcxt->parallel_progress_callback); > >+ > > Assert(pcxt->parallel_progress_callback_arg); > >+ > >+/* Report progress */ > >+ > >pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg); > > >I think the parallel query infra should not require > >parallel_progress_callback_arg to always be set. I think it can be > >NULL. > > This assertion is inside the new 'P' message type handling. > If a leader is consuming this message, they must have a > progress callback set. Right now we only set the callback > in the parallel vacuum case only, so not all leaders will be prepared > to handle this case. > > Would you agree this is needed for safety? I think it makes sense that we assume pcxt->parallel_progress_callback is always not NULL but my point is that in the future one might want to use this callback without the argument and we should allow it. If parallel vacuum assumes pcxt->parallel_progress_callback_arg is not NULL, I think we should add an assertion in parallel_vacuum_update_progress(), but not in HandleParallelMessage(). > > case 'P': /* Parallel progress reporting */ > { > /* > * A Leader process that receives this message > * must be ready to update progress. > */ > Assert(pcxt->parallel_progress_callback); > Assert(pcxt->parallel_progress_callback_arg); > > --- > >+void > >+parallel_vacuum_update_progress(void *arg) > >+{ > >+ParallelVacuumState *pvs = (ParallelVacuumState *)arg; > >+ > >+Assert(!IsParallelWorker()); > >+ > >+if (pvs) > >+ > > pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, > >+ > > pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1)); > >+} > > >Since parallel vacuum always sets the arg, I think we don't need to > > check it. > > The arg is only set during parallel vacuum. During non-parallel vacuum, > It's NULL. This check can be removed, but I realize now that we do need > an Assert(pvs). Do you agree? Agreed to add the assertion in this function. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
>Number of indexes that will be vacuumed or cleaned up. This counter only >advances when the phase is vacuuming indexes or cleaning up indexes. I agree, this reads better. --- -/* Report that we are now vacuuming indexes */ -pgstat_progress_update_param(PROGRESS_VACUUM_PHASE, - PROGRESS_VACUUM_PHASE_VACUUM_INDEX); +/* + * Report that we are now vacuuming indexes + * and the number of indexes to vacuum. + */ +progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX; +progress_start_val[1] = vacrel->nindexes; +pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val); >According to our code style guideline[1], we limit line lengths so >that the code is readable in an 80-column window. Some comments > updated in this patch seem too short. I will correct this. >I think it's better to define "void *arg". Agree >--- >+/* >+ * A Leader process that receives this > message >+ * must be ready to update progress. >+ */ >+Assert(pcxt->parallel_progress_callback); >+ > Assert(pcxt->parallel_progress_callback_arg); >+ >+/* Report progress */ >+ >pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg); >I think the parallel query infra should not require >parallel_progress_callback_arg to always be set. I think it can be >NULL. This assertion is inside the new 'P' message type handling. If a leader is consuming this message, they must have a progress callback set. Right now we only set the callback in the parallel vacuum case only, so not all leaders will be prepared to handle this case. Would you agree this is needed for safety? case 'P': /* Parallel progress reporting */ { /* * A Leader process that receives this message * must be ready to update progress. */ Assert(pcxt->parallel_progress_callback); Assert(pcxt->parallel_progress_callback_arg); --- >+void >+parallel_vacuum_update_progress(void *arg) >+{ >+ParallelVacuumState *pvs = (ParallelVacuumState *)arg; >+ >+Assert(!IsParallelWorker()); >+ >+if (pvs) >+ > pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, >+ > pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1)); >+} >Since parallel vacuum always sets the arg, I think we don't need to check > it. The arg is only set during parallel vacuum. During non-parallel vacuum, It's NULL. This check can be removed, but I realize now that we do need an Assert(pvs). Do you agree? -- Regards, Sami Imseih Amazon Web Services (AWS)
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Jan 12, 2023 at 11:02 PM Imseih (AWS), Sami wrote: > > Thanks for the feedback and I apologize for the delay in response. > > >I think the problem here is that you're basically trying to work around > > the > >lack of an asynchronous state update mechanism between leader and > > workers. The > >workaround is to add a lot of different places that poll whether there > > has > >been any progress. And you're not doing that by integrating with the > > existing > >machinery for interrupt processing (i.e. CHECK_FOR_INTERRUPTS()), but by > >developing a new mechanism. > > >I think your best bet would be to integrate with > > HandleParallelMessages(). > > You are correct. I have been trying to work around the async nature > of parallel workers performing the index vacuum. As you have pointed out, > integrating with HandleParallelMessages does appear to be the proper way. > Doing so will also avoid having to do any progress updates in the index AMs. Very interesting idea. I need to study the parallel query infrastructure more to consider potential downside of this idea but it seems okay as far as I researched so far. > In the attached patch, the parallel workers send a new type of protocol > message > type to the leader called 'P' which signals the leader that it should handle a > progress update. The leader then performs the progress update by > invoking a callback set in the ParallelContext. This is done inside > HandleParallelMessages. > In the index vacuum case, the callback is parallel_vacuum_update_progress. > > The new message does not contain a payload, and it's merely used to > signal the leader that it can invoke a progress update. Thank you for updating the patch. Here are some comments for v22 patch: --- + + Number of indexes that will be vacuumed or cleaned up. This value will be + 0 if the phase is not vacuuming indexes + or cleaning up indexes, INDEX_CLEANUP + is set to OFF, index vacuum is skipped due to very + few dead tuples in the table, or vacuum failsafe is triggered. I think that if INDEX_CLEANUP is set to OFF or index vacuum is skipped due to failsafe mode, we enter neither vacuum indexes phase nor cleanup indexes phase. So probably we can say something like: Number of indexes that will be vacuumed or cleaned up. This counter only advances when the phase is vacuuming indexes or cleaning up indexes. --- -/* Report that we are now vacuuming indexes */ -pgstat_progress_update_param(PROGRESS_VACUUM_PHASE, - PROGRESS_VACUUM_PHASE_VACUUM_INDEX); +/* + * Report that we are now vacuuming indexes + * and the number of indexes to vacuum. + */ +progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX; +progress_start_val[1] = vacrel->nindexes; +pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val); According to our code style guideline[1], we limit line lengths so that the code is readable in an 80-column window. Some comments updated in this patch seem too short. --- +StringInfoData msgbuf; + +pq_beginmessage(, 'P'); +pq_endmessage(); I think we can use pq_putmessage() instead. --- +/* progress callback definition */ +typedef void (*ParallelProgressCallback) (void *parallel_progress_callback_state); I think it's better to define "void *arg". --- +/* + * A Leader process that receives this message + * must be ready to update progress. + */ +Assert(pcxt->parallel_progress_callback); +Assert(pcxt->parallel_progress_callback_arg); + +/* Report progress */ + pcxt->parallel_progress_callback(pcxt->parallel_progress_callback_arg); I think the parallel query infra should not require parallel_progress_callback_arg to always be set. I think it can be NULL. --- +void +parallel_vacuum_update_progress(void *arg) +{ +ParallelVacuumState *pvs = (ParallelVacuumState *)arg; + +Assert(!IsParallelWorker()); + +if (pvs) +pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, + pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1)); +} Since parallel vacuum always sets the arg, I think we don't need to check it. Regards, [1] https://www.postgresql.org/docs/devel/source-format.html -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
Thanks for the feedback and I apologize for the delay in response. >I think the problem here is that you're basically trying to work around the >lack of an asynchronous state update mechanism between leader and workers. > The >workaround is to add a lot of different places that poll whether there has >been any progress. And you're not doing that by integrating with the > existing >machinery for interrupt processing (i.e. CHECK_FOR_INTERRUPTS()), but by >developing a new mechanism. >I think your best bet would be to integrate with HandleParallelMessages(). You are correct. I have been trying to work around the async nature of parallel workers performing the index vacuum. As you have pointed out, integrating with HandleParallelMessages does appear to be the proper way. Doing so will also avoid having to do any progress updates in the index AMs. In the attached patch, the parallel workers send a new type of protocol message type to the leader called 'P' which signals the leader that it should handle a progress update. The leader then performs the progress update by invoking a callback set in the ParallelContext. This is done inside HandleParallelMessages. In the index vacuum case, the callback is parallel_vacuum_update_progress. The new message does not contain a payload, and it's merely used to signal the leader that it can invoke a progress update. Also, If the leader performs the index vacuum, it can call parallel_vacuum_update_progress directly inside vacuumparallel.c. Regards, Sami Imseih Amazon Web Services (AWS) v22-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v22-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
Hi, On 2023-01-07 01:59:40 +, Imseih (AWS), Sami wrote: > --- a/src/backend/access/nbtree/nbtree.c > +++ b/src/backend/access/nbtree/nbtree.c > @@ -998,6 +998,8 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult > *stats, > if (info->report_progress) > > pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE, > > scanblkno); > + if (info->report_parallel_progress && (scanblkno % > REPORT_PARALLEL_VACUUM_EVERY_PAGES) == 0) > + > parallel_vacuum_update_progress(info->parallel_vacuum_state); > } > } I still think it's wrong to need multiple pgstat_progress_*() calls within one scan. If we really need it, it should be abstracted into a helper function that wrapps all the vacuum progress stuff needed for an index. > @@ -688,7 +703,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState > *pvs, int num_index_scan >*/ > if (nworkers > 0) > { > - /* Wait for all vacuum workers to finish */ > + /* > + * Wait for all indexes to be vacuumed while > + * updating the parallel vacuum index progress, > + * and then wait for all workers to finish. > + */ > + parallel_vacuum_wait_to_finish(pvs); > + > WaitForParallelWorkersToFinish(pvs->pcxt); > > for (int i = 0; i < pvs->pcxt->nworkers_launched; i++) I don't think it's a good idea to have two difference routines that wait for workers to exit. And certainly not when one of them basically just polls in a regular interval as parallel_vacuum_wait_to_finish(). I think the problem here is that you're basically trying to work around the lack of an asynchronous state update mechanism between leader and workers. The workaround is to add a lot of different places that poll whether there has been any progress. And you're not doing that by integrating with the existing machinery for interrupt processing (i.e. CHECK_FOR_INTERRUPTS()), but by developing a new mechanism. I think your best bet would be to integrate with HandleParallelMessages(). Greetings, Andres Freund
Re: Add index scan progress to pg_stat_progress_vacuum
> My point is whether we should show > indexes_total throughout the vacuum execution (even also in not > relevant phases such as heap scanning/vacuum/truncation). That is a good point. We should only show indexes_total and indexes_completed only during the relevant phases. V21 addresses this along with a documentation fix. indexes_total and indexes_completed can only be a value > 0 while in the "vacuuming indexes" or "cleaning up indexes" phases of vacuum. Indexes_total is set to 0 at the start of the index vacuum cycle or index cleanups and set back to 0, along with indexes_completed, at the end of the index vacuum cycle and index cleanups Also, for the progress updates in vacuumlazy.c that should be done atomically, I made a change to use pgstat_progress_update_multi_param. Regards, -- Sami Imseih Amazon Web Services: https://aws.amazon.com v21-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v21-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Fri, Jan 6, 2023 at 12:07 PM Imseih (AWS), Sami wrote: > > >Similar to above three cases, vacuum can bypass index vacuuming if > >there are almost zero TIDs. Should we set indexes_total to 0 in this > >case too? If so, I think we can set both indexes_total and > >indexes_completed at the beginning of the index vacuuming/cleanup and > >reset them at the end. > > Unlike the other 3 cases, in which the vacuum and cleanup are totally skipped, > a cleanup still occurs when the index vacuum is bypassed. From what I can > tell, > this is to allow for things like a gin pending list cleanup even if the index > is not vacuumed. Right. But since we set indexes_total also at the beginning of index cleanup (i.e. lazy_cleanup_all_indexes()), can't we still show the valid value in this case? My point is whether we should show indexes_total throughout the vacuum execution (even also in not relevant phases such as heap scanning/vacuum/truncation). For example, in the analyze progress report, we have child_tables_total and child_tables_done. child_tables_total is set before the actual work of sampling rows from child tables, but not the beginning of the analyze. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
>Similar to above three cases, vacuum can bypass index vacuuming if >there are almost zero TIDs. Should we set indexes_total to 0 in this >case too? If so, I think we can set both indexes_total and >indexes_completed at the beginning of the index vacuuming/cleanup and >reset them at the end. Unlike the other 3 cases, in which the vacuum and cleanup are totally skipped, a cleanup still occurs when the index vacuum is bypassed. From what I can tell, this is to allow for things like a gin pending list cleanup even if the index is not vacuumed. There could be other reasons as well. if (bypass) { /* * There are almost zero TIDs. Behave as if there were precisely * zero: bypass index vacuuming, but do index cleanup. * * We expect that the ongoing VACUUM operation will finish very * quickly, so there is no point in considering speeding up as a * failsafe against wraparound failure. (Index cleanup is expected to * finish very quickly in cases where there were no ambulkdelete() * calls.) */ vacrel->do_index_vacuuming = false; } So it seems like we should still report the total number of indexes as we are currently doing in the patch. With that said, the documentation should make this be more clear. For indexes_total, the description should be: Number of indexes that will be vacuumed or cleaned up. This value will be 0 if there are no indexes to vacuum, INDEX_CLEANUP is set to OFF, or vacuum failsafe is triggered. See For indexes_completed, it should be: Number of indexes vacuumed in the current vacuum cycle when the phase is vacuuming indexes, or the number of indexes cleaned up in the cleaning up indexes phase. Regards, -- Sami Imseih Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Jan 5, 2023 at 4:24 AM Imseih (AWS), Sami wrote: > > Thanks for the review! > > Addressed the comments. > > > "Increment the indexes completed." (dot at the end) instead? > > Used the commenting format being used in other places in this > file with an inclusion of a double-dash. i.,e. > /* Wraparound emergency -- end current index scan */ > > > It seems to me that "#define REPORT_PARALLEL_VACUUM_EVERY_PAGES > > ((BlockNumber) (1024 * 1024 * 1024 / BLCKSZ))" would be fine too. > > I kept this the same as it matches what we are doing in other places such > as FAILSAFE_EVERY_PAGES > > v20 attached. + + + indexes_total bigint + + + Number of indexes that will be vacuumed. This value will be + 0 if there are no indexes to vacuum, INDEX_CLEANUP + is set to OFF, or vacuum failsafe is triggered. Similar to above three cases, vacuum can bypass index vacuuming if there are almost zero TIDs. Should we set indexes_total to 0 in this case too? If so, I think we can set both indexes_total and indexes_completed at the beginning of the index vacuuming/cleanup and reset them at the end. That is, these values are valid only in index vacuum phase and index cleanup phase. Otherwise, 0. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
Thanks for the review! Addressed the comments. > "Increment the indexes completed." (dot at the end) instead? Used the commenting format being used in other places in this file with an inclusion of a double-dash. i.,e. /* Wraparound emergency -- end current index scan */ > It seems to me that "#define REPORT_PARALLEL_VACUUM_EVERY_PAGES > ((BlockNumber) (1024 * 1024 * 1024 / BLCKSZ))" would be fine too. I kept this the same as it matches what we are doing in other places such as FAILSAFE_EVERY_PAGES v20 attached. Regards, -- Sami Imseih Amazon Web Services: https://aws.amazon.com v20-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v20-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
Hi, On 1/3/23 5:46 PM, Imseih (AWS), Sami wrote: cirrus-ci.com/task/4557389261701120 I earlier compiled without building with --enable-cassert, which is why the compilation errors did not produce on my buid. Fixed in v19. Thanks Thanks for the updated patch! Some comments about it: + + + + indexes_total bigint + + + Number of indexes that wil be vacuumed. This value will be Typo: wil + /* report number of indexes to vacuum, if we are told to cleanup indexes */ + if (vacrel->do_index_cleanup) + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_TOTAL, vacrel->nindexes); "Report" instead? (to looks like the surrounding code) + /* +* Done vacuuming an index. Increment the indexes completed +*/ + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, + idx + 1); "Increment the indexes completed." (dot at the end) instead? -* Increase and report the number of index scans. +* Reset and report the number of indexes scanned. +* Also, increase and report the number of index +* scans. What about "Reset and report zero as the number of indexes scanned."? (just to make clear we don't want to report the value as it was prior to the reset) - /* Disable index vacuuming, index cleanup, and heap rel truncation */ + /* +* Disable index vacuuming, index cleanup, and heap rel truncation +* The new "Disable index vacuuming, index cleanup, and heap rel truncation" needs a dot at the end. +* Also, report to progress.h that we are no longer tracking +* index vacuum/cleanup. +*/ "Also, report that we are" instead? + /* +* Done cleaning an index. Increment the indexes completed +*/ Needs a dot at the end. - /* Reset the parallel index processing counter */ + /* Reset the parallel index processing counter ( index progress counter also ) */ "Reset the parallel index processing and progress counters" instead? + /* Update the number of indexes completed. */ + pg_atomic_add_fetch_u32(&(pvs->shared->nindexes_completed), 1); Remove the dot at the end? (to looks like the surrounding code) + +/* + * Read pvs->shared->nindexes_completed and update progress.h + * with indexes vacuumed so far. This function is called periodically "Read pvs->shared->nindexes_completed and report the number of indexes vacuumed so far" instead? + * Note: This function should be used by the leader process only, "called" instead of "used"? case WAIT_EVENT_XACT_GROUP_UPDATE: event_name = "XactGroupUpdate"; break; + case WAIT_EVENT_PARALLEL_VACUUM_FINISH: + event_name = "ParallelVacuumFinish"; + break; /* no default case, so that compiler will warn */ It seems to me that the case ordering should follow the alphabetical order (that's how it is done currently without the patch). +#define REPORT_PARALLEL_VACUUM_EVERY_PAGES ((BlockNumber) (((uint64) 1024 * 1024 * 1024) / BLCKSZ)) It seems to me that "#define REPORT_PARALLEL_VACUUM_EVERY_PAGES ((BlockNumber) (1024 * 1024 * 1024 / BLCKSZ))" would be fine too. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
> cirrus-ci.com/task/4557389261701120 I earlier compiled without building with --enable-cassert, which is why the compilation errors did not produce on my buid. Fixed in v19. Thanks -- Sami Imseih Amazon Web Services: https://aws.amazon.com v19-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v19-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Mon, 2 Jan 2023 at 10:04, Imseih (AWS), Sami wrote: > > >cfbot is complaining that this patch no longer applies. Sami, would you > >mind rebasing it? > > Rebased patch attached. CFBot shows some compilation errors as in [1], please post an updated version for the same: [07:01:58.889] In file included from ../../../src/include/postgres.h:47, [07:01:58.889] from vacuumparallel.c:27: [07:01:58.889] vacuumparallel.c: In function ‘parallel_vacuum_update_progress’: [07:01:58.889] vacuumparallel.c:1118:10: error: ‘IsParallelWorker’ undeclared (first use in this function); did you mean ‘ParallelWorkerMain’? [07:01:58.889] 1118 | Assert(!IsParallelWorker); [07:01:58.889] | ^~~~ [07:01:58.889] ../../../src/include/c.h:859:9: note: in definition of macro ‘Assert’ [07:01:58.889] 859 | if (!(condition)) \ [07:01:58.889] | ^ [07:01:58.889] vacuumparallel.c:1118:10: note: each undeclared identifier is reported only once for each function it appears in [07:01:58.889] 1118 | Assert(!IsParallelWorker); [07:01:58.889] | ^~~~ [1] - https://cirrus-ci.com/task/4557389261701120 Regards, Vignesh
Re: Add index scan progress to pg_stat_progress_vacuum
>cfbot is complaining that this patch no longer applies. Sami, would you >mind rebasing it? Rebased patch attached. -- Sami Imseih Amazon Web Services: https://aws.amazon.com v18-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v18-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Dec 14, 2022 at 05:09:46AM +, Imseih (AWS), Sami wrote: > Attached version addresses the above and the previous comments. cfbot is complaining that this patch no longer applies. Sami, would you mind rebasing it? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
>First of all, I don't think we need to declare ParallelVacuumProgress >in vacuum.c since it's set and used only in vacuumparallel.c. But I >don't even think it's a good idea to declare it in vacuumparallel.c as >a static variable. The primary reason is that it adds things we need > to care about. For example, what if we raise an error during index >vacuum? The transaction aborts but ParallelVacuumProgress still refers >to something old. Suppose further that the next parallel vacuum >doesn't launch any workers, the leader process would still end up >accessing the old value pointed by ParallelVacuumProgress, which >causes a SEGV. So we need to reset it anyway at the beginning of the >parallel vacuum. It's easy to fix at this time but once the parallel > vacuum code gets more complex, it could forget to care about it. >IMO VacuumSharedCostBalance and VacuumActiveNWorkers have a different >story. They are set in vacuumparallel.c and are used in vacuum.c for >vacuum delay. If they weren't global variables, we would need to pass >them to every function that could eventually call the vacuum delay >function. So it makes sense to me to have them as global variables.On >the other hand, for ParallelVacuumProgress, it's a common pattern that >ambulkdelete(), amvacuumcleanup() or a common index scan routine like >btvacuumscan() checks the progress. I don't think index AM needs to >pass the value down to many of its functions. So it makes sense to me >to pass it via IndexVacuumInfo. Thanks for the detailed explanation and especially clearing up my understanding of VacuumSharedCostBalance and VacuumActiveNWorker. I do now think that passing ParallelVacuumState in IndexVacuumInfo is a more optimal choice. Attached version addresses the above and the previous comments. Thanks Sami Imseih Amazon Web Services (AWS) v17-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v17-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Tue, Dec 13, 2022 at 1:40 PM Imseih (AWS), Sami wrote: > > Thanks for the feedback. I agree with the feedback, except > for > > >need to have ParallelVacuumProgress. I see > >parallel_vacuum_update_progress() uses this value but I think it's > >better to pass ParallelVacuumState to via IndexVacuumInfo. > > I was trying to avoid passing a pointer to > ParallelVacuumState in IndexVacuuminfo. > > ParallelVacuumProgress is implemented in the same > way as VacuumSharedCostBalance and > VacuumActiveNWorkers. See vacuum.h > > These values are reset at the start of a parallel vacuum cycle > and reset at the end of an index vacuum cycle. > > This seems like a better approach and less invasive. > What would be a reason not to go with this approach? First of all, I don't think we need to declare ParallelVacuumProgress in vacuum.c since it's set and used only in vacuumparallel.c. But I don't even think it's a good idea to declare it in vacuumparallel.c as a static variable. The primary reason is that it adds things we need to care about. For example, what if we raise an error during index vacuum? The transaction aborts but ParallelVacuumProgress still refers to something old. Suppose further that the next parallel vacuum doesn't launch any workers, the leader process would still end up accessing the old value pointed by ParallelVacuumProgress, which causes a SEGV. So we need to reset it anyway at the beginning of the parallel vacuum. It's easy to fix at this time but once the parallel vacuum code gets more complex, it could forget to care about it. IMO VacuumSharedCostBalance and VacuumActiveNWorkers have a different story. They are set in vacuumparallel.c and are used in vacuum.c for vacuum delay. If they weren't global variables, we would need to pass them to every function that could eventually call the vacuum delay function. So it makes sense to me to have them as global variables.On the other hand, for ParallelVacuumProgress, it's a common pattern that ambulkdelete(), amvacuumcleanup() or a common index scan routine like btvacuumscan() checks the progress. I don't think index AM needs to pass the value down to many of its functions. So it makes sense to me to pass it via IndexVacuumInfo. Having said that, I'd like to hear opinions also from other hackers, I might be wrong and it's more invasive as you pointed out. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
Thanks for the feedback. I agree with the feedback, except for >need to have ParallelVacuumProgress. I see >parallel_vacuum_update_progress() uses this value but I think it's >better to pass ParallelVacuumState to via IndexVacuumInfo. I was trying to avoid passing a pointer to ParallelVacuumState in IndexVacuuminfo. ParallelVacuumProgress is implemented in the same way as VacuumSharedCostBalance and VacuumActiveNWorkers. See vacuum.h These values are reset at the start of a parallel vacuum cycle and reset at the end of an index vacuum cycle. This seems like a better approach and less invasive. What would be a reason not to go with this approach? > parallel_vacuum_update_progress() is typically called every 1GB so I > think we don't need to worry about unnecessary update. Also, I think > this code doesn't work when pgstat_track_activities is false. Instead, > I think that in parallel_wait_for_workers_to_finish(), we can check > the value of pvs->nindexes_completed and update the progress if there > is an update or it's first time. I agree that we don’t need to worry about unnecessary updates in parallel_vacuum_update_progress since we are calling every 1GB. I also don't think we should do anything additional in parallel_wait_for_workers_to_finish since here we are only updating every 1 second. Thanks, Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
Hi, Thank you for updating the patch! On Tue, Nov 29, 2022 at 8:57 AM Imseih (AWS), Sami wrote: > > > I think that indexes_total should be 0 also when INDEX_CLEANUP is off. > > Patch updated for handling of INDEX_CLEANUP = off, with an update to > the documentation as well. > > >I think we don't need to reset it at the end of index vacuuming. There > >is a small window before switching to the next phase. If we reset this > >value while showing "index vacuuming" phase, the user might get > >confused. Instead, we can reset it at the beginning of the index > >vacuuming. > > No, I think the way it's currently done is correct. We want to reset the > number > of indexes completed before we increase the num_index_scans ( index vacuum > cycle ). > This ensures that when we enter a new index cycle, the number of indexes > completed > are already reset. The 2 fields that matter here is how many indexes are > vacuumed in the > currently cycle and this is called out in the documentation as such. > Agreed. Here are comments on v16 patch. --- a/contrib/bloom/blvacuum.c +++ b/contrib/bloom/blvacuum.c @@ -15,12 +15,14 @@ #include "access/genam.h" #include "bloom.h" #include "catalog/storage.h" +#include "commands/progress.h" #include "commands/vacuum.h" #include "miscadmin.h" #include "postmaster/autovacuum.h" #include "storage/bufmgr.h" #include "storage/indexfsm.h" #include "storage/lmgr.h" +#include "utils/backend_progress.h" I think we don't need to include them here. Probably the same is true for other index AMs. --- vacuum_delay_point(); + if (info->report_parallel_progress && (blkno % REPORT_PARALLEL_VACUUM_EVERY_PAGES) == 0) + parallel_vacuum_update_progress(); + buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno, There is an extra new line. --- + + ParallelVacuumFinish + Waiting for parallel vacuum workers to finish computing. + How about "Waiting for parallel vacuum workers to finish index vacuum"? --- vacrel->rel = rel; vac_open_indexes(vacrel->rel, RowExclusiveLock, >nindexes, >indrels); + if (instrument && vacrel->nindexes > 0) { /* Copy index names used by instrumentation (not error reporting) */ This added line is not necessary. --- /* Counter for vacuuming and cleanup */ pg_atomic_uint32 idx; + +/* + * Counter for vacuuming and cleanup progress reporting. + * This value is used to report index vacuum/cleanup progress + * in parallel_vacuum_progress_report. We keep this + * counter to avoid having to loop through + * ParallelVacuumState->indstats to determine the number + * of indexes completed. + */ +pg_atomic_uint32 idx_completed_progress; I think the name of idx_completed_progress is very confusing. Since the idx of PVShared refers to the current index in the pvs->indstats[] whereas idx_completed_progress is the number of vacuumed indexes. How about "nindexes_completed"? --- +/* + * Set the shared parallel vacuum progress. This will be used + * to periodically update progress.h with completed indexes + * in a parallel vacuum. See comments in parallel_vacuum_update_progress + * for more details. + */ +ParallelVacuumProgress = &(pvs->shared->idx_completed_progress); + Since we pass pvs to parallel_wait_for_workers_to_finish(), we don't need to have ParallelVacuumProgress. I see parallel_vacuum_update_progress() uses this value but I think it's better to pass ParallelVacuumState to via IndexVacuumInfo. --- +/* + * To wait for parallel workers to finish, + * first call parallel_wait_for_workers_to_finish + * which is responsible for reporting the + * number of indexes completed. + * + * Afterwards, WaitForParallelWorkersToFinish is called + * to do the real work of waiting for parallel workers + * to finish. + * + * Note: Both routines will acquire a WaitLatch in their + * respective loops. + */ How about something like: Wait for all indexes to be vacuumed while updating the parallel vacuum index progress. And then wait for all workers to finish. --- RelationGetRelationName(indrel)); } +if (ivinfo.report_parallel_progress) +parallel_vacuum_update_progress(); + I think it's better to update the progress info after updating pvs->shared->idx_completed_progress. --- +/* + * Check if we are done vacuuming indexes and report + * progress. How about "Waiting for all indexes to be vacuumed while updating the parallel index vacuum progress"? + * + * We nap using with a WaitLatch
Re: Add index scan progress to pg_stat_progress_vacuum
> I think that indexes_total should be 0 also when INDEX_CLEANUP is off. Patch updated for handling of INDEX_CLEANUP = off, with an update to the documentation as well. >I think we don't need to reset it at the end of index vacuuming. There >is a small window before switching to the next phase. If we reset this >value while showing "index vacuuming" phase, the user might get >confused. Instead, we can reset it at the beginning of the index >vacuuming. No, I think the way it's currently done is correct. We want to reset the number of indexes completed before we increase the num_index_scans ( index vacuum cycle ). This ensures that when we enter a new index cycle, the number of indexes completed are already reset. The 2 fields that matter here is how many indexes are vacuumed in the currently cycle and this is called out in the documentation as such. >We should add CHECK_FOR_INTERRUPTS() at the beginning of the loop to >make the wait interruptible. Done. >I think it would be better to update the counter only when the value >has been increased. Done. Did so by checking the progress value from the beentry directly in the function. We can do a more generalized >I think we should set a timeout, say 1 sec, to WaitLatch so that it >can periodically check the progress. Done. >Probably it's better to have a new wait event for this wait in order >to distinguish wait for workers to complete index vacuum from the wait >for workers to exit. I was trying to avoid introducing a new wait event, but thinking about it, I agree with your suggestion. I created a new ParallelVacuumFinish wait event and documentation for the wait event. >I think this doesn't work, since ivinfo.idx_completed is in the >backend-local memory. Instead, I think we can have a function in >vacuumparallel.c that updates the progress. Then we can have index AM >call this function. Yeah, you're absolutely correct. To make this work correctly, I did something similar to VacuumActiveNWorkers and introduced an extern variable called ParallelVacuumProgress. This variable points to pvs->shared->idx_completed_progress. The index AMs then call parallel_vacuum_update_progress which Is responsible for updating the progress with the current value of ParallelVacuumProgress. ParallelVacuumProgress is also set to NULL at the end of every index cycle. > ivinfo.report_parallel_progress = !IsParallelWorker(); Done Regards, Sami Imseih Amazon Web Services (AWS) v16-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v16-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Sat, Nov 12, 2022 at 4:10 AM Imseih (AWS), Sami wrote: > > >I don't think any of these progress callbacks should be done while > > pinning a > >buffer and ... > > Good point. > > >I also don't understand why info->parallel_progress_callback exists? > > It's only > >set to parallel_vacuum_progress_report(). Why make this stuff more > > expensive > >than it has to already be? > > Agree. Modified the patch to avoid the callback . > > >So each of the places that call this need to make an additional external > >function call for each page, just to find that there's nothing to do or > > to > >make yet another indirect function call. This should probably a static > > inline > >function. > > Even better to just remove a function call altogether. > > >This is called, for every single page, just to read the number of indexes > >completed by workers? A number that barely ever changes? > > I will take the initial suggestion by Sawada-san to update the progress > every 1GB of blocks scanned. > > Also, It sems to me that we don't need to track progress in brin indexes, > Since it is rare, if ever, this type of index will go through very heavy > block scans. In fact, I noticed the vacuum AMs for brin don't invoke the > vacuum_delay_point at all. > > The attached patch addresses the feedback. > Thank you for updating the patch! Here are review comments on v15 patch: + + Number of indexes that wil be vacuumed. This value will be + 0 if there are no indexes to vacuum or + vacuum failsafe is triggered. I think that indexes_total should be 0 also when INDEX_CLEANUP is off. --- +/* + * Reset the indexes completed at this point. + * If we end up in another index vacuum cycle, we will + * start counting from the start. + */ +pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, 0); I think we don't need to reset it at the end of index vacuuming. There is a small window before switching to the next phase. If we reset this value while showing "index vacuuming" phase, the user might get confused. Instead, we can reset it at the beginning of the index vacuuming. --- +void +parallel_wait_for_workers_to_finish(ParallelVacuumState *pvs) +{ +while (pg_atomic_read_u32(&(pvs->shared->idx_completed_progress)) < pvs->nindexes) +{ +pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, + pg_atomic_read_u32(&(pvs->shared->idx_completed_progress))); + +(void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, -1, + WAIT_EVENT_PARALLEL_FINISH); +ResetLatch(MyLatch); +} +} We should add CHECK_FOR_INTERRUPTS() at the beginning of the loop to make the wait interruptible. I think it would be better to update the counter only when the value has been increased. I think we should set a timeout, say 1 sec, to WaitLatch so that it can periodically check the progress. Probably it's better to have a new wait event for this wait in order to distinguish wait for workers to complete index vacuum from the wait for workers to exit. --- @@ -838,7 +867,12 @@ parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel, ivinfo.estimated_count = pvs->shared->estimated_count; ivinfo.num_heap_tuples = pvs->shared->reltuples; ivinfo.strategy = pvs->bstrategy; - +ivinfo.idx_completed_progress = pvs->shared->idx_completed_progress; and @@ -998,6 +998,9 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats, if (info->report_progress) pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE, scanblkno); +if (info->report_parallel_progress && (scanblkno % REPORT_PARALLEL_VACUUM_EVERY_PAGES) == 0) + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, + pg_atomic_read_u32(&(info->idx_completed_progress))); } I think this doesn't work, since ivinfo.idx_completed is in the backend-local memory. Instead, I think we can have a function in vacuumparallel.c that updates the progress. Then we can have index AM call this function. --- +if (!IsParallelWorker()) +ivinfo.report_parallel_progress = true; +else +ivinfo.report_parallel_progress = false; We can do like: ivinfo.report_parallel_progress = !IsParallelWorker(); Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
>I don't think any of these progress callbacks should be done while pinning > a >buffer and ... Good point. >I also don't understand why info->parallel_progress_callback exists? It's > only >set to parallel_vacuum_progress_report(). Why make this stuff more > expensive >than it has to already be? Agree. Modified the patch to avoid the callback . >So each of the places that call this need to make an additional external >function call for each page, just to find that there's nothing to do or to >make yet another indirect function call. This should probably a static > inline >function. Even better to just remove a function call altogether. >This is called, for every single page, just to read the number of indexes >completed by workers? A number that barely ever changes? I will take the initial suggestion by Sawada-san to update the progress every 1GB of blocks scanned. Also, It sems to me that we don't need to track progress in brin indexes, Since it is rare, if ever, this type of index will go through very heavy block scans. In fact, I noticed the vacuum AMs for brin don't invoke the vacuum_delay_point at all. The attached patch addresses the feedback. Regards, Sami Imseih Amazon Web Services (AWS) v15-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v15-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
Hi, On 2022-11-04 13:27:34 +, Imseih (AWS), Sami wrote: > diff --git a/src/backend/access/gin/ginvacuum.c > b/src/backend/access/gin/ginvacuum.c > index b4fa5f6bf8..3d5e4600dc 100644 > --- a/src/backend/access/gin/ginvacuum.c > +++ b/src/backend/access/gin/ginvacuum.c > @@ -633,6 +633,9 @@ ginbulkdelete(IndexVacuumInfo *info, > IndexBulkDeleteResult *stats, > UnlockReleaseBuffer(buffer); > buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno, > > RBM_NORMAL, info->strategy); > + > + if (info->report_parallel_progress) > + > info->parallel_progress_callback(info->parallel_progress_arg); > } > > /* right now we found leftmost page in entry's BTree */ I don't think any of these progress callbacks should be done while pinning a buffer and ... > @@ -677,6 +680,9 @@ ginbulkdelete(IndexVacuumInfo *info, > IndexBulkDeleteResult *stats, > buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno, > > RBM_NORMAL, info->strategy); > LockBuffer(buffer, GIN_EXCLUSIVE); > + > + if (info->report_parallel_progress) > + > info->parallel_progress_callback(info->parallel_progress_arg); > } > > MemoryContextDelete(gvs.tmpCxt); ... definitely not while holding a buffer lock. I also don't understand why info->parallel_progress_callback exists? It's only set to parallel_vacuum_progress_report(). Why make this stuff more expensive than it has to already be? > +parallel_vacuum_progress_report(void *arg) > +{ > + ParallelVacuumState *pvs = (ParallelVacuumState *) arg; > + > + if (IsParallelWorker()) > + return; > + > + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, > + > pg_atomic_read_u32(&(pvs->shared->idx_completed_progress))); > +} So each of the places that call this need to make an additional external function call for each page, just to find that there's nothing to do or to make yet another indirect function call. This should probably a static inline function. This is called, for every single page, just to read the number of indexes completed by workers? A number that barely ever changes? This seems all wrong. Greetings, Andres Freund
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Nov 3, 2022 at 1:52 AM Imseih (AWS), Sami wrote: > > Attached is v13-0001--Show-progress-for-index-vacuums.patch which addresses > the latest comments. Thank you for updating the patch! > 4/ Went back to calling parallel_vacuum_progress_report inside > WaitForParallelWorkersToFinish to cover the case when a > leader is waiting for parallel workers to finish. I don't think we need to modify WaitForParallelWorkersToFinish to cover that case. Instead, I think the leader process can execute a new function. The function will be like WaitForParallelWorkersToFinish() but simpler; it just updates the progress information if necessary and then checks if idx_completed_progress is equal to the number of indexes to vacuum. If yes, return from the function and call WaitForParallelWorkersToFinish() to wait for all workers to finish. Otherwise, it naps by using WaitLatch() and does this loop again. --- @@ -46,6 +46,8 @@ typedef struct ParallelContext ParallelWorkerInfo *worker; intnknown_attached_workers; bool *known_attached_workers; +void (*parallel_progress_callback)(void *arg); +void *parallel_progress_arg; } ParallelContext; With the above change I suggested, I think we won't need to have a callback function in ParallelContext. Instead, I think we can have index-AMs call parallel_vacuum_report() if report_parallel_progress is true. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
Resubmitting patches with proper format. Thanks Sami Imseih Amazon Web Services (AWS) v14-0002-Function-to-return-currently-vacuumed-or-cleaned.patch Description: v14-0002-Function-to-return-currently-vacuumed-or-cleaned.patch v14-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch Description: v14-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Re: Add index scan progress to pg_stat_progress_vacuum
2022年11月3日(木) 1:52 Imseih (AWS), Sami : > > Attached is v13-0001--Show-progress-for-index-vacuums.patch which addresses > the latest comments. The main changes are: > > 1/ Call the parallel_vacuum_progress_report inside the AMs rather than > vacuum_delay_point. > > 2/ A Boolean when set to True in vacuumparallel.c will be used to determine > if calling > parallel_vacuum_progress_report is necessary. > > 3/ Removed global varilable from vacuumparallel.c > > 4/ Went back to calling parallel_vacuum_progress_report inside > WaitForParallelWorkersToFinish to cover the case when a > leader is waiting for parallel workers to finish. > > 5/ I did not see a need to only report progress after 1GB as it's a fairly > cheap call to update > progress. > > 6/ v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch is a > separate patch > for exposing the index relid being vacuumed by a backend. This entry was marked "Needs review" in the CommitFest app but cfbot reports the patch [1] no longer applies. [1] this patch: v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch We've marked it as "Waiting on Author". As CommitFest 2022-11 is currently underway, this would be an excellent time update the patch. Once you think the patchset is ready for review again, you (or any interested party) can move the patch entry forward by visiting https://commitfest.postgresql.org/40/3617/ and changing the status to "Needs review". Thanks Ian Barwick
Re: Add index scan progress to pg_stat_progress_vacuum
Attached is v13-0001--Show-progress-for-index-vacuums.patch which addresses the latest comments. The main changes are: 1/ Call the parallel_vacuum_progress_report inside the AMs rather than vacuum_delay_point. 2/ A Boolean when set to True in vacuumparallel.c will be used to determine if calling parallel_vacuum_progress_report is necessary. 3/ Removed global varilable from vacuumparallel.c 4/ Went back to calling parallel_vacuum_progress_report inside WaitForParallelWorkersToFinish to cover the case when a leader is waiting for parallel workers to finish. 5/ I did not see a need to only report progress after 1GB as it's a fairly cheap call to update progress. 6/ v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch is a separate patch for exposing the index relid being vacuumed by a backend. Thanks Sami Imseih Amazon Web Services (AWS) v13-0001--Show-progress-for-index-vacuums.patch Description: v13-0001--Show-progress-for-index-vacuums.patch v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch Description: v1-0001-Function-to-return-currently-vacuumed-or-cleaned-ind.patch
Re: Add index scan progress to pg_stat_progress_vacuum
Thank you for the feedback! >While it seems to be a good idea to have the atomic counter for the >number of indexes completed, I think we should not use the global >variable referencing the counter as follow: >+static pg_atomic_uint32 *index_vacuum_completed = NULL; >: >+void >+parallel_vacuum_progress_report(void) >+{ >+ if (IsParallelWorker() || !report_parallel_vacuum_progress) >+ return; >+ >+ pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, >+ > pg_atomic_read_u32(index_vacuum_completed)); >+} >I think we can pass ParallelVacuumState (or PVIndStats) to the >reporting function so that it can check the counter and report the >progress. Yes, that makes sense. >--- >I am not too sure that the idea of using the vacuum delay points is the > best >plan. I think we should try to avoid piggybacking on such general >infrastructure if we can, and instead look for a way to tie this to > something that is specific to parallel vacuum. >--- Adding the call to vacuum_delay_point made sense since it's called in all major vacuum scans. While it is also called by analyze, it will only do anything if the caller sets a flag to report_parallel_vacuum_progress. > Instead, I think we can add a boolean and the pointer for > ParallelVacuumState to IndexVacuumInfo. If the boolean is true, an > index AM can call the reporting function with the pointer to > ParallelVacuumState while scanning index blocks, for example, for > every 1GB. The boolean can be true only for the leader process. Will doing this every 1GB be necessary? Considering the function will not do much more than update progress from the value stored in index_vacuum_completed? > Agreed, but with the following change, the leader process waits in a >busy loop, which should not be acceptable: Good point. >Also, I think it's better to check whether idx_completed_progress equals to the number indexes instead. Good point > 5. Went back to the idea of adding a new view called pg_stat_progress_vacuum_index > which is accomplished by adding a new type called VACUUM_PARALLEL in progress.h >Probably, we can devide the patch into two patches. One for adding the Makes sense. Thanks Sami Imseih Amazon Web Services (AWS)
Re: Add index scan progress to pg_stat_progress_vacuum
On Tue, Oct 11, 2022 at 10:50 PM Imseih (AWS), Sami wrote: > > >One idea would be to add a flag, say report_parallel_vacuum_progress, > >to IndexVacuumInfo struct and expect index AM to check and update the > >parallel index vacuum progress, say every 1GB blocks processed. The > >flag is true only when the leader process is vacuuming an index. > > >Regards, > > Sorry for the long delay on this. I have taken the approach as suggested > by Sawada-san and Robert and attached is v12. Thank you for updating the patch! > > 1. The patch introduces a new counter in the same shared memory already > used by the parallel leader and workers to keep track of the number > of indexes completed. This way there is no reason to loop through > the index status every time we want to get the status of indexes completed. While it seems to be a good idea to have the atomic counter for the number of indexes completed, I think we should not use the global variable referencing the counter as follow: +static pg_atomic_uint32 *index_vacuum_completed = NULL; : +void +parallel_vacuum_progress_report(void) +{ + if (IsParallelWorker() || !report_parallel_vacuum_progress) + return; + + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED, +pg_atomic_read_u32(index_vacuum_completed)); +} I think we can pass ParallelVacuumState (or PVIndStats) to the reporting function so that it can check the counter and report the progress. > 2. A new function in vacuumparallel.c will be used to update > the progress of indexes completed by reading from the > counter created in point #1. > > 3. The function is called during the vacuum_delay_point as a > matter of convenience, since this is called in all major vacuum > loops. The function will only do something if the caller > sets a boolean to report progress. Doing so will also ensure > progress is being reported in case the parallel workers completed > before the leader. Robert pointed out: --- I am not too sure that the idea of using the vacuum delay points is the best plan. I think we should try to avoid piggybacking on such general infrastructure if we can, and instead look for a way to tie this to something that is specific to parallel vacuum. --- I agree with this part. Instead, I think we can add a boolean and the pointer for ParallelVacuumState to IndexVacuumInfo. If the boolean is true, an index AM can call the reporting function with the pointer to ParallelVacuumState while scanning index blocks, for example, for every 1GB. The boolean can be true only for the leader process. > > 4. Rather than adding any complexity to WaitForParallelWorkersToFinish > and introducing a new callback, vacuumparallel.c will wait until > the number of vacuum workers is 0 and then call > WaitForParallelWorkersToFinish as it does currently. Agreed, but with the following change, the leader process waits in a busy loop, which should not be acceptable: + if (VacuumActiveNWorkers) + { + while (pg_atomic_read_u32(VacuumActiveNWorkers) > 0) + { + parallel_vacuum_progress_report(); + } + } + Also, I think it's better to check whether idx_completed_progress equals to the number indexes instead. > 5. Went back to the idea of adding a new view called > pg_stat_progress_vacuum_index > which is accomplished by adding a new type called VACUUM_PARALLEL in > progress.h Probably, we can devide the patch into two patches. One for adding the new statistics of the number of vacuumed indexes to pg_stat_progress_vacuum and another one for adding new statistics view pg_stat_progress_vacuum_index. Regards, -- Masahiko Sawada PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
>One idea would be to add a flag, say report_parallel_vacuum_progress, >to IndexVacuumInfo struct and expect index AM to check and update the >parallel index vacuum progress, say every 1GB blocks processed. The >flag is true only when the leader process is vacuuming an index. >Regards, Sorry for the long delay on this. I have taken the approach as suggested by Sawada-san and Robert and attached is v12. 1. The patch introduces a new counter in the same shared memory already used by the parallel leader and workers to keep track of the number of indexes completed. This way there is no reason to loop through the index status every time we want to get the status of indexes completed. 2. A new function in vacuumparallel.c will be used to update the progress of indexes completed by reading from the counter created in point #1. 3. The function is called during the vacuum_delay_point as a matter of convenience, since this is called in all major vacuum loops. The function will only do something if the caller sets a boolean to report progress. Doing so will also ensure progress is being reported in case the parallel workers completed before the leader. 4. Rather than adding any complexity to WaitForParallelWorkersToFinish and introducing a new callback, vacuumparallel.c will wait until the number of vacuum workers is 0 and then call WaitForParallelWorkersToFinish as it does currently. 5. Went back to the idea of adding a new view called pg_stat_progress_vacuum_index which is accomplished by adding a new type called VACUUM_PARALLEL in progress.h Thanks, Sami Imseih Amazon Web Servies (AWS) FYI: the above message was originally sent yesterday but was created under a separate thread. Please ignore this thread[1] [1]: https://www.postgresql.org/message-id/flat/4CD97E17-B9E4-421E-9A53-4317C90EFF35%40amazon.com v12-0001--Show-progress-for-index-vacuums.patch Description: v12-0001--Show-progress-for-index-vacuums.patch
Re: Add index scan progress to pg_stat_progress_vacuum
> One idea would be to add a flag, say report_parallel_vacuum_progress, > to IndexVacuumInfo struct and expect index AM to check and update the > parallel index vacuum progress, say every 1GB blocks processed. The > flag is true only when the leader process is vacuuming an index. Sorry for the long delay on this. I have taken the approach as suggested by Sawada-san and Robert and attached is v12. 1. The patch introduces a new counter in the shared memory already used by the parallel leader and workers to keep track of the number of indexes completed. This way there is no reason to loop through the index status everytime we want to get the status of indexes completed. 2. A new function in vacuumparallel.c will be used to update the progress of a indexes completed by reading from the counter created in point #1. 3. The function is called during the vacuum_delay_point as a matter of convenience, since it's called in all major vacuum loops. The function will only do anything if the caller sets a boolean to report progress. Doing so will also ensure progress is being reported in case the parallel workers completed before the leader. 4. Rather than adding any complexity to WaitForParallelWorkersToFinish and introducing a new callback, vacuumparallel.c will wait until the number of vacuum workers is 0 and then process to call WaitForParallelWorkersToFinish as it does. 5. Went back to the idea of adding a new view called pg_stat_progress_vacuum_index which is accomplished by adding a new type called VACUUM_PARALLEL in progress.h Thanks, Sami Imseih Amazon Web Servies (AWS) v12-0001--Show-progress-for-index-vacuums.patch Description: v12-0001--Show-progress-for-index-vacuums.patch
Re: Add index scan progress to pg_stat_progress_vacuum
This entry has been waiting on author input for a while (our current threshold is roughly two weeks), so I've marked it Returned with Feedback. Once you think the patchset is ready for review again, you (or any interested party) can resurrect the patch entry by visiting https://commitfest.postgresql.org/38/3617/ and changing the status to "Needs Review", and then changing the status again to "Move to next CF". (Don't forget the second step; hopefully we will have streamlined this in the near future!) Thanks, --Jacob
Re: Add index scan progress to pg_stat_progress_vacuum
On Mon, Jun 6, 2022 at 11:42 PM Robert Haas wrote: > > On Thu, May 26, 2022 at 11:43 AM Masahiko Sawada > wrote: > > Another idea I came up with is that we can wait for all index vacuums > > to finish while checking and updating the progress information, and > > then calls WaitForParallelWorkersToFinish after confirming all index > > status became COMPLETED. That way, we don’t need to change the > > parallel query infrastructure. What do you think? > > +1 from me. It doesn't seem to me that we should need to add something > like parallel_vacuum_progress_callback in order to solve this problem, > because the parallel index vacuum code could just do the waiting > itself, as you propose here. > > The question Sami asks him his reply is a good one, though -- who is > to say that the leader only needs to update progress at the end, once > it's finished the index it's handling locally? There will need to be a > callback system of some kind to allow the leader to update progress as > other workers finish, even if the leader is still working. I am not > too sure that the idea of using the vacuum delay points is the best > plan. I think we should try to avoid piggybacking on such general > infrastructure if we can, and instead look for a way to tie this to > something that is specific to parallel vacuum. However, I haven't > studied the problem so I'm not sure whether there's a reasonable way > to do that. One idea would be to add a flag, say report_parallel_vacuum_progress, to IndexVacuumInfo struct and expect index AM to check and update the parallel index vacuum progress, say every 1GB blocks processed. The flag is true only when the leader process is vacuuming an index. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, May 26, 2022 at 11:43 AM Masahiko Sawada wrote: > Another idea I came up with is that we can wait for all index vacuums > to finish while checking and updating the progress information, and > then calls WaitForParallelWorkersToFinish after confirming all index > status became COMPLETED. That way, we don’t need to change the > parallel query infrastructure. What do you think? +1 from me. It doesn't seem to me that we should need to add something like parallel_vacuum_progress_callback in order to solve this problem, because the parallel index vacuum code could just do the waiting itself, as you propose here. The question Sami asks him his reply is a good one, though -- who is to say that the leader only needs to update progress at the end, once it's finished the index it's handling locally? There will need to be a callback system of some kind to allow the leader to update progress as other workers finish, even if the leader is still working. I am not too sure that the idea of using the vacuum delay points is the best plan. I think we should try to avoid piggybacking on such general infrastructure if we can, and instead look for a way to tie this to something that is specific to parallel vacuum. However, I haven't studied the problem so I'm not sure whether there's a reasonable way to do that. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add index scan progress to pg_stat_progress_vacuum
On Fri, May 27, 2022 at 10:52 AM Imseih (AWS), Sami wrote: > > >Another idea I came up with is that we can wait for all index vacuums > >to finish while checking and updating the progress information, and > >then calls WaitForParallelWorkersToFinish after confirming all index > >status became COMPLETED. That way, we don’t need to change the > >parallel query infrastructure. What do you think? > > Thinking about this a bit more, the idea of using > WaitForParallelWorkersToFinish > Will not work if you have a leader worker that is > stuck on a large index. The progress will not be updated > until the leader completes. Even if the parallel workers > finish. Right. > > What are your thought about piggybacking on the > vacuum_delay_point to update progress. The leader can > perhaps keep a counter to update progress every few thousand > calls to vacuum_delay_point. > > This goes back to your original idea to keep updating progress > while scanning the indexes. I think we can have the leader process wait for all index statuses to become COMPLETED before WaitForParallelWorkersToFinish(). While waiting for it, the leader can update its progress information. After the leader confirmed all index statuses became COMPLETED, it can wait for the workers to finish by WaitForParallelWorkersToFinish(). Regarding waiting in vacuum_delay_point, it might be a side effect as it’s called every page and used not only by vacuum such as analyze, but it seems to be worth trying. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
>Another idea I came up with is that we can wait for all index vacuums >to finish while checking and updating the progress information, and >then calls WaitForParallelWorkersToFinish after confirming all index >status became COMPLETED. That way, we don’t need to change the >parallel query infrastructure. What do you think? Thinking about this a bit more, the idea of using WaitForParallelWorkersToFinish Will not work if you have a leader worker that is stuck on a large index. The progress will not be updated until the leader completes. Even if the parallel workers finish. What are your thought about piggybacking on the vacuum_delay_point to update progress. The leader can perhaps keep a counter to update progress every few thousand calls to vacuum_delay_point. This goes back to your original idea to keep updating progress while scanning the indexes. /* * vacuum_delay_point --- check for interrupts and cost-based delay. * * This should be called in each major loop of VACUUM processing, * typically once per page processed. */ void vacuum_delay_point(void) { --- Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
On Fri, May 6, 2022 at 4:26 AM Imseih (AWS), Sami wrote: > > Thank you for the feedback! > > >I think we can pass the progress update function to > > WaitForParallelWorkersToFinish(), which seems simpler. And we can call > > Directly passing the callback to WaitForParallelWorkersToFinish > will require us to modify the function signature. > > To me, it seemed simpler and touches less code to have > the caller set the callback in the ParallelContext. Okay, but if we do that, I think we should add comments about when it's used. The callback is used only when WaitForParallelWorkersToFinish(), but not when WaitForParallelWorkersToExit(). Another idea I came up with is that we can wait for all index vacuums to finish while checking and updating the progress information, and then calls WaitForParallelWorkersToFinish after confirming all index status became COMPLETED. That way, we don’t need to change the parallel query infrastructure. What do you think? > > >the function after updating the index status to > >PARALLEL_INDVAC_STATUS_COMPLETED. > > I also like this better. Will make the change. > > >BTW, currently we don't need a lock for touching index status since > >each worker touches different indexes. But after this patch, the > >leader will touch all index status, do we need a lock for that? > > I do not think locking is needed here. The leader and workers > will continue to touch different indexes to update the status. > > However, if the process is a leader, it will call the function > which will go through indstats and count how many > Indexes have a status of PARALLEL_INDVAC_STATUS_COMPLETED. > This value is then reported to the leaders backend only. I was concerned that the leader process could report the wrong progress if updating and checking index status happen concurrently. But I think it should be fine since we can read PVIndVacStatus atomically. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
Thank you for the feedback! >I think we can pass the progress update function to > WaitForParallelWorkersToFinish(), which seems simpler. And we can call Directly passing the callback to WaitForParallelWorkersToFinish will require us to modify the function signature. To me, it seemed simpler and touches less code to have the caller set the callback in the ParallelContext. >the function after updating the index status to >PARALLEL_INDVAC_STATUS_COMPLETED. I also like this better. Will make the change. >BTW, currently we don't need a lock for touching index status since >each worker touches different indexes. But after this patch, the >leader will touch all index status, do we need a lock for that? I do not think locking is needed here. The leader and workers will continue to touch different indexes to update the status. However, if the process is a leader, it will call the function which will go through indstats and count how many Indexes have a status of PARALLEL_INDVAC_STATUS_COMPLETED. This value is then reported to the leaders backend only. Regards, Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Apr 14, 2022 at 10:32 AM Imseih (AWS), Sami wrote: > > Taking the above feedback, I modified the patches > and I believe this approach should be acceptable. > > For now, I also reduced the scope to only exposing > Indexes_total and indexes_completed in > pg_stat_progress_vacuum. I will create a new CF entry > for the new view pg_stat_progress_vacuum_index. > > V10-0001: This patch adds a callback to ParallelContext > that could be called by the leader in vacuumparallel.c > and more importantly while the leader is waiting > for the parallel workers to complete inside. Thank you for updating the patch! The new design looks much better to me. typedef struct ParallelWorkerInfo @@ -46,6 +49,8 @@ typedef struct ParallelContext ParallelWorkerInfo *worker; int nknown_attached_workers; bool *known_attached_workers; + ParallelProgressCallback parallel_progress_callback; + void*parallel_progress_callback_arg; } ParallelContext; I think we can pass the progress update function to WaitForParallelWorkersToFinish(), which seems simpler. And we can call the function after updating the index status to PARALLEL_INDVAC_STATUS_COMPLETED. BTW, currently we don't need a lock for touching index status since each worker touches different indexes. But after this patch, the leader will touch all index status, do we need a lock for that? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
It looks like this patch got feedback from Andres and Robert with some significant design change recommendations. I'm marking the patch Returned with Feedback. Feel free to add it back to a future commitfest when a new version is ready.
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Apr 7, 2022 at 10:20 PM Robert Haas wrote: > > On Wed, Apr 6, 2022 at 5:22 PM Imseih (AWS), Sami wrote: > > >At the beginning of a parallel operation, we allocate a chunk of> > > >dynamic shared memory which persists even after some or all workers > > >have exited. It's only torn down at the end of the parallel operation. > > >That seems like the appropriate place to be storing any kind of data > > >that needs to be propagated between parallel workers. The current > > >patch uses the main shared memory segment, which seems unacceptable to > > >me. > > > > Correct, DSM does track shared data. However only participating > > processes in the parallel vacuum can attach and lookup this data. > > > > The purpose of the main shared memory is to allow a process that > > Is querying the progress views to retrieve the information. > > Sure, but I think that you should likely be doing what Andres > recommended before: > > # Why isn't the obvious thing to do here to provide a way to associate workers > # with their leaders in shared memory, but to use the existing progress fields > # to report progress? Then, when querying progress, the leader and workers > # progress fields can be combined to show the overall progress? > > That is, I am imagining that you would want to use DSM to propagate > data from workers back to the leader and then have the leader report > the data using the existing progress-reporting facilities. Now, if we > really need a whole row from each worker that doesn't work, but why do > we need that? +1 I also proposed the same idea before[1]. The leader can know how many indexes are processed so far by checking PVIndStats.status allocated on DSM for each index. We can have the leader check it and update the progress information before and after vacuuming one index. If we want to update the progress information more timely, probably we can pass a callback function to ambulkdelete and amvacuumcleanup so that the leader can do that periodically, e.g., every 1000 blocks, while vacuuming an index. Regards, [1] https://www.postgresql.org/message-id/CAD21AoBW6SMJ96CNoMeu%2Bf_BR4jmatPcfVA016FdD2hkLDsaTA%40mail.gmail.com -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Apr 6, 2022 at 5:22 PM Imseih (AWS), Sami wrote: > >At the beginning of a parallel operation, we allocate a chunk of> > >dynamic shared memory which persists even after some or all workers > >have exited. It's only torn down at the end of the parallel operation. > >That seems like the appropriate place to be storing any kind of data > >that needs to be propagated between parallel workers. The current > >patch uses the main shared memory segment, which seems unacceptable to > >me. > > Correct, DSM does track shared data. However only participating > processes in the parallel vacuum can attach and lookup this data. > > The purpose of the main shared memory is to allow a process that > Is querying the progress views to retrieve the information. Sure, but I think that you should likely be doing what Andres recommended before: # Why isn't the obvious thing to do here to provide a way to associate workers # with their leaders in shared memory, but to use the existing progress fields # to report progress? Then, when querying progress, the leader and workers # progress fields can be combined to show the overall progress? That is, I am imagining that you would want to use DSM to propagate data from workers back to the leader and then have the leader report the data using the existing progress-reporting facilities. Now, if we really need a whole row from each worker that doesn't work, but why do we need that? -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add index scan progress to pg_stat_progress_vacuum
>At the beginning of a parallel operation, we allocate a chunk of> >dynamic shared memory which persists even after some or all workers >have exited. It's only torn down at the end of the parallel operation. >That seems like the appropriate place to be storing any kind of data >that needs to be propagated between parallel workers. The current >patch uses the main shared memory segment, which seems unacceptable to >me. Correct, DSM does track shared data. However only participating processes in the parallel vacuum can attach and lookup this data. The purpose of the main shared memory is to allow a process that Is querying the progress views to retrieve the information. Regards, Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
>Can't the progress data trivially be inferred by the fact that the worker >completed? Yes, at some point, this idea was experimented with in 0004-Expose-progress-for-the-vacuuming-indexes-cleanup-ph.patch. This patch did the calculation in system_views.sql However, the view is complex and there could be some edge cases with inferring the values that lead to wrong values being reported. Regards, Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
On 2022-04-05 16:42:28 +, Imseih (AWS), Sami wrote: > >Why isn't the obvious thing to do here to provide a way to associate > > workers > >with their leaders in shared memory, but to use the existing progress > > fields > >to report progress? Then, when querying progress, the leader and workers > >progress fields can be combined to show the overall progress? > > The original intent was this, however the workers > can exit before the command completes and the > worker progress data will be lost. Can't the progress data trivially be inferred by the fact that the worker completed?
Re: Add index scan progress to pg_stat_progress_vacuum
On Tue, Apr 5, 2022 at 12:42 PM Imseih (AWS), Sami wrote: > >Why isn't the obvious thing to do here to provide a way to associate > > workers > >with their leaders in shared memory, but to use the existing progress > > fields > >to report progress? Then, when querying progress, the leader and workers > >progress fields can be combined to show the overall progress? > > The original intent was this, however the workers > can exit before the command completes and the > worker progress data will be lost. > This is why the shared memory was introduced. > This allows the worker progress to persist for the duration > of the command. At the beginning of a parallel operation, we allocate a chunk of dynamic shared memory which persists even after some or all workers have exited. It's only torn down at the end of the parallel operation. That seems like the appropriate place to be storing any kind of data that needs to be propagated between parallel workers. The current patch uses the main shared memory segment, which seems unacceptable to me. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Add index scan progress to pg_stat_progress_vacuum
>I nevertheless think that's not acceptable. The whole premise of the > progress >reporting infrastructure is to be low overhead. It's OK to require locking > to >initialize parallel progress reporting, it's definitely not ok to require >locking to report progress. Fair point. >Why isn't the obvious thing to do here to provide a way to associate > workers >with their leaders in shared memory, but to use the existing progress > fields >to report progress? Then, when querying progress, the leader and workers >progress fields can be combined to show the overall progress? The original intent was this, however the workers can exit before the command completes and the worker progress data will be lost. This is why the shared memory was introduced. This allows the worker progress to persist for the duration of the command. Regards, Sami Imseih Amazon Web Services.
Re: Add index scan progress to pg_stat_progress_vacuum
Hi, On 2022-03-29 12:25:52 +, Imseih (AWS), Sami wrote: > > I think that's an absolute no-go. Adding locking to progress reporting, > > particularly a single central lwlock, is going to *vastly* increase the > > overhead incurred by progress reporting. > > Sorry for the late reply. > > The usage of the shared memory will be limited > to PARALLEL maintenance operations. For now, > it will only be populated for parallel vacuums. > Autovacuum for example will not be required to > populate this shared memory. I nevertheless think that's not acceptable. The whole premise of the progress reporting infrastructure is to be low overhead. It's OK to require locking to initialize parallel progress reporting, it's definitely not ok to require locking to report progress. Leaving the locking aside, doing a hashtable lookup for each progress report is pretty expensive. Why isn't the obvious thing to do here to provide a way to associate workers with their leaders in shared memory, but to use the existing progress fields to report progress? Then, when querying progress, the leader and workers progress fields can be combined to show the overall progress? Greetings, Andres Freund
Re: Add index scan progress to pg_stat_progress_vacuum
> I think that's an absolute no-go. Adding locking to progress reporting, > particularly a single central lwlock, is going to *vastly* increase the > overhead incurred by progress reporting. Sorry for the late reply. The usage of the shared memory will be limited to PARALLEL maintenance operations. For now, it will only be populated for parallel vacuums. Autovacuum for example will not be required to populate this shared memory. Regards, --- Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
Sorry for the late reply. > additional complexity and a possible lag of progress updates. So if we > go with the current approach, I think we need to make sure enough (and > not too many) hash table entries. The hash table can be set 4 times the size of max_worker_processes which should give more than enough padding. Note that max_parallel_maintenance_workers is what should be used, but since it's dynamic, it cannot be used to determine the size of shared memory. Regards, --- Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Mar 23, 2022 at 6:57 AM Imseih (AWS), Sami wrote: > > >Can the leader pass a callback that checks PVIndStats to ambulkdelete > >an amvacuumcleanup callbacks? I think that in the passed callback, the > >leader checks if the number of processed indexes and updates its > >progress information if the current progress needs to be updated. > > Thanks for the suggestion. > > I looked at this option a but today and found that passing the callback > will also require signature changes to the ambulkdelete and > amvacuumcleanup routines. I think it would not be a critical problem since it's a new feature. > > This will also require us to check after x pages have been > scanned inside vacuumscan and vacuumcleanup. After x pages > the callback can then update the leaders progress. > I am not sure if adding additional complexity to the scan/cleanup path > is justified for what this patch is attempting to do. > > There will also be a lag of the leader updating the progress as it > must scan x amount of pages before updating. Obviously, the more > Pages to the scan, the longer the lag will be. Fair points. On the other hand, the approach of the current patch requires more memory for progress tracking, which could fail, e.g., due to running out of hashtable entries. I think that it would be worse that the parallel operation failed to start due to not being able to track the progress than the above concerns you mentioned such as introducing additional complexity and a possible lag of progress updates. So if we go with the current approach, I think we need to make sure enough (and not too many) hash table entries. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
>Can the leader pass a callback that checks PVIndStats to ambulkdelete >an amvacuumcleanup callbacks? I think that in the passed callback, the >leader checks if the number of processed indexes and updates its >progress information if the current progress needs to be updated. Thanks for the suggestion. I looked at this option a but today and found that passing the callback will also require signature changes to the ambulkdelete and amvacuumcleanup routines. This will also require us to check after x pages have been scanned inside vacuumscan and vacuumcleanup. After x pages the callback can then update the leaders progress. I am not sure if adding additional complexity to the scan/cleanup path is justified for what this patch is attempting to do. There will also be a lag of the leader updating the progress as it must scan x amount of pages before updating. Obviously, the more Pages to the scan, the longer the lag will be. Would like to hear your thoughts on the above. Regards, Sami Imseih Amazon Web Services.
Re: Add index scan progress to pg_stat_progress_vacuum
On Tue, Mar 22, 2022 at 4:27 PM Imseih (AWS), Sami wrote: > > >BTW have we discussed another idea I mentioned before that we have the > >leader process periodically check the number of completed indexes and > >advertise it in its progress information? I'm not sure which one is > >better but this idea would require only changes of vacuum code and > >probably simpler than the current idea. > > >Regards, > > > If I understand correctly, to accomplish this we will need to have the leader > check the number of indexes completed In the ambukdelete or amvacuumcleanup > callbacks. These routines do not know about PVIndStats, and they are called > by both parallel and non-parallel vacuums. > > From what I can see, PVIndstats will need to be passed down to these routines > or pass a NULL for non-parallel vacuums. > Can the leader pass a callback that checks PVIndStats to ambulkdelete an amvacuumcleanup callbacks? I think that in the passed callback, the leader checks if the number of processed indexes and updates its progress information if the current progress needs to be updated. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
>BTW have we discussed another idea I mentioned before that we have the >leader process periodically check the number of completed indexes and >advertise it in its progress information? I'm not sure which one is >better but this idea would require only changes of vacuum code and >probably simpler than the current idea. >Regards, If I understand correctly, to accomplish this we will need to have the leader check the number of indexes completed In the ambukdelete or amvacuumcleanup callbacks. These routines do not know about PVIndStats, and they are called by both parallel and non-parallel vacuums. From what I can see, PVIndstats will need to be passed down to these routines or pass a NULL for non-parallel vacuums. Sami
Re: Add index scan progress to pg_stat_progress_vacuum
Hi, On 2022-03-16 21:47:49 +, Imseih (AWS), Sami wrote: > From 85c47dfb3bb72f764b9052e74a7282c19ebd9898 Mon Sep 17 00:00:00 2001 > From: "Sami Imseih (AWS)" > Date: Wed, 16 Mar 2022 20:39:52 + > Subject: [PATCH 1/1] Add infrastructure for parallel progress reporting > > Infrastructure to allow a parallel worker to report > progress. In a PARALLEL command, the workers and > leader can report progress using a new pgstat_progress > API. What happens if we run out of memory for hashtable entries? > +void > +pgstat_progress_update_param_parallel(int leader_pid, int index, int64 val) > +{ > + ProgressParallelEntry *entry; > + bool found; > + > + LWLockAcquire(ProgressParallelLock, LW_EXCLUSIVE); > + > + entry = (ProgressParallelEntry *) hash_search(ProgressParallelHash, > _pid, HASH_ENTER, ); > + > + /* > + * If the entry is not found, set the value for the index'th member, > + * else increment the current value of the index'th member. > + */ > + if (!found) > + entry->st_progress_param[index] = val; > + else > + entry->st_progress_param[index] += val; > + > + LWLockRelease(ProgressParallelLock); > +} I think that's an absolute no-go. Adding locking to progress reporting, particularly a single central lwlock, is going to *vastly* increase the overhead incurred by progress reporting. Greetings, Andres Freund
Re: Add index scan progress to pg_stat_progress_vacuum
Sorry for the late reply. On Tue, Mar 15, 2022 at 1:20 AM Imseih (AWS), Sami wrote: > > >I'm still unsure the current design of 0001 patch is better than other > >approaches we’ve discussed. Even users who don't use parallel vacuum > >are forced to allocate shared memory for index vacuum progress, with > >GetMaxBackends() entries from the beginning. Also, it’s likely to > >extend the progress tracking feature for other parallel operations in > >the future but I think the current design is not extensible. If we > >want to do that, we will end up creating similar things for each of > >them or re-creating index vacuum progress tracking feature while > >creating a common infra. It might not be a problem as of now but I'm > >concerned that introducing a feature that is not extensible and forces > >users to allocate additional shmem might be a blocker in the future. > >Looking at the precedent example, When we introduce the progress > >tracking feature, we implemented it in an extensible way. On the other > >hand, others in this thread seem to agree with this approach, so I'd > >like to leave it to committers. > > Thanks for the review! > > I think you make strong arguments as to why we need to take a different > approach now than later. > > Flaws with current patch set: > > 1. GetMaxBackends() is a really heavy-handed overallocation of a shared > memory serving a very specific purpose. > 2. Going with the approach of a vacuum specific hash breaks the design of > progress which is meant to be extensible. > 3. Even if we go with this current approach as an interim solution, it will > be a real pain in the future. > > With that said, v7 introduces the new infrastructure. 0001 includes the new > infrastructure and 0002 takes advantage of this. > > This approach is the following: > > 1. Introduces a new API called pgstat_progress_update_param_parallel along > with some others support functions. This new infrastructure is in > backend_progress.c > > 2. There is still a shared memory involved, but the size is capped to " > max_worker_processes" which is the max to how many parallel workers can be > doing work at any given time. The shared memory hash includes a > st_progress_param array just like the Backend Status array. I think that there is a corner case where a parallel operation could not perform due to the lack of a free shared hash entry, because there is a window between a parallel worker exiting and the leader deallocating the hash table entry. BTW have we discussed another idea I mentioned before that we have the leader process periodically check the number of completed indexes and advertise it in its progress information? I'm not sure which one is better but this idea would require only changes of vacuum code and probably simpler than the current idea. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Mar 16, 2022 at 09:47:49PM +, Imseih (AWS), Sami wrote: > Spoke to Nathan offline and fixed some more comments/nitpicks in the patch. I don't have any substantial comments for v9, so I think this can be marked as ready-for-committer. However, we probably should first see whether Sawada-san has any comments on the revised approach. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
>I'm still unsure the current design of 0001 patch is better than other >approaches we’ve discussed. Even users who don't use parallel vacuum >are forced to allocate shared memory for index vacuum progress, with >GetMaxBackends() entries from the beginning. Also, it’s likely to >extend the progress tracking feature for other parallel operations in >the future but I think the current design is not extensible. If we >want to do that, we will end up creating similar things for each of >them or re-creating index vacuum progress tracking feature while >creating a common infra. It might not be a problem as of now but I'm >concerned that introducing a feature that is not extensible and forces >users to allocate additional shmem might be a blocker in the future. >Looking at the precedent example, When we introduce the progress >tracking feature, we implemented it in an extensible way. On the other >hand, others in this thread seem to agree with this approach, so I'd >like to leave it to committers. Thanks for the review! I think you make strong arguments as to why we need to take a different approach now than later. Flaws with current patch set: 1. GetMaxBackends() is a really heavy-handed overallocation of a shared memory serving a very specific purpose. 2. Going with the approach of a vacuum specific hash breaks the design of progress which is meant to be extensible. 3. Even if we go with this current approach as an interim solution, it will be a real pain in the future. With that said, v7 introduces the new infrastructure. 0001 includes the new infrastructure and 0002 takes advantage of this. This approach is the following: 1. Introduces a new API called pgstat_progress_update_param_parallel along with some others support functions. This new infrastructure is in backend_progress.c 2. There is still a shared memory involved, but the size is capped to " max_worker_processes" which is the max to how many parallel workers can be doing work at any given time. The shared memory hash includes a st_progress_param array just like the Backend Status array. typedef struct ProgressParallelEntry { pid_t leader_pid; int64 st_progress_param[PGSTAT_NUM_PROGRESS_PARAM]; } ProgressParallelEntry; 3. The progress update function is "pgstat_progress_update_param_parallel" and will aggregate totals reported for a specific progress parameter For example , it can be called lie below. In the case below, PROGRESS_VACUUM_INDEXES_COMPLETED is incremented by 1 in the shared memory entry shared by the workers and leader. case PARALLEL_INDVAC_STATUS_NEED_BULKDELETE: istat_res = vac_bulkdel_one_index(, istat, pvs->dead_items); pgstat_progress_update_param_parallel(pvs->shared->leader_pid, PROGRESS_VACUUM_INDEXES_COMPLETED, 1); <<- break; 4. pg_stat_get_progress_info will call a function called pgstat_progress_set_parallel which will set the parameter value to the total from the shared memory hash. I believe this approach gives proper infrastructure for future use-cases of workers reporting progress -and- does not do the heavy-handed shared memory allocation. -- Sami Imseih Amazon Web Services v7-0001-Add-infrastructure-for-parallel-progress-reportin.patch Description: v7-0001-Add-infrastructure-for-parallel-progress-reportin.patch v7-0002-Show-progress-for-index-vacuums.patch Description: v7-0002-Show-progress-for-index-vacuums.patch v7-0003-Expose-indexes-being-processed-in-a-VACUUM-operat.patch Description: v7-0003-Expose-indexes-being-processed-in-a-VACUUM-operat.patch v7-0004-Rename-index_vacuum_count-in-pg_stat_pogress_vacu.patch Description: v7-0004-Rename-index_vacuum_count-in-pg_stat_pogress_vacu.patch
Re: Add index scan progress to pg_stat_progress_vacuum
On Sat, Mar 12, 2022 at 4:00 PM Imseih (AWS), Sami wrote: > > > nitpick: Can we remove the extra spaces in the parentheses? > > fixed > > > What does it mean if there isn't an entry in the map? Is this actually > > expected, or should we ERROR instead? > > I cleaned up the code here and added comments. > > > I think the number of entries should be shared between > > VacuumWorkerProgressShmemInit() and VacuumWorkerProgressShmemSize(). > > Otherwise, we might update one and not the other. > > Fixed > > > I think we should elaborate a bit more in this comment. It's difficult to > > follow what this is doing without referencing the comment above > > set_vacuum_worker_progress(). > > More comments added > > I also simplified the 0002 patch as well. I'm still unsure the current design of 0001 patch is better than other approaches we’ve discussed. Even users who don't use parallel vacuum are forced to allocate shared memory for index vacuum progress, with GetMaxBackends() entries from the beginning. Also, it’s likely to extend the progress tracking feature for other parallel operations in the future but I think the current design is not extensible. If we want to do that, we will end up creating similar things for each of them or re-creating index vacuum progress tracking feature while creating a common infra. It might not be a problem as of now but I'm concerned that introducing a feature that is not extensible and forces users to allocate additional shmem might be a blocker in the future. Looking at the precedent example, When we introduce the progress tracking feature, we implemented it in an extensible way. On the other hand, others in this thread seem to agree with this approach, so I'd like to leave it to committers. Anyway, here are some comments on v5-0001 patch: +/* in commands/vacuumparallel.c */ +extern void VacuumWorkerProgressShmemInit(void); +extern Size VacuumWorkerProgressShmemSize(void); +extern void vacuum_worker_end(int leader_pid); +extern void vacuum_worker_update(int leader_pid); +extern void vacuum_worker_end_callback(int code, Datum arg); +extern void set_vaccum_worker_progress(Datum *values); These functions' body is not in vacuumparallel.c. As the comment says, I think these functions should be implemented in vacuumparallel.c. --- +/* + * set_vaccum_worker_progress --- updates the number of indexes that have been + * vacuumed or cleaned up in a parallel vacuum. + */ +void +set_vaccum_worker_progress(Datum *values) s/vaccum/vacuum/ --- +void +set_vaccum_worker_progress(Datum *values) +{ +VacProgressEntry *entry; +int leader_pid = values[0]; I thik we should use DatumGetInt32(). --- +entry = (VacProgressEntry *) hash_search(VacuumWorkerProgressHash, _pid, HASH_ENTER_NULL, ); + +if (!entry) +elog(ERROR, "cannot allocate shared memory for vacuum worker progress"); Since we raise an error in case of out of memory, I think we can use HASH_ENTER instead of HASH_ENTER_NULL. Or do we want to emit a detailed error message here? --- + VacuumWorkerProgressHash = NULL; This line is not necessary. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
On Sat, Mar 12, 2022 at 07:00:06AM +, Imseih (AWS), Sami wrote: >> nitpick: Can we remove the extra spaces in the parentheses? > > fixed > >> What does it mean if there isn't an entry in the map? Is this actually >> expected, or should we ERROR instead? > > I cleaned up the code here and added comments. > >> I think the number of entries should be shared between >> VacuumWorkerProgressShmemInit() and VacuumWorkerProgressShmemSize(). >> Otherwise, we might update one and not the other. > > Fixed > >> I think we should elaborate a bit more in this comment. It's difficult to >> follow what this is doing without referencing the comment above >> set_vacuum_worker_progress(). > > More comments added > > I also simplified the 0002 patch as well. These patches look pretty good to me. Barring additional feedback, I intend to mark this as ready-for-committer early next week. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Mar 10, 2022 at 09:30:57PM +, Imseih (AWS), Sami wrote: > Attached v4 which includes accounting for the hash size on startup, removal > of the no longer needed comment in pgstatfuncs.c and a change in both > code/docs to only reset the indexes_total to 0 when failsafe is triggered. Thanks for the new patch set. +/* + * Structs for tracking shared Progress information + * amongst worker ( and leader ) processes of a vacuum. + */ nitpick: Can we remove the extra spaces in the parentheses? +if (entry != NULL) +values[PGSTAT_NUM_PROGRESS_COMMON + PROGRESS_VACUUM_INDEXES_COMPLETED] = entry->indexes_processed; What does it mean if there isn't an entry in the map? Is this actually expected, or should we ERROR instead? +/* vacuum worker progress hash table */ +max_table_size = GetMaxBackends(); +size = add_size(size, hash_estimate_size(max_table_size, + sizeof(VacProgressEntry))); I think the number of entries should be shared between VacuumWorkerProgressShmemInit() and VacuumWorkerProgressShmemSize(). Otherwise, we might update one and not the other. +/* Call the command specific function to override datum values */ +if (pg_strcasecmp(cmd, "VACUUM") == 0) +set_vaccum_worker_progress(values); I think we should elaborate a bit more in this comment. It's difficult to follow what this is doing without referencing the comment above set_vacuum_worker_progress(). IMO the patches are in decent shape, and this should likely be marked as ready-for-committer in the near future. Before doing so, I think we should check that Sawada-san is okay with moving the deeper infrastructure changes to a separate threaḋ. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
> > BTreeShmemInit(); > > SyncScanShmemInit(); > > AsyncShmemInit(); > > + vacuum_worker_init(); > > Don't we also need to add the size of the hash table to > > CalculateShmemSize()? > No, ShmemInitHash takes the min and max size of the hash and in turn calls > ShmemInitStruct to setup the shared memory. Sorry, I am wrong here. The size needs to be accounted for at startup. -- Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
>I took a look at the latest patch set. >+ >+ indexes_total bigint >+ >+ >+ The number of indexes to be processed in the >+ vacuuming indexes >+ or cleaning up indexes phase. It is set to >+ 0 when vacuum is not in any of these phases. >+ >Could we avoid resetting it to 0 unless INDEX_CLEANUP was turned off or >failsafe kicked in? It might be nice to know how many indexes the vacuum >intends to process. I don't feel too strongly about this, so if it would >add a lot of complexity, it might be okay as is. Your suggestion is valid. On INDEX_CLEANUP it is set to 0 from the start and when failsafe kicks in it will be reset to 0. I Will remove the reset call for the common index vacuum path. > BTreeShmemInit(); > SyncScanShmemInit(); > AsyncShmemInit(); > + vacuum_worker_init(); > Don't we also need to add the size of the hash table to > CalculateShmemSize()? No, ShmemInitHash takes the min and max size of the hash and in turn calls ShmemInitStruct to setup the shared memory. >+ * A command type can optionally define a callback function >+ * which will derive Datum values rather than use values >+ * directly from the backends progress array. >I think this can be removed. Good catch. -- Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
I took a look at the latest patch set. + + indexes_total bigint + + + The number of indexes to be processed in the + vacuuming indexes + or cleaning up indexes phase. It is set to + 0 when vacuum is not in any of these phases. + Could we avoid resetting it to 0 unless INDEX_CLEANUP was turned off or failsafe kicked in? It might be nice to know how many indexes the vacuum intends to process. I don't feel too strongly about this, so if it would add a lot of complexity, it might be okay as is. BTreeShmemInit(); SyncScanShmemInit(); AsyncShmemInit(); + vacuum_worker_init(); Don't we also need to add the size of the hash table to CalculateShmemSize()? + * A command type can optionally define a callback function + * which will derive Datum values rather than use values + * directly from the backends progress array. I think this can be removed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
>I think if it's a better approach we can do that including adding a >new infrastructure for it. +1 This is a beneficial idea, especially to other progress reporting, but I see this as a separate thread targeting the next major version.
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Mar 9, 2022 at 11:35 AM Imseih (AWS), Sami wrote: > > >Indeed. > > >It might have already been discussed but other than using a new shmem > >hash for parallel vacuum, I wonder if we can allow workers to change > >the leader’s progress information. It would break the assumption that > >the backend status entry is modified by its own backend, though. But > >it might help for progress updates of other parallel operations too. > >This essentially does the same thing as what the current patch does > >but it doesn't require a new shmem hash. > > I experimented with this idea, but it did not work. The idea would have been > to create a pgstat_progress_update function that takes the leader pid, > however infrastructure does not exist to allow one backend to manipulate > another backends backend status array. > pgstat_fetch_stat_beentry returns a local copy only. I think if it's a better approach we can do that including adding a new infrastructure for it. > > >Another idea I come up with is that the parallel vacuum leader checks > >PVIndStats.status and updates how many indexes are processed to its > >progress information. The leader can check it and update the progress > >information before and after index vacuuming. And possibly we can add > >a callback to the main loop of index AM's bulkdelete and vacuumcleanup > >so that the leader can periodically make it up-to-date. > > >Regards, > > The PVIndStats idea is also one I experimented with but it did not work. The > reason being the backend checking the progress needs to do a shm_toc_lookup > to access the data, but they are not prepared to do so. What I imagined is that the leader checks how many PVIndStats.status is PARALLEL_INDVAC_STATUS_COMPLETED and updates the result to its progress information as indexes_processed. That way, the backend checking the progress can see it. > > I have not considered the callback in the index AM's bulkdelete and > vacuumcleanup, but I can imagine this is not possible since a leader could be > busy vacuuming rather than updating counters, but I may be misunderstanding > the suggestion. Checking PVIndStats.status values is cheap. Probably the leader can check it every 1GB index block, for example. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
>Indeed. >It might have already been discussed but other than using a new shmem >hash for parallel vacuum, I wonder if we can allow workers to change >the leader’s progress information. It would break the assumption that >the backend status entry is modified by its own backend, though. But >it might help for progress updates of other parallel operations too. >This essentially does the same thing as what the current patch does >but it doesn't require a new shmem hash. I experimented with this idea, but it did not work. The idea would have been to create a pgstat_progress_update function that takes the leader pid, however infrastructure does not exist to allow one backend to manipulate another backends backend status array. pgstat_fetch_stat_beentry returns a local copy only. >Another idea I come up with is that the parallel vacuum leader checks >PVIndStats.status and updates how many indexes are processed to its >progress information. The leader can check it and update the progress >information before and after index vacuuming. And possibly we can add >a callback to the main loop of index AM's bulkdelete and vacuumcleanup >so that the leader can periodically make it up-to-date. >Regards, The PVIndStats idea is also one I experimented with but it did not work. The reason being the backend checking the progress needs to do a shm_toc_lookup to access the data, but they are not prepared to do so. I have not considered the callback in the index AM's bulkdelete and vacuumcleanup, but I can imagine this is not possible since a leader could be busy vacuuming rather than updating counters, but I may be misunderstanding the suggestion. -- Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Mar 9, 2022 at 12:41 AM Imseih (AWS), Sami wrote: > > ++/* > ++ * vacuum_worker_init --- initialize this module's shared memory hash > ++ * to track the progress of a vacuum worker > ++ */ > ++void > ++vacuum_worker_init(void) > ++{ > ++ HASHCTL info; > ++ longmax_table_size = GetMaxBackends(); > ++ > ++ VacuumWorkerProgressHash = NULL; > ++ > ++ info.keysize = sizeof(pid_t); > ++ info.entrysize = sizeof(VacProgressEntry); > ++ > ++ VacuumWorkerProgressHash = ShmemInitHash("Vacuum Progress Hash", > ++ > + max_table_size, > ++ > + max_table_size, > ++ > + , > ++ > + HASH_ELEM | HASH_BLOBS); > ++} > > +It seems to me that creating a shmem hash with max_table_size entries > +for parallel vacuum process tracking is too much. IIRC an old patch > +had parallel vacuum workers advertise its progress and changed the > +pg_stat_progress_vacuum view so that it aggregates the results > +including workers' stats. I think it’s better than the current one. > +Why did you change that? > > +Regards, > > I was trying to avoid a shared memory to track completed indexes, but > aggregating stats does not work with parallel vacuums. This is because a > parallel worker will exit before the vacuum completes causing the aggregated > total to be wrong. > > For example > > Leader_pid advertises it completed 2 indexes > Parallel worker advertises it completed 2 indexes > > When aggregating we see 4 indexes completed. > > After the parallel worker exits, the aggregation will show only 2 indexes > completed. Indeed. It might have already been discussed but other than using a new shmem hash for parallel vacuum, I wonder if we can allow workers to change the leader’s progress information. It would break the assumption that the backend status entry is modified by its own backend, though. But it might help for progress updates of other parallel operations too. This essentially does the same thing as what the current patch does but it doesn't require a new shmem hash. Another idea I come up with is that the parallel vacuum leader checks PVIndStats.status and updates how many indexes are processed to its progress information. The leader can check it and update the progress information before and after index vacuuming. And possibly we can add a callback to the main loop of index AM's bulkdelete and vacuumcleanup so that the leader can periodically make it up-to-date. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
++/* ++ * vacuum_worker_init --- initialize this module's shared memory hash ++ * to track the progress of a vacuum worker ++ */ ++void ++vacuum_worker_init(void) ++{ ++ HASHCTL info; ++ longmax_table_size = GetMaxBackends(); ++ ++ VacuumWorkerProgressHash = NULL; ++ ++ info.keysize = sizeof(pid_t); ++ info.entrysize = sizeof(VacProgressEntry); ++ ++ VacuumWorkerProgressHash = ShmemInitHash("Vacuum Progress Hash", ++ + max_table_size, ++ + max_table_size, ++ + , ++ + HASH_ELEM | HASH_BLOBS); ++} +It seems to me that creating a shmem hash with max_table_size entries +for parallel vacuum process tracking is too much. IIRC an old patch +had parallel vacuum workers advertise its progress and changed the +pg_stat_progress_vacuum view so that it aggregates the results +including workers' stats. I think it’s better than the current one. +Why did you change that? +Regards, I was trying to avoid a shared memory to track completed indexes, but aggregating stats does not work with parallel vacuums. This is because a parallel worker will exit before the vacuum completes causing the aggregated total to be wrong. For example Leader_pid advertises it completed 2 indexes Parallel worker advertises it completed 2 indexes When aggregating we see 4 indexes completed. After the parallel worker exits, the aggregation will show only 2 indexes completed. -- Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
On Thu, Mar 3, 2022 at 2:08 PM Imseih (AWS), Sami wrote: > > >>> If the failsafe kicks in midway through a vacuum, the number > > indexes_total will not be reset to 0. If INDEX_CLEANUP is turned off, then > > the value will be 0 at the start of the vacuum. > >> > >> The way that this works with num_index_scans is that we "round up" > >> when there has been non-zero work in lazy_vacuum_all_indexes(), but > >> not if the precheck in lazy_vacuum_all_indexes() fails. That seems > >> like a good model to generalize from here. Note that this makes > >> INDEX_CLEANUP=off affect num_index_scans in much the same way as a > >> VACUUM where the failsafe kicks in very early, during the initial heap > >> pass. That is, if the failsafe kicks in before we reach lazy_vacuum() > >> for the first time (which is not unlikely), or even in the > >> lazy_vacuum_all_indexes() precheck, then num_index_scans will remain > >> at 0, just like INDEX_CLEANUP=off. > >> > >> The actual failsafe WARNING shows num_index_scans, possibly before it > >> gets incremented one last time (by "rounding up"). So it's reasonably > >> clear how this all works from that context (assuming that the > >> autovacuum logging stuff, which reports num_index_scans, outputs a > >> report for a table where the failsafe kicked in). > > >I am confused. If failsafe kicks in during the middle of a vacuum, I > > (perhaps naively) would expect indexes_total and indexes_processed to go > > to > >zero, and I'd expect to no longer see the "vacuuming indexes" and > > "cleaning > > up indexes" phases. Otherwise, how would I know that we are now skipping > > indexes? Of course, you won't have any historical context about the index > > work done before failsafe kicked in, but IMO it is misleading to still > > include it in the progress view. > > After speaking with Nathan offline, the best forward is to reset > indexes_total and indexes_processed to 0 after the start of "vacuuming > indexes" or "cleaning up indexes" phase. +1 +/* + * vacuum_worker_init --- initialize this module's shared memory hash + * to track the progress of a vacuum worker + */ +void +vacuum_worker_init(void) +{ + HASHCTL info; + longmax_table_size = GetMaxBackends(); + + VacuumWorkerProgressHash = NULL; + + info.keysize = sizeof(pid_t); + info.entrysize = sizeof(VacProgressEntry); + + VacuumWorkerProgressHash = ShmemInitHash("Vacuum Progress Hash", + max_table_size, + max_table_size, + , + HASH_ELEM | HASH_BLOBS); +} It seems to me that creating a shmem hash with max_table_size entries for parallel vacuum process tracking is too much. IIRC an old patch had parallel vacuum workers advertise its progress and changed the pg_stat_progress_vacuum view so that it aggregates the results including workers' stats. I think it’s better than the current one. Why did you change that? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Add index scan progress to pg_stat_progress_vacuum
> On Wed, Feb 23, 2022 at 10:02 AM Imseih (AWS), Sami wrote: >> If the failsafe kicks in midway through a vacuum, the number indexes_total will not be reset to 0. If INDEX_CLEANUP is turned off, then the value will be 0 at the start of the vacuum. > > The way that this works with num_index_scans is that we "round up" > when there has been non-zero work in lazy_vacuum_all_indexes(), but > not if the precheck in lazy_vacuum_all_indexes() fails. That seems > like a good model to generalize from here. Note that this makes > INDEX_CLEANUP=off affect num_index_scans in much the same way as a > VACUUM where the failsafe kicks in very early, during the initial heap > pass. That is, if the failsafe kicks in before we reach lazy_vacuum() > for the first time (which is not unlikely), or even in the > lazy_vacuum_all_indexes() precheck, then num_index_scans will remain > at 0, just like INDEX_CLEANUP=off. > > The actual failsafe WARNING shows num_index_scans, possibly before it > gets incremented one last time (by "rounding up"). So it's reasonably > clear how this all works from that context (assuming that the > autovacuum logging stuff, which reports num_index_scans, outputs a > report for a table where the failsafe kicked in). >I am confused. If failsafe kicks in during the middle of a vacuum, I >(perhaps naively) would expect indexes_total and indexes_processed to go to >zero, and I'd expect to no longer see the "vacuuming indexes" and "cleaning >up indexes" phases. Otherwise, how would I know that we are now skipping >indexes? Of course, you won't have any historical context about the index >work done before failsafe kicked in, but IMO it is misleading to still >include it in the progress view. Failsafe occurring in the middle of a vacuum and resetting "indexes_total" to 0 will be misleading. I am thinking that it is a better idea to expose only one column "indexes_remaining". If index_cleanup is set to OFF, the values of indexes_remaining will be 0 at the start of the vacuum. If failsafe kicks in during a vacuum in-progress, "indexes_remaining" will be calculated to 0. This approach will provide a progress based on how many indexes remaining with no ambiguity. -- Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Feb 23, 2022 at 10:41:36AM -0800, Peter Geoghegan wrote: > On Wed, Feb 23, 2022 at 10:02 AM Imseih (AWS), Sami > wrote: >> If the failsafe kicks in midway through a vacuum, the number indexes_total >> will not be reset to 0. If INDEX_CLEANUP is turned off, then the value will >> be 0 at the start of the vacuum. > > The way that this works with num_index_scans is that we "round up" > when there has been non-zero work in lazy_vacuum_all_indexes(), but > not if the precheck in lazy_vacuum_all_indexes() fails. That seems > like a good model to generalize from here. Note that this makes > INDEX_CLEANUP=off affect num_index_scans in much the same way as a > VACUUM where the failsafe kicks in very early, during the initial heap > pass. That is, if the failsafe kicks in before we reach lazy_vacuum() > for the first time (which is not unlikely), or even in the > lazy_vacuum_all_indexes() precheck, then num_index_scans will remain > at 0, just like INDEX_CLEANUP=off. > > The actual failsafe WARNING shows num_index_scans, possibly before it > gets incremented one last time (by "rounding up"). So it's reasonably > clear how this all works from that context (assuming that the > autovacuum logging stuff, which reports num_index_scans, outputs a > report for a table where the failsafe kicked in). I am confused. If failsafe kicks in during the middle of a vacuum, I (perhaps naively) would expect indexes_total and indexes_processed to go to zero, and I'd expect to no longer see the "vacuuming indexes" and "cleaning up indexes" phases. Otherwise, how would I know that we are now skipping indexes? Of course, you won't have any historical context about the index work done before failsafe kicked in, but IMO it is misleading to still include it in the progress view. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
On Wed, Feb 23, 2022 at 10:02 AM Imseih (AWS), Sami wrote: > If the failsafe kicks in midway through a vacuum, the number indexes_total > will not be reset to 0. If INDEX_CLEANUP is turned off, then the value will > be 0 at the start of the vacuum. The way that this works with num_index_scans is that we "round up" when there has been non-zero work in lazy_vacuum_all_indexes(), but not if the precheck in lazy_vacuum_all_indexes() fails. That seems like a good model to generalize from here. Note that this makes INDEX_CLEANUP=off affect num_index_scans in much the same way as a VACUUM where the failsafe kicks in very early, during the initial heap pass. That is, if the failsafe kicks in before we reach lazy_vacuum() for the first time (which is not unlikely), or even in the lazy_vacuum_all_indexes() precheck, then num_index_scans will remain at 0, just like INDEX_CLEANUP=off. The actual failsafe WARNING shows num_index_scans, possibly before it gets incremented one last time (by "rounding up"). So it's reasonably clear how this all works from that context (assuming that the autovacuum logging stuff, which reports num_index_scans, outputs a report for a table where the failsafe kicked in). -- Peter Geoghegan
Re: Add index scan progress to pg_stat_progress_vacuum
+ + + indexes_total bigint + + + The number of indexes to be processed in the + vacuuming indexes or cleaning up indexes phase + of the vacuum. + + + + + + indexes_processed bigint + + + The number of indexes processed in the + vacuuming indexes or cleaning up indexes phase. + At the start of an index vacuum cycle, this value is set to 0. + + > Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP > turned off? If the failsafe kicks in midway through a vacuum, the number indexes_total will not be reset to 0. If INDEX_CLEANUP is turned off, then the value will be 0 at the start of the vacuum. +typedef struct VacWorkerProgressInfo +{ +int num_vacuums;/* number of active VACUUMS with parallel workers */ +int max_vacuums;/* max number of VACUUMS with parallel workers */ +slock_t mutex; +VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER]; +} VacWorkerProgressInfo; > max_vacuums appears to just be a local copy of MaxBackends. Does this > information really need to be stored here? Also, is there a strong reason > for using a spinlock instead of an LWLock? First, The BTVacInfo code in backend/access/nbtree/nbtutils.c inspired this, so I wanted to follow this pattern. With that said, I do see max_vacuums being redundant here, and I am inclined to replace it with a MaxBackends() call. Second, There is no strong reason to use spinlock here except I incorrectly assumed it will be better for this case. After reading more about this and reading up src/backend/storage/lmgr/README, an LWLock will be better. +void +vacuum_worker_end(int leader_pid) +{ +SpinLockAcquire(>mutex); +for (int i = 0; i < vacworkerprogress->num_vacuums; i++) +{ +VacOneWorkerProgressInfo *vac = >vacuums[i]; + +if (vac->leader_pid == leader_pid) +{ +*vac = vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 1]; +vacworkerprogress->num_vacuums--; +SpinLockRelease(>mutex); +break; +} +} +SpinLockRelease(>mutex); +} > I see this loop pattern in a couple of places, and it makes me wonder if > this information would fit more naturally in a hash table. Followed the pattern in backend/access/nbtree/nbtutils.c for this as well. Using dynahash may make sense here if it simplifies the code. Will look. +if (callback) +callback(values, 3); > Why does this need to be set up as a callback function? Could we just call > the function if cmdtype == PROGRESS_COMMAND_VACUUM? ISTM that is pretty > much all this is doing. The intention will be for the caller to set the callback early on in the function using the existing " if (pg_strcasecmp(cmd, "VACUUM") == 0), etc." statement. This way we avoid having to add another if/else block before tuplestore_putvalues is called. -- Sami Imseih Amazon Web Services
Re: Add index scan progress to pg_stat_progress_vacuum
On Mon, Feb 21, 2022 at 07:03:39PM +, Imseih (AWS), Sami wrote: > Sending again with patch files renamed to ensure correct apply order. I haven't had a chance to test this too much, but I did look through the patch set and have a couple of small comments. + + + indexes_total bigint + + + The number of indexes to be processed in the + vacuuming indexes or cleaning up indexes phase + of the vacuum. + + + + + + indexes_processed bigint + + + The number of indexes processed in the + vacuuming indexes or cleaning up indexes phase. + At the start of an index vacuum cycle, this value is set to 0. + + Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP turned off? +typedef struct VacWorkerProgressInfo +{ +int num_vacuums;/* number of active VACUUMS with parallel workers */ +int max_vacuums;/* max number of VACUUMS with parallel workers */ +slock_t mutex; +VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER]; +} VacWorkerProgressInfo; max_vacuums appears to just be a local copy of MaxBackends. Does this information really need to be stored here? Also, is there a strong reason for using a spinlock instead of an LWLock? +void +vacuum_worker_end(int leader_pid) +{ +SpinLockAcquire(>mutex); +for (int i = 0; i < vacworkerprogress->num_vacuums; i++) +{ +VacOneWorkerProgressInfo *vac = >vacuums[i]; + +if (vac->leader_pid == leader_pid) +{ +*vac = vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 1]; +vacworkerprogress->num_vacuums--; +SpinLockRelease(>mutex); +break; +} +} +SpinLockRelease(>mutex); +} I see this loop pattern in a couple of places, and it makes me wonder if this information would fit more naturally in a hash table. +if (callback) +callback(values, 3); Why does this need to be set up as a callback function? Could we just call the function if cmdtype == PROGRESS_COMMAND_VACUUM? ISTM that is pretty much all this is doing. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Add index scan progress to pg_stat_progress_vacuum
On 1/12/22, 1:28 PM, "Bossart, Nathan" wrote: On 1/11/22, 11:46 PM, "Masahiko Sawada" wrote: > Regarding the new pg_stat_progress_vacuum_index view, why do we need > to have a separate view? Users will have to check two views. If this > view is expected to be used together with and joined to > pg_stat_progress_vacuum, why don't we provide one view that has full > information from the beginning? Especially, I think it's not useful > that the total number of indexes to vacuum (num_indexes_to_vacuum > column) and the current number of indexes that have been vacuumed > (index_ordinal_position column) are shown in separate views. > I suppose we could add all of the new columns to > pg_stat_progress_vacuum and just set columns to NULL as appropriate. > But is that really better than having a separate view? To add, since a vacuum can utilize parallel worker processes + the main vacuum process to perform index vacuuming, it made sense to separate the backends doing index vacuum/cleanup in a separate view. Besides what Nathan suggested, the only other clean option I can think of is to perhaps create a json column in pg_stat_progress_vacuum which will include all the new fields. My concern with this approach is that it will make usability, to flatten the json, difficult for users. > Also, I’m not sure how useful index_tuples_removed is; what can we > infer from this value (without a total number)? >I think the idea was that you can compare it against max_dead_tuples > and num_dead_tuples to get an estimate of the current cycle progress. >Otherwise, it just shows that progress is being made. The main purpose is to really show that the "index vacuum" phase is actually making progress. Note that for certain types of indexes, i.e. GIN/GIST the number of tuples_removed will end up exceeding the number of num_dead_tuples. Nathan [0] https://postgr.es/m/7874FB21-FAA5-49BD-8386-2866552656C7%40amazon.com