> First of all, I don't think we need to declare ParallelVacuumProgress > in vacuum.c since it's set and used only in vacuumparallel.c. But I > don't even think it's a good idea to declare it in vacuumparallel.c as > a static variable. The primary reason is that it adds things we need > to care about. For example, what if we raise an error during index > vacuum? The transaction aborts but ParallelVacuumProgress still refers > to something old. Suppose further that the next parallel vacuum > doesn't launch any workers, the leader process would still end up > accessing the old value pointed by ParallelVacuumProgress, which > causes a SEGV. So we need to reset it anyway at the beginning of the > parallel vacuum. It's easy to fix at this time but once the parallel > vacuum code gets more complex, it could forget to care about it.
> IMO VacuumSharedCostBalance and VacuumActiveNWorkers have a different > story. They are set in vacuumparallel.c and are used in vacuum.c for > vacuum delay. If they weren't global variables, we would need to pass > them to every function that could eventually call the vacuum delay > function. So it makes sense to me to have them as global variables.On > the other hand, for ParallelVacuumProgress, it's a common pattern that > ambulkdelete(), amvacuumcleanup() or a common index scan routine like > btvacuumscan() checks the progress. I don't think index AM needs to > pass the value down to many of its functions. So it makes sense to me > to pass it via IndexVacuumInfo. Thanks for the detailed explanation and especially clearing up my understanding of VacuumSharedCostBalance and VacuumActiveNWorker. I do now think that passing ParallelVacuumState in IndexVacuumInfo is a more optimal choice. Attached version addresses the above and the previous comments. Thanks Sami Imseih Amazon Web Services (AWS)
v17-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v17-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch