+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>indexes_total</structfield> <type>bigint</type>
+ </para>
+ <para>
+ The number of indexes to be processed in the
+ <literal>vacuuming indexes</literal> or <literal>cleaning up
indexes</literal> phase
+ of the vacuum.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>indexes_processed</structfield> <type>bigint</type>
+ </para>
+ <para>
+ The number of indexes processed in the
+ <literal>vacuuming indexes</literal> or <literal>cleaning up
indexes</literal> phase.
+ At the start of an index vacuum cycle, this value is set to
<literal>0</literal>.
+ </para></entry>
+ </row>
> Will these be set to 0 for failsafe vacuums and vacuums with INDEX_CLEANUP
> turned off?
If the failsafe kicks in midway through a vacuum, the number indexes_total will
not be reset to 0. If INDEX_CLEANUP is turned off, then the value will be 0 at
the start of the vacuum.
+typedef struct VacWorkerProgressInfo
+{
+ int num_vacuums; /* number of active VACUUMS
with parallel workers */
+ int max_vacuums; /* max number of VACUUMS with
parallel workers */
+ slock_t mutex;
+ VacOneWorkerProgressInfo vacuums[FLEXIBLE_ARRAY_MEMBER];
+} VacWorkerProgressInfo;
> max_vacuums appears to just be a local copy of MaxBackends. Does this
> information really need to be stored here? Also, is there a strong reason
> for using a spinlock instead of an LWLock?
First, The BTVacInfo code in backend/access/nbtree/nbtutils.c inspired this, so
I wanted to follow this pattern. With that said, I do see max_vacuums being
redundant here, and I am inclined to replace it with a MaxBackends() call.
Second, There is no strong reason to use spinlock here except I incorrectly
assumed it will be better for this case. After reading more about this and
reading up src/backend/storage/lmgr/README, an LWLock will be better.
+void
+vacuum_worker_end(int leader_pid)
+{
+ SpinLockAcquire(&vacworkerprogress->mutex);
+ for (int i = 0; i < vacworkerprogress->num_vacuums; i++)
+ {
+ VacOneWorkerProgressInfo *vac = &vacworkerprogress->vacuums[i];
+
+ if (vac->leader_pid == leader_pid)
+ {
+ *vac =
vacworkerprogress->vacuums[vacworkerprogress->num_vacuums - 1];
+ vacworkerprogress->num_vacuums--;
+ SpinLockRelease(&vacworkerprogress->mutex);
+ break;
+ }
+ }
+ SpinLockRelease(&vacworkerprogress->mutex);
+}
> I see this loop pattern in a couple of places, and it makes me wonder if
> this information would fit more naturally in a hash table.
Followed the pattern in backend/access/nbtree/nbtutils.c for this as well.
Using dynahash may make sense here if it simplifies the code. Will look.
+ if (callback)
+ callback(values, 3);
> Why does this need to be set up as a callback function? Could we just call
> the function if cmdtype == PROGRESS_COMMAND_VACUUM? ISTM that is pretty
> much all this is doing.
The intention will be for the caller to set the callback early on in the
function using the existing " if (pg_strcasecmp(cmd, "VACUUM") == 0), etc."
statement. This way we avoid having to add another if/else block before
tuplestore_putvalues is called.
--
Sami Imseih
Amazon Web Services