On Fri, Nov 5, 2021 at 6:25 AM Peter Geoghegan <p...@bowt.ie> wrote: > > On Thu, Nov 4, 2021 at 12:42 PM Peter Geoghegan <p...@bowt.ie> wrote: > > Since "The leader process alone processes all parallel-safe indexes in > > the case where no workers are launched" (no change there), I wonder: > > how does the leader *know* that it's the leader (and so can always > > process any indexes) inside its call to > > lazy_parallel_process_indexes()? Or, does the leader actually process > > all indexes inside lazy_serial_process_indexes() in this edge case? > > (The edge case where a parallel VACUUM has no workers at all, because > > they couldn't be launched by the core parallel infrastructure.) > > I think that I might see a related problem. But I'm not sure, so I'll just > ask: > > > + /* Set data required for parallel index vacuum or cleanup */ > > + prepare_parallel_index_processing(vacrel, vacuum); > > + > > + /* Reset the parallel index processing counter */ > > + pg_atomic_write_u32(&(lps->lvshared->idx), 0); > > + > > /* Setup the shared cost-based vacuum delay and launch workers */ > > if (nworkers > 0) > > { > > + /* Reinitialize the parallel context to relaunch parallel workers */ > > if (vacrel->num_index_scans > 0) > > - { > > - /* Reset the parallel index processing counter */ > > - pg_atomic_write_u32(&(lps->lvshared->idx), 0); > > - > > - /* Reinitialize the parallel context to relaunch parallel > > workers */ > > ReinitializeParallelDSM(lps->pcxt); > > - } > > Is it okay that we don't call ReinitializeParallelDSM() just because > "nworkers == 0" this time around? I notice that there is a wait for > "nworkers_launched" workers to finish parallel processing, at the top > of ReinitializeParallelDSM(). I can see why the > "vacrel->num_index_scans > 0" test is okay, but I can't see why the > "nworkers == 0" test is okay. > > I just want to be sure that we're not somehow relying on seeing state > in shared memory (in the LVSharedIndStats array) in all cases, but > finding that it is not actually there in certain rare edge cases. > Maybe this didn't matter before, because the leader didn't expect to > find this information in shared memory in any case. But that is > changed by your patch, of course, so it's something to be concerned > about.
If we launch workers (i.g., nworkers > 0), we wait for these workers to finish after processing all indexes (see we call WaitForParallelWorkersToFinish() after lazy_parallel_process_indexes). So it's guaranteed that all workers finished at the end ofparallel_lazy_vacuum_or_cleanup_all_indexes(). So even in the second call to this function, we don't need to wait for "nworkers_launched" workers who previously were running to finish. Does it make sense? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/