On Fri, May 6, 2022 at 4:26 AM Imseih (AWS), Sami <sims...@amazon.com> wrote: > > Thank you for the feedback! > > > I think we can pass the progress update function to > > WaitForParallelWorkersToFinish(), which seems simpler. And we can call > > Directly passing the callback to WaitForParallelWorkersToFinish > will require us to modify the function signature. > > To me, it seemed simpler and touches less code to have > the caller set the callback in the ParallelContext.
Okay, but if we do that, I think we should add comments about when it's used. The callback is used only when WaitForParallelWorkersToFinish(), but not when WaitForParallelWorkersToExit(). Another idea I came up with is that we can wait for all index vacuums to finish while checking and updating the progress information, and then calls WaitForParallelWorkersToFinish after confirming all index status became COMPLETED. That way, we don’t need to change the parallel query infrastructure. What do you think? > > > the function after updating the index status to > > PARALLEL_INDVAC_STATUS_COMPLETED. > > I also like this better. Will make the change. > > > BTW, currently we don't need a lock for touching index status since > > each worker touches different indexes. But after this patch, the > > leader will touch all index status, do we need a lock for that? > > I do not think locking is needed here. The leader and workers > will continue to touch different indexes to update the status. > > However, if the process is a leader, it will call the function > which will go through indstats and count how many > Indexes have a status of PARALLEL_INDVAC_STATUS_COMPLETED. > This value is then reported to the leaders backend only. I was concerned that the leader process could report the wrong progress if updating and checking index status happen concurrently. But I think it should be fine since we can read PVIndVacStatus atomically. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/