>> 2. the leader being interrupted while waiting is also already happening on >> master >> due to the pgstat_progress_parallel_incr_param() calls in >> parallel_vacuum_process_one_index() (that have been added in >> 46ebdfe164). It has been the case "only" 36 times during my test case.
46ebdfe164 will interrupt the leaders sleep every time a parallel workers reports progress, and we currently don't handle interrupts by restarting the sleep with the remaining time. nanosleep does provide the ability to restart with the remaining time [1], but I don't think it's worth the effort to ensure more accurate vacuum delays for the leader process. > 1. Having a time based only approach to throttle I do agree with a time based approach overall. > 1.1) the more parallel workers is used, the less the impact of the leader on > the vacuum index phase duration/workload is (because the repartition is done > on more processes). Did you mean " because the vacuum is done on more processes"? When a leader is operating on a large index(s) during the entirety of the vacuum operation, wouldn't more parallel workers end up interrupting the leader more often? This is why I think reporting even more often than 1 second (more below) will be better. > 3. A 1 second reporting "throttling" looks a reasonable threshold as: > 3.1 the idea is to have a significant impact when the leader could have been > interrupted say hundred/thousand times per second. > 3.2 it does not make that much sense for any tools to sample > pg_stat_progress_vacuum > multiple times per second (so a one second reporting granularity seems ok). I feel 1 second may still be too frequent. What about 10 seconds ( or 30 seconds )? I think this metric in particular will be mainly useful for vacuum runs that are running for minutes or more, making reporting every 10 or 30 seconds still useful. It just occurred to me also that pgstat_progress_parallel_incr_param should have a code comment that it will interrupt a leader process and cause activity such as a sleep to end early. Regards, Sami Imseih Amazon Web Services (AWS) [1] https://man7.org/linux/man-pages/man2/nanosleep.2.html