Hi, This thread has been quiet for a while, but I'd like to share some thoughts.
+1 to the idea of improving visibility into parallel worker saturation. But overall, we should improve parallel processing visibility, so DBAs can detect trends in parallel usage ( is the workload doing more parallel, or doing less ) and have enough data to either tune the workload or change parallel GUCs. >> We can output this at the LOG level to avoid running the server at >> DEBUG1 level. There are a few other cases where we are not able to >> spawn the worker or process and those are logged at the LOG level. For >> example, "could not fork autovacuum launcher process .." or "too many >> background workers". So, not sure, if this should get a separate >> treatment. If we fear this can happen frequently enough that it can >> spam the LOG then a GUC may be worthwhile. > I think we should definitely be afraid of that. I am in favor of a separate > GUC. Currently explain ( analyze ) will give you the "Workers Planned" and "Workers launched". Logging this via auto_explain is possible, so I am not sure we need additional GUCs or debug levels for this info. -> Gather (cost=10430.00..10430.01 rows=2 width=8) (actual tim e=131.826..134.325 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 >> What I was wondering was whether we would be better off putting this >> into the statistics collector, vs. doing it via logging. Both >> approaches seem to have pros and cons. >> >> I think it could be easier for users to process the information if it >> is available via some view, so there is a benefit in putting this into >> the stats subsystem. > Unless we do this instead. Adding cumulative stats is a much better idea. 3 new columns can be added to pg_stat_database: workers_planned, workers_launched, parallel_operations - There could be more than 1 operation per query, if for example there are multiple Parallel Gather operations in a plan. With these columns, monitoring tools can trend if there is more or less parallel work happening over time ( by looking at parallel operations ) or if the workload is suffering from parallel saturation. workers_planned/workers_launched < 1 means there is a lack of available worker processes. Also, We could add this information on a per query level as well in pg_stat_statements, but this can be taken up in a seperate discussion. Regards, -- Sami Imseih Amazon Web Services (AWS)