On Tue, Aug 4, 2020 at 9:44 PM David Rowley <dgrowle...@gmail.com> wrote: > > On Wed, 5 Aug 2020 at 13:21, Justin Pryzby <pry...@telsasoft.com> wrote: > > > > I'm testing with a customer's data on pg13dev and got output for which Peak > > Memory doesn't look right/useful. I reproduced it on 565f16902. > > Likely the sanity of those results depends on whether you think that > the Memory Usage reported outside of the workers is meant to be the > sum of all processes or the memory usage for the leader backend. > > All that's going on here is that the Parallel Append is using some > parallel safe paths and giving one to each worker. The 2 workers take > the first 2 subpaths and the leader takes the third. The memory usage > reported helps confirm that's the case. > > Can you explain what you'd want to see changed about this? Or do you > want to see the non-parallel worker memory be the sum of all workers? > Sort does not seem to do that, so I'm not sure if we should consider > hash agg as an exception to that.
I've always found the way we report parallel workers in EXPLAIN quite confusing. I realize it matches the actual implementation model (the leader often is also "another worker", but I think the natural expectation from a user perspective would be that you'd show as workers all backends (including the leader) that did work, and then aggregate into a summary line (where the leader is displayed now). In the current output there's nothing really to hint to the use that the model is leader + workers and that the "summary" line is really the leader. If I were to design this from scratch, I'd want to propose doing what I said above (summary aggregate line + treat leader as a worker line, likely with a "leader" tag), but that seems like a big change to make now. On the other hand, perhaps designating what looks like a summary line as the "leader" or some such would help clear up the confusion? Perhaps it could also say "Participating" or "Non-participating"? James