On Fri, Jan 09, 2026 at 07:53:08AM +0000, Bertrand Drouvot wrote: > While working on flushing stats outside of transaction boundaries (patch not > shared yet but linked to [1]), I realized that parallel workers could lead to > incomplete and misleading statistics. Indeed, they update "their" relation > stats during their shutdown regardless of the "main" transaction status. > > It means that, for example, stats like seq_scan, last_seq_scan and > seq_tup_read > are updated by the parallel workers during their shutdown while the main > transaction has not finished. The stats are then somehow incomplete because > the main > worker has not updated its stats yet. I think that could lead to misleading > stats > that a patch like this one could help to address. For example, parallel > workers > could update parallel_* dedicated stats and leave the non parallel_* stats > update > responsibility to the main worker when the transaction finishes. That would > make > the non parallel_* stats consistent whether parallel workers are used or not.
(Re-reading the thread to remember the context..) It depends, I guess. I still doubt that adding parallel worker data at table and index level is the right move compared to all the information we have now on HEAD, because this extra information is not actionable in terms of tuning GUCs or reloptions. Now, do you think that the extra noise of data flushed by the parallel workers shutting down and flushing their data before the main transaction has committed in the "main" backend process could really impact the tuning decisions users may want to take? Stats are not about precision, they are about offering trends that help in taking better decisions to drive the backend server in a direction where its administrator wants to lead it to. If the noise is too high, and that this noise drives to incorrect tuning decision, the system could go crazy and that would be an issue. My question is then: does this extra data flushed by the parallel workers before transaction end, which you are qualifying as noise, really matter when it comes to the tuning decisions one needs to take? That stance would apply mostly to analytical queries, of course, where parallel workers would have more data to flush. Parallel workers flushing could have a lot of data to report, but the transaction commit just delays the availability of this information. When it comes to what you are describing as problem, my intuition is telling me that we don't have a problem to solve at all here, but I'm OK to be proved wrong, as well. -- Michael
signature.asc
Description: PGP signature
