On Wed, Apr 19, 2023 at 10:23:26AM -0700, Andres Freund wrote: > Hi, > > I noticed that the numbers in pg_stat_io dont't quite add up to what I > expected in write heavy workloads. Particularly for checkpointer, the numbers > for "write" in log_checkpoints output are larger than what is visible in > pg_stat_io. > > That partially is because log_checkpoints' "write" covers way too many things, > but there's an issue with pg_stat_io as well: > > Checkpoints, and some other sources of writes, will often end up doing a lot > of smgrwriteback() calls - which pg_stat_io doesn't track. Nor do any > pre-existing forms of IO statistics. > > It seems pretty clear that we should track writeback as well. I wonder if it's > worth doing so for 16? It'd give a more complete picture that way. The > counter-argument I see is that we didn't track the time for it in existing > stats either, and that nobody complained - but I suspect that's mostly because > nobody knew to look.
Not complaining about making pg_stat_io more accurate, but what exactly would we be tracking for smgrwriteback()? I assume you are talking about IO timing. AFAICT, on Linux, it does sync_file_range() with SYNC_FILE_RANGE_WRITE, which is asynchronous. Wouldn't we just be tracking the system call overhead time? - Melanie