On Mon, Jan 6, 2025 at 6:59 PM Julien Rouhaud <rjuju...@gmail.com> wrote:
> Hi, > > On Mon, Jan 06, 2025 at 06:49:06PM +0900, torikoshia wrote: > > > > > **However, I think the general direction has merit**: Changing this > > > patch to > > > use `ru_inblock`/`ru_oublock` gives very useful insights. `ru_inblock` > > > is 0 when everything is in page cache, and it is very high when stuff > is > > > not. I was only hacking around and basically did this: > > > > > > s/ru_minflt/ru_inblock/g > > > s/ru_majflt/ru_oublock/g > > > [...] > > > Also, maybe there's some better way > > > of getting read/write numbers for the current process than > > > ru_inblock/ru_oublock (but this one seems to work at least reasonably > > > well). > > > > Maybe, but as far as using getrusage(), ru_inblock and ru_outblock seem > the > > best. > > FWIW that's the counters we've been using in pg_stat_kcache / powa to > compute > disk IO and "postgres shared buffers hit VS OS cache hit VS disk read" hit > ratio for many years and we didn't get any problem report for that. Thanks for sharing the information! On Tue, Jan 7, 2025 at 5:09 AM Jelte Fennema-Nio <postg...@jeltef.nl> wrote: > On Mon Jan 6, 2025 at 10:49 AM CET, torikoshia wrote: > > On Tue, Dec 31, 2024 at 7:57 AM Jelte Fennema-Nio <postg...@jeltef.nl> > > Updated the PoC patch to calculate them by KB: > > > > =# EXPLAIN (ANALYZE, STORAGEIO) SELECT * FROM pgbench_accounts; > > QUERY PLAN > > > > > --------------------------------------------------------------------------------------------------------------------------------- > > Seq Scan on pgbench_accounts (cost=0.00..263935.35 rows=10000035 > > width=97) (actual time=1.447..3900.279 rows=10000000 loops=1) > > Buffers: shared hit=2587 read=161348 > > Planning Time: 0.367 ms > > Execution: > > Storage I/O: read=1291856 KB write=0 KB > > Execution Time: 4353.253 ms > > (6 rows) > > > > > > The core functionality works well in my opinion. I think it makes sense > to spend the effort to move this from PoC quality to something > committable. Thanks for the comment and review! If there are no other opinions, I will make a patch based on the direction of the current PoC patch. > 4. I think this setting should be enabled by default for ANALYZE, just > like BUFFERS is now since c2a4078e[1]. > > 5. I'm wondering whether this option deserves its own EXPLAIN option, or > if it should simply be made part of BUFFERS. Although this is not information about PostgreSQL buffers, I now feel that making this addition part of BUFFERS is attractive, since people using BUFFERS would also sometimes want to know about the storage I/O. Since BUFFERS is now ON by default with EXPLAIN ANALYZE, I am concerned about the performance impact. However, if it is limited to just twice—once at the start and once at the end—for the planning phase, execution phase, and each parallel worker, I believe the impact is relatively small. > 9. I think this division by 2 could use some explanation in a comment. I > understand that you're doing this because linux divides its original > bytes using 512 bytes[2] and your additional factor of 2 gets that to > 1024 bytes. But that's not clear immediately from the code. > > I'm also not convinced that 512 is the blocksize if this logic is > even correct on every platform. I'm wondering if maybe we should > simply show the blocks after all. > Maybe so. I'll look into this and then decide the unit. For the other comments, I plan to address them as you suggested. Regards, Atsushi Torikoshi