Re: RFC: Allow EXPLAIN to Output Page Fault Information

Atsushi Torikoshi Tue, 07 Jan 2025 05:42:23 -0800

On Mon, Jan 6, 2025 at 6:59 PM Julien Rouhaud <rjuju...@gmail.com> wrote:


> Hi,
>
> On Mon, Jan 06, 2025 at 06:49:06PM +0900, torikoshia wrote:
> >
> > > **However, I think the general direction has merit**: Changing this
> > > patch to
> > > use `ru_inblock`/`ru_oublock` gives very useful insights. `ru_inblock`
> > > is 0 when everything is in page cache, and it is very high when stuff
> is
> > > not. I was only hacking around and basically did this:
> > >
> > > s/ru_minflt/ru_inblock/g
> > > s/ru_majflt/ru_oublock/g
> > > [...]
> > >  Also, maybe there's some better way
> > > of getting read/write numbers for the current process than
> > > ru_inblock/ru_oublock (but this one seems to work at least reasonably
> > > well).
> >
> > Maybe, but as far as using getrusage(), ru_inblock and ru_outblock seem
> the
> > best.
>
> FWIW that's the counters we've been using in pg_stat_kcache / powa to
> compute
> disk IO and "postgres shared buffers hit VS OS cache hit VS disk read" hit
> ratio for many years and we didn't get any problem report for that.


Thanks for sharing the information!

On Tue, Jan 7, 2025 at 5:09 AM Jelte Fennema-Nio <postg...@jeltef.nl> wrote:

> On Mon Jan 6, 2025 at 10:49 AM CET, torikoshia wrote:
> > On Tue, Dec 31, 2024 at 7:57 AM Jelte Fennema-Nio <postg...@jeltef.nl>
> > Updated the PoC patch to calculate them by KB:
> >
> >    =# EXPLAIN (ANALYZE, STORAGEIO) SELECT * FROM pgbench_accounts;
> >                                                               QUERY PLAN
> >
> >
> ---------------------------------------------------------------------------------------------------------------------------------
> >     Seq Scan on pgbench_accounts  (cost=0.00..263935.35 rows=10000035
> > width=97) (actual time=1.447..3900.279 rows=10000000 loops=1)
> >       Buffers: shared hit=2587 read=161348
> >     Planning Time: 0.367 ms
> >     Execution:
> >       Storage I/O: read=1291856 KB write=0 KB
> >     Execution Time: 4353.253 ms
> >    (6 rows)
> >
> >
>
> The core functionality works well in my opinion. I think it makes sense
> to spend the effort to move this from PoC quality to something
> committable.


Thanks for the comment and review!
If there are no other opinions, I will make a patch based on the direction
of the current PoC patch.


> 4. I think this setting should be enabled by default for ANALYZE, just
>    like BUFFERS is now since c2a4078e[1].
>
> 5. I'm wondering whether this option deserves its own EXPLAIN option, or
>    if it should simply be made part of BUFFERS.


Although this is not information about PostgreSQL buffers, I now feel that
making this addition part of BUFFERS is attractive, since people using
BUFFERS would also sometimes want to know about the storage I/O.

Since BUFFERS is now ON by default with EXPLAIN ANALYZE, I am concerned
about the performance impact.
However, if it is limited to just twice—once at the start and once at the
end—for the planning phase, execution phase, and each parallel worker, I
believe the impact is relatively small.


> 9. I think this division by 2 could use some explanation in a comment. I
>    understand that you're doing this because linux divides its original
>    bytes using 512 bytes[2] and your additional factor of 2 gets that to
>    1024 bytes. But that's not clear immediately from the code.
>
>    I'm also not convinced that 512 is the blocksize if this logic is
>    even correct on every platform. I'm wondering if maybe we should
>    simply show the blocks after all.
>

Maybe so. I'll look into this and then decide the unit.

For the other comments, I plan to address them as you suggested.


Regards,
Atsushi Torikoshi

Re: RFC: Allow EXPLAIN to Output Page Fault Information

Reply via email to