Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE

2023-10-16 Thread Michael Christofides
> EXPLAIN ANALYZE for parallel Bitmap Heap Scans currently only reports
> the number of heap blocks processed by the leader. It's missing the
> per-worker stats.


Hi David,

According to the docs[1]: "In a parallel bitmap heap scan, one process is
chosen as the leader. That process performs a scan of one or more indexes
and builds a bitmap indicating which table blocks need to be visited. These
blocks are then divided among the cooperating processes as in a parallel
sequential scan."

My understanding is that the "Heap Blocks" statistic is only reporting
blocks for the bitmap (i.e. not the subsequent scan). As such, I think it
is correct that the workers do not report additional exact heap blocks.


> explain (analyze, costs off, timing off) select * from foo where col0 >
> 900 or col1 = 1;
>

In your example, if you add the buffers and verbose parameters, do the
worker reported buffers numbers report what you are looking for?

i.e. explain (analyze, buffers, verbose, costs off, timing off) select *
from foo where col0 > 900 or col1 = 1;

—
Michael Christofides
Founder, pgMustard <https://pgmustard.com/>

[1]:
https://www.postgresql.org/docs/current/parallel-plans.html#PARALLEL-SCANS


Re: BUFFERS enabled by default in EXPLAIN (ANALYZE)

2021-11-24 Thread Michael Christofides
I think it *should* be enabled for planning, since that makes the default
> easier to understand and document, and it makes a user's use of "explain"
> easier.


I’d be keen to see BUFFERS off by default with EXPLAIN, and on by default
with EXPLAIN ANALYZE.

The SUMMARY flag was implemented that way, which I think has been easy
enough for folks to understand and document. In fact, I think the only
BUFFERS information goes in the “summary” section for EXPLAIN (BUFFERS), so
maybe it makes perfect sense? If it would be great if that made
implementation easier, too.

In any case, thank you all, I’m so glad that this is being discussed again.
It’d be so good to start seeing buffers in more plans.

—
Michael


Re: [PATCH] Add extra statistics to explain for Nested Loop

2021-01-18 Thread Michael Christofides
> New version of this patch prints extra statistics for all cases of
> multiple loops, not only for Nested Loop. Also I fixed the example by
> adding VERBOSE.
>
> Please don't hesitate to share any thoughts on this topic!

Thanks a lot for working on this! I really like the extra details, and
including it only with VERBOSE sounds good.

> rows * loops is still an important calculation.
>
> Why not just add total_rows while we are at it - last in the listing?
>
> (actual rows=N loops=N min_rows=N max_rows=N total_rows=N)

This total_rows idea from David would really help us too, especially
in the cases where the actual rows is rounded down to zero. We make an
explain visualisation tool, and it'd be nice to show people a better
total than loops * actual rows. It would also help the accuracy of
some of our tips, that use this number.

Apologies if this input is too late to be helpful.

Cheers,
Michael