RE: Improve EXPLAIN output for multicolumn B-Tree Index

Masahiro.Ikeda Sun, 30 Jun 2024 19:54:29 -0700

On 2024-06-28 21:05, Matthias van de Meent wrote:
> On Fri, 28 Jun 2024 at 10:59, Jelte Fennema-Nio <postg...@jeltef.nl> wrote:
>>
>> On Fri, 28 Jun 2024 at 00:41, Peter Geoghegan <p...@bowt.ie> wrote:
>> > Typically, no, it won't be. But there's really no telling for sure.
>> > The access patterns for a composite index on '(a, b)' with a qual
>> > "WHERE b = 5" are identical to a qual explicitly written "WHERE a =
>> > any(<every possible value in 'a'>) AND b = 5".
>>
>> Hmm, that's true. But in that case the explain plan gives a clear hint
>> that something like that might be going on, because you'll see:
>>
>> Index Cond: a = any(<every possible value in 'a'>) AND b = 5
>>
>> That does make me think of another way, and maybe more "correct" way,
>> of representing Masahiros situation than adding a new "Skip Scan Cond"
>> row to the EXPLAIN output. We could explicitly include a comparison to
>> all prefix columns in the Index Cond:
>>
>> Index Cond: ((test.id1 = 1) AND (test.id2 = ANY) AND (test.id3 = 101))
>>
>> Or if you want to go with syntactically correct SQL we could do the 
>> following:
>>
>> Index Cond: ((test.id1 = 1) AND ((test.id2 IS NULL) OR (test.id2 IS
>> NOT NULL)) AND (test.id3 = 101))
>>
>> An additional benefit this provides is that you now know which
>> additional column you should use a more specific filter on to speed up
>> the query. In this case test.id2
>>
>> OTOH, maybe it's not a great way because actually running that puts
>> the IS NULL+ IS NOT NULL query in the Filter clause (which honestly
>> surprises me because I had expected this "always true expression"
>> would have been optimized away by the planner).
>>
>> > EXPLAIN (VERBOSE, ANALYZE) SELECT id1, id2, id3 FROM test WHERE id1 = 1 
>> > AND (id2 IS NULL OR id2 IS NOT NULL) AND id3 = 101;
>>                                                            QUERY PLAN
>> ─────────────────────────────────────────────────────
>>  Index Only Scan using test_idx on public.test  (cost=0.42..12809.10
>> rows=1 width=12) (actual time=0.027..11.234 rows=1 loops=1)
>>    Output: id1, id2, id3
>>    Index Cond: ((test.id1 = 1) AND (test.id3 = 101))
>>    Filter: ((test.id2 IS NULL) OR (test.id2 IS NOT NULL))
>>
>> > What about cases where we legitimately have to vary our strategy
>> > during the same index scan?
>>
>> Would my new suggestion above address this?
>>
>> > In fact, that'd make sense even today, without skip scan (just with
>> > the 17 work on nbtree SAOP scans). Even with regular SAOP nbtree index
>> > scans, the number of primitive scans is hard to predict, and quite
>> > indicative of what's really going on with the scan.
>>
>> *googles nbtree SAOP scans and finds the very helpful[1]*
>>
>> Yes, I feel like this should definitely be part of the ANALYZE output.
>> Seeing how Lukas has to look at pg_stat_user_tables to get this
>> information seems quite annoying[2] and only possible on systems that
>> have no concurrent queries.
>>
>> So it sounds like we'd want a "Primitive Index Scans" counter in
>> ANALYZE too. In addition to the number of filtered rows by, which if
>> we go with my proposal above should probably be called "Rows Removed
>> by Index Cond".
> 
> This all just made me more confident that this shows a need to enable
> index AMs to provide output for EXPLAIN: The knowledge about how index
> scankeys are actually used is exclusively known to the index AM,
> because the strategies are often unique to the index AM (or even
> chosen operator classes), and sometimes can only be applied at
> runtime: while the index scankeys' sizes, attribute numbers and
> operators are known in advance (even if not all arguments are filled
> in; `FROM a JOIN b ON b.id = ANY (a.ref_array)`), the AM can at least
> show what strategy it is likely going to choose, and how (in)efficient
> that strategy could be.
> 
> Right now, Masahiro-san's patch tries to do that with an additional
> field in IndexPath populated (by proxy) exclusively in btcostestimate.
> I think that design is wrong, because it wires explain- and
> btree-specific data through the planner, adding overhead everywhere
> which is only used for btree- and btree-compatible indexes.
> 
> I think the better choice would be adding an IndexAmRoutine->amexplain
> support function, which would get called in e.g. explain.c's
> ExplainIndexScanDetails to populate a new "Index Scan Details" (name
> to be bikeshed) subsection of explain plans. This would certainly be
> possible, as the essentials for outputting things to EXPLAIN are
> readily available in the explain.h header.


Yes, that's one of my concerns. I agree to add IndexAmRoutine->amexplain
is better because we can support several use cases.

Although I'm not confident to add only IndexAmRoutine->amexplain is enough
now, I'll make a PoC patch to confirm it.

Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION

RE: Improve EXPLAIN output for multicolumn B-Tree Index

Reply via email to