On Wed, May 3, 2023 at 2:00 PM Dirschel, Steve <
steve.dirsc...@thomsonreuters.com> wrote:

> Thanks for the reply Jeff.  Yes-  more of an academic question.  Regarding
> this part:
>
>
>
>    Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY
> ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))
>
>    Filter: (deleted_millis <= 0)
>
>    Buffers: shared hit=24
>
>
>
> For this usage, the =ANY is applied as an "in-index filter".  It only
> descends the index once, to where workflow_id=1070, and then scans forward
> applying the =ANY to each index-tuple until it exhausts the =1070
> condition.  As long as all the =1070 entries fit into just a few buffers,
> the count of buffers accessed by doing this is fewer than doing the
> re-descents.  (Stepping from tuple to tuple in the same index page doesn't
> count as a new access.  While a re-descent releases and reacquires the
> buffer)
>
>
>
> There are 2,981,425 rows where workflow_id = 1070.  Does that change your
> theory of using an “in-index filter” for that plan?
>

Yes.  There is no way that that many index tuples will fit in just 24 index
leaf pages, so I think it must be using the re-descending method for both
plans.  Then I don't know why there is a difference in the number of
buffer accesses.  But the difference seems trivial, so I wouldn't put much
effort into investigating it.



> When you say there was a bit of speculation on the “boundard condition” vs
> “in-index filter” is the speculation on if Postgres has 2 different ways of
> processing a =ANY filter or is the speculation that one is being used by
> one plan and the other is being used by the other plan?
>

The speculation was that this applied to your query.  But going back to
some of my original tests, I see that I remembered some of the details
wrong on the broader topic as well.  When it voluntarily doesn't use the
=ANY as a boundary condition, that shows up in the plan as having the
condition evicted from "Index Cond" line and instead show up in a "Filter"
line, at least in my one test case (which means it is no longer an in-index
filter, as it jumps to the table and verifies visibility before applying
the filter).  So the thing that the plans really don't distinguish is
between when it just chooses not to use the extra index column for cost
reasons, from when it thinks it is unable to use it for correctness/safety
reasons.

Cheers,

Jeff

Reply via email to