Re: Helping planner to chose sequential scan when it improves performance

Ruslan Zakirov Wed, 14 Jun 2023 16:00:44 -0700

On Tue, Jun 13, 2023 at 10:28 PM Patrick O'Toole <patrick.oto...@sturdy.ai>
wrote:


> Hi all,
>



> Questions:
>
>    1. In Plan A, what factors are causing the planner to select a
>    substantially slower plan despite having recent stats about number of rows?
>
> Estimated overall cost. For Plan A it is ~200k. For plans B/C (haven't
noticed any differences in these two) it is ~250k. The planner uses a less
expensive plan.

Also, in the plans you can see that Pg estimates the number of rows
correctly.

>
>    1. Is there a substantial difference between the on-the-fly hash done
>    in Plan B and Plan C compared to the hash-index used in Plan A? Can I
>    assume they are essentially the same? Perhaps there are there differences
>    in how they're applied?
>
> I don't see any difference in plans B and C, but you report timing
changes. To me this looks like just a fluctuation in measurements. So I
wouldn't trust any measurements for plan A either.

I'm not a big expert, but can not say that plan A and B are essentially the
same.

Plan A: DB scans item_text table and for every record looks into the index
of conversation_item table, then looks into the table itself.

Plan B/C: DB scans conversation_item table without looking into its indexes
building a hash table on the fly.



>    1. Is it common to see values for random_page_cost set as high as 8.0?
>    We would of course need to investigate whether we see a net positive or net
>    negative impact on other queries, to adopt this as a general setting, but
>    is it a proposal we should actually consider?
>
> No idea.

>
>    1. Maybe we are barking up the wrong tree with the previous questions.
>    Are there other configuration parameters we should consider first to
>    improve performance in situations like the one illustrated?
>
> Recheck your numbers.

>
>    1. Are there other problems with our schema, query, or plans shown
>    here? Other approaches (or tools/analyses) we should consider?
>
> You can try the following index:

CREATE INDEX conversation_item_ruz1 ON conversation_item(item_uuid,
conversation_uuid, tenant_id);

I believe this index would allow Pg to use "index only scan" as variation
of Plan A and avoid touching the conversation_item table completely.

-- 
Best regards, Ruslan.

Re: Helping planner to chose sequential scan when it improves performance

Reply via email to