Re: Weird seqscan node plan

Игорь Выскорко Tue, 26 Nov 2019 19:54:52 -0800

>
> Try increasing the following parameters to 14 (or even 16, if you are not 
> sure about number of tables involved):
>
> geqo_threshold = 14
>
> from_collapse_limit = 14
>
> join_collapse_limit = 14
>
> “about 12” is too close to default limit, 12.
>
> Regards,
>
> Igor Neyman

Hi Igor,

Is "set geqo = off;" not enough to fully disable geqo? 
I know what is geqo and know about limit when it is in action. Moreover, I 
actually tried to set these parameters to 100 and it doesn't help

> Hi Игорь
> 
> I suggest running Explain and Analyze to see what the actual query results vs 
> the planner are , 
> 
> Post the SQL code   

Hi Justin,
let me show 2 variants of "explain analyze" which differs only by actual rows 
returned by inner node (9th row):

1st case:  https://explain.depesz.com/s/lA4f
45358 rows actually returned and postgres decided to join each row of 45358 set 
with each row in yankee_foxtrot using seq scan:
  Seq Scan on yankee_foxtrot foxtrot_bravo (cost=0.000..267.670 rows=7,467 
width=13) (actual time=0.003..1.090 rows=7,467 loops=45,358)
 and then filter it:
  Rows Removed by Join Filter: 338685224
it was an awful choice =)

2st: case: https://explain.depesz.com/s/zkKY
4130 rows returned and now index only scan in action


Why planner mistakes in determining the number of rows (every time planner 
expects only 1 row) in this step I can understand - inner nodes do some joins 
(inner and outer with filtration) and it's hard to predict result.
But what I can't understand is why seq scan when it is always slower than 
index. Forget to mention that join condition is by unique key. So, when planner 
expects only 1 row then it must join only one row from second table!

> 
> Also pull this part of the query out and run it by itself to see if the 
> "Planner" changes  how  it joins these two tables.  
> 
> May need to increase the statistics collected 
> https://www.postgresql.org/docs/12/planner-stats.html
> 
> also read this 
> https://www.postgresql.org/docs/12/explicit-joins.html

In this step of planning statistics can't help (let me know if I'm wrong). 
Using stats, planner knows everything about second table (yankee_foxtrot) but 
It can only suggest approximate stats of first set (and it actually wrongs 
about it)
And yes I know about the possibility of explicit joining

> forgot to state 
> 
> Generally, it's index scan -> bitmap index scan -> sequential scan, the more 
> rows  expected to be retrieved to number of rows in the table the more likely 
> the planner will go to sequential scan 

The key word is "expected" here I assume and according to expectation of 
planner it must be only 1 row...


So, to conclude: I know how to make my query faster (how to exclude this "seq 
scan") but what I really what to understand: WHY seq scan is in my plan? Why 
planner thinks it's the best choice?
Re: Weird seqscan node plan

Reply via email to