HI, all hackers:

In the GitHub repository for PostgreSQL’s pg_hint_plan extension, there is an 
issue where the generated join order does not match the assigned join order. 
After reviewing the source code, I found that this inconsistency with input 
hints is due to PostgreSQL’s implementation and is not a bug in pg_hint_plan.

PostgreSQL with pg_hint_plan supports disabling certain operators (e.g., hash 
join, seq scan) by setting pg parameters like “set enable_hashjoin = false”. 
This setting causes PostgreSQL to add a high disable_cost (e.g., 1e10) to the 
estimated cost of the hash join operator, effectively preventing the planner 
from selecting hash joins due to the inflated cost. Additionally, pg_hint_plan 
supports enforcing specific join orders. To do this, pg_hint_plan disables all 
join algorithms when it encounters inconsistent join orders, by adding the 
disable_cost to each join operator. As a result, only the assigned join order 
will be selected. This is the mechanism behind pg_hint_plan.

Then, we take an example of the GitHub issue to demonstrate this problem:

Here is a query with pg_hint:

/*+
Leading((rt (it ((n (chn (mc (mi (t (ci an)))))) cn))))
HashJoin(ci an)
NestLoop(ci an t)
NestLoop(ci an t mi)
NestLoop(ci an t mi mc)
NestLoop(ci an t mi mc chn)
NestLoop(ci an t mi mc chn n)
NestLoop(ci an t mi mc chn n cn)
NestLoop(ci an t mi mc chn n cn it)
NestLoop(ci an t mi mc chn n cn it rt)
*/
EXPLAIN (FORMAT TEXT)
SELECT MIN(n.name) AS voicing_actress,
       MIN(t.title) AS voiced_movie
FROM aka_name AS an,
     char_name AS chn,
     cast_info AS ci,
     company_name AS cn,
     info_type AS it,
     movie_companies AS mc,
     movie_info AS mi,
     name AS n,
     role_type AS rt,
     title AS t
WHERE ci.note IN ('(voice)',
                  '(voice: Japanese version)',
                  '(voice) (uncredited)',
                  '(voice: English version)')
  AND cn.country_code ='[us]'
  AND it.info = 'release dates'
  AND mc.note IS NOT NULL
  AND (mc.note LIKE '%(USA)%'
       OR mc.note LIKE '%(worldwide)%')
  AND mi.info IS NOT NULL
  AND (mi.info LIKE 'Japan:%200%'
       OR mi.info LIKE 'USA:%200%')
  AND n.gender ='f'
  AND n.name LIKE '%Ang%'
  AND rt.role ='actress'
  AND t.production_year BETWEEN 2005 AND 2009
  AND t.id = mi.movie_id
  AND t.id = mc.movie_id
  AND t.id = ci.movie_id
  AND mc.movie_id = ci.movie_id
  AND mc.movie_id = mi.movie_id
  AND mi.movie_id = ci.movie_id
  AND cn.id = mc.company_id
  AND it.id = mi.info_type_id
  AND n.id = ci.person_id
  AND rt.id = ci.role_id
  AND n.id = an.person_id
  AND ci.person_id = an.person_id
  AND chn.id = ci.person_role_id;

The hint specifies a join order (rt (it ((n (chn (mc (mi (t (ci an)))))) cn))), 
but the generated join order is (rt (it ((n ((mc (mi ((ci an) t))) chn)) cn))). 
Here, PostgreSQL generates sub-join order ((ci an) t) instead of the assigned 
sub-join order (t (ci an)), and ((mc (mi ((ci an) t))) chn) instead of (chn (mc 
(mi ((ci an) t)))). This discrepancy arises because PostgreSQL estimates 
operator costs in two phases. In the first phase, it filters out paths that are 
obviously suboptimal based on estimated costs; however, it does not factor in 
disable_cost for disabled operators in this phase, only doing so in the second 
phase. As a result, while  (t (ci an)) would use a regular nested loop join 
with a sequential scan on t,  ((ci an) t) uses an index-based nested loop join 
with an index scan on t, which is significantly faster. Consequently, (t (ci 
an)) is filtered out after the first phase of cost estimation. The same 
reasoning applies to (chn (mc (mi ((ci an) t)))).

In the following example, by forcing PostgreSQL to access relations t and chn 
with a sequential scan, PostgreSQL generates the assigned join order. This is 
because forcing a sequential scan for t and chn prevents PostgreSQL from 
considering index-based nested loop joins for them.

/*+
SeqScan(t) SeqScan(chn)
Leading((rt (it ((n (chn (mc (mi (t (ci an)))))) cn))))
HashJoin(ci an)
NestLoop(ci an t)
NestLoop(ci an t mi)
NestLoop(ci an t mi mc)
NestLoop(ci an t mi mc chn)
NestLoop(ci an t mi mc chn n)
NestLoop(ci an t mi mc chn n cn)
NestLoop(ci an t mi mc chn n cn it)
NestLoop(ci an t mi mc chn n cn it rt)
*/
EXPLAIN (FORMAT TEXT)
SELECT MIN(n.name) AS voicing_actress,
       MIN(t.title) AS voiced_movie
FROM aka_name AS an,
     char_name AS chn,
     cast_info AS ci,
     company_name AS cn,
     info_type AS it,
     movie_companies AS mc,
     movie_info AS mi,
     name AS n,
     role_type AS rt,
     title AS t
WHERE ci.note IN ('(voice)',
                  '(voice: Japanese version)',
                  '(voice) (uncredited)',
                  '(voice: English version)')
  AND cn.country_code ='[us]'
  AND it.info = 'release dates'
  AND mc.note IS NOT NULL
  AND (mc.note LIKE '%(USA)%'
       OR mc.note LIKE '%(worldwide)%')
  AND mi.info IS NOT NULL
  AND (mi.info LIKE 'Japan:%200%'
       OR mi.info LIKE 'USA:%200%')
  AND n.gender ='f'
  AND n.name LIKE '%Ang%'
  AND rt.role ='actress'
  AND t.production_year BETWEEN 2005 AND 2009
  AND t.id = mi.movie_id
  AND t.id = mc.movie_id
  AND t.id = ci.movie_id
  AND mc.movie_id = ci.movie_id
  AND mc.movie_id = mi.movie_id
  AND mi.movie_id = ci.movie_id
  AND cn.id = mc.company_id
  AND it.id = mi.info_type_id
  AND n.id = ci.person_id
  AND rt.id = ci.role_id
  AND n.id = an.person_id
  AND ci.person_id = an.person_id
  AND chn.id = ci.person_role_id;

As I explained, the issue arises because PG does not account for the 
disable_cost of disabled operators in the initial phase of cost estimation. To 
address this, I modified the process to include disable_cost for disabled 
operators in the first phase. As a result, the initial query generated the 
desired join order.

I want to submit a patch to PG to solve this issue. How do you like this 
solution?

Kindly regards,

Qilong.


Reply via email to