Here is the v5 patch of Join push-down support for foreign tables. Changes since v4:
- Separete remote conditions into ON and WHERE, per Ashutosh. - Add regression test cases for foreign join. - Don't skip reversed relation combination in OUTER join cases. I'm now working on two issues from Kaigai-san and Ashutosu, whole-row reference handling and use of get_joinrel_parampathinfo(). 2015-03-05 22:00 GMT+09:00 Shigeru Hanada <shigeru.han...@gmail.com>: > Hi Ashutosh, thanks for the review. > > 2015-03-04 19:17 GMT+09:00 Ashutosh Bapat <ashutosh.ba...@enterprisedb.com>: >> In create_foreignscan_path() we have lines like - >> 1587 pathnode->path.param_info = get_baserel_parampathinfo(root, rel, >> 1588 >> required_outer); >> Now, that the same function is being used for creating foreign scan paths >> for joins, we should be calling get_joinrel_parampathinfo() on a join rel >> and get_baserel_parampathinfo() on base rel. > > Got it. Please let me check the difference. > >> >> The patch seems to handle all the restriction clauses in the same way. There >> are two kinds of restriction clauses - a. join quals (specified using ON >> clause; optimizer might move them to the other class if that doesn't affect >> correctness) and b. quals on join relation (specified in the WHERE clause, >> optimizer might move them to the other class if that doesn't affect >> correctness). The quals in "a" are applied while the join is being computed >> whereas those in "b" are applied after the join is computed. For example, >> postgres=# select * from lt; >> val | val2 >> -----+------ >> 1 | 2 >> 1 | 3 >> (2 rows) >> >> postgres=# select * from lt2; >> val | val2 >> -----+------ >> 1 | 2 >> (1 row) >> >> postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2); >> val | val2 | val | val2 >> -----+------+-----+------ >> 1 | 2 | 1 | 2 >> 1 | 3 | | >> (2 rows) >> >> postgres=# select * from lt left join lt2 on (true) where (lt.val2 = >> lt2.val2); >> val | val2 | val | val2 >> -----+------+-----+------ >> 1 | 2 | 1 | 2 >> (1 row) >> >> The difference between these two kinds is evident in case of outer joins, >> for inner join optimizer puts all of them in class "b". The remote query >> sent to the foreign server has all those in ON clause. Consider foreign >> tables ft1 and ft2 pointing to local tables on the same server. >> postgres=# \d ft1 >> Foreign table "public.ft1" >> Column | Type | Modifiers | FDW Options >> --------+---------+-----------+------------- >> val | integer | | >> val2 | integer | | >> Server: loopback >> FDW Options: (table_name 'lt') >> >> postgres=# \d ft2 >> Foreign table "public.ft2" >> Column | Type | Modifiers | FDW Options >> --------+---------+-----------+------------- >> val | integer | | >> val2 | integer | | >> Server: loopback >> FDW Options: (table_name 'lt2') >> >> postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2 = >> ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null; >> >> QUERY PLAN >> >> ------------------------------------------------------------------------------------------------------------------------------------------------------- >> ------------------------------------------------------------------------------------ >> Foreign Scan (cost=100.00..125.60 rows=2560 width=16) >> Output: val, val2, val, val2 >> Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val, val2 FROM >> public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt) r (a >> _0, a_1) ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND ((r.a_1 = >> l.a_1)) >> (3 rows) >> >> The result is then wrong >> postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where >> ft1.val + ft2.val > ft1.val2 or ft2.val is null; >> val | val2 | val | val2 >> -----+------+-----+------ >> 1 | 2 | | >> 1 | 3 | | >> (2 rows) >> >> which should match the result obtained by substituting local tables for >> foreign ones >> postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2) where >> lt.val + lt2.val > lt.val2 or lt2.val is null; >> val | val2 | val | val2 >> -----+------+-----+------ >> 1 | 3 | | >> (1 row) >> >> Once we start distinguishing the two kinds of quals, there is some >> optimization possible. For pushing down a join it's essential that all the >> quals in "a" are safe to be pushed down. But a join can be pushed down, even >> if quals in "a" are not safe to be pushed down. But more clauses one pushed >> down to foreign server, lesser are the rows fetched from the foreign server. >> In postgresGetForeignJoinPath, instead of checking all the restriction >> clauses to be safe to be pushed down, we need to check only those which are >> join quals (class "a"). > > The argument restrictlist of GetForeignJoinPaths contains both > conditions mixed, so I added extract_actual_join_clauses() to separate > it into two lists, join_quals and other clauses. This is similar to > what create_nestloop_plan and siblings do. > > >> >> Following EXPLAIN output seems to be confusing >> ft1 and ft2 both are pointing to same lt on a foreign server. >> postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2 where >> ft1.val + ft1.val2 = ft2.val; >> >> QUERY PLAN >> >> ------------------------------------------------------------------------------------------------------------------------------------------------------- >> -------------------------- >> Foreign Scan (cost=100.00..132.00 rows=2560 width=8) >> Output: (val + val2) >> Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM public.lt) l >> (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1) ON (( >> (r.a_0 + r.a_1) = l.a_0)) >> >> Output just specified val + val2, it doesn't tell, where those val and val2 >> come from, neither it's evident from the rest of the context. >> > > Actually val and val2 come from public.lt in "r" side, but as you say > it's too difficult to know that from EXPLAIN output. Do you have any > idea to make the "Output" item more readable? > > -- > Shigeru HANADA -- Shigeru HANADA
foreign_join_v5.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers