On Mon, Mar 9, 2015 at 5:46 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
> Hi Ashutosh, > > Thanks for finding out what we oversight. > Here is still a problem because the new 'relids' field is not updated > on setrefs.c (scanrelid is incremented by rtoffset here). > It is easy to shift the bitmapset by rtoffset, however, I also would > like to see another approach. > I just made it work for explain, but other parts still need work. Sorry about that. If we follow INDEX_VAR, we should be able to get there. > > My idea adds 'List *fdw_sub_paths' field in ForeignPath to inform > planner underlying foreign-scan paths (with scanrelid > 0). > The create_foreignscan_plan() will call create_plan_recurse() to > construct plan nodes based on the path nodes being attached. > Even though these foreign-scan nodes are not actually executed, > setrefs.c can update scanrelid in usual way and ExplainPreScanNode > does not need to take exceptional handling on Foreign/CustomScan > nodes. > In addition, it allows to keep information about underlying foreign > table scan, even if planner will need some other information in the > future version (not only relids). > > How about your thought? > I am not sure about keeping planner nodes, which are not turned into execution nodes. There's no precedence for that in current code. It could be risky. > -- > NEC OSS Promotion Center / PG-Strom Project > KaiGai Kohei <kai...@ak.jp.nec.com> > > > > -----Original Message----- > > From: pgsql-hackers-ow...@postgresql.org > > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat > > Sent: Friday, March 06, 2015 7:26 PM > > To: Kaigai Kouhei(海外 浩平) > > Cc: Shigeru Hanada; Robert Haas; PostgreSQL-development > > Subject: Re: [HACKERS] Join push-down support for foreign tables > > > > Hi Kaigai-san, Hanada-san, > > > > Attached please find a patch to print the column names prefixed by the > relation > > names. I haven't tested the patch fully. The same changes will be needed > for > > CustomPlan node specific code. > > > > Now I am able to make sense out of the Output information > > > > postgres=# explain verbose select * from ft1 join ft2 using (val); > > > > QUERY PLAN > > > > > ---------------------------------------------------------------------------- > > > --------------------------------------------------------------------------- > > ----------------------- > > Foreign Scan (cost=100.00..125.60 rows=2560 width=12) > > Output: ft1.val, ft1.val2, ft2.val2 > > Remote SQL: SELECT r.a_0, r.a_1, l.a_1 FROM (SELECT val, val2 FROM > public.lt) > > l (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1) > > ON ((r.a_0 = l.a_0)) > > (3 rows) > > > > > > > > On Fri, Mar 6, 2015 at 6:41 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> > wrote: > > > > > > > Actually val and val2 come from public.lt in "r" side, but as > you say > > > it's too difficult to know that from EXPLAIN output. Do you > have any > > > idea to make the "Output" item more readable? > > > > > A fundamental reason why we need to have symbolic aliases here is > that > > postgres_fdw has remote query in cstring form. It makes > implementation > > complicated to deconstruct/construct a query that is once > constructed > > on the underlying foreign-path level. > > If ForeignScan keeps items to construct remote query in expression > node > > form (and construction of remote query is delayed to beginning of > the > > executor, probably), we will be able to construct more human > readable > > remote query. > > > > However, I don't recommend to work on this great refactoring stuff > > within the scope of join push-down support project. > > > > Thanks, > > -- > > NEC OSS Promotion Center / PG-Strom Project > > KaiGai Kohei <kai...@ak.jp.nec.com> > > > > > > > -----Original Message----- > > > From: pgsql-hackers-ow...@postgresql.org > > > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Shigeru > > Hanada > > > Sent: Thursday, March 05, 2015 10:00 PM > > > To: Ashutosh Bapat > > > Cc: Kaigai Kouhei(海外 浩平); Robert Haas; PostgreSQL-development > > > Subject: Re: [HACKERS] Join push-down support for foreign tables > > > > > > > > Hi Ashutosh, thanks for the review. > > > > > > 2015-03-04 19:17 GMT+09:00 Ashutosh Bapat > > <ashutosh.ba...@enterprisedb.com>: > > > > In create_foreignscan_path() we have lines like - > > > > 1587 pathnode->path.param_info = > > get_baserel_parampathinfo(root, rel, > > > > 1588 > > > > required_outer); > > > > Now, that the same function is being used for creating foreign > scan > > paths > > > > for joins, we should be calling get_joinrel_parampathinfo() on > a join > > rel > > > > and get_baserel_parampathinfo() on base rel. > > > > > > Got it. Please let me check the difference. > > > > > > > > > > > The patch seems to handle all the restriction clauses in the > same > > way. There > > > > are two kinds of restriction clauses - a. join quals > (specified using > > ON > > > > clause; optimizer might move them to the other class if that > doesn't > > affect > > > > correctness) and b. quals on join relation (specified in the > WHERE > > clause, > > > > optimizer might move them to the other class if that doesn't > affect > > > > correctness). The quals in "a" are applied while the join is > being > > computed > > > > whereas those in "b" are applied after the join is computed. > For > > example, > > > > postgres=# select * from lt; > > > > val | val2 > > > > -----+------ > > > > 1 | 2 > > > > 1 | 3 > > > > (2 rows) > > > > > > > > postgres=# select * from lt2; > > > > val | val2 > > > > -----+------ > > > > 1 | 2 > > > > (1 row) > > > > > > > > postgres=# select * from lt left join lt2 on (lt.val2 = > lt2.val2); > > > > val | val2 | val | val2 > > > > -----+------+-----+------ > > > > 1 | 2 | 1 | 2 > > > > 1 | 3 | | > > > > (2 rows) > > > > > > > > postgres=# select * from lt left join lt2 on (true) where > (lt.val2 > > = > > > > lt2.val2); > > > > val | val2 | val | val2 > > > > -----+------+-----+------ > > > > 1 | 2 | 1 | 2 > > > > (1 row) > > > > > > > > The difference between these two kinds is evident in case of > outer > > joins, > > > > for inner join optimizer puts all of them in class "b". The > remote > > query > > > > sent to the foreign server has all those in ON clause. > Consider foreign > > > > tables ft1 and ft2 pointing to local tables on the same server. > > > > postgres=# \d ft1 > > > > Foreign table "public.ft1" > > > > Column | Type | Modifiers | FDW Options > > > > --------+---------+-----------+------------- > > > > val | integer | | > > > > val2 | integer | | > > > > Server: loopback > > > > FDW Options: (table_name 'lt') > > > > > > > > postgres=# \d ft2 > > > > Foreign table "public.ft2" > > > > Column | Type | Modifiers | FDW Options > > > > --------+---------+-----------+------------- > > > > val | integer | | > > > > val2 | integer | | > > > > Server: loopback > > > > FDW Options: (table_name 'lt2') > > > > > > > > postgres=# explain verbose select * from ft1 left join ft2 on > (ft1.val2 > > = > > > > ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is > null; > > > > > > > > QUERY PLAN > > > > > > > > > > > > > > ---------------------------------------------------------------------------- > > > > > > --------------------------------------------------------------------------- > > > > > > > > > > ---------------------------------------------------------------------------- > > > -------- > > > > Foreign Scan (cost=100.00..125.60 rows=2560 width=16) > > > > Output: val, val2, val, val2 > > > > Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT > val, > > val2 FROM > > > > public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM > public.lt) > > r (a > > > > _0, a_1) ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) > AND > > ((r.a_1 = > > > > l.a_1)) > > > > (3 rows) > > > > > > > > The result is then wrong > > > > postgres=# select * from ft1 left join ft2 on (ft1.val2 = > ft2.val2) > > where > > > > ft1.val + ft2.val > ft1.val2 or ft2.val is null; > > > > val | val2 | val | val2 > > > > -----+------+-----+------ > > > > 1 | 2 | | > > > > 1 | 3 | | > > > > (2 rows) > > > > > > > > which should match the result obtained by substituting local > tables > > for > > > > foreign ones > > > > postgres=# select * from lt left join lt2 on (lt.val2 = > lt2.val2) > > where > > > > lt.val + lt2.val > lt.val2 or lt2.val is null; > > > > val | val2 | val | val2 > > > > -----+------+-----+------ > > > > 1 | 3 | | > > > > (1 row) > > > > > > > > Once we start distinguishing the two kinds of quals, there is > some > > > > optimization possible. For pushing down a join it's essential > that > > all the > > > > quals in "a" are safe to be pushed down. But a join can be > pushed > > down, even > > > > if quals in "a" are not safe to be pushed down. But more > clauses one > > pushed > > > > down to foreign server, lesser are the rows fetched from the > foreign > > server. > > > > In postgresGetForeignJoinPath, instead of checking all the > > restriction > > > > clauses to be safe to be pushed down, we need to check only > those > > which are > > > > join quals (class "a"). > > > > > > The argument restrictlist of GetForeignJoinPaths contains both > > > conditions mixed, so I added extract_actual_join_clauses() to > separate > > > it into two lists, join_quals and other clauses. This is > similar to > > > what create_nestloop_plan and siblings do. > > > > > > > > > > > > > > Following EXPLAIN output seems to be confusing > > > > ft1 and ft2 both are pointing to same lt on a foreign server. > > > > postgres=# explain verbose select ft1.val + ft1.val2 from ft1, > ft2 > > where > > > > ft1.val + ft1.val2 = ft2.val; > > > > > > > > QUERY PLAN > > > > > > > > > > > > > > ---------------------------------------------------------------------------- > > > > > > --------------------------------------------------------------------------- > > > > -------------------------- > > > > Foreign Scan (cost=100.00..132.00 rows=2560 width=8) > > > > Output: (val + val2) > > > > Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM > > public.lt) l > > > > (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r > (a_0, a_1) > > ON (( > > > > (r.a_0 + r.a_1) = l.a_0)) > > > > > > > > Output just specified val + val2, it doesn't tell, where those > val > > and val2 > > > > come from, neither it's evident from the rest of the context. > > > > > > > > > > Actually val and val2 come from public.lt in "r" side, but as > you say > > > it's too difficult to know that from EXPLAIN output. Do you > have any > > > idea to make the "Output" item more readable? > > > > > > -- > > > Shigeru HANADA > > > > > > > > > > > -- > > > Sent via pgsql-hackers mailing list ( > pgsql-hackers@postgresql.org) > > > To make changes to your subscription: > > > http://www.postgresql.org/mailpref/pgsql-hackers > > > > > > > > > > > > -- > > > > Best Wishes, > > Ashutosh Bapat > > EnterpriseDB Corporation > > The Postgres Database Company > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company