> Actually val and val2 come from public.lt in "r" side, but as you say
> it's too difficult to know that from EXPLAIN output.  Do you have any
> idea to make the "Output" item more readable?
>
A fundamental reason why we need to have symbolic aliases here is that
postgres_fdw has remote query in cstring form. It makes implementation
complicated to deconstruct/construct a query that is once constructed
on the underlying foreign-path level.
If ForeignScan keeps items to construct remote query in expression node
form (and construction of remote query is delayed to beginning of the
executor, probably), we will be able to construct more human readable
remote query.

However, I don't recommend to work on this great refactoring stuff
within the scope of join push-down support project.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Shigeru Hanada
> Sent: Thursday, March 05, 2015 10:00 PM
> To: Ashutosh Bapat
> Cc: Kaigai Kouhei(海外 浩平); Robert Haas; PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
> 
> Hi Ashutosh, thanks for the review.
> 
> 2015-03-04 19:17 GMT+09:00 Ashutosh Bapat <ashutosh.ba...@enterprisedb.com>:
> > In create_foreignscan_path() we have lines like -
> > 1587     pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
> > 1588
> > required_outer);
> > Now, that the same function is being used for creating foreign scan paths
> > for joins, we should be calling get_joinrel_parampathinfo() on a join rel
> > and get_baserel_parampathinfo() on base rel.
> 
> Got it.  Please let me check the difference.
> 
> >
> > The patch seems to handle all the restriction clauses in the same way. There
> > are two kinds of restriction clauses - a. join quals (specified using ON
> > clause; optimizer might move them to the other class if that doesn't affect
> > correctness) and b. quals on join relation (specified in the WHERE clause,
> > optimizer might move them to the other class if that doesn't affect
> > correctness). The quals in "a" are applied while the join is being computed
> > whereas those in "b" are applied after the join is computed. For example,
> > postgres=# select * from lt;
> >  val | val2
> > -----+------
> >    1 |    2
> >    1 |    3
> > (2 rows)
> >
> > postgres=# select * from lt2;
> >  val | val2
> > -----+------
> >    1 |    2
> > (1 row)
> >
> > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2);
> >  val | val2 | val | val2
> > -----+------+-----+------
> >    1 |    2 |   1 |    2
> >    1 |    3 |     |
> > (2 rows)
> >
> > postgres=# select * from lt left join lt2 on (true) where (lt.val2 =
> > lt2.val2);
> >  val | val2 | val | val2
> > -----+------+-----+------
> >    1 |    2 |   1 |    2
> > (1 row)
> >
> > The difference between these two kinds is evident in case of outer joins,
> > for inner join optimizer puts all of them in class "b". The remote query
> > sent to the foreign server has all those in ON clause. Consider foreign
> > tables ft1 and ft2 pointing to local tables on the same server.
> > postgres=# \d ft1
> >          Foreign table "public.ft1"
> >  Column |  Type   | Modifiers | FDW Options
> > --------+---------+-----------+-------------
> >  val    | integer |           |
> >  val2   | integer |           |
> > Server: loopback
> > FDW Options: (table_name 'lt')
> >
> > postgres=# \d ft2
> >          Foreign table "public.ft2"
> >  Column |  Type   | Modifiers | FDW Options
> > --------+---------+-----------+-------------
> >  val    | integer |           |
> >  val2   | integer |           |
> > Server: loopback
> > FDW Options: (table_name 'lt2')
> >
> > postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2 =
> > ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;
> >
> > QUERY PLAN
> >
> >
> ----------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> >
> ----------------------------------------------------------------------------
> --------
> >  Foreign Scan  (cost=100.00..125.60 rows=2560 width=16)
> >    Output: val, val2, val, val2
> >    Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val, val2 FROM
> > public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt) r (a
> > _0, a_1)  ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND ((r.a_1 =
> > l.a_1))
> > (3 rows)
> >
> > The result is then wrong
> > postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where
> > ft1.val + ft2.val > ft1.val2 or ft2.val is null;
> >  val | val2 | val | val2
> > -----+------+-----+------
> >    1 |    2 |     |
> >    1 |    3 |     |
> > (2 rows)
> >
> > which should match the result obtained by substituting local tables for
> > foreign ones
> > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2) where
> > lt.val + lt2.val > lt.val2 or lt2.val is null;
> >  val | val2 | val | val2
> > -----+------+-----+------
> >    1 |    3 |     |
> > (1 row)
> >
> > Once we start distinguishing the two kinds of quals, there is some
> > optimization possible. For pushing down a join it's essential that all the
> > quals in "a" are safe to be pushed down. But a join can be pushed down, even
> > if quals in "a" are not safe to be pushed down. But more clauses one pushed
> > down to foreign server, lesser are the rows fetched from the foreign server.
> > In postgresGetForeignJoinPath, instead of checking all the restriction
> > clauses to be safe to be pushed down, we need to check only those which are
> > join quals (class "a").
> 
> The argument restrictlist of GetForeignJoinPaths contains both
> conditions mixed, so I added extract_actual_join_clauses() to separate
> it into two lists, join_quals and other clauses.  This is similar to
> what create_nestloop_plan and siblings do.
> 
> 
> >
> > Following EXPLAIN output seems to be confusing
> > ft1 and ft2 both are pointing to same lt on a foreign server.
> > postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2 where
> > ft1.val + ft1.val2 = ft2.val;
> >
> > QUERY PLAN
> >
> >
> ----------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> > --------------------------
> >  Foreign Scan  (cost=100.00..132.00 rows=2560 width=8)
> >    Output: (val + val2)
> >    Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM public.lt) l
> > (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1)  ON ((
> > (r.a_0 + r.a_1) = l.a_0))
> >
> > Output just specified val + val2, it doesn't tell, where those val and val2
> > come from, neither it's evident from the rest of the context.
> >
> 
> Actually val and val2 come from public.lt in "r" side, but as you say
> it's too difficult to know that from EXPLAIN output.  Do you have any
> idea to make the "Output" item more readable?
> 
> --
> Shigeru HANADA
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to