Re: [HACKERS] Oddity in EXPLAIN for foreign/custom join pushdown plans

Kouhei Kaigai Thu, 28 Jul 2016 21:55:34 -0700

> I wrote:
> >> That may be so, but my point is that the target relations involved in
> >> the foreign join (ie, ft1 and ft2) should be printed somewhere in the
> >> EXPLAIN output by core, as in EXPLAIN for a simple foreign table scan.
> 
> > Why? According to your rule, Hash Join should take "on t0,t1,t2".
> >
> > postgres=# explain select id from t0 natural join t1 natural join t2;
> >                                  QUERY PLAN
> > -----------------------------------------------------------------------------
> >  Hash Join  (cost=6370.00..4560826.24 rows=98784048 width=4)
> >    Hash Cond: (t0.aid = t1.aid)
> >    ->  Hash Join  (cost=3185.00..3199360.58 rows=98784048 width=8)
> >          Hash Cond: (t0.bid = t2.bid)
> >          ->  Seq Scan on t0  (cost=0.00..1833334.80 rows=100000080 width=12)
> >          ->  Hash  (cost=1935.00..1935.00 rows=100000 width=4)
> >                ->  Seq Scan on t2  (cost=0.00..1935.00 rows=100000 width=4)
> >    ->  Hash  (cost=1935.00..1935.00 rows=100000 width=4)
> >          ->  Seq Scan on t1  (cost=0.00..1935.00 rows=100000 width=4)
> > (9 rows)
> 
> I don't think it needs "on t0,t1,t2", because we can see joining
> relations from inner/outer plans in that case.  In a foreign-join case,
> however, we can't see such relations from the EXPLAIN printed *by core*.
>   postgres_fdw avoids this issue by adding such relations to the EXPLAIN
> using ExplainForeignScan as shown in the below example, but since such
> relations are essential, I think that information should be shown by
> core itself.
>
I never opposed to print the relation names by the core, however, it has
to be controllable by the extension which provides ForeignScan/CustomScan
because only extension can know how underlying relation shall be scanned
exactly.


> postgres=# explain select * from (select ft1.a from ft1 left join ft2 on
> ft1.a = ft2.a where ft1.b = 1) ss1(a) full join (select ft3.a from ft3
> left join ft4 on ft3.a = ft4.a where ft3.b = 1) ss2(a) on ss1.a = ss2.a;
>                             QUERY PLAN
> ----------------------------------------------------------------
>   Hash Full Join  (cost=202.06..204.12 rows=1 width=8)
>     Hash Cond: (ft1.a = ft3.a)
>     ->  Foreign Scan  (cost=100.00..102.05 rows=1 width=4)
>           Relations: (public.ft1) LEFT JOIN (public.ft2)
>     ->  Hash  (cost=102.05..102.05 rows=1 width=4)
>           ->  Foreign Scan  (cost=100.00..102.05 rows=1 width=4)
>                 Relations: (public.ft3) LEFT JOIN (public.ft4)
> (7 rows)
> 
>  From the Relations line shown by postgres_fdw, we can see which foreign
> join joins which foreign tables, but if no such lines, we couldn't.
>
In case of postgres_fdw, if extension can indicate the core EXPLAIN to
print all of its underlying relations, the core EXPLAIN routine will
print name of the relations. Here is no problem.

> >>> postgres=# explain select id from t0 natural join t1 natural join t2;
> >>>                                 QUERY PLAN
> >>> ---------------------------------------------------------------------------
> >>>  Custom Scan (GpuJoin)  (cost=12385.67..291245.35 rows=9815465 width=4)
> >>>    GPU Projection: t0.id
> >>>    Depth 1: GpuHashJoin, HashKeys: (t0.bid)
> >>>             JoinQuals: (t0.bid = t2.bid)
> >>>             Nrows (in/out: 98.15%), KDS-Hash (size: 13.47MB, nbatches: 1)
> >>>    Depth 2: GpuHashJoin, HashKeys: (t0.aid)
> >>>             JoinQuals: (t0.aid = t1.aid)
> >>>             Nrows (in/out: 100.00%), KDS-Hash (size: 13.47MB, nbatches: 1)
> >>>    ->  Seq Scan on t0  (cost=0.00..183333.96 rows=9999996 width=12)
> >>>    ->  Seq Scan on t2  (cost=0.00..1935.00 rows=100000 width=4)
> >>>    ->  Seq Scan on t1  (cost=0.00..1935.00 rows=100000 width=4)
> >>> (11 rows)
> 
> > My largest concern for you proposition is, ForeignScan/CustomScan node is
> > enforced to print name of underlying relations, regardless of its actual
> > behavior. The above GpuJoin never scans tables at least, thus, it mislead
> > users if we have no choice to print underlying relation names.
> 
> OK, I understand we would need special handling for such custom joins.
>
It is not a case only for CustomScan.

Please assume an enhancement of postgres_fdw that reads a small local table 
(tbl_1)
and parse them as VALUES() clause within a remote query to execute remote join
with foreign tables (ftbl_2, ftbl_3).
This ForeignScan node has three underlying relations; tbl_1, ftbl_2 and ftbl_3.
Likely, tbl_1 will be scanned by SeqScan, not ForeignScan itself.
In this case, which relations shall be printed around ForeignScan?
Is it possible to choose proper relation names without hint by the extension?
                                               ^^^^^^^^^^^^
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Oddity in EXPLAIN for foreign/custom join pushdown plans

Reply via email to