On Mon, Nov 21, 2016 at 1:59 PM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:

> Hello,
>
> The attached patch is a revised version of pass-down LIMIT to FDW/CSP.
>
> Below is the updates from the last version.
>
> 'ps_numTuples' of PlanState was declared as uint64, instead of long
> to avoid problems on 32bits machine when a large LIMIT clause is
> supplied.
>
> 'ps_numTuples' is re-interpreted; 0 means that its upper node wants
> to fetch all the tuples. It allows to eliminate a boring initialization
> on ExecInit handler for each executor node.
>
> Even though it was not suggested, estimate_path_cost_size() of postgres_fdw
> adjusts number of rows if foreign-path is located on top-level of
> the base-relations and LIMIT clause takes a constant value.
> It will make more adequate plan as follows:
>
> * WITHOUT this patch
> --------------------
> postgres=# explain verbose select * from t_a, t_b where t_a.id = t_b.id
> and t_a.x < t_b.x LIMIT 100;
>                                        QUERY PLAN
> ------------------------------------------------------------
> ----------------------------
>  Limit  (cost=261.17..274.43 rows=100 width=88)
>    Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>    ->  Hash Join  (cost=261.17..581.50 rows=2416 width=88)
>          Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>          Hash Cond: (t_a.id = t_b.id)
>          Join Filter: (t_a.x < t_b.x)
>          ->  Foreign Scan on public.t_a  (cost=100.00..146.12 rows=1204
> width=44)
>                Output: t_a.id, t_a.x, t_a.y
>                Remote SQL: SELECT id, x, y FROM public.t
>          ->  Hash  (cost=146.12..146.12 rows=1204 width=44)
>                Output: t_b.id, t_b.x, t_b.y
>                ->  Foreign Scan on public.t_b  (cost=100.00..146.12
> rows=1204 width=44)
>                      Output: t_b.id, t_b.x, t_b.y
>                      Remote SQL: SELECT id, x, y FROM public.t
> (14 rows)
>
> * WITH this patch
> -----------------
> postgres=# explain verbose select * from t_a, t_b where t_a.id = t_b.id
> and t_a.x < t_b.x LIMIT 100;
>
> QUERY PLAN
> ------------------------------------------------------------
> --------------------------------------------------
>  Limit  (cost=100.00..146.58 rows=100 width=88)
>    Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>    ->  Foreign Scan  (cost=100.00..146.58 rows=100 width=88)
>          Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>          Relations: (public.t_a) INNER JOIN (public.t_b)
>          Remote SQL: SELECT r1.id, r1.x, r1.y, r2.id, r2.x, r2.y FROM
> (public.t r1 INNER JOIN public.t r2 ON (((r1.x < r2.x)) AND ((r1.id =
> r2.id))))
> (6 rows)
>
>
> That's nice.


> On the other hands, I noticed it is not safe to attach LIMIT clause at
> the planner stage because root->limit_tuples is declared as double.
> Even if LIMIT clause takes a constant value, it is potentially larger
> than 2^53 which is the limitation we can represent accurately with
> float64 data type but LIMIT clause allows up to 2^63-1.
> So, postgres_fdw now attaches LIMIT clause on the remote query on
> execution time only.
>

I think, it's OK.

Here are few comments on latest patch:

1.
make/make check is fine, however I am getting regression failure in
postgres_fdw contrib module (attached regression.diff).
Please investigate and fix.

2.
+             *
+             * MEMO: root->limit_tuples is not attached when query contains
+             * grouping-clause or aggregate functions. So, we don's adjust
+             * rows even if LIMIT <const> is supplied.

Can you please explain why you are not doing this for grouping-clause or
aggregate functions.

3.
Typo:

don's  => don't

Rest of the changes look good to me.

Thanks


> Thanks,
> ----
> PG-Strom Project / NEC OSS Promotion Center
> KaiGai Kohei <kai...@ak.jp.nec.com>
>
>
> > -----Original Message-----
> > From: Robert Haas [mailto:robertmh...@gmail.com]
> > Sent: Thursday, November 10, 2016 3:08 AM
> > To: Kaigai Kouhei(海外 浩平) <kai...@ak.jp.nec.com>
> > Cc: pgsql-hackers@postgresql.org; Jeevan Chalke
> > <jeevan.cha...@enterprisedb.com>; Etsuro Fujita
> > <fujita.ets...@lab.ntt.co.jp>; Andres Freund <and...@anarazel.de>
> > Subject: ##freemail## Re: PassDownLimitBound for ForeignScan/CustomScan
> > [take-2]
> >
> > On Mon, Oct 31, 2016 at 10:20 AM, Kouhei Kaigai <kai...@ak.jp.nec.com>
> > wrote:
> > > As an example, I enhanced postgres_fdw to understand the ps_numTuples
> > > if it is set. If and when remote ORDER BY is pushed down, the latest
> > > code tries to sort the entire remote table because it does not know
> > > how many rows to be returned. Thus, it took larger execution time.
> > > On the other hands, the patched one runs the remote query with LIMIT
> > > clause according to the ps_numTuples; which is informed by the Limit
> > > node on top of the ForeignScan node.
> >
> > So there are two cases here.  If the user says LIMIT 12, we could in
> theory
> > know that at planner time and optimize accordingly.  If the user says
> LIMIT
> > twelve(), however, we will need to wait until execution time unless
> twelve()
> > happens to be capable of being simplified to a constant by the planner.
> >
> > Therefore, it's possible to imagine having two mechanisms here. In the
> > simple case where the LIMIT and OFFSET values are constants, we could
> > implement a system to get hold of that information during planning and
> > use it for whatever we like.   In addition, we can have an
> > execution-time system that optimizes based on values available at
> execution
> > (regardless of whether those values were also available during planning).
> > Those are, basically, two separate things, and this patch has enough to
> > do just focusing on one of them.
> >
> > --
> > Robert Haas
> > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> > Company
>



-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachment: regression.diffs
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to