The attached patch is rebased version of pass-down LIMIT clause patch, which was forgotten to register on the last CF.
It allows to inform required number of rows to the sub-plans not only ones we have individually handled at pass_down_bound(). Its primary target is control of number of remote tuple transfer over the network connection by postgres_fdw. According to the past discussion, we add a new field @ps_numTuples on the PlanState to represent the required number of tuples. Limit node assign a particular number on the field of sub-plan, then this sub-plan can know its upper node does not require entire tuples, and adjust its execution storategy. Like MergeAppend, the sub-plan can also pass down the bounds to its sub-plans again, if it makes sense and works correctly. This feature is potentially a basis of GPU-based sorting on top of CustomScan, because it has advantage for a workload to pick up the top-N tuples if its data-size is enough small to load onto GPU-RAM. Thanks, ---- PG-Strom Project / NEC OSS Promotion Center KaiGai Kohei <kai...@ak.jp.nec.com> > -----Original Message----- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kohei KaiGai > Sent: Tuesday, January 03, 2017 12:07 PM > To: Kaigai Kouhei(海外 浩平) <kai...@ak.jp.nec.com> > Cc: Jeevan Chalke <jeevan.cha...@enterprisedb.com>; Robert Haas > <robertmh...@gmail.com>; email@example.com; Etsuro Fujita > <fujita.ets...@lab.ntt.co.jp>; Andres Freund <and...@anarazel.de> > Subject: Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan > [take-2] > > Oops, I oversight this patch was marked as "returned with feedback", not > "moved to the next CF". > > Its status has not been changed since the last update. (Code was revised > according to the last comment by Jeevan, but CF-Nov was time up at that > time.) > > How do I handle the patch? > > 2016-12-05 16:49 GMT+09:00 Kouhei Kaigai <kai...@ak.jp.nec.com>: > > Hello, > > > > Sorry for my late response. > > The attached patch reflects your comments. > > > >> Here are few comments on latest patch: > >> > >> > >> 1. > >> make/make check is fine, however I am getting regression failure in > >> postgres_fdw contrib module (attached regression.diff). > >> Please investigate and fix. > >> > > It was an incorrect interaction when postgres_fdw tries to push down > > sorting to the remote side. We cannot attach LIMIT clause on the plain > > scan path across SORT, however, the previous version estimated the > > cost for the plain scan with LIMIT clause even if local sorting is needed. > > If remote scan may return just 10 rows, estimated cost of the local > > sort is very lightweight, thus, this unreasonable path was chosen. > > (On the other hands, its query execution results were correct because > > ps_numTuples is not delivered across Sort node, so ForeignScan > > eventually scanned all the remote tuples. It made correct results but > > not optimal from the viewpoint of performance.) > > > > The v3 patch estimates the cost with remote LIMIT clause only if > > supplied pathkey strictly matches with the final output order of the > > query, thus, no local sorting is expected. > > > > Some of the regression test cases still have different plans but due > > to the new optimization by remote LIMIT clause. > > Without remote LIMIT clause, some of regression test cases preferred > > remote-JOIN + local-SORT then local-LIMIT. > > Once we have remote-LIMIT option, it allows to discount the cost for > > remote-SORT by choice of top-k heap sorting. > > It changed the optimizer's decision on some test cases. > > > > Potential one big change is the test case below. > > > > -- CROSS JOIN, not pushed down > > EXPLAIN (VERBOSE, COSTS OFF) > > SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, > > t2.c1 OFFSET 100 LIMIT 10; > > > > It assumed CROSS JOIN was not pushed down due to the cost for network > > traffic, however, remote LIMIT reduced the estimated number of tuples > > to be moved. So, all of the CROSS JOIN + ORDER BY + LIMIT became to > > run on the remote side. > > > >> 2. > >> + * > >> + * MEMO: root->limit_tuples is not attached when query > >> contains > >> + * grouping-clause or aggregate functions. So, we don's > adjust > >> + * rows even if LIMIT <const> is supplied. > >> > >> Can you please explain why you are not doing this for grouping-clause > >> or aggregate functions. > >> > > See grouping_planner() at optimizer/plan/planner.c It puts an invalid > > value on the root->limit_tuples if query has GROUP BY clause, so we > > cannot know number of tuples to be returned even if we have upper > > limit actually. > > > > /* > > * Figure out whether there's a hard limit on the number of rows > that > > * query_planner's result subplan needs to return. Even if we > know a > > * hard limit overall, it doesn't apply if the query has any > > * grouping/aggregation operations, or SRFs in the tlist. > > */ > > if (parse->groupClause || > > parse->groupingSets || > > parse->distinctClause || > > parse->hasAggs || > > parse->hasWindowFuncs || > > parse->hasTargetSRFs || > > root->hasHavingQual) > > root->limit_tuples = -1.0; > > else > > root->limit_tuples = limit_tuples; > > > >> 3. > >> Typo: > >> > >> don's => don't > >> > > Fixed, > > > > best regards, > > ---- > > PG-Strom Project / NEC OSS Promotion Center KaiGai Kohei > > <kai...@ak.jp.nec.com> > > > > > > > > -- > > Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To > > make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-hackers > > > > > > -- > KaiGai Kohei <kai...@kaigai.gr.jp>
-- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers