> I could have a discussion with Fujita-san about this topic. > Also, let me share with the discussion towards entire solution.
The primitive reason of this problem is, Scan node with scanrelid==0 represents a relation join that can involve multiple relations, thus, its TupleDesc of the records will not fit base relations, however, ExecScanFetch() was not updated when scanrelid==0 gets supported. FDW/CSP on behalf of the Scan node with scanrelid==0 are responsible to generate records according to the fdw_/custom_scan_tlist that reflects the definition of relation join, and only FDW/CSP know how to combine these base relations. In addition, host-side expressions (like Plan->qual) are initialized to reference the records generated by FDW/CSP, so the least invasive approach is to allow FDW/CSP to have own logic to recheck, I think. Below is the structure of ExecScanFetch(). ExecScanFetch(ScanState *node, ExecScanAccessMtd accessMtd, ExecScanRecheckMtd recheckMtd) { EState *estate = node->ps.state; if (estate->es_epqTuple != NULL) { /* * We are inside an EvalPlanQual recheck. Return the test tuple if * one is available, after rechecking any access-method-specific * conditions. */ Index scanrelid = ((Scan *) node->ps.plan)->scanrelid; Assert(scanrelid > 0); if (estate->es_epqTupleSet[scanrelid - 1]) { TupleTableSlot *slot = node->ss_ScanTupleSlot; : return slot; } } return (*accessMtd) (node); } When we are inside of EPQ, it fetches a tuple in es_epqTuple[] array and checks its visibility (ForeignRecheck() always say 'yep, it is visible'), then ExecScan() applies its qualifiers by ExecQual(). So, as long as FDW/CSP can return a record that satisfies the TupleDesc of this relation, made by the tuples in es_epqTuple[] array, rest of the code paths are common. I have an idea to solve the problem. It adds recheckMtd() call if scanrelid==0 just before the assertion above, and add a callback of FDW on ForeignRecheck(). The role of this new callback is to set up the supplied TupleTableSlot and check its visibility, but does not define how to do this. It is arbitrarily by FDW driver, like invocation of alternative plan consists of only built-in logic. Invocation of alternative plan is one of the most feasible way to implement EPQ logic on FDW, so I think FDW also needs a mechanism that takes child path-nodes like custom_paths in CustomPath node. Once a valid path node is linked to this list, createplan.c transform them to relevant plan node, then FDW can initialize and invoke this plan node during execution, like ForeignRecheck(). This design can solve another problem Fujita-san has also mentioned. If scan qualifier is pushed-down to the remote query and its expression node is saved in the private area of ForeignScan, the callback on ForeignRecheck() can evaluate the qualifier by itself. (Note that only FDW driver can know where and how expression node being pushed-down is saved in the private area.) In the summary, the following three enhancements are a straightforward way to fix up the problem he reported. 1. Add a special path to call recheckMtd in ExecScanFetch if scanrelid==0 2. Add a callback of FDW in ForeignRecheck() - to construct a record according to the fdw_scan_tlist definition and to evaluate its visibility, or to evaluate qualifier pushed-down if base relation. 3. Add List *fdw_paths in ForeignPath like custom_paths of CustomPaths, to construct plan nodes for EPQ evaluation. On the other hands, we also need to pay attention the development timeline. It is a really problem of v9.5, however, it looks to me the straight forward solution needs enhancement of FDW APIs. I'd like to see people's comment. -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kai...@ak.jp.nec.com> > -----Original Message----- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai > Sent: Saturday, August 01, 2015 10:35 PM > To: Robert Haas; Etsuro Fujita > Cc: PostgreSQL-development; 花田茂 > Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual > > > On Fri, Jul 3, 2015 at 6:25 AM, Etsuro Fujita > > <fujita.ets...@lab.ntt.co.jp> wrote: > > > Can't FDWs get the join information through the root, which I think we > > > would > > > pass to the API as the argument? > > > > This is exactly what Tom suggested originally, and it has some appeal, > > but neither KaiGai nor I could see how to make it work . Do you have > > an idea? It's not too late to go back and change the API. > > > > The problem that was bothering us (or at least what was bothering me) > > is that the PlannerInfo provides only a list of SpecialJoinInfo > > structures, which don't directly give you the original join order. In > > fact, min_righthand and min_lefthand are intended to constraint the > > *possible* join orders, and are deliberately designed *not* to specify > > a single join order. If you're sending a query to a remote PostgreSQL > > node, you don't want to know what all the possible join orders are; > > it's the remote side's job to plan the query. You do, however, need > > an easy way to identify one join order that you can use to construct a > > query. It didn't seem easy to do that without duplicating > > make_join_rel(), which seemed like a bad idea. > > > > But maybe there's a good way to do it. Tom wasn't crazy about this > > hook both because of the frequency of calls and also because of the > > long argument list. I think those concerns are legitimate; I just > > couldn't see how to make the other way work. > > > I could have a discussion with Fujita-san about this topic. > He has a little bit tricky, but I didn't have a clear reason to deny, > idea to tackle this matter. > At the line just above set_cheapest() of the standard_join_search(), > at least one built-in join logic are already added to the RelOptInfo, > thus, FDW driver can reference the cheapest path by built-in logic > and its lefttree and righttree that construct a joinrel. > Its assumption is, the best paths by built-in logic are at least > enough reasonable join order than other potential ones. > > Thanks, > -- > NEC Business Creation Division / PG-Strom Project > KaiGai Kohei <kai...@ak.jp.nec.com> > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers