Hello, At Fri, 2 Oct 2015 03:10:01 +0000, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote in <9a28c8860f777e439aa12e8aea7694f80114d...@bpxm15gp.gisp.nec.co.jp> > > > As long as FDW author can choose their best way to produce a joined > > > tuple, it may be worth to investigate. > > > > > > My comments are: > > > * ForeignRecheck is the best location to call RefetchForeignJoinRow > > > when scanrelid==0, not ExecScanFetch. Why you try to add special > > > case for FDW in the common routine. > > > * It is FDW's choice where the remote join tuple is kept, even though > > > most of FDW will keep it on the private field of ForeignScanState. > > > > I think that scanrelid == 0 means that the node in focus is not a > > scan node in current executor > > semantics. EvalPlanQualFetchRowMarks fetches the possiblly > > modified row then EvalPlanQualNext does recheck for the new > > row. It's the roles of each functions. > > > > In this criteria, recheck routines are not the place for > > refetching. EvalPlanQualFetchRowMarks is that. > > > I never say FDW should refetch tuples on the recheck routine. > All I suggest is, projection to generate a joined tuple and > recheck according to the qualifier pushed down are role of > FDW driver, because it knows the best strategy to do the job.
I have no objection that rechecking is FDW's job. I think you are thinking that all ROW_MARK_COPY base rows are held in ss_ScanTupleSlot so simply calling recheckMtd on the slot gives enough data to the function. (EPQState would also be needed to retrieve, though..) Right? All the underlying foreign tables should be marked as ROW_MARK_COPY to call recheckMtd safely. And somehow it required to know what column stores what base tuple. > It looks to me all of them makes the problem complicated more. > I never heard why "foreign-join" scan node is difficult to construct > a joined tuple using the EPQ slots that are already loaded on. > > Regardless of the early or late locking, EPQ slots of base relation > are already filled up, aren't it? recheckMtd needs to take EState as a parameter? > All mission of the "foreign-join" scan node is return a joined > tuple as if it was executed by local join logic. > Local join consumes two tuples then generate one tuple. > The "foreign-join" scan node can perform equivalently, even if it > is under EPQ recheck context. > > So, job of FDW driver is... > Step-1) Fetch tuples from the EPQ slots of the base foreign relation > to be joined. Please note that it is just a pointer reference. > Step-2) Try to join these two (or more) tuples according to the > join condition (only FDW knows because it is kept in private) > Step-3) If result is valid, FDW driver makes a projection from these > tuples, then return it. > > If you concern about re-invention of the code for each FDW, core > can provide a utility routine to cover 95% of FDW structure. > > I want to keep EvalPlanQualFetchRowMarks per base relation basis. > It is a bad choice to consider join at this point. > > > Apart from FDW requirement, custom-scan/join needs recheckMtd is > > > called when scanrelid==0 to avoid assertion fail. I hope FDW has > > > symmetric structure, however, not a mandatory requirement for me. > > > > It wouldn't be needed if EvalPlanQualFetchRowMarks works as > > exepcted. Is this wrong? > > > Yes, it does not work. > Expected behavior EvalPlanQualFetchRowMarks is to load the tuple > to be rechecked onto EPQ slot, using heap_fetch or copied image. > It is per base relation basis. Hmm. What I said by "works as expected" is that the function stores the tuple for the "foreign join" scan node. If it doesn't, you're right. > Who can provide a projection to generate joined tuple? > It is a job of individual plan-state-node to be walked on during > EvalPlanQualNext(). EvalPlanQualNext simply does recheck tuples stored in epqTuples, which are designed to be provided by EvalPlanQualFetchRowMarks. I think that that premise shouldn't be broken for convenience... regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers