On Tue, Nov 25, 2014 at 3:44 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote: > Today, I had a talk with Hanada-san to clarify which can be a common portion > of them and how to implement it. Then, we concluded both of features can be > shared most of the infrastructure. > Let me put an introduction of join replacement by foreign-/custom-scan below. > > Its overall design intends to inject foreign-/custom-scan node instead of > the built-in join logic (based on the estimated cost). From the viewpoint of > core backend, it looks like a sub-query scan that contains relations join > internally. > > What we need to do is below: > > (1) Add a hook add_paths_to_joinrel() > It gives extensions (including FDW drivers and custom-scan providers) chance > to add alternative paths towards a particular join of relations, using > ForeignScanPath or CustomScanPath, if it can run instead of the built-in ones. > > (2) Informs the core backend varno/varattno mapping > One thing we need to pay attention is, foreign-/custom-scan node that performs > instead of the built-in join node must return mixture of values come from both > relations. In case when FDW driver fetch a remote record (also, fetch a record > computed by external computing resource), the most reasonable way is to store > it on ecxt_scantuple of ExprContext, then kicks projection with varnode that > references this slot. > It needs an infrastructure that tracks relationship between original varnode > and the alternative varno/varattno. We thought, it shall be mapped to > INDEX_VAR > and a virtual attribute number to reference ecxt_scantuple naturally, and > this infrastructure is quite helpful for both of ForegnScan/CustomScan. > We'd like to add List *fdw_varmap/*custom_varmap variable to both of plan > nodes. > It contains list of the original Var node that shall be mapped on the position > according to the list index. (e.g, the first varnode is varno=INDEX_VAR and > varattno=1) > > (3) Reverse mapping on EXPLAIN > For EXPLAIN support, above varnode on the pseudo relation scan needed to be > solved. All we need to do is initialization of dpns->inner_tlist on > set_deparse_planstate() according to the above mapping. > > (4) case of scanrelid == 0 > To skip open/close (foreign) tables, we need to have a mark to introduce the > backend not to initialize the scan node according to table definition, but > according to the pseudo varnodes list. > As earlier custom-scan patch doing, scanrelid == 0 is a straightforward mark > to show the scan node is not combined with a particular real relation. > So, it also need to add special case handling around foreign-/custom-scan > code. > > We expect above changes are enough small to implement basic join push-down > functionality (that does not involves external computing of complicated > expression node), but valuable to support in v9.5. > > Please comment on the proposition above.
I don't really have any technical comments on this design right at the moment, but I think it's an important area where PostgreSQL needs to make some progress sooner rather than later, so I hope that we can get something committed in time for 9.5. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers