On Thu, Sep 4, 2014 at 7:57 PM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
> Regarding to the attached three patches:
> [1] custom-path and hook
> It adds register_custom_path_provider() interface for registration of
> custom-path entrypoint. Callbacks are invoked on set_plain_rel_pathlist
> to offer alternative scan path on regular relations.
> I may need to explain the terms in use. I calls the path-node custom-path
> that is the previous step of population of plan-node (like custom-scan
> and potentially custom-join and so on). The node object created by
> CreateCustomPlan() is called custom-plan because it is abstraction for
> all the potential custom-xxx node; custom-scan is the first of all.

I don't think it's a good thing that add_custom_path_type is declared
as void (*)(void *) rather than having a real type.  I suggest we add
the path-creation callback function to CustomPlanMethods instead, like

void (*CreateCustomScanPath)(PlannerInfo *root, RelOptInfo *baserel,
RangeTblEntry *rte);

Then, register_custom_path_provider() can just take CustomPathMethods
* as an argument; and create_customscan_paths can just walk the list
of CustomPlanMethods objects and call CreateCustomScanPath for each
one where that is non-NULL.  This conflates the path generation
mechanism with the type of path getting generated a little bit, but I
don't see any real downside to that.  I don't see a reason why you'd
ever want two different providers to offer the same type of

Don't the changes to src/backend/optimizer/plan/createplan.c belong in patch #2?

> [2] custom-scan node
> It adds custom-scan node support. The custom-scan node is expected to
> generate contents of a particular relation or sub-plan according to its
> custom-logic.
> Custom-scan provider needs to implement callbacks of CustomScanMethods
> and CustomExecMethods. Once a custom-scan node is populated from
> custom-path node, the backend calls back these methods in the planning
> and execution stage.

It looks to me like this patch is full of holdovers from its earlier
life as a more-generic CustomPlan node.  In particular, it contains
numerous defenses against the case where scanrelid != 0.  These are
confusingly written as scanrelid > 0, but I think really they're just
bogus altogether: if this is specifically a CustomScan, not a
CustomPlan, then the relid should always be filled in.  Please
consider what can be simplified here.

The comment in _copyCustomScan looks bogus to me.  I think we should
*require* a static method table.

In create_custom_plan, you if (IsA(custom_plan, CustomScan)) { lots of
stuff; } else elog(ERROR, ...).  I think it would be clearer to write
if (!IsA(custom_plan, CustomScan)) elog(ERROR, ...); lots of stuff;

> Also, regarding to the use-case of multi-exec interface.
> Below is an EXPLAIN output of PG-Strom. It shows the custom GpuHashJoin has
> two sub-plans; GpuScan and MultiHash.
> GpuHashJoin is stacked on the GpuScan. It is a case when these nodes utilize
> multi-exec interface for more efficient data exchange between the nodes.
> GpuScan already keeps a data structure that is suitable to send to/recv from
> GPU devices and constructed on the memory segment being DMA available.
> If we have to form a tuple, pass it via row-by-row interface, then deform it,
> it will become a major performance degradation in this use case.
> postgres=# explain select * from t10 natural join t8 natural join t9 where x 
> < 10;
>                                           QUERY PLAN
> -----------------------------------------------------------------------------------------------
>  Custom (GpuHashJoin)  (cost=10979.56..90064.15 rows=333 width=49)
>    pseudo scan tlist: 1:(t10.bid), 3:(t10.aid), 4:<t10.x>, 2:<t8.data>, 
> 5:[t8.aid], 6:[t9.bid]
>    hash clause 1: ((t8.aid = t10.aid) AND (t9.bid = t10.bid))
>    ->  Custom (GpuScan) on t10  (cost=10000.00..88831.26 rows=3333327 
> width=16)
>          Host References: aid, bid, x
>          Device References: x
>          Device Filter: (x < 10::double precision)
>    ->  Custom (MultiHash)  (cost=464.56..464.56 rows=1000 width=41)
>          hash keys: aid, bid
>          ->  Hash Join  (cost=60.06..464.56 rows=1000 width=41)
>                Hash Cond: (t9.data = t8.data)
>                ->  Index Scan using t9_pkey on t9  (cost=0.29..357.29 
> rows=10000 width=37)
>                ->  Hash  (cost=47.27..47.27 rows=1000 width=37)
>                      ->  Index Scan using t8_pkey on t8  (cost=0.28..47.27 
> rows=1000 width=37)
>  Planning time: 0.810 ms
> (15 rows)

Why can't the Custom(GpuHashJoin) node build the hash table internally
instead of using a separate node?

Also, for this patch we are only considering custom scan.  Custom join
is another patch.  We don't need to provide infrastructure for that
patch in this one.

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to