Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Hi Ashutosh, Sorry for leaving the thread. 2015-07-20 16:09 GMT+09:00 Ashutosh Bapat ashutosh.ba...@enterprisedb.com: In find_user_mapping(), if the first cache search returns a valid tuple, it is checked twice for validity, un-necessarily. Instead if the first search returns a valid tuple, it should be returned immediately. I see that this code was copied from GetUserMapping(), where as well it had the same problem. Oops. I changed find_user_mapping to exit immediately when any valid cache was found. In build_join_rel(), we check whether user mappings of the two joining relations are same. If the user mappings are same, doesn't that imply that the foreign server and local user are same too? Yes, validity of umid is identical to serverid. We can remove the check for serverid for some cycles. One idea is to put Assert for serverid inside the if-statement block. Rest of the patch looks fine. Thanks -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [idea] more aggressive join pushdown on postgres_fdw
2015/06/05 6:43、Robert Haas robertmh...@gmail.com のメール: On Sat, May 30, 2015 at 9:03 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: Neat idea. This ties into something I've thought about and mentioned before: what if the innerrel is local, but there's a replicated copy on the remote server? Perhaps both cases are worth thinking about at some point. Interesting, but I’m not sure that I understood the situation. Here which kind of replication method do you mean? I guess you assume some kind of per-table replication such as Slony-I or materialized views with postgres_fdw or dblink, in postgres_fdw case. If this assumption is correct, we need a mapping between a local ordinary table and a foreign table which points remote replicated table. -- Shigeru HANADA shigeru.han...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Thank for your comments. 2015-05-21 23:11 GMT+09:00 Robert Haas robertmh...@gmail.com: On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada shigeru.han...@gmail.com wrote: d) All relations must have the same effective user id This check is done with userid stored in PgFdwRelationInfo, which is valid only when underlying relations have the same effective user id. Here effective user id means id of the user executing the query, or the owner of the view when the foreign table is accessed via view. Tom suggested that it is necessary to check that user mapping matches for security reason, and now I think it's better than checking effective user as current postgres_fdw does. So, should this be a separate patch? Yes, it should be an additional patch for Custom/Foreign join API which is already committed. The patch would contain these changes: * add new field usermappingid to RelOptInfo, it is InvalidOid for non-foreign tables * obtain oid of user mapping for a foreign table, and store it in the RelOptInfo (we already have serverid in RelOptInfo, so userid is enough to identify user mapping though) * propagate usermappingid up to the join relation when outer and inner relations have same valid value * check matching of user mapping before calling GetForeignJoinPaths, rather than serverid One of my concerns about this patch is that it's got a lot of stuff in it that isn't obviously related to the patch. Anything that is a separate change should be separated out into its own patch. Perhaps you can post a set of patches that apply one on top of the next, with the changes for each one clearly separated. IIUC, each patch should not break compilation, and should contain only one complete logical change which can't be separated into pieces. I think whole of the patch is necessary to implement e) Each source relation must not have any local filter Evaluating conditions of join source talbe potentially produces different result in OUTER join cases. This can be relaxed for the cases when the join is INNER and local filters don't contain any volatile function/operator, but it is left as future enhancement. I think this restriction is a sign that you're not really doing this right. Consider: (1) SELECT * FROM a LEFT JOIN b ON a.x = b.x AND b.x = 3; (2) SELECT * FROM a LEFT JOIN b ON a.x = b.x WHERE b.x = 3; If you push down the scan of b, you can include the b.x = 3 qual in case (1) but not in case (2). If you push down the join, you can include the qual in either case, but you must attach it in the same place where it was before. One big change about deparsing base relation is aliasing. This patch adds column alias to SELECT clause even original query is a simple single table SELECT. fdw=# EXPLAIN (VERBOSE, COSTS false) SELECT * FROM pgbench_branches b; QUERY PLAN Foreign Scan on public.pgbench_branches b Output: bid, bbalance, filler Remote SQL: SELECT bid a9, bbalance a10, filler a11 FROM public.pgbench_branches (3 rows) As you see, every column has alias in format a%d with index value derived from pg_attribute.attnum. Index value is attnum + 8, and the magic number 8 comes from FirstLowInvalidHeapAttributeNumber for the adjustment that makes attribute number of system attributes positive. Yeah. I'm not sure this is a good idea. The column labels are pointless at the outermost level. I'm not sure it isn't a good idea, either, but I have some doubts. I fixed the patch to not add column alias to remote queries for single table. This change also reduces amount of differences from master branch slightly. One thing tricky is peusdo projection which is done by deparseProjectionSql for whole-row reference. This is done by put the query string in FROM subquery and add whole-row reference in outer SELECT clause. This is done by ExecProjection in 9.4 and older, but we need to do this in postgres_fdw because ExecProjection is not called any more. What commit changed this? No commit changed this behavior, as Kaigai-san says. If you still have comments, please refer my response to Kaigai-san. Thanks for your work on this. Although I know progress has been slow, I think this work is really important to the project. I agree. I’ll take more time for this work. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
2015-05-15 8:43 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: Regarding of FDW, as Hanada-san mentioned, I'm uncertain whether similar feature is also needed because its join-pushdown feature scan on the result-set of remotely joined relations, thus no need to have local child Path nodes. So, I put this custom_children list on Custom structure only. AFAIS most of FDWs won't need child paths to process their external data. The most possible idea is that a FDW uses output of ForeignScan plan node which is handled by the FDW, but such work should be done by another CSP (or at least via CSP I/F). -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
2015-05-12 10:24 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: option-2) Tom's suggestion. Add a new list field of Path nodes on CustomPath, then create_customscan_plan() will call static create_plan_recurse() function to construct child plan nodes. Probably, the attached patch will be an image of this enhancement, but not tested yet, of course. Once we adopt this approach, I'll adjust my PG-Strom code towards the new interface within 2 weeks and report problems if any. +1. This design achieves the functionality required for custom join by Kaigai-san's use case, instantiating sub-plans of CustomScan plan recursively, without exposing create_plan_recurse. CSP can use the index number to distinguish information, like FDWs do with fdw_private. IMO it isn't necessary to apply the change onto ForeignPath/ForeignScan. FDW would have a way to keep-and-use sub paths as private information. In fact, I wrote postgres_fdw to use create_plan_recurse to generate SQL statements of inner/outer relations to be joined, but as of now SQL construction for join push-down is accomplished by calling private function deparseSelectSql recursively at the top of a join tree. Some FDW might hope to use sub-plan generation, but we don't have any concrete use case as of now. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Thanks for the comments. 2015/05/01 22:35、Robert Haas robertmh...@gmail.com のメール: On Mon, Apr 27, 2015 at 5:07 AM, Shigeru Hanada shigeru.han...@gmail.com wrote: 2015-04-27 11:00 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: Hanada-san, could you adjust your postgres_fdw patch according to the above new (previous?) definition. The attached v14 patch is the revised version for your v13 patch. It also contains changed for Ashutosh’s comments. We should probably move this discussion to a new thread now that the other patch is committed. Changing subject line accordingly. Generally, there's an awful lot of changes in this patch - it is over 2000 insertions and more than 450 deletions - and it's not awfully obvious why all of those changes are there. I think this patch needs a detailed README to accompany it explaining what the various changes in the patch are and why those things got changed; or maybe there is a way to break it up into multiple patches so that we can take a more incremental approach. I am really suspicious of the amount of wholesale reorganization of code that this patch is doing. It's really hard to validate that a reorganization like that is necessary, or that it's correct, and it's gonna make back-patching noticeably harder in the future. If we really need this much code churn it needs careful justification; if we don't, we shouldn't do it. I agree. I’ll write detailed description for the patch and repost the new one with rebasing onto current HEAD. I’m sorry but it will take a day or so... +SET enable_mergejoin = off; -- planner choose MergeJoin even it has higher costs, so disable it for testing. This seems awfully strange. Why would the planner choose a plan if it had a higher cost? I thought so but I couldn’t reproduce such situation now. I’ll investigate it more. If the issue has gone, I’ll remove that SET statement for straightforward tests. -* If the table or the server is configured to use remote estimates, -* identify which user to do remote access as during planning. This +* Identify which user to do remote access as during planning. This * should match what ExecCheckRTEPerms() does. If we fail due to lack of * permissions, the query would have failed at runtime anyway. */ - if (fpinfo-use_remote_estimate) - { - RangeTblEntry *rte = planner_rt_fetch(baserel-relid, root); - Oid userid = rte-checkAsUser ? rte-checkAsUser : GetUserId(); - - fpinfo-user = GetUserMapping(userid, fpinfo-server-serverid); - } - else - fpinfo-user = NULL; + rte = planner_rt_fetch(baserel-relid, root); + fpinfo-userid = rte-checkAsUser ? rte-checkAsUser : GetUserId(); So, wait a minute, remote estimates aren't optional any more? No, it seems to be removed accidentally. I’ll check the reason again though, but I’ll revert the change unless I find any problem. -- Shigeru HANADA shigeru.han...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
, but your suggestion makes it looks like this: fdw=# explain (verbose) select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid; WARNING: restrictlist: ({RESTRICTINFO :clause {OPEXPR :opno 96 :opfuncid 65 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 85} {VAR :varno 2 :varattno 2 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 2 :varoattno 2 :location 77}) :location -1} :is_pushed_down true :outerjoin_delayed false :can_join true :pseudoconstant false :clause_relids (b 1 2) :required_relids (b 1 2) :outer_relids (b) :nullable_relids (b) :left_relids (b 1) :right_relids (b 2) :orclause :norm_selec 0.2000 :outer_selec -1. :mergeopfamilies (o 1976) :left_em {EQUIVALENCEMEMBER :em_expr {VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 85} :em_relids (b 1) :em_nullable_relids (b) :em_is_const false :em_is_child false :em_datatype 23} :right_em {EQUIVALENCEMEMBER :em_expr {VAR :varno 2 :varattno 2 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 2 :varoattno 2 :location 77} :em_relids (b 2) :em_nullable_relids (b) :em_is_const false :em_is_child false :em_datatype 23} :outer_is_left false :hashjoinoperator 96}) WARNING: joinclauses: WARNING: otherclauses: ({OPEXPR :opno 96 :opfuncid 65 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 85} {VAR :varno 2 :varattno 2 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 2 :varoattno 2 :location 77}) :location -1}) QUERY PLAN - - Foreign Scan (cost=100.00..101.00 rows=50 width=716) Output: b.bid, b.bbalance, b.filler, t.tid, t.bid, t.tbalance, t.filler Relations: (public.pgbench_branches b) CROSS JOIN (public.pgbench_tellers t) Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1, r.a2, r.a3, r.a4 FROM (SELECT l.a9, l.a10, l.a11 FROM (SELECT bid a9, bbalance a10, filler a11 FROM public.pgbench_branches) l) l (a1, a2, a3) CROSS JOIN (SELECT r.a9, r.a10, r.a11, r.a12 FROM (SELECT tid a9, bid a10, tbalance a11, filler a12 FROM public.pgbench_tellers) r) r (a1, a2, a3, a4) WHERE ((l.a1 = r.a2)) (4 rows) Thoughts? Regards, -- Shigeru HANADA shigeru.han...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
, but your suggestion makes it looks like this: fdw=# explain (verbose) select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid; QUERY PLAN - - Foreign Scan (cost=100.00..101.00 rows=50 width=716) Output: b.bid, b.bbalance, b.filler, t.tid, t.bid, t.tbalance, t.filler Relations: (public.pgbench_branches b) CROSS JOIN (public.pgbench_tellers t) Remote SQL: SELECT l.a1, l.a2, l.a3, r.a1, r.a2, r.a3, r.a4 FROM (SELECT l.a9, l.a10, l.a11 FROM (SELECT bid a9, bbalance a10, filler a11 FROM public.pgbench_branches) l) l (a1, a2, a3) CROSS JOIN (SELECT r.a9, r.a10, r.a11, r.a12 FROM (SELECT tid a9, bid a10, tbalance a11, filler a12 FROM public.pgbench_tellers) r) r (a1, a2, a3, a4) WHERE ((l.a1 = r.a2)) (4 rows) Thoughts? Regards, -- Shigeru HANADA shigeru.han...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
GetForeignJoinPaths in make_join_rel This FDW API is intended to provide a chance to add ForeignPaths for a join relation. This is called only once for a join relation, so FDW should consider reversed combination if it’s meaningful in their own mechanisms. Note that this is called only when the join relation was *NOT* found in the PlannerInfo, to avoid redundant calls. Parameters seems enough for postgres_fdw to process N-way join on remote side with pushing down join conditions and remote filters. * Propagate FDW information through bottom-up planning FDW can handle a join which uses foreign tables managed by the FDW, of course. We obtain FDW routine entry to plan a scan against a foreign table, so propagating the information up to join phase would help core planner to check the all sources are managed by one FDW or not. It also avoids repeated catalog accesses. * Make create_plan_recurse non-static This is for CSPs and FDWs which want underlying plan nodes of a join. For example, a CSP might want outer/inner plan nodes as input sources of a join. * Treat scanrelid == 0 as pseudo scan A foreign/custom join is represented by a scan against a pseudo relation, i.e. result of a join. Usually Scan has valid scanrelid, oid of a relation being scanned, and many functions assume that it’s always valid. The patch adds another code paths for scanrelid == 0 as custom/foreign join scans. * Pseudo scan target list support CustomScan and ForeignScan have csp_ps_tlist and fdw_ps_tlist respectively, for column reference tracking. A scan generated for custom/foreign join would have column from multiple relations in its target list, i.e. output columns. Ordinary scans have all valid columns of the relation as output, so references to them can be resolved easily, but we need an additional mechanism to determine where a reference in a target list of custom/foreign scan come from. This is very similar to what IndexOnlyScan does, so we reuse INDEX_VAR as mark of an indirect reference to another relation’s var. For this mechanism, set_plan_refs is changed to fix Vars in ps_tlist of CustomScan and ForeignScan. For this change, new BitmapSet function bms_shift_members is added. set_deparse_planstate is also changed to pass ps_tlist as namespace for deparsing. These chanes seems reasonable, so I mark this patch as “ready for committers” to hear committers' thoughts. Regards, -- Shigeru HANADA shigeru.han...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
KaiGai-san, 2015/04/14 14:04、Kouhei Kaigai kai...@ak.jp.nec.com のメール: * Fix typos Please review the v11 patch, and mark it as “ready for committer” if it’s ok. It's OK for me, and wants to be reviewed by other people to get it committed. Thanks! In addition to essential features, I tried to implement relation listing in EXPLAIN output. Attached explain_forein_join.patch adds capability to show join combination of a ForeignScan in EXPLAIN output as an additional item “Relations”. I thought that using array to list relations is a good way too, but I chose one string value because users would like to know order and type of joins too. A bit different from my expectation... I expected to display name of the local foreign tables (and its alias), not remote one, because all the local join logic displays local foreign tables name. Is it easy to adjust isn't it? Probably, all you need to do is, putting a local relation name on the text buffer (at deparseSelectSql) instead of the deparsed remote relation. Oops, that’s right. Attached is the revised version. I chose fully qualified name, schema.relname [alias] for the output. It would waste some cycles during planning if that is not for EXPLAIN, but it seems difficult to get a list of name of relations in ExplainForeignScan() phase, because planning information has gone away at that time. -- Shigeru HANADA shigeru.han...@gmail.com explain_foreign_join_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
2015/03/25 12:59、Kouhei Kaigai kai...@ak.jp.nec.com のメール: At this moment, I'm not 100% certain about its logic. Especially, I didn't test SEMI- and ANTI- join cases yet. However, time is money - I want people to check overall design first, rather than detailed debugging. Please tell me if I misunderstood the logic to break down join relations. With applying your patch, regression tests of “updatable view” failed. regression.diff contains some errors like this: ! ERROR: could not find RelOptInfo for given relids Could you check that? It is a bug around the logic to find out two RelOptInfo that can construct another RelOptInfo of joinrel. Even though I'm now working to correct the logic, it is not obvious to identify two relids that satisfy joinrel-relids. (Yep, law of entropy enhancement…) IIUC, this problem is in only non-INNER JOINs because we can treat relations joined with only INNER JOIN in arbitrary order. But supporting OUTER JOINs would be necessary even for the first cut. On the other hands, we may have a solution that does not need a complicated reconstruction process. The original concern was, FDW driver may add paths that will replace entire join subtree by foreign-scan on remote join multiple times, repeatedly, but these paths shall be identical. If we put a hook for FDW/CSP on bottom of build_join_rel(), we may be able to solve the problem more straight-forward and simply way. Because build_join_rel() finds a cache on root-join_rel_hash then returns immediately if required joinrelids already has its RelOptInfo, bottom of this function never called twice on a particular set of joinrelids. Once FDW/CSP constructs a path that replaces entire join subtree towards the joinrel just after construction, it shall be kept until cheaper built-in paths are added (if exists). This idea has one other positive side-effect. We expect remote-join is cheaper than local join with two remote scan in most cases. Once a much cheaper path is added prior to local join consideration, add_path_precheck() breaks path consideration earlier. Please comment on. Or bottom of make_join_rel(). IMO build_join_rel() is responsible for just building (or searching from a list) a RelOptInfo for given relids. After that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per join type to generate actual Paths implements the join. make_join_rel() is called only once for particular relid combination, and there SpecialJoinInfo and restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising for FDW cases. Though I’m not sure that it also fits custom join provider’s requirements. — Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
2015/03/25 19:09、Kouhei Kaigai kai...@ak.jp.nec.com のメール: On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA shigeru.han...@gmail.com wrote: Or bottom of make_join_rel(). IMO build_join_rel() is responsible for just building (or searching from a list) a RelOptInfo for given relids. After that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per join type to generate actual Paths implements the join. make_join_rel() is called only once for particular relid combination, and there SpecialJoinInfo and restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising for FDW cases. I like that idea, but I think we will have complex hook signature, it won't remain as simple as hook (root, joinrel). In this case, GetForeignJoinPaths() will take root, joinrel, rel1, rel2, sjinfo and restrictlist. It is not too simple, but not complicated signature. Even if we reconstruct rel1 and rel2 using sjinfo, we also need to compute restrictlist using build_joinrel_restrictlist() again. It is a static function in relnode.c. So, I don't think either of them has definitive advantage from the standpoint of simplicity. The bottom of make_join_rel() seems good from the viewpoint of information, but it is called multiple times for join combinations which are essentially identical, for INNER JOIN case like this: fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid; INFO: postgresGetForeignJoinPaths() 1x2 INFO: postgresGetForeignJoinPaths() 1x4 INFO: postgresGetForeignJoinPaths() 2x4 INFO: standard_join_search() old hook point INFO: standard_join_search() old hook point INFO: standard_join_search() old hook point INFO: postgresGetForeignJoinPaths() 0x4 INFO: postgresGetForeignJoinPaths() 0x2 INFO: postgresGetForeignJoinPaths() 0x1 INFO: standard_join_search() old hook point QUERY PLAN - Foreign Scan (cost=100.00..102.11 rows=211 width=1068) (1 row) Here I’ve put probe point in the beginnig of GetForeignJoinPaths handler and just before set_cheapest() call in standard_join_search() as “old hook point”. In this example 1, 2, and 4 are base relations, and in the join level 3 planner calls GetForeignJoinPaths() three times for the combinations: 1) (1x2)x4 2) (1x4)x2 3) (2x4)x1 Tom’s suggestion is aiming at providing a chance to consider join push-down in more abstract level, IIUC. So it would be good to call handler only once for that case, for flattened combination (1x2x3). Hum, how about skipping calling handler (or hook) if the joinrel was found by find_join_rel()? At least it suppress redundant call for different join orders, and handler can determine whether the combination can be flattened by checking that all RelOptInfo with RELOPT_JOINREL under joinrel has JOIN_INNER as jointype. — Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
2015/03/25 18:53、Ashutosh Bapat ashutosh.ba...@enterprisedb.com のメール: On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA shigeru.han...@gmail.com wrote: Or bottom of make_join_rel(). IMO build_join_rel() is responsible for just building (or searching from a list) a RelOptInfo for given relids. After that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per join type to generate actual Paths implements the join. make_join_rel() is called only once for particular relid combination, and there SpecialJoinInfo and restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising for FDW cases. I like that idea, but I think we will have complex hook signature, it won't remain as simple as hook (root, joinrel). Signature of the hook (or the FDW API handler) would be like this: typedef void (*GetForeignJoinPaths_function ) (PlannerInfo *root, RelOptInfo *joinrel, RelOptInfo *outerrel, RelOptInfo *innerrel, JoinType jointype, SpecialJoinInfo *sjinfo, List *restrictlist); This is very similar to add_paths_to_joinrel(), but lacks semifactors and extra_lateral_rels. semifactors can be obtained with compute_semi_anti_join_factors(), and extra_lateral_rels can be constructed from root-placeholder_list as add_paths_to_joinrel() does. From the viewpoint of postgres_fdw, jointype and restrictlist is necessary to generate SELECT statement, so it would require most work done in make_join_rel again if the signature was hook(root, joinrel). sjinfo will be necessary for supporting SEMI/ANTI joins, but currently it is not in the scope of postgres_fdw. I guess that other FDWs require at least jointype and restrictlist. — Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
2015/03/25 19:47、Kouhei Kaigai kai...@ak.jp.nec.com のメール: The reason why FDW handler was called multiple times on your example is, your modified make_join_rel() does not check whether build_join_rel() actually build a new RelOptInfo, or just a cache reference, doesn't it? Yep. After that change calling count looks like this: fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid; INFO: postgresGetForeignJoinPaths() 1x2 INFO: postgresGetForeignJoinPaths() 1x4 INFO: postgresGetForeignJoinPaths() 2x4 INFO: standard_join_search() old hook point INFO: standard_join_search() old hook point INFO: standard_join_search() old hook point INFO: postgresGetForeignJoinPaths() 0x4 INFO: standard_join_search() old hook point QUERY PLAN - Foreign Scan (cost=100.00..102.11 rows=211 width=1068) (1 row) fdw=# If so, I'm inclined to your proposition. A new bool *found argument of build_join_rel() makes reduce number of FDW handler call, with keeping reasonable information to build remote- join query. Another idea is to pass “found” as parameter to FDW handler, and let FDW to decide to skip or not. Some of FDWs (and some of CSP?) might want to be conscious of join combination. — Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
2015/03/26 10:51、Kouhei Kaigai kai...@ak.jp.nec.com のメール: The attached patch adds GetForeignJoinPaths call on make_join_rel() only when 'joinrel' is actually built and both of child relations are managed by same FDW driver, prior to any other built-in join paths. I adjusted the hook definition a little bit, because jointype can be reproduced using SpecialJoinInfo. Right? OK. Probably, it will solve the original concern towards multiple calls of FDW handler in case when it tries to replace an entire join subtree with a foreign- scan on the result of remote join query. How about your opinion? Seems fine. I’ve fixed my postgres_fdw code to fit the new version, and am working on handling a whole-join-tree. It would be difficult in the 9.5 cycle, but a hook point where we can handle whole joinrel might allow us to optimize a query which accesses multiple parent tables, each is inherited by foreign tables and partitioned with identical join key, by building a path tree which joins sharded tables first, and then union those results. -- Shigeru HANADA shigeru.han...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
2015/03/23 9:12、Kouhei Kaigai kai...@ak.jp.nec.com のメール: Sorry for my response late. It was not easy to code during business trip. The attached patch adds a hook for FDW/CSP to replace entire join-subtree by a foreign/custom-scan, according to the discussion upthread. GetForeignJoinPaths handler of FDW is simplified as follows: typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root, RelOptInfo *joinrel); It’s not a critical issue but I’d like to propose to rename add_joinrel_extra_paths() to add_extra_paths_to_joinrel(), because the latter would make it more clear that it does extra work in addition to add_paths_to_joinrel(). It takes PlannerInfo and RelOptInfo of the join-relation to be replaced if available. RelOptInfo contains 'relids' bitmap, so FDW driver will be able to know the relations to be involved and construct a remote join query. However, it is not obvious with RelOptInfo to know how relations are joined. The function below will help implement FDW driver that support remote join. List * get_joinrel_broken_down(PlannerInfo *root, RelOptInfo *joinrel, SpecialJoinInfo **p_sjinfo) It returns a list of RelOptInfo to be involved in the relations join that is represented with 'joinrel', and also set a SpecialJoinInfo on the third argument if not inner join. In case of inner join, it returns multiple (more than or equal to 2) relations to be inner-joined. Elsewhere, it returns two relations and a valid SpecialJoinInfo. As far as I tested, it works fine for SEMI and ANTI. # I want dump function of BitmapSet for debugging, as Node has nodeToString()... At this moment, I'm not 100% certain about its logic. Especially, I didn't test SEMI- and ANTI- join cases yet. However, time is money - I want people to check overall design first, rather than detailed debugging. Please tell me if I misunderstood the logic to break down join relations. With applying your patch, regression tests of “updatable view” failed. regression.diff contains some errors like this: ! ERROR: could not find RelOptInfo for given relids Could you check that? — Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
2015-03-14 7:18 GMT+09:00 Robert Haas robertmh...@gmail.com: I think the foreign data wrapper join pushdown case, which also aims to substitute a scan for a join, is interesting to think about, even though it's likely to be handled by a new FDW method instead of via the hook. Where should the FDW method get called from? Currently, the FDW method in KaiGai's patch is GetForeignJoinPaths, and that gets called from add_paths_to_joinrel(). The patch at http://www.postgresql.org/message-id/CAEZqfEfy7p=urpwn-q-nngzb8kwhbfqf82ysb9ztfzg7zn6...@mail.gmail.com uses that to implement join pushdown in postgres_fdw; if you have A JOIN B JOIN C all on server X, we'll notice that the join with A and B can be turned into a foreign scan on A JOIN B, and similarly for A-C and B-C. Then, if it turns out that the cheapest path for A-B is the foreign join, and the cheapest path for C is a foreign scan, we'll arrive at the idea of a foreign scan on A-B-C, and we'll realize the same thing in each of the other combinations as well. So, eventually the foreign join gets pushed down. From the viewpoint of postgres_fdw, incremental approach seemed natural way, although postgres_fdw should consider paths in pathlist in additon to cheapest one as you mentioned in another thread. This approarch allows FDW to use SQL statement generated for underlying scans as parts of FROM clause, as postgres_fdw does in the join push-down patch. But there's another possible approach: suppose that join_search_one_level, after considering left-sided and right-sided joins and after considering bushy joins, checks whether every relation it's got is from the same foreign server, and if so, asks that foreign server whether it would like to contribute any paths. Would that be better or worse? A disadvantage is that if you've got something like A LEFT JOIN B LEFT JOIN C LEFT JOIN D LEFT JOIN E LEFT JOIN F LEFT JOIN G LEFT JOIN H LEFT JOIN I but none of the joins can be pushed down (say, each join clause calls a non-pushdown-safe function) you'll end up examining a pile of joinrels - at every level of the join tree - and individually rejecting each one. With the build-it-up-incrementally approach, you'll figure that all out at level 2, and then after that there's nothing to do but give up quickly. On the other hand, I'm afraid the incremental approach might miss a trick: consider small LEFT JOIN (big INNER JOIN huge ON big.x = huge.x) ON small.y = big.y AND small.z = huge.z, where all three are foreign tables on the same server. If the output of the big/huge join is big, none of those paths are going to survive at level 2, but the overall join size might be very small, so we surely want a chance to recover at level 3. (We discussed test cases of this form quite a bit in the context of e2fa76d80ba571d4de8992de6386536867250474.) Interesting, I overlooked that pattern. As you pointed out, join between big foregin tables might be dominated, perhaps by a MergeJoin path. Leaving dominated ForeignPath in pathlist for more optimization in the future (in higher join level) is an idea, but it would make planning time longer (and use more cycle and memory). Tom's idea sounds good for saving the path b), but I worry that whether FDW can get enough information at that timing, just before set_cheapest. It would not be good I/F if each FDW needs to copy many code form joinrel.c... -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
Here is the v5 patch of Join push-down support for foreign tables. Changes since v4: - Separete remote conditions into ON and WHERE, per Ashutosh. - Add regression test cases for foreign join. - Don't skip reversed relation combination in OUTER join cases. I'm now working on two issues from Kaigai-san and Ashutosu, whole-row reference handling and use of get_joinrel_parampathinfo(). 2015-03-05 22:00 GMT+09:00 Shigeru Hanada shigeru.han...@gmail.com: Hi Ashutosh, thanks for the review. 2015-03-04 19:17 GMT+09:00 Ashutosh Bapat ashutosh.ba...@enterprisedb.com: In create_foreignscan_path() we have lines like - 1587 pathnode-path.param_info = get_baserel_parampathinfo(root, rel, 1588 required_outer); Now, that the same function is being used for creating foreign scan paths for joins, we should be calling get_joinrel_parampathinfo() on a join rel and get_baserel_parampathinfo() on base rel. Got it. Please let me check the difference. The patch seems to handle all the restriction clauses in the same way. There are two kinds of restriction clauses - a. join quals (specified using ON clause; optimizer might move them to the other class if that doesn't affect correctness) and b. quals on join relation (specified in the WHERE clause, optimizer might move them to the other class if that doesn't affect correctness). The quals in a are applied while the join is being computed whereas those in b are applied after the join is computed. For example, postgres=# select * from lt; val | val2 -+-- 1 |2 1 |3 (2 rows) postgres=# select * from lt2; val | val2 -+-- 1 |2 (1 row) postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2); val | val2 | val | val2 -+--+-+-- 1 |2 | 1 |2 1 |3 | | (2 rows) postgres=# select * from lt left join lt2 on (true) where (lt.val2 = lt2.val2); val | val2 | val | val2 -+--+-+-- 1 |2 | 1 |2 (1 row) The difference between these two kinds is evident in case of outer joins, for inner join optimizer puts all of them in class b. The remote query sent to the foreign server has all those in ON clause. Consider foreign tables ft1 and ft2 pointing to local tables on the same server. postgres=# \d ft1 Foreign table public.ft1 Column | Type | Modifiers | FDW Options +-+---+- val| integer | | val2 | integer | | Server: loopback FDW Options: (table_name 'lt') postgres=# \d ft2 Foreign table public.ft2 Column | Type | Modifiers | FDW Options +-+---+- val| integer | | val2 | integer | | Server: loopback FDW Options: (table_name 'lt2') postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where ft1.val + ft2.val ft1.val2 or ft2.val is null; QUERY PLAN --- Foreign Scan (cost=100.00..125.60 rows=2560 width=16) Output: val, val2, val, val2 Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val, val2 FROM public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt) r (a _0, a_1) ON r.a_0 + l.a_0) r.a_1) OR (l.a_0 IS NULL))) AND ((r.a_1 = l.a_1)) (3 rows) The result is then wrong postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where ft1.val + ft2.val ft1.val2 or ft2.val is null; val | val2 | val | val2 -+--+-+-- 1 |2 | | 1 |3 | | (2 rows) which should match the result obtained by substituting local tables for foreign ones postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2) where lt.val + lt2.val lt.val2 or lt2.val is null; val | val2 | val | val2 -+--+-+-- 1 |3 | | (1 row) Once we start distinguishing the two kinds of quals, there is some optimization possible. For pushing down a join it's essential that all the quals in a are safe to be pushed down. But a join can be pushed down, even if quals in a are not safe to be pushed down. But more clauses one pushed down to foreign server, lesser are the rows fetched from the foreign server. In postgresGetForeignJoinPath, instead of checking all the restriction clauses to be safe to be pushed down, we need to check only those which are join quals (class a). The argument restrictlist of GetForeignJoinPaths contains both conditions mixed, so I added extract_actual_join_clauses() to separate it into two lists, join_quals and other clauses. This is similar to what create_nestloop_plan and siblings do. Following EXPLAIN output seems to be confusing ft1 and ft2 both are pointing to same lt
Re: [HACKERS] Join push-down support for foreign tables
public.lt) l (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1) ON (( (r.a_0 + r.a_1) = l.a_0)) Output just specified val + val2, it doesn't tell, where those val and val2 come from, neither it's evident from the rest of the context. Actually val and val2 come from public.lt in r side, but as you say it's too difficult to know that from EXPLAIN output. Do you have any idea to make the Output item more readable? -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
2015-03-04 17:00 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: On 2015/03/03 21:34, Shigeru Hanada wrote: I rebased join push-down patch onto Kaigai-san's Custom/Foreign Join v6 patch. Thanks for the work, Hanada-san and KaiGai-san! Maybe I'm missing something, but did we agree to take this approach, ie, join push-down on top of custom join? There is a comment ahout that [1]. I just thought it'd be better to achieve a consensus before implementing the feature further. As Kaigai-san says, foreign join push-down is beside custom scan, and they are on the custom/foreign join api patch. but still the patch has an issue about joins underlying UPDATE or DELETE. Now I'm working on fixing this issue. Is that something like UPDATE foo ... FROM bar ... where both foo and bar are remote? If so, I think it'd be better to push such an update down to the remote, as discussed in [2], and I'd like to work on that together! A part of it, perhaps. But at the moment I see many issues to solve around pushing down complex UPDATE/DELETE. So I once tightened the restriction, that joins between foreign tables are pushed down only if they are part of SELECT statement. Please see next v4 patch I'll post soon. Sorry for having been late for the party. Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/23343.1418658...@sss.pgh.pa.us [2] http://www.postgresql.org/message-id/31942.1410534...@sss.pgh.pa.us -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
Here is v4 patch of Join push-down support for foreign tables. This patch requires Custom/Foreign join patch v7 posted by Kaigai-san. In this version I added check about query type which gives up pushing down joins when the join is a part of an underlying query of UPDATE/DELETE. As of now postgres_fdw builds a proper remote query but it can't bring ctid value up to postgresExecForeignUpdate()... I'm still working on supporting such query, but I'm not sure that supporting UPDATE/DELETE is required in the first version. I attached a patch foreign_join_update.patch to sure WIP for supporting update/delete as top of foreign joins. How to reproduce the error, please execute query below after running attached init_fdw.sql for building test environment. Note that the script drops user1, and creates database fdw and pgbench. fdw=# explain (verbose) update pgbench_branches b set filler = 'foo' from pgbench_tellers t where t.bid = b.bid and t.tid 10; QUERY PLAN --- Update on public.pgbench_branches b (cost=100.00..100.67 rows=67 width=390) Remote SQL: UPDATE public.pgbench_branches SET filler = $2 WHERE ctid = $1 - Foreign Scan (cost=100.00..100.67 rows=67 width=390) Output: b.bid, b.bbalance, 'foo '::character(88), b.ctid, * Remote SQL: SELECT r.a_0, r.a_1, r.a_2, l FROM (SELECT tid, bid, tbalance, filler FROM public.pgbench_tellers WHERE ((tid 10))) l (a_0, a_1) INNER JOIN (SELECT b id, bbalance, NULL, ctid FROM public.pgbench_branches FOR UPDATE) r (a_0, a_1, a_2, a_3) ON ((r.a_0 = l.a_1)) (5 rows) fdw=# explain (analyze, verbose) update pgbench_branches b set filler = 'foo' from pgbench_tellers t where t.bid = b.bid and t.tid 10; ERROR: ctid is NULL 2015-03-03 21:34 GMT+09:00 Shigeru Hanada shigeru.han...@gmail.com: I rebased join push-down patch onto Kaigai-san's Custom/Foreign Join v6 patch. I posted some comments to v6 patch in this post: http://www.postgresql.org/message-id/CAEZqfEcNvjqq-P=jxnw1pb4t9wvpcporcn7g6cc46jgub7d...@mail.gmail.com Before applying my v3 patch, please apply Kaigai-san's v6 patch and my mod_cjv6.patch. Sorry for complex patch combination. Those patches will be arranged soon by Kaigai-san and me. I fixed the issues pointed out by Thom and Kohei, but still the patch has an issue about joins underlying UPDATE or DELETE. Now I'm working on fixing this issue. Besides this issue, existing regression test passed. 2015-03-03 19:48 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: * Bug reported by Thom Brown - # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN countries ON people.country_id = countries.id LIMIT 3) x; ERROR: could not open relation with OID 0 Sorry, it was a problem caused by my portion. The patched setrefs.c checks fdw_/custom_ps_tlist to determine whether Foreign/CustomScan node is associated with a certain base relation. If *_ps_tlist is valid, it also expects scanrelid == 0. However, things we should check is incorrect. We may have a case with empty *_ps_tlist if remote join expects no columns. So, I adjusted the condition to check scanrelid instead. Is this issue fixed by v5 custom/foreign join patch? Yes, please rebase it. -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -- Shigeru HANADA -- Shigeru HANADA foreign_join_v4.patch Description: Binary data init_fdw.sql Description: Binary data foreign_join_update.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
2015-03-02 23:07 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: I seem to be getting a problem with whole-row references: # SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people p on e.person_id = p.id inner join countries c on p.country_id = c.id; ERROR: table r has 3 columns available but 4 columns specified CONTEXT: Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM (SELECT id, country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name, country_id FROM public.people) r (a_0, a_1, a_2, a_3) ON ((r.a_3 = l.a_0)) In this case, the 4th target-entry should be l, not l.a_1. Actually. I fixed that part. And the error message could be somewhat confusing. This mentions table r, but there's no such table or alias in my actual query. However, do we have a mechanical/simple way to distinguish the cases when we need relation alias from the case when we don't need it? Like a self-join cases, we has to construct a remote query even if same table is referenced multiple times in a query. Do you have a good idea? I'd like to vote for keeping current aliasing style, use l and r for join source relations, and use a_0, a_1, ... for each column of them. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
Thanks for reviewing my patch. 2015-03-02 22:50 GMT+09:00 Thom Brown t...@linux.com: I seem to be getting a problem with whole-row references: # SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people p on e.person_id = p.id inner join countries c on p.country_id = c.id; ERROR: table r has 3 columns available but 4 columns specified CONTEXT: Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM (SELECT id, country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name, country_id FROM public.people) r (a_0, a_1, a_2, a_3) ON ((r.a_3 = l.a_0)) And the error message could be somewhat confusing. This mentions table r, but there's no such table or alias in my actual query. Your concern would not be limited to the feature I'm proposing, because fdw_options about object name also introduce such mismatch between the query user constructed and another one which postgres_fdw constructed for remote execution. Currently we put CONTEXT line for such purpose, but it might hard to understand soon only from error messages. So I'd like to add section about query debugging for postgres_fdw. Another issue: # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN countries ON people.country_id = countries.id LIMIT 3) x; ERROR: could not open relation with OID 0 Good catch. In my quick trial, removing LIMIT3 avoids this error. I'll check it right now. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
Kaigai-san, The v6 patch was cleanly applied on master branch. I'll rebase my patch onto it, but before that I have a comment about name of the new FDW API handler GetForeignJoinPath. Obviously FDW can add multiple paths at a time, like GetForeignPaths, so IMO it should be renamed to GetForeignJoinPaths, with plural form. In addition to that, new member of RelOptInfo, fdw_handler, should be initialized explicitly in build_simple_rel. Please see attached a patch for these changes. I'll review the v6 path afterwards. 2015-03-03 20:20 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: Sorry, I misoperated on patch creation. Attached one is the correct version. -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai Sent: Tuesday, March 03, 2015 6:31 PM To: Kaigai Kouhei(海外 浩平); Robert Haas Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) The attached version of custom/foreign-join interface patch fixes up the problem reported on the join-pushdown support thread. The previous version referenced *_ps_tlist on setrefs.c, to check whether the Custom/ForeignScan node is associated with a particular base relation, or not. This logic considered above nodes performs base relation scan, if *_ps_tlist is valid. However, it was incorrect in case when underlying pseudo-scan relation has empty targetlist. Instead of the previous logic, it shall be revised to check scanrelid itself. If zero, it means Custom/ForeignScan node is not associated with a particular base relation, thus, its slot descriptor for scan shall be constructed based on *_ps_tlist. Also, I noticed a potential problem if CSP/FDW driver want to displays expression nodes using deparse_expression() but varnode within this expression does not appear in the *_ps_tlist. For example, a remote query below shall return rows with two columns. SELECT atext, btext FROM tbl_a, tbl_b WHERE aid = bid; Thus, ForeignScan will perform like as a scan on relation with two columns, and FDW driver will set two TargetEntry on the fdw_ps_tlist. If FDW is designed to keep the join condition (aid = bid) using expression node form, it is expected to be saved on custom/fdw_expr variable, then setrefs.c rewrites the varnode according to *_ps_tlist. It means, we also have to add *_ps_tlist both of aid and bid to avoid failure on variable lookup. However, these additional entries changes the definition of the slot descriptor. So, I adjusted ExecInitForeignScan and ExecInitCustomScan to use ExecCleanTypeFromTL(), not ExecTypeFromTL(), when it construct the slot descriptor based on the *_ps_tlist. It expects CSP/FDW drivers to add target-entries with resjunk=true, if it wants to have additional entries for variable lookups on EXPLAIN command. Fortunately or unfortunately, postgres_fdw keeps its remote query in cstring form, so it does not need to add junk entries on the fdw_ps_tlist. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai Sent: Sunday, February 15, 2015 11:01 PM To: Kaigai Kouhei(海外 浩平); Robert Haas Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) The attached patch is a rebased version of join replacement with foreign-/custom-scan. Here is no feature updates at this moment but SGML documentation is added (according to Michael's comment). This infrastructure allows foreign-data-wrapper and custom-scan- provider to add alternative scan paths towards relations join. From viewpoint of the executor, it looks like a scan on a pseudo- relation that is materialized from multiple relations, even though FDW/CSP internally processes relations join with their own logic. Its basic idea is, (1) scanrelid==0 indicates this foreign/custom scan node runs on a pseudo relation and (2) fdw_ps_tlist and custom_ps_tlist introduce the definition of the pseudo relation, because it is not associated with a tangible relation unlike simple scan case, thus planner cannot know the expected record type to be returned without these additional information. These two enhancement enables extensions to process relations join internally, and to perform as like existing scan node from viewpoint of the core backend. Also, as an aside. I had a discussion with Hanada-san about this interface off-list. He had an idea to keep create_plan_recurse() static, using a special list field in CustomPath structure to chain underlying Path node. If core backend translate the Path node
Re: [HACKERS] Join push-down support for foreign tables
I rebased join push-down patch onto Kaigai-san's Custom/Foreign Join v6 patch. I posted some comments to v6 patch in this post: http://www.postgresql.org/message-id/CAEZqfEcNvjqq-P=jxnw1pb4t9wvpcporcn7g6cc46jgub7d...@mail.gmail.com Before applying my v3 patch, please apply Kaigai-san's v6 patch and my mod_cjv6.patch. Sorry for complex patch combination. Those patches will be arranged soon by Kaigai-san and me. I fixed the issues pointed out by Thom and Kohei, but still the patch has an issue about joins underlying UPDATE or DELETE. Now I'm working on fixing this issue. Besides this issue, existing regression test passed. 2015-03-03 19:48 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: * Bug reported by Thom Brown - # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN countries ON people.country_id = countries.id LIMIT 3) x; ERROR: could not open relation with OID 0 Sorry, it was a problem caused by my portion. The patched setrefs.c checks fdw_/custom_ps_tlist to determine whether Foreign/CustomScan node is associated with a certain base relation. If *_ps_tlist is valid, it also expects scanrelid == 0. However, things we should check is incorrect. We may have a case with empty *_ps_tlist if remote join expects no columns. So, I adjusted the condition to check scanrelid instead. Is this issue fixed by v5 custom/foreign join patch? Yes, please rebase it. -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -- Shigeru HANADA foreign_join_v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
. * regression test crash The query below gets crashed: UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9; According to the crash dump, tidin() got a cstring input with unexpected format. I guess column number is incorrectly assigned, but have no clear scenario at this moment. #0 0x7f9b45a11513 in conversion_error_callback (arg=0x7fffc257ecc0) at postgres_fdw.c:3293 #1 0x008e51d6 in errfinish (dummy=0) at elog.c:436 #2 0x008935cd in tidin (fcinfo=0x7fffc257e8a0) at tid.c:69 /home/kaigai/repo/sepgsql/src/backend/utils/adt/tid.c:69:1734:beg:0x8935cd (gdb) p str $6 = 0x1d17cf7 foo Join push-down underlying UPDATE or DELETE requires ctid as its output, but it seems not fully supported. I'm fixing this issue now. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
Attached is the revised/rebased version of the $SUBJECT. This patch is based on Kaigai-san's custom/foreign join patch, so please apply it before this patch. In this version I changed some points from original postgres_fdw. 1) Disabled SELECT clause optimization ~9.4 postgres_fdw lists only columns actually used in SELECT clause, but AFAIS it makes SQL generation complex. So I disabled such optimization and put NULL for unnecessary columns in SELECT clause of remote query. 2) Extended deparse context To allow deparsing based on multiple source relations, I added some members to context structure. They are unnecessary for simple query with single foreign table, but IMO it should be integrated. With Kaigai-san's advise, changes for supporting foreign join on postgres_fdw is minimized into postgres_fdw itself. But I added new FDW API named GetForeignJoinPaths() to keep the policy that all interface between core and FDW should be in FdwRoutine, instead of using hook function. Now I'm writing document about it, and will post it in a day. 2015-02-19 16:19 GMT+09:00 Shigeru Hanada shigeru.han...@gmail.com: 2015-02-17 10:39 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: Let me put some comments in addition to where you're checking now. [design issues] * Cost estimation Estimation and evaluation of cost for remote join query is not an obvious issue. In principle, local side cannot determine the cost to run remote join without remote EXPLAIN, because local side has no information about JOIN logic applied on the remote side. Probably, we have to put an assumption for remote join algorithm, because local planner has no idea about remote planner's choice unless foreign-join don't take use_remote_estimate. I think, it is reasonable assumption (even if it is incorrect) to calculate remote join cost based on local hash-join algorithm. If user wants more correct estimation, remote EXPLAIN will make more reliable cost estimation. Hm, I guess that you chose hash-join as least-costed join. In the pgbench model, most combination between two tables generate hash join as cheapest path. Remote EXPLAIN is very expensive in the context of planning, so it would easily make the plan optimization meaningless. But giving an option to users is good, I agree. It also needs a consensus whether cost for remote CPU execution is equivalent to local CPU. If we think local CPU is rare resource than remote one, a discount rate will make planner more preferable to choose remote join than local one Something like cpu_cost_ratio as a new server-level FDW option? Once we assume a join algorithm for remote join, unit cost for remote CPU, we can calculate a cost for foreign join based on the local join logic plus cost for network translation (maybe fdw_tuple_cost?). Yes, sum of these costs is the total cost of a remote join. o fdw_startup_cost o hash-join cost, estimated as a local join o fdw_tuple_cost * rows * width * FDW options Unlike table scan, FDW options we should refer is unclear. Table level FDW options are associated with a foreign table as literal. I think we have two options here: 1. Foreign-join refers FDW options for foreign-server, but ones for foreign-tables are ignored. 2. Foreign-join is prohibited when both of relations don't have identical FDW options. My preference is 2. Even though N-way foreign join, it ensures all the tables involved with (N-1)-way foreign join has identical FDW options, thus it leads we can make N-way foreign join with all identical FDW options. One exception is updatable flag of postgres_fdw. It does not make sense on remote join, so I think mixture of updatable and non-updatable foreign tables should be admitted, however, it is a decision by FDW driver. Probably, above points need to take time for getting consensus. I'd like to see your opinion prior to editing your patch. postgres_fdw can't push down a join which contains foreign tables on multiple servers, so use_remote_estimate and fdw_startup_cost are the only FDW options to consider. So we have options for each option. 1-a. If all foreign tables in the join has identical use_remote_estimate, allow pushing down. 1-b. If any of foreign table in the join has true as use_remote_estimate, use remote estimate. 2-a. If all foreign tables in the join has identical fdw_startup_cost, allow pushing down. 2-b. Always use max value in the join. (cost would be more expensive) 2-c. Always use min value in the join. (cost would be cheaper) I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have reasonable cost about startup. I agree about updatable option. [implementation issues] The interface does not intend to add new Path/Plan type for each scan that replaces foreign joins. What postgres_fdw should do is, adding ForeignPath towards a particular joinrel, then it populates ForeignScan with remote join query once it got chosen
Re: [HACKERS] Join push-down support for foreign tables
2015-02-17 10:39 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: Let me put some comments in addition to where you're checking now. [design issues] * Cost estimation Estimation and evaluation of cost for remote join query is not an obvious issue. In principle, local side cannot determine the cost to run remote join without remote EXPLAIN, because local side has no information about JOIN logic applied on the remote side. Probably, we have to put an assumption for remote join algorithm, because local planner has no idea about remote planner's choice unless foreign-join don't take use_remote_estimate. I think, it is reasonable assumption (even if it is incorrect) to calculate remote join cost based on local hash-join algorithm. If user wants more correct estimation, remote EXPLAIN will make more reliable cost estimation. Hm, I guess that you chose hash-join as least-costed join. In the pgbench model, most combination between two tables generate hash join as cheapest path. Remote EXPLAIN is very expensive in the context of planning, so it would easily make the plan optimization meaningless. But giving an option to users is good, I agree. It also needs a consensus whether cost for remote CPU execution is equivalent to local CPU. If we think local CPU is rare resource than remote one, a discount rate will make planner more preferable to choose remote join than local one Something like cpu_cost_ratio as a new server-level FDW option? Once we assume a join algorithm for remote join, unit cost for remote CPU, we can calculate a cost for foreign join based on the local join logic plus cost for network translation (maybe fdw_tuple_cost?). Yes, sum of these costs is the total cost of a remote join. o fdw_startup_cost o hash-join cost, estimated as a local join o fdw_tuple_cost * rows * width * FDW options Unlike table scan, FDW options we should refer is unclear. Table level FDW options are associated with a foreign table as literal. I think we have two options here: 1. Foreign-join refers FDW options for foreign-server, but ones for foreign-tables are ignored. 2. Foreign-join is prohibited when both of relations don't have identical FDW options. My preference is 2. Even though N-way foreign join, it ensures all the tables involved with (N-1)-way foreign join has identical FDW options, thus it leads we can make N-way foreign join with all identical FDW options. One exception is updatable flag of postgres_fdw. It does not make sense on remote join, so I think mixture of updatable and non-updatable foreign tables should be admitted, however, it is a decision by FDW driver. Probably, above points need to take time for getting consensus. I'd like to see your opinion prior to editing your patch. postgres_fdw can't push down a join which contains foreign tables on multiple servers, so use_remote_estimate and fdw_startup_cost are the only FDW options to consider. So we have options for each option. 1-a. If all foreign tables in the join has identical use_remote_estimate, allow pushing down. 1-b. If any of foreign table in the join has true as use_remote_estimate, use remote estimate. 2-a. If all foreign tables in the join has identical fdw_startup_cost, allow pushing down. 2-b. Always use max value in the join. (cost would be more expensive) 2-c. Always use min value in the join. (cost would be cheaper) I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have reasonable cost about startup. I agree about updatable option. [implementation issues] The interface does not intend to add new Path/Plan type for each scan that replaces foreign joins. What postgres_fdw should do is, adding ForeignPath towards a particular joinrel, then it populates ForeignScan with remote join query once it got chosen by the planner. That idea is interesting, and make many things simpler. Please let me consider. A few functions added in src/backend/foreign/foreign.c are not called by anywhere, at this moment. create_plan_recurse() is reverted to static. It is needed for custom- join enhancement, if no other infrastructure can support. I made it back to static because I thought that create_plan_recurse can be called by core before giving control to FDWs. But I'm not sure it can be applied to custom scans. I'll recheck that part. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
Hi I've revised the patch based on Kaigai-san's custom/foreign join patch posted in the thread below. http://www.postgresql.org/message-id/9a28c8860f777e439aa12e8aea7694f80108c...@bpxm15gp.gisp.nec.co.jp Basically not changed from the version in the last CF, but as Robert commented before, N-way (not only 2-way) joins should be supported in the first version by construct SELECT SQL by containing source query in FROM clause as inline views (a.k.a. from clause subquery). 2014-12-26 13:48 GMT+09:00 Shigeru Hanada shigeru.han...@gmail.com: 2014-12-16 1:22 GMT+09:00 Robert Haas robertmh...@gmail.com: On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada shigeru.han...@gmail.com wrote: I'm working on $SUBJECT and would like to get comments about the design. Attached patch is for the design below. I'm glad you are working on this. 1. Join source relations As described above, postgres_fdw (and most of SQL-based FDWs) needs to check that 1) all foreign tables in the join belong to a server, and 2) all foreign tables have same checkAsUser. In addition to that, I add extra limitation that both inner/outer should be plain foreign tables, not a result of foreign join. This limiation makes SQL generator simple. Fundamentally it's possible to join even join relations, so N-way join is listed as enhancement item below. It seems pretty important to me that we have a way to push the entire join nest down. Being able to push down a 2-way join but not more seems like quite a severe limitation. Hmm, I agree to support N-way join is very useful. Postgres-XC's SQL generator seems to give us a hint for such case, I'll check it out again. -- Shigeru HANADA -- Shigeru HANADA foreign_join.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
Kaigai-san, Oops. I rebased the patch onto your v4 custom/foreign join patch. But as you mentioned off-list, I found a flaw about inappropriate change about NestPath still remains in the patch... I might have made my dev branch into unexpected state. I'll check it soon. 2015-02-16 13:13 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: Hanada-san, Your patch mixtures enhancement of custom-/foreign-scan interface and enhancement of contrib/postgres_fdw... Probably, it is a careless mis- operation. Please make your patch as differences from my infrastructure portion. Also, I noticed this Join pushdown support for foreign tables patch is unintentionally rejected in the last commit fest. https://commitfest.postgresql.org/3/20/ I couldn't register myself as reviewer. How do I operate it on the new commitfest application? Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Shigeru Hanada Sent: Monday, February 16, 2015 1:03 PM To: Robert Haas Cc: PostgreSQL-development Subject: Re: [HACKERS] Join push-down support for foreign tables Hi I've revised the patch based on Kaigai-san's custom/foreign join patch posted in the thread below. http://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F80 108c...@bpxm15gp.gisp.nec.co.jp Basically not changed from the version in the last CF, but as Robert commented before, N-way (not only 2-way) joins should be supported in the first version by construct SELECT SQL by containing source query in FROM clause as inline views (a.k.a. from clause subquery). 2014-12-26 13:48 GMT+09:00 Shigeru Hanada shigeru.han...@gmail.com: 2014-12-16 1:22 GMT+09:00 Robert Haas robertmh...@gmail.com: On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada shigeru.han...@gmail.com wrote: I'm working on $SUBJECT and would like to get comments about the design. Attached patch is for the design below. I'm glad you are working on this. 1. Join source relations As described above, postgres_fdw (and most of SQL-based FDWs) needs to check that 1) all foreign tables in the join belong to a server, and 2) all foreign tables have same checkAsUser. In addition to that, I add extra limitation that both inner/outer should be plain foreign tables, not a result of foreign join. This limiation makes SQL generator simple. Fundamentally it's possible to join even join relations, so N-way join is listed as enhancement item below. It seems pretty important to me that we have a way to push the entire join nest down. Being able to push down a 2-way join but not more seems like quite a severe limitation. Hmm, I agree to support N-way join is very useful. Postgres-XC's SQL generator seems to give us a hint for such case, I'll check it out again. -- Shigeru HANADA -- Shigeru HANADA -- Shigeru HANADA foreign_join.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
2014-12-16 0:45 GMT+09:00 Tom Lane t...@sss.pgh.pa.us: Shigeru Hanada shigeru.han...@gmail.com writes: I'm working on $SUBJECT and would like to get comments about the design. Attached patch is for the design below. Note that the patch requires Kaigai-san's custom foriegn join patch[1] For the record, I'm not particularly on-board with custom scan, and even less so with custom join. I don't want FDW features like this depending on those kluges, especially not when you're still in need of core-code changes (which really points up the inadequacy of those concepts). This design derived from discussion about custom scan/join. In that discussion Robert suggested common infrastructure for replacing Join path with Scan node. Here I agree to user such common infrastructure. One concern is introducing hook function I/F which seems to break FdwRoutine I/F boundary... Also, please don't redefine struct NestPath like that. That adds a whole bunch of notational churn (and backpatch risk) for no value that I can see. It might've been better to do it like that in a green field, but you're about twenty years too late to do it now. Ok, will revert it. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
2014-12-16 1:22 GMT+09:00 Robert Haas robertmh...@gmail.com: On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada shigeru.han...@gmail.com wrote: I'm working on $SUBJECT and would like to get comments about the design. Attached patch is for the design below. I'm glad you are working on this. 1. Join source relations As described above, postgres_fdw (and most of SQL-based FDWs) needs to check that 1) all foreign tables in the join belong to a server, and 2) all foreign tables have same checkAsUser. In addition to that, I add extra limitation that both inner/outer should be plain foreign tables, not a result of foreign join. This limiation makes SQL generator simple. Fundamentally it's possible to join even join relations, so N-way join is listed as enhancement item below. It seems pretty important to me that we have a way to push the entire join nest down. Being able to push down a 2-way join but not more seems like quite a severe limitation. Hmm, I agree to support N-way join is very useful. Postgres-XC's SQL generator seems to give us a hint for such case, I'll check it out again. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
Hi hackers, I'm working on $SUBJECT and would like to get comments about the design. Attached patch is for the design below. Note that the patch requires Kaigai-san's custom foriegn join patch[1] [1] http://www.postgresql.org/message-id/9a28c8860f777e439aa12e8aea7694f80108c...@bpxm15gp.gisp.nec.co.jp Joins to be pushed down === We have two levels of decision about Join push-down, core and FDW. I think we should allow them to push down joins as much as we can unless it doesn't break the semantics of join. Anyway FDWs should decide whether the join can be pushed down or not, on the basis of the FDW's capability. Here is the list of checks which should be done in core: 1. Join source relations All of foreign tables used in a join should be managed by one foreign data wrapper. I once proposed that all source tables should belong to one server, because at that time I assumed that FDWs use SERVER to express physical place of data source. But Robert's comment gave me an idea that SERVER is not important for some FDWs, so now I think check about server matching should be done by FDWs. USER MAPPING is another important attribute of foreign scan/join, and IMO it should be checked by FDW because some of FDWs don't require USER MAPPING. If an FDW want to check user mapping, all tables in the join should belong to the same server and have same RangeTablEntry#checkAsUser to ensure that only one user mapping is derived. 2. Join type Join type can be any, except JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER, though most of FDWs would support only INNER and OUTER. Pushing down CROSS joins might seem inefficient, because obviously CROSS JOIN always produces more result than retrieving all rows from each foreign table separately. However, some FDW might be able to accelerate such join with cache or something. So I think we should leave such decision to FDWs. Here is the list of checks which shold be done in postgres_fdw: 1. Join source relations As described above, postgres_fdw (and most of SQL-based FDWs) needs to check that 1) all foreign tables in the join belong to a server, and 2) all foreign tables have same checkAsUser. In addition to that, I add extra limitation that both inner/outer should be plain foreign tables, not a result of foreign join. This limiation makes SQL generator simple. Fundamentally it's possible to join even join relations, so N-way join is listed as enhancement item below. 2. Join type In the first proposal, postgres_fdw allows INNER and OUTER joins to be pushed down. CROSS, SEMI and ANTI would have much less use cases. 3. Join conditions and WHERE clauses Join conditions should consist of semantically safe expressions. Where the semantically safe means is same as WHERE clause push-down. Planned enhancements for 9.5 These features will be proposed as enhancements, hopefully in the 9.5 development cycle, but probably in 9.6. 1. Remove unnecessary column from SELECT clause Columns which are used for only join conditions can be removed from the target list, as postgres_fdw does in simple foreign scans. 2. Support N-way joins Mostly for difficulty of SQL generation, I didn't add support of N-Way joins. 3. Proper cost estimation Currently postgres_fdw always gives 0 as the cost of a foreign join, as a compromise. This is because estimating costs of each join without round-trip (EXPLAIN) is not easy. A rough idea about that I currently have is to use local statistics, but determining join method used at remote might require whole planner to run for the join subtree. Regards, -- Shigeru HANADA join_pushdown.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Typo in bgworker.sgml
I found a minor typo in bgworker.sgml. Patch attached. It should be applied to 9.4 too. -- Shigeru HANADA fix_typo_in_bgworker.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
2014-09-08 8:07 GMT+09:00 Shigeru HANADA shigeru.han...@gmail.com: (2014/09/04 21:37), Robert Haas wrote: On Wed, Sep 3, 2014 at 5:16 AM, Probably both the initial cost and final cost calculations should be delegated to the FDW, but maybe within postgres_fdw, the initial cost should do only the work that can be done without contacting the remote server; then, let the final cost step do that if appropriate. But I'm not entirely sure what is best here. Agreed. I'll design planner API along that way for now. I tried some patterns of implementation but I've not gotten feasible way yet. So I'd like to hear hackers' idea. * Foreign join hook point First I thought that following existing cost estimating manner is the way to go, but I tend to think it doesn't fit foreign joins because join method is tightly-coupled to sort-ness, but foreign join would not. In current planner, add_paths_to_joinrel is conscious of sort-ness, and functions directly called from it are conscious of join method. But this seems not fit the abstraction level of FDW. FDW is highly abstracted, say differ from custom plan providers, so most of work should be delegated to FDW, including pathkeys consideration, IMO. Besides that, order of join consideration is another issue. First I try to add foreign join consideration at the last (after hash join consideration), but after some thought I noticed that early-elimination would work better if we try foreign join first, because in most cases foreign join is the cheapest way to accomplish a join between two foreign relations. So currently I'm thinking that delegating whole join consideration to FDWs before other join consideration in add_paths_to_joinrel, by calling new FDW API would be promising. This means that FDWs can add multiple arbitrary paths to RelOptInfo in a call. Of course this allows FDWs to do round-trip per path, but it would be optimization issue, and they can compare their own candidates they can get without round-trip. * Supported join types INNER and (LEFT|RIGHT|FULL) OUTER would be safe to push down, even though some of OUTER JOIN might not be much faster than local join. I'm not sure that SEMI and ANTI joins are safe to push-down. Can we leave the matter to FDWs, or should we forbid FDWs pushing down by not calling foreign join API? Anyway SEMI/ANTI would not be supported in the first version. * Blockers of join push-down Pushing down join means that foreign scans for inner and outer are skipped, so some elements blocks pushing down. Basically the criteria is same as scan push-down and update push-down. After some thoughts, we should check only unsafe expression in join qual and restrict qual. This limitation is necessary to avoid difference between results of pushe-down or not. Target list seems to contain only Var for necessary columns, but we should check that too. * WIP patch Attached is WIP patch for reviewing the design. Works should be done are 1) judging push-down or not, and 2) generating join SQL. For 2), I'm thinking about referring Postgres-XC's join shipping mechanism. Any comments or questions are welcome. -- Shigeru HANADA join_pushdown_wip.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join push-down support for foreign tables
(2014/09/04 21:37), Robert Haas wrote: On Wed, Sep 3, 2014 at 5:16 AM, Shigeru Hanada shigeru.han...@gmail.com wrote: (1) Separate cost estimation phases? For existing join paths, planner estimates their costs in two phaeses. In the first phase initial_cost_foo(), here foo is one of nestloop/mergejoin/hashjoin, produces lower-bound estimates for elimination. The second phase is done for only promising paths which passed add_path_precheck(), by final_cost_foo() for cost and result size. I'm not sure that we need to follow this manner, since FDWs would be able to estimate final cost/size with their own methods. The main problem I see here is that accurate costing may require a round-trip to the remote server. If there is only one path that is probably OK; the cost of asking the question will usually be more than paid for by hearing that the pushed-down join clobbers the other possible methods of executing the query. But if there are many paths, for example because there are multiple sets of useful pathkeys, it might start to get a bit expensive. I agree that requiring round-trip per path is unbearable, so main source of plan cost should be local statistics gathered by ANALYZE command. If an FDW needs extra information for planning, it should obtain that from FDW options or its private catalogs to avoid undesirable round-trips. FDWs like postgres_fdw would want to optimize plan by providing paths with pathkeys (not only use remote index, but it also allows MergeJoin at upper level), I noticed that order of join considering is an issue too. Planner compares currently-cheapest path to newly generated path, and mostly foreign join path would be the cheapest, so considering foreign join would reduce planner overhead. Probably both the initial cost and final cost calculations should be delegated to the FDW, but maybe within postgres_fdw, the initial cost should do only the work that can be done without contacting the remote server; then, let the final cost step do that if appropriate. But I'm not entirely sure what is best here. Agreed. I'll design planner API along that way for now. (2) How to reflect cost of transfer Cost of transfer is dominant in foreign table operations, including foreign scans. It would be nice to have some mechanism to reflect actual time of transfer to the cost estimation. An idea is to have a FDW option which represents cost factor of transfer, say transfer_cost. That would be reasonable. I assume users would normally wish to specify this per-server, and the default should be something reasonable for a LAN. This enhancement could be applied separately from foreign join patch. (4) criteria for push-down It is assumed that FDWs can push joins down to remote when all foreign tables are in same server. IMO a SERVER objects represents a logical data source. For instance database for postgres_fdw and other connection-based FDWs, and disk volumes (or directory?) for file_fdw. Is this reasonable assumption? I think it's probably good to give an FDW the option of producing a ForeignJoinPath for any join against a ForeignPath *or ForeignJoinPath* for the same FDW. It's perhaps unlikely that an FDW can perform a join efficiently between two data sources with different server definitions, but why not give it the option? It should be pretty fast for the FDW to realize, oh, the server OIDs don't match - and at that point it can exit without doing anything further if that seems desirable. And there might be some kinds of data sources where cross-server joins actually can be executed quickly (e.g. when the underlying data is just in two files in different places on the local machine). Indeed how to separate servers is left to users, or author of FDWs, though postgres_fdw and most of other FDWs can join foreign tables in a server. I think it would be good if we can know two foreign tables are managed by same FDW, from FdwRoutine, maybe adding new API which returns FDW identifier? (5) Terminology I used foreign join as a process which joins foreign tables on *remote* side, but is this enough intuitive? Another idea is using remote join, is this more appropriate for this kind of process? I hesitate to use remote join because it implies client-server FDWs, but foreign join is not limited to such FDWs, e.g. file_fdw can have extra file which is already joined files accessed via foreign tables. Foreign join is perfect. As I alluded to above, it's pretty important to make sure that this works with large join trees; that is, if I join four foreign tables, I don't want it to push down a join between two of the tables and a join between the other two tables and then join the results of those joins locally. Instead, I want to push the entire join tree to the foreign server and execute the whole thing there. Some care may be needed in designing the hooks to make sure this works as desired. I think so
Re: [HACKERS] Join push-down support for foreign tables
(2014/09/05 0:56), Bruce Momjian wrote: On Thu, Sep 4, 2014 at 08:41:43PM +0530, Atri Sharma wrote: On Thursday, September 4, 2014, Bruce Momjian br...@momjian.us wrote: On Thu, Sep 4, 2014 at 08:37:08AM -0400, Robert Haas wrote: The main problem I see here is that accurate costing may require a round-trip to the remote server. If there is only one path that is probably OK; the cost of asking the question will usually be more than paid for by hearing that the pushed-down join clobbers the other possible methods of executing the query. But if there are many paths, for example because there are multiple sets of useful pathkeys, it might start to get a bit expensive. Probably both the initial cost and final cost calculations should be delegated to the FDW, but maybe within postgres_fdw, the initial cost should do only the work that can be done without contacting the remote server; then, let the final cost step do that if appropriate. But I'm not entirely sure what is best here. I am thinking eventually we will need to cache the foreign server statistics on the local server. Wouldn't that lead to issues where the statistics get outdated and we have to anyways query the foreign server before planning any joins? Or are you thinking of dropping the foreign table statistics once the foreign join is complete? I am thinking we would eventually have to cache the statistics, then get some kind of invalidation message from the foreign server. I am also thinking that cache would have to be global across all backends, I guess similar to our invalidation cache. If a FDW needs to know more information than pg_statistics and pg_class have, yes, it should cache some statistics on the local side. But such statistics would have FDW-specific shape so it would be hard to have API to manage. FDW can have their own functions and tables to manage their own statistics, and it can have even background-worker for messaging. But it would be another story. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Join push-down support for foreign tables
operations of a plan node, so we need to provide them for ForeignJoin node. Issues == (1) Separate cost estimation phases? For existing join paths, planner estimates their costs in two phaeses. In the first phase initial_cost_foo(), here foo is one of nestloop/mergejoin/hashjoin, produces lower-bound estimates for elimination. The second phase is done for only promising paths which passed add_path_precheck(), by final_cost_foo() for cost and result size. I'm not sure that we need to follow this manner, since FDWs would be able to estimate final cost/size with their own methods. (2) How to reflect cost of transfer Cost of transfer is dominant in foreign table operations, including foreign scans. It would be nice to have some mechanism to reflect actual time of transfer to the cost estimation. An idea is to have a FDW option which represents cost factor of transfer, say transfer_cost. (3) SELECT-with-Join SQL generation in postgres_fdw Probably Postgres-XC's shipping code would help us for implementing deparse JOIN SQL, but I've not studied it fully, I'll continue the study. (4) criteria for push-down It is assumed that FDWs can push joins down to remote when all foreign tables are in same server. IMO a SERVER objects represents a logical data source. For instance database for postgres_fdw and other connection-based FDWs, and disk volumes (or directory?) for file_fdw. Is this reasonable assumption? Perhaps more issues would come out later, but I'd like to get comments about the design. (5) Terminology I used foreign join as a process which joins foreign tables on *remote* side, but is this enough intuitive? Another idea is using remote join, is this more appropriate for this kind of process? I hesitate to use remote join because it implies client-server FDWs, but foreign join is not limited to such FDWs, e.g. file_fdw can have extra file which is already joined files accessed via foreign tables. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
I confirmed performance gain accomplished by this patch. This patch makes update queries ~50x faster, and even hit-miss update is 3x faster than original. Of course benefit is only for queries whose qualifiers are enough simple so that they can be pushied down fully, but this improvement is remarkable. This patch avoids 1) SELECT for determining target rows, and 2) repeated per-row UPDATE/DELETE in particular situation, so I assumed that the gain is larger for bulk update, and it's true indeed, but in fact even hit-miss update (0 row affected) become faster enough. This would come from the omission of SELECT preceding repeated UPDATE/DELETE. I was little worried about overhead in planning phase, but fluctuation was less than 1ms, so it's negligible. Measurement Result == Note: numbers below are execution time of EXPLAIN ANALYZE, and average of five runs --+-+---+ rows affected |original | patched | gain --+-+---+ 0 | 4.841 | 1.548 | 3.13x 1 | 6.944 | 1.793 | 3.87x 100 | 174.420 | 5.167 | 33.76x 10,000 | 8,215.551 | 163.832 | 50.15x 100,000 | 78,135.905 | 1,595.739 | 48.97x 200,000 | 179,784.928 | 4,305.856 | 41.75x --+-+---+ Measurement procedure = [Local side] 1) Create foreign table which refers pgbench_accounts on the remote side [Remote side] 2) pgbench -i -s 100 3) Execute ANALYZE 4) Restart PostgreSQL to clear shared buffers [Local side] 5) Execute ANALYZE against foreign table 6) Execute UPDATE SQL against foreign table once for warm the cache 7) Execute UPDATE SQL against foreign table five times Test SQL for 1-rows cas is below, only aid condition is changed according to measurement variation. EXPLAIN ANALYZE VERBOSE UPDATE ft_pgbench_accounts SET bid=bid+1, abalance=abalance+1, filler='update test' WHERE aid=1; 2014-08-29 12:59 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: (2014/08/26 12:20), Etsuro Fujita wrote: (2014/08/25 21:58), Albe Laurenz wrote: I played with it, and apart from Hanada's comments I have found the following: test= EXPLAIN (ANALYZE, VERBOSE) UPDATE rtest SET val=NULL WHERE id 3; QUERY PLAN -- Update on laurenz.rtest (cost=100.00..14134.40 rows=299970 width=10) (actual time=0.005..0.005 rows=0 loops=1) - Foreign Scan on laurenz.rtest (cost=100.00..14134.40 rows=299970 width=10) (actual time=0.002..0.002 rows=27 loops=1) Output: id, val, ctid Remote SQL: UPDATE laurenz.test SET val = NULL::text WHERE ((id 3)) Planning time: 0.179 ms Execution time: 3706.919 ms (6 rows) Time: 3708.272 ms The actual time readings are surprising. Shouldn't these similar to the actual execution time, since most of the time is spent in the foreign scan node? I was also thinkng that this is confusing to the users. I think this is because the patch executes the UPDATE/DELETE statement during postgresBeginForeignScan, not postgresIterateForeignScan, as you mentioned below: Reading the code, I noticed that the pushed down UPDATE or DELETE statement is executed during postgresBeginForeignScan rather than during postgresIterateForeignScan. I'll modify the patch so as to execute the statement during postgresIterateForeignScan. Done. It is not expected that postgresReScanForeignScan is called when the UPDATE/DELETE is pushed down, right? Maybe it would make sense to add an assertion for that. IIUC, that is right. As ModifyTable doesn't support rescan currently, postgresReScanForeignScan needn't to be called in the update pushdown case. The assertion is a good idea. I'll add it. Done. You can find the updated version of the patch at http://www.postgresql.org/message-id/53fffa50.6020...@lab.ntt.co.jp Thanks, Best regards, Etsuro Fujita -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Compute attr_needed for child relations (was Re: [HACKERS] inherit support for foreign tables)
Hi Fujita-san, I reviewed the v4 patch, and here are some comments from me. * After applying this patch, pull_varattnos() should not called in unnecessary places. For developers who want list of columns-to-be-processed for some purpose, it would be nice to mention when they should use pull_varattnos() and when they should not, maybe at the comments of pull_varattnos() implementation. * deparseReturningList() and postgresGetForeignRelSize() in contrib/postgres_fdw/ also call pull_varattnos() to determine which column to be in the SELECT clause of remote query. Shouldn't these be replaced in the same manner? Other pull_varattnos() calls are for restrictions, so IIUC they can't be replaced. * Through this review I thought up that lazy evaluation approach might fit attr_needed. I mean that we leave attr_needed for child relations empty, and fill it at the first request for it. This would avoid useless computation of attr_needed for child relations, though this idea has been considered but thrown away before... 2014-08-20 18:55 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: Hi Ashutish, (2014/08/14 22:30), Ashutosh Bapat wrote: On Thu, Aug 14, 2014 at 10:05 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp mailto:fujita.ets...@lab.ntt.co.jp wrote: (2014/08/08 18:51), Etsuro Fujita wrote: (2014/06/30 22:48), Tom Lane wrote: I wonder whether it isn't time to change that. It was coded like that originally only because calculating the values would've been a waste of cycles at the time. But this is at least the third place where it'd be useful to have attr_needed for child rels. I've revised the patch. There was a problem with the previous patch, which will be described below. Attached is the updated version of the patch addressing that. Here are some more comments Thank you for the further review! attr_needed also has the attributes used in the restriction clauses as seen in distribute_qual_to_rels(), so, it looks unnecessary to call pull_varattnos() on the clauses in baserestrictinfo in functions check_selective_binary_conversion(), remove_unused_subquery_outputs(), check_index_only(). IIUC, I think it's *necessary* to do that in those functions since the attributes used in the restriction clauses in baserestrictinfo are not added to attr_needed due the following code in distribute_qual_to_rels. /* * If it's a join clause (either naturally, or because delayed by * outer-join rules), add vars used in the clause to targetlists of their * relations, so that they will be emitted by the plan nodes that scan * those relations (else they won't be available at the join node!). * * Note: if the clause gets absorbed into an EquivalenceClass then this * may be unnecessary, but for now we have to do it to cover the case * where the EC becomes ec_broken and we end up reinserting the original * clauses into the plan. */ if (bms_membership(relids) == BMS_MULTIPLE) { List *vars = pull_var_clause(clause, PVC_RECURSE_AGGREGATES, PVC_INCLUDE_PLACEHOLDERS); add_vars_to_targetlist(root, vars, relids, false); list_free(vars); } Although in case of RTE_RELATION, the 0th entry in attr_needed corresponds to FirstLowInvalidHeapAttributeNumber + 1, it's always safer to use it is RelOptInfo::min_attr, in case get_relation_info() wants to change assumption or somewhere down the line some other part of code wants to change attr_needed information. It may be unlikely, that it would change, but it will be better to stick to comments in RelOptInfo 443 AttrNumber min_attr; /* smallest attrno of rel (often 0) */ 444 AttrNumber max_attr; /* largest attrno of rel */ 445 Relids *attr_needed;/* array indexed [min_attr .. max_attr] */ Good point! Attached is the revised version of the patch. Thanks, Best regards, Etsuro Fujita -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Hi Fujita-san, Issues addressed by Eitoku-san were fixed properly, but he found a bug and a possible enhancement in the v2 patch. * push-down check misses delete triggers update_is_pushdown_safe() seems to have a bug that it misses the existence of row-level delete trigger. DELETE statement executed against a foreign table which has row-level delete trigger is pushed down to remote, and consequently no row-level delete trigger is fired. * further optimization Is there any chance to consider further optimization by passing the operation type (UPDATE|DELETE) of undergoing statement to update_is_pushdown_safe()? It seems safe to push down UPDATE statement when the target foreign table has no update trigger even it has a delete trigger (of course the opposite combination would be also fine). * Documentation The requirement of pushing down UPDATE/DELETE statements would not be easy to understand for non-expert users, so it seems that there is a room to enhance documentation. An idea is to define which expression is safe to send to remote first (it might need to mention the difference of semantics), and refer the definition from the place describing the requirement of pushing-down for SELECT, UPDATE and DELETE. 2014-08-04 20:30 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: (2014/07/30 17:22), Etsuro Fujita wrote: (2014/07/29 0:58), Robert Haas wrote: On Fri, Jul 25, 2014 at 3:39 AM, Albe Laurenz laurenz.a...@wien.gv.at wrote: Shigeru Hanada wrote: * Naming of new behavior You named this optimization Direct Update, but I'm not sure that this is intuitive enough to express this behavior. I would like to hear opinions of native speakers. How about batch foreign update or batch foreign modification? (Disclaimer: I'm not a native speaker either.) I think direct update sounds pretty good. Batch does not sound as good to me, since it doesn't clearly describe what makes this patch special as opposed to some other grouping of updates that happens to produce a speedup. I agree with Robert on that point. Another term that might be used is update pushdown, since we are pushing the whole update to the remote server instead of having the local server participate. Without looking at the patch, I don't have a strong opinion on whether that's better than direct update in this context. Update Pushdown is fine with me. If there are no objections of others, I'll change the name from Direct Update to Update Pushdown. Done. (I've left deparseDirectUpdateSql/deparseDirectDeleteSql as-is, though.) Other changes: * Address the comments from Eitoku-san. * Add regression tests. * Fix a bug, which fails to show the actual row counts in EXPLAIN ANALYZE for UPDATE/DELETE without a RETURNING clause. * Rebase to HEAD. Please find attached an updated version of the patch. Thanks, Best regards, Etsuro Fujita -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Hi Fujita-san, Here is a new review result from Eitoku-san. 2014-07-25 16:30 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: (2014/07/24 18:30), Shigeru Hanada wrote: I'm not sure that I understand your question correctly, but the reason for that is because foreign tables cannot have INSTEAD OF triggers. Now I see the reason, but then I worry (though it unlikely happens) a case that new trigger type might be added in future. The code says that only BEFORE and AFTER triggers are unsafe for direct update, but it would be more safe to code that any trigger other than event trigger is unsafe for direct update. We found that this patch speeds up DELETE case remarkably, as you describe above, but we saw only less than 2x speed on UPDATE cases. Do you have any numbers of UPDATE cases? Here is the result for an UPDATE case. Hmm, performance gain on UPDATE cases seems similar to our results, except planning times. In your environment the patch reduces planning time too, but we got longer planning times with your patch (in only once in six trial, we got shorter planning time than average of patched version). Could you try multiple times on your environment? I think that the precise effect of this optimization for DELETE/UPDATE would depend on eg, data, queries (inc. w/ or w/o RETRUNING clauses) and server/network performance. Could you tell me these information about the UPDATE evaluation? I tried on a CentOS 6.5 on VMware on a Note PC with Core i3 1.17GHz, 2.0GB memory and single HDD, so the performance is poor. The SQLs used for performance test are quite simple, update 10 thousands rows at a time, and repeat it for different section of the table for six times. The definition of foreign table ft is same as the one in your case. EXPLAIN ANALYZE VERBOSE UPDATe ft SET data = 'abcdefg' WHERE id = 0 AND id 1; EXPLAIN ANALYZE VERBOSE UPDATe ft SET data = 'abcdefg' WHERE id = 1 AND id 2; EXPLAIN ANALYZE VERBOSE UPDATe ft SET data = 'abcdefg' WHERE id = 2 AND id 3; EXPLAIN ANALYZE VERBOSE UPDATe ft SET data = 'abcdefg' WHERE id = 3 AND id 4; EXPLAIN ANALYZE VERBOSE UPDATe ft SET data = 'abcdefg' WHERE id = 4 AND id 5; EXPLAIN ANALYZE VERBOSE UPDATe ft SET data = 'abcdefg' WHERE id = 5 AND id 6; Some more random thoughts: * Naming of new behavior You named this optimization Direct Update, but I'm not sure that this is intuitive enough to express this behavior. I would like to hear opinions of native speakers. Update push-down seems nice with according to others. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Hi Fujita-san, My coworker Takaaki EITOKU reviewed the patch, and here are some comments from him. 2014-07-08 16:07 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: ... In the patch postgresPlanForeignModify has been modified so that if, in addition to the above condition, the followings are satisfied, then the ForeignScan and ModifyTable node will work that way. - There are no local BEFORE/AFTER triggers. The reason the check ignores INSTEAD OF triggers here is that INSTEAD OF trigger would prevent executing UPDATE/DELETE statement against a foreign tables at all, right? - In UPDATE it's safe to evaluate expressions to assign to the target columns on the remote server. IIUC, in addition to target expressions, whole WHERE clause should be safe to be pushed-down. Though that is obviously required, but mentioning that in documents and source comments would help users and developers. Here is a simple performance test. On remote side: postgres=# create table t (id serial primary key, inserted timestamp default clock_timestamp(), data text); CREATE TABLE postgres=# insert into t(data) select random() from generate_series(0, 9); INSERT 0 10 postgres=# vacuum t; VACUUM On local side: postgres=# create foreign table ft (id integer, inserted timestamp, data text) server myserver options (table_name 't'); CREATE FOREIGN TABLE Unpatched: postgres=# explain analyze verbose delete from ft where id 1; QUERY PLAN --- Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual time=1275.255..1275.255 rows=0 loops=1) Remote SQL: DELETE FROM public.t WHERE ctid = $1 - Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6) (actual time=1.180..52.095 rows= loops=1) Output: ctid Remote SQL: SELECT ctid FROM public.t WHERE ((id 1)) FOR UPDATE Planning time: 0.112 ms Execution time: 1275.733 ms (7 rows) Patched (Note that the DELETE command has been pushed down.): postgres=# explain analyze verbose delete from ft where id 1; QUERY PLAN --- Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual time=0.006..0.006 rows=0 loops=1) - Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6) (actual time=0.001..0.001 rows=0 loops=1) Output: ctid Remote SQL: DELETE FROM public.t WHERE ((id 1)) Planning time: 0.101 ms Execution time: 8.808 ms (6 rows) We found that this patch speeds up DELETE case remarkably, as you describe above, but we saw only less than 2x speed on UPDATE cases. Do you have any numbers of UPDATE cases? Some more random thoughts: * Naming of new behavior You named this optimization Direct Update, but I'm not sure that this is intuitive enough to express this behavior. I would like to hear opinions of native speakers. * Macros for # of fdw_private elements In postgres_fdw.c you defined MaxFdwScanFdwPrivateLength and MinFdwScanFdwPrivateLength for determining the mode, but number of fdw_private elements is not a ranged value but an enum value (I mean that fdw_private holds 3 or 7, not 4~6, so Max and Min seems inappropriate for prefix. IMO those macros should be named so that the names represent the behavior, or define a macro to determine the mode, say IS_SHIPPABLE(foo). * Comparison of Macros Comparison against MaxFdwScanFdwPrivateLength and MinFdwScanFdwPrivateLength should be == instead of = or = to detect unexpected value. Adding assert macro seems good too. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.5] Custom Plan API
Kaigai-san, 2014-07-14 22:18 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp: Hanada-san, Thanks for your checking. The attached v4 patch is rebased one on the latest master branch. Indeed, merge conflict was trivial. Updates from the v3 are below: - custom-plan.sgml was renamed to custom-plan-provider.sgml - fix up the comments in pg_custom_plan_provider.h that mentioned about old field name. - applied your patch to fix up typos. (thanks so much!) - put costs off on the EXPLAIN command in the regression test of ctidscan extension. Checked, but the patch fails sanity-check test, you need to modify expected file of the test. Nothing to comment on the design and implementation from your viewpoint any more? As much as I can tell, the design seems reasonable. After fix for the small issue above, I'll move the patch status to Ready for committer. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.5] Custom Plan API
Kaigai-san, 2014-07-15 21:37 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: Sorry, expected result of sanity-check test was not updated on renaming to pg_custom_plan_provider. The attached patch fixed up this point. I confirmed that all regression tests passed, so I marked the patch as Ready for committer. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.5] Custom Plan API
Kaigai-san, The v3 patch had conflict in src/backend/tcop/utility.c for newly added IMPORT FOREIGN SCHEMA statement, but it was trivial. 2014-07-08 20:55 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: * System catalog name was changed to pg_custom_plan_provider; that reflects role of the object being defined. ISTM that doc/src/sgml/custom-plan.sgml should be also renamed to custom-plan-provider.sgml. * Also, prefix of its variable names are changed to cpp; that means custom-plan-provider. A custclass remains in a comment in src/include/catalog/pg_custom_plan_provider.h. * Syntax also reflects what the command does more. New syntax to define custom plan provider is: CREATE CUSTOM PLAN PROVIDER cpp_name FOR cpp_class HANDLER cpp_function; * Add custom-plan.sgml to introduce interface functions defined for path/plan/exec methods. * FinalizeCustomPlan() callback was simplified to support scan (and join in the future) at the starting point. As long as scan/join requirement, no need to control paramids bitmap in arbitrary way. * Unnecessary forward declaration in relation.h and plannode.h were removed, but a few structures still needs to have forward declarations. * Fix typos being pointed out. Check. I found some typos and a wording datatype which is not used in any other place. Please refer the attached patch. -- Shigeru HANADA fix_typo_in_v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Incorrect comment in postgres_fdw.c
Fujita-san, 2014-07-11 18:22 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: I think the following comment for store_returning_result() in postgres_fdw.c is not right. /* PGresult must be released before leaving this function. */ I think PGresult should not be released before leaving this function *on success* in that function. (I guess the comment has been copied and pasted from that for get_remote_estimate().) +1 for just removing the comment, because header comment clearly mentions necessity of releasing the PGresult. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
Hi Fujita-san, Sorry for leaving this thread for long time. 2014-06-24 16:30 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: (2014/06/23 18:35), Ashutosh Bapat wrote: Hi, Selecting tableoid on parent causes an error, ERROR: cannot extract system attribute from virtual tuple. The foreign table has an OID which can be reported as tableoid for the rows coming from that foreign table. Do we want to do that? No. I think it's a bug. I'll fix it. IIUC, you mean that tableoid can't be retrieved when a foreign table is accessed via parent table, but it sounds strange to me, because one of main purposes of tableoid is determine actual source table in appended results. Am I missing the point? -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.5] Custom Plan API
/relation.h seem unncecessary. Other headers might have same issue. * Unnecessary externing of replace_nestloop_params() replace_nestloop_params() is extern-ed but it's never called outside createplan.c. * Externing fix_scan_expr() If it's necessary for all custom plan providers to call fix_scan_expr (via fix_scan_list macro), isn't it able to do it in set_plan_refs() before calling SetCustomPlanRef()? * What does T_CustomPlanMarkPos mean? It's not clear to me when CustomPlanMarkPos works. Is it for a custom plan provider which supports marking position and rewind to the position, and ctidscan just lacks capability to do that, so it is not used anywhere? * Unnecessary changes in allpaths.c some comment about Subquery and CTE are changed (perhaps) accidentally. * Typos * planenr - planner, implements - implement in create_custom_plan.sgml * CustomScan in nodeCustom.h should be CustomPlan? * delivered - derived, in src/backend/optimizer/util/pathnode.c * Document Writing Custom Plan Provider is not provided Custom Plan Provider author would (and I DO!) hope documents about writing a custom plan provider. Regards, 2014-06-17 23:12 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp: Hanada-san, Thanks for your checks. I oversight the points when I submit the patch, sorry. The attached one is revised one on documentation stuff and contrib/Makefile. Thanks, 2014-06-16 17:29 GMT+09:00 Shigeru Hanada shigeru.han...@gmail.com: Kaigai-san, I've just applied v1 patch, and tried build and install, but I found two issues: 1) The contrib/ctidscan is not automatically built/installed because it's not described in contrib/Makefile. Is this expected behavior? 2) I got an error message below when building document. $ cd doc/src/sgml $ make openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -d stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml openjade:catalogs.sgml:2525:45:X: reference to non-existent ID SQL-CREATECUSTOMPLAN make: *** [HTML.index] Error 1 make: *** Deleting file `HTML.index' I'll review another part of the patch, including the design. 2014-06-14 10:59 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp: According to the discussion upthread, I revised the custom-plan patch to focus on regular relation scan but no join support right now, and to support DDL command to define custom-plan providers. Planner integration with custom logic to scan a particular relation is enough simple, unlike various join cases. It's almost similar to what built-in logic are doing now - custom-plan provider adds a path node with its cost estimation if it can offer alternative way to scan referenced relation. (in case of no idea, it does not need to add any paths) A new DDL syntax I'd like to propose is below: CREATE CUSTOM PLAN name FOR class PROVIDER function_name; name is as literal, put a unique identifier. class is workload type to be offered by this custom-plan provider. scan is the only option right now, that means base relation scan. function_name is also as literal; it shall perform custom-plan provider. A custom-plan provider function is assumed to take an argument of internal type to deliver a set of planner information that is needed to construct custom-plan pathnode. In case of scan class, pointer towards an customScanArg object shall be delivered on invocation of custom-plan provider. typedef struct { uint32custom_class; PlannerInfo*root; RelOptInfo *baserel; RangeTblEntry *rte; } customScanArg; In case when the custom-plan provider function being invoked thought it can offer an alternative scan path on the relation of baserel, things to do is (1) construct a CustomPath (or its inherited data type) object with a table of callback function pointers (2) put its own cost estimation, and (3) call add_path() to register this path as an alternative one. Once the custom-path was chosen by query planner, its CreateCustomPlan callback is called to populate CustomPlan node based on the pathnode. It also has a table of callback function pointers to handle various planner's job in setrefs.c and so on. Similarly, its CreateCustomPlanState callback is called to populate CustomPlanState node based on the plannode. It also has a table of callback function pointers to handle various executor's job during quey execution. Most of callback designs are not changed from the prior proposition in v9.4 development cycle, however, here is a few changes. * CustomPlan became to inherit Scan, and CustomPlanState became to inherit ScanState. Because some useful routines to implement scan- logic, like ExecScan, expects state-node has ScanState as its base type, it's more kindness for extension side. (I'd like to avoid each extension reinvent ExecScan by copy paste!) I'm not sure whether it should be a union of Join in the future, however
Re: [HACKERS] pg_stat directory and pg_stat_statements
Fujii-san, I found the right description in REL9_3_STABLE branch, thanks for pointing out the commit. Sorry for noise. 2014-06-18 12:39 GMT+09:00 Fujii Masao masao.fu...@gmail.com: On Tue, Jun 17, 2014 at 2:11 PM, Shigeru Hanada shigeru.han...@gmail.com wrote: Fujii-san, I agree not to backpatch, but I noticed that the 9.3 document about stats collector doesn't mention $PGDATA/pg_stat. http://www.postgresql.org/docs/current/static/monitoring-stats.html It just says: When the server shuts down, a permanent copy of the statistics data is stored in the global subdirectory, so that statistics can be retained across server restarts. I'm not sure whether we should modify the 9.3 document at the moment, but actually the description made me confused a bit. Could you check 8dc90b9c4c45fa30a8f59313e21d294529ef7182 in REL9_3_STABLE branch? You can see that I added the description of pg_stat directory into the document in 9.3. Regards, -- Fujii Masao -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.5] Custom Plan API
: On 8 May 2014 22:55, Tom Lane t...@sss.pgh.pa.us wrote: We're past the prototyping stage and into productionising what we know works, AFAIK. If that point is not clear, then we need to discuss that first. OK, I'll bite: what here do we know works? Not a damn thing AFAICS; it's all speculation that certain hooks might be useful, and speculation that's not supported by a lot of evidence. If you think this isn't prototyping, I wonder what you think *is* prototyping. My research contacts advise me of this recent work http://www.ntu.edu.sg/home/bshe/hashjoinonapu_vldb13.pdf and also that they expect a prototype to be ready by October, which I have been told will be open source. So there are at least two groups looking at this as a serious option for Postgres (not including the above paper's authors). That isn't *now*, but it is at least a time scale that fits with acting on this in the next release, if we can separate out the various ideas and agree we wish to proceed. I'll submerge again... Through the discussion last week, our minimum consensus are: 1. Deregulated enhancement of FDW is not a way to go 2. Custom-path that can replace built-in scan makes sense as a first step towards the future enhancement. Its planner integration is enough simple to do. 3. Custom-path that can replace built-in join takes investigation how to integrate existing planner structure, to avoid (3a) reinvention of whole of join handling in extension side, and (3b) unnecessary extension calls towards the case obviously unsupported. So, I'd like to start the (2) portion towards the upcoming 1st commit-fest on the v9.5 development cycle. Also, we will be able to have discussion for the (3) portion concurrently, probably, towards 2nd commit-fest. Unfortunately, I cannot participate PGcon/Ottawa this year. Please share us the face-to-face discussion here. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -- KaiGai Kohei kai...@kaigai.gr.jp -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat directory and pg_stat_statements
Fujii-san, I agree not to backpatch, but I noticed that the 9.3 document about stats collector doesn't mention $PGDATA/pg_stat. http://www.postgresql.org/docs/current/static/monitoring-stats.html It just says: When the server shuts down, a permanent copy of the statistics data is stored in the global subdirectory, so that statistics can be retained across server restarts. I'm not sure whether we should modify the 9.3 document at the moment, but actually the description made me confused a bit. 2014-05-29 12:22 GMT+09:00 Fujii Masao masao.fu...@gmail.com: On Thu, May 29, 2014 at 4:55 AM, Tomas Vondra t...@fuzzy.cz wrote: On 28.5.2014 19:52, Fujii Masao wrote: On Thu, May 29, 2014 at 12:37 AM, Peter Geoghegan p...@heroku.com wrote: On Wed, May 28, 2014 at 7:01 AM, Fujii Masao masao.fu...@gmail.com wrote: But pg_stat_statements file is saved under $PGDATA/global yet. Is this intentional or just oversight? I think it's an oversight. OK, patch attached. I'm afraid that it's not okay to change the file layout in $PGDATA at this beta1 stage because that change basically seems to need initdb. Otherwise something like no such file or directory error can happen. But in this case what we need to change is only the location of the pg_stat_statements permanent stats file. So, without initdb, the server will not be able to find the pg_stat_statements stats file, but this is not so harmful. Only the problem is that the pg_stat_statements stats which were collected in past would disappear. OTOH, the server can keep running successfully from then and no critical data will not disappear. Therefore I think we can commit this patch even at beta1. Thought? For HEAD, probably yes. But what about backpatching 9.3? I think No. So we should not backpatch this to 9.3. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw and connection management
2014-05-24 0:09 GMT+09:00 Sandro Santilli s...@keybit.net: Indeed I tried DISCARD ALL in hope it would have helped, so I find good your idea of allowing extensions to register an hook there. Still, I'd like the FDW handler itself to possibly be configured to disable the pool completely as a server-specific configuration. Connection management seems FDW-specific feature to me. How about to add FDW option, say pool_connection=true|false, to postgres_fdw which allows per-server configuration? -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] fix worker_spi to run as non-dynamic background worker
Hi all, I noticed that contrib/worker_spi can't run as non-dynamic background worker (IOW, load via shared_preload_libraries), because of uninitialized bgw_notify_pid. I got log lines below when starting PostgreSQL with shared_preload_libraries = 'worker_spi'. $ pg_ctl start -w waiting for server to startLOG: registering background worker worker 1 LOG: background worker worker 1: only dynamic background workers can request notification LOG: registering background worker worker 2 LOG: background worker worker 2: only dynamic background workers can request notification LOG: redirecting log output to logging collector process HINT: Future log output will appear in directory pg_log. Attached patch fixes this issue. Please apply onto HEAD and 9.4. -- Shigeru HANADA fix_worker_spi.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.5] Custom Plan API
2014-05-07 18:06 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: Let me list up the things to be clarified / developed randomly. * Join replacement by FDW; We still don't have consensus about join replacement by FDW. Probably, it will be designed to remote-join implementation primarily, however, things to do is similar. We may need to revisit the Hanada-san's proposition in the past. I can't recall the details soon but the reason I gave up was about introducing ForiegnJoinPath node, IIRC. I'll revisit the discussion and my proposal. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
2014-02-26 16:46 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: Just to come back to this- the other two contrib module patches, at least as I read over their initial submission, were *also* patching portions of backend code which it was apparently discovered that they needed. That's a good bit of my complaint regarding this approach. ?? Sorry, are you still negative on the portion of backend patched by the part-2 and part-3 portion?? Perhaps he meant to separate patches based on feature-based rule. IMO if exposing utilities is essential for Custom Scan API in practical meaning, IOW to implement and maintain an extension which implements Custom Scan API, they should be go into the first patch. IIUC two contrib modules are also PoC for the API, so part-2/3 patch should contain only changes against contrib and its document. Besides that, some typo fixing are mixed in part-2 patch. They should go into the part-1 patch where the typo introduced. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
2014-02-26 17:31 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: IIUC, his approach was integration of join-pushdown within FDW APIs, however, it does not mean the idea of remote-join is rejected. I believe it is still one of our killer feature if we can revise the implementation. Hanada-san, could you put the reason why your proposition was rejected before? IIUC it was not rejected, just returned-with-feedback. We could not get consensus about how join-push-down works. A duscussion point was multiple paths for a joinrel, but it was not so serious point. Here is the tail of the thread. http://www.postgresql.org/message-id/4f058241.2000...@enterprisedb.com Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: Hmm, so you're saying that the FDW function needs to be able to return multiple paths for a single joinrel. Fair enough, and that's not specific to remote joins. Even a single-table foreign scan could be implemented differently depending on whether you prefer fast-start or cheapest total. ... or ordered vs unordered, etc. Yeah, good point, we already got this wrong with the PlanForeignScan API. Good thing we didn't promise that would be stable. This discussion withered down here... I think the advice to Shigeru-san is to work on the API. We didn't reach a consensus on what exactly it should look like, but at least you need to be able to return multiple paths for a single joinrel, and should look at fixing the PlanForeignScan API to allow that too. And I've gave up for lack of time, IOW to finish more fundamental portion of FDW API. http://www.postgresql.org/message-id/4f39fc1a.7090...@gmail.com -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
Hi Kaigai-san, 2014-02-25 13:28 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com: The reason why I asked the question above is, I haven't been 100% certain about its usage. Indeed, semifactors is applied on a limited usage, but quite easy to reproduce by extension later (using clauselist_selectivity) if extension wants this factor. So, I agree with removing the semifactors here. Agreed. It would be nice to mention how to obtain semifactos for people who want to implement advanced join overriding. mergeclause_list and param_source_rels seem little easier to use, but anyway it should be documented how to use those parameters. The mergeclause_list might not be sufficient for extensions to determine whether its own mergejoin is applicable here. See the comment below; that is on the head of select_mergejoin_clauses. | * *mergejoin_allowed is normally set to TRUE, but it is set to FALSE if | * this is a right/full join and there are nonmergejoinable join clauses. | * The executor's mergejoin machinery cannot handle such cases, so we have | * to avoid generating a mergejoin plan. (Note that this flag does NOT | * consider whether there are actually any mergejoinable clauses. This is | * correct because in some cases we need to build a clauseless mergejoin. | * Simply returning NIL is therefore not enough to distinguish safe from | * unsafe cases.) | It says, mergejoin_clause == NIL is not a sufficient check to determine whether the mergejoin logic is applicable on the target join. So, either of them is probably an option for extension that tries to implement Perhaps you mean both of them? their own mergejoin logic; (1) putting both of mergejoin_allowed and mergeclause_list as arguments of the hook, or (2) re-definition of select_mergejoin_clauses() as extern function to reproduce the variables on demand. Which one is more preferable? I prefer (1), because exposing inside of planner might blocks changing those internal functions. If (at the moment) those information is enough for overriding merge join for CSP, let's provide as parameters. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
Hi Kaigai-san, 2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp: (1) Interface to add alternative paths in addition to built-in join paths I found that create_custom_path is not used at all in your patch. I revised postgresql_fdw.c to use it like this. ... /* Create join information which is stored as private information. */ memset(jinfo, 0, sizeof(PgRemoteJoinInfo)); jinfo.fdw_server_oid = o_server_oid; jinfo.fdw_user_oid = o_user_oid; jinfo.relids = joinrel-relids; jinfo.jointype = jointype; jinfo.outer_rel = o_relinfo; jinfo.inner_rel = i_relinfo; jinfo.remote_conds = j_remote_conds; jinfo.local_conds = j_local_conds; /* OK, make a CustomScan node to run remote join */ cpath = create_customscan_path(root, joinrel, 0, 0, 0, /* estimate later */ NIL, required_outer, postgres-fdw, 0, packPgRemoteJoinInfo(jinfo)); estimate_remote_join_cost(root, cpath, jinfo, sjinfo); add_path(joinrel, cpath-path); ... This seems to work fine. Is this right approach? If so,this portion would be a good example to replace local join with custom scan for authors of custom scan providers. One thing I worry is the case that you've intentionally avoided calling create_customscan_path. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
Hi Kaigai-san, Sorry to leave the thread for a while. 2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp: (1) Interface to add alternative paths in addition to built-in join paths This patch adds add_join_path_hook on add_paths_to_joinrel to allow extensions to provide alternative scan path in addition to the built-in join paths. Custom-scan path being added is assumed to perform to scan on a (virtual) relation that is a result set of joining relations. My concern is its arguments to be pushed. This hook is declared as follows: /* Hook for plugins to add custom join path, in addition to default ones */ typedef void (*add_join_path_hook_type)(PlannerInfo *root, RelOptInfo *joinrel, RelOptInfo *outerrel, RelOptInfo *innerrel, JoinType jointype, SpecialJoinInfo *sjinfo, List *restrictlist, List *mergeclause_list, SemiAntiJoinFactors *semifactors, Relids param_source_rels, Relids extra_lateral_rels); extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook; Likely, its arguments upper than restrictlist should be informed to extensions, because these are also arguments of add_paths_to_joinrel(). However, I'm not 100% certain how about other arguments should be informed. Probably, it makes sense to inform param_source_rels and extra_lateral_rels to check whether the path is sensible for parameterized paths. On the other hand, I doubt whether mergeclause_list is usuful to deliver. (It may make sense if someone tries to implement their own merge-join implementation??) I'd like to seem idea to improve the current interface specification. I've read the code path to add custom join again, and felt that providing semifactors seems not necessary for the first cut, because it is used in only initial_cost_nestloop (final_cost_nestloop receives semifactors but it is not used there), and external module would not become so smart before 9.5 development cycle. It seems enough complex to postpone determinig whether it's essential for add_join_path_hook. Do you have any concrete use case for the parameter? mergeclause_list and param_source_rels seem little easier to use, but anyway it should be documented how to use those parameters. IMHO, minimal interface seems better than fully-fledged but half-baked one, especially in the first-cut. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp: Folks, Let me remind the custom-scan patches; that is a basis feature of remote join of postgres_fdw, cache-only scan, (upcoming) GPU acceleration feature or various alternative ways to scan/join relations. Unfortunately, small amount of discussion we could have in this commit fest, even though Hanada-san volunteered to move the patches into ready for committer state at the CF-Nov. I found some cosmetic flaw and .gitignore leak in the patches. Please see attached a patch for details. -- Shigeru HANADA custom_scan_cosmetic_fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
Hi Horiguchi-san, 2014-02-18 19:29 GMT+09:00 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp: Could you guess any use cases in which we are happy with ALTER TABLE's inheritance tree walking? IMHO, ALTER FOREIGN TABLE always comes with some changes of the data source so implicitly invoking of such commands should be defaultly turned off. Imagine a case that foreign data source have attributes (A, B, C, D) but foreign tables and their parent ware defined as (A, B, C). If user wants to use D as well, ALTER TABLE parent ADD COLUMN D type would be useful (rather necessary?) to keep consistency. Changing data type from compatible one (i.e., int to numeric, varchar(n) to text), adding CHECK/NOT NULL constraint would be also possible. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
Hi Fujita-san, Thanks for the reviewing. 2014-02-10 21:00 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: (2014/02/07 21:31), Etsuro Fujita wrote: So, I've modified the patch so that we continue to disallow SET STORAGE on a foreign table *in the same manner as before*, but, as your patch does, allow it on an inheritance hierarchy that contains foreign tables, with the semantics that we quietly ignore the foreign tables and apply the operation to the plain tables, by modifying the ALTER TABLE simple recursion mechanism. Attached is the updated version of the patch. I'm not sure that allowing ALTER TABLE against parent table affects descendants even some of them are foreign table. I think the rule should be simple enough to understand for users, of course it should be also consistent and have backward compatibility. If foreign table can be modified through inheritance tree, this kind of change can be done. 1) create foreign table as a child of a ordinary table 2) run ALTER TABLE parent, the foreign table is also changed 3) remove foreign table from the inheritance tree by ALTER TABLE child NO INHERIT parent 4) here we can't do same thing as 2), because it is not a child anymore. So IMO we should determine which ALTER TABLE features are allowed to foreign tables, and allow them regardless of the recursivity. Comments? -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
(2014/01/27 21:52), Shigeru Hanada wrote: 2014-01-27 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: While still reviwing this patch, I feel this patch has given enough consideration to interactions with other commands, but found the following incorrect? behabior: postgres=# CREATE TABLE product (id INTEGER, description TEXT); CREATE TABLE postgres=# CREATE FOREIGN TABLE product1 () INHERITS (product) SERVER fs OPTIONS (filename '/home/foo/product1.csv', format 'csv'); CREATE FOREIGN TABLE postgres=# ALTER TABLE product ALTER COLUMN description SET STORAGE EXTERNAL; ERROR: product1 is not a table or materialized view ISTN the ALTER TABLE simple recursion mechanism (ie ATSimpleRecursion()) should be modified for the ALTER COLUMN SET STORAGE case. I just wanted to quickly tell you this for you to take time to consider. Thanks for the review. It must be an oversight, so I'll fix it up soon. It seems little bit complex than I expected. Currently foreign tables deny ALTER TABLE SET STORAGE with message like below, because foreign tables don't have storage in the meaning of PG heap tables. ERROR: pgbench1_accounts_c1 is not a table or materialized view At the moment we don't use attstorage for foreign tables, so allowing SET STORAGE against foreign tables never introduce visible change except \d+ output of foreign tables. But IMO such operation should not allowed because users would be confused. So I changed ATExecSetStorage() to skip on foreign tables. This allows us to emit ALTER TABLE SET STORAGE against ordinary tables in upper level of inheritance tree, but it have effect on only ordinary tables in the tree. This also allows direct ALTER FOREIGN TABLE SET STORAGE against foreign table but the command is silently ignored. SET STORAGE support for foreign tables is not documented because it may confuse users. With attached patch, SET STORAGE against wrong relations produces message like this. Is it confusing to mention foreign table here? ERROR: pgbench1_accounts_pkey is not a table, materialized view, or foreign table One other idea is to support SET STORAGE against foreign tables though it has no effect. This makes all tables and foreign tables to have same storage option in same column. It might be more understandable behavior for users. Thoughts? FYI, here are lists of ALTER TABLE features which is allowed/denied for foreign tables. Allowed - ADD|DROP COLUMN - SET DATA TYPE - SET|DROP NOT NULL - SET STATISTICS - SET (attribute_option = value) - RESET (atrribute_option) - SET|DROP DEFAULT - ADD table_constraint - ALTER CONSTRAINT - DROP CONSTRAINT - [NO] INHERIT parent_table - OWNER - RENAME - SET SCHEMA - SET STORAGE - moved to here by attached patch Denied - ADD table_constraint_using_index - VALIDATE CONSTRAINT - DISABLE|ENABLE TRIGGER - DISABLE|ENABLE RULE - CLUSTER ON - SET WITHOUT CLUSTER - SET WITH|WITHOUT OIDS - SET (storage_parameter = value) - RESET (storage_parameter) - OF type - NOT OF - SET TABLESPACE - REPLICA IDENTITY -- Shigeru HANADA -- Shigeru HANADA foreign_inherit-v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Review] inherit support for foreign tables
2014-01-27 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: (2014/01/25 11:27), Shigeru Hanada wrote: Yeah, the consistency is essential for its ease of use. But I'm not sure that inherited stats ignoring foreign tables is actually useful for query optimization. What I think about the consistency is a) the ANALYZE command with no table names skips ANALYZEing inheritance trees that include at least one foreign table as a child, but b) the ANALYZE command with a table name indicating an inheritance tree that includes any foreign tables does compute the inherited stats in the same way as an inheritance tree consiting of only ordinary tables by acquiring the sample rows from each foreign table on the far side. b) sounds little complex to understand or explain. Is it too big change that making ANALYZE command to handle foreign tables too even if no table name was specified? IIRC, performance issue was the reason to exclude foreign tables from auto-analyze processing. ANALYZEing large database contains local huge data also takes long time. One idea to avoid unexpected long processing is to add option to foreign tables to mark it as not-auto-analyzable. Thoughts? -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
2014-01-27 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: While still reviwing this patch, I feel this patch has given enough consideration to interactions with other commands, but found the following incorrect? behabior: postgres=# CREATE TABLE product (id INTEGER, description TEXT); CREATE TABLE postgres=# CREATE FOREIGN TABLE product1 () INHERITS (product) SERVER fs OPTIONS (filename '/home/foo/product1.csv', format 'csv'); CREATE FOREIGN TABLE postgres=# ALTER TABLE product ALTER COLUMN description SET STORAGE EXTERNAL; ERROR: product1 is not a table or materialized view ISTN the ALTER TABLE simple recursion mechanism (ie ATSimpleRecursion()) should be modified for the ALTER COLUMN SET STORAGE case. I just wanted to quickly tell you this for you to take time to consider. Thanks for the review. It must be an oversight, so I'll fix it up soon. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Review] inherit support for foreign tables
Hi Fujita-san, Thanks for the review. 2014/1/23 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: Shigeru Hanada wrote: Though this would be debatable, in current implementation, constraints defined on a foreign table (now only NOT NULL and CHECK are supported) are not enforced during INSERT or UPDATE executed against foreign tables. This means that retrieved data might violates the constraints defined on local side. This is debatable issue because integrity of data is important for DBMS, but in the first cut, it is just documented as a note. I agree with you, but we should introduce a new keyword such as ASSERTIVE or something in 9.4, in preparation for the enforcement of constraints on the local side in a future release? What I'm concerned about is, otherwise, users will have to rewrite those DDL queries at that point. No? In the original thread, whether the new syntax is necessary (maybe ASSERTIVE will not be used though) is under discussion. Anyway, your point, decrease user's DDL rewrite less as possible is important. Could you post review result and/or your opinion as the reply to the original thread, then we include other people interested in this feature can share discussion. Because I don't see practical case to have a foreign table as a parent, and it avoid a problem about recursive ALTER TABLE operation, foreign tables can't be a parent. I agree with you on that point. For other commands recursively processed such as ANALYZE, foreign tables in the leaf of inheritance tree are ignored. I'm not sure that in the processing of the ANALYZE command, we should ignore child foreign tables. It seems to me that the recursive processing is not that hard. Rather what I'm concerned about is that if the recursive processing is allowed, then autovacuum will probably have to access to forign tables on the far side in order to acquire those sample rows. It might be undesirable from the viewpoint of security or from the viewpoint of efficiency. As you say, it's not difficult to do recursive ANALYZE including foreign tables. The reason why ANALYZE ignores descendant foreign tables is consistent behavior. At the moment, ANALYZE ignores foreign tables in top-level (non-child) when no table name was given, and if we want to ANALYZE foreign tables we need to specify explicitly. I think it's not so bad to show WARNING or INFO message about foreign table ignorance, including which is not a child. --- a/doc/src/sgml/ref/create_foreign_table.sgml +++ b/doc/src/sgml/ref/create_foreign_table.sgml @@ -22,6 +22,7 @@ CREATE FOREIGN TABLE [ IF NOT EXISTS ] replaceable class=PARAMETERtable_name replaceable class=PARAMETERcolumn_name/replaceable replaceable class=PARAMETERdata_type/replaceable [ OPTIONS ( replaceable class=PA\ RAMETERoption/replaceable 'replaceable class=PARAMETERvalue/replaceable' [, ... ] ) ] [ COLLATE replaceablecollation/replaceable ] [ rep\ laceable class=PARAMETERcolumn_constraint/replaceable [ ... ] ] [, ... ] ] ) +[ INHERITS ( replaceableparent_table/replaceable [, ... ] ) ] SERVER replaceable class=parameterserver_name/replaceable [ OPTIONS ( replaceable class=PARAMETERoption/replaceable 'replaceable class=PARAMETERvalue/replaceable' [, ... ] ) ] @@ -159,6 +160,17 @@ CREATE FOREIGN TABLE [ IF NOT EXISTS ] replaceable class=PARAMETERtable_name /varlistentry varlistentry + termreplaceable class=PARAMETERparent_table/replaceable/term + listitem + para + The name of an existing table or foreign table from which the new foreign + table automatically inherits all columns, see + xref linkend=ddl-inherit for details of table inheritance. + /para + /listitem + /varlistentry Correct? I think we should not allow a foreign table to be a parent. Oops, good catch. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
Thanks for the comments. 2014/1/21 KaiGai Kohei kai...@ak.jp.nec.com: In addition, an idea which I can't throw away is to assume that all constraints defined on foreign tables as ASSERTIVE. Foreign tables potentially have dangers to have wrong data by updating source data not through foreign tables. This is not specific to an FDW, so IMO constraints defined on foreign tables are basically ASSERTIVE. Of course PG can try to maintain data correct, but always somebody might break it. qu Does it make sense to apply assertive CHECK constraint on the qual of ForeignScan to filter out tuples with violated values at the local side, as if row-level security feature doing. It enables to handle a situation that planner expects only clean tuples are returned but FDW driver is unavailable to anomalies. Probably, this additional check can be turned on/off on the fly, if FDW driver has a way to inform the core system its capability, like FDW_CAN_ENFORCE_CHECK_CONSTRAINT that informs planner to skip local checks. Hmm, IIUC you mean that local users can't (or don't need to) know that data which violates the local constraints exist on remote side. Applying constraints to the data which is modified through FDW would be necessary as well. In that design, FDW is a bidirectional filter which provides these features: 1) Don't push wrong data into remote data source, by applying local constraints to the result of the modifying query executed on local PG. This is not perfect filter, because remote constraints don't mapped automatically or perfectly (imagine constraints which is available on remote but is not supported in PG). 2) Don't retrieve wrong data from remote to local PG, by applying local constraints I have a concern about consistency. It has not been supported, but let's think of Aggregate push-down invoked by a query below. SELECT count(*) FROM remote_table; If this query was fully pushed down, the result is the # of records exist on remote side, but the result would be # of valid records when we don't push down the aggregate. This would confuse users. Besides CHECK constraints, currently NOT NULL constraints are virtually ASSERTIVE (not enforcing). Should it also be noted explicitly? Backward compatibility…. Yep, backward compatibility (especially visible ones to users) should be minimal, ideally zero. NOT NULL [ASSERTIVE] might be an option. Treating [ASSERTIVE | NOT ASSERTIVE] like DEFERRABLE, and allow ingASSERTIVE for only foreign tables? It makes sense, though we need consider exclusiveness . But It needs to default to ASSERTIVE on foreign tables, and NOT ASSERTIVE (means forced) on others. Isn't is too complicated? CREATE FOREIGN TABLE foo ( id int NOT NULL ASSERTIVE CHECK (id 1) ASSERTIVE, … CONSTRAINT chk_foo_name_upper CHECK (upper(name) = name) ASSERTIVE ) SERVER server; BTW, I noticed that this is like push-down-able expressions in JOIN/WHERE. We need to check a CHECK constraint defined on a foreign tables contains only expressions which have same semantics as remote side (in practice, built-in and immutable)? -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
2013/11/18 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: I think it's been previously proposed that we have some version of a CHECK constraint that effectively acts as an assertion for query optimization purposes, but isn't actually enforced by the system. I can see that being useful in a variety of situations, including this one. Yeah, I think it would be much smarter to provide a different syntax to explicitly represent the notion that we're only assuming the condition is true, and not trying to enforce it. I'd like to revisit this feature. Explicit notation to represent not-enforcing (assertive?) is an interesting idea. I'm not sure which word is appropriate, but for example, let's use the word ASSERTIVE as a new constraint attribute. CREATE TABLE parent ( id int NOT NULL, group int NOT NULL, name text ); CREATE FOREIGN TABLE child_grp1 ( /* no additional column */ ) INHERITS (parent) SERVER server1; ALTER TABLE child_grp1 ADD CONSTRAINT chk_group1 CHECK (group = 1) ASSERTIVE; If ASSERTIVE is specified, it's not guaranteed that the constraint is enforced completely, so it should be treated as a hint for planner. As Robert mentioned, enforcing as much as we can during INSERT/UPDATE is one option about this issue. In addition, an idea which I can't throw away is to assume that all constraints defined on foreign tables as ASSERTIVE. Foreign tables potentially have dangers to have wrong data by updating source data not through foreign tables. This is not specific to an FDW, so IMO constraints defined on foreign tables are basically ASSERTIVE. Of course PG can try to maintain data correct, but always somebody might break it. Besides CHECK constraints, currently NOT NULL constraints are virtually ASSERTIVE (not enforcing). Should it also be noted explicitly? Thoughts? -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
Hi all, 2014/1/14 Shigeru Hanada shigeru.han...@gmail.com: I'd like to revisit this feature. Attached patch allows a foreign table to be a child of a table. It also allows foreign tables to have CHECK constraints. These changes provide us a chance to propagate query load to multiple servers via constraint exclusion. If FDW supports async operation against remote server, parallel processing (not stable but read-only case would be find) can be achieved, though overhead of FDW mechanism is still there. Though this would be debatable, in current implementation, constraints defined on a foreign table (now only NOT NULL and CHECK are supported) are not enforced during INSERT or UPDATE executed against foreign tables. This means that retrieved data might violates the constraints defined on local side. This is debatable issue because integrity of data is important for DBMS, but in the first cut, it is just documented as a note. Because I don't see practical case to have a foreign table as a parent, and it avoid a problem about recursive ALTER TABLE operation, foreign tables can't be a parent. An example of such problem is adding constraint which is not unsupported for foreign tables to the parent of foreign table. Propagated operation can be applied to ordinary tables in the inheritance tree, but can't be to foreign tables. If we allow foreign tables to be parent, it's difficult to process ordinary tables below foreign tables in current traffic cop mechanism. For other commands recursively processed such as ANALYZE, foreign tables in the leaf of inheritance tree are ignored. Any comments or questions are welcome. -- Shigeru HANADA foreign_inherit.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
KaiGai-san, 2013/12/16 KaiGai Kohei kai...@ak.jp.nec.com: (2013/12/16 14:15), Shigeru Hanada wrote: (1) ctidscan Is session_preload_libraries available to enable the feature, like shared_*** and local_***? According to my trial it works fine like two similar GUCs. It shall be available; nothing different from the two parameters that we have supported for long time. Sorry, I missed the new feature to mention about. Check. (2) postgres_fdw JOIN push--down is a killer application of Custom Scan Provider feature, so I think it's good to mention it in the Remote Query Optimization section. I added an explanation about remote join execution on the section. Probably, it help users understand why Custom Scan node is here instead of Join node. Thanks for your suggestion. Check. I think that these patches are enough considered to mark as Ready for Committer. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
Hi Kaigai-san, 2013/12/11 Kohei KaiGai kai...@kaigai.gr.jp: 2013/12/10 Shigeru Hanada shigeru.han...@gmail.com: The patches could be applied cleanly, but I saw a compiler warning about get_rel_relkind() in foreign.c, but it's minor issue. Please just add #include of utils/lsyscache.h there. Fixed, Check. I have some more random comments about EXPLAIN. 1) You use Operation as the label of Custom Scan nodes in non-text format, but it seems to me rather provider name. What is the string shown there? I tried free-riding on the existing properties, but it does not make sense indeed, as you pointed out. I adjusted the explain.c to show Custom-Provider property for Custom- Scan node, as follows. New name seems better, it is what the node express. 2) It would be nice if we can see the information about what the Custom Scan node replaced in EXPLAIN output (even only in verbose mode). I know that we can't show plan tree below custom scan nodes, because CS Provider can't obtain other candidates. But even only relation names used in the join or the scan would help users to understand what is going on in Custom Scan. Even though I agree that it helps users to understand the plan, it also has a headache to implement because CustomScan node (and its super class) does not have an information which relations are underlying. Probably, this functionality needs to show the underlying relations on ExplainTargetRel() if CustomScan node represents a scan instead of join. What data source can produce the list of underlying relations here? So, if it is not a significant restriction for users, I'd like to work on this feature later. Agreed. It would be enough that Custom Scan Providers can add arbitrary information, such as Remote SQL of postgres_fdw, to EXPLAIN result via core API. Some kind of framework which helps authors of Custom Scan Providers, but it should be considered after the first cut. The attached patch fixes up a minor warning around get_rel_relkind and name of the property for custom-provider. Please check it. The patch can be applied onto 2013-12-16 HEAD cleanly, and gives no unexpected error/warinig. I'm sorry to post separately, but I have some comments on document. (1) ctidscan Is session_preload_libraries available to enable the feature, like shared_*** and local_***? According to my trial it works fine like two similar GUCs. (2) postgres_fdw JOIN push--down is a killer application of Custom Scan Provider feature, so I think it's good to mention it in the Remote Query Optimization section. Codes for core and contrib seem fine, so I'll mark the patches Ready for committer after the document enhancement. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
Hi KaiGai-san, 2013/12/8 Kohei KaiGai kai...@kaigai.gr.jp: The attached patches include documentation fixup by Hanada-san, and relocation of is_managed_relation (the portion to check whether the relation is a foreign table managed by a particular FDW) and has_wholerow_reference. I didn't touch the EXPLAIN logic because I'm uncertain whether the cost of remote join is reasonable towards the cost as an alternative path to local joins. Please check it. Thanks, The patches could be applied cleanly, but I saw a compiler warning about get_rel_relkind() in foreign.c, but it's minor issue. Please just add #include of utils/lsyscache.h there. I have some more random comments about EXPLAIN. 1) You use Operation as the label of Custom Scan nodes in non-text format, but it seems to me rather provider name. What is the string shown there? 2) It would be nice if we can see the information about what the Custom Scan node replaced in EXPLAIN output (even only in verbose mode). I know that we can't show plan tree below custom scan nodes, because CS Provider can't obtain other candidates. But even only relation names used in the join or the scan would help users to understand what is going on in Custom Scan. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
Hi KaiGai-san, 2013/11/29 Kohei KaiGai kai...@kaigai.gr.jp: The attached ones are the revised patches. I merged all the propositions from Jim. Thanks, it made the documentation quality better. Also, I fixed up cosmetic stuff around whitespace - tab. An actual code changes are to follow the changes in FunctionScan when CustomScan replaces a FunctionScan. It puts a List * object instead of a single expression tree, to have multiple functions. Nothing were changed from the previous version. I first reviewed postgres_fdw portion of the patches to learn the outline of Custom Plan. Wiki page is also a good textbook of the feature. I have some random comments about the basic design of Custom Plan: (1) IIUC add_join_path and add_scan_path are added to allow extensions to plug their code into planner. (2) FDW framework has executor callbacks based on existing executor nodes. Is there any plan to integrate them into one way, or wrap on by another? I'm not sure that we should have two similar framework side by side. # I'm sorry if I've missed the past discussion about this issue. (3) Internal routines such as is_self_managed_relation and has_wholerow_reference seem to be useful for other FDWs. Is it able to move them into core? (4) postgres_fdw estimates costs of join by calculating local numbers. How about to support remote estimation by throwing EXPLALAIN query when use_remote_estimates = true. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
2013/11/29 Kohei KaiGai kai...@kaigai.gr.jp: I merged all the propositions from Jim. Thanks, it made the documentation quality better. Also, I fixed up cosmetic stuff around whitespace - tab. I found some typos in documents and comments. Please see attached patch for detail. -- Shigeru HANADA fix_typo.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
Hi Merlin, 2013/11/22 Merlin Moncure mmonc...@gmail.com: On Thu, Nov 21, 2013 at 6:43 PM, Shigeru Hanada shigeru.han...@gmail.com wrote: 2013/11/22 Tom Lane t...@sss.pgh.pa.us: Merlin Moncure mmonc...@gmail.com writes: On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian br...@momjian.us wrote: I know join pushdowns seem insignificant, but it helps to restrict what data must be passed back because you would only pass back joined rows. By 'insignificant' you mean 'necessary to do any non-trivial real work'. Personally, I'd prefer it if FDW was extended to allow arbitrary parameterized queries like every other database connectivity API ever made ever. [ shrug... ] So use dblink. For better or worse, the FDW stuff is following the SQL standard's SQL/MED design, which does not do it like that. Pass-through mode mentioned in SQL/MED standard might be what he wants. happen to have a link handy? SQL/MED standard doesn't say much about PASS THROUGH mode, especially about interaction between client. Besides it, I think it would be nice to allow arbitrary FDW as backend of dblink interface like this: postgres= SELECT dblink_connect('con1', 'server name of an FDW'); postgres= SELECT * FROM dblink('con1', 'some query written in remote syntax') as t(/* record type definition */...); This provides a way to execute query without defining foreign table. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
2013/11/27 Dimitri Fontaine dimi...@2ndquadrant.fr: Alvaro Herrera alvhe...@2ndquadrant.com writes: Seems to me that if you want to read remote tables without creating a foreign table, you could define them locally using something like the WITH syntax and then use them normally in the rest of the query. I guess what's needed here is a kind of barrier that allows pushing a whole arbitrary subquery (with joins and quals and whatnot) down to the remote side. Yes, a big problem is how to skip parsing remote query in PG context. Bare query string (other than string literal) always parsed by PG parser, but remote side would have different syntax and semantics, as Dimitri says we need to pass whole of arbitrary query string to remote side as-is. My current thinking about how to solve that would be to add a notion of FOREIGN VIEW in our system, which would basically implement that barrier and send the view definition on the remote, with known quals values as constants, or something like that. I'm sorry but I don't see the point here. Do you mean that user executes CREATE FOREIGN VIEW in advance and uses the view in a subsequent query? Or, allow new syntax like WITH alias AS FOREIGN VIEW (remote query)? I think it's nice to support executing ad-hoc remote query written in the syntax which is valid only on remote data source through FDW, and at the moment dblink interface seems feasible for that purpose. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
2013/11/19 Kohei KaiGai kai...@kaigai.gr.jp: OK, I split them off. The part-1 is custom-scan API itself, the part-2 is ctidscan portion, and the part-3 is remote join on postgres_fdw. These three patches can be applied with no conflict onto 2013-11-27 HEAD, but some fixes are necessary to build because commit 784e762e886e6f72f548da86a27cd2ead87dbd1c (committed on 2013-11-21) allows FunctionScan node to have multiple function expression, so Node * funcexpr in CustomScan should be List *funcitons now. I'll continue to review by applying patches onto 2013-11-19 HEAD. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
2013/11/22 Kohei KaiGai kai...@kaigai.gr.jp: 2013/11/21 Bruce Momjian br...@momjian.us: Where are we on the remaining possible pushdowns for foreign data wrappers, particularly the Postgres one? I know we do WHERE restriction pushdowns in 9.3, but what about join and aggregate pushdowns? Is anyone working on those? I know join pushdowns seem insignificant, but it helps to restrict what data must be passed back because you would only pass back joined rows. Do we document these missing features anywhere? Probably, custom-scan api will provide more flexible way to push-down aggregate, sort or other stuff performing on regular tables, not only foreign tables. It allows extensions to offer alternative scan/join path on the planning stage, then executor callbacks its custom logic instead of the built-in one, if its cost is cheaper. IIRC, sort push-down is already supported. We can provide sorted pathes by setting Pathkeys to additional ForeignPath. postgres_fdw doesn't support this feature because we couldn't get consensus about how to limit sort variation. One idea was to allow to define foreign index on foreign tables to indicate which column combination is reasonably sortable. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of FDW pushdowns
2013/11/22 Tom Lane t...@sss.pgh.pa.us: Merlin Moncure mmonc...@gmail.com writes: On Thu, Nov 21, 2013 at 9:05 AM, Bruce Momjian br...@momjian.us wrote: I know join pushdowns seem insignificant, but it helps to restrict what data must be passed back because you would only pass back joined rows. By 'insignificant' you mean 'necessary to do any non-trivial real work'. Personally, I'd prefer it if FDW was extended to allow arbitrary parameterized queries like every other database connectivity API ever made ever. [ shrug... ] So use dblink. For better or worse, the FDW stuff is following the SQL standard's SQL/MED design, which does not do it like that. Pass-through mode mentioned in SQL/MED standard might be what he wants. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] inherit support for foreign tables
Hi hackers, I'd like to propose adding inheritance support for foriegn tables. David Fetter mentioned this feature last July, but it seems stalled. http://www.postgresql.org/message-id/20130719005601.ga5...@fetter.org Supporting inheritance by foreign tables allows us to distribute query to remote servers by using foreign tables as partition table of a (perhaps ordinary) table. For this purpose, I think that constraint exclusion is necessary. As result of extending Devid's patch for PoC, and AFAIS we need these changes: 1) Add INHERITS(rel, ...) clause to CREATE/ALTER FOREIGN TABLE Apperantly we need to add new syntax to define parent table(s) of a foreign table. We have options about the position of INHERIT clause, but I'd prefer before SERVER clause because having options specific to foreign tables at the tail would be most extensible. a) CREATE FOREIGN TABLE child (...) INHERITS(p1, p2) SERVER server; b) CREATE FOREIGN TABLE child (...) SERVER server INHERITS(p1, p2); 2) Allow foreign tables to have CHECK constraints Like NOT NULL, I think we don't need to enforce the check duroing INSERT/UPDATE against foreign table. 3) Allow foreign table as a child node of Append Currently prepunion.c assumes that children of Append have RELKIND_RELATION as relkind always, so we need to set relkind of child RTE explicitly. Please see attached PoC patch. I'll enhance implementation, tests and document and submit the patch for the next CF. Regards, -- Shigeru HANADA diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 0b31f55..2f2dc88 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -465,10 +465,25 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId) ereport(ERROR, (errcode(ERRCODE_INVALID_TABLE_DEFINITION), errmsg(ON COMMIT can only be used on temporary tables))); - if (stmt-constraints != NIL relkind == RELKIND_FOREIGN_TABLE) - ereport(ERROR, - (errcode(ERRCODE_WRONG_OBJECT_TYPE), -errmsg(constraints are not supported on foreign tables))); +/* + * Shouldn't this have been checked in parser? + */ + if (relkind == RELKIND_FOREIGN_TABLE) + { + ListCell *lc; + foreach(lc, stmt-constraints) + { + NewConstraint *nc = lfirst(lc); + + if (nc-contype != CONSTR_CHECK + nc-contype != CONSTR_DEFAULT + nc-contype != CONSTR_NULL + nc-contype != CONSTR_NOTNULL) + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), +errmsg(only check constraints are supported on foreign tables))); + } + } /* * Look up the namespace in which we are supposed to create the relation, @@ -1463,10 +1478,11 @@ MergeAttributes(List *schema, List *supers, char relpersistence, */ relation = heap_openrv(parent, ShareUpdateExclusiveLock); - if (relation-rd_rel-relkind != RELKIND_RELATION) + if (relation-rd_rel-relkind != RELKIND_RELATION + relation-rd_rel-relkind != RELKIND_FOREIGN_TABLE) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), -errmsg(inherited relation \%s\ is not a table, +errmsg(inherited relation \%s\ is not a table or foreign table, parent-relname))); /* Permanent rels cannot inherit from temporary ones */ if (relpersistence != RELPERSISTENCE_TEMP @@ -3043,7 +3059,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd, pass = AT_PASS_ADD_INDEX; break; case AT_AddConstraint: /* ADD CONSTRAINT */ - ATSimplePermissions(rel, ATT_TABLE); + ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE); /* Recursion occurs during execution phase */ /* No command-specific prep needed except saving recurse flag */ if (recurse) @@ -3057,7 +3073,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd, pass = AT_PASS_ADD_CONSTR; break; case AT_DropConstraint: /* DROP CONSTRAINT */ - ATSimplePermissions(rel, ATT_TABLE); + ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE); /* Recursion
Re: [HACKERS] Writable foreign tables: how to identify rows
On Wed, Mar 6, 2013 at 12:35 PM, Pavan Deolasee pavan.deola...@gmail.com wrote: In the context of postgres_fdw, I am not sure if we need an additional system column like a node_id. Would there be a possibility where tuples to-be-modified are coming from different foreign tables and at runtime we need to decide where to execute the UPDATE/DELETE operation ? If we start supporting inheritance involving foreign tables as child tables, this will become a reality. Foreign tables have tableoid system column which holds pg_class.oid of the foreign table. It seems sufficient to determine source server. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
On Wed, Mar 6, 2013 at 9:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: For postgres_fdw, that would really be enough, since it could just cause a ctid column to be created with the usual definition. Then it could put the remote ctid into the usual t_self field in returned tuples. I'm not sure whether postgres_fdw should support, but updatable views have no system column including ctid. So, we need magic identifier, perhaps it would be set of primary key value(s), to support updating remote updatable views via foreign tables. Just a heads up. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
On Fri, Feb 15, 2013 at 12:58 AM, Tom Lane t...@sss.pgh.pa.us wrote: [ rereads that... ] Hm, I did make some good points. But having seen the end result of this way, I'm still not very happy; it still looks like a maintenance problem. Maybe some additional flags in ruleutils.c is the least evil way after all. Needs more thought. I'm working on revising deparser so that it uses ruleutils routines to construct remote query, and re-found an FDW-specific problem which I encountered some months ago. So far ruleutils routines require deparse context, which is a list of namespace information. Currently deparse_context_for() seems to fit postgres_fdw's purpose, but it always uses names stored in catalogs (pg_class, pg_attribute and pg_namespace), though postgres_fdw wants to replace column/table/schema name with the name specified in relevant FDW options if any. Proper remote query will be generated If postgres_fdw can modify deparse context, but deparse_context is hidden detail of ruleutils.c. IMO disclosing it is bad idea. Given these, I'm thinking to add new deparse context generator which basically construct namespaces from catalogs, but replace one if FDW option *_name was specified for an object. With this context, existing ruleutils would generate expression-strings with proper names, without any change. Is this idea acceptable? -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
On Sun, Feb 17, 2013 at 8:44 AM, Tom Lane t...@sss.pgh.pa.us wrote: Continuing to look at this patch ... I'm wondering if any particular discussion went into choosing the FDW option names nspname, relname, and colname. IIRC, there was no deep discussion about those option names. I simply chose relname and nspname from pg_class and pg_namespace. At that time I thought users would understand those options easily if those names are used in catalog. These don't seem to me like names that we ought to be exposing at the SQL command level. Why not just schema, table, column? Or perhaps schema_name, table_name, column_name if you feel it's essential to distinguish that these are names. I think not-shortened names (words used in documents of conversations) are better now. I prefer table_name to table, because it would be easy to distinguish as name, even if we add new options like table_foo. Besides, I found a strange(?) behavior in psql \d+ command in no-postfix case, though it wouldn't be a serious problem. In psql \d+ result for postgres_fdw foreign tables, table and column are quoted, but schema is not. Is this behavior of quote_ident() intentional? postgres=# \d+ pgbench1_branches Foreign table public.pgbench1_branches Column | Type | Modifiers | FDW Options| Storage | Stats target | Description --+---+---+--+--+--+- bid | integer | not null | (column 'bid') | plain| | bbalance | integer | | | plain| | filler | character(88) | | | extended | | Server: pgbench1 FDW Options: (schema 'public', table 'foo') Has OIDs: no We can use table and column options without quoting (or with quote of course) in CREATE/ALTER FOREIGN TABLE commands, so this is not a barrier against choosing no-postfix names. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
. This doesn't seem like a maintainable approach. Was there a specific reason not to try to use ruleutils.c for this? I'd much rather tweak ruleutils to expose some additional APIs, if that's what it takes, than have all this redundant logic. I got a comment about that issue from you, but I might have misunderstood. (2012/03/09 1:18), Tom Lane wrote: I've been looking at this patch a little bit over the past day or so. I'm pretty unhappy with deparse.c --- it seems like a real kluge, inefficient and full of corner-case bugs. After some thought I believe that you're ultimately going to have to abandon depending on ruleutils.c for reverse-listing services, and it would be best to bite that bullet now and rewrite this code from scratch. I thought that writing ruleutils-free SQL constructor in postgres_fdw is better approach because ruleutils might be changed from its own purpose in future. Besides that, as Kaigai-san mentioned in upthread, ruleutils seems to have insufficient capability for building remote PostgreSQL query. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
On Thu, Feb 14, 2013 at 6:45 PM, Albe Laurenz laurenz.a...@wien.gv.at wrote: Shigeru Hanada wrote: Tom Lane wrote: It ought to be pulling the rows back a few at a time, and that's not going to work well if multiple scans are sharing the same connection. (We might be able to dodge that by declaring a cursor for each scan, but I'm not convinced that such a solution will scale up to writable foreign tables, nested queries, subtransactions, etc.) Indeed the FDW used CURSOR in older versions. Sorry for that I have not looked writable foreign table patch closely yet, but it would require (may be multiple) remote update query executions during scanning? It would for example call ExecForeignUpdate after each call to IterateForeignScan that produces a row that meets the UPDATE condition. Thanks! It seems that ExecForeignUpdate needs another connection for update query, or we need to retrieve all results at the first Iterate call to prepare for possible subsequent update query. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - assign result of query to psql variable
On Sat, Feb 2, 2013 at 7:30 PM, Pavel Stehule pavel.steh...@gmail.com wrote: possible variants a) don't store NULL values - and remove existing variable when NULL be assigned - it is probably best, but should be confusing for users b) implement flag IS NULL - for variables c) use nullPrint d) use empty String +1 for a). If users want to determine whether the result was NULL, or want to use substitute string for NULL result, they can use coalesce in SELECT clause. Anyway the feature should be documented clearly. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction
Hi Tomas, On Wed, Jan 9, 2013 at 6:38 AM, Tomas Vondra t...@fuzzy.cz wrote: Well, you need to ensure that the initial palloc is an array of that size. Oh, right - I forgot to modify the palloc() call. Thanks for spotting this. Attached is a patch with increased threshold and fixed palloc call. OK, both threshold and initial palloc were fixed correctly. I'll mark this patch as Ready for committer. Good work! :-) Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction
Hi Tomas, I reviewed v6 patch, and found that several places need fix. Sorry for extra nitpickings. * I found another extra space after asterisk. + RelFileNode * nodes; * Curly bracket which starts code block should be at the head of next line. + /* extend the array if needed (double the size) */ + if (maxrels = nrels) { * There are several trailing tabs in src/backend/catalog/storage.c. * naming of DROP_RELATIONS_BSEARCH_LIMIT (or off-by-one bug?) IIUC bsearch is used when # of relations to be dropped is *more* than the value of DROP_RELATIONS_BSEARCH_LIMIT (10). IMO this behavior is not what the macro name implies; I thought that bsearch is used for 10 relations. Besides, the word LIMIT is used as *upper limit* in other macros. How about MIN_DROP_REL[ATION]S_BSEARCH or DROP_REL[ATION]S_LINEAR_SEARCH_LIMIT? # using RELS instead of RELATIONS would be better to shorten the name * +1 for Alvaro's suggestion about avoiding palloc traffic by starting with 8 elements or so. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction
Hi Tomas, On Tue, Jan 8, 2013 at 7:08 AM, Tomas Vondra t...@fuzzy.cz wrote: * I found another extra space after asterisk. + RelFileNode * nodes; Thanks, fixed. check * Curly bracket which starts code block should be at the head of next line. + /* extend the array if needed (double the size) */ + if (maxrels = nrels) { Fixed. check * There are several trailing tabs in src/backend/catalog/storage.c. Fixed (I balieve). check * naming of DROP_RELATIONS_BSEARCH_LIMIT (or off-by-one bug?) IIUC bsearch is used when # of relations to be dropped is *more* than the value of DROP_RELATIONS_BSEARCH_LIMIT (10). IMO this behavior is not what the macro name implies; I thought that bsearch is used for 10 relations. Besides, the word LIMIT is used as *upper limit* in other macros. How about MIN_DROP_REL[ATION]S_BSEARCH or DROP_REL[ATION]S_LINEAR_SEARCH_LIMIT? # using RELS instead of RELATIONS would be better to shorten the name I've changed the name to DROP_RELS_BSEARCH_THRESHOLD. I believe this naming is consistent with options like geqo_threshold - when the number of relations reaches the specified value, the bsearch is used. I've also increased the value from 10 to 20, in accordance with the previous discussion. New name sounds good to me, but the #define says that the value is 15. Should it be fixed to 20? * +1 for Alvaro's suggestion about avoiding palloc traffic by starting with 8 elements or so. Done. Not yet. Initial size of srels array is still 1 element. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - assign result of query to psql variable
On Tue, Dec 18, 2012 at 2:52 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Attached updated patch Seems fine. I'll mark this as ready for committer. Nice work! -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - assign result of query to psql variable
On Sun, Oct 28, 2012 at 7:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello here is updated patch main change - it doesn't touch psql lexer - like Tom proposed other points respect Phil's notices I reviewed v12 patch. It provides gset command which works consistently with other psql commands, such as \g and \set, and implementation seems reasonable, and follows other reviewer's comments properly. I think we can mark it as ready for committer, once you have fixed some minor issues below. * Skipping leading blank in inner while loop of command.c seems unnecessary, because (IIUC) psql's scanner skips blanks. Is there any case that scanner returns token with leading/trailing blank? * ISTM that VARLIST_INITIAL can be removed. AFAIS it's same state as VARLIST_EXPECTED_COMMA_OR_IDENT. * I found some cosmetic flaw and typo. Please see attached patch for details. * How about pulling up codes for PGRES_TUPLES_OK case in StoreQueryResult to new static function, say StoreQueryTuple? It would make StoreQueryResult more similar to PrintQueryResult's style, and IMO it makes the code more readable. Regards, -- Shigeru HANADA gset_fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction
On Sun, Dec 9, 2012 at 1:07 AM, Tomas Vondra t...@fuzzy.cz wrote: * If you're dropping a single table, it really does not matter - the difference will be like 100 ms vs. 200 ms or something like that. Did you try dropping 10,000 tables in 100 batches, as same as previous post? If so the overhead introduced by the patch is only 1.52ms per table. It seems acceptable to me. So I'd vote to ditch that special case optimization. +1. It seems very difficult to determine reasonable threshold of such kind of optimization. As described above, the overhead seems enough little, so using bsearch in any case seems fine. Besides, v3.2 patch needs some more fixes, for minor issues. * Opening curly bracket of if/for/while/etc. should be in the next line, like this: if (foo == bar) /* { = not here */ { ... } * Use hard-tab instead of white-spaces to indent variable name in declarations. For these two issues, please see the page below: http://www.postgresql.org/docs/9.2/static/source-format.html * PostgreSQL allows C89 features, so we can't use variable length array. * Contains unintentional blank lines? Please see attached patch for details of fixes above. -- Shigeru HANADA drop-in-tx.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction
On Mon, Nov 12, 2012 at 4:36 AM, Tomas Vondra t...@fuzzy.cz wrote: Hi, I've prepared a slightly updated patch, based on the previous review. See it attached. All changes in v3 patch seem good, however I found some places which requires cosmetic changes. Please see attached v3.1 patch for those changes. Oops, you're right. I omitted 1-3 and 1-4 in actual measurement, but IMO loading data is necessary to fill the shared buffer up, because # of buffers which are deleted during COMMIT is major factor of this patch. And, yes, I verified that all shared buffers are used after all loading have been finished. I don't think it's an important factor, as the original code does this: for (i = 0; i NBuffers; i++) { volatile BufferDesc *bufHdr = BufferDescriptors[i]; ... } in the DropRelFileNodeAllBuffers. That loops through all shared buffers no matter what, so IMHO the performance in this case depends on the total size of the shared buffers. But maybe I'm missing something important. I worry the effect of continue in the loop to the performance. If most of shared buffers are not used at the moment of DROP, the load of DROP doesn't contain the overhead of LockBufHdr and subsequent few lines. Do you expect such situation in your use cases? I've done a simple benchmark on my laptop with 2GB shared buffers, it's attached in the drop-test.py (it's a bit messy, but it works). [snip] With those parameters, I got these numbers on the laptop: creating 1 tables all tables created in 3.298694 seconds dropping 1 tables one by one ... all tables dropped in 32.692478 seconds creating 1 tables all tables created in 3.458178 seconds dropping 1 tables in batches by 100... all tables dropped in 3.28268 seconds So it's 33 seconds vs. 3.3 seconds, i.e. 10x speedup. On AWS we usually get larger differences, as we use larger shared buffers and the memory is significantly slower there. Do you have performance numbers of same test on not-patched PG? This patch aims to improve performance of bulk DROP, so 4th number in the result above should be compared to 4th number of not-patched PG? -- Shigeru HANADA drop-in-transaction-v3.1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
After playing with some big SQLs for testing, I came to feel that showing every remote query in EXPLAIN output is annoying, especially when SELECT * is unfolded to long column list. AFAIK no plan node shows so many information in a line, so I'm inclined to make postgres_fdw to show it only when VERBOSE was specified. This would make EXPLAIN output easy to read, even if many foreign tables are used in a query. Thoughts? -- Shigeru HANADA
Re: [HACKERS] FDW for PostgreSQL
On Wed, Nov 21, 2012 at 7:31 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: At execute_query(), it stores the retrieved rows onto tuplestore of festate-tuples at once. Doesn't it make problems when remote- table has very big number of rows? No. postgres_fdw uses single-row processing mode of libpq when retrieving query results in execute_query, so memory usage will be stable at a certain level. IIRC, the previous code used cursor feature to fetch a set of rows to avoid over-consumption of local memory. Do we have some restriction if we fetch a certain number of rows with FETCH? It seems to me, we can fetch 1000 rows for example, and tentatively store them onto the tuplestore within one PG_TRY() block (so, no need to worry about PQclear() timing), then we can fetch remote rows again when IterateForeignScan reached end of tuplestore. As you say, postgres_fdw had used cursor to avoid possible memory exhaust on large result set. I switched to single-row processing mode (it could be said protocol-level cursor), which was added in 9.2, because it accomplish same task with less SQL calls than cursor. Regards, -- Shigeru HANADA
Re: [HACKERS] Move postgresql_fdw_validator into dblink
Sorry for long absence. On Sat, Oct 20, 2012 at 4:24 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: IIRC, the reason why postgresql_fdw instead of pgsql_fdw was no other fdw module has shorten naming such as ora_fdw for Oracle. However, I doubt whether it is enough strong reason to force to solve the technical difficulty; naming conflicts with existing user visible features. Isn't it worth to consider to back to the pgsql_fdw_validator naming again? AFAIR, in the discussion about naming of the new FDW, another name postgres_fdw was suggested as well as postgresql_fdw, and I chose the one more familiar to me at that time. I think that only few people feel that postgres is shortened name of postgresql. How about using postgres_fdw for PG-FDW? Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
Sorry for delayed response. On Sun, Oct 21, 2012 at 3:16 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: 2012/10/11 Etsuro Fujita fujita.ets...@lab.ntt.co.jp: I've reviewed your patch quickly. I noticed that the patch has been created in a slightly different way from the guidelines: http://www.postgresql.org/docs/devel/static/fdw-planning.html The guidelines say the following, but the patch identifies the clauses in baserel-baserestrictinfo in GetForeignRelSize, not GetForeignPaths. Also, it has been implemented so that most sub_expressions are evaluated at the remote end, not the local end, though I'm missing something. For postgresql_fdw to be a good reference for FDW developers, ISTM it would be better that it be consistent with the guidelines. I think it would be needed to update the following document or redesign the function to be consistent with the following document. Hmm. It seems to me Fujita-san's comment is right. Indeed postgresql_fdw touches baserestrictinfo in GetForeignRelSize, but it's because of optimization for better width estimate. The doc Fujita-san pointed says that: The actual identification of such a clause should happen during GetForeignPaths, since it would affect the cost estimate for the path. I understood this description says that you need to touch baserestrict info *before* GetForeignPlan to estimate costs of optimized path. I don't feel that this description prohibits FDW to touch baserestrictinfo in GetForeignRelSize, but mentioning it clearly might be better. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Move postgresql_fdw_validator into dblink
Sorry for long absence. On Sat, Oct 20, 2012 at 4:24 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: IIRC, the reason why postgresql_fdw instead of pgsql_fdw was no other fdw module has shorten naming such as ora_fdw for Oracle. However, I doubt whether it is enough strong reason to force to solve the technical difficulty; naming conflicts with existing user visible features. Isn't it worth to consider to back to the pgsql_fdw_validator naming again? AFAIR, in the discussion about naming of the new FDW, another name postgres_fdw was suggested as well as postgresql_fdw, and I chose longer one at that time. Perhaps only a few people feel that postgres is shortened name of postgresql. How about using postgres_fdw for PG-FDW? Once we chose the different name, postgresql_fdw_validator can be live with postgres_fdw, though their names seem little confusing. In addition, it would be worth mentioning that it's not recommended to use postgresql_fdw_validator as validator of a third-party's FDW to avoid dependency. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers