On 2017/03/08 2:27, Robert Haas wrote:
> On Tue, Mar 7, 2017 at 12:11 AM, Ashutosh Bapat
> <ashutosh.ba...@enterprisedb.com> wrote:
>> I see that all the changes by Amit and myself to what was earlier 0003
>> patch are now part of 0002 patch. The patch looks ready for committer.
> 
> Reviewing 0002:

Thanks for the review.

> 
> This patch seems to have falsified the header comments for
> expand_inherited_rtentry.

Oops, updated.

> I don't quite understand the need for the change to set_rel_size().
> The rte->inh case is handled before reaching the new code, and IIUC
> the !rte->inh case should never happen given the changes to
> expand_inherited_rtentry.  Or am I confused?

In expand_inherited_rtentry(), patch only changes the rule about the
minimum number of child RTEs required to keep rte->inh true.  In the
traditional case, it is 2 (the table itself and at least one child).  For
a partitioned table, since we don't want to scan the table itself, that
becomes 1 (at least one leaf partition).

So, it's still possible for rte->inh to become false if the required
minimum is not met, even for partitioned tables.

> If not, I think we
> should just add an Assert() to the "plain relation" case that this is
> not RELKIND_PARTITIONED_TABLE (with a comment explaining why it can't
> happen).

How about adding Assert(rte->relkind != RELKIND_PARTITIONED_TABLE) at the
beginning of the following functions as the updated patch does:

set_plain_rel_size()
set_tablesample_rel_size()
set_plain_rel_pathlist()
set_tablesample_rel_pathlist()

> In inheritance_planner, this comment addition does not seem adequate:
> 
> +         * If the parent is a partitioned table, we already set the nominal
> +         * relation.
> 
> That basically contradicts the previous paragraph.  I think you need
> to adjust the previous paragraph instead of adding a new one, probably
> in multiple places.  For example, the parenthesized bit now only
> applies in the non-partitioned case.

Hmm, that's true.  I rewrote the entire comment.

> 
> -    rel = mtstate->resultRelInfo->ri_RelationDesc;
> +    nominalRTE = rt_fetch(node->nominalRelation, estate->es_range_table);
> +    nominalRel = heap_open(nominalRTE->relid, NoLock);
> 
> No lock?

Hmm, I think I missed that a partitioned parent table would not be locked
by the *executor* at all after applying this patch.  As of now, InitPlan()
takes care of locking all the result relations in the
PlannedStmt.resultRelations list.  This patch removes partitioned
parent(s) from appearing in this list.  But that makes me wonder - aren't
the locks taken by expand_inherited_rtentry() kept long enough?  Why does
InitPlan need to acquire them again?  I see this comment in
expand_inherited_rtentry:

   * If the parent relation is the query's result relation, then we need
   * RowExclusiveLock.  Otherwise, if it's accessed FOR UPDATE/SHARE, we
   * need RowShareLock; otherwise AccessShareLock.  We can't just grab
   * AccessShareLock because then the executor would be trying to upgrade
   * the lock, leading to possible deadlocks.  (This code should match the
   * parser and rewriter.)

So, I assumed all the necessary locking is being taken care of here.

Anyway, I changed NoLock to RowExclusiveLock here for now, but maybe it's
either not necessary or a way should be found to do that in InitPlan along
with other result relations.

> Another thing that bothers me about this is that, apparently, the use
> of nominalRelation is no longer strictly for EXPLAIN, as the comments
> in plannodes.h/relation.h still claim that it is.  I'm not sure how to
> adjust that exactly; there's not a lot of room in those comments to
> give all the details.  Maybe they should simply say something like /*
> Parent RT index */ instead of /* Parent RT index for use of EXPLAIN
> */.  But we can't just go around changing the meaning of things
> without updating the comments accordingly.

It seems that this comment and one more need to be updated.  The other
comment change is in explain.c as follows:

  * Adds the relid of each referenced RTE to *rels_used.  The result controls
  * which RTEs are assigned aliases by select_rtable_names_for_explain.
  * This ensures that we don't confusingly assign un-suffixed aliases to RTEs
- * that never appear in the EXPLAIN output (such as inheritance parents).
+ * that never appear in the EXPLAIN output (such as non-partitioned
+ * inheritance parents).
  */
 static bool
 ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)

I studied the EXPLAIN code a bit to see whether the problem cited for
using inheritance parent RTE as nominalRelation (for example, in comments
in inheritance_planner) apply to the partitioned parent RTE case, which it
doesn't.  The partitioned parent RTE won't appear anywhere else in the
plan.  So, different aliases for what's the same table/RTE won't happen in
the partitioned parent case.

> A related question is
> whether we should really be using nominalRelation for this or whether
> there's some other way we should be getting at the parent -- I don't
> have another idea, though.

This is just for the ModifyTable node.  Instead of adding any new member,
I thought nominalRelation would work fine.  But as I said above regarding
locking, we *may* need to find a different place for the partitioned
parent RTE (other than ModifyTable.nominalRelation).

Thanks,
Amit
>From b65f546ee13bef408521d3f7b75f680b609c761f Mon Sep 17 00:00:00 2001
From: amit <amitlangot...@gmail.com>
Date: Mon, 6 Feb 2017 17:26:48 +0900
Subject: [PATCH 1/2] Avoid creating scan nodes for partitioned tables

* Currently, we create scan nodes for inheritance parents in their role
  as an inheritance set member.  Partitioned tables do not contain any
  data, so it's useless to create scan nodes for them.  So we need not
  create AppendRelInfo's for them in the planner prep phase.

* The planner prep phase turns off inheritance on the parent RTE if
  there isn't at least one child member (other than the parent itself
  which in normal cases is also a child member), which means the latter
  phases will not consider creating an Append plan and instead create a
  scan node. Avoid this if the RTE is a partitioned table, by noticing
  such cases in set_rel_size().  Per suggestion from Ashutosh Bapat.

* Since we do not add the RTE corresponding to the root partitioned
  table as the 1st child member of the inheritance set,
  inheritance_planner() must not assume the same when assigning
  nominalRelation to a ModifyTable node.

* In ExecInitModifyTable(), use nominalRelation to initialize tuple
  routing information, instead of the first resultRelInfo.  We set
  ModifyTable.nominalRelation to the root parent RTE in the
  partitioned table case (implicitly in the INSERT case, whereas
  explicitly in the UPDATE/DELETE cases).
---
 src/backend/commands/explain.c            |   3 +-
 src/backend/executor/nodeModifyTable.c    |  20 +++--
 src/backend/optimizer/path/allpaths.c     |  32 ++++++++
 src/backend/optimizer/plan/planner.c      |  34 +++++---
 src/backend/optimizer/prep/prepunion.c    |  47 +++++++++--
 src/include/nodes/plannodes.h             |   2 +-
 src/include/nodes/relation.h              |   2 +-
 src/test/regress/expected/inherit.out     | 124 ++++++++++--------------------
 src/test/regress/expected/tablesample.out |   4 +-
 src/test/regress/sql/inherit.sql          |   8 ++
 10 files changed, 161 insertions(+), 115 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index c9e0a3e42d..a640ef55a9 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -764,7 +764,8 @@ elapsed_time(instr_time *starttime)
  * Adds the relid of each referenced RTE to *rels_used.  The result controls
  * which RTEs are assigned aliases by select_rtable_names_for_explain.
  * This ensures that we don't confusingly assign un-suffixed aliases to RTEs
- * that never appear in the EXPLAIN output (such as inheritance parents).
+ * that never appear in the EXPLAIN output (such as non-partitioned
+ * inheritance parents).
  */
 static bool
 ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 95e158970c..3bdf778d34 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -45,6 +45,7 @@
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
+#include "parser/parsetree.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "utils/builtins.h"
@@ -1634,7 +1635,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	Plan	   *subplan;
 	ListCell   *l;
 	int			i;
-	Relation	rel;
+	Relation		nominalRel;
+	RangeTblEntry  *nominalRTE;
 
 	/* check for unsupported flags */
 	Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
@@ -1726,9 +1728,12 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	estate->es_result_relation_info = saved_resultRelInfo;
 
 	/* Build state for INSERT tuple routing */
-	rel = mtstate->resultRelInfo->ri_RelationDesc;
+	nominalRTE = rt_fetch(node->nominalRelation, estate->es_range_table);
+
+	/* This is the first time we're locking nominalRelation in the executor */
+	nominalRel = heap_open(nominalRTE->relid, RowExclusiveLock);
 	if (operation == CMD_INSERT &&
-		rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		nominalRel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
 	{
 		PartitionDispatch *partition_dispatch_info;
 		ResultRelInfo *partitions;
@@ -1737,7 +1742,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 		int			num_parted,
 					num_partitions;
 
-		ExecSetupPartitionTupleRouting(rel,
+		ExecSetupPartitionTupleRouting(nominalRel,
 									   &partition_dispatch_info,
 									   &partitions,
 									   &partition_tupconv_maps,
@@ -1801,7 +1806,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* varno = node->nominalRelation */
 			mapped_wcoList = map_partition_varattnos(wcoList,
 													 node->nominalRelation,
-													 partrel, rel);
+													 partrel, nominalRel);
 			foreach(ll, mapped_wcoList)
 			{
 				WithCheckOption *wco = (WithCheckOption *) lfirst(ll);
@@ -1876,7 +1881,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 			/* varno = node->nominalRelation */
 			rlist = map_partition_varattnos(returningList,
 											node->nominalRelation,
-											partrel, rel);
+											partrel, nominalRel);
 			rliststate = (List *) ExecInitExpr((Expr *) rlist, &mtstate->ps);
 			resultRelInfo->ri_projectReturning =
 				ExecBuildProjectionInfo(rliststate, econtext, slot,
@@ -1897,6 +1902,9 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 		mtstate->ps.ps_ExprContext = NULL;
 	}
 
+	/* We're done using nominalRel, but keep the lock */
+	heap_close(nominalRel, NoLock);
+
 	/*
 	 * If needed, Initialize target list, projection and qual for ON CONFLICT
 	 * DO UPDATE.
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 87a3faff09..11c55d7847 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -342,6 +342,14 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 					/* Foreign table */
 					set_foreign_size(root, rel, rte);
 				}
+				else if (rte->relkind == RELKIND_PARTITIONED_TABLE)
+				{
+					/*
+					 * A partitioned table without leaf partitions is marked
+					 * as a dummy rel.
+					 */
+					set_dummy_rel_pathlist(rel);
+				}
 				else if (rte->tablesample != NULL)
 				{
 					/* Sampled relation */
@@ -484,6 +492,12 @@ static void
 set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 {
 	/*
+	 * A partitioned table can never happen, because it's either handled using
+	 * an Append (rte->inh == true) or is a dummy rel.
+	 */
+	Assert(rte->relkind != RELKIND_PARTITIONED_TABLE);
+
+	/*
 	 * Test any partial indexes of rel for applicability.  We must do this
 	 * first since partial unique indexes can affect size estimates.
 	 */
@@ -650,6 +664,12 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 	Relids		required_outer;
 
 	/*
+	 * A partitioned table can never happen, because it's either handled using
+	 * an Append (rte->inh == true) or is a dummy rel.
+	 */
+	Assert(rte->relkind != RELKIND_PARTITIONED_TABLE);
+
+	/*
 	 * We don't support pushing join clauses into the quals of a seqscan, but
 	 * it could still have required parameterization due to LATERAL refs in
 	 * its tlist.
@@ -702,6 +722,12 @@ set_tablesample_rel_size(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 	double		tuples;
 
 	/*
+	 * A partitioned table can never happen, because it's either handled using
+	 * an Append (rte->inh == true) or is a dummy rel.
+	 */
+	Assert(rte->relkind != RELKIND_PARTITIONED_TABLE);
+
+	/*
 	 * Test any partial indexes of rel for applicability.  We must do this
 	 * first since partial unique indexes can affect size estimates.
 	 */
@@ -740,6 +766,12 @@ set_tablesample_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *
 	Path	   *path;
 
 	/*
+	 * A partitioned table can never happen, because it's either handled using
+	 * an Append (rte->inh == true) or is a dummy rel.
+	 */
+	Assert(rte->relkind != RELKIND_PARTITIONED_TABLE);
+
+	/*
 	 * We don't support pushing join clauses into the quals of a samplescan,
 	 * but it could still have required parameterization due to LATERAL refs
 	 * in its tlist or TABLESAMPLE arguments.
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index ca0ae7883e..c3472e4c25 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -996,6 +996,7 @@ inheritance_planner(PlannerInfo *root)
 	RelOptInfo *final_rel;
 	ListCell   *lc;
 	Index		rti;
+	RangeTblEntry *parent_rte;
 
 	Assert(parse->commandType != CMD_INSERT);
 
@@ -1054,13 +1055,21 @@ inheritance_planner(PlannerInfo *root)
 	}
 
 	/*
+	 * If the parent RTE is a partitioned table, we should use that as the
+	 * nominal relation, because we do not have the duplicate parent RTE,
+	 * unlike in the case of a non-partitioned inheritance parent.
+	 */
+	parent_rte = rt_fetch(parentRTindex, root->parse->rtable);
+	if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE)
+		nominalRelation = parentRTindex;
+
+	/*
 	 * And now we can get on with generating a plan for each child table.
 	 */
 	foreach(lc, root->append_rel_list)
 	{
 		AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
 		PlannerInfo *subroot;
-		RangeTblEntry *parent_rte;
 		RangeTblEntry *child_rte;
 		RelOptInfo *sub_final_rel;
 		Path	   *subpath;
@@ -1205,15 +1214,20 @@ inheritance_planner(PlannerInfo *root)
 		grouping_planner(subroot, true, 0.0 /* retrieve all tuples */ );
 
 		/*
-		 * We'll use the first child relation (even if it's excluded) as the
-		 * nominal target relation of the ModifyTable node.  Because of the
-		 * way expand_inherited_rtentry works, this should always be the RTE
-		 * representing the parent table in its role as a simple member of the
-		 * inheritance set.  (It would be logically cleaner to use the
-		 * inheritance parent RTE as the nominal target; but since that RTE
-		 * will not be otherwise referenced in the plan, doing so would give
-		 * rise to confusing use of multiple aliases in EXPLAIN output for
-		 * what the user will think is the "same" table.)
+		 * Set the nomimal target relation of the ModifyTable node if not
+		 * already done.  We use the inheritance parent RTE as the nominal
+		 * target relation if it's a partitioned table (see just above this
+		 * loop).  In the non-partitioned parent case, we'll use the first
+		 * child relation (even if it's excluded) as the nominal target
+		 * relation.  Because of the way expand_inherited_rtentry works, the
+		 * latter should be the RTE representing the parent table in its role
+		 * as a simple member of the inheritance set.  It would be logically
+		 * cleaner to *always* use the inheritance parent RTE; but since that
+		 * RTE will not be otherwise referenced in the plan, doing so would
+		 * give rise to confusing use of multiple aliases in EXPLAIN output
+		 * for what the user will think is the "same" table in the
+		 * non-partitioned parent case.  It's not a problem in the partitioned
+		 * parent case, because there is only one parent RTE in that case.
 		 */
 		if (nominalRelation < 0)
 			nominalRelation = appinfo->child_relid;
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 1389db18ba..1a62a53b75 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1348,10 +1348,15 @@ expand_inherited_tables(PlannerInfo *root)
  * Note that the original RTE is considered to represent the whole
  * inheritance set.  The first of the generated RTEs is an RTE for the same
  * table, but with inh = false, to represent the parent table in its role
- * as a simple member of the inheritance set.
+ * as a simple member of the inheritance set.  That's not true if the table
+ * is a partitioned table.  In the case of partitioned tables, we generate
+ * child RTEs only for the leaf partitions, because other relations in the
+ * partition hierarchy do not contain any rows.
  *
- * A childless table is never considered to be an inheritance set; therefore
- * a parent RTE must always have at least two associated AppendRelInfos.
+ * A childless table (which in the partitioned table case means one without
+ * leaf partitions) is never considered to be an inheritance set; therefore
+ * a parent RTE must always have at least one associated AppendRelInfo that
+ * is not itself.
  */
 static void
 expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
@@ -1364,6 +1369,7 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
 	List	   *inhOIDs;
 	List	   *appinfos;
 	ListCell   *l;
+	bool		has_child;
 
 	/* Does RT entry allow inheritance? */
 	if (!rte->inh)
@@ -1435,6 +1441,7 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
 
 	/* Scan the inheritance set and expand it */
 	appinfos = NIL;
+	has_child = false;
 	foreach(l, inhOIDs)
 	{
 		Oid			childOID = lfirst_oid(l);
@@ -1450,6 +1457,22 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
 			newrelation = oldrelation;
 
 		/*
+		 * Since partitioned tables themselves do not contain data, later
+		 * planning steps should not create scan nodes for them.  So do not
+		 * create the child RTE and AppendRelInfo.
+		 */
+		if (newrelation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		{
+			/*
+			 * Don't lose the lock taken for us by find_all_inheritors()
+			 * lest its partitions change under us.
+			 */
+			if (newrelation != oldrelation)
+				heap_close(newrelation, NoLock);
+			continue;
+		}
+
+		/*
 		 * It is possible that the parent table has children that are temp
 		 * tables of other backends.  We cannot safely access such tables
 		 * (because of buffering issues), and the best thing to do seems to be
@@ -1462,6 +1485,13 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
 		}
 
 		/*
+		 * We have at least one child table other than the parent table (or a
+		 * leaf partition in the partitioned table case.)
+		 */
+		if (newrelation != oldrelation)
+			has_child = true;
+
+		/*
 		 * Build an RTE for the child, and attach to query's rangetable list.
 		 * We copy most fields of the parent's RTE, but replace relation OID
 		 * and relkind, and set inh = false.  Also, set requiredPerms to zero
@@ -1545,12 +1575,13 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
 	heap_close(oldrelation, NoLock);
 
 	/*
-	 * If all the children were temp tables, pretend it's a non-inheritance
-	 * situation.  The duplicate RTE we added for the parent table is
-	 * harmless, so we don't bother to get rid of it; ditto for the useless
-	 * PlanRowMark node.
+	 * If all the children were temp tables of other backends or if the parent
+	 * is a partitioned table without any leaf partitions, pretend it's a
+	 * non-inheritance situation.  The duplicate RTE for the parent table we
+	 * added in the non-partitioned table case is harmless, so we don't bother
+	 * to get rid of it; ditto for the useless PlanRowMark node.
 	 */
-	if (list_length(appinfos) < 2)
+	if (!has_child)
 	{
 		/* Clear flag before returning */
 		rte->inh = false;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index f72f7a8978..6413700a93 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -201,7 +201,7 @@ typedef struct ModifyTable
 	Plan		plan;
 	CmdType		operation;		/* INSERT, UPDATE, or DELETE */
 	bool		canSetTag;		/* do we set the command tag/es_processed? */
-	Index		nominalRelation;	/* Parent RT index for use of EXPLAIN */
+	Index		nominalRelation;	/* Parent RT index */
 	List	   *resultRelations;	/* integer list of RT indexes */
 	int			resultRelIndex; /* index of first resultRel in plan's list */
 	List	   *plans;			/* plan(s) producing source data */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index f7ac6f600f..b9872d861c 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -1468,7 +1468,7 @@ typedef struct ModifyTablePath
 	Path		path;
 	CmdType		operation;		/* INSERT, UPDATE, or DELETE */
 	bool		canSetTag;		/* do we set the command tag/es_processed? */
-	Index		nominalRelation;	/* Parent RT index for use of EXPLAIN */
+	Index		nominalRelation;	/* Parent RT index */
 	List	   *resultRelations;	/* integer list of RT indexes */
 	List	   *subpaths;		/* Path(s) producing source data */
 	List	   *subroots;		/* per-target-table PlannerInfos */
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index 795d9f575c..a172497c83 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1598,6 +1598,14 @@ reset enable_bitmapscan;
 create table list_parted (
 	a	varchar
 ) partition by list (a);
+-- test scanning partitioned table without any partitions
+explain (costs off) select * from list_parted;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
 create table part_ab_cd partition of list_parted for values in ('ab', 'cd');
 create table part_ef_gh partition of list_parted for values in ('ef', 'gh');
 create table part_null_xy partition of list_parted for values in (null, 'xy');
@@ -1605,77 +1613,74 @@ explain (costs off) select * from list_parted;
            QUERY PLAN           
 --------------------------------
  Append
-   ->  Seq Scan on list_parted
    ->  Seq Scan on part_ab_cd
    ->  Seq Scan on part_ef_gh
    ->  Seq Scan on part_null_xy
-(5 rows)
+(4 rows)
 
 explain (costs off) select * from list_parted where a is null;
            QUERY PLAN           
 --------------------------------
  Append
-   ->  Seq Scan on list_parted
-         Filter: (a IS NULL)
    ->  Seq Scan on part_null_xy
          Filter: (a IS NULL)
-(5 rows)
+(3 rows)
 
 explain (costs off) select * from list_parted where a is not null;
            QUERY PLAN            
 ---------------------------------
  Append
-   ->  Seq Scan on list_parted
-         Filter: (a IS NOT NULL)
    ->  Seq Scan on part_ab_cd
          Filter: (a IS NOT NULL)
    ->  Seq Scan on part_ef_gh
          Filter: (a IS NOT NULL)
    ->  Seq Scan on part_null_xy
          Filter: (a IS NOT NULL)
-(9 rows)
+(7 rows)
 
 explain (costs off) select * from list_parted where a in ('ab', 'cd', 'ef');
                         QUERY PLAN                        
 ----------------------------------------------------------
  Append
-   ->  Seq Scan on list_parted
-         Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[]))
    ->  Seq Scan on part_ab_cd
          Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[]))
    ->  Seq Scan on part_ef_gh
          Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[]))
-(7 rows)
+(5 rows)
 
 explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd');
                                       QUERY PLAN                                       
 ---------------------------------------------------------------------------------------
  Append
-   ->  Seq Scan on list_parted
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
    ->  Seq Scan on part_ab_cd
          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
    ->  Seq Scan on part_ef_gh
          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
    ->  Seq Scan on part_null_xy
          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-(9 rows)
+(7 rows)
 
 explain (costs off) select * from list_parted where a = 'ab';
                 QUERY PLAN                
 ------------------------------------------
  Append
-   ->  Seq Scan on list_parted
-         Filter: ((a)::text = 'ab'::text)
    ->  Seq Scan on part_ab_cd
          Filter: ((a)::text = 'ab'::text)
-(5 rows)
+(3 rows)
 
 create table range_list_parted (
 	a	int,
 	b	char(2)
 ) partition by range (a);
 create table part_1_10 partition of range_list_parted for values from (1) to (10) partition by list (b);
+-- test scanning partitioned table without any leaf partitions
+explain (costs off) select * from range_list_parted;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
 create table part_1_10_ab partition of part_1_10 for values in ('ab');
 create table part_1_10_cd partition of part_1_10 for values in ('cd');
 create table part_10_20 partition of range_list_parted for values from (10) to (20) partition by list (b);
@@ -1689,14 +1694,9 @@ create table part_40_inf_ab partition of part_40_inf for values in ('ab');
 create table part_40_inf_cd partition of part_40_inf for values in ('cd');
 create table part_40_inf_null partition of part_40_inf for values in (null);
 explain (costs off) select * from range_list_parted;
-             QUERY PLAN              
--------------------------------------
+             QUERY PLAN             
+------------------------------------
  Append
-   ->  Seq Scan on range_list_parted
-   ->  Seq Scan on part_1_10
-   ->  Seq Scan on part_10_20
-   ->  Seq Scan on part_21_30
-   ->  Seq Scan on part_40_inf
    ->  Seq Scan on part_1_10_ab
    ->  Seq Scan on part_1_10_cd
    ->  Seq Scan on part_10_20_ab
@@ -1706,36 +1706,22 @@ explain (costs off) select * from range_list_parted;
    ->  Seq Scan on part_40_inf_ab
    ->  Seq Scan on part_40_inf_cd
    ->  Seq Scan on part_40_inf_null
-(15 rows)
+(10 rows)
 
 explain (costs off) select * from range_list_parted where a = 5;
-             QUERY PLAN              
--------------------------------------
+           QUERY PLAN           
+--------------------------------
  Append
-   ->  Seq Scan on range_list_parted
-         Filter: (a = 5)
-   ->  Seq Scan on part_1_10
-         Filter: (a = 5)
    ->  Seq Scan on part_1_10_ab
          Filter: (a = 5)
    ->  Seq Scan on part_1_10_cd
          Filter: (a = 5)
-(9 rows)
+(5 rows)
 
 explain (costs off) select * from range_list_parted where b = 'ab';
-             QUERY PLAN              
--------------------------------------
+             QUERY PLAN             
+------------------------------------
  Append
-   ->  Seq Scan on range_list_parted
-         Filter: (b = 'ab'::bpchar)
-   ->  Seq Scan on part_1_10
-         Filter: (b = 'ab'::bpchar)
-   ->  Seq Scan on part_10_20
-         Filter: (b = 'ab'::bpchar)
-   ->  Seq Scan on part_21_30
-         Filter: (b = 'ab'::bpchar)
-   ->  Seq Scan on part_40_inf
-         Filter: (b = 'ab'::bpchar)
    ->  Seq Scan on part_1_10_ab
          Filter: (b = 'ab'::bpchar)
    ->  Seq Scan on part_10_20_ab
@@ -1744,27 +1730,19 @@ explain (costs off) select * from range_list_parted where b = 'ab';
          Filter: (b = 'ab'::bpchar)
    ->  Seq Scan on part_40_inf_ab
          Filter: (b = 'ab'::bpchar)
-(19 rows)
+(9 rows)
 
 explain (costs off) select * from range_list_parted where a between 3 and 23 and b in ('ab');
                            QUERY PLAN                            
 -----------------------------------------------------------------
  Append
-   ->  Seq Scan on range_list_parted
-         Filter: ((a >= 3) AND (a <= 23) AND (b = 'ab'::bpchar))
-   ->  Seq Scan on part_1_10
-         Filter: ((a >= 3) AND (a <= 23) AND (b = 'ab'::bpchar))
-   ->  Seq Scan on part_10_20
-         Filter: ((a >= 3) AND (a <= 23) AND (b = 'ab'::bpchar))
-   ->  Seq Scan on part_21_30
-         Filter: ((a >= 3) AND (a <= 23) AND (b = 'ab'::bpchar))
    ->  Seq Scan on part_1_10_ab
          Filter: ((a >= 3) AND (a <= 23) AND (b = 'ab'::bpchar))
    ->  Seq Scan on part_10_20_ab
          Filter: ((a >= 3) AND (a <= 23) AND (b = 'ab'::bpchar))
    ->  Seq Scan on part_21_30_ab
          Filter: ((a >= 3) AND (a <= 23) AND (b = 'ab'::bpchar))
-(15 rows)
+(7 rows)
 
 /* Should select no rows because range partition key cannot be null */
 explain (costs off) select * from range_list_parted where a is null;
@@ -1776,37 +1754,17 @@ explain (costs off) select * from range_list_parted where a is null;
 
 /* Should only select rows from the null-accepting partition */
 explain (costs off) select * from range_list_parted where b is null;
-             QUERY PLAN              
--------------------------------------
+             QUERY PLAN             
+------------------------------------
  Append
-   ->  Seq Scan on range_list_parted
-         Filter: (b IS NULL)
-   ->  Seq Scan on part_1_10
-         Filter: (b IS NULL)
-   ->  Seq Scan on part_10_20
-         Filter: (b IS NULL)
-   ->  Seq Scan on part_21_30
-         Filter: (b IS NULL)
-   ->  Seq Scan on part_40_inf
-         Filter: (b IS NULL)
    ->  Seq Scan on part_40_inf_null
          Filter: (b IS NULL)
-(13 rows)
+(3 rows)
 
 explain (costs off) select * from range_list_parted where a is not null and a < 67;
                    QUERY PLAN                   
 ------------------------------------------------
  Append
-   ->  Seq Scan on range_list_parted
-         Filter: ((a IS NOT NULL) AND (a < 67))
-   ->  Seq Scan on part_1_10
-         Filter: ((a IS NOT NULL) AND (a < 67))
-   ->  Seq Scan on part_10_20
-         Filter: ((a IS NOT NULL) AND (a < 67))
-   ->  Seq Scan on part_21_30
-         Filter: ((a IS NOT NULL) AND (a < 67))
-   ->  Seq Scan on part_40_inf
-         Filter: ((a IS NOT NULL) AND (a < 67))
    ->  Seq Scan on part_1_10_ab
          Filter: ((a IS NOT NULL) AND (a < 67))
    ->  Seq Scan on part_1_10_cd
@@ -1825,23 +1783,19 @@ explain (costs off) select * from range_list_parted where a is not null and a <
          Filter: ((a IS NOT NULL) AND (a < 67))
    ->  Seq Scan on part_40_inf_null
          Filter: ((a IS NOT NULL) AND (a < 67))
-(29 rows)
+(19 rows)
 
 explain (costs off) select * from range_list_parted where a >= 30;
-             QUERY PLAN              
--------------------------------------
+             QUERY PLAN             
+------------------------------------
  Append
-   ->  Seq Scan on range_list_parted
-         Filter: (a >= 30)
-   ->  Seq Scan on part_40_inf
-         Filter: (a >= 30)
    ->  Seq Scan on part_40_inf_ab
          Filter: (a >= 30)
    ->  Seq Scan on part_40_inf_cd
          Filter: (a >= 30)
    ->  Seq Scan on part_40_inf_null
          Filter: (a >= 30)
-(11 rows)
+(7 rows)
 
 drop table list_parted;
 drop table range_list_parted;
diff --git a/src/test/regress/expected/tablesample.out b/src/test/regress/expected/tablesample.out
index b18e420e9b..d3794140fb 100644
--- a/src/test/regress/expected/tablesample.out
+++ b/src/test/regress/expected/tablesample.out
@@ -322,12 +322,10 @@ explain (costs off)
                 QUERY PLAN                 
 -------------------------------------------
  Append
-   ->  Sample Scan on parted_sample
-         Sampling: bernoulli ('100'::real)
    ->  Sample Scan on parted_sample_1
          Sampling: bernoulli ('100'::real)
    ->  Sample Scan on parted_sample_2
          Sampling: bernoulli ('100'::real)
-(7 rows)
+(5 rows)
 
 drop table parted_sample, parted_sample_1, parted_sample_2;
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
index 836ec22c20..a78a585430 100644
--- a/src/test/regress/sql/inherit.sql
+++ b/src/test/regress/sql/inherit.sql
@@ -570,6 +570,10 @@ reset enable_bitmapscan;
 create table list_parted (
 	a	varchar
 ) partition by list (a);
+
+-- test scanning partitioned table without any partitions
+explain (costs off) select * from list_parted;
+
 create table part_ab_cd partition of list_parted for values in ('ab', 'cd');
 create table part_ef_gh partition of list_parted for values in ('ef', 'gh');
 create table part_null_xy partition of list_parted for values in (null, 'xy');
@@ -586,6 +590,10 @@ create table range_list_parted (
 	b	char(2)
 ) partition by range (a);
 create table part_1_10 partition of range_list_parted for values from (1) to (10) partition by list (b);
+
+-- test scanning partitioned table without any leaf partitions
+explain (costs off) select * from range_list_parted;
+
 create table part_1_10_ab partition of part_1_10 for values in ('ab');
 create table part_1_10_cd partition of part_1_10 for values in ('cd');
 create table part_10_20 partition of range_list_parted for values from (10) to (20) partition by list (b);
-- 
2.11.0

>From 3dd4c9ddf2bbf6982e1701065544e56dade9b8ed Mon Sep 17 00:00:00 2001
From: amit <amitlangot...@gmail.com>
Date: Tue, 24 Jan 2017 14:22:34 +0900
Subject: [PATCH 2/2] Do not allocate storage for partitioned tables.

Currently, it is not possible to insert any data into a partitioned
table.  So, they're empty at all times, which means it is wasteful to
allocate relfilenode and related storage objects.
---
 src/backend/access/common/reloptions.c |  3 +--
 src/backend/catalog/heap.c             | 17 ++++++++++-------
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index c50649135f..58df48a9dc 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -968,7 +968,6 @@ extractRelOptions(HeapTuple tuple, TupleDesc tupdesc,
 		case RELKIND_RELATION:
 		case RELKIND_TOASTVALUE:
 		case RELKIND_MATVIEW:
-		case RELKIND_PARTITIONED_TABLE:
 			options = heap_reloptions(classForm->relkind, datum, false);
 			break;
 		case RELKIND_VIEW:
@@ -978,6 +977,7 @@ extractRelOptions(HeapTuple tuple, TupleDesc tupdesc,
 			options = index_reloptions(amoptions, datum, false);
 			break;
 		case RELKIND_FOREIGN_TABLE:
+		case RELKIND_PARTITIONED_TABLE:
 			options = NULL;
 			break;
 		default:
@@ -1420,7 +1420,6 @@ heap_reloptions(char relkind, Datum reloptions, bool validate)
 			return (bytea *) rdopts;
 		case RELKIND_RELATION:
 		case RELKIND_MATVIEW:
-		case RELKIND_PARTITIONED_TABLE:
 			return default_reloptions(reloptions, validate, RELOPT_KIND_HEAP);
 		default:
 			/* other relkinds are not supported */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 41c0056556..2f5090b183 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -291,6 +291,7 @@ heap_create(const char *relname,
 		case RELKIND_VIEW:
 		case RELKIND_COMPOSITE_TYPE:
 		case RELKIND_FOREIGN_TABLE:
+		case RELKIND_PARTITIONED_TABLE:
 			create_storage = false;
 
 			/*
@@ -1345,14 +1346,13 @@ heap_create_with_catalog(const char *relname,
 	if (oncommit != ONCOMMIT_NOOP)
 		register_on_commit_action(relid, oncommit);
 
-	if (relpersistence == RELPERSISTENCE_UNLOGGED)
-	{
-		Assert(relkind == RELKIND_RELATION || relkind == RELKIND_MATVIEW ||
-			   relkind == RELKIND_TOASTVALUE ||
-			   relkind == RELKIND_PARTITIONED_TABLE);
-
+	/*
+	 * We do not want to create any storage objects for a partitioned
+	 * table, including the init fork.
+	 */
+	if (relpersistence == RELPERSISTENCE_UNLOGGED &&
+		relkind != RELKIND_PARTITIONED_TABLE)
 		heap_create_init_fork(new_rel_desc);
-	}
 
 	/*
 	 * ok, the relation has been cataloged, so close our relations and return
@@ -1376,6 +1376,9 @@ heap_create_with_catalog(const char *relname,
 void
 heap_create_init_fork(Relation rel)
 {
+	Assert(rel->rd_rel->relkind == RELKIND_RELATION ||
+		   rel->rd_rel->relkind == RELKIND_MATVIEW ||
+		   rel->rd_rel->relkind == RELKIND_TOASTVALUE);
 	RelationOpenSmgr(rel);
 	smgrcreate(rel->rd_smgr, INIT_FORKNUM, false);
 	log_smgrcreate(&rel->rd_smgr->smgr_rnode.node, INIT_FORKNUM);
-- 
2.11.0

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to