At Fri, 15 Sep 2017 15:36:26 +0900, Amit Langote
<[email protected]> wrote in
<[email protected]>
> Hi.
>
> On 2017/08/28 18:28, Kyotaro HORIGUCHI wrote:
> > << the following is another topic >>
> >
> >>>> BTW, in the partitioned table case, the parent is always locked first
> >>>> using an AccessExclusiveLock. There are other considerations in that
> >>>> case
> >>>> such as needing to recreate the partition descriptor upon termination of
> >>>> inheritance (both the DETACH PARTITION and also DROP TABLE child cases).
> >>>
> >>> Apart from the degree of concurrency, if we keep parent->children
> >>> order of locking, such recreation does not seem to be
> >>> needed. Maybe I'm missing something.
> >>
> >> Sorry to have introduced that topic in this thread, but I will try to
> >> explain anyway why things are the way they are currently:
> >>
> >> Once a table is no longer a partition of the parent (detached or dropped),
> >> we must make sure that the next commands in the transaction don't see it
> >> as one. That information is currently present in the relcache
> >> (rd_partdesc), which is used by a few callers, most notably the
> >> tuple-routing code. Next commands must recreate the entry so that the
> >> correct thing happens based on the updated information. More precisely,
> >> we must invalidate the current entry. RelationClearRelation() will either
> >> delete the entry or rebuild it. If it's being referenced somewhere, it
> >> will be rebuilt. The place holding the reference may also be looking at
> >> the content of rd_partdesc, which we don't require them to make a copy of,
> >> so we must preserve its content while other fields of RelationData are
> >> being read anew from the catalog. We don't have to preserve it if there
> >> has been any change (partition added/dropped), but to make such a change
> >> one would need to take a strong enough lock on the relation (parent). We
> >> assume here that anyone who wants to reference rd_partdesc takes at least
> >> AccessShareLock lock on the relation, and anyone who wants to change its
> >> content must take a lock that will conflict with it, so
> >> AccessExclusiveLock. Note that in all of this, we are only talking about
> >> one relation, that is the parent, so parent -> child ordering of taking
> >> locks may be irrelevant.
> >
> > I think I understand this, anyway DropInherit and DropPartition
> > is different-but-at-the-same-level operations so surely needs
> > amendment for drop/detach cases. Is there already a solution? Or
> > reproducing steps?
>
> Sorry, I think I forgot to reply to this. Since you seem to have chosen
> the other solution (checking that child is still a child), maybe this
> reply is a bit too late, but anyway.
I choosed it at that time for the reason mentioned upthread, but
haven't decided which is better.
> DropInherit or NO INHERIT is seen primarily as changing a child table's
> (which is the target table of the command) property that it is no longer a
> child of the parent, so we lock the child table to block concurrent
> operations from considering it a child of parent anymore. The fact that
> parent is locked after the child and with ShareUpdateExclusiveLock instead
> of AccessExclusiveLock, we observe this race condition when SELECTing from
> the parent.
>
> DropPartition or DETACH PARTITION is seen primarily as changing the parent
> table's (which is the target table of the command) property that one of
> the partitions is removed, so we lock the parent. Any concurrent
> operations that rely on the parent's relcache to get the partition list
> will wait for the session that is dropping the partition to finish, so
> that they get the fresh information from the relcache (or more importantly
> do not end up with information obtained from the relcache going invalid
> under them without notice). Note that the lock on the partition/child is
> also present and it plays more or less the the same role as it does in the
> DropInherit case, but due to different order of locking, reported race
> condition does not occur between SELECT on partitioned table and
> DROP/DETACH PARTITION.
Thank you for the explanation. I understand that the difference
comes from which of parent and children has the information about
inheritance/partitioning. DROP child and ALTER child NO INHERITS
are (I think) the only two operations that intiated from children
side. The parent-locking patch results in different solutions for
similar problems.
However, parent locking on DROPped child requires a new index
InheritsRelidIndex to avoid full scan on pg_inherits, but the NO
INHERITS case doesn't. This might be a good reason for the
difference.
I noticed that DROP TABLE partition acquires lock on the parent
in a callback for RangeVarGetRelid(Extended). The same way seems
reasonable for the NO INHERIT case. As the result the patch
becomes far small and less invasive than the previous
one. (dropinh_lock_parent_v2.patch)
As a negative PoC, attacched dropchild_lock_parent_PoC.patch only
to show how the same method on DROP TABLE case looks.
> By the way, I will take a look at your patch when I come back from the
> vacation. Meanwhile, I noticed that it needs another rebase after
> 0a480502b092 [1].
Great. Thanks.
> Thanks,
> Amit
>
> [1]
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0a480502b092
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 563bcda..b7c87b9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -13175,7 +13175,28 @@ RangeVarCallbackForAlterRelation(const RangeVar *rv, Oid relid, Oid oldrelid,
reltype = ((AlterObjectSchemaStmt *) stmt)->objectType;
else if (IsA(stmt, AlterTableStmt))
- reltype = ((AlterTableStmt *) stmt)->relkind;
+ {
+ AlterTableStmt *alterstmt = (AlterTableStmt *)stmt;
+ ListCell *lc;
+
+ reltype = alterstmt->relkind;
+
+ foreach (lc, alterstmt->cmds)
+ {
+ AlterTableCmd *cmd = lfirst_node(AlterTableCmd, lc);
+ Assert(IsA(cmd, AlterTableCmd));
+
+ /*
+ * Though NO INHERIT doesn't modify the parent, concurrent
+ * expansion of inheritances will see a false child. We must
+ * acquire lock on the parent when dropping inheritance, but no
+ * need to ascend further.
+ */
+ if (cmd->subtype == AT_DropInherit)
+ RangeVarGetRelid((RangeVar *)cmd->def,
+ AccessExclusiveLock, false);
+ }
+ }
else
{
reltype = OBJECT_TABLE; /* placate compiler */
diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 245a374..b83e248 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -126,19 +126,6 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
{
/* Get the lock to synchronize against concurrent drop */
LockRelationOid(inhrelid, lockmode);
-
- /*
- * Now that we have the lock, double-check to see if the relation
- * really exists or not. If not, assume it was dropped while we
- * waited to acquire lock, and ignore it.
- */
- if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(inhrelid)))
- {
- /* Release useless lock */
- UnlockRelationOid(inhrelid, lockmode);
- /* And ignore this relation */
- continue;
- }
}
list = lappend_oid(list, inhrelid);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b7c87b9..62c8860 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1120,10 +1120,10 @@ RemoveRelations(DropStmt *drop)
}
/*
- * Before acquiring a table lock, check whether we have sufficient rights.
- * In the case of DROP INDEX, also try to lock the table before the index.
- * Also, if the table to be dropped is a partition, we try to lock the parent
- * first.
+ * Before acquiring a table lock, check whether we have sufficient rights. In
+ * the case of DROP INDEX, also try to lock the table before the index. Also,
+ * if the table to be dropped is a partition or inheritance children, we try
+ * to lock the parent first.
*/
static void
RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
@@ -1229,6 +1229,26 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
if (OidIsValid(state->partParentOid))
LockRelationOid(state->partParentOid, AccessExclusiveLock);
}
+
+ /* the same for all inheritance parents */
+ if (relOid != oldRelOid)
+ {
+ Relation inhrel;
+ SysScanDesc scan;
+ HeapTuple inhtup;
+
+ inhrel = heap_open(InheritsRelationId, AccessShareLock);
+ scan = systable_beginscan(inhrel, InvalidOid, false, NULL, 0, NULL);
+ while (HeapTupleIsValid(inhtup = systable_getnext(scan)))
+ {
+ Form_pg_inherits inh = (Form_pg_inherits) GETSTRUCT(inhtup);
+
+ if (inh->inhrelid == relOid && OidIsValid(inh->inhparent))
+ LockRelationOid(inh->inhparent, AccessExclusiveLock);
+ }
+ systable_endscan(scan);
+ heap_close(inhrel, AccessShareLock);
+ }
}
/*
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers