On 2015/12/03 15:30, Amit Langote wrote: > On 2015/12/03 13:09, Tom Lane wrote: >> Amit Langote <langote_amit...@lab.ntt.co.jp> writes: >>> Currently find_inheritance_children() is smart enough to skip a child >>> table that it finds has been dropped concurrently after it gets a lock on >>> the same. It does so by looking up the child relid in syscache. It seems >>> it should also check if the table is still in the list of children of the >>> parent. > >> I wonder whether we could improve matters by rechecking validity of the >> pg_inherits tuple (which we saw already and could presumably retain the >> TID of). There is at least one place where we do something like that now, >> IIRC. > > Not sure whether sane but how about performing ordered scan on pg_inherits > (systable_getnext_ordered())and using systable_recheck_tuple() in step > with it? Does using ordered catalog scan ensure safety against deadlocks > that the existing approach of ordered locking of child tables does? > Perhaps I'm missing something.
Just leaving here a patch that does this. It still returns the list in order determined by qsort(), IOW, not in the pg_inherits_parent_index order to avoid broken tests. I could not figure how to do it the way you suggested. Thanks, Amit
diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c index 04687c1..bfef40d 100644 --- a/src/backend/catalog/pg_inherits.c +++ b/src/backend/catalog/pg_inherits.c @@ -50,9 +50,10 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode) { List *list = NIL; Relation relation; + Relation index; SysScanDesc scan; ScanKeyData key[1]; - HeapTuple inheritsTuple; + HeapTuple tup; Oid inhrelid; Oid *oidarr; int maxoids, @@ -74,46 +75,30 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode) numoids = 0; relation = heap_open(InheritsRelationId, AccessShareLock); + index = index_open(InheritsParentIndexId, AccessShareLock); ScanKeyInit(&key[0], Anum_pg_inherits_inhparent, BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(parentrelId)); - scan = systable_beginscan(relation, InheritsParentIndexId, true, - NULL, 1, key); + scan = systable_beginscan_ordered(relation, index, NULL, 1, key); - while ((inheritsTuple = systable_getnext(scan)) != NULL) + while ((tup = systable_getnext_ordered(scan, ForwardScanDirection)) != NULL) { - inhrelid = ((Form_pg_inherits) GETSTRUCT(inheritsTuple))->inhrelid; + inhrelid = ((Form_pg_inherits) GETSTRUCT(tup))->inhrelid; if (numoids >= maxoids) { maxoids *= 2; oidarr = (Oid *) repalloc(oidarr, maxoids * sizeof(Oid)); } - oidarr[numoids++] = inhrelid; - } - - systable_endscan(scan); - - heap_close(relation, AccessShareLock); - - /* - * If we found more than one child, sort them by OID. This ensures - * reasonably consistent behavior regardless of the vagaries of an - * indexscan. This is important since we need to be sure all backends - * lock children in the same order to avoid needless deadlocks. - */ - if (numoids > 1) - qsort(oidarr, numoids, sizeof(Oid), oid_cmp); - - /* - * Acquire locks and build the result list. - */ - for (i = 0; i < numoids; i++) - { - inhrelid = oidarr[i]; + /* + * Following code assumes that inhrelids are returned in the same + * order by systable_getnext_ordered() in all backends. This is + * important since we need to be sure all backends lock children + * in the same order to avoid needless deadlocks. + */ if (lockmode != NoLock) { /* Get the lock to synchronize against concurrent drop */ @@ -124,7 +109,8 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode) * really exists or not. If not, assume it was dropped while we * waited to acquire lock, and ignore it. */ - if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(inhrelid))) + if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(inhrelid)) || + !systable_recheck_tuple(scan, tup)) { /* Release useless lock */ UnlockRelationOid(inhrelid, lockmode); @@ -133,10 +119,22 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode) } } - list = lappend_oid(list, inhrelid); + oidarr[numoids++] = inhrelid; } - pfree(oidarr); + /* + * Do not break regression tests - return the oids in the same order as + * would have been previously. + */ + if (numoids > 1) + qsort(oidarr, numoids, sizeof(Oid), oid_cmp); + + for (i = 0; i < numoids; i++) + list = lappend_oid(list, oidarr[i]); + + systable_endscan_ordered(scan); + index_close(index, AccessShareLock); + heap_close(relation, AccessShareLock); return list; }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers