Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-04-10 Thread Peter Eisentraut
On 3/31/17 20:25, Petr Jelinek wrote:
> On 01/04/17 01:57, Petr Jelinek wrote:
>> That being said, looking at use-cases for SetSubscriptionRelState that's
>> basically CREATE SUBSCRIPTION, ALTER SUBSCRIPTION REFRESH and tablesync
>> worker. So the DDL thing applies to first ones as well and tablesync
>> should not be running in case the record does not exist so it's fine if
>> it fails. In terms of RemoveSubscriptionRel that's only called from
>> heap_drop_with_catalog and tablesync holds relation lock so there is no
>> way heap_drop_with_catalog will happen on the same relation. This leads
>> me to thinking that RowExclusiveLock is fine for both
>> SetSubscriptionRelState and RemoveSubscriptionRel as long as we document
>> that callers should be aware that SetSubscriptionRelState has
>> concurrency issues and fail on unique index check.
>>
> 
> And a simple patch to do so. Peter do you see any problem with doing this?

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Petr Jelinek
On 01/04/17 01:57, Petr Jelinek wrote:
> On 01/04/17 01:20, Tom Lane wrote:
>> Petr Jelinek  writes:
>>> But the pg_subscription_rel is also not accessed on heap_open, the
>>> problematic code is called from heap_drop_with_catalog. And VACUUM FULL
>>> pg_class calls heap_drop_with_catalog() when doing the heap swap (and it
>>> goes through performDeletion so through dependency info which is what I
>>> mean by everything else does this).
>>
>> Hmmm.  What the heap_drop_with_catalog call is being applied to is a
>> short-lived table that is not pg_class.  It happens to own the disk
>> storage that used to belong to pg_class, but it is not pg_class;
>> in particular it doesn't have the same OID, and it is not what would
>> be looked in if someone happened to need to fetch a pg_class row
>> at that point.  So there's no circularity involved.
>>
>> On further reflection it seems like you were right, the problem is
>> taking a self-exclusive lock on pg_subscription_rel during low-level
>> catalog operations.  We're going to have to find a way to reduce that
>> lock strength, or we're going to have a lot of deadlock problems.
>>
> 
> Well we have heavy lock because we were worried about failure scenarios
> in our dumb upsert in SetSubscriptionRelState which does cache lookup
> and if it finds something it updates, otherwise inserts. We have similar
> pattern in other places but they are called from DDL statements usually
> so the worst that can happen is DDL fails with weird errors when done
> concurrently with same name so using RowExclusiveLock is okay.
> 
> That being said, looking at use-cases for SetSubscriptionRelState that's
> basically CREATE SUBSCRIPTION, ALTER SUBSCRIPTION REFRESH and tablesync
> worker. So the DDL thing applies to first ones as well and tablesync
> should not be running in case the record does not exist so it's fine if
> it fails. In terms of RemoveSubscriptionRel that's only called from
> heap_drop_with_catalog and tablesync holds relation lock so there is no
> way heap_drop_with_catalog will happen on the same relation. This leads
> me to thinking that RowExclusiveLock is fine for both
> SetSubscriptionRelState and RemoveSubscriptionRel as long as we document
> that callers should be aware that SetSubscriptionRelState has
> concurrency issues and fail on unique index check.
> 

And a simple patch to do so. Peter do you see any problem with doing this?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
From a12259efe7b4d0af570e7d2d7cb48b249158a1bd Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Sat, 1 Apr 2017 02:20:16 +0200
Subject: [PATCH] Use weaker locks when updating subscription relation state

---
 src/backend/catalog/pg_subscription.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index e7a1634..b0bd248 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -215,6 +215,9 @@ textarray_to_stringlist(ArrayType *textarray)
 
 /*
  * Set the state of a subscription table.
+ *
+ * The insert or update logic in this function is not concurrency safe so
+ * it may raise an error.
  */
 Oid
 SetSubscriptionRelState(Oid subid, Oid relid, char state,
@@ -227,7 +230,7 @@ SetSubscriptionRelState(Oid subid, Oid relid, char state,
 	Datum		values[Natts_pg_subscription_rel];
 
 	/* Prevent concurrent changes. */
-	rel = heap_open(SubscriptionRelRelationId, ShareRowExclusiveLock);
+	rel = heap_open(SubscriptionRelRelationId, RowExclusiveLock);
 
 	/* Try finding existing mapping. */
 	tup = SearchSysCacheCopy2(SUBSCRIPTIONRELMAP,
@@ -358,8 +361,7 @@ RemoveSubscriptionRel(Oid subid, Oid relid)
 	HeapTuple	tup;
 	int			nkeys = 0;
 
-	/* Prevent concurrent changes (see SetSubscriptionRelState()). */
-	rel = heap_open(SubscriptionRelRelationId, ShareRowExclusiveLock);
+	rel = heap_open(SubscriptionRelRelationId, RowExclusiveLock);
 
 	if (OidIsValid(subid))
 	{
@@ -387,7 +389,7 @@ RemoveSubscriptionRel(Oid subid, Oid relid)
 	}
 	heap_endscan(scan);
 
-	heap_close(rel, ShareRowExclusiveLock);
+	heap_close(rel, RowExclusiveLock);
 }
 
 
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Petr Jelinek
On 01/04/17 01:57, Petr Jelinek wrote:
> On 01/04/17 01:20, Tom Lane wrote:
>> Petr Jelinek  writes:
>>> But the pg_subscription_rel is also not accessed on heap_open, the
>>> problematic code is called from heap_drop_with_catalog. And VACUUM FULL
>>> pg_class calls heap_drop_with_catalog() when doing the heap swap (and it
>>> goes through performDeletion so through dependency info which is what I
>>> mean by everything else does this).
>>
>> Hmmm.  What the heap_drop_with_catalog call is being applied to is a
>> short-lived table that is not pg_class.  It happens to own the disk
>> storage that used to belong to pg_class, but it is not pg_class;
>> in particular it doesn't have the same OID, and it is not what would
>> be looked in if someone happened to need to fetch a pg_class row
>> at that point.  So there's no circularity involved.
>>
>> On further reflection it seems like you were right, the problem is
>> taking a self-exclusive lock on pg_subscription_rel during low-level
>> catalog operations.  We're going to have to find a way to reduce that
>> lock strength, or we're going to have a lot of deadlock problems.
>>
> 
> Well we have heavy lock because we were worried about failure scenarios
> in our dump upsert in SetSubscriptionRelState which does cache lookup

*dumb (ie, like me ;) )


-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Petr Jelinek
On 01/04/17 01:20, Tom Lane wrote:
> Petr Jelinek  writes:
>> But the pg_subscription_rel is also not accessed on heap_open, the
>> problematic code is called from heap_drop_with_catalog. And VACUUM FULL
>> pg_class calls heap_drop_with_catalog() when doing the heap swap (and it
>> goes through performDeletion so through dependency info which is what I
>> mean by everything else does this).
> 
> Hmmm.  What the heap_drop_with_catalog call is being applied to is a
> short-lived table that is not pg_class.  It happens to own the disk
> storage that used to belong to pg_class, but it is not pg_class;
> in particular it doesn't have the same OID, and it is not what would
> be looked in if someone happened to need to fetch a pg_class row
> at that point.  So there's no circularity involved.
> 
> On further reflection it seems like you were right, the problem is
> taking a self-exclusive lock on pg_subscription_rel during low-level
> catalog operations.  We're going to have to find a way to reduce that
> lock strength, or we're going to have a lot of deadlock problems.
> 

Well we have heavy lock because we were worried about failure scenarios
in our dump upsert in SetSubscriptionRelState which does cache lookup
and if it finds something it updates, otherwise inserts. We have similar
pattern in other places but they are called from DDL statements usually
so the worst that can happen is DDL fails with weird errors when done
concurrently with same name so using RowExclusiveLock is okay.

That being said, looking at use-cases for SetSubscriptionRelState that's
basically CREATE SUBSCRIPTION, ALTER SUBSCRIPTION REFRESH and tablesync
worker. So the DDL thing applies to first ones as well and tablesync
should not be running in case the record does not exist so it's fine if
it fails. In terms of RemoveSubscriptionRel that's only called from
heap_drop_with_catalog and tablesync holds relation lock so there is no
way heap_drop_with_catalog will happen on the same relation. This leads
me to thinking that RowExclusiveLock is fine for both
SetSubscriptionRelState and RemoveSubscriptionRel as long as we document
that callers should be aware that SetSubscriptionRelState has
concurrency might fail on unique index check.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Tom Lane
Petr Jelinek  writes:
> But the pg_subscription_rel is also not accessed on heap_open, the
> problematic code is called from heap_drop_with_catalog. And VACUUM FULL
> pg_class calls heap_drop_with_catalog() when doing the heap swap (and it
> goes through performDeletion so through dependency info which is what I
> mean by everything else does this).

Hmmm.  What the heap_drop_with_catalog call is being applied to is a
short-lived table that is not pg_class.  It happens to own the disk
storage that used to belong to pg_class, but it is not pg_class;
in particular it doesn't have the same OID, and it is not what would
be looked in if someone happened to need to fetch a pg_class row
at that point.  So there's no circularity involved.

However ... by that same token, it ought to be okay to consult
pg_subscription_rel during that drop step.  Indeed, if we put
an IsCatalogRelation check in there, it wouldn't fire, because
the target table has OID 16409 according to your stack trace.
So that line of thought is indeed not very helpful.

On further reflection it seems like you were right, the problem is
taking a self-exclusive lock on pg_subscription_rel during low-level
catalog operations.  We're going to have to find a way to reduce that
lock strength, or we're going to have a lot of deadlock problems.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Petr Jelinek
On 01/04/17 00:52, Tom Lane wrote:
> Petr Jelinek  writes:
>> On 31/03/17 21:00, Tom Lane wrote:
>>> Looking at dependency info isn't going to fix this, it only moves the
>>> unsafe catalog access somewhere else (ie pg_depend instead of
>>> pg_subscription_rel).  I suspect the only safe solution is doing an
>>> IsCatalogRelation or similar test pretty early in the logical replication
>>> code paths.
> 
>> I don't follow, everything else does dependency info check in this
>> situation, how is this any different?
> 
> What do you mean by "everything else"?  The key point here is that
> access to the bootstrap catalogs like pg_class and pg_attribute can't
> be dependent on accesses to other catalogs, or we get into circularities.
> We certainly aren't trying to look in pg_depend when we do a heap_open.
> 
> (Or, if you tell me that we are now because the logical replication
> patch added it, I'm going to start agitating for reverting the patch
> and sending it back for redesign.)

But the pg_subscription_rel is also not accessed on heap_open, the
problematic code is called from heap_drop_with_catalog. And VACUUM FULL
pg_class calls heap_drop_with_catalog() when doing the heap swap (and it
goes through performDeletion so through dependency info which is what I
mean by everything else does this).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Tom Lane
Petr Jelinek  writes:
> On 31/03/17 21:00, Tom Lane wrote:
>> Looking at dependency info isn't going to fix this, it only moves the
>> unsafe catalog access somewhere else (ie pg_depend instead of
>> pg_subscription_rel).  I suspect the only safe solution is doing an
>> IsCatalogRelation or similar test pretty early in the logical replication
>> code paths.

> I don't follow, everything else does dependency info check in this
> situation, how is this any different?

What do you mean by "everything else"?  The key point here is that
access to the bootstrap catalogs like pg_class and pg_attribute can't
be dependent on accesses to other catalogs, or we get into circularities.
We certainly aren't trying to look in pg_depend when we do a heap_open.

(Or, if you tell me that we are now because the logical replication
patch added it, I'm going to start agitating for reverting the patch
and sending it back for redesign.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Petr Jelinek
On 31/03/17 21:00, Tom Lane wrote:
> Petr Jelinek  writes:
>> On 31/03/17 20:23, Tom Lane wrote:
>>> No, the problematic part is that there is any heap_open happening at
>>> all.  That open could very easily result in a recursive attempt to read
>>> pg_class, for example, which is going to be fatal if we're in the midst
>>> of vacuum full'ing or reindex'ing pg_class.  It's frankly astonishing
>>> to me that this patch seems to have survived testing under
>>> CLOBBER_CACHE_ALWAYS, because it's only the catalog caches that are
>>> preventing such recursive lookups.
> 
>> Hmm okay, so the solution is to either use standard dependency info for
>> this so that it's only called for tables that are actually know to be
>> subscribed or have some exceptions in the current code to call the
>> function only for user catalogs. Any preferences?
> 
> Looking at dependency info isn't going to fix this, it only moves the
> unsafe catalog access somewhere else (ie pg_depend instead of
> pg_subscription_rel).  I suspect the only safe solution is doing an
> IsCatalogRelation or similar test pretty early in the logical replication
> code paths.
> 

I don't follow, everything else does dependency info check in this
situation, how is this any different?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Tom Lane
Petr Jelinek  writes:
> On 31/03/17 20:23, Tom Lane wrote:
>> No, the problematic part is that there is any heap_open happening at
>> all.  That open could very easily result in a recursive attempt to read
>> pg_class, for example, which is going to be fatal if we're in the midst
>> of vacuum full'ing or reindex'ing pg_class.  It's frankly astonishing
>> to me that this patch seems to have survived testing under
>> CLOBBER_CACHE_ALWAYS, because it's only the catalog caches that are
>> preventing such recursive lookups.

> Hmm okay, so the solution is to either use standard dependency info for
> this so that it's only called for tables that are actually know to be
> subscribed or have some exceptions in the current code to call the
> function only for user catalogs. Any preferences?

Looking at dependency info isn't going to fix this, it only moves the
unsafe catalog access somewhere else (ie pg_depend instead of
pg_subscription_rel).  I suspect the only safe solution is doing an
IsCatalogRelation or similar test pretty early in the logical replication
code paths.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Petr Jelinek
On 31/03/17 20:23, Tom Lane wrote:
> Petr Jelinek  writes:
>> On 31/03/17 19:35, Tom Lane wrote:
>>> At the very least, it would be a good idea to exclude the system
>>> catalogs from logical replication, wouldn't it?
> 
>> They are excluded.
> 
> Well, the exclusion is apparently not checked early enough.  I would say
> that touches of system catalogs should never result in any attempts to
> access the logical relation infrastructure, but what we have here is
> such an access.
> 
>> The problematic part is that the pg_subscription_rel manipulation
>> functions acquire ShareRowExclusiveLock and not the usual
>> RowExclusiveLock because there were some worries about concurrency.
> 
> No, the problematic part is that there is any heap_open happening at
> all.  That open could very easily result in a recursive attempt to read
> pg_class, for example, which is going to be fatal if we're in the midst
> of vacuum full'ing or reindex'ing pg_class.  It's frankly astonishing
> to me that this patch seems to have survived testing under
> CLOBBER_CACHE_ALWAYS, because it's only the catalog caches that are
> preventing such recursive lookups.
> 

Hmm okay, so the solution is to either use standard dependency info for
this so that it's only called for tables that are actually know to be
subscribed or have some exceptions in the current code to call the
function only for user catalogs. Any preferences?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Tom Lane
Petr Jelinek  writes:
> On 31/03/17 19:35, Tom Lane wrote:
>> At the very least, it would be a good idea to exclude the system
>> catalogs from logical replication, wouldn't it?

> They are excluded.

Well, the exclusion is apparently not checked early enough.  I would say
that touches of system catalogs should never result in any attempts to
access the logical relation infrastructure, but what we have here is
such an access.

> The problematic part is that the pg_subscription_rel manipulation
> functions acquire ShareRowExclusiveLock and not the usual
> RowExclusiveLock because there were some worries about concurrency.

No, the problematic part is that there is any heap_open happening at
all.  That open could very easily result in a recursive attempt to read
pg_class, for example, which is going to be fatal if we're in the midst
of vacuum full'ing or reindex'ing pg_class.  It's frankly astonishing
to me that this patch seems to have survived testing under
CLOBBER_CACHE_ALWAYS, because it's only the catalog caches that are
preventing such recursive lookups.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Petr Jelinek
On 31/03/17 19:35, Tom Lane wrote:
> Masahiko Sawada  writes:
>> On Fri, Mar 31, 2017 at 9:53 AM, Petr Jelinek
>>  wrote:
>>> On 30/03/17 07:25, Tom Lane wrote:
 I await with interest an explanation of what "VACUUM FULL pg_class" is
 doing trying to acquire ShareRowExclusiveLock on pg_subscription_rel, not
 to mention why a DROP SEQUENCE is holding some fairly strong lock on that
 relation.
> 
>> VACUUM FULL of any table acquires ShareRowExclusiveLock on
>> pg_subscription_rel because when doDeletion removes old heap the
>> RemoveSubscriptionRel is called in heap_drop_with_catalog.
> 
> This seems entirely horrid: it *guarantees* deadlock possibilities.
> And I wonder what happens when I VACUUM FULL pg_subscription_rel
> itself.
> 
> At the very least, it would be a good idea to exclude the system
> catalogs from logical replication, wouldn't it?
> 

They are excluded. It works same way for triggers and many other objects
so I would not say it's horrid.

The problematic part is that the pg_subscription_rel manipulation
functions acquire ShareRowExclusiveLock and not the usual
RowExclusiveLock because there were some worries about concurrency. I
think though that it's not needed though given the access patterns
there. It's only updated by CREATE SUBSCRIPTION/ALTER SUBSCRIPTION
REFRESH and then by tablesync which holds exclusive lock on the table
itself anyway.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Tom Lane
Masahiko Sawada  writes:
> On Fri, Mar 31, 2017 at 9:53 AM, Petr Jelinek
>  wrote:
>> On 30/03/17 07:25, Tom Lane wrote:
>>> I await with interest an explanation of what "VACUUM FULL pg_class" is
>>> doing trying to acquire ShareRowExclusiveLock on pg_subscription_rel, not
>>> to mention why a DROP SEQUENCE is holding some fairly strong lock on that
>>> relation.

> VACUUM FULL of any table acquires ShareRowExclusiveLock on
> pg_subscription_rel because when doDeletion removes old heap the
> RemoveSubscriptionRel is called in heap_drop_with_catalog.

This seems entirely horrid: it *guarantees* deadlock possibilities.
And I wonder what happens when I VACUUM FULL pg_subscription_rel
itself.

At the very least, it would be a good idea to exclude the system
catalogs from logical replication, wouldn't it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-31 Thread Masahiko Sawada
On Fri, Mar 31, 2017 at 9:53 AM, Petr Jelinek
 wrote:
> On 30/03/17 07:25, Tom Lane wrote:
>> I noticed this failure report:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2017-03-29%2019%3A45%3A27
>>
>> in which we find
>>
>> *** 
>> /home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/regress/expected/updatable_views.out
>>  Thu Mar 30 04:45:43 2017
>> --- 
>> /home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/regress/results/updatable_views.out
>>   Thu Mar 30 05:32:37 2017
>> ***
>> *** 349,354 
>> --- 349,358 
>>   DROP VIEW ro_view10, ro_view12, ro_view18;
>>   DROP SEQUENCE seq CASCADE;
>>   NOTICE:  drop cascades to view ro_view19
>> + ERROR:  deadlock detected
>> + DETAIL:  Process 7576 waits for AccessShareLock on relation 1259 of 
>> database 16384; blocked by process 7577.
>> + Process 7577 waits for ShareRowExclusiveLock on relation 6102 of database 
>> 16384; blocked by process 7576.
>> + HINT:  See server log for query details.
>>   -- simple updatable view
>>   CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
>>   INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) 
>> g(i);
>>
>> and the referenced bit of log is
>>
>> [58dc19dd.1d98:175] ERROR:  deadlock detected
>> [58dc19dd.1d98:176] DETAIL:  Process 7576 waits for AccessShareLock on 
>> relation 1259 of database 16384; blocked by process 7577.
>>   Process 7577 waits for ShareRowExclusiveLock on relation 6102 of 
>> database 16384; blocked by process 7576.
>>   Process 7576: DROP SEQUENCE seq CASCADE;
>>   Process 7577: VACUUM FULL pg_class;
>> [58dc19dd.1d98:177] HINT:  See server log for query details.
>> [58dc19dd.1d98:178] STATEMENT:  DROP SEQUENCE seq CASCADE;
>>
>> Of course, 1259 is pg_class and 6102 is pg_subscription_rel.
>>
>> I await with interest an explanation of what "VACUUM FULL pg_class" is
>> doing trying to acquire ShareRowExclusiveLock on pg_subscription_rel, not
>> to mention why a DROP SEQUENCE is holding some fairly strong lock on that
>> relation.  *Especially* in a situation where no subscriptions exist ---
>> but even if any did, this seems unacceptable on its face.  Access to core
>> catalogs like pg_class cannot depend on random other stuff.
>>
>
> Hmm, the DROP SEQUENCE is result of not having dependency info for
> relations/subscriptions I think. I was told during review it's needless
> bloat of dependency catalog. I guess we should revisit that. It's also
> likely that RemoveSubscriptionRel will work fine with lower lock level.
>
> I have no idea why VACUUM FULL of pg_class would touch the
> pg_subscription_rel though, I'll have to dig into that.

VACUUM FULL of any table acquires ShareRowExclusiveLock on
pg_subscription_rel because when doDeletion removes old heap the
RemoveSubscriptionRel is called in heap_drop_with_catalog. The
following stack trace is what I got when run VACUUM FULL pg_class.

#2  0x0084b9d2 in LockRelationOid (relid=6102, lockmode=6) at lmgr.c:115
#3  0x004cec37 in relation_open (relationId=6102, lockmode=6)
at heapam.c:1122
#4  0x004ceef9 in heap_open (relationId=6102, lockmode=6) at
heapam.c:1288
#5  0x00599a02 in RemoveSubscriptionRel (subid=0, relid=16409)
at pg_subscription.c:361
#6  0x0056037f in heap_drop_with_catalog (relid=16409) at heap.c:1842
#7  0x0055b420 in doDeletion (object=0x1bca758, flags=1) at
dependency.c:1125
#8  0x0055b192 in deleteOneObject (object=0x1bca758,
depRel=0x7fff000716a0, flags=1) at dependency.c:1028
#9  0x0055a169 in deleteObjectsInList
(targetObjects=0x1b99708, depRel=0x7fff000716a0, flags=1) at
dependency.c:263
#10 0x0055a21a in performDeletion (object=0x7fff00071750,
behavior=DROP_RESTRICT, flags=1) at dependency.c:344
#11 0x00615597 in finish_heap_swap (OIDOldHeap=1259,
OIDNewHeap=16409, is_system_catalog=1 '\001', swap_toast_by_content=0
'\000', check_constraints=0 '\000', is_internal=1 '\001',
frozenXid=571, cutoffMulti=1, newrelpersistence=112 'p') at
cluster.c:1574
#12 0x00613bd3 in rebuild_relation (OldHeap=0x7f541f0d24a0,
indexOid=0, verbose=0 '\000') at cluster.c:590
#13 0x0061362a in cluster_rel (tableOid=1259, indexOid=0,
recheck=0 '\000', verbose=0 '\000') at cluster.c:404
#14 0x0069f40f in vacuum_rel (relid=1259, relation=0x1b9a770,
options=17, params=0x7fff00071a80) at vacuum.c:1441
#15 0x0069db9b in vacuum (options=17, relation=0x1b9a770,
relid=0, params=0x7fff00071a80, va_cols=0x0, bstrategy=0x1bf2f50,
isTopLevel=1 '\001') at vacuum.c:304
#16 0x0069d7ec in ExecVacuum (vacstmt=0x1b9a7c8, isTopLevel=1
'\001') at vacuum.c:122
#17 0x00873a9c in standard_ProcessUtility (pstmt=0x1b9ab28,
queryString=0x1b99d50 "vacuum FULL pg_class;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x1b9ac20,
completionTag=0x7fff00071eb0 "") at utility.c:670
#18 0x00873222 in 

Re: [HACKERS] Somebody has not thought through subscription locking considerations

2017-03-30 Thread Petr Jelinek
On 30/03/17 07:25, Tom Lane wrote:
> I noticed this failure report:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2017-03-29%2019%3A45%3A27
> 
> in which we find
> 
> *** 
> /home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/regress/expected/updatable_views.out
>  Thu Mar 30 04:45:43 2017
> --- 
> /home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/regress/results/updatable_views.out
>   Thu Mar 30 05:32:37 2017
> ***
> *** 349,354 
> --- 349,358 
>   DROP VIEW ro_view10, ro_view12, ro_view18;
>   DROP SEQUENCE seq CASCADE;
>   NOTICE:  drop cascades to view ro_view19
> + ERROR:  deadlock detected
> + DETAIL:  Process 7576 waits for AccessShareLock on relation 1259 of 
> database 16384; blocked by process 7577.
> + Process 7577 waits for ShareRowExclusiveLock on relation 6102 of database 
> 16384; blocked by process 7576.
> + HINT:  See server log for query details.
>   -- simple updatable view
>   CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
>   INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i);
> 
> and the referenced bit of log is
> 
> [58dc19dd.1d98:175] ERROR:  deadlock detected
> [58dc19dd.1d98:176] DETAIL:  Process 7576 waits for AccessShareLock on 
> relation 1259 of database 16384; blocked by process 7577.
>   Process 7577 waits for ShareRowExclusiveLock on relation 6102 of 
> database 16384; blocked by process 7576.
>   Process 7576: DROP SEQUENCE seq CASCADE;
>   Process 7577: VACUUM FULL pg_class;
> [58dc19dd.1d98:177] HINT:  See server log for query details.
> [58dc19dd.1d98:178] STATEMENT:  DROP SEQUENCE seq CASCADE;
> 
> Of course, 1259 is pg_class and 6102 is pg_subscription_rel.
> 
> I await with interest an explanation of what "VACUUM FULL pg_class" is
> doing trying to acquire ShareRowExclusiveLock on pg_subscription_rel, not
> to mention why a DROP SEQUENCE is holding some fairly strong lock on that
> relation.  *Especially* in a situation where no subscriptions exist ---
> but even if any did, this seems unacceptable on its face.  Access to core
> catalogs like pg_class cannot depend on random other stuff.
> 

Hmm, the DROP SEQUENCE is result of not having dependency info for
relations/subscriptions I think. I was told during review it's needless
bloat of dependency catalog. I guess we should revisit that. It's also
likely that RemoveSubscriptionRel will work fine with lower lock level.

I have no idea why VACUUM FULL of pg_class would touch the
pg_subscription_rel though, I'll have to dig into that.

I can see that locking for example pg_trigger or pg_depend in
ShareRowExclusiveLock will also block VACUUM FULL on pg_class (and other
related tables like pg_attribute). So maybe we just need to be careful
about not taking such a strong lock...

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Somebody has not thought through subscription locking considerations

2017-03-29 Thread Tom Lane
I noticed this failure report:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2017-03-29%2019%3A45%3A27

in which we find

*** 
/home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/regress/expected/updatable_views.out
   Thu Mar 30 04:45:43 2017
--- 
/home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/regress/results/updatable_views.out
Thu Mar 30 05:32:37 2017
***
*** 349,354 
--- 349,358 
  DROP VIEW ro_view10, ro_view12, ro_view18;
  DROP SEQUENCE seq CASCADE;
  NOTICE:  drop cascades to view ro_view19
+ ERROR:  deadlock detected
+ DETAIL:  Process 7576 waits for AccessShareLock on relation 1259 of database 
16384; blocked by process 7577.
+ Process 7577 waits for ShareRowExclusiveLock on relation 6102 of database 
16384; blocked by process 7576.
+ HINT:  See server log for query details.
  -- simple updatable view
  CREATE TABLE base_tbl (a int PRIMARY KEY, b text DEFAULT 'Unspecified');
  INSERT INTO base_tbl SELECT i, 'Row ' || i FROM generate_series(-2, 2) g(i);

and the referenced bit of log is

[58dc19dd.1d98:175] ERROR:  deadlock detected
[58dc19dd.1d98:176] DETAIL:  Process 7576 waits for AccessShareLock on relation 
1259 of database 16384; blocked by process 7577.
Process 7577 waits for ShareRowExclusiveLock on relation 6102 of 
database 16384; blocked by process 7576.
Process 7576: DROP SEQUENCE seq CASCADE;
Process 7577: VACUUM FULL pg_class;
[58dc19dd.1d98:177] HINT:  See server log for query details.
[58dc19dd.1d98:178] STATEMENT:  DROP SEQUENCE seq CASCADE;

Of course, 1259 is pg_class and 6102 is pg_subscription_rel.

I await with interest an explanation of what "VACUUM FULL pg_class" is
doing trying to acquire ShareRowExclusiveLock on pg_subscription_rel, not
to mention why a DROP SEQUENCE is holding some fairly strong lock on that
relation.  *Especially* in a situation where no subscriptions exist ---
but even if any did, this seems unacceptable on its face.  Access to core
catalogs like pg_class cannot depend on random other stuff.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers