Re: Hitting CheckRelationLockedByMe() ASSERT with force_generic_plan

2018-12-14 Thread Amit Kapila
On Fri, Dec 14, 2018 at 11:32 AM Rushabh Lathia
 wrote:
>
> While looking code further around this, I realized that we need
> similar kind of fix for bitmap as well as index only scan as well.
>

I think if we want to move forward with this patch, we need to first
investigate the deadlock hazard mentioned by Tom in other related
thread [1].   I and others have also responded on that thread, see if
you can respond to it.

[1] - https://www.postgresql.org/message-id/3046.1542990312%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Hitting CheckRelationLockedByMe() ASSERT with force_generic_plan

2018-12-13 Thread Rushabh Lathia
While looking code further around this, I realized that we need
similar kind of fix for bitmap as well as index only scan as well.

Here is the patch, which does similar fix for bitmap and indexonly
scans.

Thanks,


On Fri, Nov 23, 2018 at 6:47 PM Rushabh Lathia 
wrote:

>
>
> On Fri, Nov 23, 2018 at 3:33 AM David Rowley 
> wrote:
>
>> On Thu, 22 Nov 2018 at 22:33, Rushabh Lathia 
>> wrote:
>> > CREATE TABLE foo (x int primary key);
>> > INSERT INTO foo VALUES (1), (2), (3), (4), (5);
>> >
>> > CREATE OR REPLACE FUNCTION f1(a int) RETURNS int
>> > AS $$
>> > BEGIN
>> >  DELETE FROM foo where x = a;
>> >  return 0;
>> > END;
>> > $$ LANGUAGE plpgsql;
>> >
>> > postgres@100858=#set plan_cache_mode = force_generic_plan;
>> > SET
>> > postgres@100858=#select f1(4);
>> >  f1
>> > 
>> >   0
>> > (1 row)
>> >
>> > postgres@100858=#select f1(4);
>> > server closed the connection unexpectedly
>>
>>
>> > Now, to fix this issue either we need to hold proper lock before
>> reaching
>> > to ExecInitIndexScan() or teach ExecInitIndexScan() to take
>> AccessShareLock
>> > on the scan coming from CMD_DELETE.
>>
>> I'd say the following comment and code in nodeIndexscan.c is outdated:
>>
>> /*
>> * Open the index relation.
>> *
>> * If the parent table is one of the target relations of the query, then
>> * InitPlan already opened and write-locked the index, so we can avoid
>> * taking another lock here.  Otherwise we need a normal reader's lock.
>> */
>> relistarget = ExecRelationIsTargetRelation(estate, node->scan.scanrelid);
>> indexstate->iss_RelationDesc = index_open(node->indexid,
>>   relistarget ? NoLock : AccessShareLock);
>>
>> Despite what the above comment claims, these indexes have not been
>> opened in InitPlan since 389af951552ff2. As you mentioned, they're
>> opened in nodeModifyTable.c for non-delete statements.
>>
>>
> +1.
>
> I tried the same approach and with further testing I haven't found
> any issues.
>
>
>> I think we either need to update the above code to align it to what's
>> now in nodeModifyTable.c, or just obtain an AccessShareLock
>> regardless. I kinda think we should just take the lock regardless as
>> determining if the relation is a result relation may be much more
>> expensive than just taking another lower-level lock on the index.
>>
>> Anyway. I've attached a small patch to update the above fragment.
>>
>> --
>>  David Rowley   http://www.2ndQuadrant.com/
>>  PostgreSQL Development, 24x7 Support, Training & Services
>>
>
>
> --
> Rushabh Lathia
>


-- 
Rushabh Lathia
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index d04f490..c615316 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -210,7 +210,7 @@ BitmapIndexScanState *
 ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
 {
 	BitmapIndexScanState *indexstate;
-	bool		relistarget;
+	bool		indexlocked;
 
 	/* check for unsupported flags */
 	Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
@@ -266,9 +266,17 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
 	 * InitPlan already opened and write-locked the index, so we can avoid
 	 * taking another lock here.  Otherwise we need a normal reader's lock.
 	 */
-	relistarget = ExecRelationIsTargetRelation(estate, node->scan.scanrelid);
+	/*
+	 * Open the index relation.
+	 *
+	 * For non-DELETE statement when the parent table is one of the target
+	 * relations of the query, ExecInitModifyTable will already have obtained
+	 * a lock on the index, otherwise we must obtain a read lock here.
+	 */
+	indexlocked = estate->es_plannedstmt->commandType != CMD_DELETE &&
+  ExecRelationIsTargetRelation(estate, node->scan.scanrelid);
 	indexstate->biss_RelationDesc = index_open(node->indexid,
-			   relistarget ? NoLock : AccessShareLock);
+			   indexlocked ? NoLock : AccessShareLock);
 
 	/*
 	 * Initialize index-specific scan state
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index d1201a1..d9caf00 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -493,8 +493,8 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
 {
 	IndexOnlyScanState *indexstate;
 	Relation	currentRelation;
-	bool		relistarget;
 	TupleDesc	tupDesc;
+	bool		indexlocked;
 
 	/*
 	 * create state structure
@@ -558,13 +558,14 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
 	/*
 	 * Open the index relation.
 	 *
-	 * If the parent table is one of the target relations of the query, then
-	 * InitPlan already opened and write-locked the index, so we can avoid
-	 * taking another lock here.  Otherwise we need a normal reader's lock.
+	 * For non-DELETE statement when the parent table is one of the target
+	 * relations of the query, ExecInitModifyTable will already have 

Re: Hitting CheckRelationLockedByMe() ASSERT with force_generic_plan

2018-11-23 Thread Rushabh Lathia
On Fri, Nov 23, 2018 at 3:33 AM David Rowley 
wrote:

> On Thu, 22 Nov 2018 at 22:33, Rushabh Lathia 
> wrote:
> > CREATE TABLE foo (x int primary key);
> > INSERT INTO foo VALUES (1), (2), (3), (4), (5);
> >
> > CREATE OR REPLACE FUNCTION f1(a int) RETURNS int
> > AS $$
> > BEGIN
> >  DELETE FROM foo where x = a;
> >  return 0;
> > END;
> > $$ LANGUAGE plpgsql;
> >
> > postgres@100858=#set plan_cache_mode = force_generic_plan;
> > SET
> > postgres@100858=#select f1(4);
> >  f1
> > 
> >   0
> > (1 row)
> >
> > postgres@100858=#select f1(4);
> > server closed the connection unexpectedly
>
>
> > Now, to fix this issue either we need to hold proper lock before reaching
> > to ExecInitIndexScan() or teach ExecInitIndexScan() to take
> AccessShareLock
> > on the scan coming from CMD_DELETE.
>
> I'd say the following comment and code in nodeIndexscan.c is outdated:
>
> /*
> * Open the index relation.
> *
> * If the parent table is one of the target relations of the query, then
> * InitPlan already opened and write-locked the index, so we can avoid
> * taking another lock here.  Otherwise we need a normal reader's lock.
> */
> relistarget = ExecRelationIsTargetRelation(estate, node->scan.scanrelid);
> indexstate->iss_RelationDesc = index_open(node->indexid,
>   relistarget ? NoLock : AccessShareLock);
>
> Despite what the above comment claims, these indexes have not been
> opened in InitPlan since 389af951552ff2. As you mentioned, they're
> opened in nodeModifyTable.c for non-delete statements.
>
>
+1.

I tried the same approach and with further testing I haven't found
any issues.


> I think we either need to update the above code to align it to what's
> now in nodeModifyTable.c, or just obtain an AccessShareLock
> regardless. I kinda think we should just take the lock regardless as
> determining if the relation is a result relation may be much more
> expensive than just taking another lower-level lock on the index.
>
> Anyway. I've attached a small patch to update the above fragment.
>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


-- 
Rushabh Lathia


Re: Hitting CheckRelationLockedByMe() ASSERT with force_generic_plan

2018-11-22 Thread Tom Lane
David Rowley  writes:
> On Thu, 22 Nov 2018 at 22:33, Rushabh Lathia  wrote:
>> Now, to fix this issue either we need to hold proper lock before reaching
>> to ExecInitIndexScan() or teach ExecInitIndexScan() to take AccessShareLock
>> on the scan coming from CMD_DELETE.

> I'd say the following comment and code in nodeIndexscan.c is outdated:

I complained about this previously in

https://www.postgresql.org/message-id/19465.1541636...@sss.pgh.pa.us

but got nothing but crickets :-(

regards, tom lane



Re: Hitting CheckRelationLockedByMe() ASSERT with force_generic_plan

2018-11-22 Thread David Rowley
On Thu, 22 Nov 2018 at 22:33, Rushabh Lathia  wrote:
> CREATE TABLE foo (x int primary key);
> INSERT INTO foo VALUES (1), (2), (3), (4), (5);
>
> CREATE OR REPLACE FUNCTION f1(a int) RETURNS int
> AS $$
> BEGIN
>  DELETE FROM foo where x = a;
>  return 0;
> END;
> $$ LANGUAGE plpgsql;
>
> postgres@100858=#set plan_cache_mode = force_generic_plan;
> SET
> postgres@100858=#select f1(4);
>  f1
> 
>   0
> (1 row)
>
> postgres@100858=#select f1(4);
> server closed the connection unexpectedly


> Now, to fix this issue either we need to hold proper lock before reaching
> to ExecInitIndexScan() or teach ExecInitIndexScan() to take AccessShareLock
> on the scan coming from CMD_DELETE.

I'd say the following comment and code in nodeIndexscan.c is outdated:

/*
* Open the index relation.
*
* If the parent table is one of the target relations of the query, then
* InitPlan already opened and write-locked the index, so we can avoid
* taking another lock here.  Otherwise we need a normal reader's lock.
*/
relistarget = ExecRelationIsTargetRelation(estate, node->scan.scanrelid);
indexstate->iss_RelationDesc = index_open(node->indexid,
  relistarget ? NoLock : AccessShareLock);

Despite what the above comment claims, these indexes have not been
opened in InitPlan since 389af951552ff2. As you mentioned, they're
opened in nodeModifyTable.c for non-delete statements.

I think we either need to update the above code to align it to what's
now in nodeModifyTable.c, or just obtain an AccessShareLock
regardless. I kinda think we should just take the lock regardless as
determining if the relation is a result relation may be much more
expensive than just taking another lower-level lock on the index.

Anyway. I've attached a small patch to update the above fragment.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


fix_index_locking_for_deletes.patch
Description: Binary data


Hitting CheckRelationLockedByMe() ASSERT with force_generic_plan

2018-11-22 Thread Rushabh Lathia
Hi,

Commit b04aeb0a053e7cf7faad89f7d47844d8ba0dc839 add assertions that we hold
some relevant lock during relation open as opening a relation with no lock
at all is unsafe;

With above commit below test case is failing and hitting the newly added
Assert().

Test case:
===

CREATE TABLE foo (x int primary key);
INSERT INTO foo VALUES (1), (2), (3), (4), (5);

CREATE OR REPLACE FUNCTION f1(a int) RETURNS int
AS $$
BEGIN
 DELETE FROM foo where x = a;
 return 0;
END;
$$ LANGUAGE plpgsql;

postgres@100858=#set plan_cache_mode = force_generic_plan;
SET
postgres@100858=#select f1(4);
 f1

  0
(1 row)

postgres@100858=#select f1(4);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.


Call stack:
=

#0  0x7f230b2d61d7 in raise () from /lib64/libc.so.6
#1  0x7f230b2d78c8 in abort () from /lib64/libc.so.6
#2  0x00a26ce5 in ExceptionalCondition (
conditionName=0xabdca8 "!(lockmode != 0 || (Mode ==
BootstrapProcessing) || CheckRelationLockedByMe(r, 1, 1))",
errorType=0xabc529 "FailedAssertion",
fileName=0xabc668 "heapam.c", lineNumber=1146) at assert.c:54
#3  0x004dacd8 in relation_open (relationId=16387, lockmode=0) at
heapam.c:1144
#4  0x004f94bf in index_open (relationId=16387, lockmode=0) at
indexam.c:155
#5  0x00701403 in ExecInitIndexScan (node=0x1cf8fd8,
estate=0x1cec418, eflags=0) at nodeIndexscan.c:994
#6  0x006dde31 in ExecInitNode (node=0x1cf8fd8, estate=0x1cec418,
eflags=0) at execProcnode.c:217
#7  0x0070d7b8 in ExecInitModifyTable (node=0x1cf8da8,
estate=0x1cec418, eflags=0) at nodeModifyTable.c:2190
#8  0x006ddd39 in ExecInitNode (node=0x1cf8da8, estate=0x1cec418,
eflags=0) at execProcnode.c:174
#9  0x006d48a5 in InitPlan (queryDesc=0x1cdc378, eflags=0) at
execMain.c:1024
#10 0x006d36e9 in standard_ExecutorStart (queryDesc=0x1cdc378,
eflags=0) at execMain.c:265
#11 0x006d3485 in ExecutorStart (queryDesc=0x1cdc378, eflags=0) at
execMain.c:147
#12 0x00725880 in _SPI_pquery (queryDesc=0x1cdc378,
fire_triggers=true, tcount=0) at spi.c:2469
#13 0x007252d4 in _SPI_execute_plan (plan=0x1cc4748,
paramLI=0x1ce23c8, snapshot=0x0, crosscheck_snapshot=0x0, read_only=false,
fire_triggers=true,
tcount=0) at spi.c:2235
#14 0x007222a1 in SPI_execute_plan_with_paramlist (plan=0x1cc4748,
params=0x1ce23c8, read_only=false, tcount=0) at spi.c:516
#15 0x7f23004a5f9a in exec_stmt_execsql (estate=0x7fff0bcaa4d0,
stmt=0x1cc78f0) at pl_exec.c:4114

Here DELETE statement pick the IndexScan plan, and later during execution
when it tries to open index relation, ExecInitIndexScan() assumes that index
relation should already hold require a lock.  But in this case, that's not
the
case - which got exposed through the newly added ASSERT() for NoLock.

When query get's execute first time build_simple_rel() ->
get_relation_info(),
takes a lock on the indexes of a relation (this do happen from the
BuildCachePlan()).  When we execute the same query second time, a plan is
caches
and CheckCachePlan() calls the AquireExecutorLocks() take the require locks
needed for execution of the caches plan.  Here it takes a lock on the rtable
list - but doesn't take any lock on the index relation.

I observed that ExecInitModifyTable() to open the index relation if indices
on
the result relation. But here also - it doesn't do anything for CMD_DELETE:

/*
 * If there are indices on the result relation, open them and save
 * descriptors in the result relation info, so that we can add new
 * index entries for the tuples we add/update.  We need not do this
 * for a DELETE, however, since deletion doesn't affect indexes.
Also,
 * inside an EvalPlanQual operation, the indexes might be open
 * already, since we share the resultrel state with the original
 * query.
 */
if (resultRelInfo->ri_RelationDesc->rd_rel->relhasindex &&
operation != CMD_DELETE &&
resultRelInfo->ri_IndexRelationDescs == NULL)
ExecOpenIndices(resultRelInfo,
node->onConflictAction != ONCONFLICT_NONE);

Now, to fix this issue either we need to hold proper lock before reaching
to ExecInitIndexScan() or teach ExecInitIndexScan() to take AccessShareLock
on the scan coming from CMD_DELETE.

Thoughts/Comments?

Thanks,
Rushabh Lathia
www.EnterpriseDB.com