Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-19 Thread Alvaro Herrera
Alvaro Herrera escribió:
 Tom Lane escribió:

  I think the enum idea you floated is probably worth doing.  It's
  certainly no more complex than passing a string, no?
 
 Okay, done that way, attached.  I think this one solves all concerns
 there were.

I hope the silence meant assent, because I have pushed this patch now.

Thanks,

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-19 Thread Christian Kruse
Hi,

On 19/03/14 15:12, Alvaro Herrera wrote:
 I hope the silence meant assent, because I have pushed this patch now.

Great, thanks!

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpYfuVjR_gg_.pgp
Description: PGP signature


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-18 Thread Greg Stark
On Tue, Mar 18, 2014 at 3:42 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 The message for exclusive lock on tuple print the database information.

It is true that it is possible to have a deadlock or lock chains that
involves locks on other databases.
In this example the table test is not in the database that just
logged the deadlock.

STATEMENT:  create role test;
ERROR:  deadlock detected
DETAIL:  Process 8968 waits for ShareLock on transaction 1067; blocked
by process 8973.
Process 8973 waits for ShareLock on transaction 1064; blocked
by process 8971.
Process 8971 waits for ShareLock on transaction 1062; blocked
by process 8968.
Process 8968: create role test;
Process 8973: insert into test values (2);
Process 8971: create role test2;


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-18 Thread Robert Haas
On Tue, Mar 18, 2014 at 11:59 AM, Greg Stark st...@mit.edu wrote:
 On Tue, Mar 18, 2014 at 3:42 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 The message for exclusive lock on tuple print the database information.

 It is true that it is possible to have a deadlock or lock chains that
 involves locks on other databases.
 In this example the table test is not in the database that just
 logged the deadlock.

 STATEMENT:  create role test;
 ERROR:  deadlock detected
 DETAIL:  Process 8968 waits for ShareLock on transaction 1067; blocked
 by process 8973.
 Process 8973 waits for ShareLock on transaction 1064; blocked
 by process 8971.
 Process 8971 waits for ShareLock on transaction 1062; blocked
 by process 8968.
 Process 8968: create role test;
 Process 8973: insert into test values (2);
 Process 8971: create role test2;

This is a good point, but I think it's acceptable to leave out the
database name as Tom proposes.  The context message applies to what
the current backend was doing when the message got printed, and that's
always relative to the current database.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-18 Thread Robert Haas
On Tue, Mar 18, 2014 at 12:00 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Therefore I think the only case worth considering here is when
 database/relation/TID are all defined.  The other cases where there is
 partial information are dead code; and the case where there is nothing
 defined (such as the one in SnapBuildFindSnapshot) is already handled by
 simply not setting up a context at all.

 Right. So I think we should just keep one version of message.

Well, if we're back to one version of the message, and I'm glad we
are, can we go back to saying:

CONTEXT:  while updating tuple (0,2) in relation public.foo of
database postgres

Instead of:

CONTEXT:  while operating on tuple (0,2) in relation public.foo of
database postgres: updating tuple

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-18 Thread Alvaro Herrera
Robert Haas escribió:
 On Tue, Mar 18, 2014 at 12:00 AM, Amit Kapila amit.kapil...@gmail.com wrote:
  Therefore I think the only case worth considering here is when
  database/relation/TID are all defined.  The other cases where there is
  partial information are dead code; and the case where there is nothing
  defined (such as the one in SnapBuildFindSnapshot) is already handled by
  simply not setting up a context at all.
 
  Right. So I think we should just keep one version of message.
 
 Well, if we're back to one version of the message, and I'm glad we
 are, can we go back to saying:
 
 CONTEXT:  while updating tuple (0,2) in relation public.foo of
 database postgres

That isn't easy.  The callers would need to pass the whole message in
order for it to be translatable; and generating a format string in one
module and having the arguments be stapled on top in a separate module
doesn't seem a very good idea to me.  Currently we have this:

/* wait for regular transaction to end */
XactLockTableWait(xwait, relation, tp.t_data-t_ctid,
/* translator: string is XactLockTableWait operation 
name */
  deleting tuple);

And if we go with what you propose, we would have this:

/* wait for regular transaction to end */
XactLockTableWait(xwait, relation, tp.t_data-t_ctid,
while updating tuple (%u,%u) in relation \%s\)

which doesn't seem an improvement to me.

Now another idea would be to pass a code or something; so callers would
do
XactLockTableWait(xwait, relation, TID, XLTW_OPER_UPDATE);

and the callback would select the appropriate message.  Not really
excited about that, though.

In the current version of the patch, attached, I've reduced the callback
to this:

if (ItemPointerIsValid(info-ctid)  RelationIsValid(info-rel))
{
errcontext(while operating on tuple (%u,%u) in relation 
\%s\: %s,
   ItemPointerGetBlockNumber(info-ctid),
   ItemPointerGetOffsetNumber(info-ctid),
   RelationGetRelationName(info-rel),
   _(info-opr_name));
}

Note that:
1. the database name is gone, as discussed
2. the schema name is gone too, because it requires a syscache lookup

Now we can go back to having a schema name, but I think we would have to
add an IsTransactionState() check first or something like that.  Or save
the schema name when setting up the callback, but this doesn't sound so
good (particularly considering that lock waits might end without
actually waiting at all, so this'd be wasted effort.)


If you have a better idea, I'm all ears.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 105,115  static void GetMultiXactIdHintBits(MultiXactId multi, uint16 *new_infomask,
  	   uint16 *new_infomask2);
  static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
  		uint16 t_infomask);
! static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
! int *remaining, uint16 infomask);
! static bool ConditionalMultiXactIdWait(MultiXactId multi,
! 		   MultiXactStatus status, int *remaining,
! 		   uint16 infomask);
  static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup);
  static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_modified,
  		bool *copy);
--- 105,116 
  	   uint16 *new_infomask2);
  static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
  		uint16 t_infomask);
! static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status, uint16 infomask,
! Relation rel, ItemPointer ctid, const char *opr,
! int *remaining);
! static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status,
! 		   uint16 infomask, Relation rel, ItemPointer ctid,
! 		   const char *opr, int *remaining);
  static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup);
  static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_modified,
  		bool *copy);
***
*** 2714,2721  l1:
  		if (infomask  HEAP_XMAX_IS_MULTI)
  		{
  			/* wait for multixact */
! 			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
! 			NULL, infomask);
  			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
  
  			/*
--- 2715,2724 
  		if (infomask  HEAP_XMAX_IS_MULTI)
  		{
  			/* wait for multixact */
! 			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate, infomask,
! 			/* translator: string is XactLockTableWait operation name */
! 			relation, tp.t_data-t_ctid, deleting tuple,
! 			NULL);
  			

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Well, if we're back to one version of the message, and I'm glad we
 are, can we go back to saying:

 CONTEXT:  while updating tuple (0,2) in relation public.foo of
 database postgres

If I end up being the one who commits this, it's going to say

while updating tuple (0,2) in table foo

Not more, and not less.  It is not project style to include schema names
(much less database names) in error messages where they're not central to
the meaning.

One reason why not is that schema and database names are generally not
available without an extra lookup step, which you don't really want to do
in an error-reporting code path.  Every extra action you take increases
the risk of a cascading failure, so that the user will get something
unhelpful like out of memory rather than the oh-so-extra-helpful message
you wanted to print.  The added utility of the extra information, for most
cases, is insufficient to justify that risk.

Even without that argument, it's still not project style; why should this
message be randomly different from many hundreds of others?  If you want
to start a push to include schema names anywhere a table name is given,
that should be debated separately and then done in a reasonably uniform
fashion.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-18 Thread Alvaro Herrera
Tom Lane escribió:
 Robert Haas robertmh...@gmail.com writes:
  Well, if we're back to one version of the message, and I'm glad we
  are, can we go back to saying:
 
  CONTEXT:  while updating tuple (0,2) in relation public.foo of
  database postgres
 
 If I end up being the one who commits this, it's going to say
 
 while updating tuple (0,2) in table foo
 
 Not more, and not less.

Please see my reply to Robert.  My proposal (in form of a patch) is
  while operating on tuple (0,2) in table foo: updating tuple
Would this work for you?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-18 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Please see my reply to Robert.  My proposal (in form of a patch) is
   while operating on tuple (0,2) in table foo: updating tuple
 Would this work for you?

It's pretty lousy from a readability standpoint, even in English;
I shudder to think what it might come out as after translation.

I think the enum idea you floated is probably worth doing.  It's
certainly no more complex than passing a string, no?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-18 Thread Robert Haas
On Tue, Mar 18, 2014 at 12:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Please see my reply to Robert.  My proposal (in form of a patch) is
   while operating on tuple (0,2) in table foo: updating tuple
 Would this work for you?

 It's pretty lousy from a readability standpoint, even in English;
 I shudder to think what it might come out as after translation.

 I think the enum idea you floated is probably worth doing.  It's
 certainly no more complex than passing a string, no?

+1.  We've done similar things in tablecmds.c.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-18 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Please see my reply to Robert.  My proposal (in form of a patch) is
while operating on tuple (0,2) in table foo: updating tuple
  Would this work for you?
 
 It's pretty lousy from a readability standpoint, even in English;
 I shudder to think what it might come out as after translation.

Well, the same thing actually.  I didn't think it was too bad.

 I think the enum idea you floated is probably worth doing.  It's
 certainly no more complex than passing a string, no?

Okay, done that way, attached.  I think this one solves all concerns
there were.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 105,115  static void GetMultiXactIdHintBits(MultiXactId multi, uint16 *new_infomask,
  	   uint16 *new_infomask2);
  static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
  		uint16 t_infomask);
! static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
! int *remaining, uint16 infomask);
! static bool ConditionalMultiXactIdWait(MultiXactId multi,
! 		   MultiXactStatus status, int *remaining,
! 		   uint16 infomask);
  static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup);
  static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_modified,
  		bool *copy);
--- 105,116 
  	   uint16 *new_infomask2);
  static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
  		uint16 t_infomask);
! static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status, uint16 infomask,
! Relation rel, ItemPointer ctid, XLTW_Oper oper,
! int *remaining);
! static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status,
! 		   uint16 infomask, Relation rel, ItemPointer ctid,
! 		   XLTW_Oper oper, int *remaining);
  static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup);
  static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_modified,
  		bool *copy);
***
*** 2714,2721  l1:
  		if (infomask  HEAP_XMAX_IS_MULTI)
  		{
  			/* wait for multixact */
! 			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
! 			NULL, infomask);
  			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
  
  			/*
--- 2715,2723 
  		if (infomask  HEAP_XMAX_IS_MULTI)
  		{
  			/* wait for multixact */
! 			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate, infomask,
! 			relation, tp.t_data-t_ctid, XLTW_Delete,
! 			NULL);
  			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
  
  			/*
***
*** 2741,2747  l1:
  		else
  		{
  			/* wait for regular transaction to end */
! 			XactLockTableWait(xwait);
  			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
  
  			/*
--- 2743,2749 
  		else
  		{
  			/* wait for regular transaction to end */
! 			XactLockTableWait(xwait, relation, tp.t_data-t_ctid, XLTW_Delete);
  			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
  
  			/*
***
*** 3266,3273  l2:
  			int			remain;
  
  			/* wait for multixact */
! 			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
! 			infomask);
  			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
  
  			/*
--- 3268,3276 
  			int			remain;
  
  			/* wait for multixact */
! 			MultiXactIdWait((MultiXactId) xwait, mxact_status, infomask,
! 			relation, oldtup.t_data-t_ctid, XLTW_Update,
! 			remain);
  			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
  
  			/*
***
*** 3341,3347  l2:
  			else
  			{
  /* wait for regular transaction to end */
! XactLockTableWait(xwait);
  LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
  
  /*
--- 3344,3351 
  			else
  			{
  /* wait for regular transaction to end */
! XactLockTableWait(xwait, relation, oldtup.t_data-t_ctid,
!   XLTW_Update);
  LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
  
  /*
***
*** 4402,4415  l3:
  if (nowait)
  {
  	if (!ConditionalMultiXactIdWait((MultiXactId) xwait,
! 	status, NULL, infomask))
  		ereport(ERROR,
  (errcode(ERRCODE_LOCK_NOT_AVAILABLE),
   errmsg(could not obtain lock on row in relation \%s\,
  		RelationGetRelationName(relation;
  }
  else
! 	MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
  
  /* if there are updates, follow the update chain */
  if (follow_updates 
--- 4406,4423 
  if (nowait)
  {
  	if (!ConditionalMultiXactIdWait((MultiXactId) xwait,
!   status, infomask, relation,
! 	tuple-t_data-t_ctid,
! 	XLTW_Lock, NULL))
  		ereport(ERROR,
  (errcode(ERRCODE_LOCK_NOT_AVAILABLE),
   errmsg(could not obtain lock on row in relation \%s\,
  		

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-17 Thread Christian Kruse
Hi Amit,

I've been ill the last few days, so sorry for my late response.

 I have updated the patch to pass TID and operation information in
 error context and changed some of the comments in code.
 Let me know if the added operation information is useful, else
 we can use better generic message in context.

I don't think that this fixes the translation guideline issues brought
up by Robert. This produces differing strings for the different cases
as well and it also passes in altering data directly to the error
string.

It also may produce error messages that are really weird. You
initialize the string with while attempting to . The remaining part
of the function is covered by if()s which may lead to error messages
like this:

„while attempting to “
„while attempting to in relation public.xyz of database abc“
„while attempting to in database abc“

Although this may not be very likely (because
ItemPointerIsValid((info-ctid))) should in this case not return
false).

Attached you will find a new version of this patch; it slightly
violates the translation guidelines as well: it assembles an error
string (but it doesn't pass in altering data like ctid or things like
that). I simply couldn't think of a nice solution without doing so,
and after looking through the code there are a few cases
(e.g. CheckTableNotInUse()) where this is done, too. If we insist of
having complete strings in this case we would have to have 6 * 3 * 2
error strings in the code.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e2337ac..c0f5881 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
 static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 int *remaining, uint16 infomask);
+static void MultiXactIdWaitExtended(Relation rel, ItemPointerData ctid, const char *opr_name,
+	MultiXactId multi, MultiXactStatus status, int *remaining, uint16 infomask);
 static bool ConditionalMultiXactIdWait(MultiXactId multi,
 		   MultiXactStatus status, int *remaining,
 		   uint16 infomask);
@@ -2714,8 +2716,9 @@ l1:
 		if (infomask  HEAP_XMAX_IS_MULTI)
 		{
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
-			NULL, infomask);
+			MultiXactIdWaitExtended(relation, tp.t_data-t_ctid, delete,
+	(MultiXactId) xwait,
+	MultiXactStatusUpdate, NULL, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2741,7 +2744,8 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
-			XactLockTableWait(xwait);
+			XactLockTableWaitExtended(relation, tp.t_data-t_ctid,
+	  delete, xwait);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3266,8 +3270,8 @@ l2:
 			int			remain;
 
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
-			infomask);
+			MultiXactIdWaitExtended(relation, oldtup.t_data-t_ctid, update,
+	   (MultiXactId) xwait, mxact_status, remain, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3306,7 +3310,7 @@ l2:
 			/*
 			 * There was no UPDATE in the MultiXact; or it aborted. No
 			 * TransactionIdIsInProgress() call needed here, since we called
-			 * MultiXactIdWait() above.
+			 * MultiXactIdWaitExtended() above.
 			 */
 			if (!TransactionIdIsValid(update_xact) ||
 TransactionIdDidAbort(update_xact))
@@ -3341,7 +3345,8 @@ l2:
 			else
 			{
 /* wait for regular transaction to end */
-XactLockTableWait(xwait);
+XactLockTableWaitExtended(relation, oldtup.t_data-t_ctid,
+		  update, xwait);
 LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 /*
@@ -4409,7 +4414,10 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
-	MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
+	MultiXactIdWaitExtended(relation, tuple-t_data-t_ctid,
+			lock, (MultiXactId) xwait,
+			status, NULL,
+			infomask);
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -4464,7 +4472,8 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
-	XactLockTableWait(xwait);
+	XactLockTableWaitExtended(relation, tuple-t_data-t_ctid,
+			  lock, xwait);
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -5151,7 +5160,9 @@ l4:
 	if (needwait)
 	{
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
-		XactLockTableWait(members[i].xid);
+		XactLockTableWaitExtended(rel, mytup.t_data-t_ctid,
+  lock updated,
+  members[i].xid);
 		pfree(members);
 		goto l4;
 	}
@@ -5211,7 +5222,8 @@ l4:
 if (needwait)
 {
 	LockBuffer(buf, 

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-17 Thread Amit Kapila
On Mon, Mar 17, 2014 at 4:20 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 Hi Amit,

 I've been ill the last few days, so sorry for my late response.

   Sorry to hear and no problem for delay.

 I don't think that this fixes the translation guideline issues brought
 up by Robert. This produces differing strings for the different cases
 as well and it also passes in altering data directly to the error
 string.

 It also may produce error messages that are really weird. You
 initialize the string with while attempting to . The remaining part
 of the function is covered by if()s which may lead to error messages
 like this:

 while attempting to 
 while attempting to in relation public.xyz of database abc
 while attempting to in database abc

 Although this may not be very likely (because
 ItemPointerIsValid((info-ctid))) should in this case not return
 false).

Yes, this is good point and even I have thought of it before sending
the patch. The reason why it seems not good to duplicate the
message is that I am not aware even of single case where
tuple info (chid, relinfo) will not be available by the time it will come
to wait on tuple in new call (XactLockTableWaitExtended), do you think
there exists any such case or it's just based on apprehension.
If there exists any such, then isn't it better to use earlier version of
API for that call site as is case we have left for
SnapBuildFindSnapshot?

If it's just apprehension, then I think it's better to have a check such
that if any of the tuple info is missing then don't add context and in
future if we have any such usecase then we can have multiple
messages.

 Attached you will find a new version of this patch; it slightly
 violates the translation guidelines as well: it assembles an error
 string (but it doesn't pass in altering data like ctid or things like
 that). I simply couldn't think of a nice solution without doing so,
 and after looking through the code there are a few cases
 (e.g. CheckTableNotInUse()) where this is done, too.

I could not see any problem in the modifications done by you,
but if the function is generic enough that it doesn't assume what
caller has set info, then why to assume that opr_name will always
be filled.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-17 Thread Alvaro Herrera
Here's an adjusted version.  In this one, the extra info is not used to
construct a string from pieces, but instead it puts it at the end, like
this:

LOG:  process 18899 still waiting for ShareLock on transaction 697 after 
1000.203 ms
CONTEXT:  while operating on tuple (0,2) in relation public.foo of database 
postgres: updating tuple

This way, each part can sensibly be translated.  In fact I did translate
one instance to test it at work, and it looks good to me:

LOG:  el proceso 22555 adquirió ShareLock en transacción 705 después de 
1514.017 ms
CONTEXT:  mientras se operaba en la tupla (0,2) en la relación public.foo 
de la base de datos «postgres»: actualizando tupla

Now there might be bikeshedding on the exact wording I've chosen for
each instance of context setup, but I expect it's a fairly minor point
now.

One remaining issue is that now ConditionalXactLockTableWait doesn't set
up error context info.  We could solve this by having a common routine
that serves both that one and XactLockTableWait (much like
Do_MultiXactIdWait does), but I'm not sure it's worth the trouble.
Thoughts?

Another thing that jumped at me is that passing a TID but not a Relation
is fairly useless as it stands.  We might try to add some more stuff
later, such as printing tuple contents as previous versions of the patch
did, but given the opposition the idea had previously, I'm not sure
that's ever going to happen.  If we decide that TID-but-not-Relation is
not a useful case, then we can trim the number of messages to translate.

Opinions on these points please?  I intend to push this patch tomorrow.


Note: the changes to backend/po/es.po are illustrative only.  Those are
not going in with the patch.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e2337ac..14837f9 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -105,11 +105,12 @@ static void GetMultiXactIdHintBits(MultiXactId multi, uint16 *new_infomask,
 	   uint16 *new_infomask2);
 static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
-static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
-int *remaining, uint16 infomask);
-static bool ConditionalMultiXactIdWait(MultiXactId multi,
-		   MultiXactStatus status, int *remaining,
-		   uint16 infomask);
+static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status, uint16 infomask,
+Relation rel, ItemPointer ctid, const char *opr,
+int *remaining);
+static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status,
+		   uint16 infomask, Relation rel, ItemPointer ctid,
+		   const char *opr, int *remaining);
 static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup);
 static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_modified,
 		bool *copy);
@@ -2714,8 +2715,9 @@ l1:
 		if (infomask  HEAP_XMAX_IS_MULTI)
 		{
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
-			NULL, infomask);
+			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate, infomask,
+			relation, tp.t_data-t_ctid, deleting tuple,
+			NULL);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2741,7 +2743,8 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
-			XactLockTableWait(xwait);
+			XactLockTableWait(xwait, relation, tp.t_data-t_ctid,
+			  deleting tuple);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3266,8 +3269,9 @@ l2:
 			int			remain;
 
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
-			infomask);
+			MultiXactIdWait((MultiXactId) xwait, mxact_status, infomask,
+			relation, oldtup.t_data-t_ctid, updating tuple,
+			remain);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3341,7 +3345,8 @@ l2:
 			else
 			{
 /* wait for regular transaction to end */
-XactLockTableWait(xwait);
+XactLockTableWait(xwait, relation, oldtup.t_data-t_ctid,
+  updating tuple);
 LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 /*
@@ -4402,14 +4407,18 @@ l3:
 if (nowait)
 {
 	if (!ConditionalMultiXactIdWait((MultiXactId) xwait,
-	status, NULL, infomask))
+	status, infomask, relation,
+	tuple-t_data-t_ctid,
+	locking tuple, NULL))
 		ereport(ERROR,
 (errcode(ERRCODE_LOCK_NOT_AVAILABLE),
  errmsg(could not obtain lock on row in relation \%s\,
 		RelationGetRelationName(relation;
 }
 else
-	MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
+	MultiXactIdWait((MultiXactId) xwait, status, infomask,
+	relation, tuple-t_data-t_ctid,
+	locking tuple, NULL);
 
 /* if there are updates, follow the 

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-17 Thread Alvaro Herrera
Alvaro Herrera escribió:

 Another thing that jumped at me is that passing a TID but not a Relation
 is fairly useless as it stands.  We might try to add some more stuff
 later, such as printing tuple contents as previous versions of the patch
 did, but given the opposition the idea had previously, I'm not sure
 that's ever going to happen.  If we decide that TID-but-not-Relation is
 not a useful case, then we can trim the number of messages to translate.

Andres and I chatted a bit about this and came to these conclusions:

1. MyProcPort contains the database name; no need for the
get_database_name() call in there.

2. Since this is dealing with tuples obtained from tables, it's hard to
see a case where there would be no table or no database.  (What does a
TID mean without an accompanying relation, anyway?)

3. The MyProcPort thing is initialized quite early in the startup
sequence.  I don't think it's possible to get to a tuple-lock wait
without having the database initialized.

Therefore I think the only case worth considering here is when
database/relation/TID are all defined.  The other cases where there is
partial information are dead code; and the case where there is nothing
defined (such as the one in SnapBuildFindSnapshot) is already handled by
simply not setting up a context at all.


4. There are a few extant get_database_name(MyDatabaseId) calls that
could presumably be replaced by MyProcPort-database_name.  Or if we
want to get cute, hack get_database_name so that if dbid == MyDatabaseId
return MyProcPort-database_name.  (This would also affect callers of
get_database_name that use a value not hardcoded to MyDatabaseId).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-17 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 1. MyProcPort contains the database name; no need for the
 get_database_name() call in there.

Wait. A. Minute.  This patch wants to print the current database name in
the message?  What on earth for?  What other error messages do you see
doing that?

It should print a table name and a TID.  Not more, not less.  If there's
any doubt as to whether you have both of those pieces of information,
the patch still needs work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-17 Thread Andres Freund
On 2014-03-17 20:21:18 -0400, Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  1. MyProcPort contains the database name; no need for the
  get_database_name() call in there.
 
 Wait. A. Minute.  This patch wants to print the current database name in
 the message?  What on earth for?  What other error messages do you see
 doing that?

+many. At least Christian and I argued against it on those grounds. I am
not sure why Amit had at least temporarily won that battle.

Anybody wanting information about which database a message happened in
should just add %b to log_line_prefix.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-17 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  1. MyProcPort contains the database name; no need for the
  get_database_name() call in there.
 
 Wait. A. Minute.  This patch wants to print the current database name in
 the message?  What on earth for?  What other error messages do you see
 doing that?

Heh, well, that's settled then :-)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-17 Thread Amit Kapila
On Tue, Mar 18, 2014 at 5:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 1. MyProcPort contains the database name; no need for the
 get_database_name() call in there.

 Wait. A. Minute.  This patch wants to print the current database name in
 the message?  What on earth for?  What other error messages do you see
 doing that?

The message for exclusive lock on tuple print the database information.

LOG:  process 3012 still waiting for ExclusiveLock on tuple (0,1) of
relation 57
499 of database 12045 after 1014.000 ms

I think database related info (dbid) will be printed in other Lock
related messages
as well, for example:

LOG:  process 1332 still waiting for AccessExclusiveLock on relation 16384 of da
tabase 12077 after 1014.000 ms

I believe the reason is to uniquely identify a tuple on which
transaction is waiting.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-17 Thread Amit Kapila
On Tue, Mar 18, 2014 at 3:38 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Here's an adjusted version.  In this one, the extra info is not used to
 construct a string from pieces, but instead it puts it at the end, like
 this:

 LOG:  process 18899 still waiting for ShareLock on transaction 697 after 
 1000.203 ms
 CONTEXT:  while operating on tuple (0,2) in relation public.foo of 
 database postgres: updating tuple

 This way, each part can sensibly be translated.  In fact I did translate
 one instance to test it at work, and it looks good to me:

 LOG:  el proceso 22555 adquirió ShareLock en transacción 705 después de 
 1514.017 ms
 CONTEXT:  mientras se operaba en la tupla (0,2) en la relación public.foo 
 de la base de datos «postgres»: actualizando tupla

 Now there might be bikeshedding on the exact wording I've chosen for
 each instance of context setup, but I expect it's a fairly minor point
 now.

 One remaining issue is that now ConditionalXactLockTableWait doesn't set
 up error context info.

ConditionalXactLockTableWait() is not going to block on lock which
is when this new context will be printed. So I think there is no need
to change it. Is there a case where it will be needed which I am
missing?

  We could solve this by having a common routine
 that serves both that one and XactLockTableWait (much like
 Do_MultiXactIdWait does), but I'm not sure it's worth the trouble.
 Thoughts?


 Therefore I think the only case worth considering here is when
 database/relation/TID are all defined.  The other cases where there is
 partial information are dead code; and the case where there is nothing
 defined (such as the one in SnapBuildFindSnapshot) is already handled by
 simply not setting up a context at all.

Right. So I think we should just keep one version of message.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-15 Thread Amit Kapila
On Thu, Mar 13, 2014 at 8:06 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, Mar 13, 2014 at 7:10 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 13, 2014 at 12:45 AM, Amit Kapila amit.kapil...@gmail.com 
 wrote:
 _bt_doinsert - insert index tuple (X,Y) (here it will refer to index tuple
 location)

 I don't think that giving the index tuple location is going to be very
 helpful; can we get the TID for conflicting heap tuple?

 Yes, each index tuple contains reference to TID of heap tuple.

I have updated the patch to pass TID and operation information in
error context and changed some of the comments in code.
Let me know if the added operation information is useful, else
we can use better generic message in context.

Christian, can you please once confirm if the attached patch looks
okay to you?
According to me all the comments raised till now for this patch are
addressed, so I will mark it as Ready For Committer unless some
one feels we should do something else for this patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


show_table_name_and_tuple_in_lock_log_v9.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-13 Thread Robert Haas
On Thu, Mar 13, 2014 at 12:45 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 While attempting to operate in?  That seems like unhelpful
 weasel-wording.  I wonder if we ought to have separate messages for
 each possibility, like delete tuple (X,Y) when called from
 heap_delete(), update tuple (X,Y), check exclusion constraint on
 tuple (X,Y) when called from check_exclusion_constraint, etc.  That
 seems like it would be handy information to have.

 Okay, below are the distinct places from where we need to pass
 such information.

 heap_delete - delete tuple (X,Y)
 heap_update - update tuple (X,Y)
 heap_lock_tuple - lock tuple (X,Y)
 heap_lock_updated_tuple_rec - lock updated tuple (X,Y)
 _bt_doinsert - insert index tuple (X,Y) (here it will refer to index tuple
 location)

I don't think that giving the index tuple location is going to be very
helpful; can we get the TID for conflicting heap tuple?

 IndexBuildHeapScan - scan tuple (X,Y)
 EvalPlanQualFetch - fetch tuple (X,Y)

These two seem unhelpful to me.  For EvalPlanQualFetch maybe recheck
updated tuple would be good, and for IndexBuildHeapScan perhaps
checking uniqueness of tuple.

 check_exclusion_constraint - check exclusion constraint on tuple (X,Y)

 I think it might not be a big deal to update the patch to pass such info.
 Won't it effect the translatability guidelines as we need to have different
 translation message for each op?

Yes, we'll need a separate message for each.

 The other option could be we need to ensure which places are safe to
 pass tuple so that we can display whole tuple instead of just TID,
 for example the tuple we are passing from heap_lock_tuple() has been
 fetched using Dirty Snapshot (refer EvalPlanQualFetch() caller of
 heap_lock_tuple()), but still we can use it in error as it has been decided
 that it is live tuple and transaction can update it by the time it reaches
 XactLockTableWaitWithInfo(), so it is safe. I think we need to discuss
 and validate all places where ever we use Dirty/Any Snapshot to ensure
 that we can pass tuple from such a call, may be at end the result is
 we can pass tuple from most of locations, but still it needs to be done
 carefully.

Well, it's sounding like we can only display the whole tuple if (1)
the message level is less than ERROR and (2) the snapshot is an MVCC
snapshot.  That's an annoying and hard-to-document set of limitations.
 But we should be able to display the TID always, so I think we should
drop the part of the patch that tries to show tuple data and revisit
that in a future release if need be.

I don't feel too bad about that because it seems to me that showing
the TID is a big improvement over the status quo; right now, when you
get the information that transaction A is waiting for transaction B,
you know they're fighting over some tuple, but you have no idea which
one.  Even just having the relation name would help a lot, I bet, but
if you have the TID also, you can use a SELECT query with WHERE ctid =
'(X,Y)' to find the specific tuple of interest.  That's maybe not as
convenient as having all the data printed out in the log, and there
are certainly use cases it won't answer, but it's still a WHOLE lot
better than having absolutely NO information, which is what we've got
today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Well, it's sounding like we can only display the whole tuple if (1)
 the message level is less than ERROR and (2) the snapshot is an MVCC
 snapshot.  That's an annoying and hard-to-document set of limitations.
  But we should be able to display the TID always, so I think we should
 drop the part of the patch that tries to show tuple data and revisit
 that in a future release if need be.

+1.  That avoids the thing that was really bothering me about this
patch, which is that it seemed likely to create a whole new set of
failure modes.  Every case in which the data-printing effort could
fail is a case where the patch makes things *less* debuggable, not
more so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-13 Thread Amit Kapila
On Thu, Mar 13, 2014 at 7:10 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 13, 2014 at 12:45 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 _bt_doinsert - insert index tuple (X,Y) (here it will refer to index tuple
 location)

 I don't think that giving the index tuple location is going to be very
 helpful; can we get the TID for conflicting heap tuple?

Yes, each index tuple contains reference to TID of heap tuple.

 IndexBuildHeapScan - scan tuple (X,Y)
 EvalPlanQualFetch - fetch tuple (X,Y)

 These two seem unhelpful to me.  For EvalPlanQualFetch maybe recheck
 updated tuple would be good, and for IndexBuildHeapScan perhaps
 checking uniqueness of tuple.

For IndexBuildHeapScan, checking uniqueness of tuple may not be
always the case, as in case of HEAPTUPLE_DELETE_IN_PROGRESS,
it waits on tuple even for HOT.

 check_exclusion_constraint - check exclusion constraint on tuple (X,Y)

 I think it might not be a big deal to update the patch to pass such info.
 Won't it effect the translatability guidelines as we need to have different
 translation message for each op?

 Yes, we'll need a separate message for each.

In that case, can't we think of some generic word to use, because in Log
along with this message, it will print the SQL Statement causing this log
as well, so it might not be completely vague to use some generic word.


 Well, it's sounding like we can only display the whole tuple if (1)
 the message level is less than ERROR and (2) the snapshot is an MVCC
 snapshot.  That's an annoying and hard-to-document set of limitations.
  But we should be able to display the TID always, so I think we should
 drop the part of the patch that tries to show tuple data and revisit
 that in a future release if need be.

Agreed.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-13 Thread Alvaro Herrera
In this loop,

 + for (i = 0; i  desc-natts; i++)
 + {
 + char   *val;
 + int vallen;
 +

 + vallen = strlen(val);
 + if (vallen = maxfieldlen)
 + appendStringInfoString(buf, val);
 + else
 + {
 + vallen = pg_mbcliplen(val, vallen, 
 maxfieldlen);
 + appendBinaryStringInfo(buf, val, 
 vallen);
 + appendStringInfoString(buf, ...);
 + }
 + }

you're checking that each individual field doesn't go over maxfieldlen
chars (30), but if the fields are numerous, this could end up being very
long anyway.  I think you need to limit total length as well, and as
soon as buf.len exceeds some number of chars, exit the loop.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-12 Thread Amit Kapila
On Tue, Mar 11, 2014 at 2:32 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 On 11/03/14 13:23, Amit Kapila wrote:
 Could you please once check (if you are comfortable doing so) wherever
 this patch is passing tuple, whether it is okay to pass it based on 
 visibility
 rules, else I will again verify it once.

 I think I got all places, but it would be nice to have a confirmation.

How about in IndexBuildHeapScan()? This also uses SnapShotAny to
scan the heap data.

Normally in this function, passing tuple to lock routine should not be a
problem as it would have ensured to have exclusive lock on relation
before reaching this stage, but below comment in this function leads
me to think, that there can be problem during system catalog scan.

/*
* Since caller should hold ShareLock or better, normally
* the only way to see this is if it was inserted earlier
* in our own transaction. However, it can happen in
* system catalogs, since we tend to release write lock
* before commit there.  Give a warning if neither case
* applies.
*/

I could not immediately think of testcase which can validate it, but this is
certainly a point to ponder.
Do you think it is safe to pass tuple to XactLockTableWaitWithInfo()
in this function?


I think now other things in your patch are good, just these tuple visibility
validations are tricky and it is taking time to validate the paths, because
these gets called in nested paths where in few cases even dirty or snapshot
any scans also seems to be safe w.r.t displaying tuple. Anyway, today I
have checked most paths, may be one more time I will give a look with fresh
mind and then pass on to Committer.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-12 Thread Robert Haas
On Tue, Mar 11, 2014 at 3:53 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Places where tuple info not available

 LOG:  process 5788 still waiting for ShareLock on transaction 679 after 
 1014.000
  ms
 CONTEXT:  while attempting to operate in relation public.idx_t1 of 
 database
 postgres

The way the context message is assembled piecemeal in
XactLockTableWaitErrorContextCallback violates translation guidelines.
 You need to have completely separate strings for each variant.

While attempting to operate in?  That seems like unhelpful
weasel-wording.  I wonder if we ought to have separate messages for
each possibility, like delete tuple (X,Y) when called from
heap_delete(), update tuple (X,Y), check exclusion constraint on
tuple (X,Y) when called from check_exclusion_constraint, etc.  That
seems like it would be handy information to have.

Why can't check_exclusion_constraint, for example, pass the TID, so
that at least that much information is available?

I'm not very happy with the idea of including the tuple details only
when the level is less than ERROR.  For one thing, to do that in a way
that respects translatability guidelines will require two versions of
every string that would otherwise require only one.  For another
thing, it seems like it's punting a pretty important case.  If we're
gonna add context detail to lots of cases (instead only the still
waiting case that people probably mostly care about) then we should
actually print the details more-or-less consistently in all of those
cases, not pretend like a solution that only works in the narrow case
is more general than it really is.  I think we should really try hard
to make the amount of detail provided as uniform as possible across
all the cases, even if that means removing information from some cases
where it might have been available.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-12 Thread Amit Kapila
On Thu, Mar 13, 2014 at 12:16 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Mar 11, 2014 at 3:53 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Places where tuple info not available

 LOG:  process 5788 still waiting for ShareLock on transaction 679 after 
 1014.000
  ms
 CONTEXT:  while attempting to operate in relation public.idx_t1 of 
 database
 postgres

 The way the context message is assembled piecemeal in
 XactLockTableWaitErrorContextCallback violates translation guidelines.
  You need to have completely separate strings for each variant.

 While attempting to operate in?  That seems like unhelpful
 weasel-wording.  I wonder if we ought to have separate messages for
 each possibility, like delete tuple (X,Y) when called from
 heap_delete(), update tuple (X,Y), check exclusion constraint on
 tuple (X,Y) when called from check_exclusion_constraint, etc.  That
 seems like it would be handy information to have.

Okay, below are the distinct places from where we need to pass
such information.

heap_delete - delete tuple (X,Y)
heap_update - update tuple (X,Y)
heap_lock_tuple - lock tuple (X,Y)
heap_lock_updated_tuple_rec - lock updated tuple (X,Y)
_bt_doinsert - insert index tuple (X,Y) (here it will refer to index tuple
location)
IndexBuildHeapScan - scan tuple (X,Y)
EvalPlanQualFetch - fetch tuple (X,Y)
check_exclusion_constraint - check exclusion constraint on tuple (X,Y)

I think it might not be a big deal to update the patch to pass such info.
Won't it effect the translatability guidelines as we need to have different
translation message for each op?

 Why can't check_exclusion_constraint, for example, pass the TID, so
 that at least that much information is available?

I don't think there is as such any problem in passing TID, rather I think
if we can pass TID from all places, it will be a better way.

The other option could be we need to ensure which places are safe to
pass tuple so that we can display whole tuple instead of just TID,
for example the tuple we are passing from heap_lock_tuple() has been
fetched using Dirty Snapshot (refer EvalPlanQualFetch() caller of
heap_lock_tuple()), but still we can use it in error as it has been decided
that it is live tuple and transaction can update it by the time it reaches
XactLockTableWaitWithInfo(), so it is safe. I think we need to discuss
and validate all places where ever we use Dirty/Any Snapshot to ensure
that we can pass tuple from such a call, may be at end the result is
we can pass tuple from most of locations, but still it needs to be done
carefully.

 I'm not very happy with the idea of including the tuple details only
 when the level is less than ERROR.  For one thing, to do that in a way
 that respects translatability guidelines will require two versions of
 every string that would otherwise require only one.  For another
 thing, it seems like it's punting a pretty important case.  If we're
 gonna add context detail to lots of cases (instead only the still
 waiting case that people probably mostly care about) then we should
 actually print the details more-or-less consistently in all of those
 cases, not pretend like a solution that only works in the narrow case
 is more general than it really is.  I think we should really try hard
 to make the amount of detail provided as uniform as possible across
 all the cases, even if that means removing information from some cases
 where it might have been available.

Agreed and if we use TID, then it will address your concern.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-11 Thread Amit Kapila
On Mon, Mar 10, 2014 at 5:14 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 On 09/03/14 12:15, Amit Kapila wrote:
 [...] due to which the message it displays seems to be
 incomplete. Message it displays is as below:

 LOG:  process 1800 still waiting for ShareLock on transaction 679 after 
 1014.000
  ms
 CONTEXT:  while attempting to lock in relation public.idx_t1 of database 
 pos
 tgres

 Well, there is no tuple information available, so we have to leave it out.

 Here it is not clear what it attempts *lock in*

 Ok, I reworded the message as you suggested below. Should make this
 clearer as well.

As per your recent changes, wherever we have tuple info, it will be used in
message, else it will just use relinfo. So the message difference will be as
below which I think is okay.

Places where tuple info not available

LOG:  process 5788 still waiting for ShareLock on transaction 679 after 1014.000
 ms
CONTEXT:  while attempting to operate in relation public.idx_t1 of database
postgres

Places where tuple info available

LOG:  process 5788 still waiting for ShareLock on transaction 686 after 1014.000
 ms
CONTEXT:  while attempting to operate on tuple (0,1) with values (1, aaa, bbb) i
n relation public.t1 of database postgres

Actually, I had thought of passing tuple info from places where currently
it is not passing, but it will need some extra code which I think is not
worthy for message information.
For Example in case of _bt_doinsert(), if we have to pass index tuple
info, we might need to change prototype of XactLockTableWaitWithInfo()
such that it can accept both IndexTuple and HeapTuple and then use
inside function
accordingly. Also I think this is okay to not display Index tuple here, as it
is still not visible. For other cases where we are using DirtySnapshot,
the things can be more complex to identify if tuple can be logged. So
I think it's better to leave logging it as you have done in patch.


 Can you explain why you prefer the WithInfo naming? Just curious, it
 seemed to me the more descriptive name should have been preferred.

Actually, apart from MultiXactIdWaitWithErrorContext(), I had
thought of below options:

1. Change the prototype of MultiXactIdWait()/XactLockTableWait()
to pass rel and tuple info and make new functions XactLockTableWait_Internal()
which will do the actual work, but to achieve that we need call
internal function from some places where top level is not getting
called. Such coding convetion is used at few places (heap_beginscan_internal).

2. Name new functions as  MultiXactIdWaitExtended()/XactLockTableWaitExtended()
or MultiXactIdWaitEx()/XactLockTableWaitEx().
You can find some other similar functions (ReadBufferExtended,
relation_openrv_extended)

3. MultiXactIdWaitWithInfo()/XactLockTableWaitWithInfo()

Earlier I found option 3 as a good choice, but now again thinking on
it I am leaning towards option 2.


 6.
 @@ -1981,7 +1981,8 @@ EvalPlanQualFetch(EState *estate, Relation
 relation, int lockmode,
   if (TransactionIdIsValid(SnapshotDirty.xmax))
   {
   ReleaseBuffer(buffer);
 - XactLockTableWait(SnapshotDirty.xmax);
 + XactLockTableWaitWithInfo(relation, tuple,
 +  SnapshotDirty.xmax);

 In function EvalPlanQualFetch(), we are using SnapshotDirty to fetch
 the tuple, so I think there is a chance that it will log the tuple which
 otherwise will not be visible. I don't think this is right.

 Hm, after checking source I tend to agree. I replaced it with NULL.

Today, again looking into it, I found that function
heap_lock_updated_tuple_rec() is using SnapshotAny to fetch the tuple
and I think for this case also it's not safe to Log the tuple.

Could you please once check (if you are comfortable doing so) wherever
this patch is passing tuple, whether it is okay to pass it based on visibility
rules, else I will again verify it once.

 8.
  void
  WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode)
  {
 - List   *l;
 + List   *l;

 Extra spacing not needed.

 What is the policy to do here?

The simple rule I follow is don't touch the code which has no relation
to current patch.

 [...] and recently this is used in function SnapBuildFindSnapshot().

 Hm, should we add the context there, too?

I don't think you can use it here as there is no relation or tuple
information available. This is a unique kind of usage for this API where
it is used to wait for a transaction to complete without operating on
relation/tuple, whereas all other places this is used to wait if some other
transaction is operating on a tuple.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-11 Thread Christian Kruse
Hi,

On 11/03/14 13:23, Amit Kapila wrote:
 [… snip …]
 So I think it's better to leave logging it as you have done in
 patch.

Agreed.

 […]
 2. Name new functions as  
 MultiXactIdWaitExtended()/XactLockTableWaitExtended()
 or MultiXactIdWaitEx()/XactLockTableWaitEx().
 You can find some other similar functions (ReadBufferExtended,
 relation_openrv_extended)
 
 3. MultiXactIdWaitWithInfo()/XactLockTableWaitWithInfo()
 
 Earlier I found option 3 as a good choice, but now again thinking on
 it I am leaning towards option 2.

Changing it once again will only cause more work and won't do a big
difference. So I suggest sticking with the current function names.

 Today, again looking into it, I found that function
 heap_lock_updated_tuple_rec() is using SnapshotAny to fetch the tuple
 and I think for this case also it's not safe to Log the tuple.
 
 Could you please once check (if you are comfortable doing so) wherever
 this patch is passing tuple, whether it is okay to pass it based on visibility
 rules, else I will again verify it once.

I think I got all places, but it would be nice to have a confirmation.

  [… policy regarding whitespaces …]
 The simple rule I follow is don't touch the code which has no relation
 to current patch.

OK. Thanks.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 71ec740..bcc656a 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
 static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 int *remaining, uint16 infomask);
+static void MultiXactIdWaitWithInfo(Relation rel, HeapTuple tup, MultiXactId multi,
+	MultiXactStatus status, int *remaining, uint16 infomask);
 static bool ConditionalMultiXactIdWait(MultiXactId multi,
 		   MultiXactStatus status, int *remaining,
 		   uint16 infomask);
@@ -2714,8 +2716,8 @@ l1:
 		if (infomask  HEAP_XMAX_IS_MULTI)
 		{
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
-			NULL, infomask);
+			MultiXactIdWaitWithInfo(relation, tp, (MultiXactId) xwait,
+	MultiXactStatusUpdate, NULL, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2741,7 +2743,7 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
-			XactLockTableWait(xwait);
+			XactLockTableWaitWithInfo(relation, tp, xwait);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3266,8 +3268,8 @@ l2:
 			int			remain;
 
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
-			infomask);
+			MultiXactIdWaitWithInfo(relation, oldtup,
+	   (MultiXactId) xwait, mxact_status, remain, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3306,7 +3308,7 @@ l2:
 			/*
 			 * There was no UPDATE in the MultiXact; or it aborted. No
 			 * TransactionIdIsInProgress() call needed here, since we called
-			 * MultiXactIdWait() above.
+			 * MultiXactIdWaitWithInfo() above.
 			 */
 			if (!TransactionIdIsValid(update_xact) ||
 TransactionIdDidAbort(update_xact))
@@ -3341,7 +3343,7 @@ l2:
 			else
 			{
 /* wait for regular transaction to end */
-XactLockTableWait(xwait);
+XactLockTableWaitWithInfo(relation, oldtup, xwait);
 LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 /*
@@ -4409,7 +4411,8 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
-	MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
+	MultiXactIdWaitWithInfo(relation, tuple,
+(MultiXactId) xwait, status, NULL, infomask);
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -4464,7 +4467,7 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
-	XactLockTableWait(xwait);
+	XactLockTableWaitWithInfo(relation, tuple, xwait);
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -5151,7 +5154,7 @@ l4:
 	if (needwait)
 	{
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
-		XactLockTableWait(members[i].xid);
+		XactLockTableWaitWithInfo(rel, NULL, members[i].xid);
 		pfree(members);
 		goto l4;
 	}
@@ -5211,7 +5214,7 @@ l4:
 if (needwait)
 {
 	LockBuffer(buf, BUFFER_LOCK_UNLOCK);
-	XactLockTableWait(rawxmax);
+	XactLockTableWaitWithInfo(rel, NULL, rawxmax);
 	goto l4;
 }
 if (res != HeapTupleMayBeUpdated)
@@ -6169,6 +6172,33 @@ MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 }
 
 /*
+ * MultiXactIdWaitWithInfo
+ *		Sets up the error context callback for reporting tuple
+ *		information and relation for a lock to aquire
+ *
+ * Use this instead of calling MultiXactIdWait()
+ */
+static void

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-10 Thread Christian Kruse
Hi,

thanks for the continued review.

On 09/03/14 12:15, Amit Kapila wrote:
 1.
  ISTM that you should be printing just the value and the unique index
  there, and not any information about the tuple proper.
 
 Good point, I will have a look at this.
 
 This point is still not handled in patch, […]

There have been some complaints about this by Andres, so I left it.

 […] due to which the message it displays seems to be
 incomplete. Message it displays is as below:
 
 LOG:  process 1800 still waiting for ShareLock on transaction 679 after 
 1014.000
  ms
 CONTEXT:  while attempting to lock in relation public.idx_t1 of database 
 pos
 tgres

Well, there is no tuple information available, so we have to leave it out.

 Here it is not clear what it attempts *lock in*

Ok, I reworded the message as you suggested below. Should make this
clearer as well.

 2. In function XactLockTableWaitErrorContextCallback(), ignore dropped
 columns in tuple, else it will lead to following failure:
 […]
 Problem is in Session-2 (cache lookup failed), when it tries to print values
 for dropped attribute.

Urghs. I really thought I had done this in the patch before. Thanks
for pointing out, fixed in attached patch.

 3.
  in relation \%s\.\%s\ of database %s,
 Why to display only relation name in quotes?
 I think database name should also be displayed in quotes.

Didn't think that it would make sense; the quoting makes sense for the
schema and relation name, but I did not see any need to quote the
database name. However, fixed in attached patch.

 4.
 Context message:
 while attempting to lock tuple (0,2) .
 
 I think it will be better to rephrase it (may be by using 'operate on'
 instead of 'lock').
 The reason is that actually we trying to lock on transaction, so it doesn't 
 make
 good sense to use lock on tuple

Fixed.

 5.
 + XactLockTableWaitWithInfo(relation, tp, xwait);
 
 + MultiXactIdWaitWithErrorContext(relation, tp, (MultiXactId) xwait,
 +  MultiXactStatusUpdate, NULL, infomask);
 
 I think we should make the name of MultiXactIdWaitWithErrorContext()
 similar to XactLockTableWaitWithInfo.

Fixed as well.

Can you explain why you prefer the WithInfo naming? Just curious, it
seemed to me the more descriptive name should have been preferred.

 6.
 @@ -1981,7 +1981,8 @@ EvalPlanQualFetch(EState *estate, Relation
 relation, int lockmode,
   if (TransactionIdIsValid(SnapshotDirty.xmax))
   {
   ReleaseBuffer(buffer);
 - XactLockTableWait(SnapshotDirty.xmax);
 + XactLockTableWaitWithInfo(relation, tuple,
 +  SnapshotDirty.xmax);
 
 In function EvalPlanQualFetch(), we are using SnapshotDirty to fetch
 the tuple, so I think there is a chance that it will log the tuple which
 otherwise will not be visible. I don't think this is right.

Hm, after checking source I tend to agree. I replaced it with NULL.

 8.
  void
  WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode)
  {
 - List   *l;
 + List   *l;
 
 Extra spacing not needed.

What is the policy to do here? The extra spaces have been added by
pg_indent, and there have been more changes in a couple of other files
which I had to drop manually as well. How should this been handled
generally?

 9.
 /*
  * XactLockTableWaitErrorContextCallback
  * error context callback set up by
  * XactLockTableWaitWithInfo. It reports some
  * tuple information and the relation of a lock to aquire
  *
  */
 Please improve indentation of above function header

Heh. Seems that Emacs' fill-paragraph messed this up. Thanks, fixed.

 10.
 +#include access/htup.h
 +
 +struct XactLockTableWaitLockInfo
 +{
 + Relation rel;
 + HeapTuple tuple;
 +};
 
 I think it is better to add comments for this structure.
 You can refer comments for struct HeapUpdateFailureData

Fixed.

 
 11.
 + *
 + * Use this instead of calling XactTableLockWait()
 
 In above comment, function name used is wrong and I am not sure this
 is right statement to use because still we are using XactLockTableWait()
 at few places like in function Do_MultiXactIdWait() […]

Fixed function name, but I'd like to keep the wording;
Do_MultiXactIdWait() is wrapped by MultiXactIdWaitWithInfo() and
therefore sets up the context as well.

 […] and recently this is used in function SnapBuildFindSnapshot().

Hm, should we add the context there, too?

 12.
 heap_update()
 {
 ..
 ..
 /*
  * There was no UPDATE in the MultiXact; or it aborted. No
  * TransactionIdIsInProgress() call needed here, since we called
  * MultiXactIdWait() above.
 
 Change the function name in above comment.

Fixed.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 71ec740..0cf3537 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
 static void 

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-10 Thread Robert Haas
On Mon, Mar 10, 2014 at 7:44 AM, Christian Kruse
christ...@2ndquadrant.com wrote:
 [ response to review ]

This response seems to have made no mention of point #7 from Amit's
review, which seems to me to be a rather important one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-10 Thread Christian Kruse
Hi,

On 10/03/14 14:59, Robert Haas wrote:
 On Mon, Mar 10, 2014 at 7:44 AM, Christian Kruse
 christ...@2ndquadrant.com wrote:
  [ response to review ]
 
 This response seems to have made no mention of point #7 from Amit's
 review, which seems to me to be a rather important one.

Just didn't notice it because the previous point was the same. NULL'd
the tuple there, too:

--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1307,7 +1307,7 @@ retry:
if (TransactionIdIsValid(xwait))
{
index_endscan(index_scan);
-   XactLockTableWait(xwait);
+   XactLockTableWaitWithInfo(heap, NULL, xwait);
goto retry;
}

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpkcb2npz9m2.pgp
Description: PGP signature


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-08 Thread Amit Kapila
On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 On 25/02/14 16:11, Robert Haas wrote:
 Reading this over, I'm not sure I understand why this is a CONTEXT at
 all and not just a DETAIL for the particular error message that it's
 supposed to be decorating.  Generally CONTEXT should be used for
 information that will be relevant to all errors in a given code path,
 and DETAIL for extra information specific to a particular error.

 Because there is more than one scenario where this context is useful,
 not just log_lock_wait messages.

 If we're going to stick with CONTEXT, we could rephrase it like this:

 CONTEXT: while attempting to lock tuple (1,2) in relation with OID 3456

 or when the relation name is known:

 CONTEXT: while attempting to lock tuple (1,2) in relation public.foo

 Accepted. Patch attached.

Thanks. I have done review of this patch and found couple of things
which I feel should be addressed.

1.
 ISTM that you should be printing just the value and the unique index
 there, and not any information about the tuple proper.

Good point, I will have a look at this.

This point is still not handled in patch, due to which the message
it displays seems to be incomplete. Message it displays is as below:

LOG:  process 1800 still waiting for ShareLock on transaction 679 after 1014.000
 ms
CONTEXT:  while attempting to lock in relation public.idx_t1 of database pos
tgres

Here it is not clear what it attempts *lock in*

2. In function XactLockTableWaitErrorContextCallback(), ignore dropped
columns in tuple, else it will lead to following failure:

Session-1

postgres=# select * from t1;
 c1 | c2
+
  2 | 99
(1 row)


postgres=# Alter table t1 drop column c2;
ALTER TABLE
postgres=# begin;
BEGIN
postgres=# delete from t1 where c1=2;

Session-2
postgres=# begin;
BEGIN
postgres=# delete from t1 where c1=2;
ERROR:  cache lookup failed for type 0
CONTEXT:  while attempting to lock tuple (0,2) in relation public.t1 of data
base postgres

Problem is in Session-2 (cache lookup failed), when it tries to print values
for dropped attribute.

3.
 in relation \%s\.\%s\ of database %s,
Why to display only relation name in quotes?
I think database name should also be displayed in quotes.

4.
Context message:
while attempting to lock tuple (0,2) .

I think it will be better to rephrase it (may be by using 'operate on'
instead of 'lock').
The reason is that actually we trying to lock on transaction, so it doesn't make
good sense to use lock on tuple

5.
+ XactLockTableWaitWithInfo(relation, tp, xwait);

+ MultiXactIdWaitWithErrorContext(relation, tp, (MultiXactId) xwait,
+  MultiXactStatusUpdate, NULL, infomask);

I think we should make the name of MultiXactIdWaitWithErrorContext()
similar to XactLockTableWaitWithInfo.

6.
@@ -1981,7 +1981,8 @@ EvalPlanQualFetch(EState *estate, Relation
relation, int lockmode,
  if (TransactionIdIsValid(SnapshotDirty.xmax))
  {
  ReleaseBuffer(buffer);
- XactLockTableWait(SnapshotDirty.xmax);
+ XactLockTableWaitWithInfo(relation, tuple,
+  SnapshotDirty.xmax);

In function EvalPlanQualFetch(), we are using SnapshotDirty to fetch
the tuple, so I think there is a chance that it will log the tuple which
otherwise will not be visible. I don't think this is right.


7.
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1307,7 +1307,7 @@ retry:
  if (TransactionIdIsValid(xwait))
  {
  index_endscan(index_scan);
- XactLockTableWait(xwait);
+ XactLockTableWaitWithInfo(heap, tup, xwait);

In function check_exclusion_constraint(), DirtySnapshot is used,
so it seems to me that this will also have the problem of logging
of tuple which otherwise will not be visible.

8.
 void
 WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode)
 {
- List   *l;
+ List   *l;

Extra spacing not needed.

9.
/*
 * XactLockTableWaitErrorContextCallback
 * error context callback set up by
 * XactLockTableWaitWithInfo. It reports some
 * tuple information and the relation of a lock to aquire
 *
 */
Please improve indentation of above function header

10.
+#include access/htup.h
+
+struct XactLockTableWaitLockInfo
+{
+ Relation rel;
+ HeapTuple tuple;
+};

I think it is better to add comments for this structure.
You can refer comments for struct HeapUpdateFailureData


11.
+ *
+ * Use this instead of calling XactTableLockWait()

In above comment, function name used is wrong and I am not sure this
is right statement to use because still we are using XactLockTableWait()
at few places like in function Do_MultiXactIdWait() and recently this is used
in function SnapBuildFindSnapshot().

12.
heap_update()
{
..
..
/*
 * There was no UPDATE in the MultiXact; or it aborted. No
 * TransactionIdIsInProgress() call needed here, since we called
 * MultiXactIdWait() above.

Change the function name in above comment.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To 

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-04 Thread Simon Riggs
On 4 March 2014 04:18, Amit Kapila amit.kapil...@gmail.com wrote:
 On Mon, Mar 3, 2014 at 3:38 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-28 20:55:20 +0530, Amit Kapila wrote:
 On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse
  Well, as I already stated: we don't. I copied the behavior we use in
  CHECK constraints (ExecBuildSlotValueDescription()).

 I think now more people are of opinion that displaying whole tuple is
 not useful. I believe it is good to go ahead by displaying just primary key
 for this case and move ahead.

 The patch does not, and has never, printed the full tuple. What it does
 is copying the behaviour of printing the tuple for check constraint
 violations, which is truncating the tuple after a certain length.

 I know that patch truncates the values if they are greater than certain
 length (30), but the point is why it is not sufficient to have tuple location
 (and primary key if available) which uniquely identifies any tuple?

The patch follows a pattern established elsewhere, so arguing for this
change would be a change in existing behaviour that is outside the
scope of this patch. Please raise a new thread if you wish that
change, especially since it is opposed here.

This patch is small, but adds important functionality for diagnosing
user's locking problems.

If you have alterations to the patch please list them concisely so
people can understand them and progress towards getting this committed
or rejected. Thanks.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-04 Thread Amit Kapila
On Tue, Mar 4, 2014 at 2:18 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 4 March 2014 04:18, Amit Kapila amit.kapil...@gmail.com wrote:
 I know that patch truncates the values if they are greater than certain
 length (30), but the point is why it is not sufficient to have tuple location
 (and primary key if available) which uniquely identifies any tuple?

 The patch follows a pattern established elsewhere, so arguing for this
 change would be a change in existing behaviour that is outside the
 scope of this patch. Please raise a new thread if you wish that
 change, especially since it is opposed here.

Okay, I very well got this point and I was also not completely sure what
is the best thing to do for this specific point, thats why I had asked for
opinion of others upthread and there is a mixed feedback about it.
I think best thing is to leave this point for final committer's decision
and complete the other review/verification of patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-03 Thread Andres Freund
On 2014-02-28 20:55:20 +0530, Amit Kapila wrote:
 On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse
  I wouldn't be inclined to dump the whole tuple under any
  circumstances.  That could be a lot more data than what you want
  dumped in your log.  The PK could already be somewhat unreasonably
  large, but the whole tuple could be a lot more unreasonably large.
 
  Well, as I already stated: we don't. I copied the behavior we use in
  CHECK constraints (ExecBuildSlotValueDescription()).
 
 I think now more people are of opinion that displaying whole tuple is
 not useful. I believe it is good to go ahead by displaying just primary key
 for this case and move ahead.

The patch does not, and has never, printed the full tuple. What it does
is copying the behaviour of printing the tuple for check constraint
violations, which is truncating the tuple after a certain length.

I don't think changing this in an isolated manner to the primary key
would be a good idea. Let's use the same logic here, and if somebody
wants to argue for always using the primary key instead of a truncated
tuple, let them submit a separate patch.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-03 Thread Andres Freund
On 2014-03-01 13:29:18 +0530, Amit Kapila wrote:
 With new patch, the message while updating locked rows will be displayed
 as below:
 
 LOG:  process 364 still waiting for ShareLock on transaction 678 after
 1014.000ms
 CONTEXT:  while attempting to lock tuple (0,2) with values (2) in relation 
 publ
 ic.t1 of database postgres
 
 LOG:  process 364 acquired ShareLock on transaction 678 after 60036.000 ms
 CONTEXT:  while attempting to lock tuple (0,2) with values (2) in relation 
 publ
 ic.t1 of database postgres
 
 Now I am not sure, if the second message is an improvement, as what it sounds
 is while attempting to lock tuple, it got shared lock on
 transaction'. If you, Robert
 or other feels it is okay, then we can retain it as it is in patch
 else I think either
 we need to rephrase it or may be try with some other way (global variable) 
 such
 that it appears only for required case. I feel the way Robert has
 suggested i.e to
 make it as Detail of particular message (we might need to use global variable 
 to
 pass certain info) is better way and will have minimal impact on the cases 
 where
 this additional information needs to be displayed.

I really don't understand the origins of your arguments here. Why
shouldn't a row lock caused by an UPDATE be relevant? It's currenty very
hard to debug those, just as it's hard to debug tuple/row locks caused
by explicit FOR SHARE/UPDATE/...
And if your argument is that the message might be displayed when a
explicit query cancel (statement timeout) instead of a deadlock_timeout
arrives, so what? It's still correct, and important information? After
all it seems to have waited long enough to get cancelled.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-03 Thread Amit Kapila
On Mon, Mar 3, 2014 at 3:46 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-03-01 13:29:18 +0530, Amit Kapila wrote:
 With new patch, the message while updating locked rows will be displayed
 as below:

 LOG:  process 364 still waiting for ShareLock on transaction 678 after
 1014.000ms
 CONTEXT:  while attempting to lock tuple (0,2) with values (2) in relation 
 publ
 ic.t1 of database postgres

 LOG:  process 364 acquired ShareLock on transaction 678 after 60036.000 ms
 CONTEXT:  while attempting to lock tuple (0,2) with values (2) in relation 
 publ
 ic.t1 of database postgres

 Now I am not sure, if the second message is an improvement, as what it sounds
 is while attempting to lock tuple, it got shared lock on
 transaction'. If you, Robert
 or other feels it is okay, then we can retain it as it is in patch
 else I think either
 we need to rephrase it or may be try with some other way (global variable) 
 such
 that it appears only for required case. I feel the way Robert has
 suggested i.e to
 make it as Detail of particular message (we might need to use global 
 variable to
 pass certain info) is better way and will have minimal impact on the cases 
 where
 this additional information needs to be displayed.

 I really don't understand the origins of your arguments here. Why
 shouldn't a row lock caused by an UPDATE be relevant?

The point I am trying to say is about below part of statement in
Context message:
while attempting to lock tuple (0,2) .

In above situation, we are actually trying to acquire a lock on transaction to
operate on a tuple, so I think it will be better to rephrase it (may be by using
'operate on' instead of 'lock').


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-03 Thread Amit Kapila
On Mon, Mar 3, 2014 at 3:38 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-28 20:55:20 +0530, Amit Kapila wrote:
 On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse
  Well, as I already stated: we don't. I copied the behavior we use in
  CHECK constraints (ExecBuildSlotValueDescription()).

 I think now more people are of opinion that displaying whole tuple is
 not useful. I believe it is good to go ahead by displaying just primary key
 for this case and move ahead.

 The patch does not, and has never, printed the full tuple. What it does
 is copying the behaviour of printing the tuple for check constraint
 violations, which is truncating the tuple after a certain length.

I know that patch truncates the values if they are greater than certain
length (30), but the point is why it is not sufficient to have tuple location
(and primary key if available) which uniquely identifies any tuple?

I understand that by displaying more data from tuple, you can locate the
tuple more easily, but still adding more information than required doesn't
seem to advisable.

One more thing, I have noticed that in ExecConstraints() where we are
using ExecBuildSlotValueDescription() to display tuple values doesn't
use tuple location, which we are using in current message, so shouldn't
that be sufficient reason not to display whole tuple in current case?

 I don't think changing this in an isolated manner to the primary key
 would be a good idea. Let's use the same logic here, and if somebody
 wants to argue for always using the primary key instead of a truncated
 tuple, let them submit a separate patch.

I am not sure if we can say that this is an isolated way, as we are already
displaying just tuple location for other cases where we take Exclusive
Lock on tuple, for example:
LOG:  process 3012 still waiting for ExclusiveLock on tuple (0,1) of
relation 57499 of database 12045 after 1014.000 ms.

Even if we display all  (truncated) column values for current case,
do you intend to display such information for above message as
well, otherwise we might improve the situation for one case but not
for other.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-01 Thread Amit Kapila
On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 On 25/02/14 16:11, Robert Haas wrote:
 Reading this over, I'm not sure I understand why this is a CONTEXT at
 all and not just a DETAIL for the particular error message that it's
 supposed to be decorating.  Generally CONTEXT should be used for
 information that will be relevant to all errors in a given code path,
 and DETAIL for extra information specific to a particular error.

 Because there is more than one scenario where this context is useful,
 not just log_lock_wait messages.

 If we're going to stick with CONTEXT, we could rephrase it like this:

 CONTEXT: while attempting to lock tuple (1,2) in relation with OID 3456

 or when the relation name is known:

 CONTEXT: while attempting to lock tuple (1,2) in relation public.foo

 Accepted. Patch attached.

With new patch, the message while updating locked rows will be displayed
as below:

LOG:  process 364 still waiting for ShareLock on transaction 678 after
1014.000ms
CONTEXT:  while attempting to lock tuple (0,2) with values (2) in relation publ
ic.t1 of database postgres

LOG:  process 364 acquired ShareLock on transaction 678 after 60036.000 ms
CONTEXT:  while attempting to lock tuple (0,2) with values (2) in relation publ
ic.t1 of database postgres

Now I am not sure, if the second message is an improvement, as what it sounds
is while attempting to lock tuple, it got shared lock on
transaction'. If you, Robert
or other feels it is okay, then we can retain it as it is in patch
else I think either
we need to rephrase it or may be try with some other way (global variable) such
that it appears only for required case. I feel the way Robert has
suggested i.e to
make it as Detail of particular message (we might need to use global variable to
pass certain info) is better way and will have minimal impact on the cases where
this additional information needs to be displayed.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-28 Thread Amit Kapila
On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 Hi,

 On 25/02/14 16:11, Robert Haas wrote:
 On Mon, Feb 24, 2014 at 10:13 AM, Christian Kruse
 christ...@2ndquadrant.com wrote:
  To be honest, I don't like the idea of setting up this error context
  only for log_lock_wait messages. This sounds unnecessary complex to me
  and I think that in the few cases where this message doesn't add a
  value (and thus is useless) don't justify such complexity.

 Reading this over, I'm not sure I understand why this is a CONTEXT at
 all and not just a DETAIL for the particular error message that it's
 supposed to be decorating.  Generally CONTEXT should be used for
 information that will be relevant to all errors in a given code path,
 and DETAIL for extra information specific to a particular error.

 Because there is more than one scenario where this context is useful,
 not just log_lock_wait messages.

I think using this context in more scenario's can do harm for certain cases
as mentioned upthread.

However I think the other reason for not putting in Detail is that for other
cases when this message is displayed, it already display similar info in
message as below:

LOG:  process 4656 still waiting for ExclusiveLock on tuple (0,1) of relation 57
513 of database 12045 after 1046.000 ms

Also we might not have the appropriate info available at the place where
this message is generated unless we set global variable.

We have only information what LockTag has and for shared lock case it
doesn't have the info about tuple and relation.

 If we're going to stick with CONTEXT, we could rephrase it like this:

 CONTEXT: while attempting to lock tuple (1,2) in relation with OID 3456

 or when the relation name is known:

 CONTEXT: while attempting to lock tuple (1,2) in relation public.foo

 Accepted. Patch attached.


 I wouldn't be inclined to dump the whole tuple under any
 circumstances.  That could be a lot more data than what you want
 dumped in your log.  The PK could already be somewhat unreasonably
 large, but the whole tuple could be a lot more unreasonably large.

 Well, as I already stated: we don't. I copied the behavior we use in
 CHECK constraints (ExecBuildSlotValueDescription()).

I think now more people are of opinion that displaying whole tuple is
not useful. I believe it is good to go ahead by displaying just primary key
for this case and move ahead.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-27 Thread Christian Kruse
Hi,

On 25/02/14 16:11, Robert Haas wrote:
 On Mon, Feb 24, 2014 at 10:13 AM, Christian Kruse
 christ...@2ndquadrant.com wrote:
  To be honest, I don't like the idea of setting up this error context
  only for log_lock_wait messages. This sounds unnecessary complex to me
  and I think that in the few cases where this message doesn't add a
  value (and thus is useless) don't justify such complexity.
 
 Reading this over, I'm not sure I understand why this is a CONTEXT at
 all and not just a DETAIL for the particular error message that it's
 supposed to be decorating.  Generally CONTEXT should be used for
 information that will be relevant to all errors in a given code path,
 and DETAIL for extra information specific to a particular error.

Because there is more than one scenario where this context is useful,
not just log_lock_wait messages.

 If we're going to stick with CONTEXT, we could rephrase it like this:
 
 CONTEXT: while attempting to lock tuple (1,2) in relation with OID 3456
 
 or when the relation name is known:
 
 CONTEXT: while attempting to lock tuple (1,2) in relation public.foo

Accepted. Patch attached.

  Displaying whole tuple doesn't seem to be the most right way
  to get debug information and especially in this case we are
  already displaying tuple offset(ctid) which is unique identity
  to identify a tuple. It seems to me that it is sufficient to display
  unique value of tuple if present.
  I understand that there is no clear issue here, so may be if others also
  share their opinion then it will be quite easy to take a call.
 
 I wouldn't be inclined to dump the whole tuple under any
 circumstances.  That could be a lot more data than what you want
 dumped in your log.  The PK could already be somewhat unreasonably
 large, but the whole tuple could be a lot more unreasonably large.

Well, as I already stated: we don't. I copied the behavior we use in
CHECK constraints (ExecBuildSlotValueDescription()).

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a771ccb..a04414e 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
 static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 int *remaining, uint16 infomask);
+static void MultiXactIdWaitWithErrorContext(Relation rel, HeapTuple tup, MultiXactId multi,
+	MultiXactStatus status, int *remaining, uint16 infomask);
 static bool ConditionalMultiXactIdWait(MultiXactId multi,
 		   MultiXactStatus status, int *remaining,
 		   uint16 infomask);
@@ -2703,8 +2705,8 @@ l1:
 		if (infomask  HEAP_XMAX_IS_MULTI)
 		{
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
-			NULL, infomask);
+			MultiXactIdWaitWithErrorContext(relation, tp, (MultiXactId) xwait,
+	  MultiXactStatusUpdate, NULL, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2730,7 +2732,7 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
-			XactLockTableWait(xwait);
+			XactLockTableWaitWithInfo(relation, tp, xwait);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3255,8 +3257,8 @@ l2:
 			int			remain;
 
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
-			infomask);
+			MultiXactIdWaitWithErrorContext(relation, oldtup,
+	   (MultiXactId) xwait, mxact_status, remain, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3330,7 +3332,7 @@ l2:
 			else
 			{
 /* wait for regular transaction to end */
-XactLockTableWait(xwait);
+XactLockTableWaitWithInfo(relation, oldtup, xwait);
 LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 /*
@@ -4398,7 +4400,8 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
-	MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
+	MultiXactIdWaitWithErrorContext(relation, tuple,
+(MultiXactId) xwait, status, NULL, infomask);
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -4453,7 +4456,7 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
-	XactLockTableWait(xwait);
+	XactLockTableWaitWithInfo(relation, tuple, xwait);
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -5140,7 +5143,7 @@ l4:
 	if (needwait)
 	{
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
-		XactLockTableWait(members[i].xid);
+		XactLockTableWaitWithInfo(rel, mytup, members[i].xid);
 		pfree(members);
 		goto l4;
 	}
@@ -5200,7 +5203,7 @@ l4:
 if (needwait)
 {
 	LockBuffer(buf, BUFFER_LOCK_UNLOCK);
-	XactLockTableWait(rawxmax);
+	XactLockTableWaitWithInfo(rel, mytup, 

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-25 Thread Robert Haas
On Mon, Feb 24, 2014 at 10:13 AM, Christian Kruse
christ...@2ndquadrant.com wrote:
 To be honest, I don't like the idea of setting up this error context
 only for log_lock_wait messages. This sounds unnecessary complex to me
 and I think that in the few cases where this message doesn't add a
 value (and thus is useless) don't justify such complexity.

Reading this over, I'm not sure I understand why this is a CONTEXT at
all and not just a DETAIL for the particular error message that it's
supposed to be decorating.  Generally CONTEXT should be used for
information that will be relevant to all errors in a given code path,
and DETAIL for extra information specific to a particular error.

If we're going to stick with CONTEXT, we could rephrase it like this:

CONTEXT: while attempting to lock tuple (1,2) in relation with OID 3456

or when the relation name is known:

CONTEXT: while attempting to lock tuple (1,2) in relation public.foo

 Displaying whole tuple doesn't seem to be the most right way
 to get debug information and especially in this case we are
 already displaying tuple offset(ctid) which is unique identity
 to identify a tuple. It seems to me that it is sufficient to display
 unique value of tuple if present.
 I understand that there is no clear issue here, so may be if others also
 share their opinion then it will be quite easy to take a call.

I wouldn't be inclined to dump the whole tuple under any
circumstances.  That could be a lot more data than what you want
dumped in your log.  The PK could already be somewhat unreasonably
large, but the whole tuple could be a lot more unreasonably large.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-24 Thread Christian Kruse
Hi,

attached you will find a new version of the patch, removing the ctid
text but leaving the ctid itself in the message.

On 23/02/14 11:14, Amit Kapila wrote:
  In general, why I am suggesting to restrict display of newly added
  context for the case it is added to ensure that it doesn't get started
  displaying in other cases where it might make sense or might not
  make sense.
 
  Anything failing while inside an XactLockTableWait() et al. will benefit
  from the context. In which scenarios could that be unneccessary?
 
 This path is quite deep, so I have not verified that whether
 this new context can make sense for all error cases.
 For example, in below path (error message), it doesn't seem to
 make any sense (I have tried to generate it through debugger,
 actual message will vary).
 
 XactLockTableWait-SubTransGetParent-SimpleLruReadPage_ReadOnly-SimpleLruReadPage-SlruReportIOError
 
 ERROR:  could not access status of transaction 927
 DETAIL:  Could not open file pg_subtrans/: No error.
 CONTEXT:  relation public.t1
 tuple ctid (0,2)

To be honest, I don't like the idea of setting up this error context
only for log_lock_wait messages. This sounds unnecessary complex to me
and I think that in the few cases where this message doesn't add a
value (and thus is useless) don't justify such complexity.

 I have not checked that, but the reason I mentioned about database name
 is due to display of database oid in similar message, please see the
 message below. I think in below context we get it from lock tag and
 I think for the patch case also, it might be better to display, but not
 a mandatory thing. Consider it as a suggestion which if you also feel
 good, then do it, else ignore it.
 
 LOG:  process 4656 still waiting for ExclusiveLock on tuple (0,1) of relation 
 57
 513 of database 12045 after 1046.000 ms

After thinking a bit about this I added the database name to the
context message, see attached patch.

   Currently I simply display the whole tuple with truncating long
   fields. This is plain easy, no distinction necessary. I did some code
   reading and it seems somewhat complex to get the PK columns and it
   seems that we need another lock for this, too - maybe not the best
   idea when we are already in locking trouble, do you disagree?
 
  I don't think you need to take another lock for this, it must already
  have appropriate lock by that time. There should not be any problem
  in calling relationHasPrimaryKey except that currently it is static, you
  need to expose it.
 
  Other locations that deal with similar things (notably
  ExecBuildSlotValueDescription()) doesn't either. I don't think this
  patch should introduce it then.
 
 Displaying whole tuple doesn't seem to be the most right way
 to get debug information and especially in this case we are
 already displaying tuple offset(ctid) which is unique identity
 to identify a tuple. It seems to me that it is sufficient to display
 unique value of tuple if present.
 I understand that there is no clear issue here, so may be if others also
 share their opinion then it will be quite easy to take a call.

That would be nice. I didn't change it, yet.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a771ccb..a04414e 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
 static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 int *remaining, uint16 infomask);
+static void MultiXactIdWaitWithErrorContext(Relation rel, HeapTuple tup, MultiXactId multi,
+	MultiXactStatus status, int *remaining, uint16 infomask);
 static bool ConditionalMultiXactIdWait(MultiXactId multi,
 		   MultiXactStatus status, int *remaining,
 		   uint16 infomask);
@@ -2703,8 +2705,8 @@ l1:
 		if (infomask  HEAP_XMAX_IS_MULTI)
 		{
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
-			NULL, infomask);
+			MultiXactIdWaitWithErrorContext(relation, tp, (MultiXactId) xwait,
+	  MultiXactStatusUpdate, NULL, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2730,7 +2732,7 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
-			XactLockTableWait(xwait);
+			XactLockTableWaitWithInfo(relation, tp, xwait);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3255,8 +3257,8 @@ l2:
 			int			remain;
 
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
-			infomask);
+			MultiXactIdWaitWithErrorContext(relation, oldtup,
+	   (MultiXactId) xwait, mxact_status, remain, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3330,7 +3332,7 @@ l2:
 			else
 			{
 /* wait 

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-22 Thread Andres Freund
On 2014-02-22 11:53:24 +0530, Amit Kapila wrote:
 On Fri, Feb 21, 2014 at 4:55 PM, Christian Kruse
 christ...@2ndquadrant.com wrote:
  Hi,
  1. New context added by this patch is display at wrong place
  [...]
  Do this patch expect to print the context at cancel request
  as well?
 
  To be honest, I don't see a problem here. If you cancel the request in
  most cases it is because it doesn't finish in an acceptable time. So
  displaying the context seems logical to me.
 
 Isn't it a matter of consistency, we might not be able to display it
 on all long running updates, rather only when it goes for update on tuple
 on which some other transaction is operating.

We'll display it when we are waiting for a lock. Which quite possibly is
the reason it's cancelled, so logging the locking information is highly
pertinent.

 Is it possible that we set callback to error_context_stack, only
 when we go for displaying this message. The way to do could be that
 rather than forming callback in local variable in fuction
 XactLockTableWaitWithErrorCallback(), store it in global variable
 and then use that at appropriate place to set error_context_stack.

That seems like unneccessary complexity.

 In general, why I am suggesting to restrict display of newly added
 context for the case it is added to ensure that it doesn't get started
 displaying in other cases where it might make sense or might not
 make sense.

Anything failing while inside an XactLockTableWait() et al. will benefit
from the context. In which scenarios could that be unneccessary?

  (according to Andres we would have to look at
  rel-rd_node.dbNode) and since other commands dealing with names don't
  do so, either, I think we should leave out the database name.
 
 For this case as I have shown that in similar message, we already display
 database oid, it should not be a problem to display here as well.
 It will be more information to user.

I don't think we do for any where we actually display the relation name,
do we?


  Currently I simply display the whole tuple with truncating long
  fields. This is plain easy, no distinction necessary. I did some code
  reading and it seems somewhat complex to get the PK columns and it
  seems that we need another lock for this, too - maybe not the best
  idea when we are already in locking trouble, do you disagree?
 
 I don't think you need to take another lock for this, it must already
 have appropriate lock by that time. There should not be any problem
 in calling relationHasPrimaryKey except that currently it is static, you
 need to expose it.

Other locations that deal with similar things (notably
ExecBuildSlotValueDescription()) doesn't either. I don't think this
patch should introduce it then.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-22 Thread Amit Kapila
On Sat, Feb 22, 2014 at 3:17 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-22 11:53:24 +0530, Amit Kapila wrote:

 In general, why I am suggesting to restrict display of newly added
 context for the case it is added to ensure that it doesn't get started
 displaying in other cases where it might make sense or might not
 make sense.

 Anything failing while inside an XactLockTableWait() et al. will benefit
 from the context. In which scenarios could that be unneccessary?

This path is quite deep, so I have not verified that whether
this new context can make sense for all error cases.
For example, in below path (error message), it doesn't seem to
make any sense (I have tried to generate it through debugger,
actual message will vary).

XactLockTableWait-SubTransGetParent-SimpleLruReadPage_ReadOnly-SimpleLruReadPage-SlruReportIOError

ERROR:  could not access status of transaction 927
DETAIL:  Could not open file pg_subtrans/: No error.
CONTEXT:  relation public.t1
tuple ctid (0,2)


  (according to Andres we would have to look at
  rel-rd_node.dbNode) and since other commands dealing with names don't
  do so, either, I think we should leave out the database name.

 For this case as I have shown that in similar message, we already display
 database oid, it should not be a problem to display here as well.
 It will be more information to user.

 I don't think we do for any where we actually display the relation name,
 do we?

I have not checked that, but the reason I mentioned about database name
is due to display of database oid in similar message, please see the
message below. I think in below context we get it from lock tag and
I think for the patch case also, it might be better to display, but not
a mandatory thing. Consider it as a suggestion which if you also feel
good, then do it, else ignore it.

LOG:  process 4656 still waiting for ExclusiveLock on tuple (0,1) of relation 57
513 of database 12045 after 1046.000 ms


  Currently I simply display the whole tuple with truncating long
  fields. This is plain easy, no distinction necessary. I did some code
  reading and it seems somewhat complex to get the PK columns and it
  seems that we need another lock for this, too - maybe not the best
  idea when we are already in locking trouble, do you disagree?

 I don't think you need to take another lock for this, it must already
 have appropriate lock by that time. There should not be any problem
 in calling relationHasPrimaryKey except that currently it is static, you
 need to expose it.

 Other locations that deal with similar things (notably
 ExecBuildSlotValueDescription()) doesn't either. I don't think this
 patch should introduce it then.

Displaying whole tuple doesn't seem to be the most right way
to get debug information and especially in this case we are
already displaying tuple offset(ctid) which is unique identity
to identify a tuple. It seems to me that it is sufficient to display
unique value of tuple if present.
I understand that there is no clear issue here, so may be if others also
share their opinion then it will be quite easy to take a call.


PS - I will be out for work assignment for next week, so might not be
able to complete the review, so others are welcome to takeup and
complete it in the meantime.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-21 Thread Christian Kruse
Hi,

On 19.02.2014 08:11, Amit Kapila wrote:
 I have looked into this patch. Below are some points which I
 think should be improved in this patch:

Thank you for your review.

 1. New context added by this patch is display at wrong place
 […]
 Do this patch expect to print the context at cancel request
 as well?
 I don't find it useful. I think the reason is that after setup of
 context, if any other error happens, it will display the newly setup
 context.

To be honest, I don't see a problem here. If you cancel the request in
most cases it is because it doesn't finish in an acceptable time. So
displaying the context seems logical to me.

 2a. About the message:
 LOG:  process 1936 still waiting for ShareLock on transaction 842
 after 1014.000ms
 CONTEXT:  relation name: foo (OID 57496)
 tuple (ctid (0,1)): (1, 2, abc  )
 
 Isn't it better to display database name as well, as if we see in
 one of related messages as below, it displays database name as well.
 
 LOG:  process 3012 still waiting for ExclusiveLock on tuple (0,1) of
 relation 57
 499 of database 12045 after 1014.000 ms

Maybe. But then we either have duplicate infos in some cases (e.g. on
a table lock) or we have to distinguish error cases and show distinct
contextes. And also getting the database name from a relation is not
really straight forward (according to Andres we would have to look at
rel-rd_node.dbNode) and since other commands dealing with names don't
do so, either, I think we should leave out the database name.

 2b. I think there is no need to display *ctid* before tuple offset, it seems
 to be obvious as well as consistent with other similar message.

Ok, I'll drop it.

 3. About callback I think rather than calling setup/tear down
 functions from multiple places, it will be better to have wrapper
 functions for XactTableLockWait() and MultiXactIdWait(). Just check
 in plpgsql_exec_function(), how the errorcallback is set, can we do
 something similar without to avoid palloc?

OK, Attached.

 4. About the values of tuple
 I think it might be useful to display some unique part of tuple for
 debugging part, but what will we do incase there is no primary
 key on table, so may be we can display initial few columns (2 or 3)
 or just display tuple offset as is done in other similar message
 shown above.

Currently I simply display the whole tuple with truncating long
fields. This is plain easy, no distinction necessary. I did some code
reading and it seems somewhat complex to get the PK columns and it
seems that we need another lock for this, too – maybe not the best
idea when we are already in locking trouble, do you disagree?

Also showing just a few columns when no PK is available might not be
helpful since the tuple is not necessarily identified by the columns
showed. And since we drop the CTID there is no alternative solution to
identify the tuple.

IMHO simply displaying the whole tuple is the best solution in this
case, do you agree?

 More to the point, I have observed that in few cases
 displaying values of tuple can be confusing as well. For Example:
 […]
 Now here it will be bit confusing to display values with tuple, as
 in session-3 statement has asked to update value (3), but we have
 started waiting for value (4). Although it is right, but might not
 make much sense.

What do you suggest to avoid confusion? I can see what you are talking
about, but I'm not sure what to do about it. Keep in mind that this
patch should help debugging locking, so IMHO it is essential to be
able to identify a tuple and therefore knowing its values.

 Also after session-2 commits the transaction, session-3 will say
 acquired lock, however still it will not be able to update it.

To be honest, I don't think that this is a problem of the
patch. Concurrency is not easy and it might lead to confusing
situations. I don't think that we should drop the tuple values because
of this, since they provide useful and precious debugging information.

Attached you will find a new version of the patch, mainly using
wrapper functions for XactLockTableWait() and MultiXactIdWait().

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a771ccb..71aaa93 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
 static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 int *remaining, uint16 infomask);
+static void MultiXactIdWaitWithErrorContext(Relation rel, HeapTuple tup, MultiXactId multi,
+	MultiXactStatus status, int *remaining, uint16 infomask);
 static bool ConditionalMultiXactIdWait(MultiXactId multi,
 		   MultiXactStatus status, int *remaining,
 		   uint16 infomask);
@@ -2703,8 +2705,8 @@ l1:
 		if (infomask  

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-21 Thread Amit Kapila
On Fri, Feb 21, 2014 at 4:55 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 Hi,
 1. New context added by this patch is display at wrong place
 [...]
 Do this patch expect to print the context at cancel request
 as well?

 To be honest, I don't see a problem here. If you cancel the request in
 most cases it is because it doesn't finish in an acceptable time. So
 displaying the context seems logical to me.

Isn't it a matter of consistency, we might not be able to display it
on all long running updates, rather only when it goes for update on tuple
on which some other transaction is operating.

Also though this functionality is mainly going to be useful for the case
when log_lock_waits = on, but it will display newly added context even
without that.

Is it possible that we set callback to error_context_stack, only
when we go for displaying this message. The way to do could be that
rather than forming callback in local variable in fuction
XactLockTableWaitWithErrorCallback(), store it in global variable
and then use that at appropriate place to set error_context_stack.

In general, why I am suggesting to restrict display of newly added
context for the case it is added to ensure that it doesn't get started
displaying in other cases where it might make sense or might not
make sense.

I think if it is too tedious to do it or there are some problems
due to which we cannot restrict it for case it has been added, then
we can think of going with it.

 2a. About the message:
 LOG:  process 1936 still waiting for ShareLock on transaction 842
 after 1014.000ms
 CONTEXT:  relation name: foo (OID 57496)
 tuple (ctid (0,1)): (1, 2, abc  )

 Isn't it better to display database name as well, as if we see in
 one of related messages as below, it displays database name as well.

 LOG:  process 3012 still waiting for ExclusiveLock on tuple (0,1) of
 relation 57
 499 of database 12045 after 1014.000 ms

 Maybe. But then we either have duplicate infos in some cases (e.g. on
 a table lock) or we have to distinguish error cases and show distinct
 contextes.

I think if we go with the way suggested above, then this should
not be problem, or do you think it can still create duplicate
info problem.

 And also getting the database name from a relation is not
 really straight forward

I think there should not be any complication, we can do something like
get_database_name(rel-rd_node.dbNode);

 (according to Andres we would have to look at
 rel-rd_node.dbNode) and since other commands dealing with names don't
 do so, either, I think we should leave out the database name.

For this case as I have shown that in similar message, we already display
database oid, it should not be a problem to display here as well.
It will be more information to user.

 2b. I think there is no need to display *ctid* before tuple offset, it seems
 to be obvious as well as consistent with other similar message.

 Ok, I'll drop it.

I have asked to just remove *ctid* from info not the actual info of
tuple location. It should look like tuple (0,1).

The way you have currently changed the code, it is crashing the backend.

Another thing related to this is that below code should  also not use *ctid*
+ if (geterrlevel() = ERROR)
+ {
+ errcontext(tuple ctid (%u,%u),
+   BlockIdGetBlockNumber((info-tuple-t_self.ip_blkid)),
+   info-tuple-t_self.ip_posid);
+ }


 3. About callback I think rather than calling setup/tear down
 functions from multiple places, it will be better to have wrapper
 functions for XactTableLockWait() and MultiXactIdWait(). Just check
 in plpgsql_exec_function(), how the errorcallback is set, can we do
 something similar without to avoid palloc?

 OK, Attached.

It looks better than previous, one minor suggestion:
Function name XactLockTableWaitWithErrorCallback() looks bit awkward
to me, shall we name something like XactLockTableWaitWithInfo()?

 4. About the values of tuple
 I think it might be useful to display some unique part of tuple for
 debugging part, but what will we do incase there is no primary
 key on table, so may be we can display initial few columns (2 or 3)
 or just display tuple offset as is done in other similar message
 shown above.

 Currently I simply display the whole tuple with truncating long
 fields. This is plain easy, no distinction necessary. I did some code
 reading and it seems somewhat complex to get the PK columns and it
 seems that we need another lock for this, too - maybe not the best
 idea when we are already in locking trouble, do you disagree?

I don't think you need to take another lock for this, it must already
have appropriate lock by that time. There should not be any problem
in calling relationHasPrimaryKey except that currently it is static, you
need to expose it.

 Also showing just a few columns when no PK is available might not be
 helpful since the tuple is not necessarily identified by the columns
 showed. And since we drop the CTID there is no alternative solution to
 identify the tuple.


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-18 Thread Amit Kapila
On Thu, Jan 2, 2014 at 4:38 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 Hi,

 On 02/01/14 10:02, Andres Freund wrote:
  Christian's idea of a context line seems plausible to me.  I don't
  care for this implementation too much --- a global variable?  Ick.

 Yea, the data should be stored in ErrorContextCallback.arg instead.

 Fixed.

I have looked into this patch. Below are some points which I
think should be improved in this patch:

1. New context added by this patch is display at wrong place
Session-1
Create table foo(c1 int);
insert into foo values(generate_series(1,10);
Begin;
update foo set c1 =11;

Session-2
postgres=# begin;
BEGIN
postgres=# update foo set val1 = 13 where val1=3;
Cancel request sent
ERROR:  canceling statement due to user request
CONTEXT:  relation name: foo (OID 57496)
tuple ctid (0,7)

Do this patch expect to print the context at cancel request
as well?
I don't find it useful. I think the reason is that after setup of
context, if any other error happens, it will display the newly setup
context.

2a. About the message:
LOG:  process 1936 still waiting for ShareLock on transaction 842
after 1014.000ms
CONTEXT:  relation name: foo (OID 57496)
tuple (ctid (0,1)): (1, 2, abc  )

Isn't it better to display database name as well, as if we see in
one of related messages as below, it displays database name as well.

LOG:  process 3012 still waiting for ExclusiveLock on tuple (0,1) of
relation 57
499 of database 12045 after 1014.000 ms

2b. I think there is no need to display *ctid* before tuple offset, it seems
to be obvious as well as consistent with other similar message.

3. About callback
I think rather than calling setup/tear down functions from multiple places,
it will be better to have wrapper functions for
XactTableLockWait() and MultiXactIdWait().

Just check in plpgsql_exec_function(), how the errorcallback is set,
can we do something similar without to avoid palloc?

I think we still need to see how to avoid issue mentioned in point-1.

4. About the values of tuple
I think it might be useful to display some unique part of tuple for
debugging part, but what will we do incase there is no primary
key on table, so may be we can display initial few columns (2 or 3)
or just display tuple offset as is done in other similar message
shown above. More to the point, I have observed that in few cases
displaying values of tuple can be confusing as well. For Example:

Session-1
Create table foo(c1 int);
postgres=# insert into foo values(generate_series(1,3));
INSERT 0 3
postgres=# begin;
BEGIN
postgres=# update foo set c1 = 4;
UPDATE 3

Session-2
postgres=# update foo set c1=6;
LOG:  process 1936 still waiting for ShareLock on transaction 884 after 1014.000
 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,1)): (1)

Session-3
postgres=# begin;
BEGIN
postgres=# update foo set c1=5 where c1 =3;
LOG:  process 3012 still waiting for ShareLock on transaction 884 after 1015.000
 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,3)): (3)

Session-1
Commit;

Session-2
LOG:  process 1936 acquired ShareLock on transaction 884 after 25294.000 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,1)): (1)
UPDATE 3

Session-3
LOG:  process 3012 acquired ShareLock on transaction 884 after 13217.000 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,3)): (3)
LOG:  process 3012 still waiting for ShareLock on transaction 885 after 1014.000
 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,6)): (4)

Now here it will be bit confusing to display values with tuple,
as in session-3 statement has asked to update value (3), but we have started
waiting for value (4). Although it is right, but might not make much sense.
Also after session-2 commits the transaction, session-3 will say acquired lock,
however still it will not be able to update it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-01-02 Thread Andres Freund
On 2013-12-31 13:56:53 -0800, Peter Geoghegan wrote:
 ISTM that you should be printing just the value and the unique index
 there, and not any information about the tuple proper. Doing any less
 could be highly confusing.

I agree that the message needs improvement, but I don't agree that we
shouldn't lock the tuple's location. If you manually investigate the
situation that's where you'll find the conflicting tuple - I don't see
what we gain by not logging the ctid.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-01-02 Thread Andres Freund
On 2013-12-31 11:36:36 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On 31 December 2013 09:12, Christian Kruse christ...@2ndquadrant.com 
  wrote:
  Output with patch:
  
  LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 
  ms
  CONTEXT:  relation name: foo (OID 16385)
  tuple (ctid (0,1)): (1)
 
  That is useful info.
 
  I think the message should be changed to say this only, without a context 
  line
 
  LOG:  process 24774 acquired ShareLock on relation foo (OID 16385)
  tuple (0,1) after 11688.720 ms
 
  My reason is that pid 24774 was waiting for a *tuple lock* and it was
  eventually granted, so thats what it should say.
 
 No, that wasn't what it was waiting for, and lying to the user like that
 isn't going to make things better.

Agreed.

 Christian's idea of a context line seems plausible to me.  I don't
 care for this implementation too much --- a global variable?  Ick.

Yea, the data should be stored in ErrorContextCallback.arg instead.

 Make a wrapper function for XactLockTableWait instead, please.

I don't think that'd work out all too nicely - we do the
XactLockTableWait() inside other functions like MultiXactIdWait(),
passing all the detail along for those would end up being ugly. So using
error context callbacks properly seems like the best way in the end.

 And I'm not real sure that showing the whole tuple contents is a good
 thing; I can see people calling that a security leak, not to mention
 that the performance consequences could be dire.

I don't think that the security argument has too much merit given
today's PG - we already log the full tuple for various constraint
violations. And we also accept the performance penalty there.  I don't
see any easy way to select a sensible subset of columns to print here,
and printing the columns is what would make investigating issues around
this so much easier. At the very least we'd need to print the pkey
columns and the columns of the unique key that might have been violated.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-01-02 Thread Peter Geoghegan
On Thu, Jan 2, 2014 at 12:56 AM, Andres Freund and...@2ndquadrant.com wrote:
 I agree that the message needs improvement, but I don't agree that we
 shouldn't lock the tuple's location. If you manually investigate the
 situation that's where you'll find the conflicting tuple - I don't see
 what we gain by not logging the ctid.

I suppose so, but the tuple probably isn't going to be visible anyway,
at least when the message is initially logged.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-01-02 Thread Andres Freund
On 2014-01-02 01:40:38 -0800, Peter Geoghegan wrote:
 On Thu, Jan 2, 2014 at 12:56 AM, Andres Freund and...@2ndquadrant.com wrote:
  I agree that the message needs improvement, but I don't agree that we
  shouldn't lock the tuple's location. If you manually investigate the
  situation that's where you'll find the conflicting tuple - I don't see
  what we gain by not logging the ctid.
 
 I suppose so, but the tuple probably isn't going to be visible anyway,
 at least when the message is initially logged.

But that's the case for all these, otherwise we wouldn't wait on the
row.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-01-02 Thread Christian Kruse
Hi,

On 02/01/14 10:02, Andres Freund wrote:
  Christian's idea of a context line seems plausible to me.  I don't
  care for this implementation too much --- a global variable?  Ick.
 
 Yea, the data should be stored in ErrorContextCallback.arg instead.

Fixed.

I also palloc() the ErrorContextCallback itself. But it doesn't feel
right since I could not find a piece of code doing so. What do you
think?

  Make a wrapper function for XactLockTableWait instead, please.
 
 I don't think that'd work out all too nicely - we do the
 XactLockTableWait() inside other functions like MultiXactIdWait(),
 passing all the detail along for those would end up being ugly. So using
 error context callbacks properly seems like the best way in the end.

Wrapping XactLockTableWait()/MultiXactIdWait() at least allows the
ErrorContextCallback and ErrorContextCallback.arg to live on the
stack. So what we could do is wrap this in a macro instead of a
function (like e.g. PG_TRY) or write two different functions. And I
don't like the two functions approach since it means duplication of
code.

While writing the response I really think writing a macro in PG_TRY
style for setting up and cleaning the error context callback I begin
to think that this would be the best way to go.

  And I'm not real sure that showing the whole tuple contents is a good
  thing; I can see people calling that a security leak, not to mention
  that the performance consequences could be dire.
 
 I don't think that the security argument has too much merit given
 today's PG - we already log the full tuple for various constraint
 violations. And we also accept the performance penalty there.  I don't
 see any easy way to select a sensible subset of columns to print here,
 and printing the columns is what would make investigating issues around
 this so much easier. At the very least we'd need to print the pkey
 columns and the columns of the unique key that might have been violated.

I agree. Even printing only the PK values would be much more
complex. As far as I can see we would even have to gain another lock
for this (see comment for relationHasPrimaryKey() in
src/backend/catalog/index.c).

Regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index f8545c1..cfa49c2 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2702,9 +2702,11 @@ l1:
 
 		if (infomask  HEAP_XMAX_IS_MULTI)
 		{
+			XactLockTableWaitSetupErrorContextCallback(relation, tp);
 			/* wait for multixact */
 			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
 			NULL, infomask);
+			XactLockTableWaitCleanupErrorContextCallback();
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2730,7 +2732,10 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
+			XactLockTableWaitSetupErrorContextCallback(relation, tp);
+
 			XactLockTableWait(xwait);
+			XactLockTableWaitCleanupErrorContextCallback();
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3254,9 +3259,11 @@ l2:
 			TransactionId update_xact;
 			int			remain;
 
+			XactLockTableWaitSetupErrorContextCallback(relation, oldtup);
 			/* wait for multixact */
 			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
 			infomask);
+			XactLockTableWaitCleanupErrorContextCallback();
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3330,7 +3337,10 @@ l2:
 			else
 			{
 /* wait for regular transaction to end */
+XactLockTableWaitSetupErrorContextCallback(relation, oldtup);
+
 XactLockTableWait(xwait);
+XactLockTableWaitCleanupErrorContextCallback();
 LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 /*
@@ -4398,7 +4408,11 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
+{
+	XactLockTableWaitSetupErrorContextCallback(relation, tuple);
 	MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
+	XactLockTableWaitCleanupErrorContextCallback();
+}
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -4453,7 +4467,11 @@ l3:
 		RelationGetRelationName(relation;
 }
 else
+{
+	XactLockTableWaitSetupErrorContextCallback(relation, tuple);
 	XactLockTableWait(xwait);
+	XactLockTableWaitCleanupErrorContextCallback();
+}
 
 /* if there are updates, follow the update chain */
 if (follow_updates 
@@ -5139,9 +5157,14 @@ l4:
 
 	if (needwait)
 	{
+		XactLockTableWaitSetupErrorContextCallback(rel, mytup);
+
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 		XactLockTableWait(members[i].xid);
 		pfree(members);
+
+		XactLockTableWaitCleanupErrorContextCallback();
+
 		goto l4;
 	}
 	if (res != HeapTupleMayBeUpdated)
@@ -5199,8 +5222,13 @@ l4:
  needwait);
 if (needwait)
 {
+	

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-01-01 Thread Christian Kruse
Hi,

On 31/12/13 13:56, Peter Geoghegan wrote:
 I think that this is a worthwhile effort, but like Tom I don't think
 that it's universally true that these waiters are waiting on a tuple
 lock. Most obviously, the XactLockTableWait() call you've modified
 within nbtinsert.c is not.

This is why I don't set the tuple data in this case:

 XactLockTableWaitSetupErrorContextCallback(rel, NULL);

The second argument is the tuple argument. If it is set to NULL in the
error context callback, all output regarding tuple are suppressed.

 ISTM that you should be printing just the value and the unique index
 there, and not any information about the tuple proper.

Good point, I will have a look at this.

 For better direction about where that new
 infrastructure ought to live, you might find some useful insight from
 commit 991f3e5ab3f8196d18d5b313c81a5f744f3baaea.

Thanks for the pointer!

But, to be honest, I am still unsure where to put this. As far as I
understand this commit has substantial parts in relcache.c and
elog.c – both don't seem to be very good fitting places?

Regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpFTITwzCXuK.pgp
Description: PGP signature


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-01-01 Thread Christian Kruse
On 31/12/13 11:36, Tom Lane wrote:
 Christian's idea of a context line seems plausible to me.  I don't
 care for this implementation too much --- a global variable?  Ick.
 Make a wrapper function for XactLockTableWait instead, please.

Point taken. You are right.

 And I'm not real sure that showing the whole tuple contents is a good
 thing; I can see people calling that a security leak, not to mention
 that the performance consequences could be dire.

What do you suggest to include? Having some information to identify
the tuple may be very helpful for debugging. This is why I did it this
way.

Regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgphhZyFupMjQ.pgp
Description: PGP signature


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-01-01 Thread Simon Riggs
On 31 December 2013 17:06, Simon Riggs si...@2ndquadrant.com wrote:
 On 31 December 2013 16:36, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 31 December 2013 09:12, Christian Kruse christ...@2ndquadrant.com 
 wrote:
 Output with patch:

 LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 
 ms
 CONTEXT:  relation name: foo (OID 16385)
 tuple (ctid (0,1)): (1)

 That is useful info.

 I think the message should be changed to say this only, without a context 
 line

 LOG:  process 24774 acquired ShareLock on relation foo (OID 16385)
 tuple (0,1) after 11688.720 ms

 My reason is that pid 24774 was waiting for a *tuple lock* and it was
 eventually granted, so thats what it should say.

 No, that wasn't what it was waiting for, and lying to the user like that
 isn't going to make things better.

 Like that is ambiguous and I don't understand you or what you are
 objecting to.

 When we run SELECT ... FOR SHARE we are attempting to lock rows. Why
 is the transaction we are waiting for important when we wait to lock
 rows, but when we wait to lock relations it isn't?

My thought is that this message refers to a lock wait for a
transaction to end, which is made as part of a request to lock a row.
But it is not 100% of the lock request and a race condition exists
that means that we might need to wait multiple times in rare
circumstances.

So reporting that tuple lock has been *acquired* would be inaccurate,
since at that point it isn't true. So we need to describe the
situation better, e.g. as part of waiting for a tuple lock we waited
on a transaction. Not very snappy.

Anyway, the important thing is that we can work out the tie being
waited for. Maybe logging the PK value would be useful as well, but
not the whole row.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2013-12-31 Thread Simon Riggs
On 31 December 2013 09:12, Christian Kruse christ...@2ndquadrant.com wrote:

 Output w/o patch:

 LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms

 Output with patch:

 LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms
 CONTEXT:  relation name: foo (OID 16385)
 tuple (ctid (0,1)): (1)

That is useful info.

I think the message should be changed to say this only, without a context line

LOG:  process 24774 acquired ShareLock on relation foo (OID 16385)
tuple (0,1) after 11688.720 ms

My reason is that pid 24774 was waiting for a *tuple lock* and it was
eventually granted, so thats what it should say. ALL locks wait for
the current transaction that holds the lock to complete, but only
tuple locks currently refer to a lock wait as a wait for a
transaction. That is confusing for users, as well as being
inconsistent with other existing messages.

Replacing the old text with the new info would represent a net
improvement in usefulness and understandability.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2013-12-31 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 31 December 2013 09:12, Christian Kruse christ...@2ndquadrant.com wrote:
 Output with patch:
 
 LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms
 CONTEXT:  relation name: foo (OID 16385)
 tuple (ctid (0,1)): (1)

 That is useful info.

 I think the message should be changed to say this only, without a context line

 LOG:  process 24774 acquired ShareLock on relation foo (OID 16385)
 tuple (0,1) after 11688.720 ms

 My reason is that pid 24774 was waiting for a *tuple lock* and it was
 eventually granted, so thats what it should say.

No, that wasn't what it was waiting for, and lying to the user like that
isn't going to make things better.

Christian's idea of a context line seems plausible to me.  I don't
care for this implementation too much --- a global variable?  Ick.
Make a wrapper function for XactLockTableWait instead, please.
And I'm not real sure that showing the whole tuple contents is a good
thing; I can see people calling that a security leak, not to mention
that the performance consequences could be dire.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2013-12-31 Thread Simon Riggs
On 31 December 2013 16:36, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 31 December 2013 09:12, Christian Kruse christ...@2ndquadrant.com wrote:
 Output with patch:

 LOG:  process 24774 acquired ShareLock on transaction 696 after 11688.720 ms
 CONTEXT:  relation name: foo (OID 16385)
 tuple (ctid (0,1)): (1)

 That is useful info.

 I think the message should be changed to say this only, without a context 
 line

 LOG:  process 24774 acquired ShareLock on relation foo (OID 16385)
 tuple (0,1) after 11688.720 ms

 My reason is that pid 24774 was waiting for a *tuple lock* and it was
 eventually granted, so thats what it should say.

 No, that wasn't what it was waiting for, and lying to the user like that
 isn't going to make things better.

Like that is ambiguous and I don't understand you or what you are
objecting to.

When we run SELECT ... FOR SHARE we are attempting to lock rows. Why
is the transaction we are waiting for important when we wait to lock
rows, but when we wait to lock relations it isn't?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2013-12-31 Thread Peter Geoghegan
On Tue, Dec 31, 2013 at 1:12 AM, Christian Kruse
christ...@2ndquadrant.com wrote:
 I created a patch improving the log_lock_wait messages by adding
 relation infos (name and OID) as well as tuple infos (if present –
 ctid and, if available, the tuple itself) in the context.

I think that this is a worthwhile effort, but like Tom I don't think
that it's universally true that these waiters are waiting on a tuple
lock. Most obviously, the XactLockTableWait() call you've modified
within nbtinsert.c is not. In actuality, it's waiting on what I
informally refer to as a value lock for a value being inserted into
a unique index. Unlike the tuple locking code, a LockTuple() call in
advance of the XactLockTableWait() call, to arbitrate ordering, is not
present. No tuple is locked at all.

ISTM that you should be printing just the value and the unique index
there, and not any information about the tuple proper. Doing any less
could be highly confusing. You should do an analysis of where this
kind of exception applies. For better direction about where that new
infrastructure ought to live, you might find some useful insight from
commit 991f3e5ab3f8196d18d5b313c81a5f744f3baaea.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers