Re: [PATCHES] Autovacuum cancellation

2007-10-28 Thread Simon Riggs
On Sat, 2007-10-27 at 23:22 +0100, Simon Riggs wrote:
 On Fri, 2007-10-26 at 10:32 +0100, Heikki Linnakangas wrote:
  Alvaro Herrera wrote:
   /*
* Look for a blocking autovacuum. There will only ever
* be one, since the autovacuum workers are careful
* not to operate concurrently on the same table. 
*/
  
  I think that's a bit unaccurate. You could have multiple autovacuum
  workers operating on different tables participating in a deadlock. The
  reason that can't happen is that autovacuum never holds a lock while
  waiting for another.
 
 I wrote that code comment; as you say it is true only when there are at
 least 4 processes in the lock graph where 2+ normal backends are
 deadlocking and there are 2+ autovacuums holding existing locks. The
 comment should have said If blocking is caused by an autovacuum process
 then ... (there will).

Sorry...this should read as you say it is **not** true.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Autovacuum cancellation

2007-10-27 Thread Simon Riggs
On Thu, 2007-10-25 at 17:35 -0400, Tom Lane wrote:
 There's some things still to be desired here: if an autovac process is
 involved in a hard deadlock, the patch doesn't favor zapping it over
 anybody else, nor consider cancelling the autovac as an alternative to
 rearranging queues for a soft deadlock.  But dealing with that will
 open cans of worms that I don't think we want to open for 8.3.

I did look at doing that but decided it would not be appropriate to do
that in all cases. i.e. there are hard deadlock cases where the autovac
can be the head of the lock queue and yet a deadlock still exists
between two other processes. The deadlock detector doesn't get called
twice for the same deadlock, so it wasn't possible to speculatively do
that and then re-catch it second time around.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Autovacuum cancellation

2007-10-27 Thread Simon Riggs
On Fri, 2007-10-26 at 10:32 +0100, Heikki Linnakangas wrote:
 Alvaro Herrera wrote:
  /*
   * Look for a blocking autovacuum. There will only ever
   * be one, since the autovacuum workers are careful
   * not to operate concurrently on the same table. 
   */
 
 I think that's a bit unaccurate. You could have multiple autovacuum
 workers operating on different tables participating in a deadlock. The
 reason that can't happen is that autovacuum never holds a lock while
 waiting for another.

I wrote that code comment; as you say it is true only when there are at
least 4 processes in the lock graph where 2+ normal backends are
deadlocking and there are 2+ autovacuums holding existing locks. The
comment should have said If blocking is caused by an autovacuum process
then ... (there will).

The blocking_autovacuum_proc doesn't react unless there are no hard
deadlocks, so the code works.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Autovacuum cancellation

2007-10-27 Thread Simon Riggs
On Fri, 2007-10-26 at 17:50 -0300, Alvaro Herrera wrote:

 Ok, committed; I snuck that in as well, but I'm not sure how to test
 that it works.

I've had time to review that now. I didn't reply to your original post
because you'd taken my name off the copy list for some reason and I've
been too busy to read non-addressed mail.

The committed patch is pretty much the same as my original AFAICS.

I'm sure you didn't mean to forget that, but can you please acknowledge
my contribution in CVS? Thanks.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Autovacuum cancellation

2007-10-26 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 I think there's a window where the process waiting directly on
 autovacuum could have already fired its deadlock check before it was
 waiting directly on autovacuum.

 I think you don't understand what that code is doing.  If there's an
 autovac anywhere in the dependency graph, it'll find it.

That'll teach me to try to read code from a patch directly without trying to
apply it or at least read the original source next to it. I thought I had seen
this code recently enough to apply the patch from memory -- clearly not.

I assume the right thing happens if multiple deadlock check signals fire for
the same autovacuum?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Autovacuum cancellation

2007-10-26 Thread Heikki Linnakangas
Alvaro Herrera wrote:
 /*
  * Look for a blocking autovacuum. There will only ever
  * be one, since the autovacuum workers are careful
  * not to operate concurrently on the same table. 
  */

I think that's a bit unaccurate. You could have multiple autovacuum
workers operating on different tables participating in a deadlock. The
reason that can't happen is that autovacuum never holds a lock while
waiting for another.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Autovacuum cancellation

2007-10-26 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Alvaro Herrera wrote:
 /*
 * Look for a blocking autovacuum. There will only ever
 * be one, since the autovacuum workers are careful
 * not to operate concurrently on the same table. 
 */

 I think that's a bit unaccurate. You could have multiple autovacuum
 workers operating on different tables participating in a deadlock. The
 reason that can't happen is that autovacuum never holds a lock while
 waiting for another.

And that's not true either.  It may only want low-grade locks (eg
AccessShareLock on a system catalog) but deadlock is nonetheless
entirely possible in principle.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Autovacuum cancellation

2007-10-26 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 I assume the right thing happens if multiple deadlock check signals fire for
 the same autovacuum?

Multiple signals shouldn't be a problem, but late-arriving ones could be.
It might be worth having autovac explicitly clear QueryCancelPending
after it's finished a table, so that a SIGINT sent because of activity
on one table couldn't force cancellation of vacuum on the next one.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Autovacuum cancellation

2007-10-26 Thread Alvaro Herrera
Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  I assume the right thing happens if multiple deadlock check signals fire for
  the same autovacuum?
 
 Multiple signals shouldn't be a problem, but late-arriving ones could be.
 It might be worth having autovac explicitly clear QueryCancelPending
 after it's finished a table, so that a SIGINT sent because of activity
 on one table couldn't force cancellation of vacuum on the next one.

Ok, committed; I snuck that in as well, but I'm not sure how to test
that it works.

I adjusted the comments -- I think they're more correct now.  I also
added a puny paragraph to lmgr/README.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PATCHES] Autovacuum cancellation

2007-10-25 Thread Alvaro Herrera
Simon Riggs wrote:

 Just noticed you've made these changes. I was working on them, but
 hadn't fully tested the patch because of all the different touch points.
 Sorry for the delay.
 
 Would you like me to refresh my earlier patch against the newly
 committed state (or did you commit that aspect already as well?).

Patch attached, please comment.  It only avoids cancelling when the
vacuum is for wraparound.

What we're missing here is doc updates (mainly to lmgr/README, I think)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Index: src/backend/storage/lmgr/deadlock.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/storage/lmgr/deadlock.c,v
retrieving revision 1.48
diff -c -p -r1.48 deadlock.c
*** src/backend/storage/lmgr/deadlock.c	19 Jun 2007 20:13:21 -	1.48
--- src/backend/storage/lmgr/deadlock.c	25 Oct 2007 20:06:36 -
*** static int	maxPossibleConstraints;
*** 109,114 
--- 109,117 
  static DEADLOCK_INFO *deadlockDetails;
  static int	nDeadlockDetails;
  
+ /* PGPROC pointer of any blocking av worker found */
+ static PGPROC *blocking_autovacuum_proc = NULL; 
+ 
  
  /*
   * InitDeadLockChecking -- initialize deadlock checker during backend startup
*** DeadLockCheck(PGPROC *proc)
*** 206,211 
--- 209,217 
  	nPossibleConstraints = 0;
  	nWaitOrders = 0;
  
+ 	/* Initialize to not blocked by an autovacuum worker */
+ 	blocking_autovacuum_proc = NULL;
+ 
  	/* Search for deadlocks and possible fixes */
  	if (DeadLockCheckRecurse(proc))
  	{
*** DeadLockCheck(PGPROC *proc)
*** 255,265 
--- 261,289 
  	/* Return code tells caller if we had to escape a deadlock or not */
  	if (nWaitOrders  0)
  		return DS_SOFT_DEADLOCK;
+ 	else if (blocking_autovacuum_proc != NULL)
+ 		return DS_BLOCKED_BY_AUTOVACUUM;
  	else
  		return DS_NO_DEADLOCK;
  }
  
  /*
+  * Return the PGPROC of the autovacuum that's blocking a process.
+  *
+  * We reset the saved pointer as soon as we pass it back.
+  */
+ PGPROC *
+ GetBlockingAutoVacuumPgproc(void)
+ {
+ 	PGPROC	*ptr;
+ 
+ 	ptr = blocking_autovacuum_proc;
+ 	blocking_autovacuum_proc = NULL;
+ 
+ 	return ptr;
+ }
+ 
+ /*
   * DeadLockCheckRecurse -- recursively search for valid orderings
   *
   * curConstraints[] holds the current set of constraints being considered
*** FindLockCycleRecurse(PGPROC *checkProc,
*** 497,502 
--- 521,534 
  if ((proclock-holdMask  LOCKBIT_ON(lm)) 
  	(conflictMask  LOCKBIT_ON(lm)))
  {
+ 	/*
+ 	 * Look for a blocking autovacuum. There will only ever
+ 	 * be one, since the autovacuum workers are careful
+ 	 * not to operate concurrently on the same table. 
+ 	 */
+ 	if (proc-vacuumFlags  PROC_IS_AUTOVACUUM)
+ 		blocking_autovacuum_proc = proc;
+ 
  	/* This proc hard-blocks checkProc */
  	if (FindLockCycleRecurse(proc, depth + 1,
  			 softEdges, nSoftEdges))
Index: src/backend/storage/lmgr/proc.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/storage/lmgr/proc.c,v
retrieving revision 1.195
diff -c -p -r1.195 proc.c
*** src/backend/storage/lmgr/proc.c	24 Oct 2007 20:55:36 -	1.195
--- src/backend/storage/lmgr/proc.c	25 Oct 2007 20:03:58 -
*** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 734,739 
--- 734,740 
  	PROC_QUEUE *waitQueue = (lock-waitProcs);
  	LOCKMASK	myHeldLocks = MyProc-heldLocks;
  	bool		early_deadlock = false;
+ 	bool 		allow_autovacuum_cancel = true;
  	int			myWaitStatus;
  	PGPROC	   *proc;
  	int			i;
*** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 894,899 
--- 895,949 
  		myWaitStatus = MyProc-waitStatus;
  
  		/*
+ 		 * If we are not deadlocked, but are waiting on an autovacuum-induced
+ 		 * task, send a signal to interrupt it.  
+ 		 */
+ 		if (deadlock_state == DS_BLOCKED_BY_AUTOVACUUM  allow_autovacuum_cancel)
+ 		{
+ 			PGPROC	*autovac = GetBlockingAutoVacuumPgproc();
+ 
+ 			LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ 
+ 			/*
+ 			 * only do it if the worker is not working to protect against Xid
+ 			 * wraparound 
+ 			 */
+ 			if ((autovac != NULL) 
+ !(autovac-vacuumFlags  PROC_VACUUM_FOR_WRAPAROUND))
+ 			{
+ int		pid = autovac-pid;
+ 
+ elog(DEBUG2, sending cancel to blocking autovacuum pid = %d,
+ 	 pid);
+ 
+ /* don't hold the lock across the kill() syscall */
+ LWLockRelease(ProcArrayLock);
+ 
+ /*
+  * Send the autovacuum worker Back to Old Kent Road
+  *
+  * If we have setsid(), signal the backend's whole process group 
+  */
+ #ifdef HAVE_SETSID
+ if (kill(-pid, SIGINT))
+ #else
+ 	if (kill(pid, SIGINT))
+ #endif
+ 	{
+ 		/* Just a warning to allow multiple callers */
+ 		ereport(WARNING,
+ (errmsg(could not send 

Re: [PATCHES] Autovacuum cancellation

2007-10-25 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Patch attached, please comment.  It only avoids cancelling when the
 vacuum is for wraparound.

I'm not entirely convinced that there can be only one autovac proc in
the portion of the waits-for graph explored by DeadlockCheck.  If there
is more than one, we'll cancel a random one of them, which seems OK ---
but the comment added to FindLockCycleRecurse is bogus.

I thought about suggesting that we test PROC_VACUUM_FOR_WRAPAROUND
before setting blocking_autovacuum_proc at all, but I guess the reason
you don't do that is you don't want to take ProcArrayLock there (and we
decided it was unsafe to check the bit without the lock).  So the other
thing that comment block needs is a note that it seems OK to check
PROC_IS_AUTOVACUUM without the lock, because it never changes for an
existing PGPROC, but not PROC_VACUUM_FOR_WRAPAROUND.

Otherwise it looks OK --- a bit ugly but I don't have a better idea.

There's some things still to be desired here: if an autovac process is
involved in a hard deadlock, the patch doesn't favor zapping it over
anybody else, nor consider cancelling the autovac as an alternative to
rearranging queues for a soft deadlock.  But dealing with that will open
cans of worms that I don't think we want to open for 8.3.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Autovacuum cancellation

2007-10-25 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 There's some things still to be desired here: if an autovac process is
 involved in a hard deadlock, the patch doesn't favor zapping it over
 anybody else, nor consider cancelling the autovac as an alternative to
 rearranging queues for a soft deadlock.  But dealing with that will open
 cans of worms that I don't think we want to open for 8.3.

Can autovacuum actually get into a hard deadlock? Does it take more than one
lock that can block others at the same time?

I think there's a window where the process waiting directly on autovacuum
could have already fired its deadlock check before it was waiting directly on
autovacuum. But the only way I can see it happening is if another process is
cancelled before its deadlock check fires and the signals are processed out of
order. I'm not sure that's a case we really need to worry about.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Autovacuum cancellation

2007-10-25 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Can autovacuum actually get into a hard deadlock?

It can certainly be part of a deadlock loop, though the practical cases
might be few.  It will be holding more than one lock, eg a lock on its
target table and various transient locks on system catalogs, and other
processes taking or trying for exclusive locks on those things could
create issues.  I think it's OK to leave the issue go for now, until
we see if anyone hits it in practice --- I just wanted to mention that
we might need to consider the problem in future.

 I think there's a window where the process waiting directly on
 autovacuum could have already fired its deadlock check before it was
 waiting directly on autovacuum.

I think you don't understand what that code is doing.  If there's an
autovac anywhere in the dependency graph, it'll find it.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Autovacuum cancellation

2007-10-25 Thread Alvaro Herrera
Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:

  I think there's a window where the process waiting directly on
  autovacuum could have already fired its deadlock check before it was
  waiting directly on autovacuum.
 
 I think you don't understand what that code is doing.  If there's an
 autovac anywhere in the dependency graph, it'll find it.

Thanks for making that explicit, I was trying to build a test case where
it would lock :-)

(If we had concurrent psql we could even have it in the regression tests
...)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend