Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-09-25 Thread Simon Riggs
On Mon, 2007-09-24 at 21:26 -0500, Jaime Casanova wrote:
 On 9/24/07, Simon Riggs [EMAIL PROTECTED] wrote:
  On Sat, 2007-09-22 at 23:49 -0500, Jaime Casanova wrote:
   On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote:
   
related TODO items:
- add a WAIT n clause in same SQL locations as NOWAIT
- add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting)
   
to provide better control over lock waits.
   
  
   are these actual TODO items? i can't find them on the TODO list and i
   don't remember any discussion nor patch about this
 
  They are my proposals for TODO items to assist with application
  development.
 
 
 while i'm not at all comfortable with the idea of a GUC for this, the
 WAIT clause seems to be useful.
 just out of curiosity, why the NOWAIT patch wasn't do it that way in
 first place, i mean like a WAIT clause and when receiving NOWAIT
 transform it in WAIT 0?
 maybe dicussion?

NOWAIT is used by Oracle. 

DB2 supports a lock wait timeout.

What I didn't know before googling this was that SQLServer uses NOWAIT
also. SQLServer also implement WAIT [n seconds] *and* a parameter called
lock wait period, which is pretty spooky.

Another reason to implement this is to help avoid global deadlocks in
distributed transactions (e.g. two phase).

SQLServer and DB2 have more need of this than PostgreSQL, but we do
still need it.

 there's concensus in adding a WAIT clause?

Just do it, but take careful note of any comments against things.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-09-25 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 SQLServer and DB2 have more need of this than PostgreSQL, but we do
 still need it.

Why?  What does it do that statement_timeout doesn't do better?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-09-25 Thread Simon Riggs
On Tue, 2007-09-25 at 09:16 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  SQLServer and DB2 have more need of this than PostgreSQL, but we do
  still need it.
 
 Why?  What does it do that statement_timeout doesn't do better?

If the execution time is negligible, then setting statement_timeout is
the same thing as setting a lock timeout.

If execution time is not negligible, then you may want to tell the
difference between waiting for completion against waiting forever
without doing anything useful at all.

It's also easier to set an all encompassing lock timeout at User level
than it is to set statement_timeout on individual transactions issued by
that user.

Plus, if applications are written using these concepts it is easier to
port them to PostgreSQL.

Not planning to work on this myself, but I think it is a valid TODO.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-09-25 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes:

 On Tue, 2007-09-25 at 09:16 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  SQLServer and DB2 have more need of this than PostgreSQL, but we do
  still need it.
 
 Why?  What does it do that statement_timeout doesn't do better?

 If the execution time is negligible, then setting statement_timeout is
 the same thing as setting a lock timeout.

To make this explicit, I think the typical scenario where it would make a
difference is where you're running some large job in a plpgsql function. You
might be processing millions of records but want for a single step of that
process to not wait for a lock. You still want to process all the records you
can though.

So for example if you're updating all the user profiles on your system but
don't want to block on any user-profiles which are locked by active users --
especially if you use database locks for user-visible operations which users
can drag out for long periods of time. (Not saying I agree with that design
but there are arguments for it and people do do it)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-09-25 Thread Jaime Casanova
On 9/25/07, Simon Riggs [EMAIL PROTECTED] wrote:
 On Tue, 2007-09-25 at 09:16 -0400, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   SQLServer and DB2 have more need of this than PostgreSQL, but we do
   still need it.
 
  Why?  What does it do that statement_timeout doesn't do better?

 If the execution time is negligible, then setting statement_timeout is
 the same thing as setting a lock timeout.

 If execution time is not negligible, then you may want to tell the
 difference between waiting for completion against waiting forever
 without doing anything useful at all.


[...thinking on this a bit...]
mmm... i think we can emulate WAIT number_of_seconds using the NOWAIT
and a bit of logic...

point for tom


 Plus, if applications are written using these concepts it is easier to
 port them to PostgreSQL.


no words... point for simon...

 Not planning to work on this myself, but I think it is a valid TODO.


i will make a try for 8.4

-- 
regards,
Jaime Casanova

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning.
   Richard Cook

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-09-24 Thread Simon Riggs
On Sat, 2007-09-22 at 23:49 -0500, Jaime Casanova wrote:
 On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote:
 
  related TODO items:
  - add a WAIT n clause in same SQL locations as NOWAIT
  - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting)
 
  to provide better control over lock waits.
 
 
 are these actual TODO items? i can't find them on the TODO list and i
 don't remember any discussion nor patch about this

They are my proposals for TODO items to assist with application
development. 

Waiting around DDL is just one reason to want those, though there are
other similar issues with locking. I had one client with a program that
waited for 57 hours before they noticed. statement_timeout was not an
appropriate way to manage that and some form of timeout makes better
sense. NOWAIT means extra programming to handle that case, rather than
raising a normal error, so isn't always possible to change the app
source.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-09-24 Thread Jaime Casanova
On 9/24/07, Simon Riggs [EMAIL PROTECTED] wrote:
 On Sat, 2007-09-22 at 23:49 -0500, Jaime Casanova wrote:
  On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote:
  
   related TODO items:
   - add a WAIT n clause in same SQL locations as NOWAIT
   - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting)
  
   to provide better control over lock waits.
  
 
  are these actual TODO items? i can't find them on the TODO list and i
  don't remember any discussion nor patch about this

 They are my proposals for TODO items to assist with application
 development.


while i'm not at all comfortable with the idea of a GUC for this, the
WAIT clause seems to be useful.
just out of curiosity, why the NOWAIT patch wasn't do it that way in
first place, i mean like a WAIT clause and when receiving NOWAIT
transform it in WAIT 0?
maybe dicussion?

there's concensus in adding a WAIT clause?

-- 
regards,
Jaime Casanova

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning.
   Richard Cook

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-09-22 Thread Jaime Casanova
On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote:

 related TODO items:
 - add a WAIT n clause in same SQL locations as NOWAIT
 - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting)

 to provide better control over lock waits.


are these actual TODO items? i can't find them on the TODO list and i
don't remember any discussion nor patch about this

-- 
regards,
Jaime Casanova

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning.
   Richard Cook

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-20 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Applied with some further revisions to improve the usefulness of the log
 messages.  There's now one issued when the deadlock timeout elapses, and
 another when the lock is finally obtained:

 LOG:  process 14945 still waiting for AccessExclusiveLock on relation 86076 
 of database 86042 after 1003.354 ms
 ...
 LOG:  process 14945 acquired AccessExclusiveLock on relation 86076 of 
 database 86042 after 5918.002 ms

Is it possible for unlocking the semaphore to wake another process other than
our own? In which case checking log_lock_waits before signalling the semaphore
arguably locks us into having log_lock_waits be PGC_POSTMASTER. Currently it's
PGC_SIGHUP which is odd since it could have been USERSET in the old regime.

Also, I did just think of a reason why maybe having the times in the messages
could be annoying: it makes it hard to write regression tests. I suppose
having the pids already interferes with regression tests though. Maybe we
should do something like optionally postprocess .out files with some sed
script like s/[0-9]+/###/ before running diff.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-20 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Is it possible for unlocking the semaphore to wake another process other than
 our own? In which case checking log_lock_waits before signalling the semaphore
 arguably locks us into having log_lock_waits be PGC_POSTMASTER.

How you figure that?

 Currently it's PGC_SIGHUP which is odd since it could have been
 USERSET in the old regime.

Actually I changed it to SUSET yesterday.  I don't see any strong reason
why we should disallow different processes having different settings;
however, if the DBA is trying to gather this info, random users
shouldn't be able to turn it off.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-20 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Is it possible for unlocking the semaphore to wake another process other than
 our own? In which case checking log_lock_waits before signalling the 
 semaphore
 arguably locks us into having log_lock_waits be PGC_POSTMASTER.

 How you figure that?

Well I'm not clear exactly what's going on with the semaphores here. If it's
possible for to be printing the messages only as a result of another backend
unlocking the semaphore then making the PGSemaphoreUnlock conditional on
log_lock_waits means you can't enable log_lock_waits after startup and get
deterministic behaviour because whether you get messages will depend on which
other backend happens to wake you up.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-20 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 How you figure that?

 Well I'm not clear exactly what's going on with the semaphores here. If it's
 possible for to be printing the messages only as a result of another backend
 unlocking the semaphore then making the PGSemaphoreUnlock conditional on
 log_lock_waits means you can't enable log_lock_waits after startup and get
 deterministic behaviour because whether you get messages will depend on which
 other backend happens to wake you up.

I don't see how you arrive at that conclusion.  The message is printed
by the backend that is waiting for (or just obtained) a lock, dependent
on its own local setting of log_lock_waits, and not dependent on who
woke it up.

BTW, I just noticed that GUC allows deadlock_timeout to be set all the
way down to zero.  This seems bad --- surely the minimum value should at
least be positive?  As CVS HEAD stands, you're likely to get a lot of
spurious/useless log messages if you have log_lock_waits = true and
deadlock_timeout = 0.  Do we care?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-20 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I don't see how you arrive at that conclusion.  The message is printed
 by the backend that is waiting for (or just obtained) a lock, dependent
 on its own local setting of log_lock_waits, and not dependent on who
 woke it up.

 But in your version of the patch you're not calling PGSemaphoreUnlock() unless
 log_lock_waits is set in the process doing the waking. 

Which is always the same process:
PGSemaphoreUnlock(MyProc-sem);

 BTW, I just noticed that GUC allows deadlock_timeout to be set all the
 way down to zero.  This seems bad --- surely the minimum value should at
 least be positive?  As CVS HEAD stands, you're likely to get a lot of
 spurious/useless log messages if you have log_lock_waits = true and
 deadlock_timeout = 0.  Do we care?

 Does that actually work? I would expect setitimer to turn off the alarm in
 that case.

Good point, which renders it definitely broken.  I propose we just tweak
GUC to set a minimum deadlock_timeout of 1 msec.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-20 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Which is always the same process:
   PGSemaphoreUnlock(MyProc-sem);

Aaah! I just grokked what's going on with the semaphores here. It all makes a
lot more sense now. Nevermind.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-19 Thread Gregory Stark

 * Tom Lane ([EMAIL PROTECTED]) wrote:

 Yeah, I wouldn't want one per second.  Do we already track how long
 we've been waiting?

 No, because we're *asleep*.  You'd have to add an additional
 timeout-interrupt reason.  Plus there's a ton of interesting questions
 about what's safe to do from an interrupt service routine.

 In fact, I am scandalized to see that someone has inserted a boatload
 of elog calls into CheckDeadLock since 8.2 --- that seems entirely
 unsafe.  [ checks revision history... ]

Attached is a patch which moves the messages to ProcSleep(). To do this I had
to move the semaphore signal to unconditionally be signalled whenever
CheckDeadLock() is called regardless of whether it finds a hard deadlock. I'm
not 100% sure that's correct but afaik we only use semaphores to signal state
changes and deal with spurious semaphore firings everywhere.

Incidentally in looking at this I found that the early deadlock detection
never seems to fire. Reading the source it seems it ought to be firing
whenever we have a simple two-process deadlock. But instead I only get the
timeout-based detection.



checkpoint-log-messages-fix.patch.gz
Description: Binary data

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-19 Thread Gregory Stark
Gregory Stark [EMAIL PROTECTED] writes:

 Incidentally in looking at this I found that the early deadlock detection
 never seems to fire. Reading the source it seems it ought to be firing
 whenever we have a simple two-process deadlock. But instead I only get the
 timeout-based detection.

Ok, I understand now that early deadlock detection only kicks in when doing
something like LOCK TABLE and even then only if you're deadlocking because
you're upgrading a lock. So this works as intended though it's much less
useful than I thought.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-19 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Attached is a patch which moves the messages to ProcSleep(). To do this I had
 to move the semaphore signal to unconditionally be signalled whenever
 CheckDeadLock() is called regardless of whether it finds a hard deadlock. I'm
 not 100% sure that's correct but afaik we only use semaphores to signal state
 changes and deal with spurious semaphore firings everywhere.

It's a bit itchy-seeming, but I think what you are actually doing is
going over to a regime where every wait on the semaphore checks and
re-waits if it didn't get the condition it wants.  Before, I think we
did that for every use of the semaphore except this one (which is an
outgrowth of the fact that originally this was indeed the only use of
the sema, and the others got shoehorned in over time).

I'll check it over.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-19 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Attached is a patch which moves the messages to ProcSleep().

BTW, with the messages moved out of the ISR it would be safe to make it
identify the specific lock being waited on (in the same terms used in
the existing deadlock messages).  Is there a reason not to do that?
I suppose it would eat a few more cycles ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-19 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 * Tom Lane ([EMAIL PROTECTED]) wrote:
 In fact, I am scandalized to see that someone has inserted a boatload
 of elog calls into CheckDeadLock since 8.2 --- that seems entirely
 unsafe.  [ checks revision history... ]

 Attached is a patch which moves the messages to ProcSleep().

Applied with some further revisions to improve the usefulness of the log
messages.  There's now one issued when the deadlock timeout elapses, and
another when the lock is finally obtained:

LOG:  process 14945 still waiting for AccessExclusiveLock on relation 86076 of 
database 86042 after 1003.354 ms
...
LOG:  process 14945 acquired AccessExclusiveLock on relation 86076 of database 
86042 after 5918.002 ms

although I just realized that the wording of the second one is
misleading; it actually comes out when the lock wait ends, whether we
acquired the lock or not.  (The other possibility is that our statement
was aborted, eg by SIGINT or statement_timeout.)

Is it worth having two messages for the two cases?  I'm tempted to just
not log anything if the statement is aborted --- the cause of the abort
should be reflected in some later error message, and reporting how long
we waited before giving up seems not within the charter of
log_lock_waits.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-06-19 Thread Simon Riggs
On Tue, 2007-06-19 at 16:24 -0400, Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  * Tom Lane ([EMAIL PROTECTED]) wrote:
  In fact, I am scandalized to see that someone has inserted a boatload
  of elog calls into CheckDeadLock since 8.2 --- that seems entirely
  unsafe.  [ checks revision history... ]
 
  Attached is a patch which moves the messages to ProcSleep().

Thanks Greg for taking this on; it would still be in my queue now if you
hadn't, so much appreciated.

 Applied with some further revisions to improve the usefulness of the log
 messages.  There's now one issued when the deadlock timeout elapses, and
 another when the lock is finally obtained:
 
 LOG:  process 14945 still waiting for AccessExclusiveLock on relation 86076 
 of database 86042 after 1003.354 ms
 ...
 LOG:  process 14945 acquired AccessExclusiveLock on relation 86076 of 
 database 86042 after 5918.002 ms
 
 although I just realized that the wording of the second one is
 misleading; it actually comes out when the lock wait ends, whether we
 acquired the lock or not.  (The other possibility is that our statement
 was aborted, eg by SIGINT or statement_timeout.)
 
 Is it worth having two messages for the two cases?  I'm tempted to just
 not log anything if the statement is aborted --- the cause of the abort
 should be reflected in some later error message, and reporting how long
 we waited before giving up seems not within the charter of
 log_lock_waits.

Sounds good; thanks Tom.


related TODO items:
- add a WAIT n clause in same SQL locations as NOWAIT
- add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting)

to provide better control over lock waits.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate