Re: [PATCHES] [HACKERS] 'Waiting on lock'
On Mon, 2007-09-24 at 21:26 -0500, Jaime Casanova wrote: On 9/24/07, Simon Riggs [EMAIL PROTECTED] wrote: On Sat, 2007-09-22 at 23:49 -0500, Jaime Casanova wrote: On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote: related TODO items: - add a WAIT n clause in same SQL locations as NOWAIT - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting) to provide better control over lock waits. are these actual TODO items? i can't find them on the TODO list and i don't remember any discussion nor patch about this They are my proposals for TODO items to assist with application development. while i'm not at all comfortable with the idea of a GUC for this, the WAIT clause seems to be useful. just out of curiosity, why the NOWAIT patch wasn't do it that way in first place, i mean like a WAIT clause and when receiving NOWAIT transform it in WAIT 0? maybe dicussion? NOWAIT is used by Oracle. DB2 supports a lock wait timeout. What I didn't know before googling this was that SQLServer uses NOWAIT also. SQLServer also implement WAIT [n seconds] *and* a parameter called lock wait period, which is pretty spooky. Another reason to implement this is to help avoid global deadlocks in distributed transactions (e.g. two phase). SQLServer and DB2 have more need of this than PostgreSQL, but we do still need it. there's concensus in adding a WAIT clause? Just do it, but take careful note of any comments against things. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Simon Riggs [EMAIL PROTECTED] writes: SQLServer and DB2 have more need of this than PostgreSQL, but we do still need it. Why? What does it do that statement_timeout doesn't do better? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] 'Waiting on lock'
On Tue, 2007-09-25 at 09:16 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: SQLServer and DB2 have more need of this than PostgreSQL, but we do still need it. Why? What does it do that statement_timeout doesn't do better? If the execution time is negligible, then setting statement_timeout is the same thing as setting a lock timeout. If execution time is not negligible, then you may want to tell the difference between waiting for completion against waiting forever without doing anything useful at all. It's also easier to set an all encompassing lock timeout at User level than it is to set statement_timeout on individual transactions issued by that user. Plus, if applications are written using these concepts it is easier to port them to PostgreSQL. Not planning to work on this myself, but I think it is a valid TODO. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Simon Riggs [EMAIL PROTECTED] writes: On Tue, 2007-09-25 at 09:16 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: SQLServer and DB2 have more need of this than PostgreSQL, but we do still need it. Why? What does it do that statement_timeout doesn't do better? If the execution time is negligible, then setting statement_timeout is the same thing as setting a lock timeout. To make this explicit, I think the typical scenario where it would make a difference is where you're running some large job in a plpgsql function. You might be processing millions of records but want for a single step of that process to not wait for a lock. You still want to process all the records you can though. So for example if you're updating all the user profiles on your system but don't want to block on any user-profiles which are locked by active users -- especially if you use database locks for user-visible operations which users can drag out for long periods of time. (Not saying I agree with that design but there are arguments for it and people do do it) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] 'Waiting on lock'
On 9/25/07, Simon Riggs [EMAIL PROTECTED] wrote: On Tue, 2007-09-25 at 09:16 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: SQLServer and DB2 have more need of this than PostgreSQL, but we do still need it. Why? What does it do that statement_timeout doesn't do better? If the execution time is negligible, then setting statement_timeout is the same thing as setting a lock timeout. If execution time is not negligible, then you may want to tell the difference between waiting for completion against waiting forever without doing anything useful at all. [...thinking on this a bit...] mmm... i think we can emulate WAIT number_of_seconds using the NOWAIT and a bit of logic... point for tom Plus, if applications are written using these concepts it is easier to port them to PostgreSQL. no words... point for simon... Not planning to work on this myself, but I think it is a valid TODO. i will make a try for 8.4 -- regards, Jaime Casanova Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning. Richard Cook ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] 'Waiting on lock'
On Sat, 2007-09-22 at 23:49 -0500, Jaime Casanova wrote: On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote: related TODO items: - add a WAIT n clause in same SQL locations as NOWAIT - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting) to provide better control over lock waits. are these actual TODO items? i can't find them on the TODO list and i don't remember any discussion nor patch about this They are my proposals for TODO items to assist with application development. Waiting around DDL is just one reason to want those, though there are other similar issues with locking. I had one client with a program that waited for 57 hours before they noticed. statement_timeout was not an appropriate way to manage that and some form of timeout makes better sense. NOWAIT means extra programming to handle that case, rather than raising a normal error, so isn't always possible to change the app source. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] 'Waiting on lock'
On 9/24/07, Simon Riggs [EMAIL PROTECTED] wrote: On Sat, 2007-09-22 at 23:49 -0500, Jaime Casanova wrote: On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote: related TODO items: - add a WAIT n clause in same SQL locations as NOWAIT - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting) to provide better control over lock waits. are these actual TODO items? i can't find them on the TODO list and i don't remember any discussion nor patch about this They are my proposals for TODO items to assist with application development. while i'm not at all comfortable with the idea of a GUC for this, the WAIT clause seems to be useful. just out of curiosity, why the NOWAIT patch wasn't do it that way in first place, i mean like a WAIT clause and when receiving NOWAIT transform it in WAIT 0? maybe dicussion? there's concensus in adding a WAIT clause? -- regards, Jaime Casanova Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning. Richard Cook ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] 'Waiting on lock'
On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote: related TODO items: - add a WAIT n clause in same SQL locations as NOWAIT - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting) to provide better control over lock waits. are these actual TODO items? i can't find them on the TODO list and i don't remember any discussion nor patch about this -- regards, Jaime Casanova Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning. Richard Cook ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Tom Lane [EMAIL PROTECTED] writes: Applied with some further revisions to improve the usefulness of the log messages. There's now one issued when the deadlock timeout elapses, and another when the lock is finally obtained: LOG: process 14945 still waiting for AccessExclusiveLock on relation 86076 of database 86042 after 1003.354 ms ... LOG: process 14945 acquired AccessExclusiveLock on relation 86076 of database 86042 after 5918.002 ms Is it possible for unlocking the semaphore to wake another process other than our own? In which case checking log_lock_waits before signalling the semaphore arguably locks us into having log_lock_waits be PGC_POSTMASTER. Currently it's PGC_SIGHUP which is odd since it could have been USERSET in the old regime. Also, I did just think of a reason why maybe having the times in the messages could be annoying: it makes it hard to write regression tests. I suppose having the pids already interferes with regression tests though. Maybe we should do something like optionally postprocess .out files with some sed script like s/[0-9]+/###/ before running diff. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Gregory Stark [EMAIL PROTECTED] writes: Is it possible for unlocking the semaphore to wake another process other than our own? In which case checking log_lock_waits before signalling the semaphore arguably locks us into having log_lock_waits be PGC_POSTMASTER. How you figure that? Currently it's PGC_SIGHUP which is odd since it could have been USERSET in the old regime. Actually I changed it to SUSET yesterday. I don't see any strong reason why we should disallow different processes having different settings; however, if the DBA is trying to gather this info, random users shouldn't be able to turn it off. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: Is it possible for unlocking the semaphore to wake another process other than our own? In which case checking log_lock_waits before signalling the semaphore arguably locks us into having log_lock_waits be PGC_POSTMASTER. How you figure that? Well I'm not clear exactly what's going on with the semaphores here. If it's possible for to be printing the messages only as a result of another backend unlocking the semaphore then making the PGSemaphoreUnlock conditional on log_lock_waits means you can't enable log_lock_waits after startup and get deterministic behaviour because whether you get messages will depend on which other backend happens to wake you up. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: How you figure that? Well I'm not clear exactly what's going on with the semaphores here. If it's possible for to be printing the messages only as a result of another backend unlocking the semaphore then making the PGSemaphoreUnlock conditional on log_lock_waits means you can't enable log_lock_waits after startup and get deterministic behaviour because whether you get messages will depend on which other backend happens to wake you up. I don't see how you arrive at that conclusion. The message is printed by the backend that is waiting for (or just obtained) a lock, dependent on its own local setting of log_lock_waits, and not dependent on who woke it up. BTW, I just noticed that GUC allows deadlock_timeout to be set all the way down to zero. This seems bad --- surely the minimum value should at least be positive? As CVS HEAD stands, you're likely to get a lot of spurious/useless log messages if you have log_lock_waits = true and deadlock_timeout = 0. Do we care? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I don't see how you arrive at that conclusion. The message is printed by the backend that is waiting for (or just obtained) a lock, dependent on its own local setting of log_lock_waits, and not dependent on who woke it up. But in your version of the patch you're not calling PGSemaphoreUnlock() unless log_lock_waits is set in the process doing the waking. Which is always the same process: PGSemaphoreUnlock(MyProc-sem); BTW, I just noticed that GUC allows deadlock_timeout to be set all the way down to zero. This seems bad --- surely the minimum value should at least be positive? As CVS HEAD stands, you're likely to get a lot of spurious/useless log messages if you have log_lock_waits = true and deadlock_timeout = 0. Do we care? Does that actually work? I would expect setitimer to turn off the alarm in that case. Good point, which renders it definitely broken. I propose we just tweak GUC to set a minimum deadlock_timeout of 1 msec. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Tom Lane [EMAIL PROTECTED] writes: Which is always the same process: PGSemaphoreUnlock(MyProc-sem); Aaah! I just grokked what's going on with the semaphores here. It all makes a lot more sense now. Nevermind. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] 'Waiting on lock'
* Tom Lane ([EMAIL PROTECTED]) wrote: Yeah, I wouldn't want one per second. Do we already track how long we've been waiting? No, because we're *asleep*. You'd have to add an additional timeout-interrupt reason. Plus there's a ton of interesting questions about what's safe to do from an interrupt service routine. In fact, I am scandalized to see that someone has inserted a boatload of elog calls into CheckDeadLock since 8.2 --- that seems entirely unsafe. [ checks revision history... ] Attached is a patch which moves the messages to ProcSleep(). To do this I had to move the semaphore signal to unconditionally be signalled whenever CheckDeadLock() is called regardless of whether it finds a hard deadlock. I'm not 100% sure that's correct but afaik we only use semaphores to signal state changes and deal with spurious semaphore firings everywhere. Incidentally in looking at this I found that the early deadlock detection never seems to fire. Reading the source it seems it ought to be firing whenever we have a simple two-process deadlock. But instead I only get the timeout-based detection. checkpoint-log-messages-fix.patch.gz Description: Binary data -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Gregory Stark [EMAIL PROTECTED] writes: Incidentally in looking at this I found that the early deadlock detection never seems to fire. Reading the source it seems it ought to be firing whenever we have a simple two-process deadlock. But instead I only get the timeout-based detection. Ok, I understand now that early deadlock detection only kicks in when doing something like LOCK TABLE and even then only if you're deadlocking because you're upgrading a lock. So this works as intended though it's much less useful than I thought. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Gregory Stark [EMAIL PROTECTED] writes: Attached is a patch which moves the messages to ProcSleep(). To do this I had to move the semaphore signal to unconditionally be signalled whenever CheckDeadLock() is called regardless of whether it finds a hard deadlock. I'm not 100% sure that's correct but afaik we only use semaphores to signal state changes and deal with spurious semaphore firings everywhere. It's a bit itchy-seeming, but I think what you are actually doing is going over to a regime where every wait on the semaphore checks and re-waits if it didn't get the condition it wants. Before, I think we did that for every use of the semaphore except this one (which is an outgrowth of the fact that originally this was indeed the only use of the sema, and the others got shoehorned in over time). I'll check it over. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Gregory Stark [EMAIL PROTECTED] writes: Attached is a patch which moves the messages to ProcSleep(). BTW, with the messages moved out of the ISR it would be safe to make it identify the specific lock being waited on (in the same terms used in the existing deadlock messages). Is there a reason not to do that? I suppose it would eat a few more cycles ... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] 'Waiting on lock'
Gregory Stark [EMAIL PROTECTED] writes: * Tom Lane ([EMAIL PROTECTED]) wrote: In fact, I am scandalized to see that someone has inserted a boatload of elog calls into CheckDeadLock since 8.2 --- that seems entirely unsafe. [ checks revision history... ] Attached is a patch which moves the messages to ProcSleep(). Applied with some further revisions to improve the usefulness of the log messages. There's now one issued when the deadlock timeout elapses, and another when the lock is finally obtained: LOG: process 14945 still waiting for AccessExclusiveLock on relation 86076 of database 86042 after 1003.354 ms ... LOG: process 14945 acquired AccessExclusiveLock on relation 86076 of database 86042 after 5918.002 ms although I just realized that the wording of the second one is misleading; it actually comes out when the lock wait ends, whether we acquired the lock or not. (The other possibility is that our statement was aborted, eg by SIGINT or statement_timeout.) Is it worth having two messages for the two cases? I'm tempted to just not log anything if the statement is aborted --- the cause of the abort should be reflected in some later error message, and reporting how long we waited before giving up seems not within the charter of log_lock_waits. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] 'Waiting on lock'
On Tue, 2007-06-19 at 16:24 -0400, Tom Lane wrote: Gregory Stark [EMAIL PROTECTED] writes: * Tom Lane ([EMAIL PROTECTED]) wrote: In fact, I am scandalized to see that someone has inserted a boatload of elog calls into CheckDeadLock since 8.2 --- that seems entirely unsafe. [ checks revision history... ] Attached is a patch which moves the messages to ProcSleep(). Thanks Greg for taking this on; it would still be in my queue now if you hadn't, so much appreciated. Applied with some further revisions to improve the usefulness of the log messages. There's now one issued when the deadlock timeout elapses, and another when the lock is finally obtained: LOG: process 14945 still waiting for AccessExclusiveLock on relation 86076 of database 86042 after 1003.354 ms ... LOG: process 14945 acquired AccessExclusiveLock on relation 86076 of database 86042 after 5918.002 ms although I just realized that the wording of the second one is misleading; it actually comes out when the lock wait ends, whether we acquired the lock or not. (The other possibility is that our statement was aborted, eg by SIGINT or statement_timeout.) Is it worth having two messages for the two cases? I'm tempted to just not log anything if the statement is aborted --- the cause of the abort should be reflected in some later error message, and reporting how long we waited before giving up seems not within the charter of log_lock_waits. Sounds good; thanks Tom. related TODO items: - add a WAIT n clause in same SQL locations as NOWAIT - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting) to provide better control over lock waits. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate