On Fri, 2007-06-29 at 09:29 +0900, ITAGAKI Takahiro wrote: > Alvaro Herrera <[EMAIL PROTECTED]> wrote: > > > What I'm requesting here is that the sleep in count_nondeletable_pages() > > be removed and that change backpatched to 8.2 and 8.1. > > Agreed. We'd better to shorten the exclusive locking as far as possible.
That is just one possibility, but I'd like to consider other possibilities before we go for that, especially backpatched. ISTM holding the lock across many I/Os is the thing that is causing long lock times. Removing the sleep may not substantially reduce the time on a busy system. Alvaro's example also shows that the number of blocks removed could be a substantial number - reminding us that the time the lock is held would still be O(N), whereas we would like it to be O(1). This is important since we don't even attempt truncation until the number of blocks is large enough to be worth bothering with. Would it be better to keep the sleep in there, but release and re-acquire the lock either side of the sleep? That would allow other transactions to progress without long lock waits. Currently, releasing the lock is a problem because the new FSM entries are added after truncation, so any updates and inserts would probably try to extend the relation, thus preventing further truncation. If we did things differently, we would have no reason to fail when we attempt to re-acquire the lock: - analyze where the truncation point would be on the vacuum pass - add FSM entries for all blocks below the truncation point. If that is below a minimum of 5% of the entries/16 blocks then we can move the truncation point higher so that the FSM entry is large enough to allow us time to truncate. - truncate the file, one bite at a time as we sleep (or max 16 blocks at a time if no sleep requested), possibly scanning forwards not back I would still like to see VACUUM spin a few times trying to acquire the lock before it gives up attempting to truncate. Re-running the whole VACUUM just to get another split-second chance to truncate is not very useful behaviour either. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match