On Wed, 2005-07-13 at 14:58 -0400, Tom Lane wrote: > Ian Westmacott <[EMAIL PROTECTED]> writes: > > On Wed, 2005-07-13 at 11:55, Simon Riggs wrote: > >> On Tue, 2005-07-12 at 13:50 -0400, Ian Westmacott wrote: > >>> It appears not to matter whether it is one of the tables > >>> being written to that is ANALYZEd. I can ANALYZE an old, > >>> quiescent table, or a system table and see this effect. > >> > >> Can you confirm that this effect is still seen even when the ANALYZE > >> doesn't touch *any* of the tables being accessed? > > > Yes. > > This really isn't making any sense at all.
Agreed. I think all of this indicates that some wierdness (technical term) is happening at a different level in the computing stack. I think all of this points fairly strongly to it *not* being a PostgreSQL algorithm problem, i.e. if the code was executed by an idealised Knuth- like CPU then we would not get this problem. Plus, I have faith that if it was a problem in that "plane" then you or another would have uncovered it by now. > However, these certainly do not explain Ian's problem, because (a) these > only apply to VACUUM, not ANALYZE; (b) they would only lock the table > being VACUUMed, not other ones; (c) if these locks were to block the > reader or writer thread, it'd manifest as blocking on a semaphore, not > as a surge in LWLock thrashing. I've seen enough circumstantial evidence to connect the time spent inside LWLockAcquire/Release as being connected to the Semaphore ops within them, not the other aspects of the code. Months ago we discussed the problem of false sharing on closely packed arrays of shared variables because of the large cache line size of the Xeon MP. When last we touched on that thought, I focused on the thought that the LWLock array was too tightly packed for the predefined locks. What we didn't discuss (because I was too focused on the other array) was the PGPROC shared array is equally tightly packed, which could give problems on the semaphores in LWLock. Intel says fairly clearly that this would be an issue. > >> Is that Xeon MP then? > > > Yes. > > The LWLock activity is certainly suggestive of prior reports of > excessive buffer manager lock contention, but it makes *no* sense that > that would be higher with vacuum cost delay than without. I'd have > expected the other way around. > > I'd really like to see a test case for this... My feeling is that a "micro-architecture" test would be more likely to reveal some interesting information. Best Regards, Simon Riggs ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend