I can at least report that the problem does not seem to occur with Postgres 8.0.1 running on a dual Opteron.
--Ian On Wed, 2005-07-13 at 16:39, Simon Riggs wrote: > On Wed, 2005-07-13 at 14:58 -0400, Tom Lane wrote: > > Ian Westmacott <[EMAIL PROTECTED]> writes: > > > On Wed, 2005-07-13 at 11:55, Simon Riggs wrote: > > >> On Tue, 2005-07-12 at 13:50 -0400, Ian Westmacott wrote: > > >>> It appears not to matter whether it is one of the tables > > >>> being written to that is ANALYZEd. I can ANALYZE an old, > > >>> quiescent table, or a system table and see this effect. > > >> > > >> Can you confirm that this effect is still seen even when the ANALYZE > > >> doesn't touch *any* of the tables being accessed? > > > > > Yes. > > > > This really isn't making any sense at all. > > Agreed. I think all of this indicates that some wierdness (technical > term) is happening at a different level in the computing stack. I think > all of this points fairly strongly to it *not* being a PostgreSQL > algorithm problem, i.e. if the code was executed by an idealised Knuth- > like CPU then we would not get this problem. Plus, I have faith that if > it was a problem in that "plane" then you or another would have > uncovered it by now. > > > However, these certainly do not explain Ian's problem, because (a) these > > only apply to VACUUM, not ANALYZE; (b) they would only lock the table > > being VACUUMed, not other ones; (c) if these locks were to block the > > reader or writer thread, it'd manifest as blocking on a semaphore, not > > as a surge in LWLock thrashing. > > I've seen enough circumstantial evidence to connect the time spent > inside LWLockAcquire/Release as being connected to the Semaphore ops > within them, not the other aspects of the code. > > Months ago we discussed the problem of false sharing on closely packed > arrays of shared variables because of the large cache line size of the > Xeon MP. When last we touched on that thought, I focused on the thought > that the LWLock array was too tightly packed for the predefined locks. > What we didn't discuss (because I was too focused on the other array) > was the PGPROC shared array is equally tightly packed, which could give > problems on the semaphores in LWLock. > > Intel says fairly clearly that this would be an issue. > > > >> Is that Xeon MP then? > > > > > Yes. > > > > The LWLock activity is certainly suggestive of prior reports of > > excessive buffer manager lock contention, but it makes *no* sense that > > that would be higher with vacuum cost delay than without. I'd have > > expected the other way around. > > > > I'd really like to see a test case for this... > > My feeling is that a "micro-architecture" test would be more likely to > reveal some interesting information. > > Best Regards, Simon Riggs > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq