Re: [PERFORM] cost-based vacuum

Ian Westmacott Wed, 13 Jul 2005 14:58:21 -0700

I can at least report that the problem does not seem to
occur with Postgres 8.0.1 running on a dual Opteron.


        --Ian


On Wed, 2005-07-13 at 16:39, Simon Riggs wrote:
> On Wed, 2005-07-13 at 14:58 -0400, Tom Lane wrote:
> > Ian Westmacott <[EMAIL PROTECTED]> writes:
> > > On Wed, 2005-07-13 at 11:55, Simon Riggs wrote:
> > >> On Tue, 2005-07-12 at 13:50 -0400, Ian Westmacott wrote:
> > >>> It appears not to matter whether it is one of the tables
> > >>> being written to that is ANALYZEd.  I can ANALYZE an old,
> > >>> quiescent table, or a system table and see this effect.
> > >> 
> > >> Can you confirm that this effect is still seen even when the ANALYZE
> > >> doesn't touch *any* of the tables being accessed?
> > 
> > > Yes.
> > 
> > This really isn't making any sense at all. 
> 
> Agreed. I think all of this indicates that some wierdness (technical
> term) is happening at a different level in the computing stack. I think
> all of this points fairly strongly to it *not* being a PostgreSQL
> algorithm problem, i.e. if the code was executed by an idealised Knuth-
> like CPU then we would not get this problem. Plus, I have faith that if
> it was a problem in that "plane" then you or another would have
> uncovered it by now.
> 
> > However, these certainly do not explain Ian's problem, because (a) these
> > only apply to VACUUM, not ANALYZE; (b) they would only lock the table
> > being VACUUMed, not other ones; (c) if these locks were to block the
> > reader or writer thread, it'd manifest as blocking on a semaphore, not
> > as a surge in LWLock thrashing.
> 
> I've seen enough circumstantial evidence to connect the time spent
> inside LWLockAcquire/Release as being connected to the Semaphore ops
> within them, not the other aspects of the code.
> 
> Months ago we discussed the problem of false sharing on closely packed
> arrays of shared variables because of the large cache line size of the
> Xeon MP. When last we touched on that thought, I focused on the thought
> that the LWLock array was too tightly packed for the predefined locks.
> What we didn't discuss (because I was too focused on the other array)
> was the PGPROC shared array is equally tightly packed, which could give
> problems on the semaphores in LWLock.
> 
> Intel says fairly clearly that this would be an issue. 
> 
> > >> Is that Xeon MP then?
> > 
> > > Yes.
> > 
> > The LWLock activity is certainly suggestive of prior reports of
> > excessive buffer manager lock contention, but it makes *no* sense that
> > that would be higher with vacuum cost delay than without.  I'd have
> > expected the other way around.
> > 
> > I'd really like to see a test case for this...
> 
> My feeling is that a "micro-architecture" test would be more likely to
> reveal some interesting information.
> 
> Best Regards, Simon Riggs
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [PERFORM] cost-based vacuum

Reply via email to