Simon,

In the 16/16 (16 buffer partitions/16 lock partitions) test, the
WALInsertLock lock had 14643080 acquisition attempts and 12057678
successful acquisitions on the lock. That's 2585402 retries on the lock.
That is to say that PGSemaphoreLock was invoked 2585402 times.

In the 128/128 test, the WALInsertLock lock had 14991208 acquisition
attempts and 12324765 successful acquisitions. That's 2666443 retries.

The 128/128 test attempted 348128 more lock acquisitions than the 16/16
test and retried 81041 times more than the 16/16 test. We attribute the
rise in WALInsertLock lock accesses to the reduction in time on
acquiring the BufMapping and LockMgr partition locks. Does this seem
reasonable?

The overhead of any monitoring is of great concern to us. We've tried
both clock_gettime () and getttimeofday () calls. They both seem to have
the same overhead ~1 us/call (measured against the TSC of the CPU) and
both seem to be accurate. We realize this can be a delicate point and so
we would be happy to rerun any tests with a different timing mechanism.

David

-----Original Message-----
From: Simon Riggs [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 13, 2006 2:22 AM
To: Tom Lane
Cc: Strong, David; PostgreSQL-development
Subject: Re: [HACKERS] Lock partitions

On Tue, 2006-09-12 at 12:40 -0400, Tom Lane wrote:
> "Strong, David" <[EMAIL PROTECTED]> writes:
> > When using 16 buffer and 16 lock partitions, we see that BufMapping
> > takes 809 seconds to acquire locks and 174 seconds to release locks.
The
> > LockMgr takes 362 seconds to acquire locks and 26 seconds to release
> > locks.
> 
> > When using 128 buffer and 128 lock partitions, we see that
BufMapping
> > takes 277 seconds (532 seconds improvement) to acquire locks and 78
> > seconds (96 seconds improvement) to release locks. The LockMgr takes
235
> > seconds (127 seconds improvement) to acquire locks and 22 seconds (4
> > seconds improvement) to release locks.
> 
> While I don't see any particular penalty to increasing
> NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very
> significant penalty (increasing PGPROC size as well as the work needed
> during LockReleaseAll, which is executed at every transaction end).
> I think 128 lock partitions is probably verging on the ridiculous
> ... particularly if your benchmark only involves touching half a dozen
> tables.  I'd be more interested in comparisons between 4 and 16 lock
> partitions.  Also, please vary the two settings independently rather
> than confusing the issue by changing them both at once.

Good thinking David. Even if 128 is fairly high, it does seem worth
exploring higher values - I was just stuck in "fewer == better"
thoughts.

> > With the improvements in the various locking times, one might expect
an
> > improvement in the overall benchmark result. However, a 16 partition
run
> > produces a result of 198.74 TPS and a 128 partition run produces a
> > result of 203.24 TPS.
> 
> > Part of the time saved from BufMapping and LockMgr partitions is
> > absorbed into the WALInsertLock lock. For a 16 partition run, the
total
> > time to lock/release the WALInsertLock lock is 5845 seconds. For 128
> > partitions, the WALInsertLock lock takes 6172 seconds, an increase
of
> > 327 seconds. Perhaps we have our WAL configured incorrectly?
> 
> I fear this throws your entire measurement procedure into question.
For
> a fixed workload the number of acquisitions of WALInsertLock ought to
be
> fixed, so you shouldn't see any more contention for WALInsertLock if
the
> transaction rate didn't change materially.

David's results were to do with lock acquire/release time, not the
number of acquisitions, so that in itself doesn't make me doubt these
measurements. Perhaps we can ask whether there was a substantially
different number of lock acquisitions? As Tom says, that would be an
issue.

It seems reasonable that relieving the bottleneck on BufMapping and
LockMgr locks that we would then queue longer on the next bottleneck,
WALInsertLock. So again, those tests seem reasonable to me so far.

These seem to be the beginnings of accurate wait time analysis, so I'm
listening closely.

Are you using a lightweight timer?

-- 
  Simon Riggs             
  EnterpriseDB   http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to