Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

David Rowley Mon, 06 Mar 2017 20:32:38 -0800

On 2 March 2017 at 16:06, Amit Kapila <[email protected]> wrote:
> On Wed, Mar 1, 2017 at 5:32 PM, David Rowley
> <[email protected]> wrote:
>> Hackers,
>>
>> I've attached a small patch which aims to improve the performance of
>> AccessExclusiveLock when there are many AccessExclusiveLock stored.
>>
>
> I could see that your idea is quite straightforward to improve
> performance (basically, convert recovery lock list to recovery lock
> hash table based on xid), however, it is not clear under what kind of
> workload this will be helpful?  Do you expect that many concurrent
> sessions are trying to acquire different AEL locks?


Many thanks for looking over this.

The problem case is when many AccessExclusiveLocks are in the list,
each transaction which commits must look through the entire list for
locks which belong to it. This can be a problem when a small number of
transactions create large amount of temp tables, perhaps some batch
processing job. Other transactions, even ones which don't create any
temp tables must look through the list to release any locks belonging
to them. All we need to do to recreate this is have a bunch of
transactions creating temp tables, then try and execute some very
small and fast transactions at the same time.

It's possible to reproduce the problem by running the attached
temptable_bench.sql with:

pgbench -f temptable_bench.sql -j 10 -c 10 -T 60 -n postgres

while concurrently running

pgbench -f updatecounter.sql -T 60 -n postgres

after having done:

CREATE TABLE t1 (value int not null);
insert into t1 values(1);

The hot physical standby lag grows significantly during the run. I saw
up to 500MB on a short 1 minute run and perf record shows that
StandbyReleaseLocks() is consuming 90% of CPU time.

With the patch StandbyReleaseLocks() is down at about 2% CPU.

> Few comments on the patch:
> 1.
> +/*
> + * Number of buckets to split RecoveryLockTable into.
> + * This must be a power of two.
> + */
>
> +#define RECOVERYLOCKTABLE_SIZE 1024
>
> On what basis did you decide to keep the lock table size as 1024, is
> it just a guess, if so, then also adding a comment to indicate that
> you think this is sufficient for current use cases seems better.
> However, if it is based on some data, then it would be more
> appropriate.

That may need tweaking. Likely it could be smaller if we had some sort
of bloom filter to mark if the transaction had obtained any AEL locks,
that way it could skip. Initially I really didn't want to make the
patch too complex. I had thought that a fairly large hash table would
fix the problem well enough, as quite possibly most buckets would be
empty and non empty buckets have short lists.

> 2.
> @@ -607,6 +627,8 @@ StandbyAcquireAccessExclusiveLock(TransactionId
> xid, Oid dbOid, Oid relOid)
>  {
>   xl_standby_lock *newlock;
>   LOCKTAG locktag;
> + size_t pidx;
> +
>
> Comment on top of this function needs some change, it still seems to
> refer to RelationLockList.

Will fix that.

>
> * We keep a single dynamically expandible list of locks in local memory,
>  * RelationLockList, so we can keep track of the various entries made by
>  * the Startup process's virtual xid in the shared lock table.
>
> 3.
> + size_t pidx;
> +
> + if (!TransactionIdIsValid(xid))
> + {
> + StandbyReleaseAllLocks();
> + return;
> + }
>
> What is the need of this new code?

I was not much of a fan of the old way that the function coped with an
invalid xid. The test was being performed inside of the for loop. My
new for loop was not really compatible with that test, so I moved
handling of invalid xids to outside the loop. I think doing that with
today's code would be an improvement as it would mean less work inside
the slow loop.

> 4.
> In some cases, it might slow down the performance where you have to
> traverse the complete hash table like StandbyReleaseOldLocks, however,
> I think those cases will be less relative to other lock release calls
> which are called at every commit, so probably this is okay.

Yes, that may be true. Perhaps making the table smaller would help
bring that back to what it was, or perhaps the patch needs rethought
to use bloom filters to help identify if we need to release any locks.

Opinions on that would be welcome. Meanwhile I'll go off and play with
bloom filters to see if I can knock some percentage points of the perf
report for StandbyReleaseLocks().

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

temptable_bench.sql.bz2
Description: BZip2 compressed data

updatecounter.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

Reply via email to