On 6/12/19 10:29 AM, Jamison, Kirk wrote: > >> From a user POW, the main issue with relation truncation is that it can block >> queries on standby server during truncation replay. >> >> It could be interesting if you can test this case and give results of your >> path. >> Maybe by performing read queries on standby server and counting wait_event >> with pg_wait_sampling? > > Thanks for the suggestion. I tried using the extension pg_wait_sampling, > But I wasn't sure that I could replicate the problem of blocked queries on > standby server. > Could you advise? > Here's what I did for now, similar to my previous test with hot standby setup, > but with additional read queries of wait events on standby server. > > 128MB shared_buffers > SELECT create_tables(10000); > SELECT insert_tables(10000); > SELECT delfrom_tables(10000); > > [Before VACUUM] > Standby: SELECT the following view from pg_stat_waitaccum > > wait_event_type | wait_event | calls | microsec > -----------------+-----------------+-------+---------- > Client | ClientRead | 2 | 20887759 > IO | DataFileRead | 175 | 2788 > IO | RelationMapRead | 4 | 26 > IO | SLRURead | 2 | 38 > > Primary: Execute VACUUM (induces relation truncates) > > [After VACUUM] > Standby: > wait_event_type | wait_event | calls | microsec > -----------------+-----------------+-------+---------- > Client | ClientRead | 7 | 77662067 > IO | DataFileRead | 284 | 4523 > IO | RelationMapRead | 10 | 51 > IO | SLRURead | 3 | 57 >
(Sorry for the delay, I forgot to answer you)
As far as I remember, you should see "relation" wait events (type lock) on
standby server. This is due to startup process acquiring AccessExclusiveLock for
the truncation and other backend waiting to acquire a lock to read the table.
On primary server, vacuum is able to cancel truncation:
/*
* We need full exclusive lock on the relation in order to do
* truncation. If we can't get it, give up rather than waiting --- we
* don't want to block other backends, and we don't want to deadlock
* (which is quite possible considering we already hold a lower-grade
* lock).
*/
vacrelstats->lock_waiter_detected = false;
lock_retry = 0;
while (true)
{
if (ConditionalLockRelation(onerel, AccessExclusiveLock))
break;
/*
* Check for interrupts while trying to (re-)acquire the exclusive
* lock.
*/
CHECK_FOR_INTERRUPTS();
if (++lock_retry > (VACUUM_TRUNCATE_LOCK_TIMEOUT /
VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL))
{
/*
* We failed to establish the lock in the specified number of
* retries. This means we give up truncating.
*/
vacrelstats->lock_waiter_detected = true;
ereport(elevel,
(errmsg("\"%s\": stopping truncate due to conflicting lock
request",
RelationGetRelationName(onerel))));
return;
}
pg_usleep(VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL * 1000L);
}
To maximize chances to reproduce we can use big shared_buffers. But I am afraid
it is not easy to perform reproducible tests to compare results. Unfortunately I
don't have servers to perform tests.
Regards,
signature.asc
Description: OpenPGP digital signature
