Both pg_prewarm() and the autoprewarm background worker hold
AccessShareLock on the target relation for the entire duration of
prewarming. On large tables this can take a long time, which means
that any DDL that needs a stronger lock (TRUNCATE, DROP TABLE, ALTER TABLE,
etc.) is blocked for the full duration.

VACUUM already solves this same problem during heap truncation: it
periodically calls LockHasWaitersRelation() and backs off when a
conflicting waiter is detected (see lazy_truncate_heap()).

The attached patch applies the same pattern to pg_prewarm and autoprewarm.
Every 1024 blocks, each code path checks for a waiter and if found then the
lock is released so that the DDL can proceed. Patch handles the relation
truncation, drop cases by emitting an error message. If the relation was
only partially truncated, the endpoint is adjusted downward and prewarming
continues.
When no DDL is waiting, the only overhead is one lock-table probe per 1024
blocks.

While developing this patch I discovered that LockHasWaiters() crashes with
a segfault when the lock in question was acquired via the fast-path
optimization, details in [1].

The patch includes a TAP test (t/002_lock_yield.pl) that exercises the
TRUNCATE and DROP TABLE scenarios using injection points.

Thanks,
Satya

[1]:
https://www.postgresql.org/message-id/CAHg%2BQDe_%3DZahnRx37bzrqYenKn_S5YDQ00fTfwe-ZUmjqO%3DqLg%40mail.gmail.com

Attachment: 0001-pg_prewarm-yield-lock-for-ddl.patch
Description: Binary data

Reply via email to