On 2025/06/09 7:41, Michael Paquier wrote:
On Sun, Jun 08, 2025 at 10:30:45PM +0800, Xuneng Zhou wrote:
This is more problematic in logical replication scenarios where these
waits can be very long - for example, when creating a logical
replication slot on a busy system. Without a specific wait event, it's
hard to distinguish legitimate wait from other issues.
Gotcha.
Based on suggestions from Fujii and Kevin [1], the patch introduces
WAIT_EVENT_XACT_DONE ("Waiting for a transaction to commit or abort")
and instructs both functions to report this event during their
pg_usleep() calls With patch applied, when backends are waiting in
these functions, pg_stat_activity will show what they're waiting for.
+ pgstat_report_wait_start(WAIT_EVENT_XACT_DONE);
[...]
+ pgstat_report_wait_start(WAIT_EVENT_XACT_DONE);
Wouldn't it be better to use two wait events named differently to be
able to make the difference between the two code paths?
I think it's fine to use the same wait event, since both code paths
are waiting for the same thing, even though they're in different places.
Here are a few review comments on the patch:
WAL_RECEIVER_WAIT_START "Waiting for startup process to send initial data
for streaming replication."
WAL_SUMMARY_READY "Waiting for a new WAL summary to be generated."
XACT_GROUP_UPDATE "Waiting for the group leader to update transaction status
at transaction end."
+XACT_DONE "Waiting for a transaction to commit or abort."
XACT_DONE should be listed before XACT_GROUP_UPDATE to maintain
alphabetical order.
Also, for the existing description:
transactionid "Waiting for a transaction to finish."
This could be confusing alongside XACT_DONE. Maybe update it to
something like:
"Waiting to acquire a transaction ID lock; see <xref
linkend='transaction-id'/>."
This would help users better understand the difference between
the two wait events.
Regards,
--
Fujii Masao
NTT DATA Japan Corporation