On 2025/06/09 7:41, Michael Paquier wrote:
On Sun, Jun 08, 2025 at 10:30:45PM +0800, Xuneng Zhou wrote:
This is more problematic in logical replication scenarios where these
waits can be very long - for example, when creating a logical
replication slot on a busy system. Without a specific wait event, it's
hard to distinguish legitimate wait from other issues.

Gotcha.

Based on suggestions from Fujii and Kevin [1], the patch introduces
WAIT_EVENT_XACT_DONE ("Waiting for a transaction to commit or abort")
and instructs both functions to report this event during their
pg_usleep() calls  With patch applied, when backends are waiting in
these functions, pg_stat_activity will show what they're waiting for.

+     pgstat_report_wait_start(WAIT_EVENT_XACT_DONE);
[...]
+     pgstat_report_wait_start(WAIT_EVENT_XACT_DONE);

Wouldn't it be better to use two wait events named differently to be
able to make the difference between the two code paths?

I think it's fine to use the same wait event, since both code paths
are waiting for the same thing, even though they're in different places.


Here are a few review comments on the patch:

 WAL_RECEIVER_WAIT_START        "Waiting for startup process to send initial data 
for streaming replication."
 WAL_SUMMARY_READY      "Waiting for a new WAL summary to be generated."
 XACT_GROUP_UPDATE      "Waiting for the group leader to update transaction status 
at transaction end."
+XACT_DONE      "Waiting for a transaction to commit or abort."

XACT_DONE should be listed before XACT_GROUP_UPDATE to maintain
alphabetical order.


Also, for the existing description:
transactionid   "Waiting for a transaction to finish."

This could be confusing alongside XACT_DONE. Maybe update it to
something like:

"Waiting to acquire a transaction ID lock; see <xref 
linkend='transaction-id'/>."

This would help users better understand the difference between
the two wait events.

Regards,

--
Fujii Masao
NTT DATA Japan Corporation



Reply via email to