> > > There is also a discussion [0] about wait event/activity field > > > inconsistency > > > with pg_stat_activity with a repro in [1]. > > > > > > The repro I was referring to in [1] is actually > > https://www.postgresql.org/message-id/ab1c0a7d-e789-5ef5-1180-42708ac6fe2d%40postgrespro.ru > > That is inherent. The wait event is updated in an unsynchronized fashion. As > noted in that thread. > > Making it synchronized (via st_changecount) would make wait event overhead > vastly higher.
Correct, and I don't think we should pay the overhead of ensuring
strict synchronization.
The patch that Heikkei is proposing [0] will update wait_event_info bypassing
st_changecount.
```
+++ b/src/include/utils/backend_status.h
@@ -174,6 +174,15 @@ typedef struct PgBackendStatus
/* plan identifier, optionally computed using planner_hook */
int64 st_plan_id;
+
+ /*
+ * Proc's wait information. This *not* protected by the changecount
+ * mechanism, because reading and writing an uint32 is assumed
to atomic.
+ * This is updated very frequently, so we want to keep the overhead as
+ * small as possible.
+ */
+ uint32 st_wait_event_info;
+
```
Which is the same assumption we are already making.
```
static inline void
pgstat_report_wait_start(uint32 wait_event_info)
{
/*
* Since this is a four-byte field which is always read and written as
* four-bytes, updates are atomic.
*/
*(volatile uint32 *) my_wait_event_info = wait_event_info;
}
```
So without st_changecount, wait_events could still be out of sync on with
other backend status fields (i.e. state), but at least my testing shows
better results with [0] applied. Better results here are no samples
with " state = active, wait_event = ClientRead and backend_type =
client backend"
Using the attached repro.sh (attached):
## without patch
```
# of samplase of state = active, wait_event = ClientRead and
backend_type = client backend
238
# of sampled of state = active + wait_event
11774 transactionid
4054 tuple
489 CPU
444 LockManager
343 WALWrite
238 ClientRead
69 WalSync
15 BufferExclusive
2 VacuumDelay
2 BufferShared
1 DataFileWrite
```
## with patch [0]
```
# of samplase of state = active, wait_event = ClientRead and
backend_type = client backend
0
# of sampled of state = active + wait_event
12516 transactionid
3188 tuple
504 WALWrite
414 CPU
411 LockManager
86 WalSync
7 BufferExclusive
4 BufferShared
2 VacuumDelay
```
Also, it's worth noting that we bypass st_changecount for VACUUM +
ANALYZE delay time.
This was discussed in [2]
vacuumlazy.c:
```
if (track_cost_delay_timing)
{
/*
* We bypass the changecount mechanism because this value is
* only updated by the calling process.
*/
appendStringInfo(&buf, _("delay time: %.3f ms\n"),
(double) MyBEEntry->st_progress_param[PROGRESS_ANALYZE_DELAY_TIME] / 1000000.0);
}
```
commands/analyze.c:
```
if (track_cost_delay_timing)
{
/*
* We bypass the changecount mechanism because this value is
* only updated by the calling process. We also rely on the
* above call to pgstat_progress_end_command() to not clear
* the st_progress_param array.
*/
appendStringInfo(&buf, _("delay time: %.3f ms\n"),
(double) MyBEEntry->st_progress_param[PROGRESS_VACUUM_DELAY_TIME] / 1000000.0);
}
```
[0]
https://www.postgresql.org/message-id/459e78c0-927f-4347-86df-ca431567c95a%40iki.fi
[1]
https://www.postgresql.org/message-id/ab1c0a7d-e789-5ef5-1180-42708ac6fe2d%40postgrespro.ru
[2] https://www.postgresql.org/message-id/Z6-tiXbLwHYyDeNy%40nathan
--
Sami Imseih
Amazon Web Services (AWS)
repro.sh
Description: Bourne shell script
