> On Sep 29, 2025, at 08:48, Michael Paquier <[email protected]> wrote:
>
>
> The failure is not surprising, because the stats reports can happen in
> a concurrent fashion when a point is run for example, contrary to
> other fixed-sized stats kind where the reports are only done by a
> single process (archiver, bgwriter, checkpointer). So this is just a
> matter of acquiring a lock that was forgotten, to make sure that the
> changes are consistent. Far from critical as this is template code,
> still embarrassing.
>
> Thoughts or comments?
I saw pg_state_begin_changecount_write() is called multiple places, as you
mention, for example bgwriter. But there are not the same lock acquired in
other places, for example, in bgwriter:
void
pgstat_report_bgwriter(void)
{
PgStatShared_BgWriter *stats_shmem = &pgStatLocal.shmem->bgwriter;
Assert(!pgStatLocal.shmem->is_shutdown);
pgstat_assert_is_up();
/*
* This function can be called even if nothing at all has happened. In
* this case, avoid unnecessarily modifying the stats entry.
*/
if (pg_memory_is_all_zeros(&PendingBgWriterStats,
sizeof(struct PgStat_BgWriterStats)))
return;
pgstat_begin_changecount_write(&stats_shmem->changecount);
#define BGWRITER_ACC(fld) stats_shmem->stats.fld += PendingBgWriterStats.fld
BGWRITER_ACC(buf_written_clean);
BGWRITER_ACC(maxwritten_clean);
BGWRITER_ACC(buf_alloc);
#undef BGWRITER_ACC
pgstat_end_changecount_write(&stats_shmem->changecount);
Only adding the lock in pg_report_inj_fixed() won’t prevent the race conditions
from bgwriter. So I wonder, do we need to add the same lock in the other places?
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/