> On Sep 29, 2025, at 08:48, Michael Paquier <[email protected]> wrote:
> 
> 
> The failure is not surprising, because the stats reports can happen in
> a concurrent fashion when a point is run for example, contrary to
> other fixed-sized stats kind where the reports are only done by a
> single process (archiver, bgwriter, checkpointer).  So this is just a
> matter of acquiring a lock that was forgotten, to make sure that the
> changes are consistent.  Far from critical as this is template code,
> still embarrassing.
> 
> Thoughts or comments?


I saw pg_state_begin_changecount_write() is called multiple places, as you 
mention, for example bgwriter. But there are not the same lock acquired in 
other places, for example, in bgwriter:

void
pgstat_report_bgwriter(void)
{
    PgStatShared_BgWriter *stats_shmem = &pgStatLocal.shmem->bgwriter;

    Assert(!pgStatLocal.shmem->is_shutdown);
    pgstat_assert_is_up();

    /*
     * This function can be called even if nothing at all has happened. In
     * this case, avoid unnecessarily modifying the stats entry.
     */
    if (pg_memory_is_all_zeros(&PendingBgWriterStats,
                               sizeof(struct PgStat_BgWriterStats)))
        return;

    pgstat_begin_changecount_write(&stats_shmem->changecount);

#define BGWRITER_ACC(fld) stats_shmem->stats.fld += PendingBgWriterStats.fld
    BGWRITER_ACC(buf_written_clean);
    BGWRITER_ACC(maxwritten_clean);
    BGWRITER_ACC(buf_alloc);
#undef BGWRITER_ACC

    pgstat_end_changecount_write(&stats_shmem->changecount);

Only adding the lock in pg_report_inj_fixed() won’t prevent the race conditions 
from bgwriter. So I wonder, do we need to add the same lock in the other places?

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/




Reply via email to