Hi, for many years I have been running a backup check system for my databases that constantly - upgrade to the latest available PG minor version (Debian PGDG) - restore a DB from a basebackup on S3 - replay all available WAL - perform a ton consistency checks - repeat the same with the next DB - when all DBs are done start from the beginning
All the DBs are PG14. After 14.21 was released last week I saw some of our bigger DBs failing after replaying a few 1000 WAL files. The error message reads like so: 2026-02-14 01:53:59.595 UTC [2441074] LOG: restored log file "0000000500017F8D0000004E" from archive 2026-02-14 01:53:59.605 UTC [2441074] FATAL: could not access status of transaction 2030956544 2026-02-14 01:53:59.605 UTC [2441074] DETAIL: Could not read from file "pg_multixact/offsets/790D" at offset 245760: read too few bytes. 2026-02-14 01:53:59.605 UTC [2441074] CONTEXT: WAL redo at 17F8D/4E1E03E8 for MultiXact/CREATE_ID: 2030956543 offset 1335629905 nmembers 2: 691151655 (keysh) 691151658 (keysh) It does not happen every time. A freshly taken backup succeeded in restoring ~3000 WAL files. In the next round it failed at ~5000 WAL files. If it fails, it is reproducible. It will fail at the same multixact offset again. The multixact offset file where it fails does not exist in the base backup. It is built during replay. In all cases I saw, the offset mentioned in the error message is the length of the file. So, PG apparently wants to read beyond the end of the file. After rolling back to PG 14.20, everything started working again. The release notes mention a few multixact changes from 14.20 to 14.21. I can't claim to understand the change fully. But https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=81416e101 looks like the best culprit candidate to me. All the best, Torsten
