Hi,

I use Postgres 17.7 and I have configured incremental backup

wal_summary_keep_time=10d
show wal_summary_keep_time=on

On the standby, this message is written every 10 seconds by the
walsummarizer process.

2026-02-17 11:40:22.548 CET [544677]: [297-1]
user=,db=,client=,application= ERROR:  could not read WAL from timeline 6
at 698/460000A0: invalid record length at 698/460000A0: expected at least
24, got 0
2026-02-17 11:40:32.559 CET [544677]: [298-1]
user=,db=,client=,application= ERROR:  could not read WAL from timeline 6
at 698/460000A0: invalid record length at 698/460000A0: expected at least
24, got 0

postgres  544677  544615  0 10:50 ?        00:00:00 postgres: dev_ge:
walsummarizer

Last summary files are:

-rw-------. 1 postgres postgres  1808 Feb  5 18:00
00000007000006998F2205F0000006998F2FCAA8.summary
-rw-------. 1 postgres postgres  1158 Feb  5 18:30
00000007000006998F2FCAA8000006998F328C48.summary
-rw-------. 1 postgres postgres   448 Feb  5 19:00
00000007000006998F328C48000006998F3419C0.summary
-rw-------. 1 postgres postgres   566 Feb  5 19:30
00000007000006998F3419C0000006998F35D2C8.summary
-rw-------. 1 postgres postgres   616 Feb  5 20:00
00000007000006998F35D2C8000006998F379C60.summary
-rw-------. 1 postgres postgres  1108 Feb  5 20:30
00000007000006998F379C60000006998F3A88C8.summary
-rw-------. 1 postgres postgres   330 Feb  5 20:43
00000007000006998F3A88C80000069990000028.summary
-rw-------. 1 postgres postgres    32 Feb  5 20:43
00000007000006999000002800000699900000A0.summary
-rw-------. 1 postgres postgres    32 Feb  5 20:43
0000000800000699900000A000000699900000D8.summary

summary file of the timeline 6

-rw-------. 1 postgres postgres   780 Jan  8 20:10
0000000600000698452AB9680000069846000028.summary
-rw-------. 1 postgres postgres    32 Jan  8 20:10
00000006000006984600002800000698460000A0.summary

pg_waldump 000000060000069800000046
rmgr: XLOG        len (rec/tot):    114/   114, tx:          0, lsn:
698/46000028, prev 698/452D6AA8, desc: CHECKPOINT_SHUTDOWN redo
698/46000028; tli 6; prev tli 6; fpw true; wal_level logical; xid
0:212216474; oid 6953284; multi 86737; offset 217538; oldest xid 16464388
in DB 18769; oldest multi 1 in DB 17257; oldest/newest commit timestamp
xid: 167966575/212216473; oldest running xid 0; shutdown
pg_waldump: error: error in WAL record at 698/46000028: invalid record
length at 698/460000A0: expected at least 24, got 0

On the primary summary process works fine

-rw-------. 1 postgres postgres   2340 Feb 17 10:05
000000090000069A7F9595980000069A7F9B2D08.summary
-rw-------. 1 postgres postgres   1568 Feb 17 10:35
000000090000069A7F9B2D080000069A7F9EF6C8.summary
-rw-------. 1 postgres postgres   5300 Feb 17 11:05
000000090000069A7F9EF6C80000069A7FA47288.summary
-rw-------. 1 postgres postgres    642 Feb 17 11:35
000000090000069A7FA472880000069A7FA6B820.summary

no lag between primary and secondary

Recovery on the standby is active and on time
postgres  544620  544615  0 10:50 ?        00:00:00 postgres: dev_ge:
startup recovering 000000090000069A00000080
postgres  544675  544615  0 10:50 ?        00:00:00 postgres: dev_ge:
walreceiver streaming 69A/800D0808

history files:

cat 00000007.history
1       5F4/BB000C68    no recovery target specified

2       618/9C0000A0    no recovery target specified

3       619/E70000A0    no recovery target specified

4       669/96FFFFD0    no recovery target specified

5       693/CC0000A0    no recovery target specified

6       698/460000A0    no recovery target specified

cat 00000008.history
1       5F4/BB000C68    no recovery target specified

2       618/9C0000A0    no recovery target specified

3       619/E70000A0    no recovery target specified

4       669/96FFFFD0    no recovery target specified

5       693/CC0000A0    no recovery target specified

6       698/460000A0    no recovery target specified

7       699/900000A0    no recovery target specified

cat 00000009.history
1       5F4/BB000C68    no recovery target specified

2       618/9C0000A0    no recovery target specified

3       619/E70000A0    no recovery target specified

4       669/96FFFFD0    no recovery target specified

5       693/CC0000A0    no recovery target specified

6       698/460000A0    no recovery target specified

7       699/900000A0    no recovery target specified

8       699/910000A0    no recovery target specified

In the code:
walsummarizer.c
/*
* Main loop: read xlog records one by one.
*/
while (1)
{
int block_id;
char   *errormsg;
XLogRecord *record;
uint8 rmid;

ProcessWalSummarizerInterrupts();

/* We shouldn't go backward. */
Assert(summary_start_lsn <= xlogreader->EndRecPtr);

/* Now read the next record. */
record = XLogReadRecord(xlogreader, &errormsg);
if (record == NULL)
{

xlogreader.c
XLogReadRecord --> XLogReadAhead(XLogReaderState *state, bool nonblocking)
--> result = XLogDecodeNextRecord(state, nonblocking); => error
  --> XLogNextRecord


/* There may be no next page if it's too small. */
if (total_len < SizeOfXLogRecord)
{
report_invalid_record(state,
 "invalid record length at %X/%08X: expected at least %u, got %u",
 LSN_FORMAT_ARGS(RecPtr),
 (uint32) SizeOfXLogRecord, total_len);
goto err;

why walsummarizer works fine on the primary and it is stuck on the
secondary?

when the  walsummarizer process was stop for a while (24h) and restarted,
it resumes, but the history is lost.

 -rw-------. 1 postgres postgres   580 Feb 18 11:09
000000090000069AB9B511680000069AB9B7F260.summary
-rw-------. 1 postgres postgres  1880 Feb 18 11:09
000000090000069AB9B7F2600000069AB9BB6780.summary
-rw-------. 1 postgres postgres   530 Feb 18 11:09
000000090000069ABF2082880000069ABF23C690.summary
-rw-------. 1 postgres postgres  8582 Feb 18 11:35
000000090000069ABF23C6900000069AC01B1EB0.summary

Any clue of what happend will be very appreciated

Fabrice

Reply via email to