Hi hackers,

I've encountered a bug in PostgreSQL's streaming replication where cascading
standbys fail to reconnect after falling back to archive recovery. The issue
occurs when the upstream standby uses archive-only recovery.

The standby requests streaming from the wrong WAL position (next segment
boundary
instead of the current position), causing connection failures with this
error:

    ERROR: requested starting point 0/A000000 is ahead of the WAL flush
    position of this server 0/9000000

Attached are two shell scripts that reliably reproduce the issue on
PostgreSQL
17.x and 18.x:

1. reproducer_restart_upstream_portable.sh - triggers by restarting upstream
2. reproducer_cascade_restart_portable.sh - triggers by restarting the
cascade

The scripts set up this topology:
- Primary with archiving enabled
- Standby using only archive recovery (no streaming from primary)
- Cascading standby streaming from the archive-only standby

When the cascade loses its streaming connection and falls back to archive
recovery,
it cannot reconnect. The issue appears to be in xlogrecovery.c around line
3880,
where the position passed to RequestXLogStreaming() determines which segment
boundary is requested.

The cascade restart reproducer shows that even restarting the cascade itself
triggers the bug, which affects routine maintenance operations.

Scripts require PostgreSQL binaries in PATH and use ports 15432-15434.

Best regards,
Marco
# BUG: Cascading standby fails to reconnect after falling back to archive recovery

## Summary

PostgreSQL has a bug where cascading standbys fail to reconnect to their upstream after falling back to archive recovery. The standby requests streaming from an incorrect WAL position (next segment boundary instead of current position), causing connection failures when the upstream uses archive-only recovery.

## Affected Versions

Confirmed on PostgreSQL 17.x and 18.x. The relevant code exists in earlier versions as well.

## Reproduction

The bug occurs in cascading replication setups where:
1. Primary server with archiving enabled
2. Standby server using archive-only recovery (no streaming from primary)
3. Cascading standby streaming from the archive-only standby

### Steps to Reproduce

Setup topology:
```
Primary (15432) → Archive
                    ↓
            Standby (15433) [archive-only]
                    ↓
            Cascade (15434) [streaming]
```

The bug triggers when:
1. The cascading standby loses its streaming connection
2. Falls back to archive recovery
3. Attempts to reconnect for streaming

### Error Message

```
FATAL: could not receive data from WAL stream:
ERROR: requested starting point 0/A000000 is ahead of the WAL flush position of this server 0/9000000
```

The cascade requests position 0/A000000 (next 16MB boundary) while the upstream is at 0/9000000 (current segment).

## Reproducer Scripts

Two shell scripts are provided that demonstrate the issue:

1. **reproducer_restart_upstream_portable.sh** - Triggers bug by restarting the upstream standby
2. **reproducer_cascade_restart_portable.sh** - Triggers bug by restarting the cascade itself

Both scripts:
- Create a three-node setup (primary, archive-only standby, cascading standby)
- Use ports 15432-15434 and /tmp for unix sockets
- Require PostgreSQL binaries in PATH
- Reliably reproduce the error

## Technical Details

When a cascading standby falls back to archive recovery and then attempts to reconnect:

1. Reads WAL from archive up to position (e.g., 0/9XXXXXX)
2. Calls RequestXLogStreaming() to reconnect
3. Function rounds position to segment boundary
4. Uses wrong position variable, rounds to next segment (0/A000000)
5. Archive-only upstream cannot provide future segment
6. Connection fails, retry loop continues

The issue appears to be in `src/backend/access/transam/xlogrecovery.c` around line 3880, where the position passed to RequestXLogStreaming() determines which segment boundary is requested.

## Impact

Affects cascading replication topologies where:
- Intermediate standbys use archive-only recovery
- Disaster recovery setups relay through archive storage
- Replicas cascade through non-streaming standbys

Once triggered, the cascading standby cannot reconnect without manual intervention. The cascade restart reproducer shows that even restarting the cascade itself triggers the bug, affecting routine maintenance operations.

## Example Log Output

From cascading standby after connection loss:

```
2025-01-27 10:15:23.456 CASCADE: starting WAL streaming at 0/9000000 (timeline 1)
2025-01-27 10:15:23.457 CASCADE: could not receive data from WAL stream: ERROR: requested starting point 0/A000000 is ahead of the WAL flush position of this server 0/9000148
2025-01-27 10:15:28.462 CASCADE: starting WAL streaming at 0/9000000 (timeline 1)
2025-01-27 10:15:28.463 CASCADE: could not receive data from WAL stream: ERROR: requested starting point 0/A000000 is ahead of the WAL flush position of this server 0/9000148
```

The pattern repeats indefinitely with the cascade requesting 0/A000000 while the upstream remains at 0/9XXXXXX.

Attachment: reproducer_restart_upstream_portable.sh
Description: application/shellscript

Attachment: reproducer_cascade_restart_portable.sh
Description: application/shellscript

Reply via email to