> On Jan 23, 2026, at 06:54, Michael Paquier <[email protected]> wrote:
>
> On Thu, Jan 22, 2026 at 06:02:54PM +0800, Xuneng Zhou wrote:
>>> Before this patch, this is no state change here, thus rest logic
>>> can handle STOPPING. After this patch, if the race occurs, in dev
>>> mode, the Assert is fired; in production mode, STOPPING is overwritten
>>> by STREAMING, which is wrong.
>>>
>>> So, instead of Assert(), I think we should check if current state
>>> is CONNECTING, if not, it should not proceed.
>>
>> It makes sense to me. Please check the updated patch.
>
> + if (state != WALRCV_CONNECTING)
> + {
> + if (state == WALRCV_STOPPING)
> + proc_exit(0);
> + elog(FATAL, "unexpected walreceiver state");
> + }
>
> Sorry, but this addition does not make sense to me. Before this
> patch, if the startup process decides to mark a WAL receiver as
> stopping after it has been started in WalReceiverMain(), then we would
> run at least one loop until we reach WalRcvWaitForStartPosition(),
> which is the path where the WAL receiver exits. Aka, I don't think we
> should increase the number of paths where we decide if a WAL receiver
> should fast-exit or not (I'd rather try to reduce them, but I don't
> think we can do that currently based on the ping-pong game between the
> startup process and WAL receiver process). Saying that, you are right
> about the fact that the assertion is wrong, and that we should not
> upgrade the status to STREAMING if the WAL receiver is not CONNECTING.
>
> So it seems to me that this code should remain as simple as followed,
> documenting that we switch to STREAMING when the first connection has
> been established after the WAL receiver has started, or when the WAL
> receiver is switched after WalRcvWaitForStartPosition() once
> startstreaming() has acknowledged that streaming is happening (I would
> add a comment saying that):
> if (state == WALRCV_CONNECTING)
> walrcv->walRcvState = WALRCV_STREAMING;
> --
> Michael
+1.
If current state is not CONNECTING, then leave it as is, which will then follow
the original logic. Unless we find some issue in future, I think we can take
this approach for now.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/