Hi Chao,

Thanks for looking into this.

>
> Hi Xuneng,
>
> I just reviewed the patch. It splits the current STREAMING state into 
> CONNECTING and STREAMING, where CONNECTING represents the connection 
> establishing phase, which makes sense. The patch is solid, I have just a few 
> small comments:
>
> 1 - 0001
> ```
> +         <para>
> +          <literal>stopped</literal>: WAL receiver process is not running.
> +          This state is normally not visible because the view returns no row
> +          when the WAL receiver is not running.
> +         </para>
> ```
>
> “Is not running” appears twice in this short paragraph, looks redundant. 
> Suggestion:
> ```
> <literal>stopped</literal>: WAL receiver process is not running.
> This state is normally not visible because the view returns no row in this 
> case.
> ```
>
> 2 - 0002
> ```
> +       WALRCV_CONNECTING,                      /* connecting to primary, 
> replication not yet
> +                                                                * started */
> ```
>
> I think “primary” is inaccurate. A connection destination could be a primary, 
> or the other standby, so maybe change “primary” to “upstream server”.

Yeah, "upstream server" is more precise. There are multiple
user-facing log messages in this file that also use “primary”. I’m
wondering whether we should update those as well.

>
>
> 3 - 0002
> ```
>      * Mark walreceiver as running in shared memory.
> ```
>
> I think we should update this comment, changing “running” to “connecting”. 
> There is not a “running” state, before this patch, “streaming” can roughly 
> equal to “running”, after this patch, “running” is kinda unclear.
>

It has been changed.

> 4 - 0002
> ```
> +                       /* Connection established, switch to streaming state 
> */
> +                       SpinLockAcquire(&walrcv->mutex);
> +                       Assert(walrcv->walRcvState == WALRCV_CONNECTING);
> +                       walrcv->walRcvState = WALRCV_STREAMING;
> +                       SpinLockRelease(&walrcv->mutex);
> ```
>
> I don’t feel good with this Assert(). Looking at ShutdownWalRcv(), the 
> startup process might set the state to STOPPING, so there is a race.
>
> Before this patch, this is no state change here, thus rest logic can handle 
> STOPPING. After this patch, if the race occurs, in dev mode, the Assert is 
> fired; in production mode, STOPPING is overwritten by STREAMING, which is 
> wrong.
>
> So, instead of Assert(), I think we should check if current state is 
> CONNECTING, if not, it should not proceed.

It makes sense to me. Please check the updated patch.

--
Best,
Xuneng

Attachment: v6-0001-Add-WALRCV_CONNECTING-state-to-walreceiver.patch
Description: Binary data

Reply via email to