Looks good.

Reviewed-by: Pavel Tikhomirov <[email protected]>

On 5/22/26 18:14, Denis V. Lunev wrote:
> QEMU's CPR live update issues VHOST_VSOCK_SET_RUNNING(0) on the source
> before VHOST_RESET_OWNER, nulling vq->private_data via
> vhost_vsock_drop_backends(). The fast-fail in vhost_transport_send_pkt()
> added by commit 4ff28534c799 ("ms/vhost/vsock: Refuse the connection
> immediately when guest isn't ready") then rejects every host send with
> -EHOSTUNREACH until the destination calls SET_RUNNING(1) -- the entire
> CPR window becomes a hard outage for host AF_VSOCK clients
> (VSTOR-131956).
> 
> Add a cpr_paused flag set inside vhost_vsock_drop_backends() when the
> backend was previously live, cleared by vhost_vsock_start(). When set,
> vhost_transport_send_pkt() queues the skb instead of fast-failing; the
> existing kick of send_pkt_work in vhost_vsock_start() drains it on
> resume. A device that has never run keeps cpr_paused == false and the
> boot-time fast-fail behaviour is preserved.
> 
> The flag is set before dropping backends so a concurrent sender on
> another CPU never observes (NULL backend, !paused).
> 
> https://virtuozzo.atlassian.net/browse/VSTOR-131956
> Signed-off-by: Denis V. Lunev <[email protected]>
> ---
>  drivers/vhost/vsock.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index 0a518c3d1596..9d8083b065c3 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -57,6 +57,7 @@ struct vhost_vsock {
>  
>       u32 guest_cid;
>       bool seqpacket_allow;
> +     bool cpr_paused;        /* between stop and next start; queues sends */
>  };
>  
>  static u32 vhost_transport_get_local_cid(void)
> @@ -295,7 +296,9 @@ vhost_transport_send_pkt(struct sk_buff *skb)
>        * all the outcomes covered: if the backend becomes NULL right after 
> the check,
>        * vhost_transport_do_send_pkt() will check it under the mutex anyway.
>        */
> -     if 
> (unlikely(!data_race(vhost_vq_get_backend(&vsock->vqs[VSOCK_VQ_RX])))) {
> +     /* cpr_paused: queue across CPR; else NULL backend means not ready. */
> +     if (unlikely(!data_race(vhost_vq_get_backend(&vsock->vqs[VSOCK_VQ_RX])) 
> &&
> +                  !READ_ONCE(vsock->cpr_paused))) {
>               rcu_read_unlock();
>               kfree_skb(skb);
>               return -EHOSTUNREACH;
> @@ -610,6 +613,8 @@ static int vhost_vsock_start(struct vhost_vsock *vsock)
>               mutex_unlock(&vq->mutex);
>       }
>  
> +     WRITE_ONCE(vsock->cpr_paused, false);
> +
>       /* Some packets may have been queued before the device was started,
>        * let's kick the send worker to send them.
>        */
> @@ -641,6 +646,9 @@ static void vhost_vsock_drop_backends(struct vhost_vsock 
> *vsock)
>  
>       lockdep_assert_held(&vsock->dev.mutex);
>  
> +     if (vhost_vq_get_backend(&vsock->vqs[VSOCK_VQ_RX]))
> +             WRITE_ONCE(vsock->cpr_paused, true);
> +
>       for (i = 0; i < ARRAY_SIZE(vsock->vqs); i++) {
>               vq = &vsock->vqs[i];
>  

-- 
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.

_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to