Ilya Maximets <[email protected]> writes:

> Commit in a fixes tag attempted to fix the issue in the following
> sequence of calls:
>
>     do_output
>     -> ovs_vport_send
>        -> dev_queue_xmit
>           -> __dev_queue_xmit
>              -> netdev_core_pick_tx
>                 -> skb_tx_hash
>
> When device is unregistering, the 'dev->real_num_tx_queues' goes to
> zero and the 'while (unlikely(hash >= qcount))' loop inside the
> 'skb_tx_hash' becomes infinite, locking up the core forever.
>
> But unfortunately, checking just the carrier status is not enough to
> fix the issue, because some devices may still be in unregistering
> state while reporting carrier status OK.
>
> One example of such device is a net/dummy.  It sets carrier ON
> on start, but it doesn't implement .ndo_stop to set the carrier off.
> And it makes sense, because dummy doesn't really have a carrier.
> Therefore, while this device is unregistering, it's still easy to hit
> the infinite loop in the skb_tx_hash() from the OVS datapath.  There
> might be other drivers that do the same, but dummy by itself is
> important for the OVS ecosystem, because it is frequently used as a
> packet sink for tcpdump while debugging OVS deployments.  And when the
> issue is hit, the only way to recover is to reboot.
>
> Fix that by also checking if the device is running.  The running
> state is handled by the net core during unregistering, so it covers
> unregistering case better, and we don't really need to send packets
> to devices that are not running anyway.
>
> While only checking the running state might be enough, the carrier
> check is preserved.  The running and the carrier states seem disjoined
> throughout the code and different drivers.  And other core functions
> like __dev_direct_xmit() check both before attempting to transmit
> a packet.  So, it seems safer to check both flags in OVS as well.
>
> Fixes: 066b86787fa3 ("net: openvswitch: fix race on port output")
> Reported-by: Friedrich Weber <[email protected]>
> Closes: 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
> Signed-off-by: Ilya Maximets <[email protected]>
> ---
>  net/openvswitch/actions.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 16e260014684..704c858cf209 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -934,7 +934,9 @@ static void do_output(struct datapath *dp, struct sk_buff 
> *skb, int out_port,
>  {
>       struct vport *vport = ovs_vport_rcu(dp, out_port);
>  
> -     if (likely(vport && netif_carrier_ok(vport->dev))) {
> +     if (likely(vport &&
> +                netif_running(vport->dev) &&
> +                netif_carrier_ok(vport->dev))) {
>               u16 mru = OVS_CB(skb)->mru;
>               u32 cutlen = OVS_CB(skb)->cutlen;

Reviewed-by: Aaron Conole <[email protected]>

I tried on with my VMs to reproduce the issue as described in the email
report, but I probably didn't give enough resources (or gave too many
resources) to get the race condition to bubble up.  I was using kernel
6.13-rc5 (0bc21e701a6f) also.

In any case, I think the change makes sense.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to