On 5/10/26 00:44, Michael S. Tsirkin wrote:
> On Sat, May 09, 2026 at 06:31:47PM +0200, Simon Schippers wrote:
>> On 5/8/26 17:10, Simon Schippers wrote:
>>> +static void tun_queue_purge(struct tun_struct *tun, struct tun_file *tfile)
>>>  {
>>>     void *ptr;
>>>  
>>> -   while ((ptr = ptr_ring_consume(&tfile->tx_ring)) != NULL)
>>> +   while ((ptr = tun_ring_consume(tun, tfile)) != NULL)
>>>             tun_ptr_free(ptr);
>>>  
>>>     skb_queue_purge(&tfile->sk.sk_write_queue);
>>
>> Sashiko is right once again. tun_ring_consume() in tun_queue_purge()
>> operates on a tfile that is being torn down. Its queue_index is no
>> longer valid. After the swap in __tun_detach(), it points to the
>> netdev subqueue of a different tfile.
>> --> We should not wake there.
> 
> Does it not exactly point at ntfile which is what we want to wake?
> 

I see your point. But calling tun_ring_consume() as done here is
wrong, because it does not wake if the tx_ring of the tfile 
(that is currently torn down) is empty. We could change
tun_ring_consume() to call __tun_wake_queue()
with consumed=0 if !ptr but I think this would slow down the consumer
path.

> 
>> I will swap tun_ring_consume() with ptr_ring_consume() again and
>> submit a v12 :)
> 
> If so then maybe
> netif_tx_wake_queue(netdev_get_tx_queue(tun->dev, index));
> 

But we should only do this if there is space in the ntfile.
My approach:

@@ -586,12 +588,18 @@ static void __tun_detach(struct tun_file *tfile, bool 
clean)
                BUG_ON(index >= tun->numqueues);
 
                rcu_assign_pointer(tun->tfiles[index],
                                   tun->tfiles[tun->numqueues - 1]);
                ntfile = rtnl_dereference(tun->tfiles[index]);
+               spin_lock(&ntfile->tx_ring.consumer_lock);
                ntfile->queue_index = index;
                ntfile->xdp_rxq.queue_index = index;
+               ntfile->cons_cnt = 0;
+               if (__ptr_ring_empty(&ntfile->tx_ring)) {
+                       netif_wake_subqueue(tun->dev, index);
+               }
+               spin_unlock(&ntfile->tx_ring.consumer_lock);
                rcu_assign_pointer(tun->tfiles[tun->numqueues - 1],
                                   NULL);

ntfile->cons_cnt is unvalid, because the new queue might not be stopped.
That is the reason why I reset it to 0.

Reply via email to