On 2/25/25 15:13, Hangbin Liu wrote:
> On Tue, Feb 25, 2025 at 01:05:24PM +0200, Nikolay Aleksandrov wrote:
>>> @@ -592,15 +611,17 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
>>>     real_dev->xfrmdev_ops->xdo_dev_state_delete(xs);
>>>  out:
>>>     netdev_put(real_dev, &tracker);
>>> -   mutex_lock(&bond->ipsec_lock);
>>> -   list_for_each_entry(ipsec, &bond->ipsec_list, list) {
>>> -           if (ipsec->xs == xs) {
>>> -                   list_del(&ipsec->list);
>>> -                   kfree(ipsec);
>>> -                   break;
>>> -           }
>>> -   }
>>> -   mutex_unlock(&bond->ipsec_lock);
>>> +
>>> +   xfrm_work = kmalloc(sizeof(*xfrm_work), GFP_ATOMIC);
>>> +   if (!xfrm_work)
>>> +           return;
>>> +
>>
>> What happens if this allocation fails? I think you'll leak memory and
>> potentially call the xdo_dev callbacks for this xs again because it's
>> still in the list. Also this xfrm_work memory doesn't get freed anywhere, so
>> you're leaking it as well.
> 
> Yes, I thought this too simply and forgot free the memory.
>>
>> Perhaps you can do this allocation in add_sa, it seems you can sleep
>> there and potentially return an error if it fails, so this can never
>> fail later. You'll have to be careful with the freeing dance though.
> 
> Hmm, if we allocation this in add_sa, how to we get the xfrm_work
> in del_sa? Add the xfrm_work to another list will need to sleep again
> to find it out in del_sa.
> 

Well, you have struct bond_ipsec and it is tied with the work's lifetime
so you can stick it there. :)
I haven't looked closely how feasible it is.

>> Alternatively, make the work a part of struct bond so it doesn't need
>> memory management, but then you need a mechanism to queue these items (e.g.
>> a separate list with a spinlock) and would have more complexity with freeing
>> in parallel.
> 
> I used a dealy work queue in bond for my draft patch. As you said,
> it need another list to queue the xs. And during the gc works, we need
> to use spinlock again to get the xs out...
> 

Correct, it's a different kind of mess. :)

>>
>>> +   INIT_WORK(&xfrm_work->work, bond_xfrm_state_gc_work);
>>> +   xfrm_work->bond = bond;
>>> +   xfrm_work->xs = xs;
>>> +   xfrm_state_hold(xs);
>>> +
>>> +   queue_work(bond->wq, &xfrm_work->work);
>>
>> Note that nothing waits for this work anywhere and .ndo_uninit runs before
>> bond's .priv_destructor which means ipsec_lock will be destroyed and will be
>> used afterwards when destroying bond->wq from the destructor if there were
>> any queued works.
> 
> Do you mean we need to register the work queue in bond_init and cancel
> it in bond_work_cancel_all()?
> 
> Thanks
> Hangbin

That is one way, the other is if you have access to the work queue items then
you can cancel them which should be easier (i.e. cancel_delayed_work_sync).

Regardless of which way you choose to solve this (gc or work in bond_ipsec), 
there will
be some dance to be done for the sequence of events that will not be 
straight-forward.

Cheers,
 Nik


Reply via email to