On Thu, Jan 27, 2022, at 11:38, Ilya Maximets wrote:
> On 1/27/22 07:42, Sriharsha Basavapatna via dev wrote:
>> In dp_netdev_pmd_remove_flow() we schedule the deletion of an
>> offloaded flow, if a mark has been assigned to the flow. But if
>> this occurs in the window in which the offload thread completes
>> offloading the flow and assigns a mark to the flow, then we miss
>> deleting the flow. This problem has been observed while adding
>> and deleting flows in a loop. To fix this, always enqueue flow
>> deletion regardless of the flow->mark being set.
>>
>> Fixes: 241bad15d99a("dpif-netdev: associate flow with a mark id")
>> Signed-off-by: Sriharsha Basavapatna <[email protected]>
>> ---
>> lib/dpif-netdev.c | 4 +---
>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index e28e0b554..22c5f19a8 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -3029,9 +3029,7 @@ dp_netdev_pmd_remove_flow(struct dp_netdev_pmd_thread
>> *pmd,
>> dp_netdev_simple_match_remove(pmd, flow);
>> cmap_remove(&pmd->flow_table, node, dp_netdev_flow_hash(&flow->ufid));
>> ccmap_dec(&pmd->n_flows, odp_to_u32(in_port));
>> - if (flow->mark != INVALID_FLOW_MARK) {
>> - queue_netdev_flow_del(pmd, flow);
>> - }
>> + queue_netdev_flow_del(pmd, flow);
>> flow->dead = true;
>>
>> dp_netdev_flow_unref(flow);
>>
>
> Thanks for the patch, but it looks like that it's not that simple...
> A lot of tests are crashing in GHA and ASAN reports use-after-free
> on flow disassociation if the dpif already destroyed:
> https://github.com/ovsrobot/ovs/actions/runs/1754866321
>
> I think that the problem is that offload thread holds the ref count
> for PMD thread, but PMD thread structure doesn't hold the ref count
> for the dp, because it doesn't expect that dp_netdev_pmd structure
> will be used while PMD thread is destroyed. I guess, we should fix
> that. OTOH, I'm not sure if offload thread actually needs a reference
> to dp_netdev_pmd structure. If I didn't miss something, it only
> uses pmd pointer to access pmd->dp. So, maybe, dp_offload_thread_item
> should hold and ref the dp pointer instead?
>
> What do you think? Gaetan?
>
Hi Ilya,
I've been looking into this issue, you are right that the offload threads don't
actually need the pmd pointer, only the dp one.
The problem I see with a proper reference taking on dp to protect against this
UAF, is that it is prohibitively slow to manage those dp references, and it
relies on
a global lock.
I have tried to find an alternative (the flush op could be tagged to mark that
it was issued during a dp_netdev_free), but I always end-up with a
race-condition, even if the window is reduced.
So given that the refcounting is too slow but necessary, I think the only
practical solution
is to rewrite it to be fast enough. I will send an RFC to that end, let me know
if you think
of something better.
--
Gaetan Rivet
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev