On 1/7/2021 9:10 PM, Sriharsha Basavapatna wrote:
Hi Eli,

On Mon, Dec 28, 2020 at 11:43 PM Eli Britstein <[email protected]> wrote:
Netdev datapath offloads are done in a separate thread, using messaging
between the threads. With port removal there is a race between the offload
thread removing the offloaded rules and the actual port removal, so some
rules will not be removed.
Can you provide details on this sequence, to get more clarity on how
some rules will not be removed ?

A deletion sequence from dpif point of view starts with dpif_port_del. It calls dpif->dpif_class->port_del (implemented by dpif_netdev_port_del) and netdev_ports_remove.

flow_mark_flush is called (dpif_netdev_port_del->do_del_port->reconfigure_datapath->reload_affected_pmds). This function only posts deletion requests to the queue (queue_netdev_flow_del), to be later handled by the offload thread.

When the offload thread wakes up to handle the request (dp_netdev_flow_offload_del->mark_to_flow_disassociate) it looks for the netdev object by netdev_ports_get in port_to_netdev map, racing the execution of netdev_ports_remove that removes that mapping.

If the mapping is removed before the handling, the removal of the HW rule won't take place, leaking it and the related allocated memory.


In OVS the offload objects are not freed (memory
leak).
Since dp_netdev_free_flow_offload() frees the offload object, not sure
how the memory leak happens ?
As eventually netdev_offload_dpdk_flow_del is not called, the objects in ufid_to_rte_flow that should have been freed by ufid_to_rte_flow_disassociate are leaked (as well as the rte_flows not destroyed).

In HW the remining of the rules depend on PMD behavior.
A few related questions:

1.  In Patch-1, netdev_flow_flush() is called from do_del_port().
Doesn't it violate the layering rules, since port deletion shouldn't
be directly involved with netdev/flow related operations (that would
otherwise be processed as a part of datapath reconfig) ?

netdev/flow operations are called from dpif-netdev layer, that includes do_del_port. We can call the offload functions under dp->port_mutex which is locked in that function.

Which layering rules are violated?


2. Since the rte_flow is deleted first, doesn't it leave the
dpif_netdev_flow in an inconsistent state, since the flow is still
active from the pmd thread's perspective (megaflow_to_mark and
mark_to_flow cmap entries are still valid, indicating that it is
offloaded).

That flow is a sequence not related to this patch set, that occurs normally.

Currently there are two types of offloads (both are ingress):

- Partial. Packets are handled by SW with or without offload. Packets are fully processed correctly by the SW either if they don't have mark or have mark that is not found by mark_to_flow_find.

- Full. Packets will be handled correctly by the SW as they arrive with no mark.

The mark<->megaflow mapping disassociation is handled correctly even if the rte_flow is not removed.

However, indeed, this is a good point to take into consideration when implementing egress offloading.


3. Also, how does this work with partially offloaded flows (same
megaflow mapped to multiple pmd threads) ?
Also not related to this patch-set. The flow is ref-counted. Upon the first ref the rte_flow is created, and it is destroyed with the last unref (see mark_to_flow_disassociate).

Thanks,
-Harsha
This patch-set resolves this issue using flush.

v2-v1:
- Fix for pending offload requests.
v3-v2:
- Rebase.

Travis:
v1: https://travis-ci.org/github/elibritstein/OVS/builds/747022942
v2: https://travis-ci.org/github/elibritstein/OVS/builds/748788786
v3: https://travis-ci.org/github/elibritstein/OVS/builds/751787939

GitHub Actions:
v1: https://github.com/elibritstein/OVS/actions/runs/394296553
v2: https://github.com/elibritstein/OVS/actions/runs/413379295
v3: https://github.com/elibritstein/OVS/actions/runs/448541155

Eli Britstein (4):
   dpif-netdev: Flush offload rules upon port deletion
   netdev-offload-dpdk: Keep netdev in offload object
   netdev-offload-dpdk: Refactor disassociate and flow destroy
   netdev-offload-dpdk: Implement flow flush

  lib/dpif-netdev.c         |  2 +
  lib/netdev-offload-dpdk.c | 85 +++++++++++++++++++++++++--------------
  2 files changed, 56 insertions(+), 31 deletions(-)

--
2.28.0.546.g385c171

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to