Hi, I found a potential memory leak in ovs-dpdk hw offload, when remove a dpdk port in datapath, the dpif will call dpif_port_del and will eventually remove the netdev from the map *port_to_netdev* (in netdev-offload). Meanwhile, a resulted datapath reconfiguration will flush all the marked flows by calling *flow_mark_flush*, however, this function only put the offload requests on the queue, it does not wait for the flow deletion finishes. So there could be a case that the offload thread tries to delete a hw flow, however, since the netdev is removed, it will fail at *netdev_ports_get* function, thus, will not call *netdev_flow_del* thus not free the flow entry (ufid_to_rte_flow_data), which will cause memory leak.
In fact, since any port removal will hold dp->port_mutex, when calling *flow_mark_flush*, no flow deletion will really happen since the offload thread cannot get the *dp->port_mutex* to call *netdev_flow_del*. To fix this, we'd better design a wait-done mechanism to wait the offload process to finish all the deletion. I guess we also need to remove holding dp->port_mutex in offload thread, instead, by holding netdev->mutex in the offload layer. I remember that some patch including querying the hw stats also depends on the dp->port_mutex, these also needs to be fixed. -- hepeng Bytedance _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
