On 11.02.2019 6:22, Lilijun (Jerry, Cloud Networking) wrote: > Yes, Thanks for your help again. > > Let's forget the patch :). But in my opinion, the function > reload_affected_pmds() doesn't need be locked within dp->port_mutex. Is > that true?
That's true that 'reload_affected_pmds()' itself doesn't need to be locked. However, you can't just unlock the mutex and lock it again, because upper layer functions that calls 'reconfigure_datapath()' expects that mutex will be held all the way down. For example, 'dp_netdev_free()' iterates over the 'dp->ports' hmap and it doesn't expect that 'do_del_port()' could unlock the mutex. If hmap will be changed (some ports added), the iterator could crash. Another example is 'dpif_dummy_change_port_number()' that calls the reconfiguration twice in a row expecting no changes in 'dp->ports' between. Best regards, Ilya Maximets. > >> -----Original Message----- >> From: [email protected] [mailto:ovs-dev- >> [email protected]] On Behalf Of Stokes, Ian >> Sent: Thursday, February 07, 2019 5:25 PM >> To: Ilya Maximets <[email protected]>; [email protected] >> Subject: Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock >> >>> On 31.01.2019 11:48, Lilijun wrote: >>>> This patch fix the dead lock when using dpdk userspace datapath. The >>>> problem is described as follows: >>>> 1) when add or delete port, the main thread will call >>>> reconfigure_datapath() in the function dpif_netdev_run() >>>> 2) Here the dp->port_mutex is locked. In dp_netdev_reload_pmd__(), >>>> it will notify each pmd to reload. >>>> 3) If pmd is doing packet upcall in fast_path_processing() and try >>>> to get >>>> dp->port_mutex in >>>> do_xlate_actions()-->tnl_route_lookup_flow()-- >>>> dpif_netdev_port_query_by_name(). >>>> Here pmd get lock failed because the main thread has got the lock in >>>> step 2. >>>> 4) So the main thread was stuck for waiting pmd to reload done. Now >>>> we got a dead lock. >>>> >>>> Here reload_affected_pmds() may not need to lock dp->port_mutex. So >>>> we release the lock temporarily when calling reload_affected_pmds(). >>>> >>>> Signed-off-by: Lilijun <[email protected]> >>> >>> Replying just to keep answers on a list/patchwork consistent. >>> The deadlock caused by some local changes done by the user. >>> Not possible in upstream master. See discussion for the previous >>> version of this patch that is missing in patchwork for some reason: >>> >>> >>> https://mail.openvswitch.org/pipermail/ovs-dev/2019- >> January/355756.htm >>> l >>> >>> Best regards, Ilya Maximets. >>> >> >> Thanks for clarifying Ilya, there was discussion of this at the community >> meeting yesterday but it seems a non-issue now. >> >> Thanks >> Ian >> _______________________________________________ >> dev mailing list >> [email protected] >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
