On Mon, Nov 28, 2016 at 05:37:48PM -0800, Joe Stringer wrote: > On 28 November 2016 at 17:00, Ben Pfaff <[email protected]> wrote: > > On Fri, Oct 21, 2016 at 11:47:54AM -0700, Pravin Shelar wrote: > >> On Fri, Oct 21, 2016 at 11:39 AM, Joe Stringer <[email protected]> wrote: > >> > On 21 October 2016 at 11:37, Joe Stringer <[email protected]> wrote: > >> >> On 21 October 2016 at 10:55, Pravin B Shelar <[email protected]> wrote: > >> >>> On upstream kernel datapath OVS make use of networking devices > >> >>> where networking stack does check if device is UP. following > >> >>> patch adds same check in case of compat tunneling implementation. > >> >>> This check also fixes kernel crash in case vxlan device was > >> >>> brought down by user. > >> >>> > >> >>> CPU: 4 PID: 12988 Comm: handler903 Tainted: > >> >>> RIP: 0010:[<ffffffffa05e5407>] vxlan_xmit_one.constprop.50+0x47/0x1210 > >> >>> [openvswitch] > >> >>> Call Trace: > >> >>> [<ffffffffa05e6625>] rpl_vxlan_xmit+0x55/0x80 [openvswitch] > >> >>> [<ffffffffa05d5ad4>] ovs_vport_send+0x44/0xb0 [openvswitch] > >> >>> [<ffffffffa05c62a5>] do_output+0x65/0x180 [openvswitch] > >> >>> [<ffffffffa05c70dc>] do_execute_actions+0x10c/0x860 [openvswitch] > >> >>> [<ffffffffa05c7870>] ovs_execute_actions+0x40/0x130 [openvswitch] > >> >>> [<ffffffffa05cbb59>] ovs_packet_cmd_execute+0x2c9/0x2f0 [openvswitch] > >> >>> [<ffffffff8155f31d>] genl_family_rcv_msg+0x1cd/0x400 > >> >>> [<ffffffff8155f5e1>] genl_rcv_msg+0x91/0xd0 > >> >>> [<ffffffff8155d549>] netlink_rcv_skb+0xa9/0xc0 > >> >>> [<ffffffff8155da78>] genl_rcv+0x28/0x40 > >> >>> [<ffffffff8155ceba>] netlink_unicast+0x16a/0x210 > >> >>> [<ffffffff8155d277>] netlink_sendmsg+0x317/0x430 > >> >>> [<ffffffff81514fd0>] sock_sendmsg+0xb0/0xf0 > >> >>> [<ffffffff81515409>] ___sys_sendmsg+0x3a9/0x3c0 > >> >>> [<ffffffff815162f1>] __sys_sendmsg+0x51/0x90 > >> >>> [<ffffffff81516342>] SyS_sendmsg+0x12/0x20 > >> >>> [<ffffffff81649609>] system_call_fastpath+0x16/0x1b > >> >>> > >> >>> Reported-by: Huanglili (lee) <[email protected]> > >> >>> Signed-off-by: Pravin B Shelar <[email protected]> > >> >> > >> >> I'm not 100% convinced that this solves the issue (or whether this > >> >> affects upstream as well), but it seems like this at least makes it > >> >> significantly less likely to crash - If I understand correctly, > >> >> setting the vxlan port down currently guarantees a crash, whereas this > >> >> patch would mean it either doesn't crash, or it only crashes based on > >> >> a race condition. Specifically, one where the reconfiguration gets > >> >> past the synchronize_net() into ndo_stop() (before clearing IFF_UP) > >> >> while OVS manages to execute all of a flow up until output and > >> >> proceeds past this check before the reconfiguration thread clears it. > >> >> > >> >> Probably the type of test to check this would be running heavy traffic > >> >> through a vxlan port, then toggle the port up and down. > >> >> > >> >> Will you also check upstream? > >> >> > >> >> Anyway, this LGTM so: > >> >> Acked-by: Joe Stringer <[email protected]> > >> > > >> > Hmm, the original reporter said that the issue is not fixed. Perhaps > >> > we should hold off on this? > >> > >> Yes, I would like to see another solution before merging this patch. > > > > Joe, do you know whether this needs further action? It's not merged and > > I don't know whether it should be. > > No, it looks like commit 76e671eade40 ("datapath: backport: vxlan: > avoid using stale vxlan socket.") in the OVS tree fixes this issue, > rendering this version of the fix unnecessary.
Thanks, I marked it in patchwork as "not applicable". _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
