Re: [ovs-dev] [ovs-discuss] crash when restart openvswitch with huge vxlan traffic running
Thanks a lot. I comfirm your fix works. Br, Zhike Wang JDCloud, Product Development, IaaS Mobile/+86 13466719566 E- mail/wangzh...@jd.com Address/5F Building A,North-Star Century Center,8 Beichen West Street,Chaoyang District Beijing Https://JDCloud.com -Original Message- From: Lorenzo Bianconi [mailto:lorenzo.bianc...@redhat.com] Sent: Friday, December 28, 2018 4:33 AM To: Ben Pfaff Cc: 王志克; Gregory Rose; ovs-disc...@openvswitch.org; ovs-dev@openvswitch.org Subject: Re: [ovs-discuss] crash when restart openvswitch with huge vxlan traffic running > Greg, this is a kernel issue. If you have the time, will you take a > look at it sometime? > Hi all, I worked on a pretty similar issue a couple of weeks ago. Could you please take a look to the commit below (it is already in Linus's tree): commit 8e1da73acded4751a93d4166458a7e640f37d26c Author: Lorenzo Bianconi Date: Wed Dec 19 23:23:00 2018 +0100 gro_cell: add napi_disable in gro_cells_destroy Add napi_disable routine in gro_cells_destroy since starting from commit c42858eaf492 ("gro_cells: remove spinlock protecting receive queues") gro_cell_poll and gro_cells_destroy can run concurrently on napi_skbs list producing a kernel Oops if the tunnel interface is removed while gro_cell_poll is running. The following Oops has been triggered removing a vxlan device while the interface is receiving traffic Regards, Lorenzo > On Thu, Dec 20, 2018 at 12:42:43PM +, 王志克 wrote: > > Hi All, > > > > I did below test, and found system crash, does anyone knows whether there > > are already some fix for it? > > > > Setup: > > CentOS7.4 3.10.0-693.el7.x86_64, > > OVS: 2.10.1 > > > > Step: > > 1. Build OVS only for userspace, and reuse kernel-builtin openvswitch > > module. > > 2. On Host1, create 1 vxlan interface and add 1 VF_rep to OVS. > > 3. Attach the VF to one VM, and the VM will do 5 tuples swap using DPDK > > app. > > 4. using traffic generator to send huge traffic (7Mpps with serveral k > > connetions)to Host1 PF. > > 5. The OVS rue are configured as below. > > > > VM1_PORTNAME=$1 > > VXLAN_PORTNAME=$2 > > VM1_PORT=$(ovs-vsctl list interface | grep $VM1_PORTNAME -A1 | grep ofport > > | sed 's/ofport *: \([0-9]*\)/\1/g') > > VXLAN_PORT=$(ovs-vsctl list interface | grep $VXLAN_PORTNAME -A1 | grep > > ofport | sed 's/ofport *: \([0-9]*\)/\1/g') > > ZONE=8 > > ovs-ofctl del-flows ovs-sriov > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=1000 table=0,arp, > > actions=NORMAL" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 > > table=0,ip,in_port=$VM1_PORT,action=set_field:$VM1_PORT->reg6,goto_table:5" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 > > table=0,ip,in_port=$VXLAN_PORT, tun_id=0x242, > > action=set_field:$VXLAN_PORT->reg6,set_field:$VM1_PORT->reg7,goto_table:5" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=5, priority=100, > > ip,actions=ct(table=10,zone=$ZONE)" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=-new+est-rel-inv+trk actions= goto_table:15" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=-new-est-rel+inv+trk actions=drop" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=-new-est-rel-inv-trk actions=drop" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=+new-rel-inv+trk actions= > > ct(commit,table=15,zone=$ZONE)" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, > > in_port=$VM1_PORT, > > action=set_field:0x242->tun_id,set_field:$VXLAN_PORT->reg7,goto_table:20" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, > > in_port=$VXLAN_PORT, actions=goto_table:20" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=20, priority=100, > > ip,action=output:NXM_NX_REG7[0..15]" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=200, > > priority=100,action=drop" > > 6. execute serveral times “systemctl restart openvswitch”, then crash. > > > > Crash stack (2 kinds): > > One > > [ 575.459905] device vxlan_sys_4789 left promiscuous mode > > [ 575.460103] BUG: unable to handle kernel NULL pointer dereference at > > 0008 > > [ 575.460133] IP: [] gro_cell_poll+0x4b/0x80 [vxlan] > > [ 575.460210] PGD 0 > > [ 575.460226] Oops: 0002 [#1] SMP > > [ 575.460254] Modules linked in: vhost_net vhost macvtap macvlan vxlan > > ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 > > nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle > > ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat > > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
Re: [ovs-dev] [ovs-discuss] crash when restart openvswitch with huge vxlan traffic running
> Greg, this is a kernel issue. If you have the time, will you take a > look at it sometime? > Hi all, I worked on a pretty similar issue a couple of weeks ago. Could you please take a look to the commit below (it is already in Linus's tree): commit 8e1da73acded4751a93d4166458a7e640f37d26c Author: Lorenzo Bianconi Date: Wed Dec 19 23:23:00 2018 +0100 gro_cell: add napi_disable in gro_cells_destroy Add napi_disable routine in gro_cells_destroy since starting from commit c42858eaf492 ("gro_cells: remove spinlock protecting receive queues") gro_cell_poll and gro_cells_destroy can run concurrently on napi_skbs list producing a kernel Oops if the tunnel interface is removed while gro_cell_poll is running. The following Oops has been triggered removing a vxlan device while the interface is receiving traffic Regards, Lorenzo > On Thu, Dec 20, 2018 at 12:42:43PM +, 王志克 wrote: > > Hi All, > > > > I did below test, and found system crash, does anyone knows whether there > > are already some fix for it? > > > > Setup: > > CentOS7.4 3.10.0-693.el7.x86_64, > > OVS: 2.10.1 > > > > Step: > > 1. Build OVS only for userspace, and reuse kernel-builtin openvswitch > > module. > > 2. On Host1, create 1 vxlan interface and add 1 VF_rep to OVS. > > 3. Attach the VF to one VM, and the VM will do 5 tuples swap using DPDK > > app. > > 4. using traffic generator to send huge traffic (7Mpps with serveral k > > connetions)to Host1 PF. > > 5. The OVS rue are configured as below. > > > > VM1_PORTNAME=$1 > > VXLAN_PORTNAME=$2 > > VM1_PORT=$(ovs-vsctl list interface | grep $VM1_PORTNAME -A1 | grep ofport > > | sed 's/ofport *: \([0-9]*\)/\1/g') > > VXLAN_PORT=$(ovs-vsctl list interface | grep $VXLAN_PORTNAME -A1 | grep > > ofport | sed 's/ofport *: \([0-9]*\)/\1/g') > > ZONE=8 > > ovs-ofctl del-flows ovs-sriov > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=1000 table=0,arp, > > actions=NORMAL" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 > > table=0,ip,in_port=$VM1_PORT,action=set_field:$VM1_PORT->reg6,goto_table:5" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 > > table=0,ip,in_port=$VXLAN_PORT, tun_id=0x242, > > action=set_field:$VXLAN_PORT->reg6,set_field:$VM1_PORT->reg7,goto_table:5" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=5, priority=100, > > ip,actions=ct(table=10,zone=$ZONE)" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=-new+est-rel-inv+trk actions= goto_table:15" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=-new-est-rel+inv+trk actions=drop" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=-new-est-rel-inv-trk actions=drop" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=+new-rel-inv+trk actions= > > ct(commit,table=15,zone=$ZONE)" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, > > in_port=$VM1_PORT, > > action=set_field:0x242->tun_id,set_field:$VXLAN_PORT->reg7,goto_table:20" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, > > in_port=$VXLAN_PORT, actions=goto_table:20" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=20, priority=100, > > ip,action=output:NXM_NX_REG7[0..15]" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=200, > > priority=100,action=drop" > > 6. execute serveral times “systemctl restart openvswitch”, then crash. > > > > Crash stack (2 kinds): > > One > > [ 575.459905] device vxlan_sys_4789 left promiscuous mode > > [ 575.460103] BUG: unable to handle kernel NULL pointer dereference at > > 0008 > > [ 575.460133] IP: [] gro_cell_poll+0x4b/0x80 [vxlan] > > [ 575.460210] PGD 0 > > [ 575.460226] Oops: 0002 [#1] SMP > > [ 575.460254] Modules linked in: vhost_net vhost macvtap macvlan vxlan > > ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 > > nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle > > ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat > > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT > > nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter > > ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) > > ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) > > ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) > > mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt iTCO_vendor_support dcdbas > > sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel > > kvm irqbypass crc32_pclmul > > [ 575.460619] ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > > ablk_helper cryptd ipmi_ssif joydev pcspkr sg mei_me mei lpc_ich ipmi_si > > shpchp ipmi_devintf ipmi_msghandler wmi acpi_power_meter knem(OE) nfsd > > auth_rpcgss nfs_acl lockd grace