Greg, this is a kernel issue.  If you have the time, will you take a
look at it sometime?

On Thu, Dec 20, 2018 at 12:42:43PM +0000, 王志克 wrote:
> Hi All,
> 
> I did below test, and found system crash, does anyone knows whether there are 
> already some fix for it?
> 
> Setup:
> CentOS7.4 3.10.0-693.el7.x86_64,
> OVS: 2.10.1
> 
> Step:
> 1.  Build OVS only for userspace, and reuse kernel-builtin openvswitch module.
> 2.  On Host1, create 1 vxlan interface and add 1 VF_rep to OVS.
> 3.  Attach the VF to one VM, and the VM will do 5 tuples swap using DPDK app.
> 4.  using traffic generator to send huge traffic (7Mpps with serveral k 
> connetions)to Host1 PF.
> 5.  The OVS rue are configured as below.
> 
> VM1_PORTNAME=$1
> VXLAN_PORTNAME=$2
> VM1_PORT=$(ovs-vsctl list interface | grep $VM1_PORTNAME -A1 | grep ofport | 
> sed 's/ofport *: \([0-9]*\)/\1/g')
> VXLAN_PORT=$(ovs-vsctl list interface | grep $VXLAN_PORTNAME -A1 | grep 
> ofport | sed 's/ofport *: \([0-9]*\)/\1/g')
> ZONE=8
> ovs-ofctl del-flows ovs-sriov
> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=1000 table=0,arp, 
> actions=NORMAL"
> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 
> table=0,ip,in_port=$VM1_PORT,action=set_field:$VM1_PORT->reg6,goto_table:5"
> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 
> table=0,ip,in_port=$VXLAN_PORT, tun_id=0x242, 
> action=set_field:$VXLAN_PORT->reg6,set_field:$VM1_PORT->reg7,goto_table:5"
> 
> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=5, priority=100, 
> ip,actions=ct(table=10,zone=$ZONE)"
> 
> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
> priority=100,ip,ct_state=-new+est-rel-inv+trk actions= goto_table:15"
> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
> priority=100,ip,ct_state=-new-est-rel+inv+trk actions=drop"
> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
> priority=100,ip,ct_state=-new-est-rel-inv-trk actions=drop"
> 
> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
> priority=100,ip,ct_state=+new-rel-inv+trk actions= 
> ct(commit,table=15,zone=$ZONE)"
> 
> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, 
> in_port=$VM1_PORT, 
> action=set_field:0x242->tun_id,set_field:$VXLAN_PORT->reg7,goto_table:20"
> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, 
> in_port=$VXLAN_PORT, actions=goto_table:20"
> 
> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=20, priority=100, 
> ip,action=output:NXM_NX_REG7[0..15]"
> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=200, 
> priority=100,action=drop"
> 6. execute serveral times “systemctl restart openvswitch”, then crash.
> 
> Crash stack (2 kinds):
> One
> [  575.459905] device vxlan_sys_4789 left promiscuous mode
> [  575.460103] BUG: unable to handle kernel NULL pointer dereference at 
> 0000000000000008
> [  575.460133] IP: [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan]
> [  575.460210] PGD 0
> [  575.460226] Oops: 0002 [#1] SMP
> [  575.460254] Modules linked in: vhost_net vhost macvtap macvlan vxlan 
> ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 
> nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle 
> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat 
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
> nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter 
> ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) 
> ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) 
> ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) 
> mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt iTCO_vendor_support dcdbas 
> sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm 
> irqbypass crc32_pclmul
> [  575.460619]  ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper 
> ablk_helper cryptd ipmi_ssif joydev pcspkr sg mei_me mei lpc_ich ipmi_si 
> shpchp ipmi_devintf ipmi_msghandler wmi acpi_power_meter knem(OE) nfsd 
> auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod 
> crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea ixgbe 
> sysfillrect igb sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common 
> crc32c_intel drm ahci libahci megaraid_sas libata i2c_algo_bit i2c_core mdio 
> ptp dca pps_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
> devlink]
> [  575.460885] CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G           OE  
> ------------   3.10.0-693.el7.x86_64 #1
> [  575.460912] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.3.6 
> 06/03/2015
> [  575.460933] task: ffff880152ef1fa0 ti: ffff880152efc000 task.ti: 
> ffff880152efc000
> [  575.460954] RIP: 0010:[<ffffffffc09b330b>]  [<ffffffffc09b330b>] 
> gro_cell_poll+0x4b/0x80 [vxlan]
> [  575.460990] RSP: 0018:ffff880152effd68  EFLAGS: 00010202
> [  575.461004] RAX: 0000000000000000 RBX: ffffe8dfff448818 RCX: 
> 0000000000000000
> [  575.461024] RDX: 0000000000000001 RSI: ffff881fa42ebf00 RDI: 
> ffffe8dfff448818
> [  575.461042] RBP: ffff880152effd88 R08: 0000000000019c40 R09: 
> ffffffff815710d7
> [  575.461061] R10: ffff881ffec59c40 R11: ffffea007e90ba00 R12: 
> 0000000000000002
> [  575.461079] R13: 0000000000000040 R14: ffffe8dfff448800 R15: 
> 0000000000000001
> [  575.461098] FS:  0000000000000000(0000) GS:ffff881ffec40000(0000) 
> knlGS:0000000000000000
> [  575.461119] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  575.461134] CR2: 0000000000000008 CR3: 00000000019f2000 CR4: 
> 00000000001427e0
> [  575.461153] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [  575.461172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000400
> [  575.461190] Stack:
> [  575.461198]  ffffe8dfff448818 0000000000000000 0000000000000040 
> 0000000000000000
> [  575.461221]  ffff880152effe08 ffffffff8158799d ffff881ffec57950 
> ffff881ffec57940
> [  575.461254]  00000001000432b7 0000012c52f09428 ffff881ffd57eb40 
> ffff881ffd57eb40
> [  575.461277] Call Trace:
> [  575.461290]  [<ffffffff8158799d>] net_rx_action+0x16d/0x380
> [  575.461308]  [<ffffffff81090b3f>] __do_softirq+0xef/0x280
> [  575.461324]  [<ffffffff81090d08>] run_ksoftirqd+0x38/0x50
> [  575.462074]  [<ffffffff810b909f>] smpboot_thread_fn+0x12f/0x180
> [  575.462780]  [<ffffffff810b8f70>] ? lg_double_unlock+0x40/0x40
> [  575.463464]  [<ffffffff810b098f>] kthread+0xcf/0xe0
> [  575.464169]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
> [  575.464862]  [<ffffffff816b4f18>] ret_from_fork+0x58/0x90
> [  575.465497]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
> [  575.466192] Code: 49 39 f6 74 40 48 85 f6 74 3b 83 6b f8 01 48 89 df 41 83 
> c4 01 48 8b 0e 48 8b 46 08 48 c7 06 00 00 00 00 48 c7 46 08 00 00 00 00 <48> 
> 89 41 08 48 89 08 e8 29 4f bd c0 45 39 ec 74 14 48 8b 73 e8
> [  575.467663] RIP  [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan]
> [  575.468412]  RSP <ffff880152effd68>
> [  575.469197] CR2: 0000000000000008
> 
> TWO:
> [  390.626080] device vxlan_sys_4789 left promiscuous mode
> [  390.626345] BUG: unable to handle kernel NULL pointer dereference at 
> 0000000000000008
> [  390.626411] IP: [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan]
> [  390.626462] PGD 0
> [  390.626499] Oops: 0002 [#1] SMP
> [  390.626529] Modules linked in: vhost_net vhost macvtap macvlan vxlan 
> ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 
> nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle 
> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat 
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
> nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter 
> ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) 
> ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) 
> ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) 
> mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt iTCO_vendor_support dcdbas 
> sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm 
> irqbypass crc32_pclmul
> [  390.627152]  ghash_clmulni_intel ipmi_ssif aesni_intel lrw gf128mul 
> glue_helper ablk_helper cryptd ipmi_si pcspkr joydev ipmi_devintf 
> ipmi_msghandler mei_me mei sg lpc_ich shpchp acpi_power_meter wmi nfsd 
> auth_rpcgss nfs_acl lockd knem(OE) grace sunrpc ip_tables xfs libcrc32c 
> sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea 
> sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common 
> ixgbe crc32c_intel ahci igb libahci libata megaraid_sas mdio i2c_algo_bit ptp 
> i2c_core pps_core dca dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
> devlink]
> [  390.627626] CPU: 11 PID: 6303 Comm: ovs-vswitchd Tainted: G           OE  
> ------------   3.10.0-693.el7.x86_64 #1
> [  390.627690] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.3.6 
> 06/03/2015
> [  390.627738] task: ffff881fe0e89fa0 ti: ffff881fa3590000 task.ti: 
> ffff881fa3590000
> [  390.627786] RIP: 0010:[<ffffffffc09c8b4a>]  [<ffffffffc09c8b4a>] 
> vxlan_dellink+0x9a/0xf0 [vxlan]
> [  390.627848] RSP: 0018:ffff881fa3593888  EFLAGS: 00010206
> [  390.627883] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 
> 0000000000000000
> [  390.627929] RDX: 0000000000000000 RSI: ffffea007fd7f600 RDI: 
> ffff881ff5fd8c00
> [  390.627975] RBP: ffff881fa35938b0 R08: ffff881ff5fd8b00 R09: 
> 000000018040000d
> [  390.628020] R10: 00000000f5fd8a01 R11: ffffea007fd7f600 R12: 
> ffff88015270e000
> [  390.628066] R13: ffffffff81b1caa0 R14: ffff881fa35938c0 R15: 
> ffffe8dfff60a1d8
> [  390.628112] FS:  00007f4ea1168ac0(0000) GS:ffff883ffe540000(0000) 
> knlGS:0000000000000000
> [  390.628163] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  390.628201] CR2: 0000000000000008 CR3: 0000001ff9055000 CR4: 
> 00000000001427e0
> [  390.628246] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [  390.628292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000400
> [  390.628337] Stack:
> [  390.628354]  ffff881fa35938c0 ffffffff81ad9d40 0000000000000001 
> 0000000000000000
> [  390.628411]  ffffffff81ad9d40 ffff881fa35938e0 ffffffff81599023 
> ffff881fa35938c0
> [  390.628468]  ffff881fa35938c0 000000001d8239fc ffff883ffd864a00 
> ffff881fa3593a70
> [  390.628535] Call Trace:
> [  390.628561]  [<ffffffff81599023>] rtnl_delete_link+0x43/0x80
> [  390.628610]  [<ffffffff8159b761>] rtnl_dellink+0x91/0xf0
> [  390.628649]  [<ffffffff81599bd4>] rtnetlink_rcv_msg+0xa4/0x270
> [  390.630373]  [<ffffffff815bacd0>] ? __netlink_lookup+0xc0/0x110
> [  390.632066]  [<ffffffff81599b30>] ? rtnetlink_rcv+0x30/0x30
> [  390.633751]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
> [  390.635426]  [<ffffffff81599b28>] rtnetlink_rcv+0x28/0x30
> [  390.637081]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
> [  390.638721]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
> [  390.640371]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
> [  390.642037]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
> [  390.643722]  [<ffffffff8156a88f>] ? sock_recvmsg+0xbf/0x100
> [  390.645411]  [<ffffffff8132c312>] ? put_dec+0x72/0x90
> [  390.647075]  [<ffffffff8132d303>] ? number.isra.2+0x323/0x360
> [  390.648724]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
> [  390.650362]  [<ffffffff811de9d2>] ? kmem_cache_free+0x1e2/0x200
> [  390.652010]  [<ffffffff81217af5>] ? __d_free+0x35/0x40
> [  390.653623]  [<ffffffff812181b0>] ? d_free+0x60/0x70
> [  390.655181]  [<ffffffff812186b4>] ? dentry_kill+0x154/0x1b0
> [  390.656702]  [<ffffffff81222744>] ? mntput+0x24/0x40
> [  390.658173]  [<ffffffff81203053>] ? __fput+0x183/0x260
> [  390.659606]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
> [  390.660988]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
> [  390.662325]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
> [  390.663624] Code: a0 bb c0 49 8b 3f 49 39 ff 74 be 48 85 ff 74 b9 41 83 6f 
> 10 01 48 8b 0f 48 8b 57 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 <48> 
> 89 51 08 48 89 0a e8 8a a2 ba c0 49 8b 3f 49 39 ff 75 cc eb
> [  390.666406] RIP  [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan]
> [  390.667674]  RSP <ffff881fa3593888>
> [  390.668892] CR2: 0000000000000008
> 
> Br,
> Zhike Wang
> JDCloud, Product Development, IaaS
> ------------------------------------------------------------------------------------------------
> Mobile/+86 13466719566
> E- mail/[email protected]<mailto:[email protected]>
> Address/5F Building A,North-Star Century Center,8 Beichen West 
> Street,Chaoyang District Beijing
> Https://JDCloud.com<https://jdcloud.com/>
> ------------------------------------------------------------------------------------------------
> [cid:[email protected]]
> 



> _______________________________________________
> discuss mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to