Sorry, click send too fast...

I think this line causes the KP:
https://github.com/torvalds/linux/blob/v3.14/net/core/skbuff.c#L1039
But this is weird, because as I read from this mailing list, OVS doesn't
allow shared skb.

I tried turn off GRO: it made receive side slower, but it eventually
crashes too. Any ideas?

Thanks.
-Simon


On Thu, Feb 12, 2015 at 8:32 PM, Xu (Simon) Chen <[email protected]> wrote:

> Hi folks,
>
>
> I can now consistently reproduce a kernel panic on my system. I am using
> OVS 2.3.0 on 3.14.29 kernel, a sender and a receiver (two VMs) on two
> identical hypervisors, using VXLAN tunnel connecting the two VMs. Iperf is
> used inside of VMs for generating traffic. The sender side has no problem,
> while the hypervisor with the receiving VM consistently crashes after
> certain amount of time (or rather packets).
>
>
> The kernel panic seems to be related to skb_shared check inside of
> pskb_expand_head function:
>
>  [ 7318.405112] ------------[ cut here ]------------
>
> [ 7318.409796] kernel BUG at net/core/skbuff.c:1041!
>
> [ 7318.414563] invalid opcode: 0000 [#1] SMP
>
> [ 7318.418868] Modules linked in: ip6table_filter ip6_tables xt_mac
> xt_tcpudp xt_state xt_physdev xt_set xt_multiport iptable_filter
> iptable_nat nf_nat_ipv4 nf_nat ipta
>
> ble_raw ip_tables x_tables ip_set_hash_ip ip_set nfnetlink vhost_net vhost
> macvtap macvlan tun veth openvswitch(O) gre vxlan libcrc32c bridge 8021q
> garp stp llc bonding
>
> joydev hid_generic usbhid hid deflate ctr twofish_generic
> twofish_avx_x86_64 nfsd twofish_x86_64_3way twofish_x86_64 twofish_common
> auth_rpcgss oid_registry nfs_acl ca
>
> mellia_generic camellia_aesni_avx_x86_64 camellia_x86_64 nfs lockd
> serpent_avx_x86_64 fscache serpent_sse2_x86_64 xts serpent_generic sunrpc
> blowfish_generic blowfish_x
>
> 86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common
> des_generic cbc cmac binfmt_misc xcbc rmd160 sha512_generic sha256_generic
> hmac crypto_null af_key xfrm
>
> _algo iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal coretemp kvm_intel
> kvm crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64
> lrw gf128mul glue_he
>
> lper ablk_helper cryptd microcode evdev ehci_pci sb_edac ehci_hcd
> edac_core usbcore lpc_ich ioatdma i2c_i801 usb_common mfd_core tpm_tis wmi
> tpm acpi_cpufreq processor
>
> thermal_sys button nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ipv6
> nf_defrag_ipv6 nf_conntrack ipmi_devintf ipmi_si ipmi_msghandler loop
> tcp_scalable autofs4 ext4 cr
>
> c16 jbd2 mbcache crc32c btrfs xor raid6_pq dm_mod mlx4_en(O) sg sd_mod
> crc_t10dif crct10dif_common igb isci i2c_algo_bit ahci libsas i2c_core
> libahci dca mlx4_core(O) m
>
> egaraid_sas scsi_transport_sas ptp libata pps_core compat(O) scsi_mod
>
> [ 7318.568195] CPU: 14 PID: 54124 Comm: vhost-54120 Tainted: G           O
> 3.14.25-ts1 #1
>
> [ 7318.576227] Hardware name: Supermicro SYS-F617R2-R72+/X9DRFR, BIOS 3.0b
> 04/24/2014
>
> [ 7318.583944] task: ffff887f25dde240 ti: ffff883ef6a32000 task.ti:
> ffff883ef6a32000
>
> [ 7318.591562] RIP: 0010:[<ffffffff813eb634>]  [<ffffffff813eb634>]
> pskb_expand_head+0x234/0x270
>
> [ 7318.600295] RSP: 0018:ffff887f7f103978  EFLAGS: 00010202
>
> [ 7318.605770] RAX: 0000000000000002 RBX: ffff887f23417700 RCX:
> 0000000000000020
>
> [ 7318.613016] RDX: 00000000000002ee RSI: 0000000000000000 RDI:
> ffff887f23417700
>
> [ 7318.620278] RBP: ffff887f7f1039b8 R08: 000000005ff00000 R09:
> ffff887f2113e040
>
> [ 7318.627523] R10: 00000000ffffee43 R11: 0000000000000002 R12:
> 0000000000000000
>
> [ 7318.634805] R13: ffff887f23417700 R14: 000000000000000d R15:
> ffff887f23417700
>
> [ 7318.642032] FS:  0000000000000000(0000) GS:ffff887f7f100000(0000)
> knlGS:0000000000000000
>
> [ 7318.650238] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>
> [ 7318.656110] CR2: 00002ba78155e680 CR3: 0000007f1f8cd000 CR4:
> 00000000001427e0
>
> [ 7318.663378] Stack:
>
> [ 7318.665464]  ffff887f7f1039f8 ffffffff8142ae14 ffffffffa07b50f0
> ffff887f23417700
>
> [ 7318.673226]  ffff887f7f103b58 ffff887f7f103a70 000000000000000d
> ffff887f23417700
>
> [ 7318.681000]  ffff887f7f103a08 ffffffff813eb6fc ffff887f23417700
> 000001f823417700
>
> [ 7318.688790] Call Trace:
>
> [ 7318.691309]  <IRQ>
>
> [ 7318.693290]  [<ffffffff8142ae14>] ? nf_hook_slow+0x74/0x130
>
> [ 7318.699285]  [<ffffffffa07b50f0>] ? deliver_clone+0x60/0x60 [bridge]
>
> [ 7318.705710]  [<ffffffff813eb6fc>] __pskb_pull_tail+0x4c/0x330
>
> [ 7318.711571]  [<ffffffff813f8ca7>] skb_checksum_help+0x147/0x1a0
>
> [ 7318.717599]  [<ffffffffa07de8b0>] queue_userspace_packet+0x3f0/0x440
> [openvswitch]
>
> [ 7318.725289]  [<ffffffffa07dfcd5>] ovs_dp_upcall+0x65/0x70 [openvswitch]
>
> [ 7318.732037]  [<ffffffffa07dc7b6>] do_execute_actions+0x366/0xc00
> [openvswitch]
>
> [ 7318.739403]  [<ffffffff8142ae14>] ? nf_hook_slow+0x74/0x130
>
> [ 7318.745072]  [<ffffffff812a7c9a>] ? arch_fast_hash2+0xa/0x10
>
> [ 7318.750883]  [<ffffffffa07dc7ec>] do_execute_actions+0x39c/0xc00
> [openvswitch]
>
> [ 7318.758221]  [<ffffffffa07b570d>] ? br_forward+0x5d/0x70 [bridge]
>
> [ 7318.764419]  [<ffffffffa07dd0c6>] ovs_execute_actions+0x76/0x110
> [openvswitch]
>
> [ 7318.771773]  [<ffffffffa07dfd6f>]
> ovs_dp_process_packet_with_key+0x8f/0xf0 [openvswitch]
>
> [ 7318.779988]  [<ffffffffa07e0efa>] ? ovs_flow_extract+0x89a/0xab0
> [openvswitch]
>
> [ 7318.787355]  [<ffffffffa07dfe10>]
> ovs_dp_process_received_packet+0x40/0x60 [openvswitch]
>
> [ 7318.795535]  [<ffffffffa07e616a>] ovs_vport_receive+0x2a/0x30
> [openvswitch]
>
> [ 7318.802634]  [<ffffffffa07e7cf5>] netdev_frame_hook+0xc5/0x120
> [openvswitch]
>
> [ 7318.809773]  [<ffffffff813f9f42>] __netif_receive_skb_core+0x332/0x7f0
>
> [ 7318.816418]  [<ffffffffa07e7c30>] ? netdev_create+0x150/0x150
> [openvswitch]
>
> [ 7318.823475]  [<ffffffff813fa426>] __netif_receive_skb+0x26/0x70
>
> [ 7318.829472]  [<ffffffff813fa514>] process_backlog+0xa4/0x180
>
> [ 7318.835223]  [<ffffffff813fa979>] net_rx_action+0x139/0x220
>
> [ 7318.840894]  [<ffffffff81053218>] __do_softirq+0xf8/0x280
>
> [ 7318.846391]  [<ffffffff81504b5c>] do_softirq_own_stack+0x1c/0x30
>
> [ 7318.852517]  <EOI>
>
> [ 7318.854504]  [<ffffffff81053425>] do_softirq+0x45/0x50
>
> [ 7318.860084]  [<ffffffff813f9759>] netif_rx_ni+0x39/0x70
>
> [ 7318.865416]  [<ffffffffa07f1ab3>] tun_get_user+0x413/0x840 [tun]
>
> [ 7318.871506]  [<ffffffffa07f1f3a>] tun_sendmsg+0x5a/0x80 [tun]
>
> [ 7318.877357]  [<ffffffffa0819e32>] handle_tx+0x382/0x400 [vhost_net]
>
> [ 7318.883712]  [<ffffffffa0819ee5>] handle_tx_kick+0x15/0x20 [vhost_net]
>
> [ 7318.890333]  [<ffffffffa080d4f6>] vhost_worker+0xf6/0x190 [vhost]
>
> [ 7318.896528]  [<ffffffffa080d400>] ? vhost_log_access_ok+0x30/0x30
> [vhost]
>
> [ 7318.903454]  [<ffffffff81070c69>] kthread+0xc9/0xe0
>
> [ 7318.908412]  [<ffffffff81070ba0>] ? flush_kthread_worker+0x80/0x80
>
> [ 7318.914674]  [<ffffffff8150342c>] ret_from_fork+0x7c/0xb0
>
> [ 7318.920163]  [<ffffffff81070ba0>] ? flush_kthread_worker+0x80/0x80
>
> [ 7318.926426] Code: 55 c0 e8 f0 38 d4 ff 48 8b 55 c0 84 c0 0f 85 0b ff ff
> ff e9 02 ff ff ff 0f 1f 80 00 00 00 00 41 81 cf 00 20 00 00 e9 1f fe ff ff
> <0f> 0b 0f 0b 44 89 fe 4c 89 ef e8 ad e8 ff ff 85 c0 74 12 48 89
>
> [ 7318.950040] RIP  [<ffffffff813eb634>] pskb_expand_head+0x234/0x270
>
> [ 7318.956385]  RSP <ffff887f7f103978>
>
> [ 7318.959988] ---[ end trace 221c17dcc65b8372 ]---
>
> [ 7319.076935] Kernel panic - not syncing: Fatal exception in interrupt
>
> [ 7319.086993] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation
> range: 0xffffffff80000000-0xffffffff9fffffff)
>
> [ 7319.204560] Rebooting in 10 seconds..
>
>
>
>
>
_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to