[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)
This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report[0]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug. Please follow the instructions on the wiki page[0]. The first step is to email the appropriate mailing list. If no response is received, then a bug may be opened on bugzilla.kernel.org. Once this bug is reported upstream, please add the tag: 'kernel-bug- reported-upstream'. [0] https://wiki.ubuntu.com/Bugs/Upstream/kernel ** Also affects: linux (Ubuntu Bionic) Importance: High Status: Confirmed ** Changed in: linux (Ubuntu Bionic) Status: Confirmed => Triaged ** Changed in: linux (Ubuntu Artful) Status: Confirmed => Triaged ** Changed in: linux (Ubuntu Xenial) Status: Confirmed => Triaged -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1755268 Title: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)
Additional I've also tested latest hwe kernel (4.13.0-37-generic) and build-in driver, the same problem here: [ 1011.070739] kvm [16361]: vcpu0, guest rIP: 0x810644d8 disabled perfctr wrmsr: 0xc2 data 0x [ 1011.528347] cache_from_obj: Wrong slab cache. kmalloc-256 but object is from kmalloc-192 [ 1011.927642] general protection fault: [#1] SMP PTI [ 1012.185439] cache_from_obj: Wrong slab cache. kmalloc-256 but object is from kmalloc-192 driver version: # ethtool -i eno1 driver: mlx4_en version: 4.0-0 firmware-version: 2.42.5004 expansion-rom-version: bus-info: :11:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1755268 Title: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)
Hi, I can't build new mellanox drivers for new kernel, so I used default one: # ethtool -i eno1 driver: mlx4_en version: 4.0-0 firmware-version: 2.42.5004 expansion-rom-version: bus-info: :11:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes After creating virtual machine, my kernel was stuck with these messages: [ 2170.511433] kvm [28464]: vcpu0, guest rIP: 0x810644d8 disabled perfctr wrmsr: 0xc2 data 0x [ 2170.963166] [ cut here ] ** Tags added: kernel-bug-exists-upstream ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed ** Changed in: linux (Ubuntu Xenial) Status: Incomplete => Confirmed ** Changed in: linux (Ubuntu Artful) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1755268 Title: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)
Hi Joseph, thanks for your answer. Currnetly I have two various hardware configuration: - HPE ProLiant m710p Server Cartridge (have no this problem) - HPE ProLiant m710x Server Cartridge (have this problem) > Did this issue start happening after an update/upgrade? > Was there a prior kernel version where you were not having this particular > problem? Well, I uses debootstrap script for install all needed software automatically and build image with base system. After that I uses this image for boot my nodes via PXE. So each boot I have system that installed from scratch. I've tested the next kernels: - Ubuntu 16.04 with stock kernel: 4.4.0-116-generic - Ubuntu 16.04 with hwe kernel: 4.13.0-36-generic - Ubuntu 16.04 with pve kernel: 4.13.13-6-pve - Debian 9 with pve kernel: 4.13.13-6-pve - Debian 9 with stock kernel: 4.9.0-6-amd64 All of them have this problem, but stock kernels can drops after some time. (I had no this error only on debian with 4.9.0-6-amd64 but presume it exists there because I'm not tested it properly) Another thing, that if I do this steps AFTER the system is boot up: rmmod mlx4_en mlx4_ib mlx4_core modprobe mlx4_core num_vfs=1 port_type_array=2,2 probe_vf=1 systemctl restart networking Everything starts working fine. > Please test the latest v4.16 kernel[0]. Ok, I'll do this -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1755268 Title: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)
Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem? Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.16 kernel[0]. If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'. If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'. Once testing of the upstream kernel is complete, please mark this bug as "Confirmed". Thanks in advance. [0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16-rc6 ** Changed in: linux (Ubuntu) Importance: Undecided => High ** Also affects: linux (Ubuntu Artful) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Artful) Status: New => Confirmed ** Changed in: linux (Ubuntu Artful) Importance: Undecided => High ** Changed in: linux (Ubuntu Artful) Status: Confirmed => Incomplete ** Changed in: linux (Ubuntu) Status: Confirmed => Incomplete ** Tags added: kernel-da-key ** Also affects: linux (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Xenial) Status: New => Incomplete ** Changed in: linux (Ubuntu Xenial) Importance: Undecided => High -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1755268 Title: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)
** Attachment added: "lspci -v output" https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1755268/+attachment/5081271/+files/lspci-v.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1755268 Title: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)
I was teted right now, the problem occurs also on stock kernel. # uname -a Linux m5c43 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Seems bug connected only with specific hardware, I'll provide more information about this. I uses HP Moonshot x710x xartridge: # lspci 00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 0a) 00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16) (rev 0a) 00:02.0 VGA compatible controller: Intel Corporation Device 193a (rev 09) 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31) 00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31) 00:17.0 SATA controller: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] (rev 31) 00:1b.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Root Port #17 (rev f1) 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #1 (rev f1) 00:1c.3 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #4 (rev f1) 00:1c.4 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #5 (rev f1) 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #9 (rev f1) 00:1d.5 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #14 (rev f1) 00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31) 00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31) 00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31) 01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard Slave Instrumentation & System Support (rev 06) 01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EH (rev 01) 01:00.2 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard Management Processor Support and Messaging (rev 06) 01:00.4 USB controller: Hewlett-Packard Company Integrated Lights-Out Standard Virtual USB Controller (rev 03) 0e:00.0 Non-Volatile memory controller: Intel Corporation Device f1a5 (rev 03) 11:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro] 11:00.1 Ethernet controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] # ethtool -i eno1 driver: mlx4_en version: 4.3-1.0.1 firmware-version: 2.40.5540 expansion-rom-version: bus-info: :11:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes # mst status MST modules: MST PCI module loaded MST PCI configuration module loaded MST devices: /dev/mst/mt4103_pci_cr0 - PCI direct access. domain:bus:dev.fn=:11:00.0 bar=0x7f10 size=0x10 Chip revision is: 00 /dev/mst/mt4103_pciconf0 - PCI configuration cycles access. domain:bus:dev.fn=:11:00.0 addr.reg=88 data.reg=92 Chip revision is: 00 # ibv_devinfo hca_id: mlx4_1 transport: InfiniBand (0) fw_ver: 2.40.5540 node_guid: 0014:0500:d300:bc52 sys_image_guid: f403:4303:00fd:102d vendor_id: 0x02c9 vendor_part_id: 4100 hw_ver: 0x0 board_id: HP_1690110017 phys_port_cnt: 2 Device ports: port: 1 state: PORT_DOWN (1) max_mtu:4096 (5) active_mtu: 1024 (3) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: Ethernet port: 2 state: PORT_DOWN (1) max_mtu:4096 (5) active_mtu: 1024 (3) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: Ethernet hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.40.5540 node_guid:
[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)
** Attachment added: "Dmesg output (before kernel panic)" https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1755268/+attachment/5081272/+files/dmesg.txt ** Package changed: linux-hwe (Ubuntu) => linux (Ubuntu) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1755268 Title: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)
I'm tried to use simple modules loading instead /etc/init.d/openibd script: rmmod mlx4_en mlx4_core modprobe mlx4_core num_vfs=1 port_type_array=2,2 probe_vf=1 The result the same: [ 193.469331] mlx4_core :11:00.0: bond for multifunction failed [ 193.770126] cache_from_obj: Wrong slab cache. kmalloc-256 but object is from kmalloc-192 [ 194.170171] mlx4_en: eno1d1: Fail to bond device [ 194.170178] nf_reject_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables xt_set ip_set_list_set ip_set_hash_net veth beegfs(OE) dummy nf_conntrack_netlink xt_nat xt_tcpudp xt_recent ip_set nfnetlink ip_vs rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace sunrpc fscache xt_comment xt_mark netconsole mlx4_ib(OE) mlx4_en(OE) mlx4_core(OE) ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack libcrc32c br_netfilter 8021q garp mrp ib_core(OE) mlx_compat(OE) bridge stp llc bonding ipmi_ssif intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 hpilo crypto_simd [ 194.170482] glue_helper cryptd intel_cstate mei_me ipmi_si ipmi_devintf ipmi_msghandler intel_rapl_perf shpchp mei acpi_power_meter mac_hid ie31200_edac knem(OE) autofs4 overlay nbd ptp pps_core i915 mgag200 video ttm i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm nvme ahci nvme_core libahci devlink [last unloaded: mlx4_core] [ 194.170550] CPU: 0 PID: 7 Comm: ksoftirqd/0 Tainted: GW OE 4.13.0-36-generic #40~16.04.1-Ubuntu -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1755268 Title: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1755268/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs