[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)

2018-03-20 Thread Joseph Salisbury
This issue appears to be an upstream bug, since you tested the latest
upstream kernel. Would it be possible for you to open an upstream bug
report[0]? That will allow the upstream Developers to examine the issue,
and may provide a quicker resolution to the bug.

Please follow the instructions on the wiki page[0]. The first step is to
email the appropriate mailing list. If no response is received, then a
bug may be opened on bugzilla.kernel.org.

Once this bug is reported upstream, please add the tag: 'kernel-bug-
reported-upstream'.

[0] https://wiki.ubuntu.com/Bugs/Upstream/kernel

** Also affects: linux (Ubuntu Bionic)
   Importance: High
   Status: Confirmed

** Changed in: linux (Ubuntu Bionic)
   Status: Confirmed => Triaged

** Changed in: linux (Ubuntu Artful)
   Status: Confirmed => Triaged

** Changed in: linux (Ubuntu Xenial)
   Status: Confirmed => Triaged

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1755268

Title:
  Kernel panic when using KVM and Mellanox OFED driver (bonding and
  sriov enabled)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)

2018-03-20 Thread kvaps
Additional I've also tested latest hwe kernel (4.13.0-37-generic) and
build-in driver, the same problem here:

[ 1011.070739] kvm [16361]: vcpu0, guest rIP: 0x810644d8 disabled 
perfctr wrmsr: 0xc2 data 0x
[ 1011.528347] cache_from_obj: Wrong slab cache. kmalloc-256 but object is 
from kmalloc-192
[ 1011.927642] general protection fault:  [#1] SMP PTI
[ 1012.185439] cache_from_obj: Wrong slab cache. kmalloc-256 but object is 
from kmalloc-192

driver version:

# ethtool -i eno1
driver: mlx4_en
version: 4.0-0
firmware-version: 2.42.5004
expansion-rom-version: 
bus-info: :11:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1755268

Title:
  Kernel panic when using KVM and Mellanox OFED driver (bonding and
  sriov enabled)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)

2018-03-20 Thread kvaps
Hi, I can't build new mellanox drivers for new kernel, so I used default
one:

# ethtool -i eno1
driver: mlx4_en
version: 4.0-0
firmware-version: 2.42.5004
expansion-rom-version: 
bus-info: :11:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

After creating virtual machine, my kernel was stuck with these messages:
[ 2170.511433] kvm [28464]: vcpu0, guest rIP: 0x810644d8 disabled 
perfctr wrmsr: 0xc2 data 0x
[ 2170.963166] [ cut here ]

** Tags added: kernel-bug-exists-upstream

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

** Changed in: linux (Ubuntu Xenial)
   Status: Incomplete => Confirmed

** Changed in: linux (Ubuntu Artful)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1755268

Title:
  Kernel panic when using KVM and Mellanox OFED driver (bonding and
  sriov enabled)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)

2018-03-19 Thread kvaps
Hi Joseph, thanks for your answer.

Currnetly I have two various hardware configuration:

- HPE ProLiant m710p Server Cartridge (have no this problem)
- HPE ProLiant m710x Server Cartridge (have this problem)

> Did this issue start happening after an update/upgrade?
> Was there a prior kernel version where you were not having this particular 
> problem?

Well, I uses debootstrap script for install all needed software automatically 
and build image with base system.
After that I uses this image for boot my nodes via PXE. So each boot I have 
system that installed from scratch.
I've tested the next kernels:
- Ubuntu 16.04 with stock kernel: 4.4.0-116-generic
- Ubuntu 16.04 with hwe kernel: 4.13.0-36-generic
- Ubuntu 16.04 with pve kernel: 4.13.13-6-pve
- Debian 9 with pve kernel: 4.13.13-6-pve
- Debian 9 with stock kernel: 4.9.0-6-amd64

All of them have this problem, but stock kernels can drops after some time.
(I had no this error only on debian with 4.9.0-6-amd64 but presume it exists 
there because I'm not tested it properly)

Another thing, that if I do this steps AFTER the system is boot up:

rmmod mlx4_en mlx4_ib mlx4_core
modprobe mlx4_core num_vfs=1 port_type_array=2,2 probe_vf=1
systemctl restart networking

Everything starts working fine.

> Please test the latest v4.16 kernel[0].

Ok, I'll do this

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1755268

Title:
  Kernel panic when using KVM and Mellanox OFED driver (bonding and
  sriov enabled)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)

2018-03-19 Thread Joseph Salisbury
Did this issue start happening after an update/upgrade?  Was there a
prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer
to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
v4.16 kernel[0].

If this bug is fixed in the mainline kernel, please add the following
tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag:
'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as
"Confirmed".


Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16-rc6

** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Also affects: linux (Ubuntu Artful)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Artful)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Artful)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Artful)
   Status: Confirmed => Incomplete

** Changed in: linux (Ubuntu)
   Status: Confirmed => Incomplete

** Tags added: kernel-da-key

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Xenial)
   Status: New => Incomplete

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1755268

Title:
  Kernel panic when using KVM and Mellanox OFED driver (bonding and
  sriov enabled)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)

2018-03-16 Thread kvaps
** Attachment added: "lspci -v output"
   
https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1755268/+attachment/5081271/+files/lspci-v.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1755268

Title:
  Kernel panic when using KVM and Mellanox OFED driver (bonding and
  sriov enabled)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)

2018-03-16 Thread kvaps
I was teted right now, the problem occurs also on stock kernel.

# uname -a
Linux m5c43 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

Seems bug connected only with specific hardware, I'll provide more information 
about this.
I uses HP Moonshot x710x xartridge:

# lspci   
00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers 
(rev 0a)
00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16) (rev 
0a)
00:02.0 VGA compatible controller: Intel Corporation Device 193a (rev 09)
00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI 
Controller (rev 31)
00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME 
HECI #1 (rev 31)
00:17.0 SATA controller: Intel Corporation Sunrise Point-H SATA controller 
[AHCI mode] (rev 31)
00:1b.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Root Port #17 
(rev f1)
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port 
#1 (rev f1)
00:1c.3 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port 
#4 (rev f1)
00:1c.4 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port 
#5 (rev f1)
00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port 
#9 (rev f1)
00:1d.5 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port 
#14 (rev f1)
00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 
31)
00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31)
00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31)
01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out 
Standard Slave Instrumentation & System Support (rev 06)
01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA 
G200EH (rev 01)
01:00.2 System peripheral: Hewlett-Packard Company Integrated Lights-Out 
Standard Management Processor Support and Messaging (rev 06)
01:00.4 USB controller: Hewlett-Packard Company Integrated Lights-Out 
Standard Virtual USB Controller (rev 03)
0e:00.0 Non-Volatile memory controller: Intel Corporation Device f1a5 (rev 
03)
11:00.0 Ethernet controller: Mellanox Technologies MT27520 Family 
[ConnectX-3 Pro]
11:00.1 Ethernet controller: Mellanox Technologies MT27500/MT27520 Family 
[ConnectX-3/ConnectX-3 Pro Virtual Function]

# ethtool -i eno1
driver: mlx4_en
version: 4.3-1.0.1
firmware-version: 2.40.5540
expansion-rom-version: 
bus-info: :11:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

# mst status
MST modules:

MST PCI module loaded
MST PCI configuration module loaded

MST devices:

/dev/mst/mt4103_pci_cr0  - PCI direct access.
   domain:bus:dev.fn=:11:00.0 
bar=0x7f10 size=0x10
   Chip revision is: 00
/dev/mst/mt4103_pciconf0 - PCI configuration cycles access.
   domain:bus:dev.fn=:11:00.0 
addr.reg=88 data.reg=92
   Chip revision is: 00



# ibv_devinfo 
hca_id: mlx4_1
transport:  InfiniBand (0)
fw_ver: 2.40.5540
node_guid:  0014:0500:d300:bc52
sys_image_guid: f403:4303:00fd:102d
vendor_id:  0x02c9
vendor_part_id: 4100
hw_ver: 0x0
board_id:   HP_1690110017
phys_port_cnt:  2
Device ports:
port:   1
state:  PORT_DOWN (1)
max_mtu:4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid:   0
port_lmc:   0x00
link_layer: Ethernet

port:   2
state:  PORT_DOWN (1)
max_mtu:4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid:   0
port_lmc:   0x00
link_layer: Ethernet

hca_id: mlx4_0
transport:  InfiniBand (0)
fw_ver: 2.40.5540
node_guid:  

[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)

2018-03-16 Thread kvaps
** Attachment added: "Dmesg output (before kernel panic)"
   
https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1755268/+attachment/5081272/+files/dmesg.txt

** Package changed: linux-hwe (Ubuntu) => linux (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1755268

Title:
  Kernel panic when using KVM and Mellanox OFED driver (bonding and
  sriov enabled)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1755268] Re: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled)

2018-03-12 Thread kvaps
I'm tried to use simple modules loading instead /etc/init.d/openibd
script:

rmmod mlx4_en mlx4_core
modprobe mlx4_core num_vfs=1 port_type_array=2,2 probe_vf=1

The result the same:

[  193.469331] mlx4_core :11:00.0: bond for multifunction failed
[  193.770126] cache_from_obj: Wrong slab cache. kmalloc-256 but object is 
from kmalloc-192
[  194.170171] mlx4_en: eno1d1: Fail to bond device
[  194.170178]  nf_reject_ipv4 ebtable_filter ebtables ip6table_filter 
ip6_tables xt_set ip_set_list_set ip_set_hash_net veth beegfs(OE) dummy 
nf_conntrack_netlink xt_nat xt_tcpudp xt_recent ip_set nfnetlink ip_vs 
rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace sunrpc fscache xt_comment 
xt_mark netconsole mlx4_ib(OE) mlx4_en(OE) mlx4_core(OE) ipt_MASQUERADE 
nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 
nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack 
x_tables nf_nat nf_conntrack libcrc32c br_netfilter 8021q garp mrp ib_core(OE) 
mlx_compat(OE) bridge stp llc bonding ipmi_ssif intel_rapl x86_pkg_temp_thermal 
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel pcbc aesni_intel aes_x86_64 hpilo crypto_simd
[  194.170482]  glue_helper cryptd intel_cstate mei_me ipmi_si ipmi_devintf 
ipmi_msghandler intel_rapl_perf shpchp mei acpi_power_meter mac_hid 
ie31200_edac knem(OE) autofs4 overlay nbd ptp pps_core i915 mgag200 video ttm 
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm 
nvme ahci nvme_core libahci devlink [last unloaded: mlx4_core]
[  194.170550] CPU: 0 PID: 7 Comm: ksoftirqd/0 Tainted: GW  OE   
4.13.0-36-generic #40~16.04.1-Ubuntu

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1755268

Title:
  Kernel panic when using KVM and Mellanox OFED driver (bonding and
  sriov enabled)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1755268/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs