[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2019-07-24 Thread Brad Figg
** Tags added: cscc

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  :01:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)

  # ethtool  -

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-08-23 Thread Andrew Cloke
** Changed in: ubuntu-power-systems
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  :01:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-08-23 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 4.15.0-33.36

---
linux (4.15.0-33.36) bionic; urgency=medium

  * linux: 4.15.0-33.36 -proposed tracker (LP: #1787149)

  * RTNL assertion failure on ipvlan (LP: #1776927)
- ipvlan: drop ipv6 dependency
- ipvlan: use per device spinlock to protect addrs list updates
- SAUCE: fix warning from "ipvlan: drop ipv6 dependency"

  * ubuntu_bpf_jit test failed on Bionic s390x systems (LP: #1753941)
- test_bpf: flag tests that cannot be jited on s390

  * HDMI/DP audio can't work on the laptop of Dell Latitude 5495 (LP: #1782689)
- drm/nouveau: fix nouveau_dsm_get_client_id()'s return type
- drm/radeon: fix radeon_atpx_get_client_id()'s return type
- drm/amdgpu: fix amdgpu_atpx_get_client_id()'s return type
- platform/x86: apple-gmux: fix gmux_get_client_id()'s return type
- ALSA: hda: use PCI_BASE_CLASS_DISPLAY to replace PCI_CLASS_DISPLAY_VGA
- vga_switcheroo: set audio client id according to bound GPU id

  * locking sockets broken due to missing AppArmor socket mediation patches
(LP: #1780227)
- UBUNTU SAUCE: apparmor: fix apparmor mediating locking non-fs, unix 
sockets

  * Update2 for ocxl driver (LP: #1781436)
- ocxl: Fix page fault handler in case of fault on dying process

  * netns: unable to follow an interface that moves to another netns
(LP: #1774225)
- net: core: Expose number of link up/down transitions
- dev: always advertise the new nsid when the netns iface changes
- dev: advertise the new ifindex when the netns iface changes

  * [Bionic] Disk IO hangs when using BFQ as io scheduler (LP: #1780066)
- block, bfq: fix occurrences of request finish method's old name
- block, bfq: remove batches of confusing ifdefs
- block, bfq: add requeue-request hook

  * HP ProBook 455 G5 needs mute-led-gpio fixup (LP: #1781763)
- ALSA: hda: add mute led support for HP ProBook 455 G5

  * [Bionic] bug fixes to improve stability of the ThunderX2 i2c driver
(LP: #1781476)
- i2c: xlp9xx: Fix issue seen when updating receive length
- i2c: xlp9xx: Make sure the transfer size is not more than
  I2C_SMBUS_BLOCK_SIZE

  * x86/kvm: fix LAPIC timer drift when guest uses periodic mode (LP: #1778486)
- x86/kvm: fix LAPIC timer drift when guest uses periodic mode

  * Please include ax88179_178a and r8152 modules in d-i udeb (LP: #1771823)
- [Config:] d-i: Add ax88179_178a and r8152 to nic-modules

  * Nvidia fails after switching its mode (LP: #1778658)
- PCI: Restore config space on runtime resume despite being unbound

  * Kernel error "task zfs:pid blocked for more than 120 seconds" (LP: #1781364)
- SAUCE: (noup) zfs to 0.7.5-1ubuntu16.3

  * CVE-2018-12232
- PATCH 1/1] socket: close race condition between sock_close() and
  sockfs_setattr()

  * CVE-2018-10323
- xfs: set format back to extents if xfs_bmap_extents_to_btree

  * change front mic location for more lenovo m7/8/9xx machines (LP: #1781316)
- ALSA: hda/realtek - Fix the problem of two front mics on more machines
- ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION

  * Cephfs + fscache: unable to handle kernel NULL pointer dereference at
 IP: jbd2__journal_start+0x22/0x1f0 (LP: #1783246)
- ceph: track read contexts in ceph_file_info

  * Touchpad of ThinkPad P52 failed to work with message "lost sync at byte"
(LP: #1779802)
- Input: elantech - fix V4 report decoding for module with middle key
- Input: elantech - enable middle button of touchpads on ThinkPad P52

  * xhci_hcd :00:14.0: Root hub is not suspended (LP: #1779823)
- usb: xhci: dbc: Fix lockdep warning
- usb: xhci: dbc: Don't decrement runtime PM counter if DBC is not started

  * CVE-2018-13406
- video: uvesafb: Fix integer overflow in allocation

  * CVE-2018-10840
- ext4: correctly handle a zero-length xattr with a non-zero e_value_offs

  * CVE-2018-11412
- ext4: do not allow external inodes for inline data

  * CVE-2018-10881
- ext4: clear i_data in ext4_inode_info when removing inline data

  * CVE-2018-12233
- jfs: Fix inconsistency between memory allocation and ea_buf->max_size

  * CVE-2018-12904
- kvm: nVMX: Enforce cpl=0 for VMX instructions

  * Error parsing PCC subspaces from PCCT (LP: #1528684)
- mailbox: PCC: erroneous error message when parsing ACPI PCCT

  * CVE-2018-13094
- xfs: don't call xfs_da_shrink_inode with NULL bp

  * other users' coredumps can be read via setgid directory and killpriv bypass
(LP: #1779923) // CVE-2018-13405
- Fix up non-directory creation in SGID directories

  * Invoking obsolete 'firmware_install' target breaks snap build (LP: #1782166)
- snapcraft.yaml: stop invoking the obsolete (and non-existing)
  'firmware_install' target

  * snapcraft.yaml: missing ubuntu-retpoline-extract-one script breaks the build
(LP: #1782116)
- snapcraft.yaml: copy 

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-08-22 Thread Kleber Sacilotto de Souza
Hi Mauro,

That is the correct kernel, thanks for verifying it!

I have marked the verification as done on our side.

Thank you.

** Tags removed: verification-needed-bionic
** Tags added: verification-done-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-08-07 Thread Brad Figg
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
bionic' to 'verification-done-bionic'. If the problem still exists,
change the tag 'verification-needed-bionic' to 'verification-failed-
bionic'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  O

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-08-03 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 4.17.0-6.7

---
linux (4.17.0-6.7) cosmic; urgency=medium

  * linux: 4.17.0-6.7 -proposed tracker (LP: #1783396)

  * [Regression] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383:
comm stress-ng: bg 4705: bad block bitmap checksum (LP: #1781709)
- SAUCE: Revert "UBUNTU: SAUCE: ext4: fix ext4_validate_inode_bitmap: comm
  stress-ng: Corrupt inode bitmap"
- SAUCE: ext4: check for allocation block validity with block group locked

  * Cosmic update to 4.17.9 stable release (LP: #1783201)
- userfaultfd: hugetlbfs: fix userfaultfd_huge_must_wait() pte access
- mm: hugetlb: yield when prepping struct pages
- mm: teach dump_page() to correctly output poisoned struct pages
- PCI / ACPI / PM: Resume bridges w/o drivers on suspend-to-RAM
- ACPICA: Drop leading newlines from error messages
- ACPI / battery: Safe unregistering of hooks
- drm/amdgpu: Make struct amdgpu_atif private to amdgpu_acpi.c
- tracing: Avoid string overflow
- tracing: Fix missing return symbol in function_graph output
- scsi: sg: mitigate read/write abuse
- scsi: aacraid: Fix PD performance regression over incorrect qd being set
- scsi: target: Fix truncated PR-in ReadKeys response
- s390: Correct register corruption in critical section cleanup
- drbd: fix access after free
- vfio: Use get_user_pages_longterm correctly
- ARM: dts: imx51-zii-rdu1: fix touchscreen pinctrl
- ARM: dts: omap3: Fix am3517 mdio and emac clock references
- ARM: dts: dra7: Disable metastability workaround for USB2
- cifs: Fix use after free of a mid_q_entry
- cifs: Fix memory leak in smb2_set_ea()
- cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting
- cifs: Fix infinite loop when using hard mount option
- drm: Use kvzalloc for allocating blob property memory
- drm/udl: fix display corruption of the last line
- drm/amdgpu: Add amdgpu_atpx_get_dhandle()
- drm/amdgpu: Dynamically probe for ATIF handle (v2)
- jbd2: don't mark block as modified if the handle is out of credits
- ext4: add corruption check in ext4_xattr_set_entry()
- ext4: always verify the magic number in xattr blocks
- ext4: make sure bitmaps and the inode table don't overlap with bg
  descriptors
- ext4: always check block group bounds in ext4_init_block_bitmap()
- ext4: only look at the bg_flags field if it is valid
- ext4: verify the depth of extent tree in ext4_find_extent()
- ext4: include the illegal physical block in the bad map ext4_error msg
- ext4: clear i_data in ext4_inode_info when removing inline data
- ext4: never move the system.data xattr out of the inode body
- ext4: avoid running out of journal credits when appending to an inline 
file
- ext4: add more inode number paranoia checks
- ext4: add more mount time checks of the superblock
- ext4: check superblock mapped prior to committing
- HID: i2c-hid: Fix "incomplete report" noise
- HID: hiddev: fix potential Spectre v1
- HID: debug: check length before copy_to_user()
- HID: core: allow concurrent registration of drivers
- i2c: core: smbus: fix a potential missing-check bug
- i2c: smbus: kill memory leak on emulated and failed DMA SMBus xfers
- fs: allow per-device dax status checking for filesystems
- dax: change bdev_dax_supported() to support boolean returns
- dax: check for QUEUE_FLAG_DAX in bdev_dax_supported()
- dm: prevent DAX mounts if not supported
- mtd: cfi_cmdset_0002: Change definition naming to retry write operation
- mtd: cfi_cmdset_0002: Change erase functions to retry for error
- mtd: cfi_cmdset_0002: Change erase functions to check chip good only
- netfilter: nf_log: don't hold nf_log_mutex during user access
- staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write()
- Revert mm/vmstat.c: fix vmstat_update() preemption BUG
- Linux 4.17.6
- bpf: reject passing modified ctx to helper functions
- MIPS: Call dump_stack() from show_regs()
- MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
- MIPS: Fix ioremap() RAM check
- drm/etnaviv: Check for platform_device_register_simple() failure
- drm/etnaviv: Fix driver unregistering
- drm/etnaviv: bring back progress check in job timeout handler
- ACPICA: Clear status of all events when entering S5
- mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states
- mmc: dw_mmc: fix card threshold control configuration
- mmc: renesas_sdhi_internal_dmac: Cannot clear the RX_IN_USE in abort
- ibmasm: don't write out of bounds in read handler
- staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data().
- staging: r8822be: Fix RTL8822be can't find any wireless AP
- ata: Fix ZBC_OUT command block check
- ata: Fix ZBC_OUT all bit handling
- mei: discard messages

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-07-09 Thread Manoj Iyer
** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team)

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connectio

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-06-21 Thread Andrew Cloke
** Changed in: ubuntu-power-systems
   Status: Triaged => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  :01:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/S

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-06-20 Thread Kleber Sacilotto de Souza
** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Bionic)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Triaged
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  :01:0

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-06-19 Thread Khaled El Mously
** Changed in: linux (Ubuntu)
   Status: Triaged => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Triaged
Status in linux package in Ubuntu:
  Fix Committed

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  :01:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)

  # ethtool  -i enp1s0f0

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-06-18 Thread Manoj Iyer
** Changed in: linux (Ubuntu)
 Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) => 
Canonical Kernel Team (canonical-kernel-team)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  :01:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Giga

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-06-12 Thread Joseph Salisbury
** Changed in: linux (Ubuntu)
   Status: New => Triaged

** Changed in: linux (Ubuntu)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  :01:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection

[Kernel-packages] [Bug 1776389] Re: [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

2018-06-11 Thread Frank Heimes
** Also affects: ubuntu-power-systems
   Importance: Undecided
   Status: New

** Changed in: ubuntu-power-systems
   Status: New => Triaged

** Changed in: ubuntu-power-systems
   Importance: Undecided => High

** Changed in: ubuntu-power-systems
 Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team)

** Tags added: triage-g

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1776389

Title:
  [Ubuntu 1804][boston][ixgbe] EEH causes kernel BUG at /build/linux-
  jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352 (i2S)

Status in The Ubuntu-power-systems project:
  Triaged
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - ABDUL HALEEM <> - 2018-02-16 08:26:15 ==
  Problem:
  
  Injecting error multiple times causes kernel crash.

  echo 0x0:1:4:0x60800:0xfff8 >
  /sys/kernel/debug/powerpc/PCI/err_injct

  EEH: PHB#0 failure detected, location: N/A
  EEH: PHB#0-PE#0 has failed 6 times in the
  last hour and has been permanently disabled.
  EEH: Unable to recover from failure from PHB#0-PE#0.
  Please try reseating or replacing it
  ixgbe :01:00.1: Adapter removed
  kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache joydev input_leds 
mac_hid idt_89hpesx ofpart ipmi_powernv cmdlinepart ipmi_devintf 
ipmi_msghandler at24 powernv_flash mtd opal_prd ibmpowernv uio_pdrv_genirq 
vmx_crypto uio sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas qla2xxx 
ast hid_generic ttm drm_kms_helper ixgbe syscopyarea usbhid igb sysfillrect 
sysimgblt nvme_fc fb_sys_fops hid nvme_fabrics crct10dif_vpmsum crc32c_vpmsum 
drm i40e scsi_transport_fc aacraid i2c_algo_bit mdio
  CPU: 28 PID: 972 Comm: eehd Not tainted 4.15.0-10-generic #11-Ubuntu
  NIP:  c077f080 LR: c077f070 CTR: c00aac30
  REGS: c00ff1deb5a0 TRAP: 0700   Not tainted  (4.15.0-10-generic)
  MSR:  90029033   CR: 24002822  XER: 2004
  CFAR: c018bddc SOFTE: 1
  GPR00: c077f070 c00ff1deb820 c16ea600 c00fbb5fac00
  GPR04: 02c5   
  GPR08: c00fbb5fac00 0001 c00fec617a00 c00fdfd86488
  GPR12: 0040 c7a33400 c0138be8 c00ff90ec1c0
  GPR16:    
  GPR20:    c0f48d10
  GPR24: c0f48ce8 c000200e4fcf4000 c00fc6900b18 c000200e4fcf4000
  GPR28: c000200e4fcf4288 c00810624480  c00fbb633ea0
  NIP [c077f080] free_msi_irqs+0xa0/0x260
  LR [c077f070] free_msi_irqs+0x90/0x260
  Call Trace:
  [c00ff1deb820] [c077f070] free_msi_irqs+0x90/0x260 (unreliable)
  [c00ff1deb880] [c077fa68] pci_disable_msix+0x128/0x170
  [c00ff1deb8c0] [c0081060b5c8] 
ixgbe_reset_interrupt_capability+0x90/0xd0 [ixgbe]
  [c00ff1deb8f0] [c008105d52f4] ixgbe_remove+0xec/0x240 [ixgbe]
  [c00ff1deb990] [c07670ec] pci_device_remove+0x6c/0x110
  [c00ff1deb9d0] [c085d194] 
device_release_driver_internal+0x224/0x310
  [c00ff1deba20] [c075b398] pci_stop_bus_device+0x98/0xe0
  [c00ff1deba60] [c075b588] pci_stop_and_remove_bus_device+0x28/0x40
  [c00ff1deba90] [c005e1d0] pci_hp_remove_devices+0x90/0x130
  [c00ff1debb20] [c005e184] pci_hp_remove_devices+0x44/0x130
  [c00ff1debbb0] [c003ec04] eeh_handle_normal_event+0x134/0x580
  [c00ff1debc60] [c003f160] eeh_handle_event+0x30/0x338
  [c00ff1debd10] [c003f830] eeh_event_handler+0x140/0x200
  [c00ff1debdc0] [c0138d88] kthread+0x1a8/0x1b0
  [c00ff1debe30] [c000b528] ret_from_kernel_thread+0x5c/0xb4
  Instruction dump:
  419effe0 3bc0 480c 6042 807f0010 7c7e1a14 78630020 4ba0cd3d
  6000 e9430158 312a 7d295110 <0b09> 813f0014 395e0001 7d5e07b4
  ---[ end trace 23c446a470e60864 ]---
  ixgbe :01:00.0: Adapter removed

  Sending IPI to other CPUs
  OPAL: Switch to big-endian OS
  OPAL: Switch to little-endian OS
  PHB#[0:0]: eeh_freeze_clear on fenced PHB

   
  ---uname output---
  Linux ltciofvtr-bostonlc1 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 
18:21:52 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Boston-LC 

  :00:00.0 PCI bridge [0604]: IBM Device [1014:04c1]
  :0