[Expired for linux (Ubuntu) because there has been no activity for 60
days.]

** Changed in: linux (Ubuntu)
       Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1936958

Title:
  mlx5_core crash, taking down a bond

Status in linux package in Ubuntu:
  Expired

Bug description:
  Jul 20 14:40:23 anonster kernel: [ 1716.692818] mlx5_core 0000:03:00.0: 
assert_var[0] 0xffffffff
  Jul 20 14:40:23 anonster kernel: [ 1716.698541] mlx5_core 0000:03:00.0: 
assert_var[1] 0xffffffff
  Jul 20 14:40:23 anonster kernel: [ 1716.704240] mlx5_core 0000:03:00.0: 
assert_var[2] 0xffffffff
  Jul 20 14:40:23 anonster kernel: [ 1716.709945] mlx5_core 0000:03:00.0: 
assert_var[3] 0xffffffff
  Jul 20 14:40:23 anonster kernel: [ 1716.715641] mlx5_core 0000:03:00.0: 
assert_var[4] 0xffffffff
  Jul 20 14:40:23 anonster kernel: [ 1716.721343] mlx5_core 0000:03:00.0: 
assert_exit_ptr 0xffffffff
  Jul 20 14:40:23 anonster kernel: [ 1716.727214] mlx5_core 0000:03:00.0: 
assert_callra 0xffffffff
  Jul 20 14:40:23 anonster kernel: [ 1716.732917] mlx5_core 0000:03:00.0: 
fw_ver 65535.65535.65535
  Jul 20 14:40:23 anonster kernel: [ 1716.738617] mlx5_core 0000:03:00.0: hw_id 
0xffffffff
  Jul 20 14:40:23 anonster kernel: [ 1716.743620] mlx5_core 0000:03:00.0: 
irisc_index 255
  Jul 20 14:40:23 anonster kernel: [ 1716.748530] mlx5_core 0000:03:00.0: synd 
0xff: unrecognized error
  Jul 20 14:40:23 anonster kernel: [ 1716.754662] mlx5_core 0000:03:00.0: 
ext_synd 0xffff
  Jul 20 14:40:23 anonster kernel: [ 1716.759578] mlx5_core 0000:03:00.0: raw 
fw_ver 0xffffffff
  Jul 20 14:40:23 anonster kernel: [ 1716.765038] WARNING: CPU: 0 PID: 0 at 
/build/linux-hwe-EPHQQp/linux-hwe-4.15.0/kernel/time/timer.c:898 
mod_timer+0x3e4/0x400
  Jul 20 14:40:23 anonster kernel: [ 1716.765039] Modules linked in: 
binfmt_misc lkp_Ubuntu_4_15_0_142_146_generic_78(OEK) bonding nls_iso8859_1 xfs 
edac_mce_amd ipmi_ssif kvm_amd hpilo kvm i
  2c_piix4 irqbypass ipmi_si
  Jul 20 14:40:23 anonster kernel: [ 1716.765051] mlx5_core 0000:03:00.0: 
health_care:194:(pid 29045): handling bad device here
  Jul 20 14:40:23 anonster kernel: [ 1716.765052]  ipmi_devintf ipmi_msghandler 
shpchp acpi_power_meter
  Jul 20 14:40:23 anonster kernel: [ 1716.765057] mlx5_core 0000:03:00.0: 
mlx5_handle_bad_state:152:(pid 29045): Expected to see disabled NIC but it is 
has invalid value 3
  Jul 20 14:40:23 anonster kernel: [ 1716.765058]  k10temp mac_hid ib_iser
  Jul 20 14:40:23 anonster kernel: [ 1716.765062] mlx5_core 0000:03:00.0: 
mlx5_pci_err_detected was called
  Jul 20 14:40:23 anonster kernel: [ 1716.765063]  rdma_cm iw_cm ib_cm
  Jul 20 14:40:23 anonster kernel: [ 1716.765067] mlx5_core 0000:03:00.0: 
mlx5_enter_error_state:121:(pid 29045): start
  Jul 20 14:40:23 anonster kernel: [ 1716.765067]  ib_core iscsi_tcp 
libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async
  _pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 
bcache ses enclosure crct10dif_pclmul crc32_pclmul mgag200 ghash_clmulni_intel 
pcbc ttm drm_kms_helper aesni_intel
   mlx5_core syscopyarea sysfillrect igb sysimgblt aes_x86_64 fb_sys_fops 
crypto_simd glue_helper mlxfw dca nvme cryptd drm devlink i2c_algo_bit smartpqi 
nvme_core ptp scsi_transport_sas pps_
  core wmi
  Jul 20 14:40:23 anonster kernel: [ 1716.772598] CPU: 0 PID: 0 Comm: swapper/0 
Tainted: G           OE K  4.15.0-142-generic #146~16.04.1-Ubuntu
  Jul 20 14:40:23 anonster kernel: [ 1716.772598] Hardware name: HPE ProLiant 
DL325 Gen10 Plus/ProLiant DL325 Gen10 Plus, BIOS A43 05/11/2020
  Jul 20 14:40:23 anonster kernel: [ 1716.772600] RIP: 
0010:mod_timer+0x3e4/0x400
  Jul 20 14:40:23 anonster kernel: [ 1716.772601] RSP: 0018:ffff91e55e603e30 
EFLAGS: 00010093
  Jul 20 14:40:23 anonster kernel: [ 1716.772603] RAX: 0000000100056792 RBX: 
00000001000567c4 RCX: 000000010005678a
  Jul 20 14:40:23 anonster kernel: [ 1716.772603] RDX: 000000010005678c RSI: 
ffff91e55e603e48 RDI: ffff91e55e61a700
  Jul 20 14:40:23 anonster kernel: [ 1716.772604] RBP: ffff91e55e603e80 R08: 
ffff91e55e010800 R09: ffff91e55dc01ff0
  Jul 20 14:40:23 anonster kernel: [ 1716.772605] R10: 0000000000000000 R11: 
0000000000000040 R12: ffff91e54bb4d8d8
  Jul 20 14:40:23 anonster kernel: [ 1716.772606] R13: ffff91e54bb4d8d8 R14: 
ffff91e55e61a700 R15: ffff91e54bb4d8d8
  Jul 20 14:40:23 anonster kernel: [ 1716.772607] FS:  0000000000000000(0000) 
GS:ffff91e55e600000(0000) knlGS:0000000000000000
  Jul 20 14:40:23 anonster kernel: [ 1716.772607] CS:  0010 DS: 0000 ES: 0000 
CR0: 0000000080050033
  Jul 20 14:40:23 anonster kernel: [ 1716.772608] CR2: 00007fd20bd2e000 CR3: 
0000000816294000 CR4: 0000000000340ef0
  Jul 20 14:40:23 anonster kernel: [ 1716.772609] Call Trace:
  Jul 20 14:40:23 anonster kernel: [ 1716.772611]  <IRQ>
  Jul 20 14:40:23 anonster kernel: [ 1716.772617]  ? 
fbcon_add_cursor_timer+0xc0/0xc0
  Jul 20 14:40:23 anonster kernel: [ 1716.772620]  
cursor_timer_handler+0x45/0x50
  Jul 20 14:40:23 anonster kernel: [ 1716.772622] mlx5_core 0000:03:00.0: 
mlx5_enter_error_state:128:(pid 29045): end
  Jul 20 14:40:23 anonster kernel: [ 1716.779975]  call_timer_fn+0x32/0x140
  Jul 20 14:40:23 anonster kernel: [ 1716.779976]  run_timer_softirq+0x1e9/0x430
  Jul 20 14:40:23 anonster kernel: [ 1716.779978]  ? ktime_get+0x3e/0xb0
  Jul 20 14:40:23 anonster kernel: [ 1716.779981]  ? lapic_next_event+0x20/0x30
  Jul 20 14:40:23 anonster kernel: [ 1716.779985]  __do_softirq+0xf5/0x2a8
  Jul 20 14:40:23 anonster kernel: [ 1716.779988]  irq_exit+0xca/0xd0
  Jul 20 14:40:23 anonster kernel: [ 1716.779989]  
smp_apic_timer_interrupt+0x79/0x150
  Jul 20 14:40:23 anonster kernel: [ 1716.779990]  
apic_timer_interrupt+0x90/0xa0
  Jul 20 14:40:23 anonster kernel: [ 1716.779991]  </IRQ>
  Jul 20 14:40:23 anonster kernel: [ 1716.779994] RIP: 
0010:cpuidle_enter_state+0xa7/0x300
  Jul 20 14:40:23 anonster kernel: [ 1716.779995] RSP: 0018:ffffffff9c803e08 
EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
  Jul 20 14:40:23 anonster kernel: [ 1716.779996] RAX: ffff91e55e621900 RBX: 
0000000000000002 RCX: 000000000000001f
  Jul 20 14:40:23 anonster kernel: [ 1716.779997] RDX: 0000000000000000 RSI: 
0000000028133c6f RDI: 0000000000000000
  Jul 20 14:40:23 anonster kernel: [ 1716.779997] RBP: ffffffff9c803e40 R08: 
ffffffe48aae298f R09: 0000000000000008
  Jul 20 14:40:23 anonster kernel: [ 1716.779998] R10: ffffffff9c803dd8 R11: 
0000000000002c8b R12: 0000000000000002
  Jul 20 14:40:23 anonster kernel: [ 1716.779998] R13: ffff91e54d043800 R14: 
ffffffff9c981c98 R15: 0000018fb282ae03
  Jul 20 14:40:23 anonster kernel: [ 1716.780000]  ? 
cpuidle_enter_state+0x96/0x300
  Jul 20 14:40:23 anonster kernel: [ 1716.780002]  cpuidle_enter+0x17/0x20
  Jul 20 14:40:23 anonster kernel: [ 1716.780004]  call_cpuidle+0x23/0x40
  Jul 20 14:40:23 anonster kernel: [ 1716.780006]  do_idle+0x197/0x200
  Jul 20 14:40:23 anonster kernel: [ 1716.780007]  cpu_startup_entry+0x73/0x80
  Jul 20 14:40:23 anonster kernel: [ 1716.780010]  rest_init+0xaa/0xb0
  Jul 20 14:40:23 anonster kernel: [ 1716.780013]  start_kernel+0x4fa/0x51e
  Jul 20 14:40:23 anonster kernel: [ 1716.780015]  
x86_64_start_reservations+0x24/0x26
  Jul 20 14:40:23 anonster kernel: [ 1716.780016]  x86_64_start_kernel+0x74/0x77
  Jul 20 14:40:23 anonster kernel: [ 1716.780019]  
secondary_startup_64+0xa5/0xb0
  Jul 20 14:40:23 anonster kernel: [ 1716.780020] Code: b1 fc ff ff 49 89 46 10 
48 89 45 c0 e9 a4 fc ff ff 0f 0b 45 8b 7c 24 20 e9 5d fd ff ff 49 89 55 10 45 
8b 7c 24 20 e9 4f fd ff ff <0f> 0b e9 a4 fc ff ff 49 89 46 10 e9 9b fc ff ff e8 
97 f9 f7 ff 
  Jul 20 14:40:23 anonster kernel: [ 1716.780035] ---[ end trace 
3e92c45954bacae0 ]---
  Jul 20 14:40:24 anonster kernel: [ 1717.204835] mlx5_core 0000:03:00.1: 
assert_var[0] 0xffffffff
  Jul 20 14:40:24 anonster kernel: [ 1717.210539] mlx5_core 0000:03:00.1: 
assert_var[1] 0xffffffff
  Jul 20 14:40:24 anonster kernel: [ 1717.216242] mlx5_core 0000:03:00.1: 
assert_var[2] 0xffffffff
  Jul 20 14:40:24 anonster kernel: [ 1717.221940] mlx5_core 0000:03:00.1: 
assert_var[3] 0xffffffff
  Jul 20 14:40:24 anonster kernel: [ 1717.227645] mlx5_core 0000:03:00.1: 
assert_var[4] 0xffffffff
  Jul 20 14:40:24 anonster kernel: [ 1717.233342] mlx5_core 0000:03:00.1: 
assert_exit_ptr 0xffffffff
  Jul 20 14:40:24 anonster kernel: [ 1717.239218] mlx5_core 0000:03:00.1: 
assert_callra 0xffffffff
  Jul 20 14:40:24 anonster kernel: [ 1717.244917] mlx5_core 0000:03:00.1: 
fw_ver 65535.65535.65535
  Jul 20 14:40:24 anonster kernel: [ 1717.250617] mlx5_core 0000:03:00.1: hw_id 
0xffffffff
  Jul 20 14:40:24 anonster kernel: [ 1717.255615] mlx5_core 0000:03:00.1: 
irisc_index 255
  Jul 20 14:40:24 anonster kernel: [ 1717.260533] mlx5_core 0000:03:00.1: synd 
0xff: unrecognized error
  Jul 20 14:40:24 anonster kernel: [ 1717.266666] mlx5_core 0000:03:00.1: 
ext_synd 0xffff
  Jul 20 14:40:24 anonster kernel: [ 1717.271584] mlx5_core 0000:03:00.1: raw 
fw_ver 0xffffffff
  Jul 20 14:40:24 anonster kernel: [ 1717.277053] mlx5_core 0000:03:00.1: 
health_care:194:(pid 16512): handling bad device here
  Jul 20 14:40:24 anonster kernel: [ 1717.277057] mlx5_core 0000:03:00.1: 
mlx5_handle_bad_state:152:(pid 16512): Expected to see disabled NIC but it is 
has invalid value 3
  Jul 20 14:40:24 anonster kernel: [ 1717.277060] mlx5_core 0000:03:00.1: 
mlx5_pci_err_detected was called
  Jul 20 14:40:24 anonster kernel: [ 1717.277063] mlx5_core 0000:03:00.1: 
mlx5_enter_error_state:121:(pid 16512): start
  Jul 20 14:40:24 anonster kernel: [ 1717.284625] mlx5_core 0000:03:00.1: 
mlx5_enter_error_state:128:(pid 16512): end
  Jul 20 14:40:24 anonster kernel: [ 1717.300353] mlx5_core 0000:03:00.0: 
mlx5_wait_for_vf_pages:576:(pid 29045): Skipping wait for vf pages stage
  Jul 20 14:40:24 anonster kernel: [ 1717.321544] mlx5_core 0000:03:00.0 
ens2f0: mlx5e_get_link_ksettings: query port ptys failed: -5
  Jul 20 14:40:24 anonster kernel: [ 1717.330315] mlx5_core 0000:03:00.0 
ens2f0: speed changed to 0 for port ens2f0
  Jul 20 14:40:24 anonster kernel: [ 1717.337814] mlx5_core 0000:03:00.1 
ens2f1: mlx5e_get_link_ksettings: query port ptys failed: -5
  Jul 20 14:40:24 anonster kernel: [ 1717.346576] mlx5_core 0000:03:00.1 
ens2f1: speed changed to 0 for port ens2f1
  Jul 20 14:40:24 anonster kernel: [ 1717.354089] mlx5_core 0000:03:00.1: 
mlx5_wait_for_vf_pages:576:(pid 16512): Skipping wait for vf pages stage
  Jul 20 14:40:24 anonster kernel: [ 1717.360907] bond0: link status definitely 
down for interface ens2f0, disabling it
  Jul 20 14:40:24 anonster kernel: [ 1717.360946] bond0: link status definitely 
down for interface ens2f1, disabling it
  Jul 20 14:41:25 anonster kernel: [ 1778.646176] mlx5_core 0000:03:00.0: 
health recovery flow aborted since the nic state is invalid
  Jul 20 14:41:25 anonster kernel: [ 1778.646180] mlx5_core 0000:03:00.1: 
health recovery flow aborted since the nic state is invalid

  
  == ApportVersion =================================
  2.20.1-0ubuntu2.30

  == Architecture =================================
  amd64

  == Date =================================
  Tue Jul 20 16:52:44 2021

  == Dependencies =================================
  adduser 3.113+nmu3ubuntu4
  apt 1.2.35
  apt-utils 1.2.35
  busybox-initramfs 1:1.22.0-15ubuntu1.4
  coreutils 8.25-2ubuntu3~16.04
  cpio 2.11+dfsg-5ubuntu1.1
  debconf 1.5.58ubuntu2
  debconf-i18n 1.5.58ubuntu2
  debianutils 4.7
  dpkg 1.18.4ubuntu1.7+ppa1 [origin: LP-PPA-canonical-is-sa-launchpad]
  e2fslibs 1.42.13-1ubuntu1.2
  e2fsprogs 1.42.13-1ubuntu1.2
  gcc-5-base 5.4.0-6ubuntu1~16.04.12
  gcc-6-base 6.0.1-0ubuntu1
  gnupg 1.4.20-1ubuntu3.3
  gpgv 1.4.20-1ubuntu3.3
  init-system-helpers 1.29ubuntu4
  initramfs-tools 0.122ubuntu8.17
  initramfs-tools-bin 0.122ubuntu8.17
  initramfs-tools-core 0.122ubuntu8.17
  initscripts 2.88dsf-59.3ubuntu2
  insserv 1.14.0-5ubuntu3
  klibc-utils 2.0.4-8ubuntu1.16.04.4
  kmod 22-1ubuntu5.2
  libacl1 2.2.52-3
  libapt-inst2.0 1.2.35
  libapt-pkg5.0 1.2.35
  libattr1 1:2.4.47-2
  libaudit-common 1:2.4.5-1ubuntu2.1
  libaudit1 1:2.4.5-1ubuntu2.1
  libblkid1 2.27.1-6ubuntu3.10
  libbz2-1.0 1.0.6-8ubuntu0.2
  libc6 2.23-0ubuntu11.3
  libcomerr2 1.42.13-1ubuntu1.2
  libdb5.3 5.3.28-11ubuntu0.2
  libfdisk1 2.27.1-6ubuntu3.10
  libgcc1 1:6.0.1-0ubuntu1
  libgcrypt20 1.6.5-2ubuntu0.6
  libgpg-error0 1.21-2ubuntu1
  libgpm2 1.20.4-6.1
  libklibc 2.0.4-8ubuntu1.16.04.4
  libkmod2 22-1ubuntu5.2
  liblocale-gettext-perl 1.07-1build1
  liblz4-1 0.0~r131-2ubuntu2
  liblzma5 5.1.1alpha+20120614-2ubuntu2
  libmount1 2.27.1-6ubuntu3.10
  libncurses5 6.0+20160213-1ubuntu1
  libncursesw5 6.0+20160213-1ubuntu1
  libpam-modules 1.1.8-3.2ubuntu2.3
  libpam-modules-bin 1.1.8-3.2ubuntu2.3
  libpam0g 1.1.8-3.2ubuntu2.3
  libpcre3 2:8.38-3.1
  libprocps4 2:3.3.10-4ubuntu2.5
  libreadline6 6.3-8ubuntu2
  libselinux1 2.4-3build2
  libsemanage-common 2.3-1build3
  libsemanage1 2.3-1build3
  libsepol1 2.4-2
  libsmartcols1 2.27.1-6ubuntu3.10
  libss2 1.42.13-1ubuntu1.2
  libstdc++6 5.4.0-6ubuntu1~16.04.12
  libsystemd0 229-4ubuntu21.31
  libtext-charwidth-perl 0.04-7build5
  libtext-iconv-perl 1.7-5build4
  libtext-wrapi18n-perl 0.06-7.1
  libtinfo5 6.0+20160213-1ubuntu1
  libudev1 229-4ubuntu21.31
  libusb-0.1-4 2:0.1.12-28
  libustr-1.0-1 1.0.4-5
  libuuid1 2.27.1-6ubuntu3.10
  libzstd1 1.3.1+dfsg-1~ubuntu0.16.04.1
  linux-base 4.5ubuntu1.2~16.04.1
  linux-modules-4.15.0-142-generic 4.15.0-142.146~16.04.1
  lsb-base 9.20160110ubuntu0.2
  mount 2.27.1-6ubuntu3.10
  multiarch-support 2.23-0ubuntu11.3
  passwd 1:4.2-3.1ubuntu5.4
  perl-base 5.22.1-9ubuntu0.9
  procps 2:3.3.10-4ubuntu2.5
  psmisc 22.21-2.1ubuntu0.1
  readline-common 6.3-8ubuntu2
  sensible-utils 0.0.9ubuntu0.16.04.1
  sysv-rc 2.88dsf-59.3ubuntu2
  sysvinit-utils 2.88dsf-59.3ubuntu2
  tar 1.28-2.1ubuntu0.2
  ubuntu-keyring 2012.05.19.1
  udev 229-4ubuntu21.31
  util-linux 2.27.1-6ubuntu3.10
  uuid-runtime 2.27.1-6ubuntu3.10
  zlib1g 1:1.2.8.dfsg-2ubuntu4.3

  == DistroRelease =================================
  Ubuntu 16.04

  == NonfreeKernelModules =================================
  lkp_Ubuntu_4_15_0_142_146_generic_78

  == Package =================================
  linux-image-4.15.0-142-generic 4.15.0-142.146~16.04.1

  == PackageArchitecture =================================
  amd64

  == ProblemType =================================
  Bug

  == ProcCpuinfoMinimal =================================
  processor       : 15
  vendor_id       : AuthenticAMD
  cpu family      : 23
  model           : 49
  model name      : AMD EPYC 7262 8-Core Processor
  stepping        : 0
  microcode       : 0x8301038
  cpu MHz         : 1795.684
  cache size      : 512 KB
  physical id     : 0
  siblings        : 16
  core id         : 28
  cpu cores       : 8
  apicid          : 57
  initial apicid  : 57
  fpu             : yes
  fpu_exception   : yes
  cpuid level     : 16
  wp              : yes
  flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid 
aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes 
xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a 
misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core 
perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd ibrs ibpb 
stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt 
clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total 
cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save 
tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic 
v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
  bugs            : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
  bogomips        : 6387.44
  TLB size        : 3072 4K pages
  clflush size    : 64
  cache_alignment : 64
  address sizes   : 48 bits physical, 48 bits virtual
  power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

  == ProcEnviron =================================
  TERM=xterm-256color
  PATH=(custom, no user)
  XDG_RUNTIME_DIR=<set>
  LANG=en_US.UTF-8
  SHELL=/bin/bash

  == ProcVersionSignature =================================
  Ubuntu 4.15.0-142.146~16.04.1-generic 4.15.18

  == SourcePackage =================================
  linux-signed-hwe

  == Tags =================================
  xenial third-party-packages

  == Uname =================================
  Linux 4.15.0-142-generic x86_64

  == UpgradeStatus =================================
  No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1936958/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to