This change was made by a bot.

** Changed in: linux (Ubuntu)
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1827343

Title:
  CPU hard lockup when turning CPU back online on Bionic P9

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Found on another Boston Power9 box "dradis".

  Steps to reproduce:
  1. Check online CPUs
       $ cat /sys/devices/system/cpu/online 
       0-159
  2. Do a CPU hotplug to take one off:
       $ echo 0 | sudo tee /sys/devices/system/cpu/cpu159/online 
       0
  3. Check dmesg, you should see:
       [  410.890106] IRQ 174: no longer affine to CPU159
  4. Put that CPU back online and check dmesg again:
       $ echo 1 | sudo tee /sys/devices/system/cpu/cpu159/online

  System complains about CPU hard lockup:
  [  410.890106] IRQ 174: no longer affine to CPU159
  [  421.168052] Watchdog CPU:128 Hard LOCKUP
  [  421.168054] Modules linked in: joydev input_leds mac_hid idt_89hpesx 
ipmi_powernv opal_prd ipmi_devintf ibmpowernv ofpart at24 cmdlinepart 
uio_pdrv_genirq uio powernv_flash mtd ipmi_msghandler vmx_crypto sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas uas 
usb_storage ast hid_generic i2c_algo_bit ttm drm_kms_helper usbhid syscopyarea 
sysfillrect sysimgblt hid fb_sys_fops crct10dif_vpmsum crc32c_vpmsum drm i40e 
aacraid
  [  421.168108] CPU: 128 PID: 778 Comm: watchdog/128 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  421.168109] NIP:  c000000000d082e8 LR: c00000000016c3b0 CTR: 
c000000000ac5d80
  [  421.168111] REGS: c00000003f9ffd80 TRAP: 0900   Not tainted  
(4.15.0-48-generic)
  [  421.168112] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24000484 
 XER: 00000000
  [  421.168118] CFAR: c00000000016c3ac SOFTE: 0 
                 GPR00: c00000000016c3b0 c000200e55743af0 c0000000016eb400 
c000200e614a8f20 
                 GPR04: 000000000000088c c000200e614a4360 c0000000fd6879b8 
0000000000000008 
                 GPR08: 000000000054cd92 00000000389fd980 0000000080000080 
0000000000000005 
                 GPR12: c000000000ac5d80 c00000000fad8000 c00000000013e648 
c000000ff90e9640 
                 GPR16: 0000000000000000 0000000000000000 0000000000000000 
0000000000000000 
                 GPR20: 0000000000000000 0000000000000000 c000200e55852b80 
0000200e602d0000 
                 GPR24: c000200e614a8580 0000000000000000 c000000000d01ff0 
c000200e614a6fb8 
                 GPR28: c000200e614a8880 000000000000088c c000200e614a8580 
c000200e614a8f20 
  [  421.168142] NIP [c000000000d082e8] _raw_spin_lock+0x28/0xe0
  [  421.168146] LR [c00000000016c3b0] update_curr_rt+0x1d0/0x3f0
  [  421.168147] Call Trace:
  [  421.168150] [c000200e55743af0] [c00000000171dd78] 
__per_cpu_offset+0x0/0x4000 (unreliable)
  [  421.168154] [c000200e55743b20] [c00000000016c2b0] update_curr_rt+0xd0/0x3f0
  [  421.168156] [c000200e55743bb0] [c00000000016c7bc] dequeue_task_rt+0x3c/0xf0
  [  421.168159] [c000200e55743bf0] [c00000000014e9b0] 
deactivate_task+0xb0/0x160
  [  421.168161] [c000200e55743c70] [c000000000d0187c] __schedule+0x3bc/0xaf0
  [  421.168164] [c000200e55743d40] [c000000000d01ff0] schedule+0x40/0xc0
  [  421.168167] [c000200e55743d60] [c000000000144bd4] 
smpboot_thread_fn+0x284/0x290
  [  421.168169] [c000200e55743dc0] [c00000000013e7e8] kthread+0x1a8/0x1b0
  [  421.168172] [c000200e55743e30] [c00000000000b658] 
ret_from_kernel_thread+0x5c/0x84
  [  421.168173] Instruction dump:
  [  421.168175] 7c0803a6 4bffff98 3c4c009e 38423140 7c0802a6 60000000 fbe1fff8 
f821ffd1 
  [  421.168179] 7c7f1b78 39400000 994d028d 814d0008 <7d201829> 2c090000 
40c20010 7d40192d 

  But the CPU is actually back online:
  $ cat /sys/devices/system/cpu/online 
  0-159
  $ cat /sys/devices/system/cpu/cpu159/online 
  1

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-48-generic 4.15.0-48.51
  ProcVersionSignature: Ubuntu 4.15.0-48.51-generic 4.15.18
  Uname: Linux 4.15.0-48-generic ppc64le
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 May  2 08:06 seq
   crw-rw---- 1 root audio 116, 33 May  2 08:06 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: ppc64el
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Thu May  2 08:16:36 2019
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  PciMultimedia:
   
  ProcFB: 0 astdrmfb
  ProcKernelCmdLine: root=UUID=82644f4d-d7cb-4abf-b5e9-d8b5644f77dd ro 
console=hvc0
  ProcLoadAvg: 0.02 0.42 0.37 1/1332 5365
  ProcLocks:
   1: FLOCK  ADVISORY  WRITE 4460 00:17:336 0 EOF
   2: POSIX  ADVISORY  WRITE 3994 00:17:604 0 EOF
   3: FLOCK  ADVISORY  WRITE 3913 00:17:586 0 EOF
   4: POSIX  ADVISORY  WRITE 3984 00:17:620 0 EOF
   5: POSIX  ADVISORY  WRITE 1816 00:17:356 0 EOF
  ProcSwaps:
   Filename                             Type            Size    Used    Priority
   /swap.img                               file         8388544 0       -2
  ProcVersion: Linux version 4.15.0-48-generic (buildd@bos02-ppc64el-010) (gcc 
version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #51-Ubuntu SMP Wed Apr 3 08:26:19 UTC 
2019
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-48-generic N/A
   linux-backports-modules-4.15.0-48-generic  N/A
   linux-firmware                             1.173.5
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  VarLogDump_list: total 0
  cpu_cores: Number of cores present = 40
  cpu_coreson: Number of cores online = 40
  cpu_dscr: DSCR is 16
  cpu_freq:
   min: 2.862 GHz (cpu 159)
   max: 2.862 GHz (cpu 81)
   avg: 2.862 GHz
  cpu_runmode:
   Could not retrieve current diagnostics mode,
   No kernel interface to firmware
  cpu_smt: SMT=4

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1827343/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to