Public bug reported:
Found on another Boston Power9 box "dradis".
Steps to reproduce:
1. Check online CPUs
$ cat /sys/devices/system/cpu/online
0-159
2. Do a CPU hotplug to take one off:
$ echo 0 | sudo tee /sys/devices/system/cpu/cpu159/online
0
3. Check dmesg, you should see:
[ 410.890106] IRQ 174: no longer affine to CPU159
4. Put that CPU back online and check dmesg again:
$ echo 1 | sudo tee /sys/devices/system/cpu/cpu159/online
System complains about CPU hard lockup:
[ 410.890106] IRQ 174: no longer affine to CPU159
[ 421.168052] Watchdog CPU:128 Hard LOCKUP
[ 421.168054] Modules linked in: joydev input_leds mac_hid idt_89hpesx
ipmi_powernv opal_prd ipmi_devintf ibmpowernv ofpart at24 cmdlinepart
uio_pdrv_genirq uio powernv_flash mtd ipmi_msghandler vmx_crypto sch_fq_codel
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas uas
usb_storage ast hid_generic i2c_algo_bit ttm drm_kms_helper usbhid syscopyarea
sysfillrect sysimgblt hid fb_sys_fops crct10dif_vpmsum crc32c_vpmsum drm i40e
aacraid
[ 421.168108] CPU: 128 PID: 778 Comm: watchdog/128 Not tainted
4.15.0-48-generic #51-Ubuntu
[ 421.168109] NIP: c000000000d082e8 LR: c00000000016c3b0 CTR: c000000000ac5d80
[ 421.168111] REGS: c00000003f9ffd80 TRAP: 0900 Not tainted
(4.15.0-48-generic)
[ 421.168112] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24000484
XER: 00000000
[ 421.168118] CFAR: c00000000016c3ac SOFTE: 0
GPR00: c00000000016c3b0 c000200e55743af0 c0000000016eb400
c000200e614a8f20
GPR04: 000000000000088c c000200e614a4360 c0000000fd6879b8
0000000000000008
GPR08: 000000000054cd92 00000000389fd980 0000000080000080
0000000000000005
GPR12: c000000000ac5d80 c00000000fad8000 c00000000013e648
c000000ff90e9640
GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
GPR20: 0000000000000000 0000000000000000 c000200e55852b80
0000200e602d0000
GPR24: c000200e614a8580 0000000000000000 c000000000d01ff0
c000200e614a6fb8
GPR28: c000200e614a8880 000000000000088c c000200e614a8580
c000200e614a8f20
[ 421.168142] NIP [c000000000d082e8] _raw_spin_lock+0x28/0xe0
[ 421.168146] LR [c00000000016c3b0] update_curr_rt+0x1d0/0x3f0
[ 421.168147] Call Trace:
[ 421.168150] [c000200e55743af0] [c00000000171dd78]
__per_cpu_offset+0x0/0x4000 (unreliable)
[ 421.168154] [c000200e55743b20] [c00000000016c2b0] update_curr_rt+0xd0/0x3f0
[ 421.168156] [c000200e55743bb0] [c00000000016c7bc] dequeue_task_rt+0x3c/0xf0
[ 421.168159] [c000200e55743bf0] [c00000000014e9b0] deactivate_task+0xb0/0x160
[ 421.168161] [c000200e55743c70] [c000000000d0187c] __schedule+0x3bc/0xaf0
[ 421.168164] [c000200e55743d40] [c000000000d01ff0] schedule+0x40/0xc0
[ 421.168167] [c000200e55743d60] [c000000000144bd4]
smpboot_thread_fn+0x284/0x290
[ 421.168169] [c000200e55743dc0] [c00000000013e7e8] kthread+0x1a8/0x1b0
[ 421.168172] [c000200e55743e30] [c00000000000b658]
ret_from_kernel_thread+0x5c/0x84
[ 421.168173] Instruction dump:
[ 421.168175] 7c0803a6 4bffff98 3c4c009e 38423140 7c0802a6 60000000 fbe1fff8
f821ffd1
[ 421.168179] 7c7f1b78 39400000 994d028d 814d0008 <7d201829> 2c090000 40c20010
7d40192d
But the CPU is actually back online:
$ cat /sys/devices/system/cpu/online
0-159
$ cat /sys/devices/system/cpu/cpu159/online
1
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-48-generic 4.15.0-48.51
ProcVersionSignature: Ubuntu 4.15.0-48.51-generic 4.15.18
Uname: Linux 4.15.0-48-generic ppc64le
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 May 2 08:06 seq
crw-rw---- 1 root audio 116, 33 May 2 08:06 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.6
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
Date: Thu May 2 08:16:36 2019
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
PciMultimedia:
ProcFB: 0 astdrmfb
ProcKernelCmdLine: root=UUID=82644f4d-d7cb-4abf-b5e9-d8b5644f77dd ro
console=hvc0
ProcLoadAvg: 0.02 0.42 0.37 1/1332 5365
ProcLocks:
1: FLOCK ADVISORY WRITE 4460 00:17:336 0 EOF
2: POSIX ADVISORY WRITE 3994 00:17:604 0 EOF
3: FLOCK ADVISORY WRITE 3913 00:17:586 0 EOF
4: POSIX ADVISORY WRITE 3984 00:17:620 0 EOF
5: POSIX ADVISORY WRITE 1816 00:17:356 0 EOF
ProcSwaps:
Filename Type Size Used Priority
/swap.img file 8388544 0 -2
ProcVersion: Linux version 4.15.0-48-generic (buildd@bos02-ppc64el-010) (gcc
version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #51-Ubuntu SMP Wed Apr 3 08:26:19 UTC
2019
RelatedPackageVersions:
linux-restricted-modules-4.15.0-48-generic N/A
linux-backports-modules-4.15.0-48-generic N/A
linux-firmware 1.173.5
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
VarLogDump_list: total 0
cpu_cores: Number of cores present = 40
cpu_coreson: Number of cores online = 40
cpu_dscr: DSCR is 16
cpu_freq:
min: 2.862 GHz (cpu 159)
max: 2.862 GHz (cpu 81)
avg: 2.862 GHz
cpu_runmode:
Could not retrieve current diagnostics mode,
No kernel interface to firmware
cpu_smt: SMT=4
** Affects: linux (Ubuntu)
Importance: Undecided
Status: Confirmed
** Tags: apport-bug bionic ppc64el uec-images
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1827343
Title:
CPU hard lockup when turning CPU back online on Bionic P9
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1827343/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs