This issue can be reproduced on X-4.15 Oracle (4.15.0-1032.35~16.04.1),
instance BM.Standard1.36, VM.Standard1.4 (passed on VM.DenseIO1.8)
** Summary changed:
- Bionic powerpc hang with ubuntu_kernel_selftests (kernel oops)
+ Segmentation fault (kernel oops) with memory-hotplug in
ubuntu_kernel_selftests on Bionic kernel
** Description changed:
- It looks like some test inside the ubuntu_kernel_selftests has triggered
- this issue, the jenkins job "sru-misc__B_ppc64el-
+ It looks like memory-hotplug test in ubuntu_kernel_selftests will
+ trigger this issue.
+
+ This issue cannot be reproduced with the kernel in -updates, but can be
+ reproduced quite easily with the proposed kernel.
+
+ On jenkins you will see the jenkins job "sru-misc__B_ppc64el-
generic__using_baltar__for_kernel" hung at the same spot (the beginning
of the KVM unit test) for two out of two attempts:
05:06:37 INFO | GOOD ubuntu_kvm_unit_tests.setup
ubuntu_kvm_unit_tests.setup timestamp=1580792797 localtime=Feb 04
05:06:37 completed successfully
- 05:06:37 INFO | END GOOD ubuntu_kvm_unit_tests.setup
ubuntu_kvm_unit_tests.setup timestamp=1580792797 localtime=Feb 04
05:06:37
+ 05:06:37 INFO | END GOOD ubuntu_kvm_unit_tests.setup
ubuntu_kvm_unit_tests.setup timestamp=1580792797 localtime=Feb 04
05:06:37
05:06:37 DEBUG| Persistent state client._record_indent now set to 1
05:06:37 DEBUG| Persistent state client.unexpected_reboot deleted
- 05:06:37 INFO | START ubuntu_kvm_unit_tests.emulator
ubuntu_kvm_unit_tests.emulator timestamp=1580792797 localtime=Feb 04
05:06:37
+ 05:06:37 INFO | START ubuntu_kvm_unit_tests.emulator
ubuntu_kvm_unit_tests.emulator timestamp=1580792797 localtime=Feb 04
05:06:37
05:06:37 DEBUG| Persistent state client._record_indent now set to 2
05:06:37 DEBUG| Persistent state client.unexpected_reboot now set to
('ubuntu_kvm_unit_tests.emulator', 'ubuntu_kvm_unit_tests.emulator')
05:06:37 DEBUG| Running 'kvm-ok'
05:06:37 DEBUG| [stdout] INFO: /dev/kvm exists
05:06:37 DEBUG| [stdout] KVM acceleration can be used
05:06:37 DEBUG| Running 'ppc64_cpu --smt=off'
Build was aborted
Check the syslog, there is a call trace before the test_bpf and after page
offline:
[ 1195.321441] Offlined Pages 4096
[ 1195.335056] Offlined Pages 4096
[ 1195.354614] Offlined Pages 4096
[ 1198.491967] Offlined Pages 4096
[ 1199.457587] Injecting error (-12) to MEM_GOING_ONLINE
[ 1200.473838] ------------[ cut here ]------------
[ 1200.473841] kernel BUG at
/build/linux-CWyQTi/linux-4.15.0/kernel/rcu/sync.c:128!
[ 1200.473909] Oops: Exception in kernel mode, sig: 5 [#1]
[ 1200.473953] LE SMP NR_CPUS=2048 NUMA PowerNV
[ 1200.473999] Modules linked in: memory_notifier_error_inject
notifier_error_inject overlay veth xt_CHECKSUM iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter binfmt_misc joydev
input_leds mac_hid idt_89hpesx opal_prd ofpart at24 cmdlinepart powernv_flash
ipmi_powernv uio_pdrv_genirq uio mtd ipmi_devintf ibmpowernv ipmi_msghandler
sch_fq_codel vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp
libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs
zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure
scsi_transport_sas ast i2c_algo_bit hid_generic ttm drm_kms_helper
[ 1200.474641] syscopyarea usbhid sysfillrect sysimgblt hid fb_sys_fops
crct10dif_vpmsum crc32c_vpmsum drm i40e aacraid [last unloaded: test_bpf]
[ 1200.474792] CPU: 12 PID: 139071 Comm: mem-on-off-test Not tainted
4.15.0-87-generic #87-Ubuntu
[ 1200.474894] NIP: c0000000001a8490 LR: c0000000001a8478 CTR:
c00000000026c5e0
[ 1200.474981] REGS: c000000c830ff7c0 TRAP: 0700 Not tainted
(4.15.0-87-generic)
[ 1200.475084] MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
CR: 28222888 XER: 20040000
[ 1200.475219] CFAR: c00000000001940c SOFTE: 1
[ 1200.475219] GPR00: c0000000001a8434 c000000c830ffa40 c00000000172c900
0000000000000001
[ 1200.475219] GPR04: 00000000000001f0 c000000c7a4d2480 0000000028228882
c00000000001e730
[ 1200.475219] GPR08: 0000000ff9a10000 0000000000000001 0000000000000000
c000000c61bab790
[ 1200.475219] GPR12: 0000000000002000 c00000000fa88400 0000058d97936070
0000000000000000
[ 1200.475219] GPR16: 0000058d6b6e9690 0000058d6b776ab0 0000058d6b7a8204
0000058d6b776ae8
[ 1200.475219] GPR20: 0000058d6b7ad5d8 0000000000000001 0000000000000000
00007fffd1cb80e4
[ 1200.475219] GPR24: 00007fffd1cb80e0 c000000001763428 c0000000015f6ba8
0000000000000000
[ 1200.475219] GPR28: 0000000000000020 c0000000015f6bb0 ffffffffffffffff
c0000000015f6ba8
[ 1200.476036] NIP [c0000000001a8490] rcu_sync_enter+0xa0/0x1e0
[ 1200.476124] LR [c0000000001a8478] rcu_sync_enter+0x88/0x1e0
[ 1200.476180] Call Trace:
[ 1200.476215] [c000000c830ffa40] [c000000c830ffaa0] 0xc000000c830ffaa0
(unreliable)
[ 1200.476311] [c000000c830ffab0] [c0000000001889a8]
percpu_down_write+0x38/0x140
[ 1200.476407] [c000000c830ffb00] [c00000000039fa6c] online_pages+0x1fc/0x440
[ 1200.476456] [c000000c830ffbd0] [c0000000008a7320]
memory_subsys_online+0x180/0x250
[ 1200.476495] [c000000c830ffc60] [c000000000879f54] device_online+0x84/0x120
[ 1200.476528] [c000000c830ffca0] [c0000000008a7ee8]
store_mem_state+0xb8/0x180
[ 1200.476566] [c000000c830ffce0] [c0000000008744bc] dev_attr_store+0x3c/0x60
[ 1200.476599] [c000000c830ffd00] [c0000000004ae254] sysfs_kf_write+0x64/0x90
[ 1200.476631] [c000000c830ffd20] [c0000000004acf2c]
kernfs_fop_write+0x1ac/0x240
[ 1200.476670] [c000000c830ffd70] [c0000000003e147c] __vfs_write+0x3c/0x70
[ 1200.476703] [c000000c830ffd90] [c0000000003e16d8] vfs_write+0xd8/0x220
[ 1200.476735] [c000000c830ffde0] [c0000000003e1a38] SyS_write+0x78/0x140
[ 1200.476768] [c000000c830ffe30] [c00000000000b288] system_call+0x5c/0x70
[ 1200.476799] Instruction dump:
[ 1200.476819] 409e00b0 7c2004ac 39200000 38600001 913f0008 4be70f85 60000000
2fbe0000
[ 1200.476858] 39200000 419e000c 7f9c0034 5789d97e <0b090000> 4092008c
813f0038 3d42fffb
[ 1200.476909] ---[ end trace 5ef11694541f2535 ]---
[ 1200.527850]
[ 1224.784549] test_bpf: #0 TAX jited:1 36 35 33 PASS
[ 1224.785669] test_bpf: #1 TXA jited:1 11 11 11 PASS
[ 1224.786073] test_bpf: #2 ADD_SUB_MUL_K jited:1 10 PASS
[ 1224.786236] test_bpf: #3 DIV_MOD_KX jited:1 15 PASS
[ 1224.786444] test_bpf: #4 AND_OR_LSH_K jited:1 10 10 PASS
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-87-generic 4.15.0-87.87
ProcVersionSignature: User Name 4.15.0-87.87-generic 4.15.18
Uname: Linux 4.15.0-87-generic ppc64le
.sys.firmware.opal.msglog: Error: [Errno 13] Permission denied:
'/sys/firmware/opal/msglog'
AlsaDevices:
- total 0
- crw-rw---- 1 root audio 116, 1 Feb 6 06:35 seq
- crw-rw---- 1 root audio 116, 33 Feb 6 06:35 timer
+ total 0
+ crw-rw---- 1 root audio 116, 1 Feb 6 06:35 seq
+ crw-rw---- 1 root audio 116, 33 Feb 6 06:35 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.10
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord':
'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
CurrentDmesg:
-
+
Date: Fri Feb 7 07:57:32 2020
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
- Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
- Bus 001 Device 003: ID 0451:80ff Texas Instruments, Inc.
- Bus 001 Device 004: ID 0557:2419 ATEN International Co., Ltd
- Bus 001 Device 002: ID 0557:7000 ATEN International Co., Ltd Hub
- Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
+ Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
+ Bus 001 Device 003: ID 0451:80ff Texas Instruments, Inc.
+ Bus 001 Device 004: ID 0557:2419 ATEN International Co., Ltd
+ Bus 001 Device 002: ID 0557:7000 ATEN International Co., Ltd Hub
+ Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
PciMultimedia:
-
+
ProcFB: 0 astdrmfb
ProcKernelCmdLine: root=UUID=acd1a0d7-f6fc-4130-928c-c8b11ad6e4be ro
console=hvc0
ProcLoadAvg: 2.02 1.31 1.11 1/1377 37783
ProcSwaps:
- Filename Type Size Used Priority
- /swap.img file 8388544 0 -2
+ Filename Type Size Used Priority
+ /swap.img file 8388544 0 -2
ProcVersion: Linux version 4.15.0-87-generic (buildd@bos02-ppc64el-002) (gcc
version 7.4.0 (User Name 7.4.0-1ubuntu1~18.04.1)) #87-User Name SMP Fri Jan 31
19:32:29 UTC 2020
RelatedPackageVersions:
- linux-restricted-modules-4.15.0-87-generic N/A
- linux-backports-modules-4.15.0-87-generic N/A
- linux-firmware 1.173.15
+ linux-restricted-modules-4.15.0-87-generic N/A
+ linux-backports-modules-4.15.0-87-generic N/A
+ linux-firmware 1.173.15
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
VarLogDump_list: total 0
cpu_cores: Number of cores present = 40
cpu_coreson: Number of cores online = 39
cpu_smt: SMT=4
** Also affects: linux (Ubuntu Bionic)
Importance: Undecided
Status: New
** Description changed:
It looks like memory-hotplug test in ubuntu_kernel_selftests will
trigger this issue.
This issue cannot be reproduced with the kernel in -updates, but can be
- reproduced quite easily with the proposed kernel.
+ reproduced quite easily with the proposed kernel (X-4.15 Oracle
+ 4.15.0-1032.35~16.04.1).
- On jenkins you will see the jenkins job "sru-misc__B_ppc64el-
- generic__using_baltar__for_kernel" hung at the same spot (the beginning
- of the KVM unit test) for two out of two attempts:
+ This was spotted on the following kernels (for now):
+ * X-oracle-4.15
+ * B
+ * B-oracle-4.15
+
+ It's not very easy to spot this as the jenkins job will just hang and you
won't see the test result on the report page,
+ for example the jenkins job
"sru-misc__B_ppc64el-generic__using_baltar__for_kernel" hung at the same spot
(the beginning of the KVM unit test) for two out of two attempts:
05:06:37 INFO | GOOD ubuntu_kvm_unit_tests.setup
ubuntu_kvm_unit_tests.setup timestamp=1580792797 localtime=Feb 04
05:06:37 completed successfully
05:06:37 INFO | END GOOD ubuntu_kvm_unit_tests.setup
ubuntu_kvm_unit_tests.setup timestamp=1580792797 localtime=Feb 04
05:06:37
05:06:37 DEBUG| Persistent state client._record_indent now set to 1
05:06:37 DEBUG| Persistent state client.unexpected_reboot deleted
05:06:37 INFO | START ubuntu_kvm_unit_tests.emulator
ubuntu_kvm_unit_tests.emulator timestamp=1580792797 localtime=Feb 04
05:06:37
05:06:37 DEBUG| Persistent state client._record_indent now set to 2
05:06:37 DEBUG| Persistent state client.unexpected_reboot now set to
('ubuntu_kvm_unit_tests.emulator', 'ubuntu_kvm_unit_tests.emulator')
05:06:37 DEBUG| Running 'kvm-ok'
05:06:37 DEBUG| [stdout] INFO: /dev/kvm exists
05:06:37 DEBUG| [stdout] KVM acceleration can be used
05:06:37 DEBUG| Running 'ppc64_cpu --smt=off'
Build was aborted
Check the syslog, there is a call trace before the test_bpf and after page
offline:
[ 1195.321441] Offlined Pages 4096
[ 1195.335056] Offlined Pages 4096
[ 1195.354614] Offlined Pages 4096
[ 1198.491967] Offlined Pages 4096
[ 1199.457587] Injecting error (-12) to MEM_GOING_ONLINE
[ 1200.473838] ------------[ cut here ]------------
[ 1200.473841] kernel BUG at
/build/linux-CWyQTi/linux-4.15.0/kernel/rcu/sync.c:128!
[ 1200.473909] Oops: Exception in kernel mode, sig: 5 [#1]
[ 1200.473953] LE SMP NR_CPUS=2048 NUMA PowerNV
[ 1200.473999] Modules linked in: memory_notifier_error_inject
notifier_error_inject overlay veth xt_CHECKSUM iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter binfmt_misc joydev
input_leds mac_hid idt_89hpesx opal_prd ofpart at24 cmdlinepart powernv_flash
ipmi_powernv uio_pdrv_genirq uio mtd ipmi_devintf ibmpowernv ipmi_msghandler
sch_fq_codel vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp
libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs
zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure
scsi_transport_sas ast i2c_algo_bit hid_generic ttm drm_kms_helper
[ 1200.474641] syscopyarea usbhid sysfillrect sysimgblt hid fb_sys_fops
crct10dif_vpmsum crc32c_vpmsum drm i40e aacraid [last unloaded: test_bpf]
[ 1200.474792] CPU: 12 PID: 139071 Comm: mem-on-off-test Not tainted
4.15.0-87-generic #87-Ubuntu
[ 1200.474894] NIP: c0000000001a8490 LR: c0000000001a8478 CTR:
c00000000026c5e0
[ 1200.474981] REGS: c000000c830ff7c0 TRAP: 0700 Not tainted
(4.15.0-87-generic)
[ 1200.475084] MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
CR: 28222888 XER: 20040000
[ 1200.475219] CFAR: c00000000001940c SOFTE: 1
[ 1200.475219] GPR00: c0000000001a8434 c000000c830ffa40 c00000000172c900
0000000000000001
[ 1200.475219] GPR04: 00000000000001f0 c000000c7a4d2480 0000000028228882
c00000000001e730
[ 1200.475219] GPR08: 0000000ff9a10000 0000000000000001 0000000000000000
c000000c61bab790
[ 1200.475219] GPR12: 0000000000002000 c00000000fa88400 0000058d97936070
0000000000000000
[ 1200.475219] GPR16: 0000058d6b6e9690 0000058d6b776ab0 0000058d6b7a8204
0000058d6b776ae8
[ 1200.475219] GPR20: 0000058d6b7ad5d8 0000000000000001 0000000000000000
00007fffd1cb80e4
[ 1200.475219] GPR24: 00007fffd1cb80e0 c000000001763428 c0000000015f6ba8
0000000000000000
[ 1200.475219] GPR28: 0000000000000020 c0000000015f6bb0 ffffffffffffffff
c0000000015f6ba8
[ 1200.476036] NIP [c0000000001a8490] rcu_sync_enter+0xa0/0x1e0
[ 1200.476124] LR [c0000000001a8478] rcu_sync_enter+0x88/0x1e0
[ 1200.476180] Call Trace:
[ 1200.476215] [c000000c830ffa40] [c000000c830ffaa0] 0xc000000c830ffaa0
(unreliable)
[ 1200.476311] [c000000c830ffab0] [c0000000001889a8]
percpu_down_write+0x38/0x140
[ 1200.476407] [c000000c830ffb00] [c00000000039fa6c] online_pages+0x1fc/0x440
[ 1200.476456] [c000000c830ffbd0] [c0000000008a7320]
memory_subsys_online+0x180/0x250
[ 1200.476495] [c000000c830ffc60] [c000000000879f54] device_online+0x84/0x120
[ 1200.476528] [c000000c830ffca0] [c0000000008a7ee8]
store_mem_state+0xb8/0x180
[ 1200.476566] [c000000c830ffce0] [c0000000008744bc] dev_attr_store+0x3c/0x60
[ 1200.476599] [c000000c830ffd00] [c0000000004ae254] sysfs_kf_write+0x64/0x90
[ 1200.476631] [c000000c830ffd20] [c0000000004acf2c]
kernfs_fop_write+0x1ac/0x240
[ 1200.476670] [c000000c830ffd70] [c0000000003e147c] __vfs_write+0x3c/0x70
[ 1200.476703] [c000000c830ffd90] [c0000000003e16d8] vfs_write+0xd8/0x220
[ 1200.476735] [c000000c830ffde0] [c0000000003e1a38] SyS_write+0x78/0x140
[ 1200.476768] [c000000c830ffe30] [c00000000000b288] system_call+0x5c/0x70
[ 1200.476799] Instruction dump:
[ 1200.476819] 409e00b0 7c2004ac 39200000 38600001 913f0008 4be70f85 60000000
2fbe0000
[ 1200.476858] 39200000 419e000c 7f9c0034 5789d97e <0b090000> 4092008c
813f0038 3d42fffb
[ 1200.476909] ---[ end trace 5ef11694541f2535 ]---
[ 1200.527850]
[ 1224.784549] test_bpf: #0 TAX jited:1 36 35 33 PASS
[ 1224.785669] test_bpf: #1 TXA jited:1 11 11 11 PASS
[ 1224.786073] test_bpf: #2 ADD_SUB_MUL_K jited:1 10 PASS
[ 1224.786236] test_bpf: #3 DIV_MOD_KX jited:1 15 PASS
[ 1224.786444] test_bpf: #4 AND_OR_LSH_K jited:1 10 10 PASS
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-87-generic 4.15.0-87.87
ProcVersionSignature: User Name 4.15.0-87.87-generic 4.15.18
Uname: Linux 4.15.0-87-generic ppc64le
.sys.firmware.opal.msglog: Error: [Errno 13] Permission denied:
'/sys/firmware/opal/msglog'
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Feb 6 06:35 seq
crw-rw---- 1 root audio 116, 33 Feb 6 06:35 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.10
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord':
'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
CurrentDmesg:
Date: Fri Feb 7 07:57:32 2020
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 0451:80ff Texas Instruments, Inc.
Bus 001 Device 004: ID 0557:2419 ATEN International Co., Ltd
Bus 001 Device 002: ID 0557:7000 ATEN International Co., Ltd Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
PciMultimedia:
ProcFB: 0 astdrmfb
ProcKernelCmdLine: root=UUID=acd1a0d7-f6fc-4130-928c-c8b11ad6e4be ro
console=hvc0
ProcLoadAvg: 2.02 1.31 1.11 1/1377 37783
ProcSwaps:
Filename Type Size Used Priority
/swap.img file 8388544 0 -2
ProcVersion: Linux version 4.15.0-87-generic (buildd@bos02-ppc64el-002) (gcc
version 7.4.0 (User Name 7.4.0-1ubuntu1~18.04.1)) #87-User Name SMP Fri Jan 31
19:32:29 UTC 2020
RelatedPackageVersions:
linux-restricted-modules-4.15.0-87-generic N/A
linux-backports-modules-4.15.0-87-generic N/A
linux-firmware 1.173.15
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
VarLogDump_list: total 0
cpu_cores: Number of cores present = 40
cpu_coreson: Number of cores online = 39
cpu_smt: SMT=4
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1862312
Title:
Segmentation fault (kernel oops) with memory-hotplug in
ubuntu_kernel_selftests on Bionic kernel
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1862312/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs