That one completed two runs, but on the second run, dmesg included the
following message at one point:
[ 240.841694] kernel BUG at
/home/jsalisbury/bugs/lp1733662/ubuntu-artful/mm/slub.c:3878!
[ 240.842765] invalid opcode: 0000 [#1] SMP
[ 240.843718] Modules linked in: nls_iso8859_1 intel_rapl x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel kvm irqbypass intel_cstate intel_rapl_perf
ipmi_ssif joydev input_leds ipmi_si ipmi_devintf ipmi_msghandler
acpi_power_meter lpc_ich shpchp acpi_pad mac_hid mei_me mei ib_iser rdma_cm
iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure
scsi_transport_sas crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc fnic
mgag200 ttm hid_generic drm_kms_helper syscopyarea igb sysfillrect aesni_intel
sysimgblt usbhid libfcoe fb_sys_fops aes_x86_64 dca hid crypto_simd
i2c_algo_bit mxm_wmi glue_helper ptp cryptd ahci libfc libahci
[ 240.851457] drm pps_core megaraid_sas scsi_transport_fc enic wmi
[ 240.852693] CPU: 8 PID: 2724 Comm: irqbalance Not tainted 4.13.0-13-generic
#14~lp1733662Commitac2fc5adab0f4
[ 240.853965] Hardware name: Cisco Systems Inc UCSC-C240-M4L/UCSC-C240-M4L,
BIOS C240M4.2.0.10c.0.032320160820 03/23/2016
[ 240.855281] task: ffff9b62a76645c0 task.stack: ffffb973cf6fc000
[ 240.856603] RIP: 0010:kfree+0x11c/0x160
[ 240.857937] RSP: 0018:ffffb973cf6ffa08 EFLAGS: 00010246
[ 240.859280] RAX: fffff8803cff0020 RBX: ffff9b6200000000 RCX: 0000000000000000
[ 240.860632] RDX: 0000000000000000 RSI: ffff9b62b0eb5348 RDI: 000064dcc0000000
[ 240.861995] RBP: ffffb973cf6ffa20 R08: ffff9b62b22f70f0 R09: 0000000180220021
[ 240.863367] R10: fffff8803d000000 R11: 0000000000000001 R12: ffff9b62b1648780
[ 240.864756] R13: ffffffffb65dd4e0 R14: ffff9b62a872f0d8 R15: ffff9b62a872fac0
[ 240.866145] FS: 00007ff8c4d06740(0000) GS:ffff9b62bf200000(0000)
knlGS:0000000000000000
[ 240.867562] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 240.868986] CR2: 00007fff9ef860f8 CR3: 0000003fe7876000 CR4: 00000000001406e0
[ 240.870438] Call Trace:
[ 240.871882] kfree_const+0x20/0x30
[ 240.873328] kernfs_put+0x71/0x180
[ 240.874778] kernfs_dop_release+0x12/0x20
[ 240.876218] __dentry_kill+0xe5/0x150
[ 240.877644] shrink_dentry_list+0x11f/0x2e0
[ 240.879078] d_invalidate+0x67/0x110
[ 240.880526] lookup_fast+0x2b9/0x310
[ 240.881968] ? dput.part.23+0x2d/0x1e0
[ 240.883393] walk_component+0x49/0x340
[ 240.884811] ? kernfs_iop_permission+0x4f/0x60
[ 240.886253] link_path_walk+0x1bc/0x590
[ 240.887690] ? path_init+0x177/0x2f0
[ 240.889105] path_lookupat+0x56/0x1f0
[ 240.890529] filename_lookup+0xb6/0x190
[ 240.891964] ? sprintf+0x51/0x70
[ 240.893387] ? __check_object_size+0xaf/0x1b0
[ 240.894822] ? strncpy_from_user+0x4d/0x170
[ 240.896240] user_path_at_empty+0x36/0x40
[ 240.897673] ? user_path_at_empty+0x36/0x40
[ 240.899101] vfs_statx+0x76/0xe0
[ 240.900517] SYSC_newstat+0x3d/0x70
[ 240.901934] ? ____fput+0xe/0x10
[ 240.903365] ? task_work_run+0x7b/0x90
[ 240.904783] ? exit_to_usermode_loop+0x9b/0xd0
[ 240.906181] SyS_newstat+0xe/0x10
[ 240.907559] entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 240.908900] RIP: 0033:0x7ff8c3df6bb5
[ 240.910196] RSP: 002b:00007ffe6cf8a928 EFLAGS: 00000246 ORIG_RAX:
0000000000000004
[ 240.911496] RAX: ffffffffffffffda RBX: 0000000000fe9a40 RCX: 00007ff8c3df6bb5
[ 240.912763] RDX: 00007ffe6cf8a980 RSI: 00007ffe6cf8a980 RDI: 00007ffe6cf8c210
[ 240.913985] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000039
[ 240.915181] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 240.916320] R13: 00007ffe6cf8b22b R14: 0000000000fe9a40 R15: 0000000000fe92f0
[ 240.917447] Code: 08 49 83 c4 18 48 89 da 4c 89 ee ff d0 49 8b 04 24 48 85
c0 75 e6 e9 0e ff ff ff 49 8b 02 f6 c4 80 75 0a 49 8b 42 20 a8 01 75 02 <0f> 0b
49 8b 02 31 f6 f6 c4 80 74 04 41 8b 72 6c 4c 89 d7 e8 2c
[ 240.919769] RIP: kfree+0x11c/0x160 RSP: ffffb973cf6ffa08
[ 240.920909] ---[ end trace 67fe147f4dd931eb ]---
A third run produced a hang when offlining CPU 8, with the following
dmesg output:
[ 352.776303] EDAC MC1: Giving out device to module sb_edac.c controller
Haswell SrcID#0_Ha#0: DEV 0000:7f:12.0 (INTERRUPT)
[ 352.776572] EDAC sbridge: Some needed devices are missing
[ 352.801614] EDAC MC: Removed device 0 for sb_edac.c Haswell SrcID#1_Ha#0:
DEV 0000:ff:12.0
[ 352.825588] EDAC MC: Removed device 1 for sb_edac.c Haswell SrcID#0_Ha#0:
DEV 0000:7f:12.0
[ 352.826090] EDAC sbridge: Couldn't find mci handler
[ 352.826457] EDAC sbridge: Couldn't find mci handler
[ 352.826826] EDAC sbridge: Failed to register device with error -19.
[ 353.286163] BUG: unable to handle kernel paging request at 0000317865646e69
[ 353.286790] IP: __kmalloc_node+0x135/0x2a0
[ 353.287303] PGD 0
[ 353.287304] P4D 0
[ 353.288695] Oops: 0000 [#2] SMP
[ 353.289158] Modules linked in: nls_iso8859_1 intel_rapl x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel kvm irqbypass intel_cstate intel_rapl_perf
ipmi_ssif joydev input_leds ipmi_si ipmi_devintf ipmi_msghandler
acpi_power_meter lpc_ich shpchp acpi_pad mac_hid mei_me mei ib_iser rdma_cm
iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure
scsi_transport_sas crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc fnic
mgag200 ttm hid_generic drm_kms_helper syscopyarea igb sysfillrect aesni_intel
sysimgblt usbhid libfcoe fb_sys_fops aes_x86_64 dca hid crypto_simd
i2c_algo_bit mxm_wmi glue_helper ptp cryptd ahci libfc libahci
[ 353.294318] drm pps_core megaraid_sas scsi_transport_fc enic wmi
[ 353.295246] CPU: 8 PID: 56 Comm: cpuhp/8 Tainted: G D
4.13.0-13-generic #14~lp1733662Commitac2fc5adab0f4
[ 353.296231] Hardware name: Cisco Systems Inc UCSC-C240-M4L/UCSC-C240-M4L,
BIOS C240M4.2.0.10c.0.032320160820 03/23/2016
[ 353.297274] task: ffff9b62b8fc0000 task.stack: ffffb973cc780000
[ 353.298341] RIP: 0010:__kmalloc_node+0x135/0x2a0
[ 353.299416] RSP: 0018:ffffb973cc783bb0 EFLAGS: 00010246
[ 353.300511] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000008a2
[ 353.301652] RDX: 00000000000008a1 RSI: 0000000000000000 RDI: 000000000001f3e0
[ 353.302793] RBP: ffffb973cc783bf0 R08: ffff9b62bf21f3e0 R09: ffff9b42bf807c00
[ 353.303960] R10: 000000000000024c R11: 0000000000020dd1 R12: 00000000014080c0
[ 353.305155] R13: 0000000000000008 R14: 0000317865646e69 R15: ffff9b42bf807c00
[ 353.306379] FS: 0000000000000000(0000) GS:ffff9b62bf200000(0000)
knlGS:0000000000000000
[ 353.307637] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 353.308901] CR2: 0000317865646e69 CR3: 0000002343409000 CR4: 00000000001406e0
[ 353.310205] Call Trace:
[ 353.311531] ? alloc_cpumask_var_node+0x1f/0x30
[ 353.312881] alloc_cpumask_var_node+0x1f/0x30
[ 353.314245] zalloc_cpumask_var+0x14/0x20
[ 353.315616] cpudl_init+0x6a/0xe0
[ 353.316992] init_rootdomain+0x7a/0xd0
[ 353.318393] build_sched_domains+0x26a/0xdd0
[ 353.319817] ? call_rcu_sched+0x17/0x20
[ 353.321249] ? cpu_attach_domain+0x1af/0x6a0
[ 353.322698] ? kfree+0x14a/0x160
[ 353.324146] partition_sched_domains+0x1c6/0x2f0
[ 353.325623] ? sched_cpu_activate+0xd0/0xd0
[ 353.327122] cpuset_update_active_cpus+0x17/0x40
[ 353.328583] sched_cpu_deactivate+0x94/0xd0
[ 353.330052] ? call_rcu_bh+0x20/0x20
[ 353.331495] ? call_rcu_bh+0x20/0x20
[ 353.332894] ? trace_raw_output_rcu_utilization+0x50/0x50
[ 353.334320] ? pick_next_task_fair+0x48e/0x560
[ 353.335736] cpuhp_invoke_callback+0x84/0x3b0
[ 353.337164] cpuhp_down_callbacks+0x42/0x80
[ 353.338579] cpuhp_thread_fun+0x88/0xe0
[ 353.339971] smpboot_thread_fn+0xec/0x160
[ 353.341346] kthread+0x125/0x140
[ 353.342723] ? sort_range+0x30/0x30
[ 353.344106] ? kthread_create_on_node+0x70/0x70
[ 353.345521] ret_from_fork+0x25/0x30
[ 353.346928] Code: 89 cf 4c 89 4d c0 e8 0b 7f 01 00 49 89 c7 4c 8b 4d c0 4d
85 ff 0f 85 47 ff ff ff 45 31 f6 eb 3c 49 63 47 20 49 8b 3f 48 8d 4a 01 <49> 8b
1c 06 4c 89 f0 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 20 ff
[ 353.349833] RIP: __kmalloc_node+0x135/0x2a0 RSP: ffffb973cc783bb0
[ 353.351218] CR2: 0000317865646e69
[ 353.352559] ---[ end trace 67fe147f4dd931ec ]---
Although the test script hung, I was able to continue using my other
terminal normally, run other programs, log out, log back in, etc. An
attempt to reboot ("sudo shutdown -h now") did not succeed; the system
hung with "[ OK ] Stopped target Multi-User System" on the console.
After forcing a restart via the BMC, I ran the test script again, which
completed one run but then hung on the second run, with limited
functionality thereafter. The dmesg output on the second run included
the following:
[ 103.752641] ------------[ cut here ]------------
[ 103.752643] kernel BUG at
/home/jsalisbury/bugs/lp1733662/ubuntu-artful/mm/slub.c:3878!
[ 103.753548] invalid opcode: 0000 [#1] SMP
[ 103.754440] Modules linked in: nls_iso8859_1 intel_rapl x86_pkg_temp_thermal
intel_powerclamp ipmi_ssif coretemp joydev input_leds intel_cstate ipmi_si
intel_rapl_perf mei_me ipmi_devintf ipmi_msghandler kvm_intel kvm irqbypass mei
mac_hid shpchp acpi_power_meter lpc_ich acpi_pad ib_iser rdma_cm iw_cm ib_cm
ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure
scsi_transport_sas crct10dif_pclmul mgag200 crc32_pclmul igb ttm hid_generic
ghash_clmulni_intel drm_kms_helper fnic pcbc usbhid dca syscopyarea aesni_intel
sysfillrect i2c_algo_bit sysimgblt fb_sys_fops hid libfcoe aes_x86_64 ahci ptp
crypto_simd libfc glue_helper mxm_wmi cryptd drm
[ 103.762134] libahci pps_core enic scsi_transport_fc megaraid_sas wmi
[ 103.763369] CPU: 0 PID: 3649 Comm: python3 Not tainted 4.13.0-13-generic
#14~lp1733662Commitac2fc5adab0f4
[ 103.764641] Hardware name: Cisco Systems Inc UCSC-C240-M4L/UCSC-C240-M4L,
BIOS C240M4.2.0.10c.0.032320160820 03/23/2016
[ 103.765948] task: ffff8e90a5999740 task.stack: ffff9dbb4e320000
[ 103.767263] RIP: 0010:kfree+0x11c/0x160
[ 103.768601] RSP: 0018:ffff9dbb4e323cb0 EFLAGS: 00010246
[ 103.769941] RAX: fffffa5b3cff0020 RBX: ffff8eb000000000 RCX: 0000000000000000
[ 103.771301] RDX: 0000000000000000 RSI: 0000000000000028 RDI: 0000718ec0000000
[ 103.772663] RBP: ffff9dbb4e323cc8 R08: dead000000000100 R09: ffffffff985ed7a8
[ 103.774049] R10: fffffa5b3d000000 R11: 0000000000000000 R12: 0000000000000028
[ 103.775426] R13: ffffffff97eead09 R14: 000000000000000a R15: ffffffff977143f0
[ 103.776809] FS: 00007f1e1c29f700(0000) GS:ffff8e90bfc00000(0000)
knlGS:0000000000000000
[ 103.778214] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 103.779645] CR2: 000055be9d7243a8 CR3: 0000003ff74a3000 CR4: 00000000001406f0
[ 103.781094] Call Trace:
[ 103.782527] free_cpumask_var+0x9/0x10
[ 103.783961] smpcfd_dead_cpu+0x24/0x40
[ 103.785415] cpuhp_invoke_callback+0x84/0x3b0
[ 103.786859] ? flow_cache_lookup+0x4c0/0x4c0
[ 103.788303] cpuhp_down_callbacks+0x42/0x80
[ 103.789745] _cpu_down+0xc2/0x100
[ 103.791191] do_cpu_down+0x33/0x50
[ 103.792624] cpu_down+0x10/0x20
[ 103.794056] cpu_subsys_offline+0x14/0x20
[ 103.795492] device_offline+0x73/0xc0
[ 103.796926] online_store+0x4c/0xa0
[ 103.798351] dev_attr_store+0x18/0x30
[ 103.799779] sysfs_kf_write+0x37/0x40
[ 103.801201] kernfs_fop_write+0x11c/0x1a0
[ 103.802634] __vfs_write+0x18/0x40
[ 103.804065] vfs_write+0xb1/0x1a0
[ 103.805485] SyS_write+0x55/0xc0
[ 103.806888] entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 103.808310] RIP: 0033:0x7f1e1be7f4a0
[ 103.809730] RSP: 002b:00007ffc4ead2768 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[ 103.811181] RAX: ffffffffffffffda RBX: 0000000001d8b410 RCX: 00007f1e1be7f4a0
[ 103.812648] RDX: 0000000000000002 RSI: 0000000001ea1060 RDI: 0000000000000003
[ 103.814122] RBP: 0000000000a3e020 R08: 0000000000000000 R09: 0000000000000001
[ 103.815600] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000003
[ 103.817048] R13: 0000000000501520 R14: 00007ffc4ead2bd0 R15: 00007f1e1ad98240
[ 103.818475] Code: 08 49 83 c4 18 48 89 da 4c 89 ee ff d0 49 8b 04 24 48 85
c0 75 e6 e9 0e ff ff ff 49 8b 02 f6 c4 80 75 0a 49 8b 42 20 a8 01 75 02 <0f> 0b
49 8b 02 31 f6 f6 c4 80 74 04 41 8b 72 6c 4c 89 d7 e8 2c
[ 103.821390] RIP: kfree+0x11c/0x160 RSP: ffff9dbb4e323cb0
[ 103.822826] ---[ end trace 7c1d545f713a5ad1 ]---
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1733662
Title:
System hang with Linux kernel 4.13, not with 4.10
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1733662/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs