Update: I could not recreate this bug using zbud as the allocator.

I need to test newer kernels and see if there's an easier way to trigger
this.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-gcp in Ubuntu.
https://bugs.launchpad.net/bugs/1785234

Title:
  Kernel panic softlockup with z3fold while running bitcoind

Status in linux-gcp package in Ubuntu:
  New

Bug description:
  So far I have only triggered this while running bitcoind (or bitcoin-
  qt) with the z3fold module loaded. I will do further testing to see if
  I can trigger it any other ways.

  [54075.373045] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kswapd0:41]
  [54075.377042] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! 
[bitcoin-msghand:7509]
  [54075.377042] Modules linked in: cachefiles ip6table_filter ip6_tables 
fscache iptable_filter ip_tables x_tables crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel intel_rapl_perf psmouse serio_raw virtio_net pvpanic 
i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc autofs4 aesni_intel 
aes_x86_64 crypto_simd cryptd glue_helper z3fold lz4hc lz4hc_compress
  [54075.377068] CPU: 1 PID: 7509 Comm: bitcoin-msghand Not tainted 
4.15.0-1014-gcp #14~16.04.1-Ubuntu
  [54075.377069] Hardware name: Google Google Compute Engine/Google Compute 
Engine, BIOS Google 01/01/2011
  [54075.377076] RIP: 0010:native_queued_spin_lock_slowpath+0x25/0x1a0
  [54075.377077] RSP: 0000:ffffaec7c13b7420 EFLAGS: 00000202 ORIG_RAX: 
ffffffffffffff11
  [54075.377079] RAX: 0000000000000001 RBX: ffffdb7b82e96a80 RCX: 
0000000000000000
  [54075.377080] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 
ffff8c687a5aa010
  [54075.377081] RBP: ffffaec7c13b7420 R08: ffff8c6879c749d8 R09: 
0000000000000002
  [54075.377082] R10: ffffaec7c13b74d8 R11: ffffaec7c13b7628 R12: 
ffff8c687a5aa000
  [54075.377083] R13: ffff8c687a5aa010 R14: ffff8c687a5aa000 R15: 
ffffdb7b82de7580
  [54075.377084] FS:  00007f133e7fc700(0000) GS:ffff8c687fd00000(0000) 
knlGS:0000000000000000
  [54075.377086] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [54075.377087] CR2: 00007f3c521ba000 CR3: 00000000b4d24006 CR4: 
00000000003606e0
  [54075.377090] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [54075.377091] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
  [54075.377092] Call Trace:
  [54075.377099]  _raw_spin_lock+0x20/0x30
  [54075.377104]  z3fold_zpool_map+0x72/0xf0 [z3fold]
  [54075.377108]  zpool_map_handle+0x1c/0x20
  [54075.377111]  zswap_writeback_entry+0x47/0x350
  [54075.377114]  z3fold_zpool_evict+0x2b/0x40 [z3fold]
  [54075.377116]  z3fold_zpool_shrink+0x2a5/0x350 [z3fold]
  [54075.377118]  zpool_shrink+0x1c/0x20
  [54075.377119]  zswap_frontswap_store+0x271/0x4d0
  [54075.377123]  __frontswap_store+0x78/0x100
  [54075.377125]  swap_writepage+0x3f/0x80
  [54075.377128]  pageout.isra.53+0x1e6/0x340
  [54075.377131]  shrink_page_list+0x992/0xbe0
  [54075.377133]  shrink_inactive_list+0x296/0x5e0
  [54075.377135]  ? __switch_to_asm+0x40/0x70
  [54075.377137]  ? syscall_return_via_sysret+0x5/0x75
  [54075.377140]  shrink_node_memcg+0x367/0x7e0
  [54075.377142]  ? __switch_to_asm+0x40/0x70
  [54075.377144]  ? __switch_to_asm+0x40/0x70
  [54075.377146]  shrink_node+0xe1/0x310
  [54075.377147]  ? shrink_node+0xe1/0x310
  [54075.377149]  do_try_to_free_pages+0xee/0x360
  [54075.377152]  try_to_free_pages+0xf1/0x1c0
  [54075.377155]  __alloc_pages_slowpath+0x405/0xec0
  [54075.377158]  __alloc_pages_nodemask+0x265/0x280
  [54075.377162]  alloc_pages_current+0x6a/0xe0
  [54075.377165]  __page_cache_alloc+0x86/0x90
  [54075.377167]  generic_file_read_iter+0x817/0xb60
  [54075.377169]  ? __switch_to_asm+0x40/0x70
  [54075.377170]  ? __switch_to_asm+0x34/0x70
  [54075.377172]  ? __switch_to_asm+0x40/0x70
  [54075.377173]  ? __switch_to_asm+0x34/0x70
  [54075.377174]  ? __switch_to_asm+0x40/0x70
  [54075.377176]  ? __switch_to_asm+0x34/0x70
  [54075.377177]  ? __switch_to_asm+0x40/0x70
  [54075.377178]  ? __switch_to_asm+0x34/0x70
  [54075.377181]  ext4_file_read_iter+0x56/0xf0
  [54075.377184]  new_sync_read+0xe2/0x130
  [54075.377187]  __vfs_read+0x29/0x40
  [54075.377188]  vfs_read+0x93/0x130
  [54075.377190]  SyS_read+0x55/0xc0
  [54075.377194]  do_syscall_64+0x7b/0x150
  [54075.377196]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
  [54075.377197] RIP: 0033:0x7f1360bed27d
  [54075.377198] RSP: 002b:00007f133e7f8a00 EFLAGS: 00000293 ORIG_RAX: 
0000000000000000
  [54075.377200] RAX: ffffffffffffffda RBX: 00007f128e42c7d0 RCX: 
00007f1360bed27d
  [54075.377201] RDX: 0000000000001000 RSI: 00007f128dfe27d0 RDI: 
0000000000000014
  [54075.377201] RBP: 000000000000004f R08: 228def3009585ceb R09: 
60c81007fca0c815
  [54075.377202] R10: 6156858af7255092 R11: 0000000000000293 R12: 
000000000000003c
  [54075.377203] R13: 000000000000008b R14: 00007f128d6d03cf R15: 
000000000000008b
  [54075.377205] Code: 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 0f 1f 44 00 
00 ba 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb> ec 
81 fe 00 01 00 00 0f 84 92 00 00 00 41 b8 01 01 00 00 b9 
  [54075.377234] Kernel panic - not syncing: softlockup: hung tasks
  [54075.377236] CPU: 1 PID: 7509 Comm: bitcoin-msghand Tainted: G             
L   4.15.0-1014-gcp #14~16.04.1-Ubuntu
  [54075.377237] Hardware name: Google Google Compute Engine/Google Compute 
Engine, BIOS Google 01/01/2011
  [54075.377237] Call Trace:
  [54075.377239]  <IRQ>
  [54075.377241]  dump_stack+0x85/0xcb
  [54075.377244]  panic+0xe9/0x254
  [54075.377248]  watchdog_timer_fn+0x225/0x230
  [54075.377251]  ? watchdog+0x30/0x30
  [54075.377254]  __hrtimer_run_queues+0xe7/0x230
  [54075.377256]  hrtimer_interrupt+0xb1/0x200
  [54075.377259]  smp_apic_timer_interrupt+0x67/0x140
  [54075.377261]  apic_timer_interrupt+0x8e/0xa0
  [54075.377262]  </IRQ>
  [54075.377264] RIP: 0010:native_queued_spin_lock_slowpath+0x25/0x1a0
  [54075.377265] RSP: 0000:ffffaec7c13b7420 EFLAGS: 00000202 ORIG_RAX: 
ffffffffffffff11
  [54075.377266] RAX: 0000000000000001 RBX: ffffdb7b82e96a80 RCX: 
0000000000000000
  [54075.377267] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 
ffff8c687a5aa010
  [54075.377268] RBP: ffffaec7c13b7420 R08: ffff8c6879c749d8 R09: 
0000000000000002
  [54075.377269] R10: ffffaec7c13b74d8 R11: ffffaec7c13b7628 R12: 
ffff8c687a5aa000
  [54075.377270] R13: ffff8c687a5aa010 R14: ffff8c687a5aa000 R15: 
ffffdb7b82de7580
  [54075.377272]  _raw_spin_lock+0x20/0x30
  [54075.377274]  z3fold_zpool_map+0x72/0xf0 [z3fold]
  [54075.377276]  zpool_map_handle+0x1c/0x20
  [54075.377278]  zswap_writeback_entry+0x47/0x350
  [54075.377281]  z3fold_zpool_evict+0x2b/0x40 [z3fold]
  [54075.377283]  z3fold_zpool_shrink+0x2a5/0x350 [z3fold]
  [54075.377285]  zpool_shrink+0x1c/0x20
  [54075.377286]  zswap_frontswap_store+0x271/0x4d0
  [54075.377289]  __frontswap_store+0x78/0x100
  [54075.377291]  swap_writepage+0x3f/0x80
  [54075.377292]  pageout.isra.53+0x1e6/0x340
  [54075.377295]  shrink_page_list+0x992/0xbe0
  [54075.377297]  shrink_inactive_list+0x296/0x5e0
  [54075.377299]  ? __switch_to_asm+0x40/0x70
  [54075.377301]  ? syscall_return_via_sysret+0x5/0x75
  [54075.377303]  shrink_node_memcg+0x367/0x7e0
  [54075.377305]  ? __switch_to_asm+0x40/0x70
  [54075.377307]  ? __switch_to_asm+0x40/0x70
  [54075.377309]  shrink_node+0xe1/0x310
  [54075.377310]  ? shrink_node+0xe1/0x310
  [54075.377312]  do_try_to_free_pages+0xee/0x360
  [54075.377314]  try_to_free_pages+0xf1/0x1c0
  [54075.377317]  __alloc_pages_slowpath+0x405/0xec0
  [54075.377320]  __alloc_pages_nodemask+0x265/0x280
  [54075.377322]  alloc_pages_current+0x6a/0xe0
  [54075.377325]  __page_cache_alloc+0x86/0x90
  [54075.377327]  generic_file_read_iter+0x817/0xb60
  [54075.377329]  ? __switch_to_asm+0x40/0x70
  [54075.377330]  ? __switch_to_asm+0x34/0x70
  [54075.377332]  ? __switch_to_asm+0x40/0x70
  [54075.377333]  ? __switch_to_asm+0x34/0x70
  [54075.377334]  ? __switch_to_asm+0x40/0x70
  [54075.377336]  ? __switch_to_asm+0x34/0x70
  [54075.377337]  ? __switch_to_asm+0x40/0x70
  [54075.377338]  ? __switch_to_asm+0x34/0x70
  [54075.377340]  ext4_file_read_iter+0x56/0xf0
  [54075.377342]  new_sync_read+0xe2/0x130
  [54075.377344]  __vfs_read+0x29/0x40
  [54075.377345]  vfs_read+0x93/0x130
  [54075.377347]  SyS_read+0x55/0xc0
  [54075.377349]  do_syscall_64+0x7b/0x150
  [54075.377351]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
  [54075.377352] RIP: 0033:0x7f1360bed27d
  [54075.377353] RSP: 002b:00007f133e7f8a00 EFLAGS: 00000293 ORIG_RAX: 
0000000000000000
  [54075.377354] RAX: ffffffffffffffda RBX: 00007f128e42c7d0 RCX: 
00007f1360bed27d
  [54075.377355] RDX: 0000000000001000 RSI: 00007f128dfe27d0 RDI: 
0000000000000014
  [54075.377356] RBP: 000000000000004f R08: 228def3009585ceb R09: 
60c81007fca0c815
  [54075.377357] R10: 6156858af7255092 R11: 0000000000000293 R12: 
000000000000003c
  [54075.377358] R13: 000000000000008b R14: 00007f128d6d03cf R15: 
000000000000008b
  [54076.121563] Modules linked in: cachefiles ip6table_filter ip6_tables 
fscache iptable_filter ip_tables x_tables crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel intel_rapl_perf psmouse serio_raw virtio_net pvpanic 
i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc autofs4 aesni_intel 
aes_x86_64 crypto_simd cryptd glue_helper z3fold lz4hc lz4hc_compress
  [54076.155373] CPU: 0 PID: 41 Comm: kswapd0 Tainted: G             L   
4.15.0-1014-gcp #14~16.04.1-Ubuntu
  [54076.165164] Hardware name: Google Google Compute Engine/Google Compute 
Engine, BIOS Google 01/01/2011
  [54076.175167] RIP: 0010:native_queued_spin_lock_slowpath+0x25/0x1a0
  [54076.181385] RSP: 0018:ffffaec7c086f840 EFLAGS: 00000202 ORIG_RAX: 
ffffffffffffff11
  [54076.189128] RAX: 0000000000000001 RBX: ffffdb7b82e96a80 RCX: 
0000000000000000
  [54076.196729] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 
ffff8c687a5aa010
  [54076.204247] RBP: ffffaec7c086f840 R08: ffff8c6879c749d8 R09: 
0000000000000000
  [54076.212191] R10: ffff8c6879c749e0 R11: ffffaec7c086fa48 R12: 
ffff8c687a5aa000
  [54076.219535] R13: ffff8c687a5aa010 R14: ffff8c687a5aa000 R15: 
ffffdb7b82de7500
  [54076.227001] FS:  0000000000000000(0000) GS:ffff8c687fc00000(0000) 
knlGS:0000000000000000
  [54076.235520] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [54076.241564] CR2: 00007f12ff830098 CR3: 000000005c20a004 CR4: 
00000000003606f0
  [54076.249134] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [54076.256448] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
  [54076.263940] Call Trace:
  [54076.266522]  _raw_spin_lock+0x20/0x30
  [54076.270445]  z3fold_zpool_map+0x72/0xf0 [z3fold]
  [54076.275233]  zpool_map_handle+0x1c/0x20
  [54076.279274]  zswap_writeback_entry+0x47/0x350
  [54076.283750]  ? __alloc_pages_nodemask+0x265/0x280
  [54076.288661]  z3fold_zpool_evict+0x2b/0x40 [z3fold]
  [54076.293659]  z3fold_zpool_shrink+0x2a5/0x350 [z3fold]
  [54076.298834]  zpool_shrink+0x1c/0x20
  [54076.302446]  zswap_frontswap_store+0x271/0x4d0
  [54076.307008]  __frontswap_store+0x78/0x100
  [54076.311137]  swap_writepage+0x3f/0x80
  [54076.315040]  pageout.isra.53+0x1e6/0x340
  [54076.319086]  shrink_page_list+0x992/0xbe0
  [54076.323405]  shrink_inactive_list+0x296/0x5e0
  [54076.327953]  shrink_node_memcg+0x367/0x7e0
  [54076.332176]  ? __switch_to_asm+0x34/0x70
  [54076.336217]  ? __switch_to_asm+0x40/0x70
  [54076.340284]  ? __switch_to_asm+0x40/0x70
  [54076.344324]  shrink_node+0xe1/0x310
  [54076.348019]  ? shrink_node+0xe1/0x310
  [54076.351975]  kswapd+0x32a/0x770
  [54076.355291]  kthread+0x105/0x140
  [54076.358638]  ? mem_cgroup_shrink_node+0x190/0x190
  [54076.363549]  ? kthread_associate_blkcg+0xa0/0xa0
  [54076.368307]  ? kthread_associate_blkcg+0xa0/0xa0
  [54076.373048]  ret_from_fork+0x3a/0x50
  [54076.376885] Code: 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 0f 1f 44 00 
00 ba 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb> ec 
81 fe 00 01 00 00 0f 84 92 00 00 00 41 b8 01 01 00 00 b9 
  [54076.475882] Shutting down cpus with NMI
  [54076.482263] Kernel Offset: 0x3e00000 from 0xffffffff81000000 (relocation 
range: 0xffffffff80000000-0xffffffffbfffffff)
  [54076.494218] Rebooting in 10 seconds..
  [54086.492912] ACPI MEMORY or I/O RESET_REG.

  I thought it might be related to the amount of RAM in use / available
  so I increased the amount of available RAM and reduced the cache size
  of bitcoind (dbcache) but this still occurred.

  The bitcoind log shows nothing of value but it can be seen to have
  been cut off "mid-sentence".

  The panic leaves my LVM volumes in a dirty state and the VM refuses to
  boot. The root partition isn't on an LVM volume so I can work around
  it.

  Will update asap.

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.15.0-1014-gcp 4.15.0-1014.14~16.04.1
  ProcVersionSignature: Ubuntu 4.15.0-1014.14~16.04.1-gcp 4.15.18
  Uname: Linux 4.15.0-1014-gcp x86_64
  ApportVersion: 2.20.1-0ubuntu2.18
  Architecture: amd64
  Date: Fri Aug  3 12:30:07 2018
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_GB.UTF-8
   SHELL=/bin/bash
  SourcePackage: linux-gcp
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-gcp/+bug/1785234/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to