Public bug reported:
Hi all.
We have an NFSv4 server hosted in GCP with relative heavy load (read
over nfs and write over ssh/rsync). After upgrade from 14.04 to 18.04
this server started to crash approx. once a day or two. We have similar
box with same configuration but without nfs load (cold spare) and this
problem does not affect it.
I've tried multiple kernels listed below, this did not help.
linux-image-4.15.0-1026-gcp
linux-image-4.15.0-55-generic
linux-image-4.18.0-1008-gcp
linux-image-4.18.0-1015-gcp
linux-image-4.19.36
linux-image-5.0.0-1011-gcp
linux-image-5.2.5
NFS packages:
ii libnfsidmap2:amd64 0.25-5.1
amd64 NFS idmapping library
ii nfs-common 1:1.3.4-2.1ubuntu5.2
amd64 NFS support files common to client and server
ii nfs-kernel-server 1:1.3.4-2.1ubuntu5.2
amd64 support for NFS kernel server
Crash info:
KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-55-generic
DUMPFILE: /var/crash_/201908190304/dump.201908190304 [PARTIAL DUMP]
CPUS: 32
DATE: Mon Aug 19 03:04:28 2019
UPTIME: 06:05:53
LOAD AVERAGE: 7.56, 6.82, 6.77
TASKS: 627
NODENAME: storage-gce-be-1.project.domain.net
RELEASE: 4.15.0-55-generic
VERSION: #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019
MACHINE: x86_64 (2300 Mhz)
MEMORY: 120 GB
PANIC: "BUG: unable to handle kernel paging request at ffff9e9119d39b78"
PID: 112
COMMAND: "ksoftirqd/17"
TASK: ffff9e96bfcc8000 [THREAD_INFO: ffff9e96bfcc8000]
CPU: 17
STATE: TASK_RUNNING (PANIC)
A part of crash log:
[86915.179808] WARNING: CPU: 17 PID: 3917 at
/build/linux-aAn8fZ/linux-4.15.0/lib/radix-tree.c:783 delete_node+0x87/0x1f0
[86915.179810] Modules linked in: tcp_diag inet_diag binfmt_misc
ip6table_filter ip6_tables iptable_filter sch_fq_codel ib_iser rdma_cm iw_cm
ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
nls_iso8859_1 sb_edac intel_rapl_perf input_leds mac_hid serio_raw pvpanic nfsd
auth_rpcgss nfs_acl lockd grace sunrpc netconsole ip_tables x_tables autofs4
btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64
crypto_simd glue_helper cryptd psmouse virtio_net virtio_scsi i2c_piix4
[86915.179859] CPU: 17 PID: 3917 Comm: kworker/u64:3 Not tainted
4.15.0-55-generic #60-Ubuntu
[86915.179860] Hardware name: Google Google Compute Engine/Google Compute
Engine, BIOS Google 01/01/2011
[86915.179877] Workqueue: nfsd4_callbacks nfsd4_run_cb_work [nfsd]
[86915.179880] RIP: 0010:delete_node+0x87/0x1f0
[86915.179881] RSP: 0018:ffffb97087e0fd78 EFLAGS: 00010206
[86915.179882] RAX: ffff9e81839df6d8 RBX: ffff9e81839df6c0 RCX: 00000000ffffffff
[86915.179883] RDX: 0000000000000000 RSI: ffff9e9119d39b60 RDI: ffff9e9119d39b78
[86915.179884] RBP: ffffb97087e0fda0 R08: 0000000000000000 R09: 0000000000000034
[86915.179884] R10: ffff9e9119d39b88 R11: 0000000000000035 R12: ffff9e96b6304840
[86915.179885] R13: 0000000000000000 R14: ffffffff84583630 R15: 0000000000000000
[86915.179887] FS: 0000000000000000(0000) GS:ffff9e96c7240000(0000)
knlGS:0000000000000000
[86915.179887] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[86915.179888] CR2: 00007fdbe3df9ba8 CR3: 00000002fba0a006 CR4: 00000000001606e0
[86915.179893] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[86915.179894] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[86915.179895] Call Trace:
[86915.179901] __radix_tree_delete+0x7f/0xa0
[86915.179903] radix_tree_delete_item+0x6a/0xc0
[86915.179910] nfs4_put_stid+0x3d/0x90 [nfsd]
[86915.179915] nfsd4_cb_recall_release+0x15/0x20 [nfsd]
[86915.179920] nfsd4_run_cb_work+0xd4/0xf0 [nfsd]
[86915.179924] process_one_work+0x1de/0x410
[86915.179926] worker_thread+0x32/0x410
[86915.179928] kthread+0x121/0x140
[86915.179930] ? process_one_work+0x410/0x410
[86915.179932] ? kthread_create_worker_on_cpu+0x70/0x70
[86915.179935] ret_from_fork+0x35/0x40
[86915.179936] Code: c2 41 8b 04 24 a9 00 00 00 02 75 09 25 ff ff ff 03 41 89
04 24 49 c7 44 24 08 00 00 00 00 48 8b 46 18 48 39 f8 0f 84 2d 01 00 00 <0f> 0b
4c 89 f6 e8 2f f3 77 ff 48 85 db 75 ab 41 bf 01 00 00 00
[86915.179961] ---[ end trace d94747d62f40d46c ]---
[86915.220219] kernel tried to execute NX-protected page - exploit attempt?
(uid: 0)
[86915.229005] BUG: unable to handle kernel paging request at ffff9e9119d39b78
[86915.236105] IP: 0xffff9e9119d39b78
[86915.241016] PGD 2fc143067 P4D 2fc143067 PUD 756c2f063 PMD 8000001819c000e3
[86915.249604] Oops: 0011 [#1] SMP PTI
[86915.253220] Modules linked in: tcp_diag inet_diag binfmt_misc
ip6table_filter ip6_tables iptable_filter sch_fq_codel ib_iser rdma_cm iw_cm
ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
nls_iso8859_1 sb_edac intel_rapl_perf input_leds mac_hid serio_raw pvpanic nfsd
auth_rpcgss nfs_acl lockd grace sunrpc netconsole ip_tables x_tables autofs4
btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64
crypto_simd glue_helper cryptd psmouse virtio_net virtio_scsi i2c_piix4
[86915.312337] CPU: 17 PID: 112 Comm: ksoftirqd/17 Tainted: G W
4.15.0-55-generic #60-Ubuntu
[86915.321858] Hardware name: Google Google Compute Engine/Google Compute
Engine, BIOS Google 01/01/2011
[86915.332620] RIP: 0010:0xffff9e9119d39b78
[86915.336670] RSP: 0000:ffffb970866b7df0 EFLAGS: 00010292
[86915.343398] RAX: ffff9e9119d39b78 RBX: ffff9e96c7263640 RCX: ffff9e8048d0a000
[86915.350649] RDX: ffff9e9119d39b78 RSI: ffffb970866b7e08 RDI: ffff9e9119d39b78
[86915.359282] RBP: ffffb970866b7e58 R08: ffff9e96ad5c4240 R09: 0000000000000100
[86915.366540] R10: ffffb970866b7dc0 R11: 00000000ffffff00 R12: ffffffff850a9280
[86915.373797] R13: ffff9e96c7263678 R14: ffffb970866b7e08 R15: 000000000000000a
[86915.382434] FS: 0000000000000000(0000) GS:ffff9e96c7240000(0000)
knlGS:0000000000000000
[86915.390845] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[86915.398097] CR2: ffff9e9119d39b78 CR3: 00000002fba0a004 CR4: 00000000001606e0
[86915.405360] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[86915.413995] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[86915.421351] Call Trace:
[86915.425345] ? rcu_process_callbacks+0x1a7/0x4c0
[86915.430100] __do_softirq+0xe4/0x2bb
[86915.433794] run_ksoftirqd+0x22/0x60
[86915.438880] smpboot_thread_fn+0xfc/0x170
[86915.443011] kthread+0x121/0x140
[86915.446353] ? sort_range+0x30/0x30
[86915.451395] ? kthread_create_worker_on_cpu+0x70/0x70
[86915.456579] ret_from_fork+0x35/0x40
[86915.461654] Code: 00 00 00 00 00 00 00 00 00 00 00 ec d7 d0 79 55 4b 74 68
00 03 00 00 00 00 00 00 c0 f6 9d 83 81 9e ff ff 40 48 30 b6 96 9e ff ff <78> 9b
d3 19 91 9e ff ff 78 9b d3 19 91 9e ff ff 00 00 00 00 00
[86915.482097] RIP: 0xffff9e9119d39b78 RSP: ffffb970866b7df0
[86915.487621] CR2: ffff9e9119d39b78
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840650
Title:
System crashes under nfs heavy load
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840650/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs