First I must admit that I'm not 100% sure if this is f2fs/NMI/something-else bug, but it only triggers after significant disk access.
I've hit this bug several times since kernel 5.0.7 (have not tried earlier kernels) while compiling many packages in batch. But in those occasions it would hang all cores and loose the latest changes to the filesystem so I could not get any logs. This time it triggered while I was in the process of moving two of my installations to new partitions and locked only one core, so I was able to get the log after rebooting. The I did to trigger was make a new partition and run commands: #mkfs-f2fs -a 1 -f -i -l NVME_Gentoo-alt -o7 /dev/nvme0n1p7 #mount -o "rw,realtime,lazytime,background_gc=on,disable_ext_identity,discard,heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,extent_cache,data_flush,mode=adaptive,active_logs=6,whint_mode=fs-based,alloc_mode=default,fsync_mode=strict" /dev/nvme0n1p7 /mnt/gentoo-alt-new #cd /mnt/gentoo-alt #cp -a . /mnt/gentoo-alt-new #umount /nmt/gentoo-alt The bug triggers just after last command and last command was run within 20 seconds ofter the second-last command returned. "/mnt/gentoo-alt" was already mounted with same options as "/mnt/gentoo-alt-new", and contained a Gentoo-rootfs of 64GB data (according df -h). This was on kernel 5.0.10 with "[PATCH] f2fs: fix potential recursive call when enabling data_flush" by Chao Yu Syslog for bug follows: Apr 29 07:02:52 40o2 kernel: watchdog: BUG: soft lockup - CPU#4 stuck for 21s! [irq/61-nvme0q5:383] Apr 29 07:02:52 40o2 kernel: Modules linked in: ipv6 crc_ccitt 8021q garp stp llc nls_cp437 vfat fat sd_mod iTCO_wdt x86_pkg_temp_thermal kvm_intel kvm irqbypass ghash_clmulni_intel serio_raw i2c_i801 firewire_ohci firewire_core igb crc_itu_t uas lpc_ich dca usb_storage processor_thermal_device ahci intel_soc_dts_iosf int340x_thermal_zone libahci pcc_cpufreq efivarfs Apr 29 07:02:52 40o2 kernel: CPU: 4 PID: 383 Comm: irq/61-nvme0q5 Not tainted 5.0.10-gentoo #1 Apr 29 07:02:52 40o2 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./C226 WS, BIOS P3.40 06/25/2018 Apr 29 07:02:52 40o2 kernel: RIP: 0010:_raw_spin_unlock_irqrestore+0x36/0x40 Apr 29 07:02:52 40o2 kernel: Code: f6 c7 02 75 1e 56 9d e8 38 a0 69 ff bf 01 00 00 00 e8 ce 32 60 ff 65 8b 05 37 8a 70 5b 85 c0 74 0b 5b c3 e8 3c 9f 69 ff 53 9d <eb> e0 e8 bb 58 4f ff 5b c3 90 e8 6b 50 0f 00 b8 00 fe ff ff f0 0f Apr 29 07:02:52 40o2 kernel: RSP: 0018:ffff99a4003d7d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 Apr 29 07:02:52 40o2 kernel: RAX: 0000000000000202 RBX: 0000000000000246 RCX: dead000000000200 Apr 29 07:02:52 40o2 kernel: RDX: ffff8ab7fd8852e0 RSI: 0000000000000000 RDI: ffffffffa490c264 Apr 29 07:02:52 40o2 kernel: RBP: ffffbef2cc242db8 R08: 0000000000000002 R09: 0000000000024688 Apr 29 07:02:52 40o2 kernel: R10: ffffffffffffffb8 R11: ffffffffffffffb8 R12: ffff8ab7fd885000 Apr 29 07:02:52 40o2 kernel: R13: ffff8ab7fd8852d8 R14: ffff8ab7fef42b40 R15: 6db6db6db6db6db7 Apr 29 07:02:52 40o2 kernel: FS: 0000000000000000(0000) GS:ffff8ab81ed00000(0000) knlGS:0000000000000000 Apr 29 07:02:52 40o2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 29 07:02:52 40o2 kernel: CR2: 00007e6278ff9000 CR3: 0000000792e0c004 CR4: 00000000003606e0 Apr 29 07:02:52 40o2 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 29 07:02:52 40o2 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Apr 29 07:02:52 40o2 kernel: Call Trace: Apr 29 07:02:52 40o2 kernel: ? f2fs_del_fsync_node_entry+0x9f/0xd0 Apr 29 07:02:52 40o2 kernel: ? f2fs_write_end_io+0xb6/0x1e0 Apr 29 07:02:52 40o2 kernel: ? blk_update_request+0xc0/0x270 Apr 29 07:02:52 40o2 kernel: ? blk_mq_end_request+0x1a/0x130 Apr 29 07:02:52 40o2 kernel: ? blk_mq_complete_request+0x92/0x110 Apr 29 07:02:52 40o2 kernel: ? irq_finalize_oneshot.part.46+0xe0/0xe0 Apr 29 07:02:52 40o2 kernel: ? nvme_irq+0xf9/0x260 Apr 29 07:02:52 40o2 kernel: ? irq_finalize_oneshot.part.46+0xe0/0xe0 Apr 29 07:02:52 40o2 kernel: ? irq_forced_thread_fn+0x30/0x80 Apr 29 07:02:52 40o2 kernel: ? irq_thread+0xe7/0x160 Apr 29 07:02:52 40o2 kernel: ? wake_threads_waitq+0x30/0x30 Apr 29 07:02:52 40o2 kernel: ? irq_thread_check_affinity+0x80/0x80 Apr 29 07:02:52 40o2 kernel: ? kthread+0x116/0x130 Apr 29 07:02:52 40o2 kernel: ? kthread_create_worker_on_cpu+0x70/0x70 Apr 29 07:02:52 40o2 kernel: ? ret_from_fork+0x3a/0x50 Apr 29 07:02:55 40o2 kernel: rcu: INFO: rcu_preempt self-detected stall on CPU Apr 29 07:02:55 40o2 kernel: rcu: \x095-....: (13949 ticks this GP) idle=a26/1/0x4000000000000002 softirq=191162/191162 fqs=4843 Apr 29 07:02:55 40o2 kernel: rcu: \x09 (t=21000 jiffies g=184461 q=674) Apr 29 07:02:55 40o2 kernel: Sending NMI from CPU 5 to CPUs 4: Apr 29 07:02:55 40o2 kernel: NMI backtrace for cpu 4 Apr 29 07:02:55 40o2 kernel: CPU: 4 PID: 383 Comm: irq/61-nvme0q5 Tainted: G L 5.0.10-gentoo #1 Apr 29 07:02:55 40o2 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./C226 WS, BIOS P3.40 06/25/2018 Apr 29 07:02:55 40o2 kernel: RIP: 0010:queued_spin_lock_slowpath+0x52/0x1b0 Apr 29 07:02:55 40o2 kernel: Code: 0f b6 c0 c1 e0 08 89 c2 8b 03 30 e4 09 d0 a9 00 01 ff ff 75 1b 85 c0 75 0f b8 01 00 00 00 66 89 03 5b 5d 41 5c c3 f3 90 8b 03 <84> c0 75 f8 eb e9 f6 c4 01 75 04 c6 43 01 00 e8 3a fe 28 00 48 c7 Apr 29 07:02:55 40o2 kernel: RSP: 0018:ffff99a4003d7d30 EFLAGS: 00000002 Apr 29 07:02:55 40o2 kernel: RAX: 0000000000200101 RBX: ffff8ab7fd8852d8 RCX: 0000000000000000 Apr 29 07:02:55 40o2 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8ab7fd8852d8 Apr 29 07:02:55 40o2 kernel: RBP: 0000000000000246 R08: 0000000000000002 R09: 0000000000024688 Apr 29 07:02:55 40o2 kernel: R10: ffffffffffffffb8 R11: ffffffffffffffb8 R12: ffff8ab7fd885000 Apr 29 07:02:55 40o2 kernel: R13: ffff8ab7fd8852d8 R14: ffff8ab5ffdbd840 R15: 6db6db6db6db6db7 Apr 29 07:02:55 40o2 kernel: FS: 0000000000000000(0000) GS:ffff8ab81ed00000(0000) knlGS:0000000000000000 Apr 29 07:02:55 40o2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 29 07:02:55 40o2 kernel: CR2: 00007e6278ff9000 CR3: 0000000792e0c004 CR4: 00000000003606e0 Apr 29 07:02:55 40o2 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 29 07:02:55 40o2 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Apr 29 07:02:55 40o2 kernel: Call Trace: Apr 29 07:02:55 40o2 kernel: ? _raw_spin_lock_irqsave+0x39/0x40 Apr 29 07:02:55 40o2 kernel: ? f2fs_del_fsync_node_entry+0x20/0xd0 Apr 29 07:02:55 40o2 kernel: ? f2fs_write_end_io+0xb6/0x1e0 Apr 29 07:02:55 40o2 kernel: ? blk_update_request+0xc0/0x270 Apr 29 07:02:55 40o2 kernel: ? blk_mq_end_request+0x1a/0x130 Apr 29 07:02:55 40o2 kernel: ? blk_mq_complete_request+0x92/0x110 Apr 29 07:02:55 40o2 kernel: ? irq_finalize_oneshot.part.46+0xe0/0xe0 Apr 29 07:02:55 40o2 kernel: ? nvme_irq+0xf9/0x260 Apr 29 07:02:55 40o2 kernel: ? irq_finalize_oneshot.part.46+0xe0/0xe0 Apr 29 07:02:55 40o2 kernel: ? irq_forced_thread_fn+0x30/0x80 Apr 29 07:02:55 40o2 kernel: ? irq_thread+0xe7/0x160 Apr 29 07:02:55 40o2 kernel: ? wake_threads_waitq+0x30/0x30 Apr 29 07:02:55 40o2 kernel: ? irq_thread_check_affinity+0x80/0x80 Apr 29 07:02:55 40o2 kernel: ? kthread+0x116/0x130 Apr 29 07:02:55 40o2 kernel: ? kthread_create_worker_on_cpu+0x70/0x70 Apr 29 07:02:55 40o2 kernel: ? ret_from_fork+0x3a/0x50 Apr 29 07:02:55 40o2 kernel: NMI backtrace for cpu 5 Apr 29 07:02:55 40o2 kernel: CPU: 5 PID: 384 Comm: irq/62-nvme0q6 Tainted: G L 5.0.10-gentoo #1 Apr 29 07:02:55 40o2 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./C226 WS, BIOS P3.40 06/25/2018 Apr 29 07:02:55 40o2 kernel: Call Trace: Apr 29 07:02:55 40o2 kernel: <IRQ> Apr 29 07:02:55 40o2 kernel: ? dump_stack+0x67/0x90 Apr 29 07:02:55 40o2 kernel: ? nmi_cpu_backtrace.cold.4+0x13/0x4e Apr 29 07:02:55 40o2 kernel: ? nmi_trigger_cpumask_backtrace+0xb1/0xc8 Apr 29 07:02:55 40o2 kernel: ? rcu_dump_cpu_stacks+0x85/0xb1 Apr 29 07:02:55 40o2 kernel: ? rcu_check_callbacks.cold.69+0x19a/0x459 Apr 29 07:02:55 40o2 kernel: ? tick_sched_do_timer+0x60/0x60 Apr 29 07:02:55 40o2 kernel: ? update_process_times+0x28/0x60 Apr 29 07:02:55 40o2 kernel: ? tick_sched_handle+0x22/0x60 Apr 29 07:02:55 40o2 kernel: ? tick_sched_timer+0x37/0x70 Apr 29 07:02:55 40o2 kernel: ? __hrtimer_run_queues+0xfb/0x2c0 Apr 29 07:02:55 40o2 kernel: ? hrtimer_interrupt+0xfd/0x210 Apr 29 07:02:55 40o2 kernel: ? smp_apic_timer_interrupt+0x7d/0x180 Apr 29 07:02:55 40o2 kernel: ? apic_timer_interrupt+0xf/0x20 Apr 29 07:02:55 40o2 kernel: </IRQ> Apr 29 07:02:55 40o2 kernel: ? _raw_spin_unlock_irqrestore+0x34/0x40 Apr 29 07:02:55 40o2 kernel: ? _raw_spin_unlock_irqrestore+0x36/0x40 Apr 29 07:02:55 40o2 kernel: ? f2fs_del_fsync_node_entry+0x9f/0xd0 Apr 29 07:02:55 40o2 kernel: ? f2fs_write_end_io+0xb6/0x1e0 Apr 29 07:02:55 40o2 kernel: ? blk_update_request+0xc0/0x270 Apr 29 07:02:55 40o2 kernel: ? blk_mq_end_request+0x1a/0x130 Apr 29 07:02:55 40o2 kernel: ? blk_mq_complete_request+0x92/0x110 Apr 29 07:02:55 40o2 kernel: ? irq_finalize_oneshot.part.46+0xe0/0xe0 Apr 29 07:02:55 40o2 kernel: ? nvme_irq+0xf9/0x260 Apr 29 07:02:55 40o2 kernel: ? irq_finalize_oneshot.part.46+0xe0/0xe0 Apr 29 07:02:55 40o2 kernel: ? irq_forced_thread_fn+0x30/0x80 Apr 29 07:02:55 40o2 kernel: ? irq_thread+0xe7/0x160 Apr 29 07:02:55 40o2 kernel: ? wake_threads_waitq+0x30/0x30 Apr 29 07:02:55 40o2 kernel: ? irq_thread_check_affinity+0x80/0x80 Apr 29 07:02:55 40o2 kernel: ? kthread+0x116/0x130 Apr 29 07:02:55 40o2 kernel: ? kthread_create_worker_on_cpu+0x70/0x70 Apr 29 07:02:55 40o2 kernel: ? ret_from_fork+0x3a/0x50 Apr 29 07:03:02 40o2 kernel: nvme nvme0: I/O 310 QID 5 timeout, aborting Apr 29 07:03:02 40o2 kernel: nvme nvme0: I/O 210 QID 6 timeout, aborting Apr 29 07:03:02 40o2 kernel: nvme nvme0: Abort status: 0x0 Apr 29 07:03:02 40o2 kernel: nvme nvme0: Abort status: 0x0 _______________________________________________ Linux-f2fs-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
