Hi Hagbard,

At a glance, may I ask what's your device size? Since I notice that you used -i
option during mkfs, if size is large enough, checkpoint may crash nat/sit_bitmap
online.

mkfs-f2fs -a 1 -f -i -l NVME_Gentoo-alt -o7 /dev/nvme0n1p7

On 2019/4/29 15:25, Hagbard Celine wrote:
> First I must admit that I'm not 100% sure if this is
> f2fs/NMI/something-else bug, but it only triggers after significant
> disk access.
> 
> I've hit this bug several times since kernel 5.0.7 (have not tried
> earlier kernels) while compiling many packages in batch. But in those
> occasions it would hang all cores and loose the latest changes to the
> filesystem so I could not get any logs.
> This time it triggered while I was in the process of moving two of my
> installations to new partitions and locked only one core, so I was
> able to get the log after rebooting.
> The I did to trigger was make a new partition and run commands:
> #mkfs-f2fs -a 1 -f -i -l NVME_Gentoo-alt -o7 /dev/nvme0n1p7
> #mount -o 
> "rw,realtime,lazytime,background_gc=on,disable_ext_identity,discard,heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,extent_cache,data_flush,mode=adaptive,active_logs=6,whint_mode=fs-based,alloc_mode=default,fsync_mode=strict"
> /dev/nvme0n1p7 /mnt/gentoo-alt-new
> #cd /mnt/gentoo-alt
> #cp -a . /mnt/gentoo-alt-new
> #umount /nmt/gentoo-alt
> The bug triggers just after last command and last command was run
> within 20 seconds ofter the second-last command returned.
> "/mnt/gentoo-alt" was already mounted with same options as
> "/mnt/gentoo-alt-new", and contained a Gentoo-rootfs of 64GB data
> (according df -h).
> 
> This was on kernel 5.0.10 with "[PATCH] f2fs: fix potential recursive
> call when enabling data_flush" by Chao Yu
> 
> Syslog for bug follows:
> 
> Apr 29 07:02:52 40o2 kernel: watchdog: BUG: soft lockup - CPU#4 stuck
> for 21s! [irq/61-nvme0q5:383]
> Apr 29 07:02:52 40o2 kernel: Modules linked in: ipv6 crc_ccitt 8021q
> garp stp llc nls_cp437 vfat fat sd_mod iTCO_wdt x86_pkg_temp_thermal
> kvm_intel kvm irqbypass ghash_clmulni_intel serio_raw i2c_i801
> firewire_ohci firewire_core igb crc_itu_t uas lpc_ich dca usb_storage
> processor_thermal_device ahci intel_soc_dts_iosf int340x_thermal_zone
> libahci pcc_cpufreq efivarfs
> Apr 29 07:02:52 40o2 kernel: CPU: 4 PID: 383 Comm: irq/61-nvme0q5 Not
> tainted 5.0.10-gentoo #1
> Apr 29 07:02:52 40o2 kernel: Hardware name: To Be Filled By O.E.M. To
> Be Filled By O.E.M./C226 WS, BIOS P3.40 06/25/2018
> Apr 29 07:02:52 40o2 kernel: RIP: 0010:_raw_spin_unlock_irqrestore+0x36/0x40
> Apr 29 07:02:52 40o2 kernel: Code: f6 c7 02 75 1e 56 9d e8 38 a0 69 ff
> bf 01 00 00 00 e8 ce 32 60 ff 65 8b 05 37 8a 70 5b 85 c0 74 0b 5b c3
> e8 3c 9f 69 ff 53 9d <eb> e0 e8 bb 58 4f ff 5b c3 90 e8 6b 50 0f 00 b8
> 00 fe ff ff f0 0f
> Apr 29 07:02:52 40o2 kernel: RSP: 0018:ffff99a4003d7d58 EFLAGS:
> 00000246 ORIG_RAX: ffffffffffffff13
> Apr 29 07:02:52 40o2 kernel: RAX: 0000000000000202 RBX:
> 0000000000000246 RCX: dead000000000200
> Apr 29 07:02:52 40o2 kernel: RDX: ffff8ab7fd8852e0 RSI:
> 0000000000000000 RDI: ffffffffa490c264
> Apr 29 07:02:52 40o2 kernel: RBP: ffffbef2cc242db8 R08:
> 0000000000000002 R09: 0000000000024688
> Apr 29 07:02:52 40o2 kernel: R10: ffffffffffffffb8 R11:
> ffffffffffffffb8 R12: ffff8ab7fd885000
> Apr 29 07:02:52 40o2 kernel: R13: ffff8ab7fd8852d8 R14:
> ffff8ab7fef42b40 R15: 6db6db6db6db6db7
> Apr 29 07:02:52 40o2 kernel: FS:  0000000000000000(0000)
> GS:ffff8ab81ed00000(0000) knlGS:0000000000000000
> Apr 29 07:02:52 40o2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Apr 29 07:02:52 40o2 kernel: CR2: 00007e6278ff9000 CR3:
> 0000000792e0c004 CR4: 00000000003606e0
> Apr 29 07:02:52 40o2 kernel: DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Apr 29 07:02:52 40o2 kernel: DR3: 0000000000000000 DR6:
> 00000000fffe0ff0 DR7: 0000000000000400
> Apr 29 07:02:52 40o2 kernel: Call Trace:
> Apr 29 07:02:52 40o2 kernel:  ? f2fs_del_fsync_node_entry+0x9f/0xd0
> Apr 29 07:02:52 40o2 kernel:  ? f2fs_write_end_io+0xb6/0x1e0
> Apr 29 07:02:52 40o2 kernel:  ? blk_update_request+0xc0/0x270
> Apr 29 07:02:52 40o2 kernel:  ? blk_mq_end_request+0x1a/0x130
> Apr 29 07:02:52 40o2 kernel:  ? blk_mq_complete_request+0x92/0x110
> Apr 29 07:02:52 40o2 kernel:  ? irq_finalize_oneshot.part.46+0xe0/0xe0
> Apr 29 07:02:52 40o2 kernel:  ? nvme_irq+0xf9/0x260
> Apr 29 07:02:52 40o2 kernel:  ? irq_finalize_oneshot.part.46+0xe0/0xe0
> Apr 29 07:02:52 40o2 kernel:  ? irq_forced_thread_fn+0x30/0x80
> Apr 29 07:02:52 40o2 kernel:  ? irq_thread+0xe7/0x160
> Apr 29 07:02:52 40o2 kernel:  ? wake_threads_waitq+0x30/0x30
> Apr 29 07:02:52 40o2 kernel:  ? irq_thread_check_affinity+0x80/0x80
> Apr 29 07:02:52 40o2 kernel:  ? kthread+0x116/0x130
> Apr 29 07:02:52 40o2 kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
> Apr 29 07:02:52 40o2 kernel:  ? ret_from_fork+0x3a/0x50
> Apr 29 07:02:55 40o2 kernel: rcu: INFO: rcu_preempt self-detected stall on CPU
> Apr 29 07:02:55 40o2 kernel: rcu: \x095-....: (13949 ticks this GP)
> idle=a26/1/0x4000000000000002 softirq=191162/191162 fqs=4843
> Apr 29 07:02:55 40o2 kernel: rcu: \x09 (t=21000 jiffies g=184461 q=674)
> Apr 29 07:02:55 40o2 kernel: Sending NMI from CPU 5 to CPUs 4:
> Apr 29 07:02:55 40o2 kernel: NMI backtrace for cpu 4
> Apr 29 07:02:55 40o2 kernel: CPU: 4 PID: 383 Comm: irq/61-nvme0q5
> Tainted: G             L    5.0.10-gentoo #1
> Apr 29 07:02:55 40o2 kernel: Hardware name: To Be Filled By O.E.M. To
> Be Filled By O.E.M./C226 WS, BIOS P3.40 06/25/2018
> Apr 29 07:02:55 40o2 kernel: RIP: 0010:queued_spin_lock_slowpath+0x52/0x1b0
> Apr 29 07:02:55 40o2 kernel: Code: 0f b6 c0 c1 e0 08 89 c2 8b 03 30 e4
> 09 d0 a9 00 01 ff ff 75 1b 85 c0 75 0f b8 01 00 00 00 66 89 03 5b 5d
> 41 5c c3 f3 90 8b 03 <84> c0 75 f8 eb e9 f6 c4 01 75 04 c6 43 01 00 e8
> 3a fe 28 00 48 c7
> Apr 29 07:02:55 40o2 kernel: RSP: 0018:ffff99a4003d7d30 EFLAGS: 00000002
> Apr 29 07:02:55 40o2 kernel: RAX: 0000000000200101 RBX:
> ffff8ab7fd8852d8 RCX: 0000000000000000
> Apr 29 07:02:55 40o2 kernel: RDX: 0000000000000000 RSI:
> 0000000000000000 RDI: ffff8ab7fd8852d8
> Apr 29 07:02:55 40o2 kernel: RBP: 0000000000000246 R08:
> 0000000000000002 R09: 0000000000024688
> Apr 29 07:02:55 40o2 kernel: R10: ffffffffffffffb8 R11:
> ffffffffffffffb8 R12: ffff8ab7fd885000
> Apr 29 07:02:55 40o2 kernel: R13: ffff8ab7fd8852d8 R14:
> ffff8ab5ffdbd840 R15: 6db6db6db6db6db7
> Apr 29 07:02:55 40o2 kernel: FS:  0000000000000000(0000)
> GS:ffff8ab81ed00000(0000) knlGS:0000000000000000
> Apr 29 07:02:55 40o2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Apr 29 07:02:55 40o2 kernel: CR2: 00007e6278ff9000 CR3:
> 0000000792e0c004 CR4: 00000000003606e0
> Apr 29 07:02:55 40o2 kernel: DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Apr 29 07:02:55 40o2 kernel: DR3: 0000000000000000 DR6:
> 00000000fffe0ff0 DR7: 0000000000000400
> Apr 29 07:02:55 40o2 kernel: Call Trace:
> Apr 29 07:02:55 40o2 kernel:  ? _raw_spin_lock_irqsave+0x39/0x40
> Apr 29 07:02:55 40o2 kernel:  ? f2fs_del_fsync_node_entry+0x20/0xd0
> Apr 29 07:02:55 40o2 kernel:  ? f2fs_write_end_io+0xb6/0x1e0
> Apr 29 07:02:55 40o2 kernel:  ? blk_update_request+0xc0/0x270
> Apr 29 07:02:55 40o2 kernel:  ? blk_mq_end_request+0x1a/0x130
> Apr 29 07:02:55 40o2 kernel:  ? blk_mq_complete_request+0x92/0x110
> Apr 29 07:02:55 40o2 kernel:  ? irq_finalize_oneshot.part.46+0xe0/0xe0
> Apr 29 07:02:55 40o2 kernel:  ? nvme_irq+0xf9/0x260
> Apr 29 07:02:55 40o2 kernel:  ? irq_finalize_oneshot.part.46+0xe0/0xe0
> Apr 29 07:02:55 40o2 kernel:  ? irq_forced_thread_fn+0x30/0x80
> Apr 29 07:02:55 40o2 kernel:  ? irq_thread+0xe7/0x160
> Apr 29 07:02:55 40o2 kernel:  ? wake_threads_waitq+0x30/0x30
> Apr 29 07:02:55 40o2 kernel:  ? irq_thread_check_affinity+0x80/0x80
> Apr 29 07:02:55 40o2 kernel:  ? kthread+0x116/0x130
> Apr 29 07:02:55 40o2 kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
> Apr 29 07:02:55 40o2 kernel:  ? ret_from_fork+0x3a/0x50
> Apr 29 07:02:55 40o2 kernel: NMI backtrace for cpu 5
> Apr 29 07:02:55 40o2 kernel: CPU: 5 PID: 384 Comm: irq/62-nvme0q6
> Tainted: G             L    5.0.10-gentoo #1
> Apr 29 07:02:55 40o2 kernel: Hardware name: To Be Filled By O.E.M. To
> Be Filled By O.E.M./C226 WS, BIOS P3.40 06/25/2018
> Apr 29 07:02:55 40o2 kernel: Call Trace:
> Apr 29 07:02:55 40o2 kernel:  <IRQ>
> Apr 29 07:02:55 40o2 kernel:  ? dump_stack+0x67/0x90
> Apr 29 07:02:55 40o2 kernel:  ? nmi_cpu_backtrace.cold.4+0x13/0x4e
> Apr 29 07:02:55 40o2 kernel:  ? nmi_trigger_cpumask_backtrace+0xb1/0xc8
> Apr 29 07:02:55 40o2 kernel:  ? rcu_dump_cpu_stacks+0x85/0xb1
> Apr 29 07:02:55 40o2 kernel:  ? rcu_check_callbacks.cold.69+0x19a/0x459
> Apr 29 07:02:55 40o2 kernel:  ? tick_sched_do_timer+0x60/0x60
> Apr 29 07:02:55 40o2 kernel:  ? update_process_times+0x28/0x60
> Apr 29 07:02:55 40o2 kernel:  ? tick_sched_handle+0x22/0x60
> Apr 29 07:02:55 40o2 kernel:  ? tick_sched_timer+0x37/0x70
> Apr 29 07:02:55 40o2 kernel:  ? __hrtimer_run_queues+0xfb/0x2c0
> Apr 29 07:02:55 40o2 kernel:  ? hrtimer_interrupt+0xfd/0x210
> Apr 29 07:02:55 40o2 kernel:  ? smp_apic_timer_interrupt+0x7d/0x180
> Apr 29 07:02:55 40o2 kernel:  ? apic_timer_interrupt+0xf/0x20
> Apr 29 07:02:55 40o2 kernel:  </IRQ>
> Apr 29 07:02:55 40o2 kernel:  ? _raw_spin_unlock_irqrestore+0x34/0x40
> Apr 29 07:02:55 40o2 kernel:  ? _raw_spin_unlock_irqrestore+0x36/0x40
> Apr 29 07:02:55 40o2 kernel:  ? f2fs_del_fsync_node_entry+0x9f/0xd0
> Apr 29 07:02:55 40o2 kernel:  ? f2fs_write_end_io+0xb6/0x1e0
> Apr 29 07:02:55 40o2 kernel:  ? blk_update_request+0xc0/0x270
> Apr 29 07:02:55 40o2 kernel:  ? blk_mq_end_request+0x1a/0x130
> Apr 29 07:02:55 40o2 kernel:  ? blk_mq_complete_request+0x92/0x110
> Apr 29 07:02:55 40o2 kernel:  ? irq_finalize_oneshot.part.46+0xe0/0xe0
> Apr 29 07:02:55 40o2 kernel:  ? nvme_irq+0xf9/0x260
> Apr 29 07:02:55 40o2 kernel:  ? irq_finalize_oneshot.part.46+0xe0/0xe0
> Apr 29 07:02:55 40o2 kernel:  ? irq_forced_thread_fn+0x30/0x80
> Apr 29 07:02:55 40o2 kernel:  ? irq_thread+0xe7/0x160
> Apr 29 07:02:55 40o2 kernel:  ? wake_threads_waitq+0x30/0x30
> Apr 29 07:02:55 40o2 kernel:  ? irq_thread_check_affinity+0x80/0x80
> Apr 29 07:02:55 40o2 kernel:  ? kthread+0x116/0x130
> Apr 29 07:02:55 40o2 kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
> Apr 29 07:02:55 40o2 kernel:  ? ret_from_fork+0x3a/0x50
> Apr 29 07:03:02 40o2 kernel: nvme nvme0: I/O 310 QID 5 timeout, aborting
> Apr 29 07:03:02 40o2 kernel: nvme nvme0: I/O 210 QID 6 timeout, aborting
> Apr 29 07:03:02 40o2 kernel: nvme nvme0: Abort status: 0x0
> Apr 29 07:03:02 40o2 kernel: nvme nvme0: Abort status: 0x0
> 
> 
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to