On Wed, Jul 17, 2024 at 09:02:21PM GMT, Hongbo Li wrote:
> 
> 
> On 2024/7/12 22:29, Kent Overstreet wrote:
> > On Fri, Jul 12, 2024 at 05:41:31PM GMT, Hongbo Li wrote:
> > > Hi, Kent,
> > > I found the latest repo( 
> > > 69558c638c465a79be3a08bfeb3d5a15979cbe42(bcachefs:
> > > fix ei_update_lock lock ordering)) on master branch will cause stucking
> > > during bcachefs umount.
> > > 
> > > Here is the test step:
> > > 
> > > ```
> > > mount -t bcachefs /dev/loop1 /mnt/bcachefs
> > > umount /mnt/bcachefs                      ---- stuck here !!!
> > 
> > try a lockdep build
> 
> I open the lockdep config, and here is the trace:

this has been fixed, run linus's tree or my master or testing branch

it was the SLAB_ACCOUNT patch, + not using kmem_cache_alloc_lru

> 
> ```
> [ 1262.139731] bcachefs (loop1): mounting version 1.9: disk_accounting_v2
> [ 1262.139781] bcachefs (loop1): recovering from unclean shutdown
> [ 1262.139813] bcachefs (loop1): starting journal read
> [ 1262.395735] bcachefs (loop1): journal read done on device loop1, ret 0
> [ 1262.395819] bcachefs (loop1): journal read done, replaying entries 9-9
> [ 1262.395974] bcachefs (loop1): Journal keys: 0 read, 0 after sorting and
> compacting
> [ 1262.417883] bcachefs (loop1): accounting_read... done
> [ 1262.444861] bcachefs (loop1): alloc_read... done
> [ 1262.444904] bcachefs (loop1): stripes_read... done
> [ 1262.444943] bcachefs (loop1): snapshots_read... done
> [ 1262.454903] bcachefs (loop1): going read-write
> [ 1262.455463] bcachefs (loop1): journal_replay... done
> [ 1262.455498] bcachefs (loop1): resume_logged_ops... done
> [ 1262.455518] bcachefs (loop1): delete_dead_inodes... done
> [ 1262.456851] bcachefs (loop1): done starting filesystem
> [ 1267.770276] BUG: kernel NULL pointer dereference, address:
> 0000000000000008
> [ 1267.770305] #PF: supervisor read access in kernel mode
> [ 1267.770317] #PF: error_code(0x0000) - not-present page
> [ 1267.770332] PGD 126009067 P4D 109dd8067 PUD 109ddb067 PMD 0
> [ 1267.770347] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 1267.770369] CPU: 3 PID: 1804 Comm: umount Kdump: loaded Not tainted
> 6.10.0-rc4+ #42
> [ 1267.770398] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.15.0-1 04/01/2014
> [ 1267.770417] RIP: 0010:list_lru_add+0x86/0x130
> [ 1267.770487] Code: cc cc 48 8b 04 24 4d 89 f4 48 85 c0 74 12 41 80 7f 1c
> 00 48 63 b0 e8 08 00 00 74 04 85 f6 79 7d 49 03 2f 48 83 c5 40 48 89 ea <4c>
> 8b 75 08 48 89 df 48 89 54 24 08 4c 89 f6 e8 16 a7 32 00 84 c0
> [ 1267.770524] RSP: 0018:ff6096c18275fd98 EFLAGS: 00010246
> [ 1267.770538] RAX: 0000000000000000 RBX: ff13502246670f20 RCX:
> ff6096c18275fcf4
> [ 1267.770566] RDX: 0000000000000000 RSI: ffffffff9b9e6da0 RDI:
> ff13502269690000
> [ 1267.770581] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> [ 1267.770598] R10: 00000000000000ff R11: ff13502269690ed0 R12:
> 0000000000000000
> [ 1267.770613] R13: ff13502269522500 R14: 0000000000000000 R15:
> ff135022688307d0
> [ 1267.770633] FS:  00007f1524643840(0000) GS:ff1350313fb80000(0000)
> knlGS:0000000000000000
> [ 1267.770651] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1267.771024] CR2: 0000000000000008 CR3: 0000000104cc2002 CR4:
> 0000000000771ef0
> [ 1267.771368] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 1267.771698] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [ 1267.772014] PKRU: 55555554
> [ 1267.772326] Call Trace:
> [ 1267.772631]  <TASK>
> [ 1267.772934]  ? __die+0x24/0x70
> [ 1267.773290]  ? page_fault_oops+0x80/0x140
> [ 1267.773624]  ? find_held_lock+0x2b/0x80
> [ 1267.773964]  ? exc_page_fault+0x6c/0x1f0
> [ 1267.774312]  ? asm_exc_page_fault+0x26/0x30
> [ 1267.774636]  ? list_lru_add+0x86/0x130
> [ 1267.774948]  ? list_lru_add+0x102/0x130
> [ 1267.775257]  __inode_add_lru+0x70/0x90
> [ 1267.775591]  iput_final+0x11b/0x140
> [ 1267.775892]  ? dput+0x124/0x230
> [ 1267.776191]  __dentry_kill+0x77/0x190
> [ 1267.776490]  ? dput+0x124/0x230
> [ 1267.776784]  dput+0x150/0x230
> [ 1267.777083]  shrink_dcache_for_umount+0x83/0x110
> [ 1267.777380]  generic_shutdown_super+0x20/0x170
> [ 1267.777692]  bch2_kill_sb+0x16/0x20 [bcachefs]
> [ 1267.778077]  deactivate_locked_super+0x32/0xb0
> [ 1267.778370]  cleanup_mnt+0x100/0x160
> [ 1267.778670]  task_work_run+0x59/0x90
> [ 1267.778995]  syscall_exit_to_user_mode+0x1f5/0x200
> [ 1267.779300]  do_syscall_64+0x69/0x170
> [ 1267.779622]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 1267.779922] RIP: 0033:0x7f152450d1ab
> [ 1267.780220] Code: 7b 3c 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 90 f3 0f 1e
> fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48>
> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 41 3c 0e 00 f7 d8
> [ 1267.780845] RSP: 002b:00007fff04b19eb8 EFLAGS: 00000246 ORIG_RAX:
> 00000000000000a6
> [ 1267.781190] RAX: 0000000000000000 RBX: 00007f15247c2264 RCX:
> 00007f152450d1ab
> [ 1267.781513] RDX: ffffffffffffff70 RSI: 0000000000000000 RDI:
> 0000000003a5bd40
> [ 1267.781837] RBP: 0000000003a57200 R08: 0000000000000000 R09:
> 00007fff04b18c60
> [ 1267.782158] R10: 0000000000000000 R11: 0000000000000246 R12:
> 0000000000000000
> [ 1267.782471] R13: 0000000003a5bd40 R14: 0000000003a57310 R15:
> 0000000003a57430
> [ 1267.782780]  </TASK>
> ```
> 
> shrink_dcache_for_umount
> ----> do_one_tree
> --------> dput
> ------------> __dentry_kill
> ----------------> dentry_unlink_inode
> --------------------> iput
> ------------------------> iput_final
> -----------------------------> __inode_add_lru
> ---------------------------------> list_lru_add_obj
> --------------------------------------> list_lru_add
> cause the null pointer access.
> 
> 
> Thanks,
> Hongbo

Reply via email to