> On 2024-10-17 1:39 AM PDT Kent Overstreet <[email protected]> wrote:

> ...

> Again - bcachefs was only merged in 6.7, clearly marked experimental,
> and you're running 6.9; this kind of bug is exactly the sort of thing we
> try to shake out in the experimental phase.

Not a bcachefs problem but as a distribution user I would have no idea that 
bcachefs was experimental. Every major distribution I've looked at recently 
includes the bcachefs module and tools and there is nothing to tell the user 
it's experimental. Only the person who actually configured the kernel (or 
people who read mailing lists) would know that it's experimental.

Perhaps if this is to be expected right now the bcachefs command line tool 
should output a big warning letting users know that bcachefs is experimental 
and might eat their data?

> Also, a fsck would have sufficed, if you haven't ran that already.

That must be a different bug because that doesn't work. I still have the old 
filesystem images and I just tried fsck again, then mounted, then tried to 
unmount and immediately got the same filesystem lockup. Time to reboot. See 
below.

Carl


fsck test:
---

[clip carl]# fsck.bcachefs /dev/clip/local.old
Running userspace offline fsck
starting version 1.12: (unknown version) 
opts=ro,compression=lz4,nopromote_whole_extents,degraded,fsck,fix_errors=ask,read_only
recovering from clean shutdown, journal seq 5004
Version downgrade required:
  running recovery passes: check_allocations
accounting_read... done
alloc_read... done
stripes_read... done
snapshots_read... done
check_allocations... done
going read-write
journal_replay... done
check_alloc_info... done
check_lrus... done
check_btree_backpointers... done
check_backpointers_to_extents... done
check_extents_to_backpointers... done
check_alloc_to_lru_refs... done
check_snapshot_trees... done
check_snapshots... done
check_subvols... done
check_subvol_children... done
delete_dead_snapshots... done
check_inodes... done
check_extents... done
check_indirect_extents... done
check_dirents... done
check_xattrs... done
check_root... done
check_subvolume_structure... done
check_directory_structure... done
check_nlinks... done
resume_logged_ops... done
delete_dead_inodes... done
shutdown complete, journal seq 5005


[clip carl]# mount /dev/clip/local.old /cet

[clip carl]# umount /cet



^C^C^C^C^C



>From dmesg:
---

[10679.819992] bcachefs (dm-7): mounting version 1.11: (unknown version) 
opts=compression=lz4
[10679.820007] bcachefs (dm-7): recovering from clean shutdown, journal seq 5005
[10679.820009] bcachefs (dm-7): Version downgrade required:
[10679.828985] bcachefs (dm-7): alloc_read... done
[10679.828999] bcachefs (dm-7): stripes_read... done
[10679.829004] bcachefs (dm-7): snapshots_read... done
[10679.829439] bcachefs (dm-7): journal_replay... done
[10679.829445] bcachefs (dm-7): resume_logged_ops... done
[10679.829454] bcachefs (dm-7): going read-write
[10929.797197] Process accounting resumed
[10930.010868] r8169 0000:02:00.0 eth0: Link is Down
[10935.961246] INFO: task bch-copygc/dm-7:28532 blocked for more than 122 
seconds.
[10935.961252]       Not tainted 6.9.4-arch1-1 #1
[10935.961253] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[10935.961255] task:bch-copygc/dm-7 state:D stack:0     pid:28532 tgid:28532 
ppid:2      flags:0x00024000
[10935.961259] Call Trace:
[10935.961261]  <TASK>
[10935.961265]  __schedule+0x3c7/0x1510
[10935.961275]  schedule+0x27/0xf0
[10935.961278]  __closure_sync+0x7e/0x140
[10935.961283]  __bch2_write+0x136b/0x1660 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961336]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.961339]  ? __kmalloc+0x1a7/0x440
[10935.961343]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.961346]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.961351]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.961353]  ? local_clock_noinstr+0xd/0xd0
[10935.961355]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.961357]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.961360]  ? bch2_moving_ctxt_do_pending_writes+0x11a/0x220 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961397]  bch2_moving_ctxt_do_pending_writes+0x11a/0x220 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961426]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.961428]  ? bch2_btree_path_traverse_one+0x958/0xcf0 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961467]  bch2_data_update_init+0x68b/0x1420 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961512]  ? bch2_move_extent+0x3da/0xed0 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961549]  bch2_move_extent+0x3da/0xed0 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961584]  ? bch2_evacuate_bucket+0x9d4/0xc00 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961615]  bch2_evacuate_bucket+0x9d4/0xc00 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961653]  ? bch2_copygc+0x210/0x880 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961684]  bch2_copygc+0x210/0x880 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961718]  bch2_copygc_thread+0x152/0x3d0 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961749]  ? bch2_copygc_thread+0xcf/0x3d0 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961784]  ? __pfx_bch2_copygc_thread+0x10/0x10 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961813]  kthread+0xcf/0x100
[10935.961818]  ? __pfx_kthread+0x10/0x10
[10935.961821]  ret_from_fork+0x31/0x50
[10935.961825]  ? __pfx_kthread+0x10/0x10
[10935.961828]  ret_from_fork_asm+0x1a/0x30
[10935.961834]  </TASK>
[10935.961835] INFO: task umount:28561 blocked for more than 122 seconds.
[10935.961837]       Not tainted 6.9.4-arch1-1 #1
[10935.961838] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[10935.961839] task:umount          state:D stack:0     pid:28561 tgid:28561 
ppid:27681  flags:0x00004004
[10935.961842] Call Trace:
[10935.961844]  <TASK>
[10935.961846]  __schedule+0x3c7/0x1510
[10935.961850]  ? schedule+0x27/0xf0
[10935.961855]  schedule+0x27/0xf0
[10935.961857]  schedule_timeout+0x12f/0x160
[10935.961862]  wait_for_completion+0x86/0x170
[10935.961866]  kthread_stop+0x6a/0x180
[10935.961869]  bch2_copygc_stop+0x1e/0x80 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961900]  __bch2_fs_read_only+0x3b/0x210 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961938]  bch2_fs_read_only+0x140/0x3f0 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.961972]  ? __pfx_autoremove_wake_function+0x10/0x10
[10935.961976]  __bch2_fs_stop+0x5a/0x380 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.962009]  generic_shutdown_super+0x77/0x170
[10935.962014]  bch2_kill_sb+0x16/0x20 [bcachefs 
8d6f6bf430dcfbb124cd4a016333997e24e1fc8a]
[10935.962053]  deactivate_locked_super+0x30/0xb0
[10935.962057]  cleanup_mnt+0xba/0x150
[10935.962061]  task_work_run+0x59/0x90
[10935.962065]  syscall_exit_to_user_mode+0x1fe/0x210
[10935.962067]  do_syscall_64+0x8f/0x190
[10935.962071]  ? syscall_exit_to_user_mode+0x75/0x210
[10935.962073]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.962075]  ? do_syscall_64+0x8f/0x190
[10935.962077]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.962079]  ? srso_alias_return_thunk+0x5/0xfbef5
[10935.962081]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[10935.962085] RIP: 0033:0x79801ee197a9
[10935.962127] RSP: 002b:00007ffd0559c690 EFLAGS: 00000246 ORIG_RAX: 
00000000000000a6
[10935.962130] RAX: 0000000000000000 RBX: 000079801edcfb50 RCX: 000079801ee197a9
[10935.962132] RDX: 000000000000034a RSI: 0000000000000000 RDI: 000056a4fc1f53b0
[10935.962133] RBP: 000056a4fc1f53b0 R08: 00000000ffffff9c R09: 0000000000000000
[10935.962135] R10: 00000000fffffffe R11: 0000000000000246 R12: 0000000000000000
[10935.962136] R13: 000079801edcfc58 R14: 000056a4fc1f5740 R15: 0000000000000000
[10935.962140]  </TASK>

Reply via email to