This one is still an issue even with today's fixes. Thanks, Carl
> On 2023-12-28 1:59 PM PST Carl E. Thompson <[email protected]> > wrote: > > > Hello, there appears to be a bug in bcachefs in which certain changes to > subvolumes and snapshots can result in an inability to suspend the system. > Specifically, if a bcachefs snapshot is taken of a subvolume, then a file is > removed or modified in either the subvolume or snapshot, then the subvolume > and snapshot are deleted, then after that s2idle will fail until the system > is rebooted. This is 100% reproducible on my laptop running rc7. > > Here is a short example of something that will trigger the bug: > --- > [carl@clip test]$ bcachefs subvolume create subvol > > [carl@clip test]$ touch subvol/file > > [carl@clip test]$ bcachefs subvolume snapshot subvol snapshot_of_subvol > > [carl@clip test]$ rm subvol/file > > [carl@clip test]$ bcachefs subvolume delete subvol > > [carl@clip test]$ bcachefs subvolume delete snapshot_of_subvol > --- > > After this suspending the system will fail and produce kernel messages like > the following: > --- > [10898.793676] Freezing remaining freezable tasks > [10918.797255] Freezing remaining freezable tasks failed after 20.003 seconds > (0 tasks refusing to freeze, wq_busy=1): > [10918.797270] Showing freezable workqueues that are still busy: > [10918.797273] workqueue events_freezable: flags=0x4 > [10918.797277] pwq 28: cpus=14 node=0 flags=0x0 nice=0 active=0/0 refcnt=2 > [10918.797289] inactive: pci_pme_list_scan > [10918.797309] workqueue bcachefs_write_ref: flags=0x4 > [10918.797314] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=2/0 refcnt=3 > [10918.797322] in-flight: > 12451:bch2_subvolume_wait_for_pagecache_and_delete [bcachefs] > bch2_subvolume_wait_for_pagecache_and_delete [bcachefs] > [10918.797519] workqueue bcachefs_io: flags=0x1c > [10918.797525] pwq 9: cpus=4 node=0 flags=0x0 nice=-20 active=0/0 refcnt=2 > [10918.797532] inactive: journal_write_work [bcachefs] > [10918.797616] workqueue bcachefs_write_ref: flags=0x4 > [10918.797620] pwq 18: cpus=9 node=0 flags=0x0 nice=0 active=2/0 refcnt=3 > [10918.797626] in-flight: > 17562:bch2_subvolume_wait_for_pagecache_and_delete [bcachefs] > bch2_subvolume_wait_for_pagecache_and_delete [bcachefs] > [10918.798386] Restarting kernel threads ... done. > [10918.799643] OOM killer enabled. > [10918.799647] Restarting tasks ... done. > [10918.803749] random: crng reseeded on system resumption > [10919.295422] PM: suspend exit > --- > > - I have only tested this on bcachefs filesystems with 4k blocks. If this is > related to the issue causing one of my other bug reports today then it's > possible in may not happen on filesystems with 512 byte blocks (untested). > > Thank you, > Carl Thompson
