This one is still an issue even with today's fixes.

Thanks,
Carl

> On 2023-12-28 1:59 PM PST Carl E. Thompson <[email protected]> 
> wrote:
> 
>  
> Hello, there appears to be a bug in bcachefs in which certain changes to 
> subvolumes and snapshots can result in an inability to suspend the system. 
> Specifically, if a bcachefs snapshot is taken of a subvolume, then a file is 
> removed or modified in either the subvolume or snapshot, then the subvolume 
> and snapshot are deleted, then after that s2idle will fail until the system 
> is rebooted. This is 100% reproducible on my laptop running rc7. 
> 
> Here is a short example of something that will trigger the bug:
> ---
> [carl@clip test]$ bcachefs subvolume create subvol
> 
> [carl@clip test]$ touch subvol/file
> 
> [carl@clip test]$ bcachefs subvolume snapshot subvol snapshot_of_subvol
> 
> [carl@clip test]$ rm subvol/file
> 
> [carl@clip test]$ bcachefs subvolume delete subvol
> 
> [carl@clip test]$ bcachefs subvolume delete snapshot_of_subvol
> ---
> 
> After this suspending the system will fail and produce kernel messages like 
> the following:
> ---
> [10898.793676] Freezing remaining freezable tasks
> [10918.797255] Freezing remaining freezable tasks failed after 20.003 seconds 
> (0 tasks refusing to freeze, wq_busy=1):
> [10918.797270] Showing freezable workqueues that are still busy:
> [10918.797273] workqueue events_freezable: flags=0x4
> [10918.797277]   pwq 28: cpus=14 node=0 flags=0x0 nice=0 active=0/0 refcnt=2
> [10918.797289]     inactive: pci_pme_list_scan
> [10918.797309] workqueue bcachefs_write_ref: flags=0x4
> [10918.797314]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=2/0 refcnt=3
> [10918.797322]     in-flight: 
> 12451:bch2_subvolume_wait_for_pagecache_and_delete [bcachefs] 
> bch2_subvolume_wait_for_pagecache_and_delete [bcachefs]
> [10918.797519] workqueue bcachefs_io: flags=0x1c
> [10918.797525]   pwq 9: cpus=4 node=0 flags=0x0 nice=-20 active=0/0 refcnt=2
> [10918.797532]     inactive: journal_write_work [bcachefs]
> [10918.797616] workqueue bcachefs_write_ref: flags=0x4
> [10918.797620]   pwq 18: cpus=9 node=0 flags=0x0 nice=0 active=2/0 refcnt=3
> [10918.797626]     in-flight: 
> 17562:bch2_subvolume_wait_for_pagecache_and_delete [bcachefs] 
> bch2_subvolume_wait_for_pagecache_and_delete [bcachefs]
> [10918.798386] Restarting kernel threads ... done.
> [10918.799643] OOM killer enabled.
> [10918.799647] Restarting tasks ... done.
> [10918.803749] random: crng reseeded on system resumption
> [10919.295422] PM: suspend exit
> ---
> 
> - I have only tested this on bcachefs filesystems with 4k blocks. If this is 
> related to the issue causing one of my other bug reports today then it's 
> possible in may not happen on filesystems with 512 byte blocks (untested).
> 
> Thank you,
> Carl Thompson

Reply via email to