On Wed, Mar 26, 2025 at 08:55:48AM -0400, Damien Le Moal wrote: > On 2025/03/21 13:52, Benjamin Marzinski wrote: > > On Fri, Mar 21, 2025 at 08:18:16AM +0100, Christoph Hellwig wrote: > >> Add support for zoned device by passing through report_zoned to the > >> underlying read device. > >> > >> This is required to make enable xfstests xfs/311 on zoned devices. > > > > On suspend, delay_presuspend() stops delaying and it doesn't guarantee > > that new bios coming in will always be submitted after the delayed bios > > it is flushing. That can mess things up for zoned devices. I didn't > > check if that matters for the specific test. Setting > > > > ti->emulate_zone_append = true; > > > > would enforce write ordering, at the expense of adding a whole other > > layer of delays to zoned dm-delay devices. Since this isn't really > > useful outside of testing, I think that could be acceptable if necessary > > (it would require us to support table reloads of zoned devices with > > emulated zone append, since tests often want to change the delay). > > However it would probably be better to see if we can just make dm-delay > > preserve write ordering during a suspend. > > delay_presuspend() calls flush_delayed_bios() with flush_all == true. So all > BIOs will be flushed in the order they are queued in the delay list, which as > far as I can tell is the order in which the user of dm-delay issued the BIOs. > So > for writes, the order is preserved as far as I can tell.
delay_presuspend() is called before we set the DMF_BLOCK_IO_FOR_SUSPEND bit, which will stop incoming bio from getting mapped, and also before lock_fs() is called. This means it's common for new bios to continue to come into delay_map(), while delay_presuspend() is running. The moment delay_presuspend() sets dc->may_delay = false, those new bios will stop getting queued by delay_bio(). They will get remapped immeditately to the underlying device. flush_delayed_bios() doesn't even get called until after dc->may_delay is set to false, and if there are a lot of bios on the delayed_bios list, flush_delayed_bios() will schedule. So, it's actually very common for new incoming bios to get passed to underlying device before all the bios on the dc->delayed_bios list do. Solving this without grabbing the dc->process_bios_lock mutex for every bio sent to dm-delay probably involves keeping the incoming bios going to dc->delayed_bios during suspend, at least until we can guarantee that it's empty and no bios are being flushed. -Ben > > -- > Damien Le Moal > Western Digital Research