On Wed, 25 Jun 2025, Damien Le Moal wrote:

> Any zoned DM target that requires zone append emulation will use the
> block layer zone write plugging. In such case, DM target drivers must
> not split BIOs using dm_accept_partial_bio() as doing so can potentially
> lead to deadlocks with queue freeze operations. Regular write operations
> used to emulate zone append operations also cannot be split by the
> target driver as that would result in an invalid writen sector value
> return using the BIO sector.
> 
> In order for zoned DM target drivers to avoid such incorrect BIO
> splitting, we must ensure that large BIOs are split before being passed
> to the map() function of the target, thus guaranteeing that the
> limits for the mapped device are not exceeded.
> 
> dm-crypt and dm-flakey are the only target drivers supporting zoned
> devices and using dm_accept_partial_bio().
> 
> In the case of dm-crypt, this function is used to split BIOs to the
> internal max_write_size limit (which will be suppressed in a different
> patch). However, since crypt_alloc_buffer() uses a bioset allowing only
> up to BIO_MAX_VECS (256) vectors in a BIO. The dm-crypt device
> max_segments limit, which is not set and so default to BLK_MAX_SEGMENTS
> (128), must thus be respected and write BIOs split accordingly.
> 
> In the case of dm-flakey, since zone append emulation is not required,
> the block layer zone write plugging is not used and no splitting of BIOs
> required.
> 
> Modify the function dm_zone_bio_needs_split() to use the block layer
> helper function bio_needs_zone_write_plugging() to force a call to
> bio_split_to_limits() in dm_split_and_process_bio(). This allows DM
> target drivers to avoid using dm_accept_partial_bio() for write
> operations on zoned DM devices.
> 
> Fixes: f211268ed1f9 ("dm: Use the block layer zone append emulation")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Damien Le Moal <dlem...@kernel.org>

Reviewed-by: Mikulas Patocka <mpato...@redhat.com>

> ---
>  drivers/md/dm.c | 29 ++++++++++++++++++++++-------
>  1 file changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index e477765cdd27..f1e63c1808b4 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1773,12 +1773,29 @@ static inline bool dm_zone_bio_needs_split(struct 
> mapped_device *md,
>                                          struct bio *bio)
>  {
>       /*
> -      * For mapped device that need zone append emulation, we must
> -      * split any large BIO that straddles zone boundaries.
> +      * Special case the zone operations that cannot or should not be split.
>        */
> -     return dm_emulate_zone_append(md) && bio_straddles_zones(bio) &&
> -             !bio_flagged(bio, BIO_ZONE_WRITE_PLUGGING);
> +     switch (bio_op(bio)) {
> +     case REQ_OP_ZONE_APPEND:
> +     case REQ_OP_ZONE_FINISH:
> +     case REQ_OP_ZONE_RESET:
> +     case REQ_OP_ZONE_RESET_ALL:
> +             return false;
> +     default:
> +             break;
> +     }
> +
> +     /*
> +      * Mapped devices that require zone append emulation will use the block
> +      * layer zone write plugging. In such case, we must split any large BIO
> +      * to the mapped device limits to avoid potential deadlocks with queue
> +      * freeze operations.
> +      */
> +     if (!dm_emulate_zone_append(md))
> +             return false;
> +     return bio_needs_zone_write_plugging(bio) || bio_straddles_zones(bio);
>  }
> +
>  static inline bool dm_zone_plug_bio(struct mapped_device *md, struct bio 
> *bio)
>  {
>       if (!bio_needs_zone_write_plugging(bio))
> @@ -1927,9 +1944,7 @@ static void dm_split_and_process_bio(struct 
> mapped_device *md,
>  
>       is_abnormal = is_abnormal_io(bio);
>       if (static_branch_unlikely(&zoned_enabled)) {
> -             /* Special case REQ_OP_ZONE_RESET_ALL as it cannot be split. */
> -             need_split = (bio_op(bio) != REQ_OP_ZONE_RESET_ALL) &&
> -                     (is_abnormal || dm_zone_bio_needs_split(md, bio));
> +             need_split = is_abnormal || dm_zone_bio_needs_split(md, bio);
>       } else {
>               need_split = is_abnormal;
>       }
> -- 
> 2.49.0
> 


Reply via email to