On Wed, Sep 6, 2017 at 7:38 PM, Ilya Dryomov <[email protected]> wrote:
> sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to
> permit trying WRITE SAME on older SCSI devices, unless ->no_write_same
> is set. This means blkdev_issue_zeroout() must cope with WRITE SAME
> failing with BLK_STS_TARGET/-EREMOTEIO and explicitly write zeroes,
> unless BLKDEV_ZERO_NOFALLBACK is specified.
>
> Commit 71027e97d796 ("block: stop using discards for zeroing") added
> the following to __blkdev_issue_zeroout() comment:
>
> "Note that this function may fail with -EOPNOTSUPP if the driver signals
> zeroing offload support, but the device fails to process the command (for
> some devices there is no non-destructive way to verify whether this
> operation is actually supported). In this case the caller should call
> retry the call to blkdev_issue_zeroout() and the fallback path will be
> used."
>
> But __blkdev_issue_zeroout() doesn't fail in this case: if WRITE SAME
> support is indicated, a REQ_OP_WRITE_ZEROES bio is built and returned
> to blkdev_issue_zeroout(). -EREMOTEIO is then propagated up:
>
> $ fallocate -zn -l 1k /dev/sdg
> fallocate: fallocate failed: Remote I/O error
> $ fallocate -zn -l 1k /dev/sdg # OK
>
> (The second fallocate(1) succeeds because sd_done() sets ->no_write_same
> in response to a sense that would become BLK_STS_TARGET.)
>
> Retry __blkdev_issue_zeroout() if the I/O fails with -EREMOTEIO. This
> is roughly what we did until 4.12, sans BLKDEV_ZERO_NOFALLBACK knob.
>
> Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing")
> Cc: Christoph Hellwig <[email protected]>
> Cc: "Martin K. Petersen" <[email protected]>
> Cc: Hannes Reinecke <[email protected]>
> Cc: [email protected] # 4.12+
> Signed-off-by: Ilya Dryomov <[email protected]>
> ---
> block/blk-lib.c | 49 +++++++++++++++++++++++++++++++++----------------
> 1 file changed, 33 insertions(+), 16 deletions(-)
>
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 3fe0aec90597..876b5478c1a2 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -287,12 +287,6 @@ static unsigned int
> __blkdev_sectors_to_bio_pages(sector_t nr_sects)
> * Zero-fill a block range, either using hardware offload or by explicitly
> * writing zeroes to the device.
> *
> - * Note that this function may fail with -EOPNOTSUPP if the driver signals
> - * zeroing offload support, but the device fails to process the command (for
> - * some devices there is no non-destructive way to verify whether this
> - * operation is actually supported). In this case the caller should call
> - * retry the call to blkdev_issue_zeroout() and the fallback path will be
> used.
> - *
> * If a device is using logical block provisioning, the underlying space
> will
> * not be released if %flags contains BLKDEV_ZERO_NOUNMAP.
> *
> @@ -343,6 +337,34 @@ int __blkdev_issue_zeroout(struct block_device *bdev,
> sector_t sector,
> }
> EXPORT_SYMBOL(__blkdev_issue_zeroout);
>
> +/*
> + * This function may fail with -EREMOTEIO if the driver signals zeroing
> + * offload support, but the device fails to process the command (for
> + * some devices there is no non-destructive way to verify whether this
> + * operation is actually supported). In this case the caller should
> + * retry and the fallback path in __blkdev_issue_zeroout() will be used
> + * unless %flags contains BLKDEV_ZERO_NOFALLBACK.
> + */
> +static int __blkdev_issue_zeroout_wait(struct block_device *bdev,
> + sector_t sector, sector_t nr_sects,
> + gfp_t gfp_mask, unsigned flags)
> +{
> + int ret;
> + struct bio *bio = NULL;
> + struct blk_plug plug;
> +
> + blk_start_plug(&plug);
> + ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask,
> + &bio, flags);
> + if (ret == 0 && bio) {
> + ret = submit_bio_wait(bio);
> + bio_put(bio);
> + }
> + blk_finish_plug(&plug);
> +
> + return ret;
> +}
> +
> /**
> * blkdev_issue_zeroout - zero-fill a block range
> * @bdev: blockdev to write
> @@ -360,17 +382,12 @@ int blkdev_issue_zeroout(struct block_device *bdev,
> sector_t sector,
> sector_t nr_sects, gfp_t gfp_mask, unsigned flags)
> {
> int ret;
> - struct bio *bio = NULL;
> - struct blk_plug plug;
>
> - blk_start_plug(&plug);
> - ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask,
> - &bio, flags);
> - if (ret == 0 && bio) {
> - ret = submit_bio_wait(bio);
> - bio_put(bio);
> - }
> - blk_finish_plug(&plug);
> + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects,
> + gfp_mask, flags);
> + if (ret == -EREMOTEIO)
> + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects,
> + gfp_mask, flags);
>
> return ret;
> }
Ping... This fixes a regression introduced in 4.12. Christoph,
Martin, could you please take a look?
Thanks,
Ilya