On Thu, Oct 5, 2017 at 7:13 PM, Christoph Hellwig <[email protected]> wrote:
> On Wed, Oct 04, 2017 at 05:03:16PM +0200, Ilya Dryomov wrote:
>> sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to
>> permit trying WRITE SAME on older SCSI devices, unless ->no_write_same
>> is set. Because REQ_OP_WRITE_ZEROES is implemented in terms of WRITE
>> SAME, blkdev_issue_zeroout() may fail with -EREMOTEIO:
>>
>> $ fallocate -zn -l 1k /dev/sdg
>> fallocate: fallocate failed: Remote I/O error
>> $ fallocate -zn -l 1k /dev/sdg # OK
>> $ fallocate -zn -l 1k /dev/sdg # OK
>
> Can we wire this up for blktests somehow?
This is covered by Darrick's generic/351, part of fstests blockdev
group.
>
>>
>> The following calls succeed because sd_done() sets ->no_write_same in
>> response to a sense that would become BLK_STS_TARGET/-EREMOTEIO, causing
>> __blkdev_issue_zeroout() to fall back to generating ZERO_PAGE bios.
>>
>> This means blkdev_issue_zeroout() must cope with WRITE ZEROES failing
>> and fall back to manually zeroing, unless BLKDEV_ZERO_NOFALLBACK is
>> specified. For BLKDEV_ZERO_NOFALLBACK case, return -EOPNOTSUPP if
>> sd_done() has just set ->no_write_same thus indicating lack of offload
>> support.
>>
>> Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing")
>> Cc: Christoph Hellwig <[email protected]>
>> Cc: "Martin K. Petersen" <[email protected]>
>> Cc: Hannes Reinecke <[email protected]>
>> Signed-off-by: Ilya Dryomov <[email protected]>
>> ---
>> block/blk-lib.c | 41 +++++++++++++++++++++++++++++++----------
>> 1 file changed, 31 insertions(+), 10 deletions(-)
>>
>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>> index 9d2ab8bba52a..17494275673e 100644
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -321,12 +321,6 @@ static int __blkdev_issue_zero_pages(struct
>> block_device *bdev,
>> * Zero-fill a block range, either using hardware offload or by explicitly
>> * writing zeroes to the device.
>> *
>> - * Note that this function may fail with -EOPNOTSUPP if the driver signals
>> - * zeroing offload support, but the device fails to process the command
>> (for
>> - * some devices there is no non-destructive way to verify whether this
>> - * operation is actually supported). In this case the caller should call
>> - * retry the call to blkdev_issue_zeroout() and the fallback path will be
>> used.
>> - *
>> * If a device is using logical block provisioning, the underlying space
>> will
>> * not be released if %flags contains BLKDEV_ZERO_NOUNMAP.
>> *
>> @@ -370,18 +364,45 @@ EXPORT_SYMBOL(__blkdev_issue_zeroout);
>> int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
>> sector_t nr_sects, gfp_t gfp_mask, unsigned flags)
>> {
>> - int ret;
>> - struct bio *bio = NULL;
>> + int ret = 0;
>> + sector_t bs_mask;
>> + struct bio *bio;
>> struct blk_plug plug;
>> + bool try_write_zeroes = !!bdev_write_zeroes_sectors(bdev);
>> +
>> + bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
>> + if ((sector | nr_sects) & bs_mask)
>> + return -EINVAL;
>>
>> +retry:
>> + bio = NULL;
>> blk_start_plug(&plug);
>> - ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask,
>> - &bio, flags);
>> + if (try_write_zeroes) {
>> + ret = __blkdev_issue_write_zeroes(bdev, sector, nr_sects,
>> + gfp_mask, &bio, flags);
>> + } else if (!(flags & BLKDEV_ZERO_NOFALLBACK)) {
>> + ret = __blkdev_issue_zero_pages(bdev, sector, nr_sects,
>> + gfp_mask, &bio);
>> + } else if (!bdev_write_zeroes_sectors(bdev)) {
>> + /*
>> + * Manual zeroout is not allowed and either:
>> + * - no zeroing offload support
>> + * - zeroing offload support was indicated, but the device
>> + * reported ILLEGAL REQUEST (for some devices there is no
>> + * non-destructive way to verify whether WRITE ZEROES is
>> + * actually supported)
>> + */
>> + ret = -EOPNOTSUPP;
>
> I don't understand the conditional above this error return - if
> we can't zero using either method we should always return an error.
This is to avoid returning -EREMOTEIO in the following case: device
doesn't support WRITE SAME but scsi_disk::max_ws_blocks != 0, zeroout
is called with BLKDEV_ZERO_NOFALLBACK. Enter blkdev_issue_zeroout(),
bdev_write_zeroes_sectors() != 0, so we issue WRITE ZEROES. The
request fails with ILLEGAL REQUEST, sd_done() sets ->no_write_same and
updates queue_limits, ILLEGAL REQUEST is translated into -EREMOTEIO,
which is returned from submit_bio_wait(). Manual zeroing is not
allowed, so we must return an error, but it shouldn't be -EREMOTEIO if
queue_limits just got updated because of ILLEGAL REQUEST. Without this
conditional, we'd get
$ fallocate -pn -l 1k /dev/sdg
fallocate: fallocate failed: Remote I/O error
$ fallocate -pn -l 1k /dev/sdg # -EOPNOTSUPP
fallocate: keep size mode (-n option) unsupported
$ fallocate -pn -l 1k /dev/sdg # -EOPNOTSUPP
fallocate: keep size mode (-n option) unsupported
I tried to explain this between the comment and the commit message.
Basically, just mopping up after sd_config_write_same().
Thanks,
Ilya