Re: [PATCH] block: don't try Write Same from __blkdev_issue_zeroout

2017-02-05 Thread Junichi Nomura
On 02/06/17 02:10, Christoph Hellwig wrote:
> Write Same can return an error asynchronously if it turns out the
> underlying SCSI device does not support Write Same, which makes a
> proper fallback to other methods in __blkdev_issue_zeroout impossible.
> Thus only issue a Write Same from blkdev_issue_zeroout an don't try it
> at all from __blkdev_issue_zeroout as a non-invasive workaround.
> 
> Signed-off-by: Christoph Hellwig <h...@lst.de>
> Reported-by: Junichi Nomura <j-nom...@ce.jp.nec.com>
> Fixes: e73c23ff ("block: add async variant of blkdev_issue_zeroout")

Thank you. I tested your patch and confirmed it works for me.

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.


[REGRESSION v4.10-rc1] blkdev_issue_zeroout() returns -EREMOTEIO on the first call for SCSI device that doesn't support WRITE SAME

2017-02-02 Thread Junichi Nomura
I found following ext4 error occurs on a certain storage since v4.10-rc1:
  EXT4-fs (sdc1): Delayed block allocation failed for inode 12 at logical 
offset 100 with max blocks 2 with error 121
  EXT4-fs (sdc1): This should not happen!! Data will be lost

Error 121 (EREMOTEIO) was returned from blkdev_issue_zeroout().
That came from sd driver because WRITE SAME was sent to the device
which didn't support it.

The problem was introduced by commit e73c23ff736e ("block: add async
variant of blkdev_issue_zeroout"). Before the commit, blkdev_issue_zeroout
fell back to normal zero writing when WRITE SAME failed and it seems
sd driver's heuristics depends on that behaviour.

Below is a band-aid fix to restore the fallback behaviour for sd. Although
there should be better fix as retrying blindly is not a good idea...

v4.10-rc6:
  # cat /sys/block/sdc/queue/write_same_max_bytes
  33553920
  # fallocate -v -z -l 512 /dev/sdc1
  fallocate: fallocate failed: Remote I/O error
  # cat /sys/block/sdc/queue/write_same_max_bytes
  0
  # fallocate -v -z -l 512 /dev/sdc1
  # echo $?
  0

v4.9 or v4.10-rc6 + this patch:
  # grep . /sys/block/sdc/queue/write_same_max_bytes
  33553920
  # fallocate -v -z -l 512 /dev/sdc1
  # echo $?
  0
  # grep . /sys/block/sdc/queue/write_same_max_bytes
  0

diff --git a/block/blk-lib.c b/block/blk-lib.c
index f8c82a9..8e53474 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -360,6 +360,7 @@ int blkdev_issue_zeroout(struct block_device *bdev, 
sector_t sector,
 sector_t nr_sects, gfp_t gfp_mask, bool discard)
 {
int ret;
+   int pass = 0;
struct bio *bio = NULL;
struct blk_plug plug;
 
@@ -369,6 +370,7 @@ int blkdev_issue_zeroout(struct block_device *bdev, 
sector_t sector,
return 0;
}
 
+  retry_other_method:
blk_start_plug();
ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask,
, discard);
@@ -378,6 +380,11 @@ int blkdev_issue_zeroout(struct block_device *bdev, 
sector_t sector,
}
blk_finish_plug();
 
+   if (ret && pass++ == 0) {
+   bio = NULL;
+   goto retry_other_method;
+   }
+
return ret;
 }
 EXPORT_SYMBOL(blkdev_issue_zeroout);
-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/16] dm: always defer request allocation to the owner of the request_queue

2017-01-25 Thread Junichi Nomura
On 01/25/17 01:39, Mike Snitzer wrote:
> On Tue, Jan 24 2017 at  9:20am -0500, Christoph Hellwig  wrote:
>> On Tue, Jan 24, 2017 at 05:05:39AM -0500, Mike Snitzer wrote:
>>> possible and is welcomed cleanup.  The only concern I have is that using
>>> get_request() for the old request_fn request_queue eliminates the
>>> guaranteed availability of requests to allow for forward progress (on
>>> path failure or for the purposes of swap over mpath, etc).  This isn't a
>>> concern for blk-mq because as you know we have a fixed set of tags (and
>>> associated preallocated requests).
>>>
>>> So I'm left unconvinced old request_fn request-based DM multipath isn't
>>> regressing in how request resubmission can be assured a request will be
>>> available when needed on retry in the face of path failure.
>>
>> Mempool only need a size where we can make guaranteed requests, so for
>> get_request based drivers under dm the theoretical minimum size would be
>> one as we never rely on a second request to finish the first one,
>> and each request_queue has it's own mempool(s) to start with.
> 
> Fair enough.  Cc'ing Junichi just in case he sees anything we're
> missing.

DM multipath could not use blk_get_request() because the function
was not callable from interrupt-disabled context. E.g. request_fn.

However, since the current code no longer calls blk_get_request()
from such a context, the change should be ok.

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html