Re: [PATCH 5/8] nowait aio: return on congested block device

2017-03-16 Thread Ming Lei
On Thu, Mar 16, 2017 at 10:33 PM, Jens Axboe wrote: > On 03/15/2017 03:51 PM, Goldwyn Rodrigues wrote: >> diff --git a/block/blk-core.c b/block/blk-core.c >> index 0eeb99e..2e5cba2 100644 >> --- a/block/blk-core.c >> +++ b/block/blk-core.c >> @@ -2014,7 +2019,7 @@ blk_qc_t

Re: [PATCH 1/2] blk-mq: don't complete un-started request in timeout handler

2017-03-16 Thread Ming Lei
On Fri, Mar 17, 2017 at 5:35 AM, Bart Van Assche wrote: > On Thu, 2017-03-16 at 08:07 +0800, Ming Lei wrote: >> > * Check whether REQ_ATOM_STARTED has been set. >> > * Check whether REQ_ATOM_COMPLETE has not yet been set. >> > * If both conditions have been met, set

Re: [PATCH 1/2] blk-mq: don't complete un-started request in timeout handler

2017-03-16 Thread Bart Van Assche
On Thu, 2017-03-09 at 21:02 +0800, Ming Lei wrote: > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 159187a28d66..0aff380099d5 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -697,17 +697,8 @@ static void blk_mq_check_expired(struct blk_mq_hw_ctx > *hctx, > { > struct

Re: [PATCH 1/2] blk-mq: don't complete un-started request in timeout handler

2017-03-16 Thread Bart Van Assche
On Thu, 2017-03-16 at 08:07 +0800, Ming Lei wrote: > > * Check whether REQ_ATOM_STARTED has been set. > > * Check whether REQ_ATOM_COMPLETE has not yet been set. > > * If both conditions have been met, set REQ_ATOM_COMPLETE. > > > > I don't think there is another solution than using a single

Re: [PATCH 5/8] nowait aio: return on congested block device

2017-03-16 Thread Dave Chinner
On Wed, Mar 15, 2017 at 04:51:04PM -0500, Goldwyn Rodrigues wrote: > From: Goldwyn Rodrigues > > A new flag BIO_NOWAIT is introduced to identify bio's > orignating from iocb with IOCB_NOWAIT. This flag indicates > to return immediately if a request cannot be made instead > of

[PATCH v3 00/14] md: cleanup on direct access to bvec table

2017-03-16 Thread Ming Lei
In MD's resync I/O path, there are lots of direct access to bio's bvec table. This patchset kills almost all, and the conversion is quite straightforward. One root cause of direct access to bvec table is that resync I/O uses the bio's bvec to manage pages. In V1, as suggested by Shaohua, a new

[PATCH v3 06/14] md: raid1: retrieve page from pre-allocated resync page array

2017-03-16 Thread Ming Lei
Now one page array is allocated for each resync bio, and we can retrieve page from this table directly. Signed-off-by: Ming Lei --- drivers/md/raid1.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/md/raid1.c

[PATCH v3 04/14] md: raid1: simplify r1buf_pool_free()

2017-03-16 Thread Ming Lei
This patch gets each page's reference of each bio for resync, then r1buf_pool_free() gets simplified a lot. The same policy has been taken in raid10's buf pool allocation/free too. Signed-off-by: Ming Lei --- drivers/md/raid1.c | 15 +++ 1 file changed, 7

[PATCH v3 07/14] md: raid1: use bio helper in process_checks()

2017-03-16 Thread Ming Lei
Avoid to direct access to bvec table. Signed-off-by: Ming Lei --- drivers/md/raid1.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index b1345aa4efd8..4034a2963da8 100644 --- a/drivers/md/raid1.c

[PATCH v3 13/14] md: raid10: retrieve page from preallocated resync page array

2017-03-16 Thread Ming Lei
Now one page array is allocated for each resync bio, and we can retrieve page from this table directly. Signed-off-by: Ming Lei --- drivers/md/raid10.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid10.c

[PATCH v3 09/14] md: raid1: move 'offset' out of loop

2017-03-16 Thread Ming Lei
The 'offset' local variable can't be changed inside the loop, so move it out. Signed-off-by: Ming Lei --- drivers/md/raid1.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 4034a2963da8..2f3622c695ce

[PATCH v3 08/14] block: introduce bio_copy_data_partial

2017-03-16 Thread Ming Lei
Turns out we can use bio_copy_data in raid1's write behind, and we can make alloc_behind_pages() more clean/efficient, but we need to partial version of bio_copy_data(). Signed-off-by: Ming Lei --- block/bio.c | 60

[PATCH v3 12/14] md: raid10: don't use bio's vec table to manage resync pages

2017-03-16 Thread Ming Lei
Now we allocate one page array for managing resync pages, instead of using bio's vec table to do that, and the old way is very hacky and won't work any more if multipage bvec is enabled. The introduced cost is that we need to allocate (128 + 16) * copies bytes per r10_bio, and it is fine because

[PATCH v3 10/14] md: raid1: improve write behind

2017-03-16 Thread Ming Lei
This patch improve handling of write behind in the following ways: - introduce behind master bio to hold all write behind pages - fast clone bios from behind master bio - avoid to change bvec table directly - use bio_copy_data() and make code more clean Suggested-by: Shaohua Li

[PATCH v3 14/14] md: raid10: avoid direct access to bvec table in handle_reshape_read_error

2017-03-16 Thread Ming Lei
All reshape I/O share pages from 1st copy device, so just use that pages for avoiding direct access to bvec table in handle_reshape_read_error. Signed-off-by: Ming Lei --- drivers/md/raid10.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git

[PATCH v3 05/14] md: raid1: don't use bio's vec table to manage resync pages

2017-03-16 Thread Ming Lei
Now we allocate one page array for managing resync pages, instead of using bio's vec table to do that, and the old way is very hacky and won't work any more if multipage bvec is enabled. The introduced cost is that we need to allocate (128 + 16) * raid_disks bytes per r1_bio, and it is fine

[PATCH v3 01/14] md: raid1/raid10: don't handle failure of bio_add_page()

2017-03-16 Thread Ming Lei
All bio_add_page() is for adding one page into resync bio, which is big enough to hold RESYNC_PAGES pages, and the current bio_add_page() doesn't check queue limit any more, so it won't fail at all. Signed-off-by: Ming Lei --- drivers/md/raid1.c | 21

Re: [PATCH 5/8] nowait aio: return on congested block device

2017-03-16 Thread Jens Axboe
On 03/15/2017 03:51 PM, Goldwyn Rodrigues wrote: > diff --git a/block/blk-core.c b/block/blk-core.c > index 0eeb99e..2e5cba2 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -2014,7 +2019,7 @@ blk_qc_t generic_make_request(struct bio *bio) > do { > struct

Re: [PATCH 3/8] nowait aio: return if direct write will trigger writeback

2017-03-16 Thread Goldwyn Rodrigues
On 03/16/2017 08:08 AM, Matthew Wilcox wrote: > On Wed, Mar 15, 2017 at 04:51:02PM -0500, Goldwyn Rodrigues wrote: >> This introduces a new function filemap_range_has_page() which >> returns true if the file's mapping has a page within the range >> mentioned. > > I thought you were going to

Re: [PATCH 3/8] nowait aio: return if direct write will trigger writeback

2017-03-16 Thread Goldwyn Rodrigues
On 03/16/2017 08:20 AM, Matthew Wilcox wrote: > On Wed, Mar 15, 2017 at 04:51:02PM -0500, Goldwyn Rodrigues wrote: >> From: Goldwyn Rodrigues >> >> Find out if the write will trigger a wait due to writeback. If yes, >> return -EAGAIN. >> >> This introduces a new function

Re: [PATCH 3/8] nowait aio: return if direct write will trigger writeback

2017-03-16 Thread Matthew Wilcox
On Wed, Mar 15, 2017 at 04:51:02PM -0500, Goldwyn Rodrigues wrote: > From: Goldwyn Rodrigues > > Find out if the write will trigger a wait due to writeback. If yes, > return -EAGAIN. > > This introduces a new function filemap_range_has_page() which > returns true if the

Re: [PATCH 3/8] nowait aio: return if direct write will trigger writeback

2017-03-16 Thread Matthew Wilcox
On Wed, Mar 15, 2017 at 04:51:02PM -0500, Goldwyn Rodrigues wrote: > This introduces a new function filemap_range_has_page() which > returns true if the file's mapping has a page within the range > mentioned. I thought you were going to replace this patch with one that starts writeback for these