[PATCH V2 6/7] blk-mq: don't allocate driver tag beforehand for flush rq

2017-09-15 Thread Ming Lei
The behind idea is simple: 1) for none scheduler, driver tag has to be borrowed for flush rq, otherwise we may run out of tag, and IO hang is caused. get/put driver tag is actually a nop, so reorder tags isn't necessary at all. 2) for real I/O scheduler, we needn't to allocate driver tag

[PATCH V2 7/7] blk-mq-sched: warning on inserting a req with driver tag allocated

2017-09-15 Thread Ming Lei
In case of IO scheduler, any request shouldn't have a tag assigned before dispatching, so add the warning to monitor possible bug. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq-sched.c

[PATCH V2 5/7] blk-mq: move blk_mq_put_driver_tag*() into blk-mq.h

2017-09-15 Thread Ming Lei
We need this helper to put the tag for flush rq, since we will not share tag in the flush request sequences in case of I/O scheduler. Also the driver tag need to be released before requeuing. Signed-off-by: Ming Lei --- block/blk-mq.c | 32

[PATCH V2 4/7] blk-mq: decide how to handle flush rq via RQF_FLUSH_SEQ

2017-09-15 Thread Ming Lei
Now we always preallocate one driver tag before blk_insert_flush(), and flush request will be marked as RQF_FLUSH_SEQ once it is in flush machinary. So if RQF_FLUSH_SEQ isn't set, we call blk_insert_flush() to handle the request, otherwise the flush request is dispatched to ->dispatch list

[PATCH V2 2/7] block: pass 'run_queue' to blk_mq_request_bypass_insert

2017-09-15 Thread Ming Lei
Block flush need this function without needing run queue, so introduce the parameter. Signed-off-by: Ming Lei --- block/blk-core.c | 2 +- block/blk-mq.c | 5 +++-- block/blk-mq.h | 2 +- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/block/blk-core.c

[PATCH V2 3/7] blk-flush: use blk_mq_request_bypass_insert()

2017-09-15 Thread Ming Lei
In the following patch, we will use RQF_FLUSH_SEQ to decide: - if the flag isn't set, the flush rq need to be inserted via blk_insert_flush() - otherwise, the flush rq need to be dispatched directly since it is in flush machinery now. So we use blk_mq_request_bypass_insert() for requsts of

[PATCH V2 0/7] blk-mq: don't allocate driver tag beforehand for flush rq

2017-09-15 Thread Ming Lei
Hi, This patchset avoids to allocate driver tag beforehand for flush rq in case of I/O scheduler, then flush rq isn't treated specially wrt. get/put driver tag, code gets cleanup much, such as, reorder_tags_to_front() is removed, and we needn't to worry about request order in dispatch list for

[PATCH V2 0/7] blk-mq: don't allocate driver tag beforehand for flush rq

2017-09-15 Thread Ming Lei
Hi, This patchset avoids to allocate driver tag beforehand for flush rq in case of I/O scheduler, then flush rq isn't treated specially wrt. get/put driver tag, code gets cleanup much, such as, reorder_tags_to_front() is removed, and we needn't to worry about request order in dispatch list for

[PATCH V2 1/7] blk-flush: don't run queue for requests of bypassing flush

2017-09-15 Thread Ming Lei
blk_insert_flush() should only insert request since run queue always follows it. For the case of bypassing flush, we don't need to run queue since every blk_insert_flush() follows one run queue. Signed-off-by: Ming Lei --- block/blk-flush.c | 2 +- 1 file changed, 1

Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code

2017-09-15 Thread Bart Van Assche
On Sat, 2017-09-16 at 07:35 +0900, Damien Le Moal wrote: > rw16 is mandatory for ZBC drives. So it has to be set to true. If the > HBA does not support rw16 (why would that happen ?), then the disk > should not be used. It's good that all HBAs support rw16. But it's nontrivial to analyze whether

Re: [PATCH 0/6] blk-mq: don't allocate driver tag beforehand for flush rq

2017-09-15 Thread Ming Lei
On Fri, Sep 15, 2017 at 12:06:41PM -0600, Jens Axboe wrote: > On 09/15/2017 08:29 AM, Jens Axboe wrote: > > On 09/14/2017 08:20 PM, Ming Lei wrote: > >> On Thu, Sep 14, 2017 at 12:51:24PM -0600, Jens Axboe wrote: > >>> On 09/14/2017 10:42 AM, Ming Lei wrote: > Hi, > > This patchset

Re: [PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation

2017-09-15 Thread Damien Le Moal
On 9/15/17 19:44, Hannes Reinecke wrote: > On 09/15/2017 12:06 PM, Damien Le Moal wrote: >> Fix comments style (do not use documented comment style) and add some >> comments to clarify some functions. Also fix some functions signature >> indentation and remove a useless blank line in

Re: [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions

2017-09-15 Thread Damien Le Moal
On 9/16/17 02:45, Christoph Hellwig wrote: > On Fri, Sep 15, 2017 at 07:06:34PM +0900, Damien Le Moal wrote: >> __blk_mq_debugfs_rq_show() and blk_mq_debugfs_rq_show() are exported >> symbols but ar eonly declared in the block internal file >> block/blk-mq-debugfs.h. which is not cleanly

Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code

2017-09-15 Thread Damien Le Moal
On 9/16/17 06:02, Bart Van Assche wrote: > On Fri, 2017-09-15 at 19:51 +0200, h...@lst.de wrote: >> On Fri, Sep 15, 2017 at 02:51:03PM +, Bart Van Assche wrote: >>> On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw

Re: [PATCH 5/5] dm-mpath: improve I/O schedule

2017-09-15 Thread Bart Van Assche
On Sat, 2017-09-16 at 00:44 +0800, Ming Lei wrote: > +static void save_path_queue_depth(struct pgpath *p) > +{ > + struct request_queue *q = bdev_get_queue(p->path.dev->bdev); > + > + p->old_nr_requests = q->nr_requests; > + p->queue_depth = q->queue_depth; > + > + /* one extra

Re: [PATCH 5/5] dm-mpath: improve I/O schedule

2017-09-15 Thread Bart Van Assche
On Sat, 2017-09-16 at 00:44 +0800, Ming Lei wrote: > 1) lpfc.lpfc_lun_queue_depth=3, so that it is same with .cmd_per_lun Nobody I know uses such a low queue depth for the lpfc driver. Please also include performance results for a more realistic queue depth. Thanks, Bart.

Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code

2017-09-15 Thread Bart Van Assche
On Fri, 2017-09-15 at 19:51 +0200, h...@lst.de wrote: > On Fri, Sep 15, 2017 at 02:51:03PM +, Bart Van Assche wrote: > > On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: > > > Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw > > > assignments and move the calculation

Re: [PATCH 5/5] dm-mpath: improve I/O schedule

2017-09-15 Thread Bart Van Assche
On Sat, 2017-09-16 at 00:44 +0800, Ming Lei wrote: > --- > |v4.13+ |v4.13+ > |+scsi_mq_perf |+scsi_mq_perf+patches > - > IOPS(K) |MQ-DEADLINE|MQ-DEADLINE >

Re: [PATCH 2/5] dm-mpath: return DM_MAPIO_REQUEUE in case of rq allocation failure

2017-09-15 Thread Bart Van Assche
On Fri, 2017-09-15 at 16:06 -0400, Mike Snitzer wrote: > The problem is that multipath_clone_and_map() is now treated as common > code (thanks to both blk-mq and old .request_fn now enjoying the use of > blk_get_request) BUT: Ming please understand that this code is used by > old .request_fn too.

Re: [PATCH 5/5] dm-mpath: improve I/O schedule

2017-09-15 Thread Mike Snitzer
On Fri, Sep 15 2017 at 12:44pm -0400, Ming Lei wrote: > The actual I/O schedule is done in dm-mpath layer, and the > underlying I/O schedule is simply bypassed. > > This patch sets underlying queue's nr_requests as its queue's > queue_depth, then we can get its queue busy

Re: [PATCH 2/5] dm-mpath: return DM_MAPIO_REQUEUE in case of rq allocation failure

2017-09-15 Thread Mike Snitzer
On Fri, Sep 15 2017 at 1:29pm -0400, Bart Van Assche wrote: > On Sat, 2017-09-16 at 00:44 +0800, Ming Lei wrote: > > blk-mq will rerun queue via RESTART after one request is completion, > > so not necessary to wait random time for requeuing, it should trust > > blk-mq to

Re: [PATCH 00/10] nvme multipath support on top of nvme-4.13 branch

2017-09-15 Thread Christoph Hellwig
Hi Anish, I looked over the code a bit, and I'm rather confused by the newly added commands. Which controller supports them? Also the NVMe working group went down a very different way with the ALUA approch, which uses different grouping concepts and doesn't require path activations - for Linux

Re: [PATCH 0/6] blk-mq: don't allocate driver tag beforehand for flush rq

2017-09-15 Thread Jens Axboe
On 09/15/2017 08:29 AM, Jens Axboe wrote: > On 09/14/2017 08:20 PM, Ming Lei wrote: >> On Thu, Sep 14, 2017 at 12:51:24PM -0600, Jens Axboe wrote: >>> On 09/14/2017 10:42 AM, Ming Lei wrote: Hi, This patchset avoids to allocate driver tag beforehand for flush rq in case of I/O

Re: [PATCH 1/5] block: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-15 Thread Bart Van Assche
On Sat, 2017-09-16 at 00:44 +0800, Ming Lei wrote: > If .queue_rq() returns BLK_STS_RESOURCE, blk-mq will rerun > the queue in the three situations: > > 1) if BLK_MQ_S_SCHED_RESTART is set > - queue is rerun after one rq is completed, see blk_mq_sched_restart() > which is run from

Re: [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics()

2017-09-15 Thread Christoph Hellwig
Looks fine, Reviewed-by: Christoph Hellwig

Re: [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros

2017-09-15 Thread Christoph Hellwig
Looks fine, Reviewed-by: Christoph Hellwig

Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code

2017-09-15 Thread h...@lst.de
On Fri, Sep 15, 2017 at 02:51:03PM +, Bart Van Assche wrote: > On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: > > Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw > > assignments and move the calculation of sdkp->zone_shift together > > with the assignment of the

Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code

2017-09-15 Thread Christoph Hellwig
Looks fine, Reviewed-by: Christoph Hellwig

Re: [PATCH V3 04/12] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h

2017-09-15 Thread Christoph Hellwig
Looks fine, Reviewed-by: Christoph Hellwig

Re: [PATCH V3 03/12] block: Add zoned block device information to request queue

2017-09-15 Thread Christoph Hellwig
> +struct blk_zoned { > + unsigned intnr_zones; > + unsigned long *seq_zones; > +}; > + > struct blk_zone_report_hdr { > unsigned intnr_zones; > u8 padding[60]; > @@ -492,6 +497,10 @@ struct request_queue { > struct blk_integrity integrity; >

Re: [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions

2017-09-15 Thread Christoph Hellwig
Same as for patch 1: this should stay local to block/ - we don't want random drivers to grow I/O schedulers.

Re: [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions

2017-09-15 Thread Christoph Hellwig
On Fri, Sep 15, 2017 at 07:06:34PM +0900, Damien Le Moal wrote: > __blk_mq_debugfs_rq_show() and blk_mq_debugfs_rq_show() are exported > symbols but ar eonly declared in the block internal file > block/blk-mq-debugfs.h. which is not cleanly accessible to files outside > of the block directory. >

Re: [PATCH 2/5] dm-mpath: return DM_MAPIO_REQUEUE in case of rq allocation failure

2017-09-15 Thread Bart Van Assche
On Sat, 2017-09-16 at 00:44 +0800, Ming Lei wrote: > blk-mq will rerun queue via RESTART after one request is completion, > so not necessary to wait random time for requeuing, it should trust > blk-mq to do it. > > Signed-off-by: Ming Lei > --- > drivers/md/dm-mpath.c | 2

[PATCH 3/5] dm-mpath: remove annoying message of 'blk_get_request() returned -11'

2017-09-15 Thread Ming Lei
It is very normal to see allocation failure, so not necessary to dump it and annoy people. Signed-off-by: Ming Lei --- drivers/md/dm-mpath.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index f5a1088a6e79..f57ad8621c4c

[PATCH 5/5] dm-mpath: improve I/O schedule

2017-09-15 Thread Ming Lei
The actual I/O schedule is done in dm-mpath layer, and the underlying I/O schedule is simply bypassed. This patch sets underlying queue's nr_requests as its queue's queue_depth, then we can get its queue busy feedback by simply checking if blk_get_request() returns successfully. In this way,

[PATCH 4/5] block: export blk_update_nr_requests

2017-09-15 Thread Ming Lei
dm-mpath need this API for improving IO scheduling. The IO schedule is actually done on dm-rq(mpath) queue, instead of underlying devices. If we set q->nr_requests as q->queue_depth on underlying devices, we can get the queue's busy feedback by simply checking if blk_get_request() returns

[PATCH 2/5] dm-mpath: return DM_MAPIO_REQUEUE in case of rq allocation failure

2017-09-15 Thread Ming Lei
blk-mq will rerun queue via RESTART after one request is completion, so not necessary to wait random time for requeuing, it should trust blk-mq to do it. Signed-off-by: Ming Lei --- drivers/md/dm-mpath.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[PATCH 0/5] dm-mpath: improve I/O schedule

2017-09-15 Thread Ming Lei
Hi, We depend on I/O scheduler in dm-mpath layer, and underlying I/O scheduler is bypassed basically. I/O scheduler depends on queue busy condition to trigger I/O merge, unfortunatley inside dm-mpath, the underlying queue busy feedback is not accurate enough, and we just allocate one request and

[PATCH 0/5] dm-mpath: improve I/O schedule

2017-09-15 Thread Ming Lei
Hi, We depend on I/O scheduler in dm-mpath layer, and underlying I/O scheduler is bypassed basically. I/O scheduler depends on queue busy condition to trigger I/O merge, unfortunatley inside dm-mpath, the underlying queue busy feedback is not accurate enough, and we just allocate one request and

Re: [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros

2017-09-15 Thread Bart Van Assche
On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: > instead of open coding, use the min() macro to calculate a report zones > reply buffer length in sd_zbc_check_zone_size() and the round_up() > macro for calculating the number of zones in sd_zbc_setup(). Reviewed-by: Bart Van Assche

Re: [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones

2017-09-15 Thread Bart Van Assche
On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: > + * There is no write constraints on conventional zones. So any write ^^^ Should this have been "There are no"? > - if (sdkp->zones_wlock && > - test_and_set_bit(zno, sdkp->zones_wlock)) > + if

Re: [PATCH V3 1/4] kthread: add a mechanism to store cgroup info

2017-09-15 Thread Tejun Heo
On Thu, Sep 14, 2017 at 02:02:04PM -0700, Shaohua Li wrote: > From: Shaohua Li > > kthread usually runs jobs on behalf of other threads. The jobs should be > charged to cgroup of original threads. But the jobs run in a kthread, > where we lose the cgroup context of original threads.

Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code

2017-09-15 Thread Bart Van Assche
On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: > Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw > assignments and move the calculation of sdkp->zone_shift together > with the assignment of the verified zone_blocks value in > sd_zbc_check_zone_size(). Both functions

Re: [PATCH V3 03/12] block: Add zoned block device information to request queue

2017-09-15 Thread Bart Van Assche
On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: > @@ -492,6 +497,10 @@ struct request_queue { > struct blk_integrity integrity; > #endif /* CONFIG_BLK_DEV_INTEGRITY */ > > +#ifdef CONFIG_BLK_DEV_ZONED > + struct blk_zonedzoned; > +#endif > + > #ifdef CONFIG_PM

Re: [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions

2017-09-15 Thread Bart Van Assche
On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: > The functions blk_mq_sched_free_hctx_data(), blk_mq_sched_try_merge(), > blk_mq_sched_try_insert_merge() and blk_mq_sched_request_inserted() are > all exported symbols but are declared only internally in > block/blk-mq-sched.h. Move these

Re: [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions

2017-09-15 Thread Bart Van Assche
On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: > __blk_mq_debugfs_rq_show() and blk_mq_debugfs_rq_show() are exported > symbols but ar eonly declared in the block internal file are only? > block/blk-mq-debugfs.h. which is not cleanly accessible to

Re: [PATCH 0/6] blk-mq: don't allocate driver tag beforehand for flush rq

2017-09-15 Thread Jens Axboe
On 09/14/2017 08:20 PM, Ming Lei wrote: > On Thu, Sep 14, 2017 at 12:51:24PM -0600, Jens Axboe wrote: >> On 09/14/2017 10:42 AM, Ming Lei wrote: >>> Hi, >>> >>> This patchset avoids to allocate driver tag beforehand for flush rq >>> in case of I/O scheduler, then flush rq isn't treated specially

Re: [PATCH v2 0/2] Add wrapper for blkcg policy operatins

2017-09-15 Thread weiping zhang
On Fri, Sep 01, 2017 at 10:16:45PM +0800, weiping zhang wrote: > The first patch is the V2 of [PATCH] blkcg: check pol->cpd_free_fn > before free cpd, it fixs a checking before free cpd. > > The second patch add some wrappers for struct blkcg_policy->xxx_fn, because > not > every block cgroup

[PATCH] lightnvm: pblk: remove redundant check on read path

2017-09-15 Thread Javier González
A partial read I/O in pblk is an I/O where some sectors reside in the write buffer in main memory and some are persisted on the device. Such an I/O must at least contain 2 lbas, therefore checking for the case where a single lba is mapped is not necessary. Signed-off-by: Javier González

[PATCH] lightnvm: pblk: check lba sanity on read path

2017-09-15 Thread Javier González
As part of pblk's recovery scheme, we store the lba mapped to each physical sector on the device's out-of-bound (OOB) area. On the read path, we can use this information to validate that the data being delivered to the upper layers corresponds to the lba being requested. The cost of this check is

[PATCH] lightnvm: pblk: remove I/O dependency on write path

2017-09-15 Thread Javier González
pblk schedules user I/O, metadata I/O and erases on the write path in order to minimize collisions at the media level. Until now, there has been a dependency between user and metadata I/Os that could lead to a deadlock as both take the per-LUN semaphore to schedule submission. This path removes

[PATCH] lightnvm: pblk: ensure right bad block calculation

2017-09-15 Thread Javier González
Make sure that the variable controlling block threshold for allocating extra metadata sectors in case of a line with bad blocks does not get a negative value. Otherwise, the line will be marked as corrupted and wasted. Signed-off-by: Javier González ---

[PATCH] lightnvm: pblk: enable 1 LUN configuration

2017-09-15 Thread Javier González
Metadata I/Os are scheduled to minimize their impact on user data I/Os. When there are enough LUNs instantiated (i.e., enought bandwidth), it is easy to interleave metadata and data one after the other so that metadata I/Os are the ones being blocked and not viceversa. We do this by calculating

[PATCH] lightnvm: pblk: guarantee line integrity on reads

2017-09-15 Thread Javier González
When a line is recycled during garbage collection, reads can still be issued to the line. If the line is freed in the middle of this process, data corruption might occur. This patch guarantees that lines are not freed in the middle of reads that target them (lines). Specifically, we use the

[PATCH 01/11] lightnvm: pblk: use constant for GC max inflight

2017-09-15 Thread Javier González
Use a constant to set the maximum number of inflight GC requests allowed. Signed-off-by: Javier González --- drivers/lightnvm/pblk-gc.c | 4 ++-- drivers/lightnvm/pblk.h| 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/lightnvm/pblk-gc.c

[PATCH 02/11] lightnvm: pblk: normalize ppa namings

2017-09-15 Thread Javier González
Normalize the way we name ppa variables to improve code readability. Signed-off-by: Javier González --- drivers/lightnvm/pblk-core.c | 48 +++- 1 file changed, 25 insertions(+), 23 deletions(-) diff --git

[PATCH 03/11] lightnvm: pblk: refactor read lba sanity check

2017-09-15 Thread Javier González
Refactor lba sanity check on read path to avoid code duplication. Signed-off-by: Javier González --- drivers/lightnvm/pblk-read.c | 29 ++--- 1 file changed, 10 insertions(+), 19 deletions(-) diff --git a/drivers/lightnvm/pblk-read.c

[PATCH 04/11] lightnvm: pblk: simplify data validity check on GC

2017-09-15 Thread Javier González
When a line is selected for recycling by the garbage collector (GC), the line state changes and the invalid bitmap is frozen, preventing invalidations from happening. Throughout the GC, the L2P map is checked to verify that not data being recycled has been updated. The last check is done before

[PATCH 06/11] lightnvm: pblk: put bio on bio completion

2017-09-15 Thread Javier González
Simplify put bio by doing it on bio end_io instead of manually putting it on the completion path. Signed-off-by: Javier González --- drivers/lightnvm/pblk-core.c | 10 +++--- drivers/lightnvm/pblk-read.c | 1 - drivers/lightnvm/pblk-recovery.c | 1 -

[PATCH 05/11] lightnvm: pblk: refactor read path on GC

2017-09-15 Thread Javier González
Simplify the part of the garbage collector where data is read from the line being recycled and moved into an internal queue before being copied to the memory buffer. This allows to get rid of a dedicated function, which introduces an unnecessary dependency on the code. Signed-off-by: Javier

[PATCH 07/11] lightnvm: pblk: simplify path on REQ_PREFLUSH

2017-09-15 Thread Javier González
On REQ_PREFLUSH, directly tag the I/O context flags to signal a flush in the write to cache path, instead of finding the correct entry context and imposing a memory barrier. This simplifies the code and might potentially prevent race conditions when adding functionality to the write path.

[PATCH 09/11] lightnvm: pblk: improve naming for internal req.

2017-09-15 Thread Javier González
Each request type sent to the LightNVM subsystem requires different metadata. Until now, we have tailored this metadata based on write, read and erase commands. However, pblk uses different metadata for internal writes that do not hit the write buffer. Instead of abusing the metadata for reads,

[PATCH 10/11] lightnvm: pblk: refactor rqd alloc/free

2017-09-15 Thread Javier González
Refactor the rqd allocation and free functions so that all I/O types can use these helper functions. Signed-off-by: Javier González --- drivers/lightnvm/pblk-core.c | 40 ++-- drivers/lightnvm/pblk-read.c | 2 --

[PATCH 08/11] lightnvm: pblk: allocate bio size more accurately

2017-09-15 Thread Javier González
Wait until we know the exact number of ppas to be sent to the device, before allocating the bio. Signed-off-by: Javier González --- drivers/lightnvm/pblk-rb.c| 5 +++-- drivers/lightnvm/pblk-write.c | 20 ++-- drivers/lightnvm/pblk.h | 4 ++-- 3

[PATCH 11/11] lightnvm: pblk: use rqd->end_io for completion

2017-09-15 Thread Javier González
For consistency with the rest of pblk, use rqd->end_io to point to the function taking care of ending the request on the completion path. Signed-off-by: Javier González --- drivers/lightnvm/pblk-core.c | 7 --- drivers/lightnvm/pblk-read.c | 5 ++--- 2 files changed, 2

[PATCH 00/11] lightnvm: pblk: cleanup

2017-09-15 Thread Javier González
This patchset is a general cleanup to improve code readability. Javier González (11): lightnvm: pblk: use constant for GC max inflight lightnvm: pblk: normalize ppa namings lightnvm: pblk: refactor read lba sanity check lightnvm: pblk: simplify data validity check on GC lightnvm: pblk:

Re: [PATCH V3 12/12] block: Introduce zoned I/O scheduler

2017-09-15 Thread Hannes Reinecke
On 09/15/2017 12:06 PM, Damien Le Moal wrote: > The zoned I/O scheduler is mostly identical to mq-deadline and retains > the same configuration attributes. The main difference is that the > zoned scheduler will ensure that at any time at most one write request > per sequential zone is in flight

Re: [PATCH V3 11/12] scsi: sd_zbc: Disable zone write locking with scsi-mq

2017-09-15 Thread Hannes Reinecke
On 09/15/2017 12:06 PM, Damien Le Moal wrote: > In the case of a ZBC disk used with scsi-mq, zone write locking does > not prevent write reordering in sequential zones. Unlike the legacy > case, zone locking is done after the command request is removed from > the scheduler dispatch queue. That is,

Re: [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones

2017-09-15 Thread Hannes Reinecke
On 09/15/2017 12:06 PM, Damien Le Moal wrote: > Zoned block devices have no write constraints for conventional zones. > So write locking of conventional zones is not necessary and can even > hurt performance by unnecessarily operating the disk under low queue > depth. To avoid this, use the disk

Re: [PATCH V3 09/12] scsi: sd_zbc: Initialize device queue zoned structure

2017-09-15 Thread Hannes Reinecke
On 09/15/2017 12:06 PM, Damien Le Moal wrote: > Allocate and initialize the disk request queue zoned structure on disk > revalidate. As the bitmap allocation for the seq_zones field of the > zoned structure is identical to the allocation of the zones write lock > bitmap, introduce the helper

Re: [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics()

2017-09-15 Thread Hannes Reinecke
On 09/15/2017 12:06 PM, Damien Le Moal wrote: > The three values starting at byte 8 of the Zoned Block Device > Characteristics VPD page B6h are 32 bits values, not 64bits. So use > get_unaligned_be32() to retrieve the values and not get_unaligned_be64() > > Fixes: 89d947561077 ("sd: Implement

Re: [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros

2017-09-15 Thread Hannes Reinecke
On 09/15/2017 12:06 PM, Damien Le Moal wrote: > instead of open coding, use the min() macro to calculate a report zones > reply buffer length in sd_zbc_check_zone_size() and the round_up() > macro for calculating the number of zones in sd_zbc_setup(). > > No functional change is introduced by

Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code

2017-09-15 Thread Hannes Reinecke
On 09/15/2017 12:06 PM, Damien Le Moal wrote: > Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw > assignments and move the calculation of sdkp->zone_shift together > with the assignment of the verified zone_blocks value in > sd_zbc_check_zone_size(). > > No functional change

Re: [PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation

2017-09-15 Thread Hannes Reinecke
On 09/15/2017 12:06 PM, Damien Le Moal wrote: > Fix comments style (do not use documented comment style) and add some > comments to clarify some functions. Also fix some functions signature > indentation and remove a useless blank line in sd_zbc_read_zones(). > > No functional change is

[PATCH V3 12/12] block: Introduce zoned I/O scheduler

2017-09-15 Thread Damien Le Moal
The zoned I/O scheduler is mostly identical to mq-deadline and retains the same configuration attributes. The main difference is that the zoned scheduler will ensure that at any time at most one write request per sequential zone is in flight (has been dispatched to the disk) in order to protect

[PATCH V3 09/12] scsi: sd_zbc: Initialize device queue zoned structure

2017-09-15 Thread Damien Le Moal
Allocate and initialize the disk request queue zoned structure on disk revalidate. As the bitmap allocation for the seq_zones field of the zoned structure is identical to the allocation of the zones write lock bitmap, introduce the helper sd_zbc_alloc_zone_bitmap(). Using this helper, wait for the

[PATCH V3 11/12] scsi: sd_zbc: Disable zone write locking with scsi-mq

2017-09-15 Thread Damien Le Moal
In the case of a ZBC disk used with scsi-mq, zone write locking does not prevent write reordering in sequential zones. Unlike the legacy case, zone locking is done after the command request is removed from the scheduler dispatch queue. That is, at the time of zone locking, the write command may

[PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones

2017-09-15 Thread Damien Le Moal
Zoned block devices have no write constraints for conventional zones. So write locking of conventional zones is not necessary and can even hurt performance by unnecessarily operating the disk under low queue depth. To avoid this, use the disk request queue seq_zones bitmap to allow any write to be

[PATCH V3 07/12] scsi: sd_zbc: Use well defined macros

2017-09-15 Thread Damien Le Moal
instead of open coding, use the min() macro to calculate a report zones reply buffer length in sd_zbc_check_zone_size() and the round_up() macro for calculating the number of zones in sd_zbc_setup(). No functional change is introduced by this patch. Signed-off-by: Damien Le Moal

[PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics()

2017-09-15 Thread Damien Le Moal
The three values starting at byte 8 of the Zoned Block Device Characteristics VPD page B6h are 32 bits values, not 64bits. So use get_unaligned_be32() to retrieve the values and not get_unaligned_be64() Fixes: 89d947561077 ("sd: Implement support for ZBC devices") Cc:

[PATCH V3 06/12] scsi: sd_zbc: Rearrange code

2017-09-15 Thread Damien Le Moal
Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw assignments and move the calculation of sdkp->zone_shift together with the assignment of the verified zone_blocks value in sd_zbc_check_zone_size(). No functional change is introduced by this patch. Signed-off-by: Damien Le Moal

[PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions

2017-09-15 Thread Damien Le Moal
__blk_mq_debugfs_rq_show() and blk_mq_debugfs_rq_show() are exported symbols but ar eonly declared in the block internal file block/blk-mq-debugfs.h. which is not cleanly accessible to files outside of the block directory. Move the declaration of these functions to the new file

[PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation

2017-09-15 Thread Damien Le Moal
Fix comments style (do not use documented comment style) and add some comments to clarify some functions. Also fix some functions signature indentation and remove a useless blank line in sd_zbc_read_zones(). No functional change is introduced by this patch. Signed-off-by: Damien Le Moal

[PATCH V3 03/12] block: Add zoned block device information to request queue

2017-09-15 Thread Damien Le Moal
Components relying only on the requeuest_queue structure for managing and controlling block devices (e.g. I/O schedulers) have a limited view/knowledged of the device being controlled. For instance, the device capacity cannot be known easily, which for a zoned block device also result in the

[PATCH V3 04/12] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h

2017-09-15 Thread Damien Le Moal
Move standard macro definitions for the zone types and zone conditions to scsi_proto.h together with the definitions related to the REPORT ZONES command. While at it, define all values in the enums to be clear. Also remove unnecessary includes in sd_zbc.c. No functional change is introduced by

[PATCH V3 00/12] scsi-mq support for ZBC disks

2017-09-15 Thread Damien Le Moal
This series implements support for ZBC disks used through the scsi-mq I/O path. The current scsi level support of ZBC disks guarantees write request ordering using a per-zone write lock which prevents issuing simultaneously multiple write commands to a zone, doing so avoid reordering of

[PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions

2017-09-15 Thread Damien Le Moal
The functions blk_mq_sched_free_hctx_data(), blk_mq_sched_try_merge(), blk_mq_sched_try_insert_merge() and blk_mq_sched_request_inserted() are all exported symbols but are declared only internally in block/blk-mq-sched.h. Move these declarations to the new file include/linux/blk-mq-sched.h to make

[PATCH] block: consider merge of segments when merge bio into rq

2017-09-15 Thread Jianchao Wang
When account the nr_phys_segments during merging bios into rq, only consider segments merging in individual bio but not all the bios in a rq. This leads to the bigger nr_phys_segments of rq than the real one when the segments of bios in rq are contiguous and mergeable. The nr_phys_segments of rq

[PATCH] block: move sanity checking ahead of bi_front/back_seg_size updating

2017-09-15 Thread Jianchao Wang
If the bio_integrity_merge_rq() return false or nr_phys_segments exceeds the max_segments, the merging fails, but the bi_front/back_seg_size may have been modified. To avoid it, move the sanity checking ahead. Signed-off-by: Jianchao Wang --- block/blk-merge.c | 16