Re: [dm-devel] [PATCH 05/16] dm: always defer request allocation to the owner of the request_queue
On 01/25/17 01:39, Mike Snitzer wrote: > On Tue, Jan 24 2017 at 9:20am -0500, Christoph Hellwigwrote: >> On Tue, Jan 24, 2017 at 05:05:39AM -0500, Mike Snitzer wrote: >>> possible and is welcomed cleanup. The only concern I have is that using >>> get_request() for the old request_fn request_queue eliminates the >>> guaranteed availability of requests to allow for forward progress (on >>> path failure or for the purposes of swap over mpath, etc). This isn't a >>> concern for blk-mq because as you know we have a fixed set of tags (and >>> associated preallocated requests). >>> >>> So I'm left unconvinced old request_fn request-based DM multipath isn't >>> regressing in how request resubmission can be assured a request will be >>> available when needed on retry in the face of path failure. >> >> Mempool only need a size where we can make guaranteed requests, so for >> get_request based drivers under dm the theoretical minimum size would be >> one as we never rely on a second request to finish the first one, >> and each request_queue has it's own mempool(s) to start with. > > Fair enough. Cc'ing Junichi just in case he sees anything we're > missing. DM multipath could not use blk_get_request() because the function was not callable from interrupt-disabled context. E.g. request_fn. However, since the current code no longer calls blk_get_request() from such a context, the change should be ok. -- Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 18/18] block: don't assign cmd_flags in __blk_rq_prep_clone
> "Christoph" == Christoph Hellwigwrites: Christoph> These days we have the proper flags set since request Christoph> allocation time. Acked-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 15/18] scsi: allocate scsi_cmnd structures as part of struct request
> "Christoph" == Christoph Hellwigwrites: Christoph> Rely on the new block layer functionality to allocate Christoph> additional driver specific data behind struct request instead Christoph> of implementing it in SCSI itѕelf. Acked-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 14/18] scsi: remove __scsi_alloc_queue
> "Christoph" == Christoph Hellwigwrites: Christoph> Instead do an internal export of __scsi_init_queue for the Christoph> transport classes that export BSG nodes. Acked-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 13/18] scsi: remove scsi_cmd_dma_pool
> "Christoph" == Christoph Hellwigwrites: Christoph> There is no need for GFP_DMA allocations of the scsi_cmnd Christoph> structures themselves, all that might be DMAed to or from is Christoph> the actual payload, or the sense buffers. Acked-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 12/18] scsi: respect unchecked_isa_dma for blk-mq
> "Christoph" == Christoph Hellwigwrites: Christoph> Currently blk-mq always allocates the sense buffer using Christoph> normal GFP_KERNEL allocation. Refactor the cmd pool code to Christoph> split the cmd and sense allocation and share the code to Christoph> allocate the sense buffers as well as the sense buffer slab Christoph> caches between the legacy and blk-mq path. Acked-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 11/18] scsi: remove gfp_flags member in scsi_host_cmd_pool
> "Christoph" == Christoph Hellwigwrites: Christoph> When using the slab allocator we already decide at cache Christoph> creation time if an allocation comes from a GFP_DMA pool Christoph> using the SLAB_CACHE_DMA flag, and there is no point passing Christoph> the kmalloc-family only GFP_DMA flag to kmem_cache_alloc. Christoph> Drop all the infrastructure for doing so. Acked-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 08/18] scsi_dh_rdac: switch to scsi_execute_req_flags()
> "Christoph" == Christoph Hellwigwrites: Christoph> From: Hannes Reinecke Switch to Christoph> scsi_execute_req_flags() and scsi_get_vpd_page() instead of Christoph> open-coding it. Using scsi_execute_req_flags() will set Christoph> REQ_QUIET and REQ_PREEMPT, but this is okay as we're Christoph> evaluating the errors anyway and should be able to send the Christoph> command even if the device is quiesced. Acked-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 05/18] block: allow specifying size for extra command data
> "Christoph" == Christoph Hellwigwrites: Christoph, Christoph> This mirrors the blk-mq capabilities to allocate extra Christoph> drivers-specific data behind struct request by setting a Christoph> cmd_size field, as well as having a constructor / destructor Christoph> for it. Nice! A couple of minor nits: +static void *alloc_request_size(gfp_t gfp_mask, void *data) I like alloc_request_simple() but alloc_request_size() seems a bit contrived. _reserve? _extra? _special? Don't have any good suggestions, I'm afraid. Also a bit heavy on the else brackets a couple of places. But no biggie. Reviewed-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 04/18] block: simplify blk_init_allocated_queue
> "Christoph" == Christoph Hellwigwrites: Christoph> Return an errno value instead of the passed in queue so that Christoph> the callers don't have to keep track of two queues, and move Christoph> the assignment of the request_fn and lock to the caller as Christoph> passing them as argument doesn't simplify anything. While Christoph> we're at it also remove two pointless NULL assignments, given Christoph> that the request structure is zeroed on allocation. Reviewed-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 03/18] block: fix elevator init check
> "Christoph" == Christoph Hellwigwrites: Christoph> We can't initalize the elevator fields for flushes as flush Christoph> share space in struct request with the elevator data. But Christoph> currently we can't commnicate that a request is a flush communicate Christoph> through blk_get_request as we can only pass READ or WRITE, Christoph> and the low-level code looks at the possible NULL bio to Christoph> check for a flush. Christoph> Fix this by allowing to pass any block op and flags, and by Christoph> checking for the flush flags in __get_request. Reviewed-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 01/18] block: add a op_is_flush helper
This centralizes the checks for bios that needs to be go into the flush state machine. Signed-off-by: Christoph Hellwig--- block/blk-core.c | 8 block/blk-mq-sched.c | 5 ++--- block/blk-mq.c | 4 ++-- drivers/md/bcache/request.c | 2 +- drivers/md/dm-cache-target.c | 13 +++-- drivers/md/dm-thin.c | 13 + include/linux/blk_types.h| 9 + 7 files changed, 26 insertions(+), 28 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index a61f140..b830e14 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1035,7 +1035,7 @@ static bool blk_rq_should_init_elevator(struct bio *bio) * Flush requests do not use the elevator so skip initialization. * This allows a request to share the flush and elevator data. */ - if (bio->bi_opf & (REQ_PREFLUSH | REQ_FUA)) + if (op_is_flush(bio->bi_opf)) return false; return true; @@ -1641,7 +1641,7 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio) return BLK_QC_T_NONE; } - if (bio->bi_opf & (REQ_PREFLUSH | REQ_FUA)) { + if (op_is_flush(bio->bi_opf)) { spin_lock_irq(q->queue_lock); where = ELEVATOR_INSERT_FLUSH; goto get_rq; @@ -2145,7 +2145,7 @@ int blk_insert_cloned_request(struct request_queue *q, struct request *rq) */ BUG_ON(blk_queued_rq(rq)); - if (rq->cmd_flags & (REQ_PREFLUSH | REQ_FUA)) + if (op_is_flush(rq->cmd_flags)) where = ELEVATOR_INSERT_FLUSH; add_acct_request(q, rq, where); @@ -3256,7 +3256,7 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule) /* * rq is already accounted, so use raw insert */ - if (rq->cmd_flags & (REQ_PREFLUSH | REQ_FUA)) + if (op_is_flush(rq->cmd_flags)) __elv_add_request(q, rq, ELEVATOR_INSERT_FLUSH); else __elv_add_request(q, rq, ELEVATOR_INSERT_SORT_MERGE); diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index d05061f..3bd66e5 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -111,7 +111,6 @@ struct request *blk_mq_sched_get_request(struct request_queue *q, struct blk_mq_hw_ctx *hctx; struct blk_mq_ctx *ctx; struct request *rq; - const bool is_flush = op & (REQ_PREFLUSH | REQ_FUA); blk_queue_enter_live(q); ctx = blk_mq_get_ctx(q); @@ -126,7 +125,7 @@ struct request *blk_mq_sched_get_request(struct request_queue *q, * Flush requests are special and go directly to the * dispatch list. */ - if (!is_flush && e->type->ops.mq.get_request) { + if (!op_is_flush(op) && e->type->ops.mq.get_request) { rq = e->type->ops.mq.get_request(q, op, data); if (rq) rq->rq_flags |= RQF_QUEUED; @@ -138,7 +137,7 @@ struct request *blk_mq_sched_get_request(struct request_queue *q, } if (rq) { - if (!is_flush) { + if (!op_is_flush(op)) { rq->elv.icq = NULL; if (e && e->type->icq_cache) blk_mq_sched_assign_ioc(q, rq, bio); diff --git a/block/blk-mq.c b/block/blk-mq.c index ee69e5e..e229f8a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1378,7 +1378,7 @@ static void blk_mq_try_issue_directly(struct request *rq, blk_qc_t *cookie) static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { const int is_sync = op_is_sync(bio->bi_opf); - const int is_flush_fua = bio->bi_opf & (REQ_PREFLUSH | REQ_FUA); + const int is_flush_fua = op_is_flush(bio->bi_opf); struct blk_mq_alloc_data data; struct request *rq; unsigned int request_count = 0, srcu_idx; @@ -1498,7 +1498,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio) { const int is_sync = op_is_sync(bio->bi_opf); - const int is_flush_fua = bio->bi_opf & (REQ_PREFLUSH | REQ_FUA); + const int is_flush_fua = op_is_flush(bio->bi_opf); struct blk_plug *plug; unsigned int request_count = 0; struct blk_mq_alloc_data data; diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index 76d2087..01035e7 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -666,7 +666,7 @@ static inline struct search *search_alloc(struct bio *bio, s->iop.write_prio = 0; s->iop.error= 0; s->iop.flags= 0; - s->iop.flush_journal= (bio->bi_opf & (REQ_PREFLUSH|REQ_FUA)) !=
[dm-devel] [PATCH 04/18] block: simplify blk_init_allocated_queue
Return an errno value instead of the passed in queue so that the callers don't have to keep track of two queues, and move the assignment of the request_fn and lock to the caller as passing them as argument doesn't simplify anything. While we're at it also remove two pointless NULL assignments, given that the request structure is zeroed on allocation. Signed-off-by: Christoph HellwigReviewed-by: Hannes Reinecke --- block/blk-core.c | 38 +++--- drivers/md/dm-rq.c | 3 ++- include/linux/blkdev.h | 3 +-- 3 files changed, 18 insertions(+), 26 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index a84c1b9..54b5512 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -823,15 +823,19 @@ EXPORT_SYMBOL(blk_init_queue); struct request_queue * blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id) { - struct request_queue *uninit_q, *q; + struct request_queue *q; - uninit_q = blk_alloc_queue_node(GFP_KERNEL, node_id); - if (!uninit_q) + q = blk_alloc_queue_node(GFP_KERNEL, node_id); + if (!q) return NULL; - q = blk_init_allocated_queue(uninit_q, rfn, lock); - if (!q) - blk_cleanup_queue(uninit_q); + q->request_fn = rfn; + if (lock) + q->queue_lock = lock; + if (blk_init_allocated_queue(q) < 0) { + blk_cleanup_queue(q); + return NULL; + } return q; } @@ -839,30 +843,19 @@ EXPORT_SYMBOL(blk_init_queue_node); static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio); -struct request_queue * -blk_init_allocated_queue(struct request_queue *q, request_fn_proc *rfn, -spinlock_t *lock) -{ - if (!q) - return NULL; +int blk_init_allocated_queue(struct request_queue *q) +{ q->fq = blk_alloc_flush_queue(q, NUMA_NO_NODE, 0); if (!q->fq) - return NULL; + return -ENOMEM; if (blk_init_rl(>root_rl, q, GFP_KERNEL)) goto fail; INIT_WORK(>timeout_work, blk_timeout_work); - q->request_fn = rfn; - q->prep_rq_fn = NULL; - q->unprep_rq_fn = NULL; q->queue_flags |= QUEUE_FLAG_DEFAULT; - /* Override internal queue lock with supplied lock pointer */ - if (lock) - q->queue_lock = lock; - /* * This also sets hw/phys segments, boundary and size */ @@ -880,13 +873,12 @@ blk_init_allocated_queue(struct request_queue *q, request_fn_proc *rfn, } mutex_unlock(>sysfs_lock); - - return q; + return 0; fail: blk_free_flush_queue(q->fq); wbt_exit(q); - return NULL; + return -ENOMEM; } EXPORT_SYMBOL(blk_init_allocated_queue); diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 9d7275f..93f6e9f 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -823,7 +823,8 @@ static void dm_old_request_fn(struct request_queue *q) int dm_old_init_request_queue(struct mapped_device *md) { /* Fully initialize the queue */ - if (!blk_init_allocated_queue(md->queue, dm_old_request_fn, NULL)) + md->queue->request_fn = dm_old_request_fn; + if (blk_init_allocated_queue(md->queue) < 0) return -EINVAL; /* disable dm_old_request_fn's merge heuristic by default */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 8e0b57e..a036c4a 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1131,8 +1131,7 @@ extern void blk_unprep_request(struct request *); extern struct request_queue *blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id); extern struct request_queue *blk_init_queue(request_fn_proc *, spinlock_t *); -extern struct request_queue *blk_init_allocated_queue(struct request_queue *, - request_fn_proc *, spinlock_t *); +extern int blk_init_allocated_queue(struct request_queue *); extern void blk_cleanup_queue(struct request_queue *); extern void blk_queue_make_request(struct request_queue *, make_request_fn *); extern void blk_queue_bounce_limit(struct request_queue *, u64); -- 2.1.4 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 18/18] block: don't assign cmd_flags in __blk_rq_prep_clone
These days we have the proper flags set since request allocation time. Signed-off-by: Christoph HellwigReviewed-by: Hannes Reinecke --- block/blk-core.c | 1 - 1 file changed, 1 deletion(-) diff --git a/block/blk-core.c b/block/blk-core.c index 33c5d05e..6bf5ba0 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2983,7 +2983,6 @@ EXPORT_SYMBOL_GPL(blk_rq_unprep_clone); static void __blk_rq_prep_clone(struct request *dst, struct request *src) { dst->cpu = src->cpu; - dst->cmd_flags = src->cmd_flags | REQ_NOMERGE; dst->cmd_type = src->cmd_type; dst->__sector = blk_rq_pos(src); dst->__data_len = blk_rq_bytes(src); -- 2.1.4 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] split scsi passthrough fields out of struct request V2
Hi all, this series splits the support for SCSI passthrough commands from the main struct request used all over the block layer into a separate scsi_request structure that drivers that want to support SCSI passthough need to embedded as the first thing into their request-private data, similar to how we handle NVMe passthrough commands. To support this I've added support for that the private data after request structure to the legacy request path instead, so that it can be treated the same way as the blk-mq path. Compare to the current scsi_cmnd allocator that actually is a major simplification. Changes since V1: - fix handling of a NULL sense pointer in __scsi_execute - clean up handling of the flush flags in the block layer and MD - additional small cleanup in dm-rq -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 03/18] block: fix elevator init check
We can't initalize the elevator fields for flushes as flush share space in struct request with the elevator data. But currently we can't commnicate that a request is a flush through blk_get_request as we can only pass READ or WRITE, and the low-level code looks at the possible NULL bio to check for a flush. Fix this by allowing to pass any block op and flags, and by checking for the flush flags in __get_request. Signed-off-by: Christoph HellwigReviewed-by: Hannes Reinecke --- block/blk-core.c | 26 -- 1 file changed, 4 insertions(+), 22 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index b830e14..a84c1b9 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1022,25 +1022,6 @@ int blk_update_nr_requests(struct request_queue *q, unsigned int nr) return 0; } -/* - * Determine if elevator data should be initialized when allocating the - * request associated with @bio. - */ -static bool blk_rq_should_init_elevator(struct bio *bio) -{ - if (!bio) - return true; - - /* -* Flush requests do not use the elevator so skip initialization. -* This allows a request to share the flush and elevator data. -*/ - if (op_is_flush(bio->bi_opf)) - return false; - - return true; -} - /** * __get_request - get a free request * @rl: request list to allocate from @@ -1119,10 +1100,13 @@ static struct request *__get_request(struct request_list *rl, unsigned int op, * request is freed. This guarantees icq's won't be destroyed and * makes creating new ones safe. * +* Flush requests do not use the elevator so skip initialization. +* This allows a request to share the flush and elevator data. +* * Also, lookup icq while holding queue_lock. If it doesn't exist, * it will be created after releasing queue_lock. */ - if (blk_rq_should_init_elevator(bio) && !blk_queue_bypass(q)) { + if (!op_is_flush(op) && !blk_queue_bypass(q)) { rq_flags |= RQF_ELVPRIV; q->nr_rqs_elvpriv++; if (et->icq_cache && ioc) @@ -1276,8 +1260,6 @@ static struct request *blk_old_get_request(struct request_queue *q, int rw, { struct request *rq; - BUG_ON(rw != READ && rw != WRITE); - /* create ioc upfront */ create_io_context(gfp_mask, q->node); -- 2.1.4 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 15/18] scsi: allocate scsi_cmnd structures as part of struct request
Rely on the new block layer functionality to allocate additional driver specific data behind struct request instead of implementing it in SCSI itѕelf. Signed-off-by: Christoph HellwigReviewed-by: Hannes Reinecke --- drivers/scsi/hosts.c | 20 +-- drivers/scsi/scsi.c | 319 -- drivers/scsi/scsi_error.c | 17 ++- drivers/scsi/scsi_lib.c | 122 -- drivers/scsi/scsi_priv.h | 8 +- include/scsi/scsi_host.h | 3 - 6 files changed, 95 insertions(+), 394 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 6d29c4a..831a1c8 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -230,19 +230,6 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, } } - /* -* Note that we allocate the freelist even for the MQ case for now, -* as we need a command set aside for scsi_reset_provider. Having -* the full host freelist and one command available for that is a -* little heavy-handed, but avoids introducing a special allocator -* just for this. Eventually the structure of scsi_reset_provider -* will need a major overhaul. -*/ - error = scsi_setup_command_freelist(shost); - if (error) - goto out_destroy_tags; - - if (!shost->shost_gendev.parent) shost->shost_gendev.parent = dev ? dev : _bus; if (!dma_dev) @@ -262,7 +249,7 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, error = device_add(>shost_gendev); if (error) - goto out_destroy_freelist; + goto out_disable_runtime_pm; scsi_host_set_state(shost, SHOST_RUNNING); get_device(shost->shost_gendev.parent); @@ -312,13 +299,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, device_del(>shost_dev); out_del_gendev: device_del(>shost_gendev); - out_destroy_freelist: + out_disable_runtime_pm: device_disable_async_suspend(>shost_gendev); pm_runtime_disable(>shost_gendev); pm_runtime_set_suspended(>shost_gendev); pm_runtime_put_noidle(>shost_gendev); - scsi_destroy_command_freelist(shost); - out_destroy_tags: if (shost_use_blk_mq(shost)) scsi_mq_destroy_tags(shost); fail: @@ -359,7 +344,6 @@ static void scsi_host_dev_release(struct device *dev) kfree(dev_name(>shost_dev)); } - scsi_destroy_command_freelist(shost); if (shost_use_blk_mq(shost)) { if (shost->tag_set.tags) scsi_mq_destroy_tags(shost); diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 2e24f31..3d8d215 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -98,163 +98,6 @@ EXPORT_SYMBOL(scsi_sd_probe_domain); ASYNC_DOMAIN_EXCLUSIVE(scsi_sd_pm_domain); EXPORT_SYMBOL(scsi_sd_pm_domain); -struct scsi_host_cmd_pool { - struct kmem_cache *cmd_slab; - unsigned intusers; - char*cmd_name; -}; - -static struct scsi_host_cmd_pool scsi_cmd_pool = { - .cmd_name = "scsi_cmd_cache", -}; - -static DEFINE_MUTEX(host_cmd_pool_mutex); - -/** - * scsi_host_free_command - internal function to release a command - * @shost: host to free the command for - * @cmd: command to release - * - * the command must previously have been allocated by - * scsi_host_alloc_command. - */ -static void -scsi_host_free_command(struct Scsi_Host *shost, struct scsi_cmnd *cmd) -{ - struct scsi_host_cmd_pool *pool = shost->cmd_pool; - - if (cmd->prot_sdb) - kmem_cache_free(scsi_sdb_cache, cmd->prot_sdb); - scsi_free_sense_buffer(shost, cmd->sense_buffer); - kmem_cache_free(pool->cmd_slab, cmd); -} - -/** - * scsi_host_alloc_command - internal function to allocate command - * @shost: SCSI host whose pool to allocate from - * @gfp_mask: mask for the allocation - * - * Returns a fully allocated command with sense buffer and protection - * data buffer (where applicable) or NULL on failure - */ -static struct scsi_cmnd * -scsi_host_alloc_command(struct Scsi_Host *shost, gfp_t gfp_mask) -{ - struct scsi_host_cmd_pool *pool = shost->cmd_pool; - struct scsi_cmnd *cmd; - - cmd = kmem_cache_zalloc(pool->cmd_slab, gfp_mask); - if (!cmd) - goto fail; - - cmd->sense_buffer = scsi_alloc_sense_buffer(shost, gfp_mask, - NUMA_NO_NODE); - if (!cmd->sense_buffer) - goto fail_free_cmd; - - if (scsi_host_get_prot(shost) >= SHOST_DIX_TYPE0_PROTECTION) { - cmd->prot_sdb = kmem_cache_zalloc(scsi_sdb_cache, gfp_mask); - if (!cmd->prot_sdb) - goto fail_free_sense; - } - - return cmd; - -fail_free_sense: -
[dm-devel] [PATCH 09/18] scsi_dh_emc: switch to scsi_execute_req_flags()
From: Hannes ReineckeSwitch to scsi_execute_req_flags() and scsi_get_vpd_page() instead of open-coding it. Using scsi_execute_req_flags() will set REQ_QUIET and REQ_PREEMPT, but this is okay as we're evaluating the errors anyway and should be able to send the command even if the device is quiesced. Signed-off-by: Hannes Reinecke Signed-off-by: Christoph Hellwig --- drivers/scsi/device_handler/scsi_dh_emc.c | 247 +++--- 1 file changed, 56 insertions(+), 191 deletions(-) diff --git a/drivers/scsi/device_handler/scsi_dh_emc.c b/drivers/scsi/device_handler/scsi_dh_emc.c index 5b80746..4a7679f 100644 --- a/drivers/scsi/device_handler/scsi_dh_emc.c +++ b/drivers/scsi/device_handler/scsi_dh_emc.c @@ -88,12 +88,6 @@ struct clariion_dh_data { */ unsigned char buffer[CLARIION_BUFFER_SIZE]; /* -* SCSI sense buffer for commands -- assumes serial issuance -* and completion sequence of all commands for same multipath. -*/ - unsigned char sense[SCSI_SENSE_BUFFERSIZE]; - unsigned int senselen; - /* * LUN state */ int lun_state; @@ -116,44 +110,38 @@ struct clariion_dh_data { /* * Parse MODE_SELECT cmd reply. */ -static int trespass_endio(struct scsi_device *sdev, char *sense) +static int trespass_endio(struct scsi_device *sdev, + struct scsi_sense_hdr *sshdr) { int err = SCSI_DH_IO; - struct scsi_sense_hdr sshdr; - - if (!scsi_normalize_sense(sense, SCSI_SENSE_BUFFERSIZE, )) { - sdev_printk(KERN_ERR, sdev, "%s: Found valid sense data 0x%2x, " - "0x%2x, 0x%2x while sending CLARiiON trespass " - "command.\n", CLARIION_NAME, sshdr.sense_key, - sshdr.asc, sshdr.ascq); - if ((sshdr.sense_key == 0x05) && (sshdr.asc == 0x04) && -(sshdr.ascq == 0x00)) { - /* -* Array based copy in progress -- do not send -* mode_select or copy will be aborted mid-stream. -*/ - sdev_printk(KERN_INFO, sdev, "%s: Array Based Copy in " - "progress while sending CLARiiON trespass " - "command.\n", CLARIION_NAME); - err = SCSI_DH_DEV_TEMP_BUSY; - } else if ((sshdr.sense_key == 0x02) && (sshdr.asc == 0x04) && - (sshdr.ascq == 0x03)) { - /* -* LUN Not Ready - Manual Intervention Required -* indicates in-progress ucode upgrade (NDU). -*/ - sdev_printk(KERN_INFO, sdev, "%s: Detected in-progress " - "ucode upgrade NDU operation while sending " - "CLARiiON trespass command.\n", CLARIION_NAME); - err = SCSI_DH_DEV_TEMP_BUSY; - } else - err = SCSI_DH_DEV_FAILED; - } else { - sdev_printk(KERN_INFO, sdev, - "%s: failed to send MODE SELECT, no sense available\n", - CLARIION_NAME); - } + sdev_printk(KERN_ERR, sdev, "%s: Found valid sense data 0x%2x, " + "0x%2x, 0x%2x while sending CLARiiON trespass " + "command.\n", CLARIION_NAME, sshdr->sense_key, + sshdr->asc, sshdr->ascq); + + if (sshdr->sense_key == 0x05 && sshdr->asc == 0x04 && + sshdr->ascq == 0x00) { + /* +* Array based copy in progress -- do not send +* mode_select or copy will be aborted mid-stream. +*/ + sdev_printk(KERN_INFO, sdev, "%s: Array Based Copy in " + "progress while sending CLARiiON trespass " + "command.\n", CLARIION_NAME); + err = SCSI_DH_DEV_TEMP_BUSY; + } else if (sshdr->sense_key == 0x02 && sshdr->asc == 0x04 && + sshdr->ascq == 0x03) { + /* +* LUN Not Ready - Manual Intervention Required +* indicates in-progress ucode upgrade (NDU). +*/ + sdev_printk(KERN_INFO, sdev, "%s: Detected in-progress " + "ucode upgrade NDU operation while sending " + "CLARiiON trespass command.\n", CLARIION_NAME); + err = SCSI_DH_DEV_TEMP_BUSY; + } else + err = SCSI_DH_DEV_FAILED; return err; } @@ -257,103 +245,15 @@ static char * parse_sp_model(struct scsi_device *sdev, unsigned char *buffer) return sp_model; } -/* - * Get block request for REQ_BLOCK_PC command issued to path. Currently - *
[dm-devel] [PATCH 02/18] md: cleanup bio op / flags handling in raid1_write_request
No need for the local variables, the bio is still live and we can just assigned the bits we want directly. Make me wonder why we can't assign all the bio flags to start with. Signed-off-by: Christoph Hellwig--- drivers/md/raid1.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 7b0f647..67b0365 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1170,10 +1170,6 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio, int i, disks; struct bitmap *bitmap = mddev->bitmap; unsigned long flags; - const int op = bio_op(bio); - const unsigned long do_sync = (bio->bi_opf & REQ_SYNC); - const unsigned long do_flush_fua = (bio->bi_opf & - (REQ_PREFLUSH | REQ_FUA)); struct md_rdev *blocked_rdev; struct blk_plug_cb *cb; struct raid1_plug_cb *plug = NULL; @@ -1389,7 +1385,8 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio, conf->mirrors[i].rdev->data_offset); mbio->bi_bdev = conf->mirrors[i].rdev->bdev; mbio->bi_end_io = raid1_end_write_request; - bio_set_op_attrs(mbio, op, do_flush_fua | do_sync); + mbio->bi_opf = bio_op(bio) | + (bio->bi_opf & (REQ_SYNC | REQ_PREFLUSH | REQ_FUA)); if (test_bit(FailFast, >mirrors[i].rdev->flags) && !test_bit(WriteMostly, >mirrors[i].rdev->flags) && conf->raid_disks - mddev->degraded > 1) -- 2.1.4 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 08/18] scsi_dh_rdac: switch to scsi_execute_req_flags()
From: Hannes ReineckeSwitch to scsi_execute_req_flags() and scsi_get_vpd_page() instead of open-coding it. Using scsi_execute_req_flags() will set REQ_QUIET and REQ_PREEMPT, but this is okay as we're evaluating the errors anyway and should be able to send the command even if the device is quiesced. Signed-off-by: Hannes Reinecke Signed-off-by: Christoph Hellwig --- drivers/scsi/device_handler/scsi_dh_rdac.c | 174 + 1 file changed, 51 insertions(+), 123 deletions(-) diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c index 00d9c32..b64eaae 100644 --- a/drivers/scsi/device_handler/scsi_dh_rdac.c +++ b/drivers/scsi/device_handler/scsi_dh_rdac.c @@ -205,7 +205,6 @@ struct rdac_dh_data { #define RDAC_NON_PREFERRED 1 charpreferred; - unsigned char sense[SCSI_SENSE_BUFFERSIZE]; union { struct c2_inquiry c2; struct c4_inquiry c4; @@ -262,40 +261,12 @@ do { \ sdev_printk(KERN_INFO, sdev, RDAC_NAME ": " f "\n", ## arg); \ } while (0); -static struct request *get_rdac_req(struct scsi_device *sdev, - void *buffer, unsigned buflen, int rw) +static unsigned int rdac_failover_get(struct rdac_controller *ctlr, + struct list_head *list, + unsigned char *cdb) { - struct request *rq; - struct request_queue *q = sdev->request_queue; - - rq = blk_get_request(q, rw, GFP_NOIO); - - if (IS_ERR(rq)) { - sdev_printk(KERN_INFO, sdev, - "get_rdac_req: blk_get_request failed.\n"); - return NULL; - } - blk_rq_set_block_pc(rq); - - if (buflen && blk_rq_map_kern(q, rq, buffer, buflen, GFP_NOIO)) { - blk_put_request(rq); - sdev_printk(KERN_INFO, sdev, - "get_rdac_req: blk_rq_map_kern failed.\n"); - return NULL; - } - - rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | -REQ_FAILFAST_DRIVER; - rq->retries = RDAC_RETRIES; - rq->timeout = RDAC_TIMEOUT; - - return rq; -} - -static struct request *rdac_failover_get(struct scsi_device *sdev, - struct rdac_dh_data *h, struct list_head *list) -{ - struct request *rq; + struct scsi_device *sdev = ctlr->ms_sdev; + struct rdac_dh_data *h = sdev->handler_data; struct rdac_mode_common *common; unsigned data_size; struct rdac_queue_data *qdata; @@ -332,27 +303,17 @@ static struct request *rdac_failover_get(struct scsi_device *sdev, lun_table[qdata->h->lun] = 0x81; } - /* get request for block layer packet command */ - rq = get_rdac_req(sdev, >ctlr->mode_select, data_size, WRITE); - if (!rq) - return NULL; - /* Prepare the command. */ if (h->ctlr->use_ms10) { - rq->cmd[0] = MODE_SELECT_10; - rq->cmd[7] = data_size >> 8; - rq->cmd[8] = data_size & 0xff; + cdb[0] = MODE_SELECT_10; + cdb[7] = data_size >> 8; + cdb[8] = data_size & 0xff; } else { - rq->cmd[0] = MODE_SELECT; - rq->cmd[4] = data_size; + cdb[0] = MODE_SELECT; + cdb[4] = data_size; } - rq->cmd_len = COMMAND_SIZE(rq->cmd[0]); - - rq->sense = h->sense; - memset(rq->sense, 0, SCSI_SENSE_BUFFERSIZE); - rq->sense_len = 0; - return rq; + return data_size; } static void release_controller(struct kref *kref) @@ -400,46 +361,14 @@ static struct rdac_controller *get_controller(int index, char *array_name, return ctlr; } -static int submit_inquiry(struct scsi_device *sdev, int page_code, - unsigned int len, struct rdac_dh_data *h) -{ - struct request *rq; - struct request_queue *q = sdev->request_queue; - int err = SCSI_DH_RES_TEMP_UNAVAIL; - - rq = get_rdac_req(sdev, >inq, len, READ); - if (!rq) - goto done; - - /* Prepare the command. */ - rq->cmd[0] = INQUIRY; - rq->cmd[1] = 1; - rq->cmd[2] = page_code; - rq->cmd[4] = len; - rq->cmd_len = COMMAND_SIZE(INQUIRY); - - rq->sense = h->sense; - memset(rq->sense, 0, SCSI_SENSE_BUFFERSIZE); - rq->sense_len = 0; - - err = blk_execute_rq(q, NULL, rq, 1); - if (err == -EIO) - err = SCSI_DH_IO; - - blk_put_request(rq); -done: - return err; -} - static int get_lun_info(struct scsi_device *sdev, struct rdac_dh_data *h, char *array_name, u8 *array_id) { - int err, i; - struct c8_inquiry *inqp; +
[dm-devel] [PATCH 07/18] dm: always defer request allocation to the owner of the request_queue
DM already calls blk_mq_alloc_request on the request_queue of the underlying device if it is a blk-mq device. But now that we allow drivers to allocate additional data and initialize it ahead of time we need to do the same for all drivers. Doing so and using the new cmd_size infrastructure in the block layer greatly simplifies the dm-rq and mpath code, and should also make arbitrary combinations of SQ and MQ devices with SQ or MQ device mapper tables easily possible as a further step. Signed-off-by: Christoph HellwigReviewed-by: Hannes Reinecke Reviewed-by: Mike Snitzer --- drivers/md/dm-core.h | 1 - drivers/md/dm-mpath.c | 132 -- drivers/md/dm-rq.c| 251 ++ drivers/md/dm-rq.h| 2 +- drivers/md/dm-target.c| 7 -- drivers/md/dm.c | 30 ++--- drivers/md/dm.h | 3 +- include/linux/device-mapper.h | 3 - 8 files changed, 85 insertions(+), 344 deletions(-) diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h index 40ceba1..136fda3 100644 --- a/drivers/md/dm-core.h +++ b/drivers/md/dm-core.h @@ -92,7 +92,6 @@ struct mapped_device { * io objects are allocated from here. */ mempool_t *io_pool; - mempool_t *rq_pool; struct bio_set *bs; diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 6400cff..784f237 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -92,12 +92,6 @@ struct multipath { unsigned queue_mode; - /* -* We must use a mempool of dm_mpath_io structs so that we -* can resubmit bios on error. -*/ - mempool_t *mpio_pool; - struct mutex work_mutex; struct work_struct trigger_event; @@ -115,8 +109,6 @@ struct dm_mpath_io { typedef int (*action_fn) (struct pgpath *pgpath); -static struct kmem_cache *_mpio_cache; - static struct workqueue_struct *kmultipathd, *kmpath_handlerd; static void trigger_event(struct work_struct *work); static void activate_path(struct work_struct *work); @@ -209,7 +201,6 @@ static struct multipath *alloc_multipath(struct dm_target *ti) init_waitqueue_head(>pg_init_wait); mutex_init(>work_mutex); - m->mpio_pool = NULL; m->queue_mode = DM_TYPE_NONE; m->ti = ti; @@ -229,16 +220,7 @@ static int alloc_multipath_stage2(struct dm_target *ti, struct multipath *m) m->queue_mode = DM_TYPE_MQ_REQUEST_BASED; else m->queue_mode = DM_TYPE_REQUEST_BASED; - } - - if (m->queue_mode == DM_TYPE_REQUEST_BASED) { - unsigned min_ios = dm_get_reserved_rq_based_ios(); - - m->mpio_pool = mempool_create_slab_pool(min_ios, _mpio_cache); - if (!m->mpio_pool) - return -ENOMEM; - } - else if (m->queue_mode == DM_TYPE_BIO_BASED) { + } else if (m->queue_mode == DM_TYPE_BIO_BASED) { INIT_WORK(>process_queued_bios, process_queued_bios); /* * bio-based doesn't support any direct scsi_dh management; @@ -263,7 +245,6 @@ static void free_multipath(struct multipath *m) kfree(m->hw_handler_name); kfree(m->hw_handler_params); - mempool_destroy(m->mpio_pool); kfree(m); } @@ -272,38 +253,6 @@ static struct dm_mpath_io *get_mpio(union map_info *info) return info->ptr; } -static struct dm_mpath_io *set_mpio(struct multipath *m, union map_info *info) -{ - struct dm_mpath_io *mpio; - - if (!m->mpio_pool) { - /* Use blk-mq pdu memory requested via per_io_data_size */ - mpio = get_mpio(info); - memset(mpio, 0, sizeof(*mpio)); - return mpio; - } - - mpio = mempool_alloc(m->mpio_pool, GFP_ATOMIC); - if (!mpio) - return NULL; - - memset(mpio, 0, sizeof(*mpio)); - info->ptr = mpio; - - return mpio; -} - -static void clear_request_fn_mpio(struct multipath *m, union map_info *info) -{ - /* Only needed for non blk-mq (.request_fn) multipath */ - if (m->mpio_pool) { - struct dm_mpath_io *mpio = info->ptr; - - info->ptr = NULL; - mempool_free(mpio, m->mpio_pool); - } -} - static size_t multipath_per_bio_data_size(void) { return sizeof(struct dm_mpath_io) + sizeof(struct dm_bio_details); @@ -530,16 +479,17 @@ static bool must_push_back_bio(struct multipath *m) /* * Map cloned requests (request-based multipath) */ -static int __multipath_map(struct dm_target *ti, struct request *clone, - union map_info *map_context, - struct request *rq, struct request **__clone) +static int multipath_clone_and_map(struct dm_target *ti,
[dm-devel] [PATCH 13/18] scsi: remove scsi_cmd_dma_pool
There is no need for GFP_DMA allocations of the scsi_cmnd structures themselves, all that might be DMAed to or from is the actual payload, or the sense buffers. Signed-off-by: Christoph HellwigReviewed-by: Hannes Reinecke --- drivers/scsi/scsi.c | 15 +-- 1 file changed, 1 insertion(+), 14 deletions(-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 469aa0f..2e24f31 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -102,17 +102,10 @@ struct scsi_host_cmd_pool { struct kmem_cache *cmd_slab; unsigned intusers; char*cmd_name; - unsigned intslab_flags; }; static struct scsi_host_cmd_pool scsi_cmd_pool = { .cmd_name = "scsi_cmd_cache", - .slab_flags = SLAB_HWCACHE_ALIGN, -}; - -static struct scsi_host_cmd_pool scsi_cmd_dma_pool = { - .cmd_name = "scsi_cmd_cache(DMA)", - .slab_flags = SLAB_HWCACHE_ALIGN|SLAB_CACHE_DMA, }; static DEFINE_MUTEX(host_cmd_pool_mutex); @@ -290,8 +283,6 @@ scsi_find_host_cmd_pool(struct Scsi_Host *shost) { if (shost->hostt->cmd_size) return shost->hostt->cmd_pool; - if (shost->unchecked_isa_dma) - return _cmd_dma_pool; return _cmd_pool; } @@ -318,10 +309,6 @@ scsi_alloc_host_cmd_pool(struct Scsi_Host *shost) return NULL; } - pool->slab_flags = SLAB_HWCACHE_ALIGN; - if (shost->unchecked_isa_dma) - pool->slab_flags |= SLAB_CACHE_DMA; - if (hostt->cmd_size) hostt->cmd_pool = pool; @@ -349,7 +336,7 @@ scsi_get_host_cmd_pool(struct Scsi_Host *shost) if (!pool->users) { pool->cmd_slab = kmem_cache_create(pool->cmd_name, cmd_size, 0, - pool->slab_flags, NULL); + SLAB_HWCACHE_ALIGN, NULL); if (!pool->cmd_slab) goto out_free_pool; } -- 2.1.4 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 05/18] block: allow specifying size for extra command data
This mirrors the blk-mq capabilities to allocate extra drivers-specific data behind struct request by setting a cmd_size field, as well as having a constructor / destructor for it. Signed-off-by: Christoph HellwigReviewed-by: Hannes Reinecke --- block/blk-core.c | 59 -- block/blk-flush.c | 5 ++--- block/blk-sysfs.c | 7 -- include/linux/blkdev.h | 7 ++ 4 files changed, 61 insertions(+), 17 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 54b5512..7de7164 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -606,17 +606,41 @@ void blk_cleanup_queue(struct request_queue *q) EXPORT_SYMBOL(blk_cleanup_queue); /* Allocate memory local to the request queue */ -static void *alloc_request_struct(gfp_t gfp_mask, void *data) +static void *alloc_request_simple(gfp_t gfp_mask, void *data) { - int nid = (int)(long)data; - return kmem_cache_alloc_node(request_cachep, gfp_mask, nid); + struct request_queue *q = data; + + return kmem_cache_alloc_node(request_cachep, gfp_mask, q->node); } -static void free_request_struct(void *element, void *unused) +static void free_request_simple(void *element, void *data) { kmem_cache_free(request_cachep, element); } +static void *alloc_request_size(gfp_t gfp_mask, void *data) +{ + struct request_queue *q = data; + struct request *rq; + + rq = kmalloc_node(sizeof(struct request) + q->cmd_size, gfp_mask, + q->node); + if (rq && q->init_rq_fn && q->init_rq_fn(q, rq, gfp_mask) < 0) { + kfree(rq); + rq = NULL; + } + return rq; +} + +static void free_request_size(void *element, void *data) +{ + struct request_queue *q = data; + + if (q->exit_rq_fn) + q->exit_rq_fn(q, element); + kfree(element); +} + int blk_init_rl(struct request_list *rl, struct request_queue *q, gfp_t gfp_mask) { @@ -629,10 +653,15 @@ int blk_init_rl(struct request_list *rl, struct request_queue *q, init_waitqueue_head(>wait[BLK_RW_SYNC]); init_waitqueue_head(>wait[BLK_RW_ASYNC]); - rl->rq_pool = mempool_create_node(BLKDEV_MIN_RQ, alloc_request_struct, - free_request_struct, - (void *)(long)q->node, gfp_mask, - q->node); + if (q->cmd_size) { + rl->rq_pool = mempool_create_node(BLKDEV_MIN_RQ, + alloc_request_size, free_request_size, + q, gfp_mask, q->node); + } else { + rl->rq_pool = mempool_create_node(BLKDEV_MIN_RQ, + alloc_request_simple, free_request_simple, + q, gfp_mask, q->node); + } if (!rl->rq_pool) return -ENOMEM; @@ -846,12 +875,15 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio); int blk_init_allocated_queue(struct request_queue *q) { - q->fq = blk_alloc_flush_queue(q, NUMA_NO_NODE, 0); + q->fq = blk_alloc_flush_queue(q, NUMA_NO_NODE, q->cmd_size); if (!q->fq) return -ENOMEM; + if (q->init_rq_fn && q->init_rq_fn(q, q->fq->flush_rq, GFP_KERNEL)) + goto out_free_flush_queue; + if (blk_init_rl(>root_rl, q, GFP_KERNEL)) - goto fail; + goto out_exit_flush_rq; INIT_WORK(>timeout_work, blk_timeout_work); q->queue_flags |= QUEUE_FLAG_DEFAULT; @@ -869,13 +901,16 @@ int blk_init_allocated_queue(struct request_queue *q) /* init elevator */ if (elevator_init(q, NULL)) { mutex_unlock(>sysfs_lock); - goto fail; + goto out_exit_flush_rq; } mutex_unlock(>sysfs_lock); return 0; -fail: +out_exit_flush_rq: + if (q->exit_rq_fn) + q->exit_rq_fn(q, q->fq->flush_rq); +out_free_flush_queue: blk_free_flush_queue(q->fq); wbt_exit(q); return -ENOMEM; diff --git a/block/blk-flush.c b/block/blk-flush.c index d7de34e..bf3ba3c 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -547,11 +547,10 @@ struct blk_flush_queue *blk_alloc_flush_queue(struct request_queue *q, if (!fq) goto fail; - if (q->mq_ops) { + if (q->mq_ops) spin_lock_init(>mq_flush_lock); - rq_sz = round_up(rq_sz + cmd_size, cache_line_size()); - } + rq_sz = round_up(rq_sz + cmd_size, cache_line_size()); fq->flush_rq = kzalloc_node(rq_sz, GFP_KERNEL, node); if (!fq->flush_rq) goto fail_rq; diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 1dbce05..894f773 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -814,10 +814,13 @@
[dm-devel] [PATCH 12/18] scsi: respect unchecked_isa_dma for blk-mq
Currently blk-mq always allocates the sense buffer using normal GFP_KERNEL allocation. Refactor the cmd pool code to split the cmd and sense allocation and share the code to allocate the sense buffers as well as the sense buffer slab caches between the legacy and blk-mq path. Note that this switches to lazy allocation of the sense slab caches - the slab caches (not the actual allocations) won't be destroy until the scsi module is unloaded instead of keeping track of hosts using them. Signed-off-by: Christoph HellwigReviewed-by: Hannes Reinecke --- drivers/scsi/hosts.c | 4 drivers/scsi/scsi.c | 24 --- drivers/scsi/scsi_lib.c | 62 +--- drivers/scsi/scsi_priv.h | 5 4 files changed, 73 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 258a3f9..6d29c4a 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -213,6 +213,10 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, goto fail; } + error = scsi_init_sense_cache(shost); + if (error) + goto fail; + if (shost_use_blk_mq(shost)) { error = scsi_mq_setup_tags(shost); if (error) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 0f93892..469aa0f 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -100,22 +100,18 @@ EXPORT_SYMBOL(scsi_sd_pm_domain); struct scsi_host_cmd_pool { struct kmem_cache *cmd_slab; - struct kmem_cache *sense_slab; unsigned intusers; char*cmd_name; - char*sense_name; unsigned intslab_flags; }; static struct scsi_host_cmd_pool scsi_cmd_pool = { .cmd_name = "scsi_cmd_cache", - .sense_name = "scsi_sense_cache", .slab_flags = SLAB_HWCACHE_ALIGN, }; static struct scsi_host_cmd_pool scsi_cmd_dma_pool = { .cmd_name = "scsi_cmd_cache(DMA)", - .sense_name = "scsi_sense_cache(DMA)", .slab_flags = SLAB_HWCACHE_ALIGN|SLAB_CACHE_DMA, }; @@ -136,7 +132,7 @@ scsi_host_free_command(struct Scsi_Host *shost, struct scsi_cmnd *cmd) if (cmd->prot_sdb) kmem_cache_free(scsi_sdb_cache, cmd->prot_sdb); - kmem_cache_free(pool->sense_slab, cmd->sense_buffer); + scsi_free_sense_buffer(shost, cmd->sense_buffer); kmem_cache_free(pool->cmd_slab, cmd); } @@ -158,7 +154,8 @@ scsi_host_alloc_command(struct Scsi_Host *shost, gfp_t gfp_mask) if (!cmd) goto fail; - cmd->sense_buffer = kmem_cache_alloc(pool->sense_slab, gfp_mask); + cmd->sense_buffer = scsi_alloc_sense_buffer(shost, gfp_mask, + NUMA_NO_NODE); if (!cmd->sense_buffer) goto fail_free_cmd; @@ -171,7 +168,7 @@ scsi_host_alloc_command(struct Scsi_Host *shost, gfp_t gfp_mask) return cmd; fail_free_sense: - kmem_cache_free(pool->sense_slab, cmd->sense_buffer); + scsi_free_sense_buffer(shost, cmd->sense_buffer); fail_free_cmd: kmem_cache_free(pool->cmd_slab, cmd); fail: @@ -301,7 +298,6 @@ scsi_find_host_cmd_pool(struct Scsi_Host *shost) static void scsi_free_host_cmd_pool(struct scsi_host_cmd_pool *pool) { - kfree(pool->sense_name); kfree(pool->cmd_name); kfree(pool); } @@ -317,8 +313,7 @@ scsi_alloc_host_cmd_pool(struct Scsi_Host *shost) return NULL; pool->cmd_name = kasprintf(GFP_KERNEL, "%s_cmd", hostt->proc_name); - pool->sense_name = kasprintf(GFP_KERNEL, "%s_sense", hostt->proc_name); - if (!pool->cmd_name || !pool->sense_name) { + if (!pool->cmd_name) { scsi_free_host_cmd_pool(pool); return NULL; } @@ -357,12 +352,6 @@ scsi_get_host_cmd_pool(struct Scsi_Host *shost) pool->slab_flags, NULL); if (!pool->cmd_slab) goto out_free_pool; - - pool->sense_slab = kmem_cache_create(pool->sense_name, -SCSI_SENSE_BUFFERSIZE, 0, -pool->slab_flags, NULL); - if (!pool->sense_slab) - goto out_free_slab; } pool->users++; @@ -371,8 +360,6 @@ scsi_get_host_cmd_pool(struct Scsi_Host *shost) mutex_unlock(_cmd_pool_mutex); return retval; -out_free_slab: - kmem_cache_destroy(pool->cmd_slab); out_free_pool: if (hostt->cmd_size) { scsi_free_host_cmd_pool(pool); @@ -398,7 +385,6 @@ static void scsi_put_host_cmd_pool(struct Scsi_Host *shost) if (!--pool->users) { kmem_cache_destroy(pool->cmd_slab); -
[dm-devel] [PATCH 11/18] scsi: remove gfp_flags member in scsi_host_cmd_pool
When using the slab allocator we already decide at cache creation time if an allocation comes from a GFP_DMA pool using the SLAB_CACHE_DMA flag, and there is no point passing the kmalloc-family only GFP_DMA flag to kmem_cache_alloc. Drop all the infrastructure for doing so. Signed-off-by: Christoph HellwigReviewed-by: Johannes Thumshirn --- drivers/scsi/scsi.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 75455d4..0f93892 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -105,7 +105,6 @@ struct scsi_host_cmd_pool { char*cmd_name; char*sense_name; unsigned intslab_flags; - gfp_t gfp_mask; }; static struct scsi_host_cmd_pool scsi_cmd_pool = { @@ -118,7 +117,6 @@ static struct scsi_host_cmd_pool scsi_cmd_dma_pool = { .cmd_name = "scsi_cmd_cache(DMA)", .sense_name = "scsi_sense_cache(DMA)", .slab_flags = SLAB_HWCACHE_ALIGN|SLAB_CACHE_DMA, - .gfp_mask = __GFP_DMA, }; static DEFINE_MUTEX(host_cmd_pool_mutex); @@ -156,12 +154,11 @@ scsi_host_alloc_command(struct Scsi_Host *shost, gfp_t gfp_mask) struct scsi_host_cmd_pool *pool = shost->cmd_pool; struct scsi_cmnd *cmd; - cmd = kmem_cache_zalloc(pool->cmd_slab, gfp_mask | pool->gfp_mask); + cmd = kmem_cache_zalloc(pool->cmd_slab, gfp_mask); if (!cmd) goto fail; - cmd->sense_buffer = kmem_cache_alloc(pool->sense_slab, -gfp_mask | pool->gfp_mask); + cmd->sense_buffer = kmem_cache_alloc(pool->sense_slab, gfp_mask); if (!cmd->sense_buffer) goto fail_free_cmd; @@ -327,10 +324,8 @@ scsi_alloc_host_cmd_pool(struct Scsi_Host *shost) } pool->slab_flags = SLAB_HWCACHE_ALIGN; - if (shost->unchecked_isa_dma) { + if (shost->unchecked_isa_dma) pool->slab_flags |= SLAB_CACHE_DMA; - pool->gfp_mask = __GFP_DMA; - } if (hostt->cmd_size) hostt->cmd_pool = pool; @@ -424,7 +419,6 @@ static void scsi_put_host_cmd_pool(struct Scsi_Host *shost) */ int scsi_setup_command_freelist(struct Scsi_Host *shost) { - const gfp_t gfp_mask = shost->unchecked_isa_dma ? GFP_DMA : GFP_KERNEL; struct scsi_cmnd *cmd; spin_lock_init(>free_list_lock); @@ -437,7 +431,7 @@ int scsi_setup_command_freelist(struct Scsi_Host *shost) /* * Get one backup command for this host. */ - cmd = scsi_host_alloc_command(shost, gfp_mask); + cmd = scsi_host_alloc_command(shost, GFP_KERNEL); if (!cmd) { scsi_put_host_cmd_pool(shost); shost->cmd_pool = NULL; -- 2.1.4 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 10/18] scsi_dh_hp_sw: switch to scsi_execute_req_flags()
From: Hannes ReineckeSwitch to scsi_execute_req_flags() instead of using the block interface directly. This will set REQ_QUIET and REQ_PREEMPT, but this is okay as we're evaluating the errors anyway and should be able to send the command even if the device is quiesced. Signed-off-by: Hannes Reinecke Signed-off-by: Christoph Hellwig --- drivers/scsi/device_handler/scsi_dh_hp_sw.c | 222 1 file changed, 65 insertions(+), 157 deletions(-) diff --git a/drivers/scsi/device_handler/scsi_dh_hp_sw.c b/drivers/scsi/device_handler/scsi_dh_hp_sw.c index 308e871..be43c94 100644 --- a/drivers/scsi/device_handler/scsi_dh_hp_sw.c +++ b/drivers/scsi/device_handler/scsi_dh_hp_sw.c @@ -38,13 +38,10 @@ #define HP_SW_PATH_PASSIVE 1 struct hp_sw_dh_data { - unsigned char sense[SCSI_SENSE_BUFFERSIZE]; int path_state; int retries; int retry_cnt; struct scsi_device *sdev; - activate_complete callback_fn; - void*callback_data; }; static int hp_sw_start_stop(struct hp_sw_dh_data *); @@ -56,43 +53,34 @@ static int hp_sw_start_stop(struct hp_sw_dh_data *); * * Returns SCSI_DH_DEV_OFFLINED if the sdev is on the passive path */ -static int tur_done(struct scsi_device *sdev, unsigned char *sense) +static int tur_done(struct scsi_device *sdev, struct hp_sw_dh_data *h, + struct scsi_sense_hdr *sshdr) { - struct scsi_sense_hdr sshdr; - int ret; + int ret = SCSI_DH_IO; - ret = scsi_normalize_sense(sense, SCSI_SENSE_BUFFERSIZE, ); - if (!ret) { - sdev_printk(KERN_WARNING, sdev, - "%s: sending tur failed, no sense available\n", - HP_SW_NAME); - ret = SCSI_DH_IO; - goto done; - } - switch (sshdr.sense_key) { + switch (sshdr->sense_key) { case UNIT_ATTENTION: ret = SCSI_DH_IMM_RETRY; break; case NOT_READY: - if ((sshdr.asc == 0x04) && (sshdr.ascq == 2)) { + if (sshdr->asc == 0x04 && sshdr->ascq == 2) { /* * LUN not ready - Initialization command required * * This is the passive path */ - ret = SCSI_DH_DEV_OFFLINED; + h->path_state = HP_SW_PATH_PASSIVE; + ret = SCSI_DH_OK; break; } /* Fallthrough */ default: sdev_printk(KERN_WARNING, sdev, "%s: sending tur failed, sense %x/%x/%x\n", - HP_SW_NAME, sshdr.sense_key, sshdr.asc, - sshdr.ascq); + HP_SW_NAME, sshdr->sense_key, sshdr->asc, + sshdr->ascq); break; } - -done: return ret; } @@ -105,131 +93,36 @@ static int tur_done(struct scsi_device *sdev, unsigned char *sense) */ static int hp_sw_tur(struct scsi_device *sdev, struct hp_sw_dh_data *h) { - struct request *req; - int ret; + unsigned char cmd[6] = { TEST_UNIT_READY }; + struct scsi_sense_hdr sshdr; + int ret = SCSI_DH_OK, res; + u64 req_flags = REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | + REQ_FAILFAST_DRIVER; retry: - req = blk_get_request(sdev->request_queue, WRITE, GFP_NOIO); - if (IS_ERR(req)) - return SCSI_DH_RES_TEMP_UNAVAIL; - - blk_rq_set_block_pc(req); - req->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | - REQ_FAILFAST_DRIVER; - req->cmd_len = COMMAND_SIZE(TEST_UNIT_READY); - req->cmd[0] = TEST_UNIT_READY; - req->timeout = HP_SW_TIMEOUT; - req->sense = h->sense; - memset(req->sense, 0, SCSI_SENSE_BUFFERSIZE); - req->sense_len = 0; - - ret = blk_execute_rq(req->q, NULL, req, 1); - if (ret == -EIO) { - if (req->sense_len > 0) { - ret = tur_done(sdev, h->sense); - } else { + res = scsi_execute_req_flags(sdev, cmd, DMA_NONE, NULL, 0, , +HP_SW_TIMEOUT, HP_SW_RETRIES, +NULL, req_flags, 0); + if (res) { + if (scsi_sense_valid()) + ret = tur_done(sdev, h, ); + else { sdev_printk(KERN_WARNING, sdev, "%s: sending tur failed with %x\n", - HP_SW_NAME, req->errors); + HP_SW_NAME, res); ret = SCSI_DH_IO; } } else { h->path_state = HP_SW_PATH_ACTIVE;
Re: [dm-devel] deterministic io throughput in multipath
On Wed, Jan 25, 2017 at 11:48:33AM +, Muneendra Kumar M wrote: > Hi Ben, > Thanks for the review . > I will consider the below points and will do the necessary changes. > > I have two general questions which may not be related to this. > 1)Is there any standard tests that we need to do to check the functionality > of the multipath daemon. No. multipath doesn't have a standard set of regression tests. You need to do your own testing. > 2)Iam new to git is there any standard steps which we generally follow to > push the changes . You don't need to use git to push a patch, but it is easier to process if your patch is inline in the email instead of as an attachment (assuming your mail client doesn't mangle the patch). If you want to use git, you just need to commit your patches to a branch off the head of master. Then you can build patches with # git format-patch --cover-letter -s -n -o origin and send them with # git send-email --to "device-mapper development" --cc "Christophe Varoqui " --no-chain-reply-to --suppress-from You may need to first need to set up your git name and email -Ben > Regards, > Muneendra. > > > > -Original Message- > From: Benjamin Marzinski [mailto:bmarz...@redhat.com] > Sent: Wednesday, January 25, 2017 2:59 PM > To: Muneendra Kumar M > Cc: dm-devel@redhat.com > Subject: Re: [dm-devel] deterministic io throughput in multipath > > This looks fine to me. If this what you want to push, I'm o.k. with it. > But I'd like to make some suggestions that you are free to ignore. > > Right now you have to check in two places to see if the path failed (in > update_multipath and check_path). If you look at the delayed_*_checks code, > it flags the path failures when you reinstate the path in check_path, since > this will only happen there. > > Next, right now you use the disable_reinstate code to deal with the devices > when they shouldn't be reinstated. The issue with this is that the path > appears to be up when people look at its state, but still isn't being used. > If you do the check early and set the path state to PATH_DELAYED, like > delayed_*_checks does, then the path is clearly marked when users look to see > why it isn't being used. Also, if you exit check_path early, then you won't > be running the prioritizer on these likely-unstable paths. > > Finally, the way you use dis_reinstate_time, a flakey device can get > reinstated as soon as it comes back up, as long it was down for long enough, > simply because pp->dis_reinstate_time reached > mpp->san_path_err_recovery_time while the device was failed. > delayed_*_checks depends on a number of successful path checks, so you know > that the device has at least been nominally functional for > san_path_err_recovery_time. > > Like I said, you don't have to change any of this to make me happy with your > patch. But if you did change all of these, then the current delay_*_checks > code would just end up being a special case of your code. > I'd really like to pull out the delayed_*_checks code and just keep your > version, since it seems more useful. It would be nice to keep the same > functionality. But even if you don't make these changes, I still think we > should pull out the delayed_*_checks code, since they both do the same > general thing, and your code does it better. > > -Ben > > On Mon, Jan 23, 2017 at 11:02:42AM +, Muneendra Kumar M wrote: > >Hi Ben, > >I have made the changes as per the below review comments . > > > >Could you please review the attached patch and provide us your valuable > >comments . > >Below are the files that has been changed . > > > >libmultipath/config.c | 3 +++ > >libmultipath/config.h | 9 + > >libmultipath/configure.c | 3 +++ > >libmultipath/defaults.h | 3 ++- > >libmultipath/dict.c | 84 > > > > > >libmultipath/dict.h | 3 +-- > >libmultipath/propsel.c | 48 > >++-- > >libmultipath/propsel.h | 3 +++ > >libmultipath/structs.h | 14 ++ > >libmultipath/structs_vec.c | 6 ++ > >multipath/multipath.conf.5 | 57 > >+ > >multipathd/main.c | 70 > > > > +++--- > > > > > >Regards, > >Muneendra. > > > >_ > >From: Muneendra Kumar M > >Sent: Tuesday, January 17, 2017 4:13 PM > >To: 'Benjamin Marzinski' > >Cc: dm-devel@redhat.com > >Subject: RE: [dm-devel] deterministic io throughput in multipath > > > > > >Hi Ben, > >Thanks for the review. > >
Re: [dm-devel] deterministic io throughput in multipath
Hi Ben, Thanks for the review . I will consider the below points and will do the necessary changes. I have two general questions which may not be related to this. 1)Is there any standard tests that we need to do to check the functionality of the multipath daemon. 2)Iam new to git is there any standard steps which we generally follow to push the changes . Regards, Muneendra. -Original Message- From: Benjamin Marzinski [mailto:bmarz...@redhat.com] Sent: Wednesday, January 25, 2017 2:59 PM To: Muneendra Kumar MCc: dm-devel@redhat.com Subject: Re: [dm-devel] deterministic io throughput in multipath This looks fine to me. If this what you want to push, I'm o.k. with it. But I'd like to make some suggestions that you are free to ignore. Right now you have to check in two places to see if the path failed (in update_multipath and check_path). If you look at the delayed_*_checks code, it flags the path failures when you reinstate the path in check_path, since this will only happen there. Next, right now you use the disable_reinstate code to deal with the devices when they shouldn't be reinstated. The issue with this is that the path appears to be up when people look at its state, but still isn't being used. If you do the check early and set the path state to PATH_DELAYED, like delayed_*_checks does, then the path is clearly marked when users look to see why it isn't being used. Also, if you exit check_path early, then you won't be running the prioritizer on these likely-unstable paths. Finally, the way you use dis_reinstate_time, a flakey device can get reinstated as soon as it comes back up, as long it was down for long enough, simply because pp->dis_reinstate_time reached mpp->san_path_err_recovery_time while the device was failed. delayed_*_checks depends on a number of successful path checks, so you know that the device has at least been nominally functional for san_path_err_recovery_time. Like I said, you don't have to change any of this to make me happy with your patch. But if you did change all of these, then the current delay_*_checks code would just end up being a special case of your code. I'd really like to pull out the delayed_*_checks code and just keep your version, since it seems more useful. It would be nice to keep the same functionality. But even if you don't make these changes, I still think we should pull out the delayed_*_checks code, since they both do the same general thing, and your code does it better. -Ben On Mon, Jan 23, 2017 at 11:02:42AM +, Muneendra Kumar M wrote: >Hi Ben, >I have made the changes as per the below review comments . > >Could you please review the attached patch and provide us your valuable >comments . >Below are the files that has been changed . > >libmultipath/config.c | 3 +++ >libmultipath/config.h | 9 + >libmultipath/configure.c | 3 +++ >libmultipath/defaults.h | 3 ++- >libmultipath/dict.c | 84 > > >libmultipath/dict.h | 3 +-- >libmultipath/propsel.c | 48 >++-- >libmultipath/propsel.h | 3 +++ >libmultipath/structs.h | 14 ++ >libmultipath/structs_vec.c | 6 ++ >multipath/multipath.conf.5 | 57 >+ >multipathd/main.c | 70 > > +++--- > > >Regards, >Muneendra. > >_ >From: Muneendra Kumar M >Sent: Tuesday, January 17, 2017 4:13 PM >To: 'Benjamin Marzinski' >Cc: dm-devel@redhat.com >Subject: RE: [dm-devel] deterministic io throughput in multipath > > >Hi Ben, >Thanks for the review. >In dict.c I will make sure I will make generic functions which will be >used by both delay_checks and err_checks. > >We want to increment the path failures every time the path goes down >regardless of whether multipathd or the kernel noticed the failure of >paths. >Thanks for pointing this. > >I will completely agree with the idea which you mentioned below by >reconsidering the san_path_err_threshold_window with >san_path_err_forget_rate. This will avoid counting time when the path was >down as time where the path wasn't having problems. > >I will incorporate all the changes mentioned below and will resend the >patch once the testing is done. > >Regards, >Muneendra. > > > >-Original Message- >From: Benjamin Marzinski [[1]mailto:bmarz...@redhat.com] >Sent: Tuesday, January 17, 2017 6:35 AM >To: Muneendra Kumar M <[2]mmand...@brocade.com> >Cc:
Re: [dm-devel] deterministic io throughput in multipath
This looks fine to me. If this what you want to push, I'm o.k. with it. But I'd like to make some suggestions that you are free to ignore. Right now you have to check in two places to see if the path failed (in update_multipath and check_path). If you look at the delayed_*_checks code, it flags the path failures when you reinstate the path in check_path, since this will only happen there. Next, right now you use the disable_reinstate code to deal with the devices when they shouldn't be reinstated. The issue with this is that the path appears to be up when people look at its state, but still isn't being used. If you do the check early and set the path state to PATH_DELAYED, like delayed_*_checks does, then the path is clearly marked when users look to see why it isn't being used. Also, if you exit check_path early, then you won't be running the prioritizer on these likely-unstable paths. Finally, the way you use dis_reinstate_time, a flakey device can get reinstated as soon as it comes back up, as long it was down for long enough, simply because pp->dis_reinstate_time reached mpp->san_path_err_recovery_time while the device was failed. delayed_*_checks depends on a number of successful path checks, so you know that the device has at least been nominally functional for san_path_err_recovery_time. Like I said, you don't have to change any of this to make me happy with your patch. But if you did change all of these, then the current delay_*_checks code would just end up being a special case of your code. I'd really like to pull out the delayed_*_checks code and just keep your version, since it seems more useful. It would be nice to keep the same functionality. But even if you don't make these changes, I still think we should pull out the delayed_*_checks code, since they both do the same general thing, and your code does it better. -Ben On Mon, Jan 23, 2017 at 11:02:42AM +, Muneendra Kumar M wrote: >Hi Ben, >I have made the changes as per the below review comments . > >Could you please review the attached patch and provide us your valuable >comments . >Below are the files that has been changed . > >libmultipath/config.c | 3 +++ >libmultipath/config.h | 9 + >libmultipath/configure.c | 3 +++ >libmultipath/defaults.h | 3 ++- >libmultipath/dict.c | 84 > > >libmultipath/dict.h | 3 +-- >libmultipath/propsel.c | 48 >++-- >libmultipath/propsel.h | 3 +++ >libmultipath/structs.h | 14 ++ >libmultipath/structs_vec.c | 6 ++ >multipath/multipath.conf.5 | 57 >+ >multipathd/main.c | 70 >+++--- > > >Regards, >Muneendra. > >_ >From: Muneendra Kumar M >Sent: Tuesday, January 17, 2017 4:13 PM >To: 'Benjamin Marzinski'>Cc: dm-devel@redhat.com >Subject: RE: [dm-devel] deterministic io throughput in multipath > > >Hi Ben, >Thanks for the review. >In dict.c I will make sure I will make generic functions which will be >used by both delay_checks and err_checks. > >We want to increment the path failures every time the path goes down >regardless of whether multipathd or the kernel noticed the failure of >paths. >Thanks for pointing this. > >I will completely agree with the idea which you mentioned below by >reconsidering the san_path_err_threshold_window with >san_path_err_forget_rate. This will avoid counting time when the path was >down as time where the path wasn't having problems. > >I will incorporate all the changes mentioned below and will resend the >patch once the testing is done. > >Regards, >Muneendra. > > > >-Original Message- >From: Benjamin Marzinski [[1]mailto:bmarz...@redhat.com] >Sent: Tuesday, January 17, 2017 6:35 AM >To: Muneendra Kumar M <[2]mmand...@brocade.com> >Cc: [3]dm-devel@redhat.com >Subject: Re: [dm-devel] deterministic io throughput in multipath > >On Mon, Jan 16, 2017 at 11:19:19AM +, Muneendra Kumar M wrote: >> Hi Ben, >> After the below discussion we came with the approach which will meet >our >> requirement. >> I have attached the patch which is working good in our field tests. >> Could you please review the attached patch and provide us your >valuable >> comments . > >I can see a number of issues with this patch. > >First, some nit-picks: >- I assume "dis_reinstante_time" should be "dis_reinstate_time" > >- The indenting in