RE: [PATCH 0/5] blk-mq/scsi-mq: support global tags & introduce force_blk_mq

2018-02-04 Thread Kashyap Desai
> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Monday, February 5, 2018 12:28 PM
> To: Ming Lei; Jens Axboe; linux-block@vger.kernel.org; Christoph Hellwig;
> Mike Snitzer
> Cc: linux-s...@vger.kernel.org; Arun Easi; Omar Sandoval; Martin K .
> Petersen;
> James Bottomley; Christoph Hellwig; Don Brace; Kashyap Desai; Peter
> Rivera;
> Paolo Bonzini; Laurence Oberman
> Subject: Re: [PATCH 0/5] blk-mq/scsi-mq: support global tags & introduce
> force_blk_mq
>
> On 02/03/2018 05:21 AM, Ming Lei wrote:
> > Hi All,
> >
> > This patchset supports global tags which was started by Hannes
> > originally:
> >
> > https://marc.info/?l=linux-block=149132580511346=2
> >
> > Also inroduce 'force_blk_mq' to 'struct scsi_host_template', so that
> > driver can avoid to support two IO paths(legacy and blk-mq),
> > especially recent discusion mentioned that SCSI_MQ will be enabled at
> default soon.
> >
> > https://marc.info/?l=linux-scsi=151727684915589=2
> >
> > With the above two changes, it should be easier to convert SCSI drivers'
> > reply queue into blk-mq's hctx, then the automatic irq affinity issue
> > can be solved easily, please see detailed descrption in commit log.
> >
> > Also drivers may require to complete request on the submission CPU for
> > avoiding hard/soft deadlock, which can be done easily with blk_mq too.
> >
> > https://marc.info/?t=15160185141=1=2
> >
> > The final patch uses the introduced 'force_blk_mq' to fix virtio_scsi
> > so that IO hang issue can be avoided inside legacy IO path, this issue
> > is a bit generic, at least HPSA/virtio-scsi are found broken with
> > v4.15+.
> >
> > Thanks
> > Ming
> >
> > Ming Lei (5):
> >   blk-mq: tags: define several fields of tags as pointer
> >   blk-mq: introduce BLK_MQ_F_GLOBAL_TAGS
> >   block: null_blk: introduce module parameter of 'g_global_tags'
> >   scsi: introduce force_blk_mq
> >   scsi: virtio_scsi: fix IO hang by irq vector automatic affinity
> >
> >  block/bfq-iosched.c|  4 +--
> >  block/blk-mq-debugfs.c | 11 
> >  block/blk-mq-sched.c   |  2 +-
> >  block/blk-mq-tag.c | 67
> > +-
> >  block/blk-mq-tag.h | 15 ---
> >  block/blk-mq.c | 31 -
> >  block/blk-mq.h |  3 ++-
> >  block/kyber-iosched.c  |  2 +-
> >  drivers/block/null_blk.c   |  6 +
> >  drivers/scsi/hosts.c   |  1 +
> >  drivers/scsi/virtio_scsi.c | 59
> > +++-
> >  include/linux/blk-mq.h |  2 ++
> >  include/scsi/scsi_host.h   |  3 +++
> >  13 files changed, 105 insertions(+), 101 deletions(-)
> >
> Thanks Ming for picking this up.
>
> I'll give it a shot and see how it behaves on other hardware.

Ming -

There is no way we can enable global tags from SCSI stack in this patch
series.   I still think we have no solution for issue described below in
this patch series.
https://marc.info/?t=15160185141=1=2

What we will be doing is just use global tag HBA wide instead of h/w queue
based. We still have more than one reply queue ending up completion one CPU.
Try to reduce MSI-x vector of megaraid_sas or mpt3sas driver via module
parameter to simulate the issue. We need more number of Online CPU than
reply-queue.
We may see completion redirected to original CPU because of
"QUEUE_FLAG_SAME_FORCE", but ISR of low level driver can keep one CPU busy
in local ISR routine.


Kashyap

>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke  Teamlead Storage & Networking
> h...@suse.de +49 911 74053 688
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284
> (AG Nürnberg)


Re: [PATCH 0/5] blk-mq/scsi-mq: support global tags & introduce force_blk_mq

2018-02-04 Thread Hannes Reinecke
On 02/03/2018 05:21 AM, Ming Lei wrote:
> Hi All,
> 
> This patchset supports global tags which was started by Hannes originally:
> 
>   https://marc.info/?l=linux-block=149132580511346=2
> 
> Also inroduce 'force_blk_mq' to 'struct scsi_host_template', so that
> driver can avoid to support two IO paths(legacy and blk-mq), especially
> recent discusion mentioned that SCSI_MQ will be enabled at default soon.
> 
>   https://marc.info/?l=linux-scsi=151727684915589=2
> 
> With the above two changes, it should be easier to convert SCSI drivers'
> reply queue into blk-mq's hctx, then the automatic irq affinity issue can
> be solved easily, please see detailed descrption in commit log.
> 
> Also drivers may require to complete request on the submission CPU
> for avoiding hard/soft deadlock, which can be done easily with blk_mq
> too.
> 
>   https://marc.info/?t=15160185141=1=2
> 
> The final patch uses the introduced 'force_blk_mq' to fix virtio_scsi
> so that IO hang issue can be avoided inside legacy IO path, this issue is
> a bit generic, at least HPSA/virtio-scsi are found broken with v4.15+.
> 
> Thanks
> Ming
> 
> Ming Lei (5):
>   blk-mq: tags: define several fields of tags as pointer
>   blk-mq: introduce BLK_MQ_F_GLOBAL_TAGS
>   block: null_blk: introduce module parameter of 'g_global_tags'
>   scsi: introduce force_blk_mq
>   scsi: virtio_scsi: fix IO hang by irq vector automatic affinity
> 
>  block/bfq-iosched.c|  4 +--
>  block/blk-mq-debugfs.c | 11 
>  block/blk-mq-sched.c   |  2 +-
>  block/blk-mq-tag.c | 67 
> +-
>  block/blk-mq-tag.h | 15 ---
>  block/blk-mq.c | 31 -
>  block/blk-mq.h |  3 ++-
>  block/kyber-iosched.c  |  2 +-
>  drivers/block/null_blk.c   |  6 +
>  drivers/scsi/hosts.c   |  1 +
>  drivers/scsi/virtio_scsi.c | 59 +++-
>  include/linux/blk-mq.h |  2 ++
>  include/scsi/scsi_host.h   |  3 +++
>  13 files changed, 105 insertions(+), 101 deletions(-)
> 
Thanks Ming for picking this up.

I'll give it a shot and see how it behaves on other hardware.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 1/5] blk-mq: tags: define several fields of tags as pointer

2018-02-04 Thread Hannes Reinecke
On 02/03/2018 05:21 AM, Ming Lei wrote:
> This patch changes tags->breserved_tags, tags->bitmap_tags and
> tags->active_queues as pointer, and prepares for supporting global tags.
> 
> No functional change.
> 
> Cc: Laurence Oberman 
> Cc: Mike Snitzer 
> Cc: Christoph Hellwig 
> Signed-off-by: Ming Lei 
> ---
>  block/bfq-iosched.c|  4 ++--
>  block/blk-mq-debugfs.c | 10 +-
>  block/blk-mq-tag.c | 48 ++--
>  block/blk-mq-tag.h | 10 +++---
>  block/blk-mq.c |  2 +-
>  block/kyber-iosched.c  |  2 +-
>  6 files changed, 42 insertions(+), 34 deletions(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 5/5] scsi: virtio_scsi: fix IO hang by irq vector automatic affinity

2018-02-04 Thread Hannes Reinecke
On 02/03/2018 05:21 AM, Ming Lei wrote:
> Now 84676c1f21e8ff5(genirq/affinity: assign vectors to all possible CPUs)
> has been merged to V4.16-rc, and it is easy to allocate all offline CPUs
> for some irq vectors, this can't be avoided even though the allocation
> is improved.
> 
> For example, on a 8cores VM, 4~7 are not-present/offline, 4 queues of
> virtio-scsi, the irq affinity assigned can become the following shape:
> 
>   irq 36, cpu list 0-7
>   irq 37, cpu list 0-7
>   irq 38, cpu list 0-7
>   irq 39, cpu list 0-1
>   irq 40, cpu list 4,6
>   irq 41, cpu list 2-3
>   irq 42, cpu list 5,7
> 
> Then IO hang is triggered in case of non-SCSI_MQ.
> 
> Given storage IO is always C/S model, there isn't such issue with 
> SCSI_MQ(blk-mq),
> because no IO can be submitted to one hw queue if the hw queue hasn't online
> CPUs.
> 
> Fix this issue by forcing to use blk_mq.
> 
> BTW, I have been used virtio-scsi(scsi_mq) for several years, and it has
> been quite stable, so it shouldn't cause extra risk.
> 
> Cc: Hannes Reinecke 
> Cc: Arun Easi 
> Cc: Omar Sandoval ,
> Cc: "Martin K. Petersen" ,
> Cc: James Bottomley ,
> Cc: Christoph Hellwig ,
> Cc: Don Brace 
> Cc: Kashyap Desai 
> Cc: Peter Rivera 
> Cc: Paolo Bonzini 
> Cc: Laurence Oberman 
> Cc: Mike Snitzer 
> Signed-off-by: Ming Lei 
> ---
>  drivers/scsi/virtio_scsi.c | 59 
> +++---
>  1 file changed, 3 insertions(+), 56 deletions(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 4/5] scsi: introduce force_blk_mq

2018-02-04 Thread Hannes Reinecke
On 02/03/2018 05:21 AM, Ming Lei wrote:
> From scsi driver view, it is a bit troublesome to support both blk-mq
> and non-blk-mq at the same time, especially when drivers need to support
> multi hw-queue.
> 
> This patch introduces 'force_blk_mq' to scsi_host_template so that drivers
> can provide blk-mq only support, so driver code can avoid the trouble
> for supporting both.
> 
> This patch may clean up driver a lot by providing blk-mq only support, 
> espeically
> it is easier to convert multiple reply queues into blk_mq's MQ for the 
> following
> purposes:
> 
> 1) use blk_mq multiple hw queue to deal with allocated irq vectors of all 
> offline
> CPU affinity[1]:
> 
>   [1] https://marc.info/?l=linux-kernel=151748144730409=2
> 
> Now 84676c1f21e8ff5(genirq/affinity: assign vectors to all possible CPUs)
> has been merged to V4.16-rc, and it is easy to allocate all offline CPUs
> for some irq vectors, this can't be avoided even though the allocation
> is improved.
> 
> So all these drivers have to avoid to ask HBA to complete request in
> reply queue which hasn't online CPUs assigned.
> 
> This issue can be solved generically and easily via blk_mq(scsi_mq) multiple
> hw queue by mapping each reply queue into hctx.
> 
> 2) some drivers[1] require to complete request in the submission CPU for
> avoiding hard/soft lockup, which is easily done with blk_mq, so not necessary
> to reinvent wheels for solving the problem.
> 
>   [2] https://marc.info/?t=15160185141=1=2
> 
> Sovling the above issues for non-MQ path may not be easy, or introduce
> unnecessary work, especially we plan to enable SCSI_MQ soon as discussed
> recently[3]:
> 
>   [3] https://marc.info/?l=linux-scsi=151727684915589=2
> 
> Cc: Hannes Reinecke 
> Cc: Arun Easi 
> Cc: Omar Sandoval ,
> Cc: "Martin K. Petersen" ,
> Cc: James Bottomley ,
> Cc: Christoph Hellwig ,
> Cc: Don Brace 
> Cc: Kashyap Desai 
> Cc: Peter Rivera 
> Cc: Laurence Oberman 
> Cc: Mike Snitzer 
> Signed-off-by: Ming Lei 
> ---
>  drivers/scsi/hosts.c | 1 +
>  include/scsi/scsi_host.h | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index fe3a0da3ec97..c75cebd7911d 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -471,6 +471,7 @@ struct Scsi_Host *scsi_host_alloc(struct 
> scsi_host_template *sht, int privsize)
>   shost->dma_boundary = 0x;
>  
>   shost->use_blk_mq = scsi_use_blk_mq;
> + shost->use_blk_mq = scsi_use_blk_mq || !!shost->hostt->force_blk_mq;
>  
>   device_initialize(>shost_gendev);
>   dev_set_name(>shost_gendev, "host%d", shost->host_no);
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index a8b7bf879ced..4118760e5c32 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -452,6 +452,9 @@ struct scsi_host_template {
>   /* True if the controller does not support WRITE SAME */
>   unsigned no_write_same:1;
>  
> + /* tell scsi core we support blk-mq only */
> + unsigned force_blk_mq:1;
> +
>   /*
>* Countdown for host blocking with no commands outstanding.
>*/
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 2/5] blk-mq: introduce BLK_MQ_F_GLOBAL_TAGS

2018-02-04 Thread Hannes Reinecke
On 02/03/2018 05:21 AM, Ming Lei wrote:
> Quite a few HBAs(such as HPSA, megaraid, mpt3sas, ..) support multiple
> reply queues, but tags is often HBA wide.
> 
> These HBAs have switched to use pci_alloc_irq_vectors(PCI_IRQ_AFFINITY)
> for automatic affinity assignment.
> 
> Now 84676c1f21e8ff5(genirq/affinity: assign vectors to all possible CPUs)
> has been merged to V4.16-rc, and it is easy to allocate all offline CPUs
> for some irq vectors, this can't be avoided even though the allocation
> is improved.
> 
> So all these drivers have to avoid to ask HBA to complete request in
> reply queue which hasn't online CPUs assigned, and HPSA has been broken
> with v4.15+:
> 
>   https://marc.info/?l=linux-kernel=151748144730409=2
> 
> This issue can be solved generically and easily via blk_mq(scsi_mq) multiple
> hw queue by mapping each reply queue into hctx, but one tricky thing is
> the HBA wide(instead of hw queue wide) tags.
> 
> This patch is based on the following Hannes's patch:
> 
>   https://marc.info/?l=linux-block=149132580511346=2
> 
> One big difference with Hannes's is that this patch only makes the tags 
> sbitmap
> and active_queues data structure HBA wide, and others are kept as NUMA 
> locality,
> such as request, hctx, tags, ...
> 
> The following patch will support global tags on null_blk, also the performance
> data is provided, no obvious performance loss is observed when the whole
> hw queue depth is same.
> 
> Cc: Hannes Reinecke 
> Cc: Arun Easi 
> Cc: Omar Sandoval ,
> Cc: "Martin K. Petersen" ,
> Cc: James Bottomley ,
> Cc: Christoph Hellwig ,
> Cc: Don Brace 
> Cc: Kashyap Desai 
> Cc: Peter Rivera 
> Cc: Laurence Oberman 
> Cc: Mike Snitzer 
> Signed-off-by: Ming Lei 
> ---
>  block/blk-mq-debugfs.c |  1 +
>  block/blk-mq-sched.c   |  2 +-
>  block/blk-mq-tag.c | 23 ++-
>  block/blk-mq-tag.h |  5 -
>  block/blk-mq.c | 29 -
>  block/blk-mq.h |  3 ++-
>  include/linux/blk-mq.h |  2 ++
>  7 files changed, 52 insertions(+), 13 deletions(-)
> 
> diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
> index 0dfafa4b655a..0f0fafe03f5d 100644
> --- a/block/blk-mq-debugfs.c
> +++ b/block/blk-mq-debugfs.c
> @@ -206,6 +206,7 @@ static const char *const hctx_flag_name[] = {
>   HCTX_FLAG_NAME(SHOULD_MERGE),
>   HCTX_FLAG_NAME(TAG_SHARED),
>   HCTX_FLAG_NAME(SG_MERGE),
> + HCTX_FLAG_NAME(GLOBAL_TAGS),
>   HCTX_FLAG_NAME(BLOCKING),
>   HCTX_FLAG_NAME(NO_SCHED),
>  };
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index 55c0a745b427..191d4bc95f0e 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -495,7 +495,7 @@ static int blk_mq_sched_alloc_tags(struct request_queue 
> *q,
>   int ret;
>  
>   hctx->sched_tags = blk_mq_alloc_rq_map(set, hctx_idx, q->nr_requests,
> -set->reserved_tags);
> +set->reserved_tags, false);
>   if (!hctx->sched_tags)
>   return -ENOMEM;
>  
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index 571797dc36cb..66377d09eaeb 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -379,9 +379,11 @@ static struct blk_mq_tags 
> *blk_mq_init_bitmap_tags(struct blk_mq_tags *tags,
>   return NULL;
>  }
>  
> -struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
> +struct blk_mq_tags *blk_mq_init_tags(struct blk_mq_tag_set *set,
> +  unsigned int total_tags,
>unsigned int reserved_tags,
> -  int node, int alloc_policy)
> +  int node, int alloc_policy,
> +  bool global_tag)
>  {
>   struct blk_mq_tags *tags;
>  
> @@ -397,6 +399,14 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int 
> total_tags,
>   tags->nr_tags = total_tags;
>   tags->nr_reserved_tags = reserved_tags;
>  
> + WARN_ON(global_tag && !set->global_tags);
> + if (global_tag && set->global_tags) {
> + tags->bitmap_tags = set->global_tags->bitmap_tags;
> + tags->breserved_tags = set->global_tags->breserved_tags;
> + tags->active_queues = set->global_tags->active_queues;
> + tags->global_tags = true;
> + return tags;
> + }
>   tags->bitmap_tags = >__bitmap_tags;
>   tags->breserved_tags = >__breserved_tags;
>   tags->active_queues = >__active_queues;
Do we really need the 'global_tag' flag here?
Can't we just rely on the ->global_tags pointer to be set?

> @@ -406,8 +416,10 @@ struct blk_mq_tags 

Re: [PATCH 3/5] block: null_blk: introduce module parameter of 'g_global_tags'

2018-02-04 Thread Hannes Reinecke
On 02/03/2018 05:21 AM, Ming Lei wrote:
> This patch introduces the parameter of 'g_global_tags' so that we can
> test this feature by null_blk easiy.
> 
> Not see obvious performance drop with global_tags when the whole hw
> depth is kept as same:
> 
> 1) no 'global_tags', each hw queue depth is 1, and 4 hw queues
> modprobe null_blk queue_mode=2 nr_devices=4 shared_tags=1 global_tags=0 
> submit_queues=4 hw_queue_depth=1
> 
> 2) 'global_tags', global hw queue depth is 4, and 4 hw queues
> modprobe null_blk queue_mode=2 nr_devices=4 shared_tags=1 global_tags=0 
> submit_queues=4 hw_queue_depth=4
> 
> 3) fio test done in above two settings:
>fio --bs=4k --size=512G  --rw=randread --norandommap --direct=1 
> --ioengine=libaio --iodepth=4 --runtime=$RUNTIME --group_reporting=1  
> --name=nullb0 --filename=/dev/nullb0 --name=nullb1 --filename=/dev/nullb1 
> --name=nullb2 --filename=/dev/nullb2 --name=nullb3 --filename=/dev/nullb3
> 
> 1M IOPS can be reached in both above tests which is done in one VM.
> 
> Cc: Hannes Reinecke 
> Cc: Arun Easi 
> Cc: Omar Sandoval ,
> Cc: "Martin K. Petersen" ,
> Cc: James Bottomley ,
> Cc: Christoph Hellwig ,
> Cc: Don Brace 
> Cc: Kashyap Desai 
> Cc: Peter Rivera 
> Cc: Laurence Oberman 
> Cc: Mike Snitzer 
> Signed-off-by: Ming Lei 
> ---
>  drivers/block/null_blk.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
> index 287a09611c0f..ad0834efad42 100644
> --- a/drivers/block/null_blk.c
> +++ b/drivers/block/null_blk.c
> @@ -163,6 +163,10 @@ static int g_submit_queues = 1;
>  module_param_named(submit_queues, g_submit_queues, int, S_IRUGO);
>  MODULE_PARM_DESC(submit_queues, "Number of submission queues");
>  
> +static int g_global_tags = 0;
> +module_param_named(global_tags, g_global_tags, int, S_IRUGO);
> +MODULE_PARM_DESC(global_tags, "All submission queues share one tags");
> +
>  static int g_home_node = NUMA_NO_NODE;
>  module_param_named(home_node, g_home_node, int, S_IRUGO);
>  MODULE_PARM_DESC(home_node, "Home node for the device");
> @@ -1622,6 +1626,8 @@ static int null_init_tag_set(struct nullb *nullb, 
> struct blk_mq_tag_set *set)
>   set->flags = BLK_MQ_F_SHOULD_MERGE;
>   if (g_no_sched)
>   set->flags |= BLK_MQ_F_NO_SCHED;
> + if (g_global_tags)
> + set->flags |= BLK_MQ_F_GLOBAL_TAGS;
>   set->driver_data = NULL;
>  
>   if ((nullb && nullb->dev->blocking) || g_blocking)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


[GIT PULL] Block fixes for 4.16-rc

2018-02-04 Thread Jens Axboe
Hi Linus,

A followup pull request for this merge window, though most of this is
fixes and not new code/features. This pull request contains:

- skd fix from Arnd, fixing a build error dependent on sla allocator
  type.

- blk-mq scheduler discard merging fixes, one from me and one from
  Keith. This fixes a segment miscalculation for blk-mq-sched, where we
  mistakenly think two segments are physically contigious even though
  the request isn't carrying real data. Also fixes a bio-to-rq merge
  case.

- Don't re-set a bit on the buffer_head flags, if it's already set. This
  can cause scalability concerns on bigger machines and workloads. From
  Kemi Wang.

- Add BLK_STS_DEV_RESOURCE return value to blk-mq, allowing us to
  distuingish between a local (device related) resource starvation and a
  global one. The latter might happen without IO being in flight, so it
  has to be handled a bit differently. From Ming.

Please pull!


  git://git.kernel.dk/linux-block.git tags/for-linus-20180204



Arnd Bergmann (1):
  block: skd: fix incorrect linux/slab_def.h inclusion

Jens Axboe (1):
  blk-mq: fix discard merge with scheduler attached

Keith Busch (1):
  blk-mq-sched: Enable merging discard bio into request

Kemi Wang (1):
  buffer: Avoid setting buffer bits that are already set

Ming Lei (1):
  blk-mq: introduce BLK_STS_DEV_RESOURCE

 block/blk-core.c |  3 +++
 block/blk-merge.c| 29 ++---
 block/blk-mq-sched.c |  2 ++
 block/blk-mq.c   | 20 
 drivers/block/null_blk.c |  2 +-
 drivers/block/skd_main.c |  7 ---
 drivers/block/virtio_blk.c   |  2 +-
 drivers/block/xen-blkfront.c |  2 +-
 drivers/md/dm-rq.c   |  5 ++---
 drivers/nvme/host/fc.c   | 12 ++--
 drivers/scsi/scsi_lib.c  |  6 +++---
 include/linux/blk_types.h| 18 ++
 include/linux/buffer_head.h  |  5 -
 13 files changed, 83 insertions(+), 30 deletions(-)

-- 
Jens Axboe



Re: [PATCH 5/5] lightnvm: pblk: refactor bad block identification

2018-02-04 Thread Javier Gonzalez

> On 4 Feb 2018, at 13.55, Matias Bjørling  wrote:
> 
> On 02/04/2018 11:37 AM, Javier Gonzalez wrote:
>>> On 31 Jan 2018, at 19.24, Matias Bjørling  wrote:
>>> 
>>> On 01/31/2018 10:13 AM, Javier Gonzalez wrote:
> On 31 Jan 2018, at 16.51, Matias Bjørling  wrote:
> 
 I have a patches abstracting this, which I think it makes it cleaner. I 
 can push it next week for review. I’m traveling this week. (If you want to 
 get a glimpse I can point you to the code).
>>> 
>>> Yes, please do. Thanks
>> This is the release candidate for 2.0 support based on 4.17. I'll rebase
>> on top of you 2.0 support. We'll see if all changes make it to 4.17
>> then.
>> https://github.com/OpenChannelSSD/linux/tree/for-4.17/spec20
>> Javier
> 
> Great. I look forward to be patches being cleaned up and posted. I do see 
> some nitpicks here and there, which we properly can take a couple of stabs at.

Sure. This is still in development; just wanted to point to the abstractions 
I’m thinking of so that we don’t do the same work twice. 

I’ll wait for posting until you do the 2.0 identify, since the old version is 
implemented on the first patch of this series. 

> One think that generally stands out to me is the "if 1.2 support", else, ... 
> statements. These could be structured better by having dedicated setup 
> functions for 1.2 and 2.0.

We have this construction both in pblk and in core for address translation. 
Note that we need to have them separated to support multi instance and keep 
channels decoupled from each instance. 

I assume 2 if...then is cheaper than doing 2 de-references to function 
pointers. This is the way it is done on legacy paths in other places (e.g., non 
mq scsi), but I can look into how pointer functions would look like and measure 
the performance impact. 

Javier

Re: [PATCH 5/5] lightnvm: pblk: refactor bad block identification

2018-02-04 Thread Matias Bjørling

On 02/04/2018 11:37 AM, Javier Gonzalez wrote:



On 31 Jan 2018, at 19.24, Matias Bjørling  wrote:

On 01/31/2018 10:13 AM, Javier Gonzalez wrote:

On 31 Jan 2018, at 16.51, Matias Bjørling  wrote:


I have a patches abstracting this, which I think it makes it cleaner. I can 
push it next week for review. I’m traveling this week. (If you want to get a 
glimpse I can point you to the code).


Yes, please do. Thanks


This is the release candidate for 2.0 support based on 4.17. I'll rebase
on top of you 2.0 support. We'll see if all changes make it to 4.17
then.

https://github.com/OpenChannelSSD/linux/tree/for-4.17/spec20

Javier



Great. I look forward to be patches being cleaned up and posted. I do 
see some nitpicks here and there, which we properly can take a couple of 
stabs at.


One think that generally stands out to me is the "if 1.2 support", else, 
... statements. These could be structured better by having dedicated 
setup functions for 1.2 and 2.0.


Re: [PATCH 5/5] lightnvm: pblk: refactor bad block identification

2018-02-04 Thread Javier Gonzalez

> On 31 Jan 2018, at 19.24, Matias Bjørling  wrote:
> 
> On 01/31/2018 10:13 AM, Javier Gonzalez wrote:
>>> On 31 Jan 2018, at 16.51, Matias Bjørling  wrote:
>>> 
>> I have a patches abstracting this, which I think it makes it cleaner. I can 
>> push it next week for review. I’m traveling this week. (If you want to get a 
>> glimpse I can point you to the code).
> 
> Yes, please do. Thanks

This is the release candidate for 2.0 support based on 4.17. I'll rebase
on top of you 2.0 support. We'll see if all changes make it to 4.17
then.

https://github.com/OpenChannelSSD/linux/tree/for-4.17/spec20

Javier


signature.asc
Description: Message signed with OpenPGP


Unsafe shutdowns on TOSHIBA THNSF5256GPUK NVME

2018-02-04 Thread Jordan Glover
Hello,

I have problem with "unsafe shutdowns" on Lenovo laptop with TOSHIBA
THNSF5256GPUK NVME disk. It happens every time I power-off my machine and I can
hear characteristic "click" sound like with rotational disks. (maybe that's
something else but I'm basing on info from S.M.A.R.T data not this).

I looked around and saw that people report similar problems in different NVME
disks[1][2], even across different OSes[3][4]. I also found some old patches
witch tried to deal with this[5].

I know that my disk was already put on $hitlist due to power saving issues[6].

I tested this with linux kernels 4.14.x - 4.15. My BIOS and disk firmware are
updated.

Is there anything I could do to resolve this issue? Thx for any help.

Jordan

[1] https://bbs.archlinux.org/viewtopic.php?id=230723
[2] https://bugs.alpinelinux.org/issues/5082
[3]https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211852
[4] 
https://www.win-raid.com/t2041f38-SAMSUNG-PRO-NVMe-SSD-unsafe-shutdowns-NVMe-driver-W-x-unexpected-shutdown.html
[5] http://lists.infradead.org/pipermail/linux-nvme/2014-March/000744.html
[6] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/nvme/host/core.c#n1950