Re: [PATCH v13 6/6] PCI/DPC: Do not do recovery for hotplug enabled system

2018-04-12 Thread Keith Busch
On Thu, Apr 12, 2018 at 08:39:54AM -0600, Keith Busch wrote: > On Thu, Apr 12, 2018 at 10:34:37AM -0400, Sinan Kaya wrote: > > On 4/12/2018 10:06 AM, Bjorn Helgaas wrote: > > > > > > I think the scenario you are describing is two systems that are > > > ide

Re: [PATCH v13 6/6] PCI/DPC: Do not do recovery for hotplug enabled system

2018-04-12 Thread Keith Busch
On Thu, Apr 12, 2018 at 10:34:37AM -0400, Sinan Kaya wrote: > On 4/12/2018 10:06 AM, Bjorn Helgaas wrote: > > > > I think the scenario you are describing is two systems that are > > identical except that in the first, the endpoint is below a hotplug > > bridge, while in the second, it's below a no

Re: [PATCH v13 4/6] PCI/DPC: Unify and plumb error handling into DPC

2018-04-09 Thread Keith Busch
On Mon, Apr 09, 2018 at 10:41:52AM -0400, Oza Pawandeep wrote: > +static int find_dpc_dev_iter(struct device *device, void *data) > +{ > + struct pcie_port_service_driver *service_driver; > + struct device **dev; > + > + dev = (struct device **) data; > + > + if (device->bus == &pci

Re: [PATCH v13 5/6] PCI: Unify wait for link active into generic PCI

2018-04-09 Thread Keith Busch
On Mon, Apr 09, 2018 at 10:41:53AM -0400, Oza Pawandeep wrote: > +/** > + * pcie_wait_for_link - Wait for link till it's active/inactive > + * @pdev: Bridge device > + * @active: waiting for active or inactive ? > + * > + * Use this to wait till link becomes active or inactive. > + */ > +bool pcie_

Re: [PATCH v13 3/6] PCI/PORTDRV: Implement generic find service

2018-04-09 Thread Keith Busch
On Mon, Apr 09, 2018 at 10:41:51AM -0400, Oza Pawandeep wrote: > This patch implements generic pcie_port_find_service() routine. > > Signed-off-by: Oza Pawandeep Looks good. Reviewed-by: Keith Busch

Re: [PATCH v13 2/6] PCI/AER: Factor out error reporting from AER

2018-04-09 Thread Keith Busch
off-by: Oza Pawandeep Looks fine. Reviewed-by: Keith Busch

Re: [PATCH v13 1/6] PCI/AER: Rename error recovery to generic PCI naming

2018-04-09 Thread Keith Busch
On Mon, Apr 09, 2018 at 10:41:49AM -0400, Oza Pawandeep wrote: > This patch renames error recovery to generic name with pcie prefix > > Signed-off-by: Oza Pawandeep Looks fine. Reviewed-by: Keith Busch

Re: [PATCH] nvme-multipath: implement active-active round-robin path selector

2018-04-04 Thread Keith Busch
On Fri, Mar 30, 2018 at 09:04:46AM +, Eric H. Chang wrote: > We internally call PCIe-retimer as HBA. It's not a real Host Bus Adapter that > translates the interface from PCIe to SATA or SAS. Sorry for the confusion. Please don't call a PCIe retimer an "HBA"! :) While your experiment is setu

Re: [PATCH v3] nvmet: fix nvmet_execute_write_zeroes function

2018-04-02 Thread Keith Busch
Thanks, I've applied the patch with a simpler changelog explaining the bug.

Re: [PATCH v2] nvmet: fix nvmet_execute_write_zeroes function

2018-04-02 Thread Keith Busch
On Mon, Apr 02, 2018 at 11:49:41AM -0300, Rodrigo R. Galvao wrote: > When trying to issue write_zeroes command against TARGET with a 4K block > size, it ends up hitting the following condition at __blkdev_issue_zeroout: > > if ((sector | nr_sects) & bs_mask) > return -EINVAL;

Re: [PATCH v1] PCI/DPC: Rename from pcie-dpc.c to dpc.c.

2018-04-02 Thread Keith Busch
On Sat, Mar 31, 2018 at 05:34:26PM -0500, Bjorn Helgaas wrote: > From: Bjorn Helgaas > > Rename from pcie-dpc.c to dpc.c. The path "drivers/pci/pcie/pcie-dpc.c" > has more occurrences of "pci" than necessary. > > Signed-off-by: Bjorn Helgaas Looks good. Acked-by: Keith Busch

Re: [PATCH] nvmet: fix nvmet_execute_write_zeroes function

2018-04-02 Thread Keith Busch
On Mon, Apr 02, 2018 at 10:47:10AM -0300, Rodrigo Rosatti Galvao wrote: > One thing that I just forgot to explain previously, but I think its > relevant: > > 1. The command is failing with 4k logical block size, but works with 512B > > 2. With the patch, the command is working for both 512B and 4

Re: [PATCH] nvmet: fix nvmet_execute_write_zeroes function

2018-04-02 Thread Keith Busch
On Fri, Mar 30, 2018 at 06:18:50PM -0300, Rodrigo R. Galvao wrote: > sector = le64_to_cpu(write_zeroes->slba) << > (req->ns->blksize_shift - 9); > nr_sector = (((sector_t)le16_to_cpu(write_zeroes->length)) << > - (req->ns->blksize_shift - 9)) + 1; > +

Re: [PATCH] nvmet: fix nvmet_execute_write_zeroes function

2018-03-30 Thread Keith Busch
On Fri, Mar 30, 2018 at 06:18:50PM -0300, Rodrigo R. Galvao wrote: > When trying to issue write_zeroes command against TARGET the nr_sector is > being incremented by 1, which ends up hitting the following condition at > __blkdev_issue_zeroout: > > if ((sector | nr_sects) & bs_mask) >

Re: [PATCH] nvme-multipath: implement active-active round-robin path selector

2018-03-28 Thread Keith Busch
On Wed, Mar 28, 2018 at 10:06:46AM +0200, Christoph Hellwig wrote: > For PCIe devices the right policy is not a round robin but to use > the pcie device closer to the node. I did a prototype for that > long ago and the concept can work. Can you look into that and > also make that policy used auto

Re: [PATCH] nvme: don't send keep-alives to the discovery controller

2018-03-28 Thread Keith Busch
Thanks, applied.

Re: [PATCH] nvme: unexport nvme_start_keep_alive

2018-03-28 Thread Keith Busch
Thanks, applied.

Re: [PATCH] nvme: target: fix buffer overflow

2018-03-28 Thread Keith Busch
Thanks, applied.

Re: [PATCH] nvme: use upper_32_bits() instead of bit shift

2018-03-28 Thread Keith Busch
On Wed, Mar 28, 2018 at 03:57:47PM +0200, Arnd Bergmann wrote: > @@ -2233,8 +2233,8 @@ int nvme_get_log_ext(struct nvme_ctrl *ctrl, struct > nvme_ns *ns, > c.get_log_page.lid = log_page; > c.get_log_page.numdl = cpu_to_le16(dwlen & ((1 << 16) - 1)); > c.get_log_page.numdu = cpu_t

Re: [PATCH] nvme: enforce 64bit offset for nvme_get_log_ext fn

2018-03-27 Thread Keith Busch
On Tue, Mar 27, 2018 at 08:00:33PM +0200, Matias Bjørling wrote: > Compiling on 32 bits system produces a warning for the shift width > when shifting 32 bit integer with 64bit integer. > > Make sure that offset always is 64bit, and use macros for retrieving > lower and upper bits of the offset. T

Re: [PATCH] nvme: make nvme_get_log_ext non-static

2018-03-21 Thread Keith Busch
On Wed, Mar 21, 2018 at 08:27:07PM +0100, Matias Bjørling wrote: > Enable the lightnvm integration to use the nvme_get_log_ext() > function. > > Signed-off-by: Matias Bjørling Thanks, applied to nvme-4.17.

Re: [RFC PATCH] nvme: avoid race-conditions when enabling devices

2018-03-21 Thread Keith Busch
On Wed, Mar 21, 2018 at 11:48:09PM +0800, Ming Lei wrote: > On Wed, Mar 21, 2018 at 01:10:31PM +0100, Marta Rybczynska wrote: > > > On Wed, Mar 21, 2018 at 12:00:49PM +0100, Marta Rybczynska wrote: > > >> NVMe driver uses threads for the work at device reset, including enabling > > >> the PCIe devi

Re: [PATCH 08/12] lightnvm: implement get log report chunk helpers

2018-03-21 Thread Keith Busch
On Wed, Mar 21, 2018 at 03:06:05AM -0700, Matias Bjørling wrote: > > outside of nvme core so that we can use it form lightnvm. > > > > Signed-off-by: Javier González > > --- > > drivers/lightnvm/core.c | 11 +++ > > drivers/nvme/host/core.c | 6 ++-- > > drivers/nvme/host/lightn

Re: [PATCH v2] PCI/DPC: Fix PCI legacy interrupt acknowledgement

2018-03-14 Thread Keith Busch
On Wed, Mar 14, 2018 at 02:52:30PM -0600, Keith Busch wrote: > > Reviewed-by: Keith Busch

Re: [PATCH v2] PCI/DPC: Fix PCI legacy interrupt acknowledgement

2018-03-14 Thread Keith Busch
y INT interrupt. > > With current code we do not acknowledge the interrupt back in dpc_irq() > and we get dpc interrupt storm. > > This patch acknowledges the interrupt in interrupt handler. > > Signed-off-by: Oza Pawandeep Thanks, this looks good to me. Reviewed-by: Keith Busch

Re: [PATCH v12 0/6] Address error and recovery for AER and DPC

2018-03-14 Thread Keith Busch
On Mon, Mar 12, 2018 at 11:47:12PM -0400, Sinan Kaya wrote: > > The spec is recommending code to use "Hotplug Surprise" to differentiate > these two cases we are looking for. > > The use case Keith is looking for is for hotplug support. > The case I and Oza are more interested is for error hand

Re: [PATCH] nvme-pci: disable APST for Samsung NVMe SSD 960 EVO + ASUS PRIME Z370-A

2018-03-14 Thread Keith Busch
Thanks, applied for 4.17.

Re: [PATCH V3] nvme-pci: assign separate irq vectors for adminq and ioq1

2018-03-13 Thread Keith Busch
On Tue, Mar 13, 2018 at 06:45:00PM +0800, Ming Lei wrote: > On Tue, Mar 13, 2018 at 05:58:08PM +0800, Jianchao Wang wrote: > > Currently, adminq and ioq1 share the same irq vector which is set > > affinity to cpu0. If a system allows cpu0 to be offlined, the adminq > > will not be able work any mor

Re: [PATCH v12 0/6] Address error and recovery for AER and DPC

2018-03-12 Thread Keith Busch
On Mon, Mar 12, 2018 at 02:47:30PM -0500, Bjorn Helgaas wrote: > [+cc Alex] > > On Mon, Mar 12, 2018 at 08:25:51AM -0600, Keith Busch wrote: > > On Sun, Mar 11, 2018 at 11:03:58PM -0400, Sinan Kaya wrote: > > > On 3/11/2018 6:03 PM, Bjorn Helgaas wrote: > > > >

Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-03-12 Thread Keith Busch
Hi Jianchao, The patch tests fine on all hardware I had. I'd like to queue this up for the next 4.16-rc. Could you send a v3 with the cleanup changes Andy suggested and a changelog aligned with Ming's insights? Thanks, Keith

Re: [pci PATCH v5 1/4] pci: Add pci_sriov_configure_simple for PFs that don't manage VF resources

2018-03-12 Thread Keith Busch
On Mon, Mar 12, 2018 at 11:09:34AM -0700, Alexander Duyck wrote: > On Mon, Mar 12, 2018 at 10:40 AM, Keith Busch wrote: > > On Mon, Mar 12, 2018 at 10:21:29AM -0700, Alexander Duyck wrote: > >> diff --git a/include/linux/pci.h b/include/linux/pci.h > >> index 024a1b

Re: [PATCH v12 0/6] Address error and recovery for AER and DPC

2018-03-12 Thread Keith Busch
On Mon, Mar 12, 2018 at 01:41:07PM -0400, Sinan Kaya wrote: > I was just writing a reply to you. You acted first :) > > On 3/12/2018 1:33 PM, Keith Busch wrote: > >>> After releasing a slot from DPC, the link is allowed to retrain. If > >>> there > >>

Re: [pci PATCH v5 1/4] pci: Add pci_sriov_configure_simple for PFs that don't manage VF resources

2018-03-12 Thread Keith Busch
On Mon, Mar 12, 2018 at 10:21:29AM -0700, Alexander Duyck wrote: > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 024a1beda008..9cab9d0d51dc 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -1953,6 +1953,7 @@ static inline void pci_mmcfg_late_init(void) { } > int

Re: [PATCH v12 0/6] Address error and recovery for AER and DPC

2018-03-12 Thread Keith Busch
On Mon, Mar 12, 2018 at 09:04:47PM +0530, p...@codeaurora.org wrote: > On 2018-03-12 20:28, Keith Busch wrote: > > I'm not sure I understand. The link is disabled while DPC is triggered, > > so if anything, you'd want to un-enumerate everything below the > > containe

Re: [PATCH v12 0/6] Address error and recovery for AER and DPC

2018-03-12 Thread Keith Busch
On Mon, Mar 12, 2018 at 08:16:38PM +0530, p...@codeaurora.org wrote: > On 2018-03-12 19:55, Keith Busch wrote: > > On Sun, Mar 11, 2018 at 11:03:58PM -0400, Sinan Kaya wrote: > > > On 3/11/2018 6:03 PM, Bjorn Helgaas wrote: > > > > On Wed, Feb 28, 2018 at 10:34:1

Re: [PATCH v12 0/6] Address error and recovery for AER and DPC

2018-03-12 Thread Keith Busch
On Sun, Mar 11, 2018 at 11:03:58PM -0400, Sinan Kaya wrote: > On 3/11/2018 6:03 PM, Bjorn Helgaas wrote: > > On Wed, Feb 28, 2018 at 10:34:11PM +0530, Oza Pawandeep wrote: > > > That difference has been there since the beginning of DPC, so it has > > nothing to do with *this* series EXCEPT for the

Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-03-09 Thread Keith Busch
On Thu, Mar 08, 2018 at 08:42:20AM +0100, Christoph Hellwig wrote: > > So I suspect we'll need to go with a patch like this, just with a way > better changelog. I have to agree this is required for that use case. I'll run some quick tests and propose an alternate changelog. Longer term, the curr

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Keith Busch
On Mon, Mar 05, 2018 at 01:10:53PM -0700, Jason Gunthorpe wrote: > So when reading the above mlx code, we see the first wmb() being used > to ensure that CPU stores to cachable memory are visible to the DMA > triggered by the doorbell ring. IIUC, we don't need a similar barrier for NVMe to ensure

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Keith Busch
On Mon, Mar 05, 2018 at 12:33:29PM +1100, Oliver wrote: > On Thu, Mar 1, 2018 at 10:40 AM, Logan Gunthorpe wrote: > > @@ -429,10 +429,7 @@ static void __nvme_submit_cmd(struct nvme_queue *nvmeq, > > { > > u16 tail = nvmeq->sq_tail; > > > - if (nvmeq->sq_cmds_io) > > -

Re: [PATCH v2 1/1] nvme: implement log page low/high offset and dwords

2018-03-02 Thread Keith Busch
Thanks, applied for 4.17. A side note for those interested, some bug fixing commits introduced in the nvme-4.17 branch were applied to 4.16-rc. I've rebased these on top of linux-block/for-next so we don't have the duplicate commits for the 4.17 merge window. The nvme branch currently may be viewe

Re: [PATCH v2 10/10] nvmet: Optionally use PCI P2P memory

2018-03-01 Thread Keith Busch
On Thu, Mar 01, 2018 at 11:00:51PM +, Stephen Bates wrote: > > P2P is about offloading the memory and PCI subsystem of the host CPU > and this is achieved no matter which p2p_dev is used. Even within a device, memory attributes for its various regions may not be the same. There's a meaningfu

Re: [PATCH 5/5] nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors

2018-03-01 Thread Keith Busch
On Thu, Mar 01, 2018 at 01:52:20AM +0100, Christoph Hellwig wrote: > Looks fine, > > and we should pick this up for 4.16 independent of the rest, which > I might need a little more review time for. > > Reviewed-by: Christoph Hellwig Thanks, queued up for 4.16.

Re: [PATCH V3] nvme-pci: Fixes EEH failure on ppc

2018-03-01 Thread Keith Busch
On Thu, Mar 01, 2018 at 11:12:08AM -0600, Wen Xiong wrote: >Hi Keith, > >It is perfect! I go with it. Thanks, queued up for 4.16.

Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-03-01 Thread Keith Busch
On Thu, Mar 01, 2018 at 11:03:30PM +0800, Ming Lei wrote: > If all CPUs for the 1st IRQ vector of admin queue are offline, then I > guess NVMe can't work any more. Yikes, with respect to admin commands, it appears you're right if your system allows offlining CPU0. > So looks it is a good idea to

Re: [PATCH V2] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-03-01 Thread Keith Busch
On Thu, Mar 01, 2018 at 06:05:53PM +0800, jianchao.wang wrote: > When the adminq is free, ioq0 irq completion path has to invoke nvme_irq > twice, one for itself, > one for adminq completion irq action. Let's be a little more careful on the terminology when referring to spec defined features: the

Re: [PATCH V3] nvme-pci: Fixes EEH failure on ppc

2018-02-28 Thread Keith Busch
On Wed, Feb 28, 2018 at 04:31:37PM -0600, wenxiong wrote: > On 2018-02-15 14:05, wenxi...@linux.vnet.ibm.com wrote: > > From: Wen Xiong > > > > With b2a0eb1a0ac72869c910a79d935a0b049ec78ad9(nvme-pci: Remove watchdog > > timer), EEH recovery stops working on ppc. > > > > After removing whatdog ti

Re: [PATCH v2] nvme-multipath: fix sysfs dangerously created links

2018-02-28 Thread Keith Busch
Thanks, applied.

Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-02-28 Thread Keith Busch
On Wed, Feb 28, 2018 at 11:46:20PM +0800, jianchao.wang wrote: > > the irqbalance may migrate the adminq irq away from cpu0. No, irqbalance can't touch managed IRQs. See irq_can_set_affinity_usr().

Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-02-28 Thread Keith Busch
On Wed, Feb 28, 2018 at 10:53:31AM +0800, jianchao.wang wrote: > On 02/27/2018 11:13 PM, Keith Busch wrote: > > On Tue, Feb 27, 2018 at 04:46:17PM +0800, Jianchao Wang wrote: > >> Currently, adminq and ioq0 share the same irq vector. This is > >> unfair for both amdinq a

Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0

2018-02-27 Thread Keith Busch
On Tue, Feb 27, 2018 at 04:46:17PM +0800, Jianchao Wang wrote: > Currently, adminq and ioq0 share the same irq vector. This is > unfair for both amdinq and ioq0. > - For adminq, its completion irq has to be bound on cpu0. > - For ioq0, when the irq fires for io completion, the adminq irq >act

Re: [PATCH] nvme-multipath: fix sysfs dangerously created links

2018-02-26 Thread Keith Busch
On Mon, Feb 26, 2018 at 05:51:23PM +0900, baeg...@gmail.com wrote: > From: Baegjae Sung > > If multipathing is enabled, each NVMe subsystem creates a head > namespace (e.g., nvme0n1) and multiple private namespaces > (e.g., nvme0c0n1 and nvme0c1n1) in sysfs. When creating links for > private name

Re: [PATCH V2] nvme-pci: set cq_vector to -1 if io queue setup fails

2018-02-26 Thread Keith Busch
On Thu, Feb 15, 2018 at 07:13:41PM +0800, Jianchao Wang wrote: > nvme cq irq is freed based on queue_count. When the sq/cq creation > fails, irq will not be setup. free_irq will warn 'Try to free > already-free irq'. > > To fix it, set the nvmeq->cq_vector to -1, then nvme_suspend_queue > will ign

Re: [PATCH] blk: optimization for classic polling

2018-02-20 Thread Keith Busch
On Tue, Feb 20, 2018 at 02:21:37PM +0100, Peter Zijlstra wrote: > Also, set_current_state(TASK_RUNNING) is dodgy (similarly in > __blk_mq_poll), why do you need that memory barrier? You're right. The subsequent revision that was committed removed the barrier. The commit is here: https://git.kerne

Re: [BUG? NVME Linux-4.15] Dracut loops indefinitely with 4.15

2018-02-15 Thread Keith Busch
On Thu, Feb 15, 2018 at 02:49:56PM +0100, Julien Durillon wrote: > I opened an issue here: > https://github.com/dracutdevs/dracut/issues/373 for dracut. You can > read there how dracuts enters an infinite loop. > > TL;DR: in linux-4.14, trying to find the last "slave" of /dev/dm-0 > ends with a ma

Re: [PATCH RESENT] nvme-pci: suspend queues based on online_queues

2018-02-13 Thread Keith Busch
On Mon, Feb 12, 2018 at 09:05:13PM +0800, Jianchao Wang wrote: > @@ -1315,9 +1315,6 @@ static int nvme_suspend_queue(struct nvme_queue *nvmeq) > nvmeq->cq_vector = -1; > spin_unlock_irq(&nvmeq->q_lock); > > - if (!nvmeq->qid && nvmeq->dev->ctrl.admin_q) > - blk_mq_quie

Re: [PATCH 2/3] nvme: fix the deadlock in nvme_update_formats

2018-02-12 Thread Keith Busch
Hi Sagi, This one is fixing a deadlock in namespace detach. It is still not a widely supported operation, but becoming more common. While the other two patches in this series look good for 4.17, I would really recommend this one for 4.16-rc, and add a Cc to linux-stable for 4.15 too. Sound okay?

Re: [PATCH 3/3] nvme: change namespaces_mutext to namespaces_rwsem

2018-02-12 Thread Keith Busch
On Mon, Feb 12, 2018 at 08:47:47PM +0200, Sagi Grimberg wrote: > This looks fine to me, but I really want Keith and/or Christoph to have > a look as well. This looks fine to me as well. Reviewed-by: Keith Busch

Re: [PATCH 1/3] nvme: fix the dangerous reference of namespaces list

2018-02-12 Thread Keith Busch
Looks good. Reviewed-by: Keith Busch

Re: [PATCH 2/3] nvme: fix the deadlock in nvme_update_formats

2018-02-12 Thread Keith Busch
This looks good. Reviewed-by: Keith Busch

Re: [PATCH] nvme-pci: drain the entered requests after ctrl is shutdown

2018-02-12 Thread Keith Busch
On Mon, Feb 12, 2018 at 08:43:58PM +0200, Sagi Grimberg wrote: > > > Currently, we will unquiesce the queues after the controller is > > shutdown to avoid residual requests to be stuck. In fact, we can > > do it more cleanly, just wait freeze and drain the queue in > > nvme_dev_disable and finally

Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-09 Thread Keith Busch
On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote: > > if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING case, > nvme_reset_work will hang forever, because no one could complete the entered > requests. Except it's no longer in the "RESETTING" case since you adde

Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-08 Thread Keith Busch
On Thu, Feb 08, 2018 at 05:56:49PM +0200, Sagi Grimberg wrote: > Given the discussion on this set, you plan to respin again > for 4.16? With the exception of maybe patch 1, this needs more consideration than I'd feel okay with for the 4.16 release.

Re: [PATCH] blk: optimization for classic polling

2018-02-08 Thread Keith Busch
On Sun, May 30, 2083 at 09:51:06AM +0530, Nitesh Shetty wrote: > This removes the dependency on interrupts to wake up task. Set task > state as TASK_RUNNING, if need_resched() returns true, > while polling for IO completion. > Earlier, polling task used to sleep, relying on interrupt to wake it up.

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-08 Thread Keith Busch
On Thu, Feb 08, 2018 at 10:17:00PM +0800, jianchao.wang wrote: > There is a dangerous scenario which caused by nvme_wait_freeze in > nvme_reset_work. > please consider it. > > nvme_reset_work > -> nvme_start_queues > -> nvme_wait_freeze > > if the controller no response, we have to rely on t

Re: [PATCH V2]nvme-pci: Fixes EEH failure on ppc

2018-02-07 Thread Keith Busch
On Wed, Feb 07, 2018 at 02:09:38PM -0600, wenxi...@linux.vnet.ibm.com wrote: > @@ -1189,6 +1183,12 @@ static enum blk_eh_timer_return nvme_timeout(struct > request *req, bool reserved) > struct nvme_command cmd; > u32 csts = readl(dev->bar + NVME_REG_CSTS); > > + /* If PCI error

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-07 Thread Keith Busch
On Wed, Feb 07, 2018 at 10:13:51AM +0800, jianchao.wang wrote: > What's the difference ? Can you please point out. > I have shared my understanding below. > But actually, I don't get the point what's the difference you said. It sounds like you have all the pieces. Just keep this in mind: we don't

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-06 Thread Keith Busch
On Tue, Feb 06, 2018 at 09:46:36AM +0800, jianchao.wang wrote: > Hi Keith > > Thanks for your kindly response. > > On 02/05/2018 11:13 PM, Keith Busch wrote: > > but how many requests are you letting enter to their demise by > > freezing on the wrong side of the res

Re: [PATCH] nvme-pci: Fix incorrect use of CMB size to calculate q_depth

2018-02-06 Thread Keith Busch
On Mon, Feb 05, 2018 at 03:32:23PM -0700, sba...@raithlin.com wrote: > > - if (dev->cmb && (dev->cmbsz & NVME_CMBSZ_SQS)) { > + if (dev->cmb && use_cmb_sqes && (dev->cmbsz & NVME_CMBSZ_SQS)) { Is this a prep patch for something coming later? dev->cmb is already NULL if use_cmb_sqes is fa

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-05 Thread Keith Busch
On Mon, Feb 05, 2018 at 10:26:03AM +0800, jianchao.wang wrote: > > Freezing is not just for shutdown. It's also used so > > blk_mq_update_nr_hw_queues will work if the queue count changes across > > resets. > blk_mq_update_nr_hw_queues will freeze the queue itself. Please refer to. > static void __

Re: [PATCH 1/6] nvme-pci: move clearing host mem behind stopping queues

2018-02-02 Thread Keith Busch
to something like: This patch quiecses new IO prior to disabling device HMB access. A controller using HMB may be relying on it to efficiently complete IO commands. Reviewed-by: Keith Busch > --- > drivers/nvme/host/pci.c | 8 +++- > 1 file changed, 3 insertions(+), 5 deletions

Re: [PATCH 4/6] nvme-pci: break up nvme_timeout and nvme_dev_disable

2018-02-02 Thread Keith Busch
On Fri, Feb 02, 2018 at 03:00:47PM +0800, Jianchao Wang wrote: > Currently, the complicated relationship between nvme_dev_disable > and nvme_timeout has become a devil that will introduce many > circular pattern which may trigger deadlock or IO hang. Let's > enumerate the tangles between them: > -

Re: [PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case

2018-02-02 Thread Keith Busch
On Fri, Feb 02, 2018 at 03:00:45PM +0800, Jianchao Wang wrote: > Currently, request queue will be frozen and quiesced for both reset > and shutdown case. This will trigger ioq requests in RECONNECTING > state which should be avoided to prepare for following patch. > Just freeze request queue for sh

Re: [PATCH] PCI/DPC: Fix INT legacy interrupt in dpc_irq

2018-01-31 Thread Keith Busch
with current code we do not acknowledge the > interrupt and we get dpc interrupt storm. > This patch acknowledges the interrupt in interrupt handler. > > Signed-off-by: Oza Pawandeep Thanks, looks good to me. Reviewed-by: Keith Busch

Re: [PATCH] nvme-pci: use NOWAIT flag for nvme_set_host_mem

2018-01-30 Thread Keith Busch
On Tue, Jan 30, 2018 at 11:41:07AM +0800, jianchao.wang wrote: > Another point that confuses me is that whether nvme_set_host_mem is necessary > in nvme_dev_disable ? > As the comment: > > /* >* If the controller is still alive tell it to stop using the >

Re: [PATCH] nvme-pci: use NOWAIT flag for nvme_set_host_mem

2018-01-29 Thread Keith Busch
On Mon, Jan 29, 2018 at 09:55:41PM +0200, Sagi Grimberg wrote: > > Thanks for the fix. It looks like we still have a problem, though. > > Commands submitted with the "shutdown_lock" held need to be able to make > > forward progress without relying on a completion, but this one could > > block indef

Re: [PATCH] nvme-pci: use NOWAIT flag for nvme_set_host_mem

2018-01-29 Thread Keith Busch
On Mon, Jan 29, 2018 at 11:07:35AM +0800, Jianchao Wang wrote: > nvme_set_host_mem will invoke nvme_alloc_request without NOWAIT > flag, it is unsafe for nvme_dev_disable. The adminq driver tags > may have been used up when the previous outstanding adminq requests > cannot be completed due to some

Re: [PATCH RESENT] nvme-pci: introduce RECONNECTING state to mark initializing procedure

2018-01-25 Thread Keith Busch
. > > Suggested-by: James Smart > Reviewed-by: James Smart > Signed-off-by: Jianchao Wang This looks fine. Thank you for your patience. Reviewed-by: Keith Busch

Re: Report long suspend times of NVMe devices (mostly firmware/device issues)

2018-01-24 Thread Keith Busch
On Wed, Jan 24, 2018 at 11:29:12PM +0100, Paul Menzel wrote: > Am 22.01.2018 um 22:30 schrieb Keith Busch: > > The nvme spec guides toward longer times than that. I don't see the > > point of warning users about things operating within spec. > > I quickly glanced ove

Re: [PATCH] nvme-pci: ensure nvme_timeout complete before initializing procedure

2018-01-22 Thread Keith Busch
On Mon, Jan 22, 2018 at 09:14:23PM +0100, Christoph Hellwig wrote: > > Link: https://lkml.org/lkml/2018/1/19/68 > > Suggested-by: Keith Busch > > Signed-off-by: Keith Busch > > Signed-off-by: Jianchao Wang > > Why does this have a signoff from Keith? Right, I

Re: Report long suspend times of NVMe devices (mostly firmware/device issues)

2018-01-22 Thread Keith Busch
On Mon, Jan 22, 2018 at 10:02:12PM +0100, Paul Menzel wrote: > Dear Linux folks, > > > Benchmarking the ACPI S3 suspend and resume times with `sleepgraph.py > -config config/suspend-callgraph.cfg` [1], shows that the NVMe disk SAMSUNG > MZVKW512HMJP-0 in the TUXEDO Book BU1406 takes between 0

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-19 Thread Keith Busch
On Fri, Jan 19, 2018 at 09:56:48PM +0800, jianchao.wang wrote: > In nvme_dev_disable, the outstanding requests will be requeued finally. > I'm afraid the requests requeued on the q->requeue_list will be blocked until > another requeue > occurs, if we cancel the requeue work before it get scheduled

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-19 Thread Keith Busch
On Fri, Jan 19, 2018 at 05:02:06PM +0800, jianchao.wang wrote: > We should not use blk_sync_queue here, the requeue_work and run_work will be > canceled. > Just flush_work(&q->timeout_work) should be ok. I agree flushing timeout_work is sufficient. All the other work had already better not be run

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-19 Thread Keith Busch
On Fri, Jan 19, 2018 at 04:14:02PM +0800, jianchao.wang wrote: > On 01/19/2018 04:01 PM, Keith Busch wrote: > > The nvme_dev_disable routine makes forward progress without depending on > > timeout handling to complete expired commands. Once controller disabling > > completes,

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-18 Thread Keith Busch
On Thu, Jan 18, 2018 at 06:10:00PM +0800, Jianchao Wang wrote: > Hello > > Please consider the following scenario. > nvme_reset_ctrl > -> set state to RESETTING > -> queue reset_work > (scheduling) > nvme_reset_work > -> nvme_dev_disable > -> quiesce queues > -> nvme_cance

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread Keith Busch
On Fri, Jan 19, 2018 at 01:55:29PM +0800, jianchao.wang wrote: > On 01/19/2018 12:59 PM, Keith Busch wrote: > > On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote: > >> + * - When the ctrl.state is NVME_CTRL_RESETTING, the expired > >> + * request sh

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread Keith Busch
On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote: > + * - When the ctrl.state is NVME_CTRL_RESETTING, the expired > + * request should come from the previous work and we handle > + * it as nvme_cancel_request. > + * - When the ctrl.state is NVME_CTRL_RECONNECTIN

Re: [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event

2018-01-18 Thread Keith Busch
On Thu, Jan 18, 2018 at 11:35:59AM -0500, Sinan Kaya wrote: > On 1/18/2018 12:32 AM, p...@codeaurora.org wrote: > > On 2018-01-18 08:26, Keith Busch wrote: > >> On Wed, Jan 17, 2018 at 08:27:39AM -0800, Sinan Kaya wrote: > >>> On 1/17/2018 5:37 AM, Oza Pawande

Re: [BUG 4.15-rc7] IRQ matrix management errors

2018-01-18 Thread Keith Busch
On Thu, Jan 18, 2018 at 09:10:43AM +0100, Thomas Gleixner wrote: > Can you please provide the output of > > # cat /sys/kernel/debug/irq/irqs/$ONE_I40_IRQ # cat /sys/kernel/debug/irq/irqs/48 handler: handle_edge_irq device: :1a:00.0 status: 0x istate: 0x ddepth: 0 wdep

Re: [PATCH v3 1/2] nvme: add tracepoint for nvme_setup_cmd

2018-01-17 Thread Keith Busch
Looks good. Reviewed-by: Keith Busch

Re: [PATCH v3 2/2] nvme: add tracepoint for nvme_complete_rq

2018-01-17 Thread Keith Busch
Looks good. Reviewed-by: Keith Busch

Re: [PATCH v5 4/4] PCI/DPC: Enumerate the devices after DPC trigger event

2018-01-17 Thread Keith Busch
On Wed, Jan 17, 2018 at 08:27:39AM -0800, Sinan Kaya wrote: > On 1/17/2018 5:37 AM, Oza Pawandeep wrote: > > +static bool dpc_wait_link_active(struct pci_dev *pdev) > > +{ > > I think you can also make this function common instead of making another copy > here. > Of course, this would be another

Re: [BUG 4.15-rc7] IRQ matrix management errors

2018-01-17 Thread Keith Busch
er 200 iterations that used to fail within only a few. I'd say the problem is cured. Thanks! Tested-by: Keith Busch

Re: [BUG 4.15-rc7] IRQ matrix management errors

2018-01-16 Thread Keith Busch
On Wed, Jan 17, 2018 at 08:34:22AM +0100, Thomas Gleixner wrote: > Can you trace the matrix allocations from the very beginning or tell me how > to reproduce. I'd like to figure out why this is happening. Sure, I'll get the irq_matrix events. I reproduce this on a machine with 112 CPUs and 3 NVMe

Re: [BUG 4.15-rc7] IRQ matrix management errors

2018-01-16 Thread Keith Busch
x86_vector_free_irqs(domain, virq, i); > return err; > } > The patch does indeed fix all the warnings and allows device binding to succeed, albeit in a degraded performance mode. Despite that, this is a good fix, and looks applicable to 4.4-stable, so: Tested-by: Keith

Re: [PATCH v2 0/2] add tracepoints for nvme command submission and completion

2018-01-16 Thread Keith Busch
On Tue, Jan 16, 2018 at 03:28:19PM +0100, Johannes Thumshirn wrote: > Add tracepoints for nvme command submission and completion. The tracepoints > are modeled after SCSI's trace_scsi_dispatch_cmd_start() and > trace_scsi_dispatch_cmd_done() tracepoints and fulfil a similar purpose, > namely a fast

Re: [BUG 4.15-rc7] IRQ matrix management errors

2018-01-16 Thread Keith Busch
On Tue, Jan 16, 2018 at 12:20:18PM +0100, Thomas Gleixner wrote: > What we want is s/i + 1/i/ > > That's correct because x86_vector_free_irqs() does: > >for (i = 0; i < nr; i++) > > > So if we fail at the first irq, then the loop will do nothing. Failing on > the se

Re: [BUG 4.15-rc7] IRQ matrix management errors

2018-01-15 Thread Keith Busch
This is all way over my head, but the part that obviously shows something's gone wrong: kworker/u674:3-1421 [028] d... 335.307051: irq_matrix_reserve_managed: bit=56 cpu=0 online=1 avl=86 alloc=116 managed=3 online_maps=112 global_avl=22084, global_rsvd=157, total_alloc=570 kworker/u674:3

[BUG 4.15-rc7] IRQ matrix management errors

2018-01-14 Thread Keith Busch
I hoped to have a better report before the weekend, but I've run out of time and without my machine till next week, so sending what I have and praying someone more in the know will have a better clue. I've a few NVMe drives and occasionally the IRQ teardown and bring-up is failing. Resetting the c

Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting

2018-01-14 Thread Keith Busch
On Mon, Jan 15, 2018 at 10:02:04AM +0800, jianchao.wang wrote: > Hi keith > > Thanks for your kindly review and response. I agree with Sagi's feedback, but I can't take credit for it. :)

Re: ASPM powersupersave change NVMe SSD Samsung 960 PRO capacity to 0 and read-only

2018-01-11 Thread Keith Busch
On Thu, Jan 11, 2018 at 06:50:40PM +0100, Maik Broemme wrote: > I've re-run the test with 4.15rc7.r111.g5f615b97cdea and the following > patches from Keith: > > [PATCH 1/4] PCI/AER: Return approrpiate value when AER is not supported > [PATCH 2/4] PCI/AER: Provide API for getting AER information >

<    1   2   3   4   5   6   7   8   9   10   >