RE: [PATCH] nvme/pci: remap BAR0 to cover admin CQ doorbell for large stride
Thanks for your suggestion, we will try to make it better and resend the patch soon. -Original Message- From: Christoph Hellwig [mailto:h...@lst.de] Sent: Thursday, May 18, 2017 9:44 PM To: Xu, Yu A <yu.a...@intel.com> Cc: linux-n...@lists.infradead.org; linux-kernel@vger.kernel.org; Busch, Keith <keith.bu...@intel.com>; ax...@fb.com; h...@lst.de; s...@grimberg.me; Zhang, Haozhong <haozhong.zh...@intel.com> Subject: Re: [PATCH] nvme/pci: remap BAR0 to cover admin CQ doorbell for large stride On Thu, May 18, 2017 at 06:35:47AM +0800, Xu Yu wrote: > The existing driver initially maps 8192 bytes of BAR0 which is > intended to cover doorbells of admin SQ and CQ. However, if a large > stride, e.g. 10, is used, the doorbell of admin CQ will be out of 8192 > bytes. Consequently, a page fault will be raised when the admin CQ > doorbell is accessed in nvme_configure_admin_queue(). > > This patch fixes this issue by remapping BAR0 before accessing admin > CQ doorbell if the initial mapping is not enough. > > Signed-off-by: "Xu, Yu A" <yu.a...@intel.com> > --- > drivers/nvme/host/pci.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index > 9d4640a..7c991eb 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -1322,6 +1322,17 @@ static int nvme_configure_admin_queue(struct nvme_dev > *dev) > u32 aqa; > u64 cap = lo_hi_readq(dev->bar + NVME_REG_CAP); > struct nvme_queue *nvmeq; > + struct pci_dev *pdev = to_pci_dev(dev->dev); > + unsigned long size; > + > + size = 4096 + 2 * 4 * dev->db_stride; > + if (size > 8192) { > + iounmap(dev->bar); > + dev->bar = ioremap(pci_resource_start(pdev, 0), size); > + if (!dev->bar) > + return -ENOMEM; > + dev->dbs = dev->bar + 4096; > + } This code duplicates logic in db_bar_size / nvme_setup_io_queues. Please reuse the db_bar_size helper by passing 0 to, and try to figure out if we can factor this whole sequence into a new helper as well. Bonus points for adding constants to nvme.h for the 4096 offset of the first db register, and our magic 8192 threshold.
RE: [PATCH] nvme/pci: remap BAR0 to cover admin CQ doorbell for large stride
Thanks for your suggestion, we will try to make it better and resend the patch soon. -Original Message- From: Christoph Hellwig [mailto:h...@lst.de] Sent: Thursday, May 18, 2017 9:44 PM To: Xu, Yu A Cc: linux-n...@lists.infradead.org; linux-kernel@vger.kernel.org; Busch, Keith ; ax...@fb.com; h...@lst.de; s...@grimberg.me; Zhang, Haozhong Subject: Re: [PATCH] nvme/pci: remap BAR0 to cover admin CQ doorbell for large stride On Thu, May 18, 2017 at 06:35:47AM +0800, Xu Yu wrote: > The existing driver initially maps 8192 bytes of BAR0 which is > intended to cover doorbells of admin SQ and CQ. However, if a large > stride, e.g. 10, is used, the doorbell of admin CQ will be out of 8192 > bytes. Consequently, a page fault will be raised when the admin CQ > doorbell is accessed in nvme_configure_admin_queue(). > > This patch fixes this issue by remapping BAR0 before accessing admin > CQ doorbell if the initial mapping is not enough. > > Signed-off-by: "Xu, Yu A" > --- > drivers/nvme/host/pci.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index > 9d4640a..7c991eb 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -1322,6 +1322,17 @@ static int nvme_configure_admin_queue(struct nvme_dev > *dev) > u32 aqa; > u64 cap = lo_hi_readq(dev->bar + NVME_REG_CAP); > struct nvme_queue *nvmeq; > + struct pci_dev *pdev = to_pci_dev(dev->dev); > + unsigned long size; > + > + size = 4096 + 2 * 4 * dev->db_stride; > + if (size > 8192) { > + iounmap(dev->bar); > + dev->bar = ioremap(pci_resource_start(pdev, 0), size); > + if (!dev->bar) > + return -ENOMEM; > + dev->dbs = dev->bar + 4096; > + } This code duplicates logic in db_bar_size / nvme_setup_io_queues. Please reuse the db_bar_size helper by passing 0 to, and try to figure out if we can factor this whole sequence into a new helper as well. Bonus points for adding constants to nvme.h for the 4096 offset of the first db register, and our magic 8192 threshold.
Re: [PATCH] nvme/pci: remap BAR0 to cover admin CQ doorbell for large stride
On Thu, May 18, 2017 at 06:35:47AM +0800, Xu Yu wrote: > The existing driver initially maps 8192 bytes of BAR0 which is > intended to cover doorbells of admin SQ and CQ. However, if a > large stride, e.g. 10, is used, the doorbell of admin CQ will > be out of 8192 bytes. Consequently, a page fault will be raised > when the admin CQ doorbell is accessed in nvme_configure_admin_queue(). > > This patch fixes this issue by remapping BAR0 before accessing > admin CQ doorbell if the initial mapping is not enough. > > Signed-off-by: "Xu, Yu A"> --- > drivers/nvme/host/pci.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index 9d4640a..7c991eb 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -1322,6 +1322,17 @@ static int nvme_configure_admin_queue(struct nvme_dev > *dev) > u32 aqa; > u64 cap = lo_hi_readq(dev->bar + NVME_REG_CAP); > struct nvme_queue *nvmeq; > + struct pci_dev *pdev = to_pci_dev(dev->dev); > + unsigned long size; > + > + size = 4096 + 2 * 4 * dev->db_stride; > + if (size > 8192) { > + iounmap(dev->bar); > + dev->bar = ioremap(pci_resource_start(pdev, 0), size); > + if (!dev->bar) > + return -ENOMEM; > + dev->dbs = dev->bar + 4096; > + } This code duplicates logic in db_bar_size / nvme_setup_io_queues. Please reuse the db_bar_size helper by passing 0 to, and try to figure out if we can factor this whole sequence into a new helper as well. Bonus points for adding constants to nvme.h for the 4096 offset of the first db register, and our magic 8192 threshold.
Re: [PATCH] nvme/pci: remap BAR0 to cover admin CQ doorbell for large stride
On Thu, May 18, 2017 at 06:35:47AM +0800, Xu Yu wrote: > The existing driver initially maps 8192 bytes of BAR0 which is > intended to cover doorbells of admin SQ and CQ. However, if a > large stride, e.g. 10, is used, the doorbell of admin CQ will > be out of 8192 bytes. Consequently, a page fault will be raised > when the admin CQ doorbell is accessed in nvme_configure_admin_queue(). > > This patch fixes this issue by remapping BAR0 before accessing > admin CQ doorbell if the initial mapping is not enough. > > Signed-off-by: "Xu, Yu A" > --- > drivers/nvme/host/pci.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index 9d4640a..7c991eb 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -1322,6 +1322,17 @@ static int nvme_configure_admin_queue(struct nvme_dev > *dev) > u32 aqa; > u64 cap = lo_hi_readq(dev->bar + NVME_REG_CAP); > struct nvme_queue *nvmeq; > + struct pci_dev *pdev = to_pci_dev(dev->dev); > + unsigned long size; > + > + size = 4096 + 2 * 4 * dev->db_stride; > + if (size > 8192) { > + iounmap(dev->bar); > + dev->bar = ioremap(pci_resource_start(pdev, 0), size); > + if (!dev->bar) > + return -ENOMEM; > + dev->dbs = dev->bar + 4096; > + } This code duplicates logic in db_bar_size / nvme_setup_io_queues. Please reuse the db_bar_size helper by passing 0 to, and try to figure out if we can factor this whole sequence into a new helper as well. Bonus points for adding constants to nvme.h for the 4096 offset of the first db register, and our magic 8192 threshold.