Re: [PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power
On 02.10.2015 [10:25:44 -0700], Christoph Hellwig wrote: > Hi Nishanth, > > please expose this value through the generic DMA API instead of adding > architecture specific hacks to drivers. Ok, I'm happy to do that instead -- what I struggled with is that I don't have enough knowledge of the various architectures to provide the right default implementation. It should be sufficient for the default to return PAGE_SHIFT, and on Power just override that to return the IOMMU table's page size? Since the only user will be the NVMe driver currently, that should be fine? Sorry for the less-than-ideal patch! -Nish ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power
We received a bug report recently when DDW (64-bit direct DMA on Power) is not enabled for NVMe devices. In that case, we fall back to 32-bit DMA via the IOMMU, which is always done via 4K TCEs (Translation Control Entries). The NVMe device driver, though, assumes that the DMA alignment for the PRP entries will match the device's page size, and that the DMA aligment matches the kernel's page aligment. On Power, the the IOMMU page size, as mentioned above, can be 4K, while the device can have a page size of 8K, while the kernel has a page size of 64K. This eventually trips the BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple of 4K but not 8K (e.g., 0xF000). In this particular case, and generally, we want to use the IOMMU's page size for the default device page size, rather than the kernel's page size. With this patch, a NVMe device survives our internal hardware exerciser; the kernel BUGs within a few seconds without the patch. Signed-off-by: Nishanth Aravamudandiff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c index 7920c27..969a95e 100644 --- a/drivers/block/nvme-core.c +++ b/drivers/block/nvme-core.c @@ -42,6 +42,7 @@ #include #include #include +#include #define NVME_MINORS(1U << MINORBITS) #define NVME_Q_DEPTH 1024 @@ -1680,6 +1681,11 @@ static int nvme_configure_admin_queue(struct nvme_dev *dev) unsigned page_shift = PAGE_SHIFT; unsigned dev_page_min = NVME_CAP_MPSMIN(cap) + 12; unsigned dev_page_max = NVME_CAP_MPSMAX(cap) + 12; +#ifdef CONFIG_PPC64 + struct iommu_table *tbl = get_iommu_table_base(dev->dev); + if (tbl) + page_shift = IOMMU_PAGE_SHIFT(tbl); +#endif if (page_shift < dev_page_min) { dev_err(dev->dev, ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power
On Fri, Oct 02, 2015 at 10:39:47AM -0700, Nishanth Aravamudan wrote: > Ok, I'm happy to do that instead -- what I struggled with is that I > don't have enough knowledge of the various architectures to provide the > right default implementation. It should be sufficient for the default to > return PAGE_SHIFT, and on Power just override that to return the IOMMU > table's page size? Since the only user will be the NVMe driver > currently, that should be fine? I think that's fine. > Sorry for the less-than-ideal patch! Np, it's a reasonable first attempt that we just need to refine. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power
Hi Nishanth, please expose this value through the generic DMA API instead of adding architecture specific hacks to drivers. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power
Hi Nishanth, [auto build test results on v4.3-rc3 -- if it's inappropriate base, please ignore] config: arm64-allmodconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm64 All error/warnings (new ones prefixed by >>): >> drivers/block/nvme-core.c:45:23: fatal error: asm/iommu.h: No such file or >> directory #include ^ compilation terminated. vim +45 drivers/block/nvme-core.c 39 #include 40 #include 41 #include 42 #include 43 #include 44 #include > 45 #include 46 47 #define NVME_MINORS (1U << MINORBITS) 48 #define NVME_Q_DEPTH1024 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev