Re: revert scope for 5.8, was Re: dma-pool fixes
On Sat, 1 Aug 2020 at 23:09, Christoph Hellwig wrote: > > On Sat, Aug 01, 2020 at 05:27:04PM +0530, Amit Pundir wrote: > > Hi, I found the problematic memory region. It was a memory > > chunk reserved/removed in the downstream tree but was > > seemingly reserved upstream for different drivers. I failed to > > calculate the length of the total region reserved downstream > > correctly. And there was still a portion of memory left unmarked, > > which I should have marked as reserved in my testing earlier > > today. > > > > Sorry for all the noise and thanks Nicolas, Christoph and David > > for your patience. > > So you'll need to patch the upstream DTS to fix this up? Do you also > need my two fixes? What about the Oneplus phones? Can you send a > mail with a summary of the status? Poco's DTS is not upstreamed yet. I have updated it for this fix and sent out a newer version for review. https://lkml.org/lkml/2020/8/1/184 I didn't need to try your two add-on fixes. I'll give them a spin later today. I'm sure One Plus 6 and 6T will be running into similar problem. I'll check with Caleb and send out a status mail with the summary. Regards, Amit Pundir ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: revert scope for 5.8, was Re: dma-pool fixes
On Sat, 1 Aug 2020 at 23:58, Linus Torvalds wrote: > > On Sat, Aug 1, 2020 at 4:57 AM Amit Pundir wrote: > > > > Hi, I found the problematic memory region. It was a memory > > chunk reserved/removed in the downstream tree but was > > seemingly reserved upstream for different drivers. > > Is this happening with a clean tree, or are there external drivers > involved that trigger the problem? > > Because if it's a clean tree, I guess I need to do an rc8 anyway, just > to get whatever workaround you then added to devicetree and/or some > driver to make it work again. > No, this is not on a clean tree. The phone's device-tree is not upstreamed yet. That is the only change I carry. I have updated the device-tree for this fix and sent out a newer version for review. https://lkml.org/lkml/2020/8/1/184 Regards, Amit Pundir > Or is there a quick fix that I can get today or early tomorrow? We had > some confusion this week due to a nasty include header mess, but > despite that there hasn't been anything else that has come up (so > far), so.. > >Linus ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5] PCI/ACS: Enable PCI_ACS_TB and disable only when needed for ATS
Hi Bjorn, On Tue, Jul 14, 2020 at 1:24 PM Rajat Jain wrote: > > On Tue, Jul 14, 2020 at 1:15 PM Rajat Jain wrote: > > > > The ACS "Translation Blocking" bit blocks the translated addresses from > > the devices. We don't expect such traffic from devices unless ATS is > > enabled on them. A device sending such traffic without ATS enabled, > > indicates malicious intent, and thus should be blocked. > > > > Enable PCI_ACS_TB by default for all devices, and it stays enabled until > > atleast one of the devices downstream wants to enable ATS. It gets > > disabled to enable ATS on a device downstream it, and then gets enabled > > back on once all the downstream devices don't need ATS. > > > > Signed-off-by: Rajat Jain Just checking to see if you got a chance to look at this V5 patch. Thanks & Best Regards, Rajat > > --- > > Note that I'm ignoring the devices that require quirks to enable or > > disable ACS, instead of using the standard way for ACS configuration. > > The reason is that it would require adding yet another quirk table or > > quirk function pointer, that I don't know how to implement for those > > devices, and will neither have the devices to test that code. > > > > v5: Enable TB and disable ATS for all devices on boot. Disable TB later > > only if needed to enable ATS on downstream devices. > > v4: Add braces to avoid warning from kernel robot > > print warning for only external-facing devices. > > v3: print warning if ACS_TB not supported on external-facing/untrusted > > ports. > > Minor code comments fixes. > > v2: Commit log change > > > > drivers/pci/ats.c | 5 > > drivers/pci/pci.c | 57 + > > drivers/pci/pci.h | 2 ++ > > drivers/pci/probe.c | 2 +- > > include/linux/pci.h | 2 ++ > > 5 files changed, 67 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c > > index b761c1f72f67..e2ea9083f30f 100644 > > --- a/drivers/pci/ats.c > > +++ b/drivers/pci/ats.c > > @@ -28,6 +28,9 @@ void pci_ats_init(struct pci_dev *dev) > > return; > > > > dev->ats_cap = pos; > > + > > + dev->ats_enabled = 1; /* To avoid WARN_ON from pci_disable_ats() */ > > + pci_disable_ats(dev); > > } > > > > /** > > @@ -82,6 +85,7 @@ int pci_enable_ats(struct pci_dev *dev, int ps) > > } > > pci_write_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, ctrl); > > > > + pci_disable_acs_trans_blocking(dev); > > dev->ats_enabled = 1; > > return 0; > > } > > @@ -102,6 +106,7 @@ void pci_disable_ats(struct pci_dev *dev) > > ctrl &= ~PCI_ATS_CTRL_ENABLE; > > pci_write_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, ctrl); > > > > + pci_enable_acs_trans_blocking(dev); > > dev->ats_enabled = 0; > > } > > EXPORT_SYMBOL_GPL(pci_disable_ats); > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index 73a862782214..614e3c1e8c56 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -876,6 +876,9 @@ static void pci_std_enable_acs(struct pci_dev *dev) > > /* Upstream Forwarding */ > > ctrl |= (cap & PCI_ACS_UF); > > > > + /* Translation Blocking */ > > + ctrl |= (cap & PCI_ACS_TB); > > + > > pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl); > > } > > > > @@ -904,6 +907,60 @@ static void pci_enable_acs(struct pci_dev *dev) > > pci_disable_acs_redir(dev); > > } > > > > +void pci_disable_acs_trans_blocking(struct pci_dev *pdev) > > +{ > > + u16 cap, ctrl, pos; > > + struct pci_dev *dev; > > + > > + if (!pci_acs_enable) > > + return; > > + > > + for (dev = pdev; dev; dev = pci_upstream_bridge(pdev)) { > > + > > + pos = dev->acs_cap; > > + if (!pos) > > + continue; > > + > > + /* > > +* Disable translation blocking when first downstream > > +* device that needs it (for ATS) wants to enable ATS > > +*/ > > + if (++dev->ats_dependencies == 1) { > > I am a little worried about a potential race condition here. I know > that 2 PCI devices cannot be enumerating at the same time. Do we know > if multiple pci_enable_ats() and pci_disable_ats() function calls can > be simultaneously executing (even for different devices)? If so, we > may need an atomic_t variable for ats_dependencies. > > Thanks, > > Rajat > > > > + pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap); > > + pci_read_config_word(dev, pos + PCI_ACS_CTRL, > > &ctrl); > > + ctrl &= ~(cap & PCI_ACS_TB); > > + pci_write_config_word(dev, pos + PCI_ACS_CTRL, > > ctrl); > > + } > > + } > > +} > > + > > +void pci_enable_acs_trans_blocking(struct pci_dev *pdev) > > +{ > > + u16 cap, ctrl, pos; > > + struct pci_dev *dev; > > + > > +
Re: [PATCH v5] PCI/ACS: Enable PCI_ACS_TB and disable only when needed for ATS
Hi Bjorn, On Tue, Jul 14, 2020 at 1:15 PM Rajat Jain wrote: > The ACS "Translation Blocking" bit blocks the translated addresses from > the devices. We don't expect such traffic from devices unless ATS is > enabled on them. A device sending such traffic without ATS enabled, > indicates malicious intent, and thus should be blocked. > > Enable PCI_ACS_TB by default for all devices, and it stays enabled until > atleast one of the devices downstream wants to enable ATS. It gets > disabled to enable ATS on a device downstream it, and then gets enabled > back on once all the downstream devices don't need ATS. > > Signed-off-by: Rajat Jain > Just checking to see if you got a chance to look at this V5 patch. Thanks & Best Regards, Rajat > --- > Note that I'm ignoring the devices that require quirks to enable or > disable ACS, instead of using the standard way for ACS configuration. > The reason is that it would require adding yet another quirk table or > quirk function pointer, that I don't know how to implement for those > devices, and will neither have the devices to test that code. > > v5: Enable TB and disable ATS for all devices on boot. Disable TB later > only if needed to enable ATS on downstream devices. > v4: Add braces to avoid warning from kernel robot > print warning for only external-facing devices. > v3: print warning if ACS_TB not supported on external-facing/untrusted > ports. > Minor code comments fixes. > v2: Commit log change > > drivers/pci/ats.c | 5 > drivers/pci/pci.c | 57 + > drivers/pci/pci.h | 2 ++ > drivers/pci/probe.c | 2 +- > include/linux/pci.h | 2 ++ > 5 files changed, 67 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c > index b761c1f72f67..e2ea9083f30f 100644 > --- a/drivers/pci/ats.c > +++ b/drivers/pci/ats.c > @@ -28,6 +28,9 @@ void pci_ats_init(struct pci_dev *dev) > return; > > dev->ats_cap = pos; > + > + dev->ats_enabled = 1; /* To avoid WARN_ON from pci_disable_ats() */ > + pci_disable_ats(dev); > } > > /** > @@ -82,6 +85,7 @@ int pci_enable_ats(struct pci_dev *dev, int ps) > } > pci_write_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, ctrl); > > + pci_disable_acs_trans_blocking(dev); > dev->ats_enabled = 1; > return 0; > } > @@ -102,6 +106,7 @@ void pci_disable_ats(struct pci_dev *dev) > ctrl &= ~PCI_ATS_CTRL_ENABLE; > pci_write_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, ctrl); > > + pci_enable_acs_trans_blocking(dev); > dev->ats_enabled = 0; > } > EXPORT_SYMBOL_GPL(pci_disable_ats); > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 73a862782214..614e3c1e8c56 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -876,6 +876,9 @@ static void pci_std_enable_acs(struct pci_dev *dev) > /* Upstream Forwarding */ > ctrl |= (cap & PCI_ACS_UF); > > + /* Translation Blocking */ > + ctrl |= (cap & PCI_ACS_TB); > + > pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl); > } > > @@ -904,6 +907,60 @@ static void pci_enable_acs(struct pci_dev *dev) > pci_disable_acs_redir(dev); > } > > +void pci_disable_acs_trans_blocking(struct pci_dev *pdev) > +{ > + u16 cap, ctrl, pos; > + struct pci_dev *dev; > + > + if (!pci_acs_enable) > + return; > + > + for (dev = pdev; dev; dev = pci_upstream_bridge(pdev)) { > + > + pos = dev->acs_cap; > + if (!pos) > + continue; > + > + /* > +* Disable translation blocking when first downstream > +* device that needs it (for ATS) wants to enable ATS > +*/ > + if (++dev->ats_dependencies == 1) { > + pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap); > + pci_read_config_word(dev, pos + PCI_ACS_CTRL, > &ctrl); > + ctrl &= ~(cap & PCI_ACS_TB); > + pci_write_config_word(dev, pos + PCI_ACS_CTRL, > ctrl); > + } > + } > +} > + > +void pci_enable_acs_trans_blocking(struct pci_dev *pdev) > +{ > + u16 cap, ctrl, pos; > + struct pci_dev *dev; > + > + if (!pci_acs_enable) > + return; > + > + for (dev = pdev; dev; dev = pci_upstream_bridge(pdev)) { > + > + pos = dev->acs_cap; > + if (!pos) > + continue; > + > + /* > +* Enable translation blocking when last downstream device > +* that depends on it (for ATS), doesn't need ATS anymore > +*/ > + if (--dev->ats_dependencies == 0) { > + pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap); > + pci_read_config_word(dev, pos + PCI_ACS_CTRL, > &ctrl); > +
Re: revert scope for 5.8, was Re: dma-pool fixes
On Sat, Aug 1, 2020 at 4:57 AM Amit Pundir wrote: > > Hi, I found the problematic memory region. It was a memory > chunk reserved/removed in the downstream tree but was > seemingly reserved upstream for different drivers. Is this happening with a clean tree, or are there external drivers involved that trigger the problem? Because if it's a clean tree, I guess I need to do an rc8 anyway, just to get whatever workaround you then added to devicetree and/or some driver to make it work again. Or is there a quick fix that I can get today or early tomorrow? We had some confusion this week due to a nasty include header mess, but despite that there hasn't been anything else that has come up (so far), so.. Linus ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: revert scope for 5.8, was Re: dma-pool fixes
On Sat, Aug 01, 2020 at 05:27:04PM +0530, Amit Pundir wrote: > Hi, I found the problematic memory region. It was a memory > chunk reserved/removed in the downstream tree but was > seemingly reserved upstream for different drivers. I failed to > calculate the length of the total region reserved downstream > correctly. And there was still a portion of memory left unmarked, > which I should have marked as reserved in my testing earlier > today. > > Sorry for all the noise and thanks Nicolas, Christoph and David > for your patience. So you'll need to patch the upstream DTS to fix this up? Do you also need my two fixes? What about the Oneplus phones? Can you send a mail with a summary of the status? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 08/12] device core: Introduce DMA range map, supplanting dma_pfn_offset
Hi Jim, here's some comments after testing your series against RPi4. On Fri, 2020-07-24 at 16:33 -0400, Jim Quinlan wrote: > The new field 'dma_range_map' in struct device is used to facilitate the > use of single or multiple offsets between mapping regions of cpu addrs and > dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only > capable of holding a single uniform offset and had no region bounds > checking. > > The function of_dma_get_range() has been modified so that it takes a single > argument -- the device node -- and returns a map, NULL, or an error code. > The map is an array that holds the information regarding the DMA regions. > Each range entry contains the address offset, the cpu_start address, the > dma_start address, and the size of the region. > > of_dma_configure() is the typical manner to set range offsets but there are > a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel > driver code. These cases now invoke the function > dma_attach_offset_range(dev, cpu_addr, dma_addr, size). > > Signed-off-by: Jim Quinlan > --- [...] > diff --git a/drivers/of/address.c b/drivers/of/address.c > index 8eea3f6e29a4..4b718d199efe 100644 > --- a/drivers/of/address.c > +++ b/drivers/of/address.c > @@ -918,33 +918,33 @@ void __iomem *of_io_request_and_map(struct device_node > *np, int index, > } > EXPORT_SYMBOL(of_io_request_and_map); > > +#ifdef CONFIG_HAS_DMA > /** > - * of_dma_get_range - Get DMA range info > + * of_dma_get_range - Get DMA range info and put it into a map array > * @np: device node to get DMA range info > - * @dma_addr:pointer to store initial DMA address of DMA range > - * @paddr: pointer to store initial CPU address of DMA range > - * @size:pointer to store size of DMA range > + * @map: dma range structure to return > * > * Look in bottom up direction for the first "dma-ranges" property > - * and parse it. > - * dma-ranges format: > + * and parse it. Put the information into a DMA offset map array. > + * > + * dma-ranges format: > * DMA addr (dma_addr) : naddr cells > * CPU addr (phys_addr_t) : pna cells > * size: nsize cells > * > - * It returns -ENODEV if "dma-ranges" property was not found > - * for this device in DT. > + * It returns -ENODEV if "dma-ranges" property was not found for this > + * device in the DT. > */ > -int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 > *size) > +int of_dma_get_range(struct device_node *np, const struct bus_dma_region > **map) > { > struct device_node *node = of_node_get(np); > const __be32 *ranges = NULL; > - int len; > - int ret = 0; > bool found_dma_ranges = false; > struct of_range_parser parser; > struct of_range range; > - u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0; > + struct bus_dma_region *r; > + int len, num_ranges = 0; > + int ret; > > while (node) { > ranges = of_get_property(node, "dma-ranges", &len); > @@ -970,44 +970,35 @@ int of_dma_get_range(struct device_node *np, u64 > *dma_addr, u64 *paddr, u64 *siz > } > > of_dma_range_parser_init(&parser, node); > + for_each_of_range(&parser, &range) > + num_ranges++; > + > + of_dma_range_parser_init(&parser, node); > + > + ret = -ENOMEM; > + r = kcalloc(num_ranges + 1, sizeof(*r), GFP_KERNEL); > + if (!r) > + goto out; > > + /* > + * Record all info in the generic DMA ranges array for struct device. > + */ > + *map = r; > for_each_of_range(&parser, &range) { > pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n", >range.bus_addr, range.cpu_addr, range.size); > - > - if (dma_offset && range.cpu_addr - range.bus_addr != > dma_offset) { > - pr_warn("Can't handle multiple dma-ranges with > different offsets on node(%pOF)\n", node); > - /* Don't error out as we'd break some existing DTs */ > - continue; > - } > - dma_offset = range.cpu_addr - range.bus_addr; > - > - /* Take lower and upper limits */ > - if (range.bus_addr < dma_start) > - dma_start = range.bus_addr; > - if (range.bus_addr + range.size > dma_end) > - dma_end = range.bus_addr + range.size; > - } > - > - if (dma_start >= dma_end) { > - ret = -EINVAL; > - pr_debug("Invalid DMA ranges configuration on node(%pOF)\n", > - node); > - goto out; > + r->cpu_start = range.cpu_addr; > + r->dma_start = range.bus_addr; > + r->size = range.size; > + r->offset = (u64)range.cpu_addr - (u64)range.bus_addr; > + r++; > } > > - *dma_addr = dma_start; > - *size = dma_end - dma_star
Re: revert scope for 5.8, was Re: dma-pool fixes
On Sat, 2020-08-01 at 17:27 +0530, Amit Pundir wrote: [...] > > I'm between a rock and a hard place here. If we simply want to revert > > commits as-is to make sure both the Raspberry Pi 4 and thone phone do > > not regress we'll have to go all the way back and revert the whole SEV > > pool support. I could try to manual revert of the multiple pool > > support, but it is very late for that. > > Hi, I found the problematic memory region. It was a memory > chunk reserved/removed in the downstream tree but was > seemingly reserved upstream for different drivers. I failed to > calculate the length of the total region reserved downstream > correctly. And there was still a portion of memory left unmarked, > which I should have marked as reserved in my testing earlier > today. > > Sorry for all the noise and thanks Nicolas, Christoph and David > for your patience. That's great news, thanks for persevering! Regards, Nicolas ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 15/17] iommu/vt-d: Drop uses of pci_read_config_*() return value
The return value of pci_read_config_*() may not indicate a device error. However, the value read by these functions is more likely to indicate this kind of error. This presents two overlapping ways of reporting errors and complicates error checking. It is possible to move to one single way of checking for error if the dependency on the return value of these functions is removed, then it can later be made to return void. Remove all uses of the return value of pci_read_config_*(). Check the actual value read for ~0. In this case, ~0 is an invalid value thus it indicates some kind of error. Suggested-by: Bjorn Helgaas Signed-off-by: Saheed O. Bolarinwa --- drivers/iommu/intel/iommu.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index d759e7234e98..aad3c065e4a0 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -6165,7 +6165,8 @@ static void quirk_calpella_no_shadow_gtt(struct pci_dev *dev) if (risky_device(dev)) return; - if (pci_read_config_word(dev, GGC, &ggc)) + pci_read_config_word(dev, GGC, &ggc); + if (ggc == (u16)~0) return; if (!(ggc & GGC_MEMORY_VT_ENABLED)) { @@ -6218,7 +6219,8 @@ static void __init check_tylersburg_isoch(void) return; } - if (pci_read_config_dword(pdev, 0x188, &vtisochctrl)) { + pci_read_config_dword(pdev, 0x188, &vtisochctrl); + if (vtisochctrl == (uint32_t)~0) { pci_dev_put(pdev); return; } -- 2.18.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 06/15] powerpc: fadamp: simplify fadump_reserve_crash_area()
On 01/08/20 3:48 pm, Mike Rapoport wrote: On Thu, Jul 30, 2020 at 10:15:13PM +1000, Michael Ellerman wrote: Mike Rapoport writes: From: Mike Rapoport fadump_reserve_crash_area() reserves memory from a specified base address till the end of the RAM. Replace iteration through the memblock.memory with a single call to memblock_reserve() with appropriate that will take care of proper memory ^ parameters? reservation. Signed-off-by: Mike Rapoport --- arch/powerpc/kernel/fadump.c | 20 +--- 1 file changed, 1 insertion(+), 19 deletions(-) I think this looks OK to me, but I don't have a setup to test it easily. I've added Hari to Cc who might be able to. But I'll give you an ack in the hope that it works :) Actually, I did some digging in the git log and the traversal was added there on purpose by the commit b71a693d3db3 ("powerpc/fadump: exclude memory holes while reserving memory in second kernel") I was about to comment on the same :) memblock_reserve() was being used until we ran into the issue talked about in the above commit... Presuming this is still reqruired I'm going to drop this patch and will Yeah, it is still required.. simply replace for_each_memblock() with for_each_mem_range() in v2. Sounds right. Acked-by: Michael Ellerman diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index 78ab9a6ee6ac..2446a61e3c25 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -1658,25 +1658,7 @@ int __init fadump_reserve_mem(void) /* Preserve everything above the base address */ static void __init fadump_reserve_crash_area(u64 base) { - struct memblock_region *reg; - u64 mstart, msize; - - for_each_memblock(memory, reg) { - mstart = reg->base; - msize = reg->size; - - if ((mstart + msize) < base) - continue; - - if (mstart < base) { - msize -= (base - mstart); - mstart = base; - } - - pr_info("Reserving %lluMB of memory at %#016llx for preserving crash data", - (msize >> 20), mstart); - memblock_reserve(mstart, msize); - } + memblock_reserve(base, memblock_end_of_DRAM() - base); } unsigned long __init arch_reserved_kernel_pages(void) -- 2.26.2 Thanks Hari ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 00/17] Drop uses of pci_read_config_*() return value
The return value of pci_read_config_*() may not indicate a device error. However, the value read by these functions is more likely to indicate this kind of error. This presents two overlapping ways of reporting errors and complicates error checking. It is possible to move to one single way of checking for error if the dependencies on the return value of these functions are removed, then it can later be made to return void. Remove all uses of the return value of pci_read_config_*(). Check the actual value read for ~0. In this case, ~0 is an invalid value thus it indicates some kind of error. In some cases it madkes sence to make the calling function return void without causing any bug. Future callers can use the value obtained from these functions for validation. This case pertain to cs5536_read() and edac_pci_read_dword() MERGE: There is no dependency. Merge individually Saheed O. Bolarinwa (17): ata: Drop uses of pci_read_config_*() return value atm: Drop uses of pci_read_config_*() return value bcma: Drop uses of pci_read_config_*() return value hwrng: Drop uses of pci_read_config_*() return value dmaengine: ioat: Drop uses of pci_read_config_*() return value edac: Drop uses of pci_read_config_*() return value fpga: altera-cvp: Drop uses of pci_read_config_*() return value gpio: Drop uses of pci_read_config_*() return value drm/i915/vga: Drop uses of pci_read_config_*() return value hwmon: Drop uses of pci_read_config_*() return value intel_th: pci: Drop uses of pci_read_config_*() return value i2c: Drop uses of pci_read_config_*() return value ide: Drop uses of pci_read_config_*() return value IB: Drop uses of pci_read_config_*() return value iommu/vt-d: Drop uses of pci_read_config_*() return value mtd: Drop uses of pci_read_config_*() return value net: Drop uses of pci_read_config_*() return value drivers/ata/pata_cs5536.c | 6 +-- drivers/ata/pata_rz1000.c | 3 +- drivers/atm/eni.c | 3 +- drivers/atm/he.c | 12 +++-- drivers/atm/idt77252.c| 9 ++-- drivers/atm/iphase.c | 46 ++- drivers/atm/lanai.c | 4 +- drivers/atm/nicstar.c | 3 +- drivers/atm/zatm.c| 9 ++-- drivers/bcma/host_pci.c | 6 ++- drivers/char/hw_random/amd-rng.c | 6 +-- drivers/dma/ioat/dma.c| 6 +-- drivers/edac/amd64_edac.c | 8 ++-- drivers/edac/amd8111_edac.c | 16 ++- drivers/edac/amd8131_edac.c | 6 +-- drivers/edac/i82443bxgx_edac.c| 3 +- drivers/edac/sb_edac.c| 12 +++-- drivers/edac/skx_common.c | 18 +--- drivers/fpga/altera-cvp.c | 8 ++-- drivers/gpio/gpio-amd8111.c | 7 ++- drivers/gpio/gpio-rdc321x.c | 21 + drivers/gpu/drm/i915/display/intel_vga.c | 3 +- drivers/hwmon/i5k_amb.c | 12 +++-- drivers/hwmon/vt8231.c| 8 ++-- drivers/hwtracing/intel_th/pci.c | 12 ++--- drivers/i2c/busses/i2c-ali15x3.c | 6 ++- drivers/i2c/busses/i2c-elektor.c | 3 +- drivers/i2c/busses/i2c-nforce2.c | 4 +- drivers/i2c/busses/i2c-sis5595.c | 17 --- drivers/i2c/busses/i2c-sis630.c | 7 +-- drivers/i2c/busses/i2c-viapro.c | 11 +++-- drivers/ide/cs5536.c | 6 +-- drivers/ide/rz1000.c | 3 +- drivers/ide/setup-pci.c | 26 +++ drivers/infiniband/hw/hfi1/pcie.c | 38 +++ drivers/infiniband/hw/mthca/mthca_reset.c | 19 drivers/iommu/intel/iommu.c | 6 ++- drivers/mtd/maps/ichxrom.c| 4 +- drivers/net/can/peak_canfd/peak_pciefd_main.c | 6 ++- drivers/net/can/sja1000/peak_pci.c| 6 ++- drivers/net/ethernet/agere/et131x.c | 11 +++-- .../ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 5 +- .../cavium/liquidio/cn23xx_pf_device.c| 4 +- drivers/net/ethernet/marvell/sky2.c | 5 +- drivers/net/ethernet/mellanox/mlx4/catas.c| 7 +-- drivers/net/ethernet/mellanox/mlx4/reset.c| 10 ++-- .../net/ethernet/myricom/myri10ge/myri10ge.c | 4 +- drivers/net/wan/farsync.c | 5 +- .../broadcom/brcm80211/brcmfmac/pcie.c| 4 +- .../net/wireless/intel/iwlwifi/pcie/trans.c | 15 -- 50 files changed, 270 insertions(+), 209 deletions(-) -- 2.18.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value
On Sat, Aug 01, 2020 at 01:24:29PM +0200, Saheed O. Bolarinwa wrote: > The return value of pci_read_config_*() may not indicate a device error. > However, the value read by these functions is more likely to indicate > this kind of error. This presents two overlapping ways of reporting > errors and complicates error checking. So why isn't the *value check done in the pci_read_config_* functions instead of touching gazillion callers? For example, pci_conf{1,2}_read() could check whether the u32 *value it just read depending on the access method, whether that value is ~0 and return proper PCIBIOS_ error in that case. The check you're replicating if (val32 == (u32)~0) everywhere, instead, is just ugly and tests a naked value ~0 which doesn't mean anything... -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: revert scope for 5.8, was Re: dma-pool fixes
On Sat, 1 Aug 2020 at 14:27, Christoph Hellwig wrote: > > On Sat, Aug 01, 2020 at 01:20:07AM -0700, David Rientjes wrote: > > To follow-up on this, the introduction of the DMA atomic pools in 5.8 > > fixes an issue for any AMD SEV enabled guest that has a driver that > > requires atomic DMA allocations (for us, nvme) because runtime decryption > > of memory allocated through the DMA API may block. This manifests itself > > as "sleeping in invalid context" BUGs for any confidential VM user in > > cloud. > > > > I unfortunately don't have Amit's device to be able to independently debug > > this issue and certainly could not have done a better job at working the > > bug than Nicolas and Christoph have done so far. I'm as baffled by the > > results as anybody else. > > > > I fully understand the no regressions policy. I'd also ask that we > > consider that *all* SEV guests are currently broken if they use nvme or > > any other driver that does atomic DMA allocations. It's an extremely > > serious issue for cloud. If there is *anything* that I can do to make > > forward progress on this issue for 5.8, including some of the workarounds > > above that Amit requested, I'd be very happy to help. Christoph will make > > the right decision for DMA in 5.8, but I simply wanted to state how > > critical working SEV guests are to users. > > I'm between a rock and a hard place here. If we simply want to revert > commits as-is to make sure both the Raspberry Pi 4 and thone phone do > not regress we'll have to go all the way back and revert the whole SEV > pool support. I could try to manual revert of the multiple pool > support, but it is very late for that. Hi, I found the problematic memory region. It was a memory chunk reserved/removed in the downstream tree but was seemingly reserved upstream for different drivers. I failed to calculate the length of the total region reserved downstream correctly. And there was still a portion of memory left unmarked, which I should have marked as reserved in my testing earlier today. Sorry for all the noise and thanks Nicolas, Christoph and David for your patience. Regards, Amit Pundir > > Or maybe Linus has decided to cut a -rc8 which would give us a little > more time. > - ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 06/15] powerpc: fadamp: simplify fadump_reserve_crash_area()
On Thu, Jul 30, 2020 at 10:15:13PM +1000, Michael Ellerman wrote: > Mike Rapoport writes: > > From: Mike Rapoport > > > > fadump_reserve_crash_area() reserves memory from a specified base address > > till the end of the RAM. > > > > Replace iteration through the memblock.memory with a single call to > > memblock_reserve() with appropriate that will take care of proper memory > ^ > parameters? > > reservation. > > > > Signed-off-by: Mike Rapoport > > --- > > arch/powerpc/kernel/fadump.c | 20 +--- > > 1 file changed, 1 insertion(+), 19 deletions(-) > > I think this looks OK to me, but I don't have a setup to test it easily. > I've added Hari to Cc who might be able to. > > But I'll give you an ack in the hope that it works :) Actually, I did some digging in the git log and the traversal was added there on purpose by the commit b71a693d3db3 ("powerpc/fadump: exclude memory holes while reserving memory in second kernel") Presuming this is still reqruired I'm going to drop this patch and will simply replace for_each_memblock() with for_each_mem_range() in v2. > Acked-by: Michael Ellerman > > > > diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c > > index 78ab9a6ee6ac..2446a61e3c25 100644 > > --- a/arch/powerpc/kernel/fadump.c > > +++ b/arch/powerpc/kernel/fadump.c > > @@ -1658,25 +1658,7 @@ int __init fadump_reserve_mem(void) > > /* Preserve everything above the base address */ > > static void __init fadump_reserve_crash_area(u64 base) > > { > > - struct memblock_region *reg; > > - u64 mstart, msize; > > - > > - for_each_memblock(memory, reg) { > > - mstart = reg->base; > > - msize = reg->size; > > - > > - if ((mstart + msize) < base) > > - continue; > > - > > - if (mstart < base) { > > - msize -= (base - mstart); > > - mstart = base; > > - } > > - > > - pr_info("Reserving %lluMB of memory at %#016llx for preserving > > crash data", > > - (msize >> 20), mstart); > > - memblock_reserve(mstart, msize); > > - } > > + memblock_reserve(base, memblock_end_of_DRAM() - base); > > } > > > > unsigned long __init arch_reserved_kernel_pages(void) > > -- > > 2.26.2 -- Sincerely yours, Mike. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
revert scope for 5.8, was Re: dma-pool fixes
On Sat, Aug 01, 2020 at 01:20:07AM -0700, David Rientjes wrote: > To follow-up on this, the introduction of the DMA atomic pools in 5.8 > fixes an issue for any AMD SEV enabled guest that has a driver that > requires atomic DMA allocations (for us, nvme) because runtime decryption > of memory allocated through the DMA API may block. This manifests itself > as "sleeping in invalid context" BUGs for any confidential VM user in > cloud. > > I unfortunately don't have Amit's device to be able to independently debug > this issue and certainly could not have done a better job at working the > bug than Nicolas and Christoph have done so far. I'm as baffled by the > results as anybody else. > > I fully understand the no regressions policy. I'd also ask that we > consider that *all* SEV guests are currently broken if they use nvme or > any other driver that does atomic DMA allocations. It's an extremely > serious issue for cloud. If there is *anything* that I can do to make > forward progress on this issue for 5.8, including some of the workarounds > above that Amit requested, I'd be very happy to help. Christoph will make > the right decision for DMA in 5.8, but I simply wanted to state how > critical working SEV guests are to users. I'm between a rock and a hard place here. If we simply want to revert commits as-is to make sure both the Raspberry Pi 4 and thone phone do not regress we'll have to go all the way back and revert the whole SEV pool support. I could try to manual revert of the multiple pool support, but it is very late for that. Or maybe Linus has decided to cut a -rc8 which would give us a little more time. - ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dma-pool fixes
On Fri, Jul 31, 2020 at 12:04:28PM -0700, David Rientjes wrote: > On Fri, 31 Jul 2020, Christoph Hellwig wrote: > > > > Hi Nicolas, Christoph, > > > > > > Just out of curiosity, I'm wondering if we can restore the earlier > > > behaviour and make DMA atomic allocation configured thru platform > > > specific device tree instead? > > > > > > Or if you can allow a more hackish approach to restore the earlier > > > logic, using of_machine_is_compatible() just for my device for the > > > time being. Meanwhile I'm checking with other developers running the > > > mainline kernel on sdm845 phones like OnePlus 6/6T, if they see this > > > issue too. > > > > If we don't find a fix for your platform I'm going to send Linus a > > last minute revert this weekend, to stick to the no regressions policy. > > I still hope we can fix the issue for real. > > > > What would be the scope of this potential revert? I've just looked into that and it seems like we need to revert everything pool related back ot "dma-pool: add additional coherent pools to map to gfp mask". ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dma-pool fixes
On Fri, 31 Jul 2020, David Rientjes wrote: > > > Hi Nicolas, Christoph, > > > > > > Just out of curiosity, I'm wondering if we can restore the earlier > > > behaviour and make DMA atomic allocation configured thru platform > > > specific device tree instead? > > > > > > Or if you can allow a more hackish approach to restore the earlier > > > logic, using of_machine_is_compatible() just for my device for the > > > time being. Meanwhile I'm checking with other developers running the > > > mainline kernel on sdm845 phones like OnePlus 6/6T, if they see this > > > issue too. > > > > If we don't find a fix for your platform I'm going to send Linus a > > last minute revert this weekend, to stick to the no regressions policy. > > I still hope we can fix the issue for real. > > > > What would be the scope of this potential revert? > To follow-up on this, the introduction of the DMA atomic pools in 5.8 fixes an issue for any AMD SEV enabled guest that has a driver that requires atomic DMA allocations (for us, nvme) because runtime decryption of memory allocated through the DMA API may block. This manifests itself as "sleeping in invalid context" BUGs for any confidential VM user in cloud. I unfortunately don't have Amit's device to be able to independently debug this issue and certainly could not have done a better job at working the bug than Nicolas and Christoph have done so far. I'm as baffled by the results as anybody else. I fully understand the no regressions policy. I'd also ask that we consider that *all* SEV guests are currently broken if they use nvme or any other driver that does atomic DMA allocations. It's an extremely serious issue for cloud. If there is *anything* that I can do to make forward progress on this issue for 5.8, including some of the workarounds above that Amit requested, I'd be very happy to help. Christoph will make the right decision for DMA in 5.8, but I simply wanted to state how critical working SEV guests are to users. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3] iommu/arm-smmu-v3: permit users to disable MSI polling
Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. This patch provides a command line option so that users can decide to use MSI polling or not based on their tests. Signed-off-by: Barry Song --- -v3: * rebase on top of linux-next as arm-smmu-v3.c has moved; * provide a command line option drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 7196207be7ea..89d3cb391fef 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -418,6 +418,11 @@ module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); MODULE_PARM_DESC(disable_bypass, "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); +static bool disable_msipolling; +module_param_named(disable_msipolling, disable_msipolling, bool, S_IRUGO); +MODULE_PARM_DESC(disable_msipolling, + "Disable MSI-based polling for CMD_SYNC completion."); + enum pri_resp { PRI_RESP_DENY = 0, PRI_RESP_FAIL = 1, @@ -980,6 +985,13 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) return 0; } +static bool arm_smmu_use_msipolling(struct arm_smmu_device *smmu) +{ + return !disable_msipolling && + smmu->features & ARM_SMMU_FEAT_COHERENCY && + smmu->features & ARM_SMMU_FEAT_MSI; +} + static void arm_smmu_cmdq_build_sync_cmd(u64 *cmd, struct arm_smmu_device *smmu, u32 prod) { @@ -992,8 +1004,7 @@ static void arm_smmu_cmdq_build_sync_cmd(u64 *cmd, struct arm_smmu_device *smmu, * Beware that Hi16xx adds an extra 32 bits of goodness to its MSI * payload, so the write will zero the entire command on that platform. */ - if (smmu->features & ARM_SMMU_FEAT_MSI && - smmu->features & ARM_SMMU_FEAT_COHERENCY) { + if (arm_smmu_use_msipolling(smmu)) { ent.sync.msiaddr = q->base_dma + Q_IDX(&q->llq, prod) * q->ent_dwords * 8; } @@ -1332,8 +1343,7 @@ static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu, static int arm_smmu_cmdq_poll_until_sync(struct arm_smmu_device *smmu, struct arm_smmu_ll_queue *llq) { - if (smmu->features & ARM_SMMU_FEAT_MSI && - smmu->features & ARM_SMMU_FEAT_COHERENCY) + if (arm_smmu_use_msipolling(smmu)) return __arm_smmu_cmdq_poll_until_msi(smmu, llq); return __arm_smmu_cmdq_poll_until_consumed(smmu, llq); -- 2.27.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dma-pool fixes
On Fri, 31 Jul 2020 at 19:50, Amit Pundir wrote: > > On Fri, 31 Jul 2020 at 19:45, Nicolas Saenz Julienne > wrote: > > > > On Fri, 2020-07-31 at 16:47 +0530, Amit Pundir wrote: > > > On Fri, 31 Jul 2020 at 16:17, Nicolas Saenz Julienne > > > > [...] > > > > > > Ok, so lets see who's doing what and with what constraints: > > > > > > Here is the relevant dmesg log: https://pastebin.ubuntu.com/p/dh3pPnxS2v/ > > > > Sadly nothing out of the ordinary, looks reasonable. > > > > I have an idea, I've been going over the downstream device tree and it seems > > the reserved-memory entries, specially the ones marked with 'no-map' don't > > fully match what we have upstream. On top of that all these reserved areas > > seem > > to fall into ZONE_DMA. > > > > So, what could be happening is that, while allocating pages for the ZONE_DMA > > atomic pool, something in the page allocator is either writing/mapping into > > a > > reserved area triggering some kind of fault. > > > > Amir, could you go over the no-map reserved-memory entries in the downstream > > device-tree, both in 'beryllium-*.dtsi' (I think those are the relevant > > ones) > > and 'sdm845.dtsi'[1], and make sure they match what you are using. If not > > just > > edit them in and see if it helps. If you need any help with that I'll be > > happy > > to give you a hand. > > Thank you for the pointers. I'll try to match my dts' reserved-memory > entries with the downstream dts. I'll let you know how it goes. > I matched my dts's reserved-memory nodes with downstream but it didn't help. Most of the no-map reserved memory regions in the downstream kernel are accompanied with "removed-dma-pool" compatibility, "to indicate a region of memory which is meant to be carved out and not exposed to kernel." [1][2]. Is this something which might be tripping my device off? I tried to cherry-pick removed-dma-pool from msm kernel[3], to see if that makes any difference but I might have missed a few dependencies and my device didn't boot. [1] https://android.googlesource.com/kernel/msm/+/e9171c1c69c31ec2226f0871fb5535b6f2044ef3%5E%21/#F0 [2] https://lore.kernel.org/patchwork/patch/670515/#857952 [3] https://github.com/OnePlusOSS/android_kernel_oneplus_sm8250/commit/a478a8bf78ade799a5626cee45c2b247071b325f > Regards, > Amit Pundir > > > > > Regards, > > Nicolas > > > > [1] You could also extract the device tree from a device running with the > > downstream kernel, whatever is easier for you. > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu