Re: Consult on ARM SMMU debugfs
On Mon, Jan 11, 2021 at 08:01:48PM +, Robin Murphy wrote: > On 2021-01-07 02:45, chenxiang (M) wrote: > > Hi Will, Robin or other guys, > > > > When debugging SMMU/SVA issue on huawei ARM64 board, we find that it > > lacks of enough debugfs for ARM SMMU driver (such as > > > > the value of STE/CD which we need to check sometimes). Currently it > > creates top-level iommu directory in debugfs, but there is no debugfs > > > > for ARM SMMU driver specially. Do you know whether ARM have the plan to > > do that recently? > > FWIW I don't think I've ever felt the need to need to inspect the Stream > Table on a live system. So far the nature of the STE code has been simple > enough that it's very hard for any given STE to be *wrong* - either it's set > up as expected and thus works fine, or it's not initialised at all and you > get C_BAD_STE, where 99% of the time you then just cross-reference the > Stream ID against the firmware and find that the DT/IORT is wrong. > > Similarly I don't think I've even even *seen* an issue that could be > attributed to a context descriptor, although I appreciate that as we start > landing more PASID and SVA support the scope for that starts to widen > considerably. > > Feel free to propose a patch if you believe it would be genuinely useful and > won't just bit-rot into a maintenance burden, but it's not something that's > on our roadmap here. I do think that the IOMMU stuff needs better debugging. I've hit the WARN_ON() in __arm_lpae_map(), and it's been pretty much undebuggable, so I've resorted to putting the IOMMU into bypass mode permanently to work around the issue. The reason that it's undebuggable is if one puts printk() or trace statements in the code, boots the platform, you get flooded with those debugging messages, because every access to the rootfs generates and tears down a mapping. It would be nice to be able to inspect the IOMMU page tables and state of the IOMMU, rather than having to resort to effectively disabling the IOMMU. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/4] ARM/omap1: switch to use dma_direct_set_offset for lbus DMA offsets
On Mon, Sep 21, 2020 at 08:47:23AM +0200, Christoph Hellwig wrote: > On Mon, Sep 21, 2020 at 09:44:18AM +0300, Tony Lindgren wrote: > > * Janusz Krzysztofik [200919 22:29]: > > > Hi Tony, > > > > > > On Friday, September 18, 2020 7:49:33 A.M. CEST Tony Lindgren wrote: > > > > * Christoph Hellwig [200917 17:37]: > > > > > Switch the omap1510 platform ohci device to use dma_direct_set_offset > > > > > to set the DMA offset instead of using direct hooks into the DMA > > > > > mapping code and remove the now unused hooks. > > > > > > > > Looks nice to me :) I still can't test this probably for few more weeks > > > > though but hopefully Aaro or Janusz (Added to Cc) can test it. > > > > > > Works for me on Amstrad Delta (tested with a USB ethernet adapter). > > > > > > Tested-by: Janusz Krzysztofik > > > > Great, good to hear! And thanks for testing it. > > > > Christoph, feel free to queue this along with the other patches: > > > > Acked-by: Tony Lindgren > > > > Or let me know if you want me to pick it up. > > I'd prefer to pick it up through the dma-mapping tree, but preferably > I'd pick the whole series up once Russell has tested footwinder. I don't think that's going to happen very soon... seems way too much effort to pull down the appropriate tree to build and test. Sorry. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/4] ARM/footbridge: switch to use dma_direct_set_offset for lbus DMA offsets
On Thu, Sep 17, 2020 at 07:32:27PM +0200, Christoph Hellwig wrote: > static int __init cats_pci_init(void) > { > - if (machine_is_cats()) > - pci_common_init(&cats_pci); > + if (!machine_is_cats()) > + return 0; > + bus_register_notifier(&pci_bus_type, &footbridge_pci_dma_nb); > + pci_common_init(&cats_pci); I'd prefer these things to retain a positive-logic construct, so: if (machine_is_cats()) { bus_register_notifier(&pci_bus_type, &footbridge_pci_dma_nb); pci_common_init(&cats_pci); } It's the same number of lines. Otherwise, I think it's fine. I'll try to find some spare time to give it a go on a Netwinder. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/4] ARM/dma-mapping: don't handle NULL devices in dma-direct.h
On Thu, Sep 17, 2020 at 07:32:28PM +0200, Christoph Hellwig wrote: > The DMA API removed support for not passing in a device a long time > ago, so remove the NULL checks. What happens with ISA devices? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE
On Thu, Sep 10, 2020 at 07:40:37AM +0200, Christoph Hellwig wrote: > The DMA offset notifier can only be used if PHYS_OFFSET is at least > KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit > phys_addr_t. Currently the code compiles fine despite that, a pending > change to the DMA offset handling would create a compiler warning for > this case. Add an ifdef to not compile the code except for LPAE > configs. However, to have use of the high physical offset, LPAE needs to be enabled, which ensures that phys_addr_t is 64-bit. I believe that DMA is non-coherent on this platform unless the high physical address is used. Or something like that. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: decruft the vmalloc API
On Wed, Apr 08, 2020 at 01:58:58PM +0200, Christoph Hellwig wrote: > Hi all, > > Peter noticed that with some dumb luck you can toast the kernel address > space with exported vmalloc symbols. > > I used this as an opportunity to decruft the vmalloc.c API and make it > much more systematic. This also removes any chance to create vmalloc > mappings outside the designated areas or using executable permissions > from modules. Besides that it removes more than 300 lines of code. I haven't read all your patches yet. Have you tested it on 32-bit ARM, where the module area is located _below_ PAGE_OFFSET and outside of the vmalloc area? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: silence iommu group prints
On Wed, Mar 04, 2020 at 12:33:14PM +0200, Laurentiu Tudor wrote: > > > On 04.03.2020 12:07, Russell King - ARM Linux admin wrote: > > On Wed, Mar 04, 2020 at 11:56:53AM +0200, Laurentiu Tudor wrote: > > > From 44ae46501b5379bd0890df87fd3827248626ed03 Mon Sep 17 00:00:00 2001 > > > From: Laurentiu Tudor > > > Date: Tue, 1 Oct 2019 16:22:48 +0300 > > > Subject: [PATCH 1/6] bus: fsl-mc: make mc work with smmu disable bypass on > > > Content-Type: text/plain; charset="us-ascii" > > > > > > Since this commit [1] booting kernel on MC based chips started to > > > fail because this firmware starts before the kernel and as soon as > > > SMMU is probed it starts to trigger contiguous faults. > > > > I think you mean "continuous" here. > > Yes, thanks. > > > > This is a workaround that allows MC firmware to run with SMMU > > > in disable bypass mode. It consists of the following steps: > > > 1. pause the firmware at early boot to get a chance to setup SMMU > > > 2. request direct mapping for MC device > > > 3. resume the firmware > > > The workaround relies on the fact that no state is lost when > > > pausing / resuming firmware, as per the docs. > > > With this patch, platforms with MC firmware can now boot without > > > having to change the default config to set: > > >CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=n > > > > This alone is a definite improvement, and has been needed for a while. > > Please submit this patch with an appropriate Fixes: tag so stable trees > > can pick it up. > > The thing is that probably this workaround will never make it in the kernel > because it questionable to say the least, e.g. see [1]. My plan is to give > this approach [2] a try sometime soon. > > [1] https://patchwork.kernel.org/comment/23149049/ > [2] https://patchwork.kernel.org/cover/11279577/ So, if we want to reduce the iommu noise, we need to completely break the ability to use a mainline kernel on the LX2160A. This doesn't seem practical nor sensible. Someone has to give. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: silence iommu group prints
On Wed, Mar 04, 2020 at 11:56:53AM +0200, Laurentiu Tudor wrote: > From 44ae46501b5379bd0890df87fd3827248626ed03 Mon Sep 17 00:00:00 2001 > From: Laurentiu Tudor > Date: Tue, 1 Oct 2019 16:22:48 +0300 > Subject: [PATCH 1/6] bus: fsl-mc: make mc work with smmu disable bypass on > Content-Type: text/plain; charset="us-ascii" > > Since this commit [1] booting kernel on MC based chips started to > fail because this firmware starts before the kernel and as soon as > SMMU is probed it starts to trigger contiguous faults. I think you mean "continuous" here. > This is a workaround that allows MC firmware to run with SMMU > in disable bypass mode. It consists of the following steps: > 1. pause the firmware at early boot to get a chance to setup SMMU > 2. request direct mapping for MC device > 3. resume the firmware > The workaround relies on the fact that no state is lost when > pausing / resuming firmware, as per the docs. > With this patch, platforms with MC firmware can now boot without > having to change the default config to set: > CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=n This alone is a definite improvement, and has been needed for a while. Please submit this patch with an appropriate Fixes: tag so stable trees can pick it up. > [1] 954a03be033 ("iommu/arm-smmu: Break insecure users by disabling bypass by > default") Please put this where you're referencing it above - it's fine to wrap the description of the commit when using it in the body of the commit message. However, that should _never_ when providing a Fixes: tag (linux-next has a script which will detect and complain about broken Fixes: tags.) Thanks. > > Signed-off-by: Laurentiu Tudor > --- > drivers/bus/fsl-mc/fsl-mc-bus.c | 53 + > 1 file changed, 53 insertions(+) > > diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c > index a0f8854acb3a..683a6401ffe8 100644 > --- a/drivers/bus/fsl-mc/fsl-mc-bus.c > +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c > @@ -18,6 +18,8 @@ > #include > #include > #include > +#include > +#include > > #include "fsl-mc-private.h" > > @@ -889,6 +891,12 @@ static int get_mc_addr_translation_ranges(struct device > *dev, > return 0; > } > > +#define FSL_MC_GCR1 0x0 > +#define GCR1_P1_STOP BIT(31) > + > +static u32 boot_gcr1; > +static void __iomem *fsl_mc_regs; > + > /** > * fsl_mc_bus_probe - callback invoked when the root MC bus is being > * added > @@ -906,6 +914,21 @@ static int fsl_mc_bus_probe(struct platform_device *pdev) > struct mc_version mc_version; > struct resource res; > > + /* > + * The MC firmware requires full access to the whole address space plus > + * it has no way of dealing with any kind of address translation, so > + * request direct mapping for it. > + */ > + error = iommu_request_dm_for_dev(&pdev->dev); > + if (error) > + dev_warn(&pdev->dev, "iommu_request_dm_for_dev() failed: %d\n", > + error); > + > + if (fsl_mc_regs) { > + /* Resume the firmware */ > + writel(boot_gcr1 & ~GCR1_P1_STOP, fsl_mc_regs + FSL_MC_GCR1); > + } > + > mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL); > if (!mc) > return -ENOMEM; > @@ -990,6 +1013,13 @@ static int fsl_mc_bus_remove(struct platform_device > *pdev) > if (!fsl_mc_is_root_dprc(&mc->root_mc_bus_dev->dev)) > return -EINVAL; > > + /* > + * Pause back the firmware so that it doesn't trigger faults by the > + * time SMMU gets brought down. > + */ > + writel(boot_gcr1 | GCR1_P1_STOP, fsl_mc_regs + FSL_MC_GCR1); > + iounmap(fsl_mc_regs); > + > fsl_mc_device_remove(mc->root_mc_bus_dev); > > fsl_destroy_mc_io(mc->root_mc_bus_dev->mc_io); > @@ -1018,6 +1048,8 @@ static struct platform_driver fsl_mc_bus_driver = { > static int __init fsl_mc_bus_driver_init(void) > { > int error; > + struct device_node *np; > + struct resource res; > > error = bus_register(&fsl_mc_bus_type); > if (error < 0) { > @@ -1039,9 +1071,30 @@ static int __init fsl_mc_bus_driver_init(void) > if (error < 0) > goto error_cleanup_dprc_driver; > > + np = of_find_matching_node(NULL, fsl_mc_bus_match_table); > + if (np && of_device_is_available(np)) { > + error = of_address_to_resource(np, 1, &res); > + if (error) > + goto error_cleanup_dprc_driver; > + fsl_mc_regs = ioremap(res.start, resource_size(&res)); > + if (!fsl_mc_regs) { > + error = -ENXIO; > + goto error_cleanup_dprc_driver; > + } > + > + boot_gcr1 = readl(fsl_mc_regs + FSL_MC_GCR1); > + /* > + * If found running, pause MC firmware in order to get a chance > + * to setup SMMU for it. > + */ > +
Re: [PATCH] iommu: silence iommu group prints
On Wed, Mar 04, 2020 at 11:42:16AM +0200, Laurentiu Tudor wrote: > On 04.03.2020 11:33, Russell King - ARM Linux admin wrote: > > On Wed, Mar 04, 2020 at 10:56:06AM +0200, Laurentiu Tudor wrote: > > > > > > On 04.03.2020 00:17, Russell King - ARM Linux admin wrote: > > > > On Tue, Mar 03, 2020 at 05:55:05PM +0200, Laurentiu Tudor wrote: > > > > > From c98dc05cdd45ae923654f2427985bd28bcde4bb2 Mon Sep 17 00:00:00 > > > > > 2001 > > > > > From: Laurentiu Tudor > > > > > Date: Thu, 13 Feb 2020 11:59:12 +0200 > > > > > Subject: [PATCH 1/4] bus: fsl-mc: add custom .dma_configure > > > > > implementation > > > > > Content-Type: text/plain; charset="us-ascii" > > > > > > > > > > The devices on this bus are not discovered by way of device tree > > > > > but by queries to the firmware. It makes little sense to trick the > > > > > generic of layer into thinking that these devices are of related so > > > > > that we can get our dma configuration. Instead of doing that, add > > > > > our custom dma configuration implementation. > > > > > > > > Firstly, applying this to v5.5 results in a build failure, due to a > > > > missing linux/iommu.h include. > > > > > > > > Secondly, this on its own appears to make the DPAA2 network interfaces > > > > completely disappear. Looking in /sys/bus/fsl-mc/drivers/*, none of > > > > the DPAA2 drivers are bound to anything, and looking in > > > > /sys/bus/fsl-mc/devices/, there is: > > > > > > > > lrwxrwxrwx 1 root root 0 Mar 3 22:06 dprc.1 -> > > > > ../../../devices/platform/soc/80c00.fsl-mc/dprc.1 > > > > > > > > This is booting with u-boot, so using DT rather than ACPI. > > > > > > > > > Signed-off-by: Laurentiu Tudor > > > > > --- > > > > >drivers/bus/fsl-mc/fsl-mc-bus.c | 42 > > > > > - > > > > >1 file changed, 41 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c > > > > > b/drivers/bus/fsl-mc/fsl-mc-bus.c > > > > > index 36eb25f82c8e..3df015eedae4 100644 > > > > > --- a/drivers/bus/fsl-mc/fsl-mc-bus.c > > > > > +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c > > > > > @@ -132,11 +132,51 @@ static int fsl_mc_bus_uevent(struct device > > > > > *dev, struct kobj_uevent_env *env) > > > > >static int fsl_mc_dma_configure(struct device *dev) > > > > >{ > > > > > struct device *dma_dev = dev; > > > > > + struct iommu_fwspec *fwspec; > > > > > + const struct iommu_ops *iommu_ops; > > > > > + struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev); > > > > > + int ret; > > > > > + u32 icid; > > > > > + > > > > > + /* Skip DMA setup for devices that are not DMA masters */ > > > > > + if (dev->type == &fsl_mc_bus_dpmcp_type || > > > > > + dev->type == &fsl_mc_bus_dpbp_type || > > > > > + dev->type == &fsl_mc_bus_dpcon_type || > > > > > + dev->type == &fsl_mc_bus_dpio_type) > > > > > + return 0; > > > > > while (dev_is_fsl_mc(dma_dev)) > > > > > dma_dev = dma_dev->parent; > > > > > - return of_dma_configure(dev, dma_dev->of_node, 0); > > > > > + fwspec = dev_iommu_fwspec_get(dma_dev); > > > > > + if (!fwspec) > > > > > + return -ENODEV; > > > > > > > > The problem appears to be here - fwspec is NULL for dprc.1. > > > > > > Ok, that's because the iommu config is missing from the DT node that's > > > exposing the MC firmware. I've attached a fresh set of patches that > > > include > > > on top the missing config and a workaround that makes MC work over SMMU. > > > Also added the missing #include, thanks for pointing out. > > > Let me know how it goes. > > > > Shouldn't patch 6 should be first in the series, so that patch 1 does > > not cause a regression and bisectability damage? > > > > Correct, width one comment: both 5 and 6 should go first. Once iommu-map is > added in the device tree the workaround for MC with the > arm-smmu.disable_bypass boot arg will no longer work. So, wouldn't it be reasonable to arrange the patch series like that? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: silence iommu group prints
On Wed, Mar 04, 2020 at 10:56:06AM +0200, Laurentiu Tudor wrote: > > On 04.03.2020 00:17, Russell King - ARM Linux admin wrote: > > On Tue, Mar 03, 2020 at 05:55:05PM +0200, Laurentiu Tudor wrote: > > > From c98dc05cdd45ae923654f2427985bd28bcde4bb2 Mon Sep 17 00:00:00 2001 > > > From: Laurentiu Tudor > > > Date: Thu, 13 Feb 2020 11:59:12 +0200 > > > Subject: [PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation > > > Content-Type: text/plain; charset="us-ascii" > > > > > > The devices on this bus are not discovered by way of device tree > > > but by queries to the firmware. It makes little sense to trick the > > > generic of layer into thinking that these devices are of related so > > > that we can get our dma configuration. Instead of doing that, add > > > our custom dma configuration implementation. > > > > Firstly, applying this to v5.5 results in a build failure, due to a > > missing linux/iommu.h include. > > > > Secondly, this on its own appears to make the DPAA2 network interfaces > > completely disappear. Looking in /sys/bus/fsl-mc/drivers/*, none of > > the DPAA2 drivers are bound to anything, and looking in > > /sys/bus/fsl-mc/devices/, there is: > > > > lrwxrwxrwx 1 root root 0 Mar 3 22:06 dprc.1 -> > > ../../../devices/platform/soc/80c00.fsl-mc/dprc.1 > > > > This is booting with u-boot, so using DT rather than ACPI. > > > > > Signed-off-by: Laurentiu Tudor > > > --- > > > drivers/bus/fsl-mc/fsl-mc-bus.c | 42 - > > > 1 file changed, 41 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c > > > b/drivers/bus/fsl-mc/fsl-mc-bus.c > > > index 36eb25f82c8e..3df015eedae4 100644 > > > --- a/drivers/bus/fsl-mc/fsl-mc-bus.c > > > +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c > > > @@ -132,11 +132,51 @@ static int fsl_mc_bus_uevent(struct device *dev, > > > struct kobj_uevent_env *env) > > > static int fsl_mc_dma_configure(struct device *dev) > > > { > > > struct device *dma_dev = dev; > > > + struct iommu_fwspec *fwspec; > > > + const struct iommu_ops *iommu_ops; > > > + struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev); > > > + int ret; > > > + u32 icid; > > > + > > > + /* Skip DMA setup for devices that are not DMA masters */ > > > + if (dev->type == &fsl_mc_bus_dpmcp_type || > > > + dev->type == &fsl_mc_bus_dpbp_type || > > > + dev->type == &fsl_mc_bus_dpcon_type || > > > + dev->type == &fsl_mc_bus_dpio_type) > > > + return 0; > > > while (dev_is_fsl_mc(dma_dev)) > > > dma_dev = dma_dev->parent; > > > - return of_dma_configure(dev, dma_dev->of_node, 0); > > > + fwspec = dev_iommu_fwspec_get(dma_dev); > > > + if (!fwspec) > > > + return -ENODEV; > > > > The problem appears to be here - fwspec is NULL for dprc.1. > > Ok, that's because the iommu config is missing from the DT node that's > exposing the MC firmware. I've attached a fresh set of patches that include > on top the missing config and a workaround that makes MC work over SMMU. > Also added the missing #include, thanks for pointing out. > Let me know how it goes. Shouldn't patch 6 should be first in the series, so that patch 1 does not cause a regression and bisectability damage? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: silence iommu group prints
On Tue, Mar 03, 2020 at 05:55:05PM +0200, Laurentiu Tudor wrote: > From c98dc05cdd45ae923654f2427985bd28bcde4bb2 Mon Sep 17 00:00:00 2001 > From: Laurentiu Tudor > Date: Thu, 13 Feb 2020 11:59:12 +0200 > Subject: [PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation > Content-Type: text/plain; charset="us-ascii" > > The devices on this bus are not discovered by way of device tree > but by queries to the firmware. It makes little sense to trick the > generic of layer into thinking that these devices are of related so > that we can get our dma configuration. Instead of doing that, add > our custom dma configuration implementation. Firstly, applying this to v5.5 results in a build failure, due to a missing linux/iommu.h include. Secondly, this on its own appears to make the DPAA2 network interfaces completely disappear. Looking in /sys/bus/fsl-mc/drivers/*, none of the DPAA2 drivers are bound to anything, and looking in /sys/bus/fsl-mc/devices/, there is: lrwxrwxrwx 1 root root 0 Mar 3 22:06 dprc.1 -> ../../../devices/platform/soc/80c00.fsl-mc/dprc.1 This is booting with u-boot, so using DT rather than ACPI. > Signed-off-by: Laurentiu Tudor > --- > drivers/bus/fsl-mc/fsl-mc-bus.c | 42 - > 1 file changed, 41 insertions(+), 1 deletion(-) > > diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c > index 36eb25f82c8e..3df015eedae4 100644 > --- a/drivers/bus/fsl-mc/fsl-mc-bus.c > +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c > @@ -132,11 +132,51 @@ static int fsl_mc_bus_uevent(struct device *dev, struct > kobj_uevent_env *env) > static int fsl_mc_dma_configure(struct device *dev) > { > struct device *dma_dev = dev; > + struct iommu_fwspec *fwspec; > + const struct iommu_ops *iommu_ops; > + struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev); > + int ret; > + u32 icid; > + > + /* Skip DMA setup for devices that are not DMA masters */ > + if (dev->type == &fsl_mc_bus_dpmcp_type || > + dev->type == &fsl_mc_bus_dpbp_type || > + dev->type == &fsl_mc_bus_dpcon_type || > + dev->type == &fsl_mc_bus_dpio_type) > + return 0; > > while (dev_is_fsl_mc(dma_dev)) > dma_dev = dma_dev->parent; > > - return of_dma_configure(dev, dma_dev->of_node, 0); > + fwspec = dev_iommu_fwspec_get(dma_dev); > + if (!fwspec) > + return -ENODEV; The problem appears to be here - fwspec is NULL for dprc.1. > + iommu_ops = iommu_ops_from_fwnode(fwspec->iommu_fwnode); > + if (!iommu_ops) > + return -ENODEV; > + > + ret = iommu_fwspec_init(dev, fwspec->iommu_fwnode, iommu_ops); > + if (ret) > + return ret; > + > + icid = mc_dev->icid; > + ret = iommu_fwspec_add_ids(dev, &icid, 1); > + if (ret) { > + iommu_fwspec_free(dev); > + return ret; > + } > + > + if (!device_iommu_mapped(dev)) { > + ret = iommu_probe_device(dev); > + if (ret) { > + iommu_fwspec_free(dev); > + return ret; > + } > + } > + > + arch_setup_dma_ops(dev, 0, *dma_dev->dma_mask + 1, iommu_ops, true); > + > + return 0; > } > > static ssize_t modalias_show(struct device *dev, struct device_attribute > *attr, > -- > 2.17.1 > -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: silence iommu group prints
On Tue, Mar 03, 2020 at 04:18:57PM +0200, Laurentiu Tudor wrote: > > > On 28.02.2020 20:32, Robin Murphy wrote: > > [ +Laurentiu ] > > > > Hi Russell, > > > > Thanks for sharing a log, now I properly understand what's up... further > > comments at the end (for context). > > > > On 28/02/2020 10:06 am, Russell King - ARM Linux admin wrote: > > > On Fri, Feb 28, 2020 at 09:33:40AM +, John Garry wrote: > > > > On 28/02/2020 02:16, Lu Baolu wrote: > > > > > Hi, > > > > > > > > > > On 2020/2/27 19:57, Russell King wrote: > > > > > > On the LX2160A, there are lots (about 160) of IOMMU messages > > > > > > produced > > > > > > during boot; this is excessive. Reduce the severity of these > > > > > > messages > > > > > > to debug level. > > > > > > > > > > > > Signed-off-by: Russell King > > > > > > --- > > > > > > drivers/iommu/iommu.c | 4 ++-- > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > > > > > > index 3ead597e1c57..304281ec623b 100644 > > > > > > --- a/drivers/iommu/iommu.c > > > > > > +++ b/drivers/iommu/iommu.c > > > > > > @@ -741,7 +741,7 @@ int iommu_group_add_device(struct iommu_group > > > > > > *group, struct device *dev) > > > > > > trace_add_device_to_group(group->id, dev); > > > > > > - dev_info(dev, "Adding to iommu group %d\n", group->id); > > > > > > + dev_dbg(dev, "Adding to iommu group %d\n", group->id); > > > > > > > > > > I'm not strongly against this. But to me this message seems > > > > > to be a good > > > > > indicator that a device was probed successfully by the iommu > > > > > subsystem. > > > > > Keeping it in the default kernel message always helps to the kernel > > > > > debugging. > > > > > > > > > > > > > I would tend to agree. > > > > > > Here's the boot messages. Notice how many of these "Adding to iommu > > > group" messages there are: > > > > > > [ 0.00] Booting Linux on physical CPU 0x00 [0x410fd083] > > > [ 0.00] Linux version 5.5.0+ (rmk@rmk-PC) (gcc version 4.9.2 > > > (GCC)) #655 SMP PREEMPT Fri Feb 28 09:54:47 GMT 2020 > > > [ 0.00] Machine model: SolidRun LX2160A Clearfog CX > > > [ 0.00] earlycon: pl11 at MMIO32 0x021c (options '') > > > [ 0.00] printk: bootconsole [pl11] enabled > > > [ 0.00] efi: Getting EFI parameters from FDT: > > > [ 0.00] efi: UEFI not found. > > > [ 0.00] cma: Reserved 32 MiB at 0xf9c0 > > > [ 0.00] On node 0 totalpages: 1555968 > > > [ 0.00] DMA zone: 4096 pages used for memmap > > > [ 0.00] DMA zone: 0 pages reserved > > > [ 0.00] DMA zone: 262144 pages, LIFO batch:63 > > > [ 0.00] DMA32 zone: 3832 pages used for memmap > > > [ 0.00] DMA32 zone: 245248 pages, LIFO batch:63 > > > [ 0.00] Normal zone: 16384 pages used for memmap > > > [ 0.00] Normal zone: 1048576 pages, LIFO batch:63 > > > [ 0.00] psci: probing for conduit method from DT. > > > [ 0.00] psci: PSCIv1.1 detected in firmware. > > > [ 0.00] psci: Using standard PSCI v0.2 function IDs > > > [ 0.00] psci: MIGRATE_INFO_TYPE not supported. > > > [ 0.00] psci: SMC Calling Convention v1.1 > > > [ 0.00] percpu: Embedded 31 pages/cpu s88968 r8192 d29816 u126976 > > > [ 0.00] pcpu-alloc: s88968 r8192 d29816 u126976 alloc=31*4096 > > > [ 0.00] pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05 > > > [0] 06 [0] 07 > > > [ 0.00] pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11 [0] 12 [0] 13 > > > [0] 14 [0] 15 > > > [ 0.00] Detected PIPT I-cache on CPU0 > > > [ 0.00] CPU features: detected: GIC system register CPU interface > > > [ 0.00] CPU features: detected: EL2 vector hardening > > > [ 0.00] Speculative Store Bypass Disable mitigation not required > > > [ 0.00] CPU features: detected: ARM erratum
Re: [PATCH] iommu: silence iommu group prints
On Fri, Feb 28, 2020 at 09:33:40AM +, John Garry wrote: > On 28/02/2020 02:16, Lu Baolu wrote: > > Hi, > > > > On 2020/2/27 19:57, Russell King wrote: > > > On the LX2160A, there are lots (about 160) of IOMMU messages produced > > > during boot; this is excessive. Reduce the severity of these messages > > > to debug level. > > > > > > Signed-off-by: Russell King > > > --- > > > drivers/iommu/iommu.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > > > index 3ead597e1c57..304281ec623b 100644 > > > --- a/drivers/iommu/iommu.c > > > +++ b/drivers/iommu/iommu.c > > > @@ -741,7 +741,7 @@ int iommu_group_add_device(struct iommu_group > > > *group, struct device *dev) > > > trace_add_device_to_group(group->id, dev); > > > - dev_info(dev, "Adding to iommu group %d\n", group->id); > > > + dev_dbg(dev, "Adding to iommu group %d\n", group->id); > > > > I'm not strongly against this. But to me this message seems to be a good > > indicator that a device was probed successfully by the iommu subsystem. > > Keeping it in the default kernel message always helps to the kernel > > debugging. > > > > I would tend to agree. Here's the boot messages. Notice how many of these "Adding to iommu group" messages there are: [0.00] Booting Linux on physical CPU 0x00 [0x410fd083] [0.00] Linux version 5.5.0+ (rmk@rmk-PC) (gcc version 4.9.2 (GCC)) #655 SMP PREEMPT Fri Feb 28 09:54:47 GMT 2020 [0.00] Machine model: SolidRun LX2160A Clearfog CX [0.00] earlycon: pl11 at MMIO32 0x021c (options '') [0.00] printk: bootconsole [pl11] enabled [0.00] efi: Getting EFI parameters from FDT: [0.00] efi: UEFI not found. [0.00] cma: Reserved 32 MiB at 0xf9c0 [0.00] On node 0 totalpages: 1555968 [0.00] DMA zone: 4096 pages used for memmap [0.00] DMA zone: 0 pages reserved [0.00] DMA zone: 262144 pages, LIFO batch:63 [0.00] DMA32 zone: 3832 pages used for memmap [0.00] DMA32 zone: 245248 pages, LIFO batch:63 [0.00] Normal zone: 16384 pages used for memmap [0.00] Normal zone: 1048576 pages, LIFO batch:63 [0.00] psci: probing for conduit method from DT. [0.00] psci: PSCIv1.1 detected in firmware. [0.00] psci: Using standard PSCI v0.2 function IDs [0.00] psci: MIGRATE_INFO_TYPE not supported. [0.00] psci: SMC Calling Convention v1.1 [0.00] percpu: Embedded 31 pages/cpu s88968 r8192 d29816 u126976 [0.00] pcpu-alloc: s88968 r8192 d29816 u126976 alloc=31*4096 [0.00] pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05 [0] 06 [0] 07 [0.00] pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11 [0] 12 [0] 13 [0] 14 [0] 15 [0.00] Detected PIPT I-cache on CPU0 [0.00] CPU features: detected: GIC system register CPU interface [0.00] CPU features: detected: EL2 vector hardening [0.00] Speculative Store Bypass Disable mitigation not required [0.00] CPU features: detected: ARM erratum 1319367 [0.00] Built 1 zonelists, mobility grouping on. Total pages: 1531656 [0.00] Kernel command line: console=ttyAMA0,115200 root=PARTUUID=c7837e2f-02 rootwait earlycon=pl011,mmio32,0x21c ramdisk_size=0 pci=pcie_bus_perf arm_smmu.disable_bypass=0 iommu.passthrough=0 [0.00] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear) [0.00] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, linear) [0.00] mem auto-init: stack:off, heap alloc:off, heap free:off [0.00] software IO TLB: mapped [mem 0xbbfff000-0xb000] (64MB) [0.00] Memory: 5991500K/6223872K available (10172K kernel code, 1376K rwdata, 3888K rodata, 960K init, 4326K bss, 199604K reserved, 32768K cma-reserved) [0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1 [0.00] rcu: Preemptible hierarchical RCU implementation. [0.00] rcu: RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=16. [0.00] Tasks RCU enabled. [0.00] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies. [0.00] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=16 [0.00] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0 [0.00] GICv3: GIC: Using split EOI/Deactivate mode [0.00] GICv3: 256 SPIs implemented [0.00] GICv3: 0 Extended SPIs implemented [0.00] GICv3: Distributor has no Range Selector support [0.00] GICv3: 16 PPIs implemented [0.00] GICv3: no VLPI support, no direct LPI support [0.00] GICv3: CPU0: found redistributor 0 region 0:0x0620 [0.00] ITS [mem 0x0602-0x0603] [0.00] ITS@0x0602: allocated 65536 Devices @2178d0 (flat, esz 8, psz 64K, shr 0) [0.00] ITS:
Re: [PATCH] iommu: silence iommu group prints
On Thu, Feb 27, 2020 at 06:19:10PM +, Robin Murphy wrote: > On 27/02/2020 1:48 pm, Russell King - ARM Linux admin wrote: > > On Thu, Feb 27, 2020 at 01:44:56PM +, Robin Murphy wrote: > > > On 27/02/2020 11:57 am, Russell King wrote: > > > > On the LX2160A, there are lots (about 160) of IOMMU messages produced > > > > during boot; this is excessive. Reduce the severity of these messages > > > > to debug level. > > > > > > That's... a lot. Does the system really have that many devices, or is some > > > driver being stupid and repeatedly populating and destroying an entire bus > > > in a probe-deferral dance? > > > > It's all the devices created by for the mc-bus for the DPAA2 > > networking support. I don't know the technicalities, just that > > the boot is spammed with these messages. > > Well, the "technicalities" are really just whether the thing before the > colon on each message is unique or not. If you're seeing multiple add and > remove calls pertaining to the same device (or frankly any remove calls at > all during boot) then it smacks of something somewhere wasting time and > resources with unnecessary busywork, which is indicative of either poor > design or an actual bug, either of which would deserve fixing. It's not the same device. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: silence iommu group prints
On Thu, Feb 27, 2020 at 01:44:56PM +, Robin Murphy wrote: > On 27/02/2020 11:57 am, Russell King wrote: > > On the LX2160A, there are lots (about 160) of IOMMU messages produced > > during boot; this is excessive. Reduce the severity of these messages > > to debug level. > > That's... a lot. Does the system really have that many devices, or is some > driver being stupid and repeatedly populating and destroying an entire bus > in a probe-deferral dance? It's all the devices created by for the mc-bus for the DPAA2 networking support. I don't know the technicalities, just that the boot is spammed with these messages. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: fix the module name for disable_bypass parameter
On Tue, Feb 11, 2020 at 05:36:55PM -0600, Li Yang wrote: > Since commit cd221bd24ff5 ("iommu/arm-smmu: Allow building as a module"), > there is a side effect that the module name is changed from arm-smmu to > arm-smmu-mod. So the kernel parameter for disable_bypass need to be > changed too. Fix the Kconfig help and error message to the correct > parameter name. Hmm, this seems to be a user-visible change - so those of us who have been booting with "arm-smmu.disable_bypass=0" now need to change that depending on which kernel is being booted - which is not nice, and makes the support side on platforms that need this kernel parameter harder. > > Signed-off-by: Li Yang > --- > drivers/iommu/Kconfig| 2 +- > drivers/iommu/arm-smmu.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index d2fade984999..fb54be903c60 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -415,7 +415,7 @@ config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT > hardcode the bypass disable in the code. > > NOTE: the kernel command line parameter > - 'arm-smmu.disable_bypass' will continue to override this > + 'arm-smmu-mod.disable_bypass' will continue to override this > config. > > config ARM_SMMU_V3 > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 16c4b87af42b..2ffe8ff04393 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -512,7 +512,7 @@ static irqreturn_t arm_smmu_global_fault(int irq, void > *dev) > if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) && > (gfsr & ARM_SMMU_sGFSR_USF)) > dev_err(smmu->dev, > - "Blocked unknown Stream ID 0x%hx; boot with > \"arm-smmu.disable_bypass=0\" to allow, but this may have security > implications\n", > + "Blocked unknown Stream ID 0x%hx; boot with > \"arm-smmu-mod.disable_bypass=0\" to allow, but this may have security > implications\n", > (u16)gfsynr1); > else > dev_err(smmu->dev, > -- > 2.17.1 > > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: Report USF more clearly
On Fri, Sep 13, 2019 at 12:48:37PM +0100, Robin Murphy wrote: > Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool > for smoking out inadequate firmware, the failure mode is non-obvious > and can be confusing for end users. Add some special-case reporting of > Unidentified Stream Faults to help clarify this particular symptom. Having encountered this on a board that turned up this week, it may be better to use the hex representation of the stream ID, especially as it seems normal for the stream ID to be made up of implementation defined bitfields. If we want to stick with decimal, maybe masking the stream ID with the number of allowable bits would be a good idea, so that the decimal value remains meaningful should other bits be non-zero? > CC: Douglas Anderson > Signed-off-by: Robin Murphy > --- > drivers/iommu/arm-smmu.c | 5 + > drivers/iommu/arm-smmu.h | 2 ++ > 2 files changed, 7 insertions(+) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index b7cf24402a94..76ac8c180695 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -499,6 +499,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void > *dev) > dev_err_ratelimited(smmu->dev, > "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 > 0x%08x\n", > gfsr, gfsynr0, gfsynr1, gfsynr2); > + if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) && > + (gfsr & sGFSR_USF)) > + dev_err_ratelimited(smmu->dev, > + "Stream ID %hu may not be described by firmware, try > booting with \"arm-smmu.disable_bypass=0\"\n", > + (u16)gfsynr1); > > arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sGFSR, gfsr); > return IRQ_HANDLED; > diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h > index c9c13b5785f2..46f7e161e83e 100644 > --- a/drivers/iommu/arm-smmu.h > +++ b/drivers/iommu/arm-smmu.h > @@ -79,6 +79,8 @@ > #define ID7_MINORGENMASK(3, 0) > > #define ARM_SMMU_GR0_sGFSR 0x48 > +#define sGFSR_USFBIT(2) I do wonder if this is another instance where writing "(1 << 1)" would have resulted in less chance of a mistake being made... wrapping stuff up into macros is not always better! 9.6.15SMMU_sGFSR, Global Fault Status Register The SMMU_sGFSR bit assignments are: USF, bit[1] Unidentified stream fault. The possible values of this bit are: 0 No Unidentified stream fault. 1 Unidentified stream fault. So this wants to be: #define sGFSR_USF BIT(1) -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/4] vmalloc: lift the arm flag for coherent mappings to common code
On Fri, Aug 30, 2019 at 08:29:21AM +0200, Christoph Hellwig wrote: > The arm architecture had a VM_ARM_DMA_CONSISTENT flag to mark DMA > coherent remapping for a while. Lift this flag to common code so > that we can use it generically. We also check it in the only place > VM_USERMAP is directly check so that we can entirely replace that > flag as well (although I'm not even sure why we'd want to allow > remapping DMA appings, but I'd rather not change behavior). Good, because if you did change that behaviour, you'd break almost every ARM framebuffer and cripple ARM audio drivers. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-mapping: fix page attributes for dma_mmap_*
On Tue, Aug 06, 2019 at 05:45:03PM +0100, Russell King - ARM Linux admin wrote: > On Tue, Aug 06, 2019 at 05:08:54PM +0100, Will Deacon wrote: > > On Sat, Aug 03, 2019 at 08:48:12AM +0200, Christoph Hellwig wrote: > > > On Fri, Aug 02, 2019 at 11:38:03AM +0100, Will Deacon wrote: > > > > > > > > So this boils down to a terminology mismatch. The Arm architecture > > > > doesn't have > > > > anything called "write combine", so in Linux we instead provide what > > > > the Arm > > > > architecture calls "Normal non-cacheable" memory for > > > > pgprot_writecombine(). > > > > Amongst other things, this memory type permits speculation, unaligned > > > > accesses > > > > and merging of writes. I found something in the architecture spec about > > > > non-cachable memory, but it's written in Armglish[1]. > > > > > > > > pgprot_noncached(), on the other hand, provides what the architecture > > > > calls > > > > Strongly Ordered or Device-nGnRnE memory. This is intended for mapping > > > > MMIO > > > > (i.e. PCI config space) and therefore forbids speculation, preserves > > > > access > > > > size, requires strict alignment and also forces write responses to come > > > > from > > > > the endpoint. > > > > > > > > I think the naming mismatch is historical, but on arm64 we wanted to > > > > use the > > > > same names as arm32 so that any drivers using these things directly > > > > would get > > > > the same behaviour. > > > > > > That all makes sense, but it totally needs a comment. I'll try to draft > > > one based on this. I've also looked at the arm32 code a bit more, and > > > it seems arm always (?) supported Normal non-cacheable attribute, but > > > Linux only optionally uses it for arm v6+ because of fears of drivers > > > missing barriers. > > > > I think it was also to do with aliasing, but I don't recall all of the > > details. > > ARMv6+ is where the architecture significantly changed to introduce > the idea of [Normal, Device, Strongly Ordered] where Normal has the > cache attributes. > > Before that, we had just "uncached/unbuffered, uncached/buffered, > cached/unbuffered, cached/buffered" modes. > > The write buffer (enabled by buffered modes) has no architected > guarantees about how long writes will sit in it, and there is only > the "drain write buffer" instruction to push writes out. > > Up to and including ARMv5, we took the easy approach of just using > the "uncached/unbuffered" mode since that is (a) the safest, and (b) > avoids write buffers that alias when there are multiple different > mappings. > > We could have used a different approach, making all IO writes contain > a "drain write buffer" instruction, and map DMA memory as "buffered", > but as there were no Linux barriers defined to order memory accesses > to DMA memory (so, for example, ring buffers can be updated in the > correct order) back in those days, using the uncached/unbuffered mode > was the sanest and most reliable solution. > > > > > > The other really weird things is that in arm32 > > > pgprot_dmacoherent incudes the L_PTE_XN bit, which from my understanding > > > is the no-execture bit, but pgprot_writecombine does not. This seems to > > > not very unintentional. So minus that the whole DMA_ATTR_WRITE_COMBІNE > > > seems to be about flagging old arm specific drivers as having the proper > > > barriers in places and otherwise is a no-op. > > > > I think it only matters for Armv7 CPUs, but yes, we should probably be > > setting L_PTE_XN for both of these memory types. > > Conventionally, pgprot_writecombine() has only been used to change > the memory type and not the permissions. Since writecombine memory > is still capable of being executed, I don't see any reason to set XN > for it. > > If the user wishes to mmap() using PROT_READ|PROT_EXEC, then is there > really a reason for writecombine to set XN overriding the user? > > That said, pgprot_writecombine() is mostly used for framebuffers, which > arguably shouldn't be executable anyway - but who'd want to mmap() the > framebuffer with PROT_EXEC? > > > > > > Here is my tentative plan: > > > > > > - respin this patch with a small fix to handle the > > >DMA_ATTR_NON_CONSISTENT (as in ignore it unl
Re: [PATCH] dma-mapping: fix page attributes for dma_mmap_*
On Tue, Aug 06, 2019 at 05:08:54PM +0100, Will Deacon wrote: > On Sat, Aug 03, 2019 at 08:48:12AM +0200, Christoph Hellwig wrote: > > On Fri, Aug 02, 2019 at 11:38:03AM +0100, Will Deacon wrote: > > > > > > So this boils down to a terminology mismatch. The Arm architecture > > > doesn't have > > > anything called "write combine", so in Linux we instead provide what the > > > Arm > > > architecture calls "Normal non-cacheable" memory for > > > pgprot_writecombine(). > > > Amongst other things, this memory type permits speculation, unaligned > > > accesses > > > and merging of writes. I found something in the architecture spec about > > > non-cachable memory, but it's written in Armglish[1]. > > > > > > pgprot_noncached(), on the other hand, provides what the architecture > > > calls > > > Strongly Ordered or Device-nGnRnE memory. This is intended for mapping > > > MMIO > > > (i.e. PCI config space) and therefore forbids speculation, preserves > > > access > > > size, requires strict alignment and also forces write responses to come > > > from > > > the endpoint. > > > > > > I think the naming mismatch is historical, but on arm64 we wanted to use > > > the > > > same names as arm32 so that any drivers using these things directly would > > > get > > > the same behaviour. > > > > That all makes sense, but it totally needs a comment. I'll try to draft > > one based on this. I've also looked at the arm32 code a bit more, and > > it seems arm always (?) supported Normal non-cacheable attribute, but > > Linux only optionally uses it for arm v6+ because of fears of drivers > > missing barriers. > > I think it was also to do with aliasing, but I don't recall all of the > details. ARMv6+ is where the architecture significantly changed to introduce the idea of [Normal, Device, Strongly Ordered] where Normal has the cache attributes. Before that, we had just "uncached/unbuffered, uncached/buffered, cached/unbuffered, cached/buffered" modes. The write buffer (enabled by buffered modes) has no architected guarantees about how long writes will sit in it, and there is only the "drain write buffer" instruction to push writes out. Up to and including ARMv5, we took the easy approach of just using the "uncached/unbuffered" mode since that is (a) the safest, and (b) avoids write buffers that alias when there are multiple different mappings. We could have used a different approach, making all IO writes contain a "drain write buffer" instruction, and map DMA memory as "buffered", but as there were no Linux barriers defined to order memory accesses to DMA memory (so, for example, ring buffers can be updated in the correct order) back in those days, using the uncached/unbuffered mode was the sanest and most reliable solution. > > > The other really weird things is that in arm32 > > pgprot_dmacoherent incudes the L_PTE_XN bit, which from my understanding > > is the no-execture bit, but pgprot_writecombine does not. This seems to > > not very unintentional. So minus that the whole DMA_ATTR_WRITE_COMBІNE > > seems to be about flagging old arm specific drivers as having the proper > > barriers in places and otherwise is a no-op. > > I think it only matters for Armv7 CPUs, but yes, we should probably be > setting L_PTE_XN for both of these memory types. Conventionally, pgprot_writecombine() has only been used to change the memory type and not the permissions. Since writecombine memory is still capable of being executed, I don't see any reason to set XN for it. If the user wishes to mmap() using PROT_READ|PROT_EXEC, then is there really a reason for writecombine to set XN overriding the user? That said, pgprot_writecombine() is mostly used for framebuffers, which arguably shouldn't be executable anyway - but who'd want to mmap() the framebuffer with PROT_EXEC? > > > Here is my tentative plan: > > > > - respin this patch with a small fix to handle the > >DMA_ATTR_NON_CONSISTENT (as in ignore it unless actually supported), > >but keep the name as-is to avoid churn. This should allow 5.3 > >inclusion and backports > > - remove DMA_ATTR_WRITE_COMBINE support from mips, probably also 5.3 > >material. > > - move all architectures but arm over to just define > >pgprot_dmacoherent, including a comment with the above explanation > >for arm64. > > That would be great, thanks. > > > - make DMA_ATTR_WRITE_COMBINE a no-op and schedule it for removal, > >thus removing the last instances of arch_dma_mmap_pgprot > > All sounds good to me, although I suppose 32-bit Arm platforms without > CONFIG_ARM_DMA_MEM_BUFFERABLE may run into issues if DMA_ATTR_WRITE_COMBINE > disappears. Only one way to find out... Looking at the results of grep, I think only OMAP2+ and Exynos may be affected. However, removing writecombine support from the DMA API is going to have a huge impact for framebuffers on earlier ARMs - that's where we do expect framebuffers to be mapped "uncached/buffered" for perfo
Re: SATA broken with LPAE
On Thu, Jun 27, 2019 at 11:15:30AM +0200, Christoph Hellwig wrote: > On Thu, Jun 27, 2019 at 10:07:53AM +0100, Russell King - ARM Linux admin > wrote: > > dmabounce has only ever been used with specific devices that have weird > > setups. Otherwise, we've never expected what you describe on ARM. I > > also don't recognise your assertion about the way the DMA API should > > behave as ever having been documented as a requirement for architectures > > to implement. > > That requirement has basically always been there since at least the > 2.6.x days. The history here is that when 64-bit architectures showed > up they all had iommus, so this wasn't an issue. Next was x86 with > highmem, which added special bounce buffering for block I/O and networking > only. Then ia64 showed up that didn't always have an iommu and swiotlb > was added as a "software IOMMU". At this point we had to bounce buffering > schemes for block and networking, while everything else potentially > DMAing to higher memory relied on swiotlb, which was picked up by > basically every architecture that could have memory not covered by a > 32-bit mask and didn't have an iommu. If it wasn't documented... > Except it seems arm never did > that and has been lucky as people didn't try anything that is not > block or networking on their extended physical address space. Because no one knew that the requirement had been introduced, and we've been reliant on all the code you've been stripping out. You're now causing regressions on 32-bit ARM, so you get to fix it. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: SATA broken with LPAE
On Wed, Jun 26, 2019 at 02:53:25PM +0200, Christoph Hellwig wrote: > Hi Roger, > > it seems the arm dma direct mapping code isn't doing the right thing > here. On other platforms that have > 4G memory we always use swiotlb > for bounce buffering in case a device that can't DMA to all the memory. > > Arm is the odd one out and uses its own dmabounce framework instead, > but it seems like it doesn't get used in this case. We need to make > sure dmabounce (or swiotlb for that matter) is set up if > 32-bit > addressing is supported. I'm not really an arm platform expert, > but some of those on the Cc list are and might chime in on how to > do that. dmabounce has only ever been used with specific devices that have weird setups. Otherwise, we've never expected what you describe on ARM. I also don't recognise your assertion about the way the DMA API should behave as ever having been documented as a requirement for architectures to implement. dmabounce comes with it some serious issues that have been known to cause OOMs with old kernels (which have never been solved), so that really isn't the answer. This kind of breakage that is now being reported is exactly the problem that I thought would happen with your patches. 32-bit ARM is in maintenance only now, there is no longer any appetite nor funding for development work on core code. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] ARM: dma-mapping: allow larger DMA mask than supported
On Tue, May 21, 2019 at 02:47:29PM +0200, Christoph Hellwig wrote: > Since Linux 5.1 we allow drivers to just set the largest DMA mask they > support instead of falling back to smaller ones. This doesn't make sense. "they" is confusing - why would a driver set a DMA mask larger than the driver supports? Or is "they" not referring to the drivers (in which case, what is it referring to?) > When fixing up all the dma ops instances to allow for this behavior > the arm direct mapping code was missed. Fix it up by removing the > sanity check, as all the actual mapping code handles this case just > fine. > > Fixes: 9eb9e96e97b3 ("Documentation/DMA-API-HOWTO: update dma_mask sections") > Signed-off-by: Christoph Hellwig > --- > arch/arm/mm/dma-mapping.c | 20 +--- > 1 file changed, 1 insertion(+), 19 deletions(-) > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 0a75058c11f3..bdf0d236aaee 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -219,25 +219,7 @@ EXPORT_SYMBOL(arm_coherent_dma_ops); > > static int __dma_supported(struct device *dev, u64 mask, bool warn) > { > - unsigned long max_dma_pfn; > - > - /* > - * If the mask allows for more memory than we can address, > - * and we actually have that much memory, then we must > - * indicate that DMA to this device is not supported. > - */ > - if (sizeof(mask) != sizeof(dma_addr_t) && > - mask > (dma_addr_t)~0 && > - dma_to_pfn(dev, ~0) < max_pfn - 1) { > - if (warn) { > - dev_warn(dev, "Coherent DMA mask %#llx is larger than > dma_addr_t allows\n", > - mask); > - dev_warn(dev, "Driver did not use or check the return > value from dma_set_coherent_mask()?\n"); > - } > - return 0; > - } The point of this check is to trap the case where we have, for example, 8GB of memory, but dma_addr_t is 32-bit. We can allocate in the high 4GB, but we can't represent the address in a dma_addr_t. > - > - max_dma_pfn = min(max_pfn, arm_dma_pfn_limit); > + unsigned long max_dma_pfn = min(max_pfn, arm_dma_pfn_limit); > > /* >* Translate the device's DMA mask to a PFN limit. This > -- > 2.20.1 > > -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] dma-mapping: truncate dma masks to what dma_addr_t can hold
On Tue, May 21, 2019 at 02:47:28PM +0200, Christoph Hellwig wrote: > The dma masks in struct device are always 64-bits wide. But for builds > using a 32-bit dma_addr_t we need to ensure we don't store an > unsupportable value. Before Linux 5.0 this was handled at least by > the ARM dma mapping code by never allowing to set a larger dma_mask, > but these days we allow the driver to just set the largest supported > value and never fall back to a smaller one. Ensure this always works > by truncating the value. So how does the driver negotiation for >32bit addresses work if we don't fail for large masks? I'm thinking about all those PCI drivers that need DAC cycles for >32bit addresses, such as e1000, which negotiate via (eg): /* there is a workaround being applied below that limits * 64-bit DMA addresses to 64-bit hardware. There are some * 32-bit adapters that Tx hang when given 64-bit DMA addresses */ pci_using_dac = 0; if ((hw->bus_type == e1000_bus_type_pcix) && !dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64))) { pci_using_dac = 1; } else { err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); if (err) { pr_err("No usable DMA config, aborting\n"); goto err_dma; } } and similar. If we blindly trunate the 64-bit to 32-bit, aren't we going to end up with PCI cards using DAC cycles to a host bridge that do not support DAC cycles? > > Fixes: 9eb9e96e97b3 ("Documentation/DMA-API-HOWTO: update dma_mask sections") > Signed-off-by: Christoph Hellwig > --- > kernel/dma/mapping.c | 12 > 1 file changed, 12 insertions(+) > > diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c > index f7afdadb6770..1f628e7ac709 100644 > --- a/kernel/dma/mapping.c > +++ b/kernel/dma/mapping.c > @@ -317,6 +317,12 @@ void arch_dma_set_mask(struct device *dev, u64 mask); > > int dma_set_mask(struct device *dev, u64 mask) > { > + /* > + * Truncate the mask to the actually supported dma_addr_t width to > + * avoid generating unsupportable addresses. > + */ > + mask = (dma_addr_t)mask; > + > if (!dev->dma_mask || !dma_supported(dev, mask)) > return -EIO; > > @@ -330,6 +336,12 @@ EXPORT_SYMBOL(dma_set_mask); > #ifndef CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK > int dma_set_coherent_mask(struct device *dev, u64 mask) > { > + /* > + * Truncate the mask to the actually supported dma_addr_t width to > + * avoid generating unsupportable addresses. > + */ > + mask = (dma_addr_t)mask; > + > if (!dma_supported(dev, mask)) > return -EIO; > > -- > 2.20.1 > > -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up
Re: remove NULL struct device support in the DMA API
On Thu, Mar 21, 2019 at 03:52:28PM -0700, Christoph Hellwig wrote: > We still have a few drivers which pass a NULL struct device pointer > to DMA API functions, which generally is a bad idea as the API > implementations rely on the device not only for ops selection, but > also the dma mask and various other attributes, and many implementations > have been broken for NULL device support for a while. I think I must be missing something, but... My understanding is that ISA DMA is normally limited to 24 bits of address - indeed, the x86 version only programs 24 bits of DMA address. Looking through this series, it appears that the conversions mean that the DMA mask for ISA becomes the full all-ones DMA mask, which would of course lead to memory corruption if only 24 bits of the address end up being programmed into the hardware. Maybe you could say why you think this series is safe in regard to ISA DMA? > > This series removes the few remaning users that weren't picked up in > the last merge window and then removes core support for this "feature". > > A git tree is also available at: > > git://git.infradead.org/users/hch/misc.git dma-remove-NULL-dev-support > > Gitweb: > > > http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-remove-NULL-dev-support > -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 0/9] Use vm_insert_range
Having discussed with Matthew offlist, I think we've come to the following conclusion - there's a number of drivers that buggily ignore vm_pgoff. So, what I proposed is: static int __vm_insert_range(struct vm_struct *vma, struct page *pages, size_t num, unsigned long offset) { unsigned long count = vma_pages(vma); unsigned long uaddr = vma->vm_start; int ret; /* Fail if the user requested offset is beyond the end of the object */ if (offset > num) return -ENXIO; /* Fail if the user requested size exceeds available object size */ if (count > num - offset) return -ENXIO; /* Never exceed the number of pages that the user requested */ for (i = 0; i < count; i++) { ret = vm_insert_page(vma, uaddr, pages[offset + i]); if (ret < 0) return ret; uaddr += PAGE_SIZE; } return 0; } /* * Maps an object consisting of `num' `pages', catering for the user's * requested vm_pgoff */ int vm_insert_range(struct vm_struct *vma, struct page *pages, size_t num) { return __vm_insert_range(vma, pages, num, vma->vm_pgoff); } /* * Maps a set of pages, always starting at page[0] */ int vm_insert_range_buggy(struct vm_struct *vma, struct page *pages, size_t num) { return __vm_insert_range(vma, pages, num, 0); } With this, drivers such as iommu/dma-iommu.c can be converted thusly: int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma+) { - unsigned long uaddr = vma->vm_start; - unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT; - int ret = -ENXIO; - - for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) { - ret = vm_insert_page(vma, uaddr, pages[i]); - if (ret) - break; - uaddr += PAGE_SIZE; - } - return ret; + return vm_insert_range(vma, pages, PAGE_ALIGN(size) >> PAGE_SHIFT); } and drivers such as firewire/core-iso.c: int fw_iso_buffer_map_vma(struct fw_iso_buffer *buffer, struct vm_area_struct *vma) { - unsigned long uaddr; - int i, err; - - uaddr = vma->vm_start; - for (i = 0; i < buffer->page_count; i++) { - err = vm_insert_page(vma, uaddr, buffer->pages[i]); - if (err) - return err; - - uaddr += PAGE_SIZE; - } - - return 0; + return vm_insert_range_buggy(vma, buffer->pages, buffer->page_count); } and this gives us something to grep for to find these buggy drivers. Now, this may not look exactly equivalent, but if you look at fw_device_op_mmap(), buffer->page_count is basically vma_pages(vma) at this point, which means this should be equivalent. We _could_ then at a later date "fix" these drivers to behave according to the normal vm_pgoff offsetting simply by removing the _buggy suffix on the function name... and if that causes regressions, it gives us an easy way to revert (as long as vm_insert_range_buggy() remains available.) In the case of firewire/core-iso.c, it currently ignores the mmap offset entirely, so making the above suggested change would be tantamount to causing it to return -ENXIO for any non-zero mmap offset. IMHO, this approach is way simpler, and easier to get it correct at each call site, rather than the current approach which seems to be error-prone. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 6/9] iommu/dma-iommu.c: Convert to use vm_insert_range
On Mon, Dec 24, 2018 at 06:55:31PM +0530, Souptick Joarder wrote: > Convert to use vm_insert_range() to map range of kernel > memory to user vma. > > Signed-off-by: Souptick Joarder > Reviewed-by: Matthew Wilcox > --- > drivers/iommu/dma-iommu.c | 13 +++-- > 1 file changed, 3 insertions(+), 10 deletions(-) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index d1b0475..de7ffd8 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -622,17 +622,10 @@ struct page **iommu_dma_alloc(struct device *dev, > size_t size, gfp_t gfp, > > int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct > *vma) > { > - unsigned long uaddr = vma->vm_start; > - unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT; > - int ret = -ENXIO; > + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > > - for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) { > - ret = vm_insert_page(vma, uaddr, pages[i]); > - if (ret) > - break; > - uaddr += PAGE_SIZE; > - } > - return ret; > + return vm_insert_range(vma, vma->vm_start, pages + vma->vm_pgoff, > + count - vma->vm_pgoff); This introduces a new bug. I'm not going to continue to point out in minute detail the mistakes you are introducing, as I don't think that is helping you to learn. Look at this closely, and see whether you can spot the mistake. Specifically, compare the boundary conditions for the final page that is to be inserted and the value returned by the original version and by your version for different scenarios. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/9] dma-mapping: move the arm64 ncoherent alloc/free support to common code
On Mon, Nov 05, 2018 at 01:19:26PM +0100, Christoph Hellwig wrote: > The arm64 codebase to implement coherent dma allocation for architectures > with non-coherent DMA is a good start for a generic implementation, given > that is uses the generic remap helpers, provides the atomic pool for > allocations that can't sleep and still is realtively simple and well > tested. Move it to kernel/dma and allow architectures to opt into it > using a config symbol. Architectures just need to provide a new > arch_dma_prep_coherent helper to writeback an invalidate the caches > for any memory that gets remapped for uncached access. > > Signed-off-by: Christoph Hellwig > --- > arch/arm64/Kconfig | 2 +- > arch/arm64/mm/dma-mapping.c | 184 ++-- > include/linux/dma-mapping.h | 5 + > include/linux/dma-noncoherent.h | 2 + > kernel/dma/Kconfig | 6 ++ > kernel/dma/remap.c | 158 ++- > 6 files changed, 181 insertions(+), 176 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 5d065acb6d10..2e645ea693ea 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -82,7 +82,7 @@ config ARM64 > select CRC32 > select DCACHE_WORD_ACCESS > select DMA_DIRECT_OPS > - select DMA_REMAP > + select DMA_DIRECT_REMAP > select EDAC_SUPPORT > select FRAME_POINTER > select GENERIC_ALLOCATOR > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index a3ac26284845..e2e7e5d0f94e 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -33,113 +33,6 @@ > > #include > > -static struct gen_pool *atomic_pool __ro_after_init; > - > -#define DEFAULT_DMA_COHERENT_POOL_SIZE SZ_256K > -static size_t atomic_pool_size __initdata = DEFAULT_DMA_COHERENT_POOL_SIZE; > - > -static int __init early_coherent_pool(char *p) > -{ > - atomic_pool_size = memparse(p, &p); > - return 0; > -} > -early_param("coherent_pool", early_coherent_pool); > - > -static void *__alloc_from_pool(size_t size, struct page **ret_page, gfp_t > flags) > -{ > - unsigned long val; > - void *ptr = NULL; > - > - if (!atomic_pool) { > - WARN(1, "coherent pool not initialised!\n"); > - return NULL; > - } > - > - val = gen_pool_alloc(atomic_pool, size); > - if (val) { > - phys_addr_t phys = gen_pool_virt_to_phys(atomic_pool, val); > - > - *ret_page = phys_to_page(phys); > - ptr = (void *)val; > - memset(ptr, 0, size); > - } > - > - return ptr; > -} > - > -static bool __in_atomic_pool(void *start, size_t size) > -{ > - return addr_in_gen_pool(atomic_pool, (unsigned long)start, size); > -} > - > -static int __free_from_pool(void *start, size_t size) > -{ > - if (!__in_atomic_pool(start, size)) > - return 0; > - > - gen_pool_free(atomic_pool, (unsigned long)start, size); > - > - return 1; > -} > - > -void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > - gfp_t flags, unsigned long attrs) > -{ > - struct page *page; > - void *ptr, *coherent_ptr; > - pgprot_t prot = pgprot_writecombine(PAGE_KERNEL); > - > - size = PAGE_ALIGN(size); > - > - if (!gfpflags_allow_blocking(flags)) { > - struct page *page = NULL; > - void *addr = __alloc_from_pool(size, &page, flags); > - > - if (addr) > - *dma_handle = phys_to_dma(dev, page_to_phys(page)); > - > - return addr; > - } > - > - ptr = dma_direct_alloc_pages(dev, size, dma_handle, flags, attrs); > - if (!ptr) > - goto no_mem; > - > - /* remove any dirty cache lines on the kernel alias */ > - __dma_flush_area(ptr, size); > - > - /* create a coherent mapping */ > - page = virt_to_page(ptr); > - coherent_ptr = dma_common_contiguous_remap(page, size, VM_USERMAP, > -prot, > __builtin_return_address(0)); > - if (!coherent_ptr) > - goto no_map; > - > - return coherent_ptr; > - > -no_map: > - dma_direct_free_pages(dev, size, ptr, *dma_handle, attrs); > -no_mem: > - return NULL; > -} > - > -void arch_dma_free(struct device *dev, size_t size, void *vaddr, > - dma_addr_t dma_handle, unsigned long attrs) > -{ > - if (!__free_from_pool(vaddr, PAGE_ALIGN(size))) { > - void *kaddr = phys_to_virt(dma_to_phys(dev, dma_handle)); > - > - vunmap(vaddr); > - dma_direct_free_pages(dev, size, kaddr, dma_handle, attrs); > - } > -} > - > -long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr, > - dma_addr_t dma_addr) > -{ > - return __phys_to_pfn(dma_to_phys(dev, dma_addr)); > -} > - > pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot, > unsigned long att
Re: remove the ->mapping_error method from dma_map_ops V3
On Fri, Nov 30, 2018 at 02:22:08PM +0100, Christoph Hellwig wrote: > Error reporting for the dma_map_single and dma_map_page operations is > currently a mess. Both APIs directly return the dma_addr_t to be used for > the DMA, with a magic error escape that is specific to the instance and > checked by another method provided. This has a few downsides: > > - the error check is easily forgotten and a __must_check marker doesn't >help as the value always is consumed anyway > - the error checking requires another indirect call, which have gotten >incredibly expensive > - a lot of code is wasted on implementing these methods > > The historical reason for this is that people thought DMA mappings would > not fail when the API was created, which sounds like a really bad > assumption in retrospective, and then we tried to cram error handling > onto it later on. > > There basically are two variants: the error code is 0 because the > implementation will never return 0 as a valid DMA address, or the error > code is all-F as the implementation won't ever return an address that > high. The old AMD GART is the only one not falling into these two camps > as it picks sort of a relative zero relative to where it is mapped. > > The 0 return doesn't work for direct mappings that have ram at address > zero and a lot of IOMMUs that start allocating bus space from address > zero, so we can't consolidate on that, but I think we can move everyone > to all-Fs, which the patch here does. The reason for that is that > there is only one way to ever get this address: by doing a 1-byte long, > 1-byte aligned mapping, but all our mappings are not only longer but > generally aligned, and the mappings have to keep at least the basic > alignment. > > A git tree is also available here: > > git://git.infradead.org/users/hch/misc.git dma-mapping-error.3 Hi Chris, For patches 1, 3 and 23, Acked-by: Russell King Thanks. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Wed, Nov 28, 2018 at 11:27:17AM -0800, David Miller wrote: > From: Linus Torvalds > Date: Wed, 28 Nov 2018 10:00:06 -0800 > > > Not all memory is accessible even to the kernel. If you have memory > > that shows up in the last page of phys_addr_t, you just mark it > > reserved at boot-time. > > It's not the physical memory at the end that needs to be reserved. > > It's the IOMMU mapping arena. True, if and only if you have an IOMMU. Where there isn't an IOMMU, then we'd have to reserve every page that that translates to a bus address in the top 4K of dma_addr_t on any bus in the system - that means knowing early in the kernel initialisation about all buses in the system so we can detect and reserve these pages. I don't think that's trivial to do. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Wed, Nov 28, 2018 at 11:19:15AM -0800, Linus Torvalds wrote: > On Wed, Nov 28, 2018, 10:08 Russell King - ARM Linux wrote: > > > > > > > You already cannot do that kmalloc(), exactly because of ERR_PTR(). > > > > I'm very sorry, but I think you are confused. > > > > kmalloc() returns a _virtual_ address, which quite rightly must not be > > in the top 4K of 4GB, exactly due to ERR_PTR(). That is fine. > > > > However, that is a completely different kettle of fish from a physical > > or DMA address - neither of which are virtual addresses. > > > > You cut away the part that showed that I'm not in the least confused. > > Let me just paste it back in here: > > "Which is what we ALREADY do for these exact reasons. If the DMA > mappings means that you'd need to add one more page to that list of > reserved pages, then so be it." We're not talking about just one more page. We're talking about _every_ page that _may_ map to a bus address in the range of 4GB - 4K (a point I've already made earlier in this discussion.) It's not just about physical addresses, it's about bus addresses, and there are buses out there that have a translation between physical address and bus address without IOMMUs and the like. Can we know ahead of time about all these translations? It means moving that information out of various bus drivers into core architecture code (yuck, when you have multiple different platforms) or into DT (which means effectively breaking every damn platform that's using older DT files) - something that we don't have to do today. I remain 100% opposed to your idea. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Wed, Nov 28, 2018 at 06:08:41PM +, Russell King - ARM Linux wrote: > On Wed, Nov 28, 2018 at 10:00:06AM -0800, Linus Torvalds wrote: > > On Wed, Nov 28, 2018 at 9:45 AM Russell King - ARM Linux > > wrote: > > > > > > > I don't think this is a huge deal, but ERR_PTR() has been hugely > > > > successful elsewhere. And I'm not hugely convinced about all these > > > > "any address can be valid" arguments. How the hell do you generate a > > > > random dma address in the last page that isn't even page-aligned? > > > > > > kmalloc() a 64-byte buffer, dma_map_single() that buffer. > > > > No. > > > > You already cannot do that kmalloc(), exactly because of ERR_PTR(). > > I'm very sorry, but I think you are confused. > > kmalloc() returns a _virtual_ address, which quite rightly must not be > in the top 4K of 4GB, exactly due to ERR_PTR(). That is fine. > > However, that is a completely different kettle of fish from a physical > or DMA address - neither of which are virtual addresses. > > Now, say we have 1GB of RAM which starts at 0xc000 _physical_. > The kernel is configured with a 2GB/2GB user/kernel split, which means > all 1GB of RAM is mapped as lowmem from 0x8000 to 0xbfff > inclusive. This means kmalloc() can return any address in that range. > > ERR_PTR() will work correctly on any of those pointers, meaning that > none of them will be seen as an error. > > However, map any virtual address in the range of 0xb000 to > 0xbfff into DMA space, and the resulting DMA address could well > be in the range of 0xf000 to 0x - precisely the range > of addresses that you are advocating to be used for error codes. > > > The whole argument of "every possible piece of memory is DMA'able" is > > just wrong. > > I'm very sorry, but I do not buy your argument - you are conflating > virtual addresses which ERR_PTR() deals in with physical and bus > addresses - and if you persist down this route, you will cause > regressions. Here's another case: i.MX6 with 4GB of RAM. Devices are mapped to 0-0x0fff physical, RAM is mapped to 0x1000-0x physical. The last 256MB of RAM is not accessible as this is a 32-bit device. DMA addresses are the same as physical addresses. While the final physical page will be highmem in a normal kernel, and thus will not be available for kmalloc(), that doesn't mean it can't happen. A crashdump kernel loaded high in physical memory (eg, last 512MB and given the last 512MB to play around in) would have the top 512MB as lowmem, and therefore available for kmalloc(). If a page is available in lowmem, it's available for kmalloc(), and we can't say that we will never allocate memory from such a page for DMA - if we do and we're using an IS_ERR_VALUE() scheme, it _will_ break if that happens as memory will end up being mapped by the DMA API but dma_mapping_error() will see it as a failure. It won't be an obvious breakage, because it depends on the right conditions happening - a kmalloc() from the top page of physical RAM and that being passed to dma_map_single(). IOW, it's not something that a quick boot test would find, it's something that is likely to cause failures after a system has been running for a period of time. There are other situations where there are possibilities - such as: dma_map_page(dev, page, offset, size, direction) If 'page' is a highmem page which happens to be the top page in the 4GB space, and offset is non-zero, and there's a 1:1 mapping between physical address and DMA address, the returned value will be 0xf000 + offset - within the "last 4095 values are errors" range. Networking uses this for fragments - the packet fragment list is a list of pages, offsets and sizes - we have sendpage() that may end up finding that last page, and TCP-sized packets may be generated from it which would certianly result in non-zero offsets being passed to dma_map_page(). So, whatever way _I_ look at it, I find your proposal to be unsafe and potentially regression causing, and I *completely* and strongly oppose it in its current form. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Wed, Nov 28, 2018 at 10:00:06AM -0800, Linus Torvalds wrote: > On Wed, Nov 28, 2018 at 9:45 AM Russell King - ARM Linux > wrote: > > > > > I don't think this is a huge deal, but ERR_PTR() has been hugely > > > successful elsewhere. And I'm not hugely convinced about all these > > > "any address can be valid" arguments. How the hell do you generate a > > > random dma address in the last page that isn't even page-aligned? > > > > kmalloc() a 64-byte buffer, dma_map_single() that buffer. > > No. > > You already cannot do that kmalloc(), exactly because of ERR_PTR(). I'm very sorry, but I think you are confused. kmalloc() returns a _virtual_ address, which quite rightly must not be in the top 4K of 4GB, exactly due to ERR_PTR(). That is fine. However, that is a completely different kettle of fish from a physical or DMA address - neither of which are virtual addresses. Now, say we have 1GB of RAM which starts at 0xc000 _physical_. The kernel is configured with a 2GB/2GB user/kernel split, which means all 1GB of RAM is mapped as lowmem from 0x8000 to 0xbfff inclusive. This means kmalloc() can return any address in that range. ERR_PTR() will work correctly on any of those pointers, meaning that none of them will be seen as an error. However, map any virtual address in the range of 0xb000 to 0xbfff into DMA space, and the resulting DMA address could well be in the range of 0xf000 to 0x - precisely the range of addresses that you are advocating to be used for error codes. > The whole argument of "every possible piece of memory is DMA'able" is > just wrong. I'm very sorry, but I do not buy your argument - you are conflating virtual addresses which ERR_PTR() deals in with physical and bus addresses - and if you persist down this route, you will cause regressions. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Wed, Nov 28, 2018 at 08:47:05AM -0800, Linus Torvalds wrote: > On Tue, Nov 27, 2018 at 11:41 PM Christoph Hellwig wrote: > > > > On Fri, Nov 23, 2018 at 07:55:11AM +0100, Christoph Hellwig wrote: > > > Well, I can tweak the last patch to return -EINVAL from dma_mapping_error > > > instead of the old 1 is as bool true. The callers should all be fine, > > > although I'd have to audit them. Still wouldn't help with being able to > > > return different errors. > > > > Any opinions? I'd really like to make some forward progress on this > > series. > > So I do think that yes, dma_mapping_error() should return an error > code, not 0/1. > > But I was really hoping that the individual drivers themselves could > return error codes. Right now the patch-series has code like this: > > ret = needs_bounce(dev, dma_addr, size); > if (ret < 0) > - return ARM_MAPPING_ERROR; > + return DMA_MAPPING_ERROR; > > which while it all makes sense in the context of this patch-series, I > *really* think it would have been so much nicer to return the error > code 'ret' instead (which in this case is -E2BIG). > > I don't think this is a huge deal, but ERR_PTR() has been hugely > successful elsewhere. And I'm not hugely convinced about all these > "any address can be valid" arguments. How the hell do you generate a > random dma address in the last page that isn't even page-aligned? kmalloc() a 64-byte buffer, dma_map_single() that buffer. If you have RAM that maps to a _bus_ address in the top page of 4GB of a 32-bit bus address, then you lose. Simples. Subsystems like I2C, SPI, USB etc all deal with small kmalloc'd buffers and their drivers make use of DMA. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Fri, Nov 23, 2018 at 02:03:13PM +0100, Joerg Roedel wrote: > On Fri, Nov 23, 2018 at 11:01:55AM +0000, Russell King - ARM Linux wrote: > > Yuck. So, if we have a 4GB non-PAE 32-bit system, or a PAE system > > where we have valid memory across the 4GB boundary and no IOMMU, > > we have to reserve the top 4K page in the first 4GB of RAM? > > But that is only needed when dma_addr_t is 32bit anyway, no? You said: But we can easily work around that by reserving the top 4k of the first 4GB of IOVA address space in the allocator, no? Then these values are never returned as valid DMA handles. To me, your proposal equates to this in code: int dma_error(dma_addr_t addr) { return (addr & ~(dma_addr_t)0xfff) == 0xf000 ? (s32)addr : 0; } Hence, the size of dma_addr_t would be entirely irrelevant. This makes dma_addr_t values in the range of 0xf000 to 0x special, irrespective of whether dma_addr_t is 32-bit or 64-bit. If that's not what you meant, then I'm afraid your statement wasn't worded very well - so please can you re-word to state more precisely what your proposal is? > > Rather than inventing magic cookies like this, I'd much rather we > > sanitised the API so that we have functions that return success or > > an error code, rather than trying to shoe-horn some kind of magic > > error codes into dma_addr_t and subtly break systems in the process. > > Sure, but is has the obvious downside that we need to touch every driver > that uses these functions, and that are probably a lot of drivers. So we have two options: - change the interface - subtly break drivers In any case, I disagree that we need to touch every driver. Today, drivers just do: if (dma_mapping_error(dev, addr)) and, because dma_mapping_error() returns a boolean, they only care about the true/falseness. If we're going to start returning error codes, then there's no point just returning error codes unless we have some way for drivers to use them. Yes, the simple way would be to make dma_mapping_error() translate addr to an error code, and that maintains compatibility with existing drivers. If, instead, we revamped the DMA API, and had the *new* mapping functions which returned an error code, then the existing mapping functions can be implemented as part of compatibility rather trivially: dma_addr_t dma_map_single(...) { dma_addr_t addr; int error; error = dma_map_single_error(..., &addr); if (error) addr = DMA_MAPPING_ERROR; return addr; } which means that if drivers want access to the error code, they use dma_map_single_error(), meanwhile existing drivers just get on with life as it _currently_ is, with the cookie-based all-ones error code - without introducing any of this potential breakage. Remember, existing drivers would need modification in _any_ case to make use of a returned error code, so the argument that we'd need to touch every driver doesn't really stand up. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Fri, Nov 23, 2018 at 11:49:18AM +0100, Joerg Roedel wrote: > On Thu, Nov 22, 2018 at 05:52:15PM +, Robin Murphy wrote: > > Unfortunately, with things like the top-down IOVA allocator, and 32-bit > > systems in general, "the top 4095" values may well still be valid addresses > > - we're relying on a 1-byte mapping of the very top byte of memory/IOVA > > space being sufficiently ridiculous that no real code would ever do that, > > but even a 4-byte mapping of the top 4 bytes is within the realms of the > > plausible (I've definitely seen the USB layer make 8-byte mappings from any > > old offset within a page, for example). > > But we can easily work around that by reserving the top 4k of the first > 4GB of IOVA address space in the allocator, no? Then these values are > never returned as valid DMA handles. Yuck. So, if we have a 4GB non-PAE 32-bit system, or a PAE system where we have valid memory across the 4GB boundary and no IOMMU, we have to reserve the top 4K page in the first 4GB of RAM? Linus' IS_DMA_ERR() solution is way more preferable, but unfortunately it still falls short, because it knocks out the top 4K of every DMA capable bus. A DMA capable bus may involve some translation of the address (eg, by simple offsetting) which means that we'd need to potentially knock out multiple pages from the page allocator for each of those pages that correspond to the top 4K of any DMA capable bus. Knowing that at the right time at boot will be fun! It also sounds wasteful to me. Rather than inventing magic cookies like this, I'd much rather we sanitised the API so that we have functions that return success or an error code, rather than trying to shoe-horn some kind of magic error codes into dma_addr_t and subtly break systems in the process. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Thu, Nov 22, 2018 at 09:55:25AM -0800, Linus Torvalds wrote: > On Thu, Nov 22, 2018 at 9:52 AM Robin Murphy wrote: > > > > Unfortunately, with things like the top-down IOVA allocator, and 32-bit > > systems in general, "the top 4095" values may well still be valid > > addresses - > > Ugh. > > > The only immediate benefit I can see is that we could distinguish cases > > like the first which can never possibly succeed, and thus callers could > > actually give up instead of doing what various subsystems currently do > > and retry the exact same mapping indefinitely on the apparent assumption > > that errors must be transient. > > No, the big immediate benefit of allowing "return -EINVAL" etc is > simply legibility and error avoidance. > > It's basically how pretty much all the rest of the kernel returns > errors, so not only is it very obvious, it's also what people do > without even thinking.. So it would be good to work. An alternative idea would be to migrate away from the dma_map_single() and dma_map_page() interfaces that return a dma_addr_t, and instead have them return an error code or zero on success. I've thought for some time that our DMA interfaces aren't particularly friendly, especially after we had the stuff with PCI DMA which migrated its way into the DMA API: DEFINE_DMA_UNMAP_ADDR DEFINE_DMA_UNMAP_LEN dma_unmap_* When coupled that with the requirement that dma_unmap_*() should be called with the same parameters as dma_map_*(), I wonder why we never did: struct dma_map_state { dma_addr_t dma_addr; whatever's needed; } int dma_map_single(struct device *dev, struct dma_map_state *state, void *cpu, size_t len, enum dma_data_direction dir); void dma_unmap_single(struct device *dev, struct dma_map_state *state); note the simpler unmap API, which inherently guarantees that the parameters to the map could be carried over to the unmap - without our many driver authors having to think about it. That also paves the way for dma_map_single() to return an error code or zero. However, I fear that boat sailed long ago - but maybe its worth thinking along these lines if we want to sanitise the API now? -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Thu, Nov 22, 2018 at 08:50:47AM -0800, Linus Torvalds wrote: > On Thu, Nov 22, 2018 at 6:03 AM Christoph Hellwig wrote: > > > > The 0 return doesn't work for direct mappings that have ram at address > > zero and a lot of IOMMUs that start allocating bus space from address > > zero, so we can't consolidate on that, but I think we can move everyone > > to all-Fs, which the patch here does. > > Hmm. Maybe not limit it to just all ones, but actually use the > (standard, for the kernel) IS_ERR_VALUE()? > > That basically reserves the last 4095 values in an unsigned long for > error values. > > Then those functions could actually return *what* error they > encountered, using just plain > > return -ENOMEM; > > or whatever? Linus, I'm afraid that won't work very well - 32 bit platforms with 64-bit addresses (LPAE) would have dma_addr_t as a 64-bit value, which wouldn't fit into an unsigned long. IS_ERR_VALUE() would cast a 64-bit DMA address down to a 32-bit pointer (effectively masking with 0x). It would have the effect of making (eg) 0x_fVVV an error, where are any of the top 32-bits of a 64-bit bus address, and VVV is the error code value. That could be a problem if you hit it in several places throughout your available RAM... we'd have to mark every top page of RAM in a naturally aligned 4GB as unusable, as well as block the top page in natually aligned 4GB blocks from IOMMUs... and then what about buses that have weird offsets... So, I don't think the IS_ERR_VALUE() would work very well. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Thu, Nov 22, 2018 at 09:09:47AM -0800, Linus Torvalds wrote: > On Thu, Nov 22, 2018 at 9:07 AM Russell King - ARM Linux > wrote: > > > > I'm afraid that won't work very well - 32 bit platforms with 64-bit > > addresses (LPAE) would have dma_addr_t as a 64-bit value, which > > wouldn't fit into an unsigned long. > > Good point. So we'd have to have a special IS_DMA_ERR() function that > takes a dma_addr_t and checks the same "is it the top 4095 values". No problem with that. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] of/platform: initialise AMBA default DMA masks
On Fri, Aug 31, 2018 at 10:26:14AM +0200, Linus Walleij wrote: > This addresses a v4.19-rc1 regression in the PL111 DRM driver > in drivers/gpu/pl111/* > > The driver uses the CMA KMS helpers and will thus at some > point call down to dma_alloc_attrs() to allocate a chunk > of contigous DMA memory for the framebuffer. > > It appears that in v4.18, it was OK that this (and other > DMA mastering AMBA devices) left dev->coherent_dma_mask > blank (zero). > > In v4.19-rc1 the WARN_ON_ONCE(dev && !dev->coherent_dma_mask) > in dma_alloc_attrs() in include/linux/dma-mapping.h is > triggered. The allocation later fails when get_coherent_dma_mask() > is called from __dma_alloc() and __dma_alloc() returns > NULL: > > drm-clcd-pl111 dev:20: coherent DMA mask is unset > drm-clcd-pl111 dev:20: [drm:drm_fb_helper_fbdev_setup] *ERROR* > Failed to set fbdev configuration > > I have not been able to properly bisect down to the actual > committ causing it beacuse the kernel simply fails to build > at to many bisection points, pushing the bisect back to places > like the merge of entire subsystem trees, where things have > likely been resolved while merging them. > > I found that Robin Murphy solved a similar problem in > a5516219b102 ("of/platform: Initialise default DMA masks") > by simply assigning dev.coherent_dma_mask and the > dev.dma_mask to point to the same when creating devices > from the device tree, and introducing the same code into > the code path creating AMBA/PrimeCell devices solved my > problem, graphics now come up. > > The code simply assumes that the device can access all > of the system memory by setting the coherent DMA mask > to 0x when creating a device from the device > tree, which is crude, but seems to be what kernel v4.18 > assumed. > > Cc: Russell King > Cc: Christoph Hellwig > Cc: Robin Murphy > Signed-off-by: Linus Walleij > --- > Russell, Christoph: this is probably not the last patch. > But it has a more accurate description of the problem. > I still do not know what change to the kernel made > it start triggering this, whether in the DMA API or > in DRM :( Doing some research, it seems the warning was added in v4.16-rc1, and it is only a warning - it doesn't otherwise change the behaviour, so it's not the actual warning that's the problem. I can't see anything in the DMA API nor DRM which would cause this either, but there are changes in the DT code. However, there are changes to of_dma_configure() which will be called just before the driver's probe function is called - and I guess the culpret is 4d8bde883bfb ("OF: Don't set default coherent DMA mask"). So I guess that's the root cause of the problem, and indeed this change was made in 4.19-rc1. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up According to speedtest.net: 13Mbps down 490kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/platform: Initialise AMBA default DMA masks
On Thu, Aug 30, 2018 at 10:45:11AM +0200, Linus Walleij wrote: > On Thu, Aug 30, 2018 at 10:40 AM Russell King - ARM Linux > wrote: > > > Well, as I've no idea what the issue is here, I can't do anything or > > make any suggestions. I wasn't copied on the initial part of the > > thread. > > Sorry about that, it was because the original patch only hit in > drivers/of/*. > > I will send you a copy of the thread. Thanks. >From the original patch description: > commit a5516219b102 ("of/platform: Initialise default DMA masks") > sets up the coherent_dma_mask of platform devices created > from the device tree, but fails to do the same for AMBA > (PrimeCell) devices. > > This leads to a regression in kernel v4.19-rc1 triggering the > WARN_ONCE() in kernel/dma/coherent.c, dma_alloc_attrs() > WARN_ON_ONCE(dev && !dev->coherent_dma_mask): This description is very misleading. It makes it sound like commit a5516219b102 caused the problem. Maybe someone would like to explain how that can be the case - this commit just touches the DT platform device code, and thus can not itself cause this regression. Second reason it is misleading is that it claims that there is a regression in 4.19-rc1 with a WARN in kernel/dma/coherent.c. Looking at Linus' tip, kernel/dma/coherent.c does not contain any WARNings except for one to do with dma_reserved_default_memory. Looking at the history of that file in Linus' tree, there is only one commit which touches this file, and that is Christoph's commit which creates it. So, this patch description needs much improvement, and AFAICS there is no regression in Linus' kernel. There may be a regression in -next, but that is not as urgent as if it were in Linus' tree - iow, we have time to fix this properly. Now, from the AMBA perspective, we do not want every AMBA device to appear to be DMA capable - we only want the ones which are to be so. >From the backtrace, it seems to be a PL111 causing the issue - this device has an AXI bus interface as well as an APB bus interface, so is one of the few that are capable of DMA. It's only restriction is that it is limited to 32 bits of DMA address, and it doesn't care about Linux's distinction between coherent and streaming DMA. So, I'd suggest that we arrange for the DT code to setup the DMA mask for the device only if it has an appropriate property (dma-ranges?), and we don't need to re-introduce the separate mask for coherent DMA. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up According to speedtest.net: 13Mbps down 490kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/platform: Initialise AMBA default DMA masks
On Wed, Aug 29, 2018 at 07:55:21AM +0200, Christoph Hellwig wrote: > On Tue, Aug 28, 2018 at 03:14:14PM +0100, Russell King - ARM Linux wrote: > > But yes, the fundamental fact is that AMBA devices don't have any > > care about the differences between coherent and streaming DMA. The > > distinction that we make in the kernel is purely a software one when > > it comes to these devices. > > > > Most AMBA devices themselves are not DMA capable, as they are only > > connected to the APB (Amba peripheral bus) and they rely on a > > separate DMA engine for their DMA. APB devices should not have DMA > > masks - their DMA capabilities are entirely down to the DMA controller. > > So, the majority of AMBA devices should not have any DMA masks. > > > > Only those connected to a bus that they can master on (eg AXI) should > > have DMA masks - things like the PL08x DMA controllers, PL11x LCD > > controllers, etc. As I've said above, there is no difference between > > streaming and coherent DMA for these devices. > > So for now I plan to apply the patch from Linus to just set a dma > mask, as that gets back the previous behavior where dma did just > work (as it did without a mask). NAK on that at the moment. > But if Linus, you or someone else familiar with amba would like to > add an explicit opt-in into dma support eventually that would be > even better. Well, as I've no idea what the issue is here, I can't do anything or make any suggestions. I wasn't copied on the initial part of the thread. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up According to speedtest.net: 13Mbps down 490kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/platform: Initialise AMBA default DMA masks
On Tue, Aug 28, 2018 at 03:25:55PM +0200, Linus Walleij wrote: > On Tue, Aug 28, 2018 at 11:21 AM Christoph Hellwig wrote: > > > > + dev->dev.coherent_dma_mask = DMA_BIT_MASK(32); > > > + if (!dev->dev.dma_mask) > > > + dev->dev.dma_mask = &dev->dev.coherent_dma_mask; > > > > We should never set dma_mask to point to the coherent_dma_mask, > > as that will cause problems with devices that have different > > mask for the two. > > > > How about something like this? > (...) > > + u64 dma_mask; > > We did that before, so this becomes effectively a revert of: > commit 446b2a9380b64b9d7410d86ee8226031e03645cf > > DMA-API: amba: get rid of separate dma_mask > > AMBA Primecell devices always treat streaming and coherent DMA exactly > the same, so there's no point in having the masks separated. > > So there is some history around this. > > There is also some code in drivers/amba/bus.c affected by that > and I need to look over all static amba device allocations if we > reintroduce this. I don't have the rest of this thread to read... But yes, the fundamental fact is that AMBA devices don't have any care about the differences between coherent and streaming DMA. The distinction that we make in the kernel is purely a software one when it comes to these devices. Most AMBA devices themselves are not DMA capable, as they are only connected to the APB (Amba peripheral bus) and they rely on a separate DMA engine for their DMA. APB devices should not have DMA masks - their DMA capabilities are entirely down to the DMA controller. So, the majority of AMBA devices should not have any DMA masks. Only those connected to a bus that they can master on (eg AXI) should have DMA masks - things like the PL08x DMA controllers, PL11x LCD controllers, etc. As I've said above, there is no difference between streaming and coherent DMA for these devices. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up According to speedtest.net: 13Mbps down 490kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 7/7] OF: Don't set default coherent DMA mask
On Fri, Jul 27, 2018 at 12:36:00PM +0100, Robin Murphy wrote: > That is intentional. Consider a 32-bit device on a bus with an upstream DMA > range of 40 bits (which could easily happen with e.g. PCI). If the firmware > code gives that device 40-bit DMA masks and the driver doesn't change them, > then DMA is not going to work correctly. Therefore the firmware code cannot > safely make {coherent_}dma_mask larger than the default value passed in, and > if the default is 0 then that should be fixed in whatever code created the > device, not worked around later. I think you're missing the point. When OF creates platform devices, they are created without any DMA masks what so ever. It is only during device probe that dma_configure() gets called, which then calls of_dma_configure(), where the DMA mask for any DT declared device is setup. What this means is that your change has broken all existing DT devices that are trying to use DMA, because they now end up with a zero coherent DMA mask and/or streaming DMA mask. Your original idea behind the patch may be a sound one, but since the implementation has caused a regression, the implementation is obviously broken, and that needs fixing or reworking in some way. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up According to speedtest.net: 13Mbps down 490kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()
On Mon, Jul 02, 2018 at 01:53:17PM +0200, Thierry Reding wrote: > On Wed, May 30, 2018 at 04:06:24PM +0200, Thierry Reding wrote: > > From: Thierry Reding > > > > Instead of setting the DMA ops pointer to NULL, set the correct, > > non-IOMMU ops depending on the device's coherency setting. > > > > Signed-off-by: Thierry Reding > > --- > > Changes in v4: > > - new patch to fix existing arm_iommu_detach_device() to do what we need > > > > arch/arm/mm/dma-mapping.c | 12 ++-- > > 1 file changed, 6 insertions(+), 6 deletions(-) > > Christoph, Russell, > > could either of you provide an Acked-by for this? I think it makes the > most sense for Ben to pick this up into the Nouveau tree along with > patch 2/2. Looks fine to me. Acked-by: Russell King Thanks. > > Thierry > > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > > index af27f1c22d93..87a0037574e4 100644 > > --- a/arch/arm/mm/dma-mapping.c > > +++ b/arch/arm/mm/dma-mapping.c > > @@ -1151,6 +1151,11 @@ int arm_dma_supported(struct device *dev, u64 mask) > > return __dma_supported(dev, mask, false); > > } > > > > +static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent) > > +{ > > + return coherent ? &arm_coherent_dma_ops : &arm_dma_ops; > > +} > > + > > #ifdef CONFIG_ARM_DMA_USE_IOMMU > > > > static int __dma_info_to_prot(enum dma_data_direction dir, unsigned long > > attrs) > > @@ -2296,7 +2301,7 @@ void arm_iommu_detach_device(struct device *dev) > > iommu_detach_device(mapping->domain, dev); > > kref_put(&mapping->kref, release_iommu_mapping); > > to_dma_iommu_mapping(dev) = NULL; > > - set_dma_ops(dev, NULL); > > + set_dma_ops(dev, arm_get_dma_map_ops(dev->archdata.dma_coherent)); > > > > pr_debug("Detached IOMMU controller from %s device.\n", dev_name(dev)); > > } > > @@ -2357,11 +2362,6 @@ static void arm_teardown_iommu_dma_ops(struct device > > *dev) { } > > > > #endif /* CONFIG_ARM_DMA_USE_IOMMU */ > > > > -static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent) > > -{ > > - return coherent ? &arm_coherent_dma_ops : &arm_dma_ops; > > -} > > - > > void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, > > const struct iommu_ops *iommu, bool coherent) > > { > > -- > > 2.17.0 > > -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)
On Fri, May 18, 2018 at 01:35:08PM -0700, Vineet Gupta wrote: > On 05/18/2018 10:50 AM, Russell King - ARM Linux wrote: > >On Fri, May 18, 2018 at 10:20:02AM -0700, Vineet Gupta wrote: > >>I never understood the need for this direction. And if memory serves me > >>right, at that time I was seeing twice the amount of cache flushing ! > >It's necessary. Take a moment to think carefully about this: > > > > dma_map_single(, dir) > > > > dma_sync_single_for_cpu(, dir) > > > > dma_sync_single_for_device(, dir) > > > > dma_unmap_single(, dir) > > As an aside, do these imply a state machine of sorts - does a driver needs > to always call map_single first ? Kind-of, but some drivers do omit some of the dma_sync_*() calls. For example, if a buffer is written to, then mapped with TO_DEVICE, and then the CPU wishes to write to it, it's fairly common that a driver omits the dma_sync_single_for_cpu() call. If you think about the cases I gave and what cache operations happen, such a scenario practically turns out to be safe. > My original point of contention/confusion is the specific combinations of > API and direction, specifically for_cpu(TO_DEV) and for_device(TO_CPU) Remember that it is expected that all calls for a mapping use the same direction argument while that mapping exists. In other words, if you call dma_map_single(TO_DEVICE) and then use any of the other functions, the other functions will also use TO_DEVICE. The DMA direction argument describes the direction of the DMA operation being performed on the buffer, not on the individual dma_* operation. What isn't expected at arch level is for drivers to do: dma_map_single(TO_DEVICE) dma_sync_single_for_cpu(FROM_DEVICE) or vice versa. > Semantically what does dma_sync_single_for_cpu(TO_DEV) even imply for a non > dma coherent arch. > > Your tables below have "none" for both, implying it is unlikely to be a real > combination (for ARM and ARC atleast). Very little for the cases that I've stated (and as I mentioned above, some drivers do omit the call in that case.) > The other case, actually @dir TO_CPU, independent of for_{cpu, device} > implies driver intends to touch it after the call, so it would invalidate > any stray lines, unconditionally (and not just for speculative prefetch > case). If you don't have a CPU that speculatively prefetches, and you've already had to invalidate the cache lines (to avoid write-backs corrupting DMA'd data) then there's no need for the architecture to do any work at the for_cpu(TO_CPU) case - the CPU shouldn't be touching cache lines that are part of the buffer while it is mapped, which means a non-speculating CPU won't pull in any cache lines without an explicit access. Speculating CPUs are different. The action of the speculation is to try and guess what data the program wants to access ahead of the program flow. That causes the CPU to prefetch data into the cache. The point in the program flow that this happens is not really determinant to the programmer. This means that if you try to read from the DMA buffer after the DMA operation has complete without invalidating the cache between the DMA completing and the CPU reading, you have no guarantee that you're reading the data that the DMA operation has been written. The cache may have loaded itself with data before the DMA operation completed, and the CPU may see that stale data. The difference between non-speculating CPUs and speculating CPUs is that for non-speculating CPUs, caches work according to explicit accesses by the program, and the program is stalled while the data is fetched from external memory. Speculating CPUs try to predict ahead of time what data the program will require in the future, and attempt to load that data into the caches _before_ the program requires it - which means that the program suffers fewer stalls. > >In the case of a DMA-incoherent architecture, the operations done at each > >stage depend on the direction argument: > > > > map for_cpu for_device unmap > >TO_DEV writeback nonewriteback none > >TO_CPU invalidate invalidate* invalidate invalidate* > >BIDIRwriteback invalidate writeback invalidate > > > >* - only necessary if the CPU speculatively prefetches. > > > >The multiple invalidations for the TO_CPU case handles different > >conditions that can result in data corruption, and for some CPUs, all > >four are necessary. > > Can you please explain in some more detail, TO_CPU row, why invalidate is > conditional sometimes. See above - I hope my explanation above is sufficient. -- RMK's Patch system: h
Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)
On Fri, May 18, 2018 at 07:57:34PM +, Alexey Brodkin wrote: > Hi Russel, That's Russell. > On Fri, 2018-05-18 at 18:50 +0100, Russell King - ARM Linux wrote: > > It's necessary. Take a moment to think carefully about this: > > > > dma_map_single(, dir) > > > > dma_sync_single_for_cpu(, dir) > > > > dma_sync_single_for_device(, dir) > > > > dma_unmap_single(, dir) > > > > In the case of a DMA-incoherent architecture, the operations done at each > > stage depend on the direction argument: > > > > map for_cpu for_device unmap > > TO_DEV writeback nonewriteback none > > TO_CPU invalidate invalidate* invalidate invalidate* > > BIDIR writeback invalidate writeback invalidate > > > > * - only necessary if the CPU speculatively prefetches. > > I think invalidation of DMA buffer is required on for_cpu(TO_CPU) even > if CPU doesn't preferch - what if we reuse the same buffer for multiple > reads from DMA device? That's fine - for non-coherent DMA, the CPU caches will only end up containing data for that memory if: - the CPU speculatively fetches data from that memory, or - the CPU explicitly touches that memory > > The multiple invalidations for the TO_CPU case handles different > > conditions that can result in data corruption, and for some CPUs, all > > four are necessary. > > I would agree that map()/unmap() a quite a special cases and so depending > on direction we need to execute in them either for_cpu() or for_device() > call-backs depending on direction. > > As for invalidation in case of for_device(TO_CPU) I still don't see > a rationale behind it. Would be interesting to see a real example where > we benefit from this. Yes, you could avoid that, but depending how you structure the architecture implementation, it can turn out to be a corner case. The above table is precisely how 32-bit ARM is implemented, because the way we implement them is based on who owns the memory - the "map" and "for_device" operations translate internally to a cpu-to-device ownership transition of the buffer. Similar for "unmap" and "to_cpu". It basically avoids having to add additional functions at the lower implementation levels. > > Things get more interesting if the implementation behind the DMA API has > > to copy data between the buffer supplied to the mapping and some DMA > > accessible buffer: > > > > map for_cpu for_device unmap > > TO_DEV copy to dma nonecopy to dma none > > TO_CPU nonecopy to cpu nonecopy to cpu > > BIDIR copy to dma copy to cpu copy to dma copy to cpu > > > > So, in both cases, the value of the direction argument defines what you > > need to do in each call. > > Interesting enough in your seond table (which describes more complicated > case indeed) you set "none" for for_device(TO_CPU) which looks logical > to me. > > So IMHO that's what make sense: > >8- > map for_cpu for_device unmap > TO_DEV writeback nonewriteback none > TO_CPU noneinvalidate noneinvalidate* > BIDIR writeback invalidate writeback invalidate* > >8- That doesn't make sense for the TO_CPU case. If the caches contain dirty cache lines, and you're DMAing from the device to the system RAM, other system activity can cause the dirty cache lines to be evicted (written back) to memory which the DMA has already overwritten. The result is data corruption. So, you really can't have "none" in the "map" case there. Given that, the "for_cpu" case becomes dependent on whether the CPU speculatively prefetches. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)
On Fri, May 18, 2018 at 10:20:02AM -0700, Vineet Gupta wrote: > I never understood the need for this direction. And if memory serves me > right, at that time I was seeing twice the amount of cache flushing ! It's necessary. Take a moment to think carefully about this: dma_map_single(, dir) dma_sync_single_for_cpu(, dir) dma_sync_single_for_device(, dir) dma_unmap_single(, dir) In the case of a DMA-incoherent architecture, the operations done at each stage depend on the direction argument: map for_cpu for_device unmap TO_DEV writeback nonewriteback none TO_CPU invalidate invalidate* invalidate invalidate* BIDIR writeback invalidate writeback invalidate * - only necessary if the CPU speculatively prefetches. The multiple invalidations for the TO_CPU case handles different conditions that can result in data corruption, and for some CPUs, all four are necessary. This is what is implemented for 32-bit ARM, depending on the CPU capabilities, as we have DMA incoherent devices and we have CPUs that speculatively prefetch data, and so may load data into the caches while DMA is in operation. Things get more interesting if the implementation behind the DMA API has to copy data between the buffer supplied to the mapping and some DMA accessible buffer: map for_cpu for_device unmap TO_DEV copy to dma nonecopy to dma none TO_CPU nonecopy to cpu nonecopy to cpu BIDIR copy to dma copy to cpu copy to dma copy to cpu So, in both cases, the value of the direction argument defines what you need to do in each call. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Difference between IOVA and bus address when SMMU is enabled
On Sat, May 12, 2018 at 06:25:13PM +0530, valmiki wrote: > Hi All, > > What is the difference between IOVA address and bus address > when SMMU is enabled ? > > Is IOVA address term used only when hypervisor is present ? IOVA = IO virtual address. IOVA is the term normally used to describe the address used on the _device_ side of an IOMMU. For any general setup: RAM - MMU - DEVICE ^ ^ physical virtual addressaddress where "device" can be an IO device or a CPU, the terms still apply. If you have something like this: RAM - PCI bridge - MMU - DEVICE ^^ ^ physical bus virtual address address address You could also have (eg, in the case of a system MMU): RAM - MMU - PCI bridge - DEVICE ^ ^^ physical virtualbus address address address (this can also be considered a bus address!) In both of the above two cases, the PCI bridge may perform some address translation, meaning that the bus address is different from the address seen on the other side of the bridge. So, the terms used depend exactly on the overall bus topology. In the case of a system MMU, where the system MMU sits between peripheral devices and RAM, then the bus addresses are the same as the _IOVA of the system MMU_. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 04/20] arm-nommu: use generic dma_noncoherent_ops
On Fri, May 11, 2018 at 09:59:29AM +0200, Christoph Hellwig wrote: > Switch to the generic noncoherent direct mapping implementation for > the nommu dma map implementation. > > Signed-off-by: Christoph Hellwig > --- > arch/arc/Kconfig| 1 + > arch/arm/Kconfig| 4 + > arch/arm/mm/dma-mapping-nommu.c | 139 +--- > 3 files changed, 23 insertions(+), 121 deletions(-) > > diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig > index 89d47eac18b2..3a492a9aeaad 100644 > --- a/arch/arc/Kconfig > +++ b/arch/arc/Kconfig > @@ -9,6 +9,7 @@ > config ARC > def_bool y > select ARC_TIMERS > + select ARCH_HAS_SYNC_DMA_FOR_DEVICE > select ARCH_HAS_SYNC_DMA_FOR_CPU > select ARCH_HAS_SYNC_DMA_FOR_DEVICE > select ARCH_HAS_SG_CHAIN > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index c43f5bb55ac8..76ddd0064f87 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -12,6 +12,8 @@ config ARM > select ARCH_HAS_PHYS_TO_DMA > select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL > select ARCH_HAS_STRICT_MODULE_RWX if MMU > + select ARCH_HAS_SYNC_DMA_FOR_CPU if !MMU > + select ARCH_HAS_SYNC_DMA_FOR_DEVICE if !MMU > select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST > select ARCH_HAVE_CUSTOM_GPIO_H > select ARCH_HAS_GCOV_PROFILE_ALL > @@ -27,6 +29,8 @@ config ARM > select CPU_PM if (SUSPEND || CPU_IDLE) > select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS > select DMA_DIRECT_OPS if !MMU > + select DMA_NONCOHERENT_OPS if !MMU > + select DMA_NONCOHERENT_MMAP if !MMU > select EDAC_SUPPORT > select EDAC_ATOMIC_SCRUB > select GENERIC_ALLOCATOR > diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c > index f448a0663b10..a74ed6632982 100644 > --- a/arch/arm/mm/dma-mapping-nommu.c > +++ b/arch/arm/mm/dma-mapping-nommu.c > @@ -12,6 +12,7 @@ > #include > #include > #include > +#include > #include > > #include > @@ -26,18 +27,16 @@ > * - MMU/MPU is off > * - cpu is v7m w/o cache support > * - device is coherent > - * otherwise arm_nommu_dma_ops is used. > + * otherwise dma_noncoherent_ops is used. > * > - * arm_nommu_dma_ops rely on consistent DMA memory (please, refer to > + * dma_noncoherent_ops rely on consistent DMA memory (please, refer to > * [1] on how to declare such memory). > * > * [1] Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt > */ > > -static void *arm_nommu_dma_alloc(struct device *dev, size_t size, > - dma_addr_t *dma_handle, gfp_t gfp, > - unsigned long attrs) > - > +void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > + gfp_t gfp, unsigned long attrs) > { > void *ret; > > @@ -65,9 +64,8 @@ static void *arm_nommu_dma_alloc(struct device *dev, size_t > size, > return ret; > } > > -static void arm_nommu_dma_free(struct device *dev, size_t size, > -void *cpu_addr, dma_addr_t dma_addr, > -unsigned long attrs) > +void arch_dma_free(struct device *dev, size_t size, void *cpu_addr, > + dma_addr_t dma_addr, unsigned long attrs) > { > if (attrs & DMA_ATTR_NON_CONSISTENT) { > dma_direct_free(dev, size, cpu_addr, dma_addr, attrs); > @@ -81,9 +79,9 @@ static void arm_nommu_dma_free(struct device *dev, size_t > size, > return; > } > > -static int arm_nommu_dma_mmap(struct device *dev, struct vm_area_struct *vma, > - void *cpu_addr, dma_addr_t dma_addr, size_t size, > - unsigned long attrs) > +int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma, > + void *cpu_addr, dma_addr_t dma_addr, size_t size, > + unsigned long attrs) > { > int ret; > > @@ -93,9 +91,8 @@ static int arm_nommu_dma_mmap(struct device *dev, struct > vm_area_struct *vma, > return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); > } > > - > -static void __dma_page_cpu_to_dev(phys_addr_t paddr, size_t size, > - enum dma_data_direction dir) > +void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr, > + size_t size, enum dma_data_direction dir) Please no. There is a lot of history of these (__dma_page_cpu_to_dev etc) functions being abused by out of tree drivers, because they think they know better. This is stopped by making them static and ensuring that drivers have no access to these functions. Please do not re-expose these to the global kernel. While it may make things easier for a cross-architecture point of view, it makes it a lot easier for people to abuse these private APIs. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in su
Re: noveau vs arm dma ops
(While there's a rain shower...) On Thu, Apr 26, 2018 at 02:09:42AM -0700, Christoph Hellwig wrote: > synopsis: > > drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c:pdevinfo.dma_mask > = DMA_BIT_MASK(32); > drivers/gpu/drm/bridge/synopsys/dw-hdmi.c: pdevinfo.dma_mask = > DMA_BIT_MASK(32); This is for the AHB audio driver, and is a correct default on two counts: 1. It is historically usual to initialise DMA masks to 32-bit, and leave it to the device drivers to negotiate via the DMA mask functions if they wish to use higher orders of bits. 2. The AHB audio hardware in the HDMI block only supports 32-bit addresses. What I've missed from the AHB audio driver is calling the DMA mask functions... oops. Patch below. > drivers/gpu/drm/bridge/synopsys/dw-hdmi.c: pdevinfo.dma_mask = > DMA_BIT_MASK(32); This is for the I2S audio driver, and I suspect is wrong - I doubt that the I2S sub-device itself does any DMA what so ever. 8<=== From: Russell King Subject: drm: bridge: dw-hdmi: Negotiate dma mask with DMA API DMA drivers are supposed to negotiate the DMA mask with the DMA API, but this was missing from the AHB audio driver. Add the necessary call. Signed-off-by: Russell King --- drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c index cf3f0caf9c63..16c45b6cd6af 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c @@ -539,6 +539,10 @@ static int snd_dw_hdmi_probe(struct platform_device *pdev) unsigned revision; int ret; + ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); + if (ret) + return ret; + writeb_relaxed(HDMI_IH_MUTE_AHBDMAAUD_STAT0_ALL, data->base + HDMI_IH_MUTE_AHBDMAAUD_STAT0); revision = readb_relaxed(data->base + HDMI_REVISION_ID); -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: noveau vs arm dma ops
On Wed, Apr 25, 2018 at 11:35:13PM +0200, Daniel Vetter wrote: > On arm that doesn't work. The iommu api seems like a good fit, except > the dma-api tends to get in the way a bit (drm/msm apparently has > similar problems like tegra), and if you need contiguous memory > dma_alloc_coherent is the only way to get at contiguous memory. There > was a huge discussion years ago about that, and direct cma access was > shot down because it would have exposed too much of the caching > attribute mangling required (most arm platforms need wc-pages to not > be in the kernel's linear map apparently). I think you completely misunderstand ARM from what you've written above, and this worries me greatly about giving DRM the level of control that is being asked for. Modern ARMs have a PIPT cache or a non-aliasing VIPT cache, and cache attributes are stored in the page tables. These caches are inherently non-aliasing when there are multiple mappings (which is a great step forward compared to the previous aliasing caches.) As the cache attributes are stored in the page tables, this in theory allows different virtual mappings of the same physical memory to have different cache attributes. However, there's a problem, and that's called speculative prefetching. Let's say you have one mapping which is cacheable, and another that is marked as write combining. If a cache line is speculatively prefetched through the cacheable mapping of this memory, and then you read the same physical location through the write combining mapping, it is possible that you could read cached data. So, it is generally accepted that all mappings of any particular physical bit of memory should have the same cache attributes to avoid unpredictable behaviour. This presents a problem with what is generally called "lowmem" where the memory is mapped in kernel virtual space with cacheable attributes. It can also happen with highmem if the memory is kmapped. This is why, on ARM, you can't use something like get_free_pages() to grab some pages from the system, pass it to the GPU, map it into userspace as write-combining, etc. It _might_ work for some CPUs, but ARM CPUs vary in how much prefetching they do, and what may work for one particular CPU is in no way guaranteed to work for another ARM CPU. The official line from architecture folk is to assume that the caches infinitely speculate, are of infinite size, and can writeback *dirty* data at any moment. The way to stop things like speculative prefetches to particular physical memory is to, quite "simply", not have any cacheable mappings of that physical memory anywhere in the system. Now, cache flushes on ARM tend to be fairly expensive for GPU buffers. If you have, say, an 8MB buffer (for a 1080p frame) and you need to do a cache operation on that buffer, you'll be iterating over it 32 or maybe 64 bytes at a time "just in case" there's a cache line present. Referring to my previous email, where I detailed the potential need for _two_ flushes, one before the GPU operation and one after, and this becomes _really_ expensive. At that point, you're probably way better off using write-combine memory where you don't need to spend CPU cycles performing cache flushing - potentially across all CPUs in the system if cache operations aren't broadcasted. This isn't a simple matter of "just provide some APIs for cache operations" - there's much more that needs to be understood by all parties here, especially when we have GPU drivers that can be used with quite different CPUs. It may well be that for some combinations of CPUs and workloads, it's better to use write-combine memory without cache flushing, but for other CPUs that tradeoff (for the same workload) could well be different. Older ARMs get more interesting, because they have aliasing caches. That means the CPU cache aliases across different virtual space mappings in some way, which complicates (a) the mapping of memory and (b) handling the cache operations on it. It's too late for me to go into that tonight, and I probably won't be reading mail for the next week and a half, sorry. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: noveau vs arm dma ops
On Wed, Apr 25, 2018 at 08:33:12AM -0700, Christoph Hellwig wrote: > On Wed, Apr 25, 2018 at 12:04:29PM +0200, Daniel Vetter wrote: > > - dma api hides the cache flushing requirements from us. GPUs love > > non-snooped access, and worse give userspace control over that. We want > > a strict separation between mapping stuff and flushing stuff. With the > > IOMMU api we mostly have the former, but for the later arch maintainers > > regularly tells they won't allow that. So we have drm_clflush.c. > > The problem is that a cache flushing API entirely separate is hard. That > being said if you look at my generic dma-noncoherent API series it tries > to move that way. So far it is in early stages and apparently rather > buggy unfortunately. And if folk want a cacheable mapping with explicit cache flushing, the cache flushing must not be defined in terms of "this is what CPU seems to need" but from the point of view of a CPU with infinite prefetching, infinite caching and infinite capacity to perform writebacks of dirty cache lines at unexpected moments when the memory is mapped in a cacheable mapping. (The reason for that is you're operating in a non-CPU specific space, so you can't make any guarantees as to how much caching or prefetching will occur by the CPU - different CPUs will do different amounts.) So, for example, the sequence: GPU writes to memory CPU reads from cacheable memory if the memory was previously dirty (iow, CPU has written), you need to flush the dirty cache lines _before_ the GPU writes happen, but you don't know whether the CPU has speculatively prefetched, so you need to flush any prefetched cache lines before reading from the cacheable memory _after_ the GPU has finished writing. Also note that "flush" there can be "clean the cache", "clean and invalidate the cache" or "invalidate the cache" as appropriate - some CPUs are able to perform those three operations, and the appropriate one depends on not only where in the above sequence it's being used, but also on what the operations are. So, the above sequence could be: CPU invalidates cache for memory (due to possible dirty cache lines) GPU writes to memory CPU invalidates cache for memory (to get rid of any speculatively prefetched lines) CPU reads from cacheable memory Yes, in the above case, _two_ cache operations are required to ensure correct behaviour. However, if you know for certain that the memory was previously clean, then the first cache operation can be skipped. What I'm pointing out is there's much more than just "I want to flush the cache" here, which is currently what DRM seems to assume at the moment with the code in drm_cache.c. If we can agree a set of interfaces that allows _proper_ use of these facilities, one which can be used appropriately, then there shouldn't be a problem. The DMA API does that via it's ideas about who owns a particular buffer (because of the above problem) and that's something which would need to be carried over to such a cache flushing API (it should be pretty obvious that having a GPU read or write memory while the cache for that memory is being cleaned will lead to unexpected results.) Also note that things get even more interesting in a SMP environment if cache operations aren't broadcasted across the SMP cluster (which means cache operations have to be IPI'd to other CPUs.) The next issue, which I've brought up before, is that exposing cache flushing to userspace on architectures where it isn't already exposed comes. As has been shown by Google Project Zero, this risks exposing those architectures to Spectre and Meltdown exploits where they weren't at such a risk before. (I've pretty much shown here that you _do_ need to control which cache lines get flushed to make these exploits work, and flushing the cache by reading lots of data in liu of having the ability to explicitly flush bits of cache makes it very difficult to impossible for them to work.) -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 5/5] ARM: Unconditionally enable ARM_DMA_USE_IOMMU
On Wed, Apr 25, 2018 at 12:10:51PM +0200, Thierry Reding wrote: > From: Thierry Reding > > The ARM_DMA_USE_IOMMU Kconfig option has side-effects that drivers can > not opt into but have to explicitly opt out of. This can lead to subtle > bugs that are difficult to track down and not immediately obvious to be > related to this Kconfig option. > > To avoid this confusion, always enable the option to expose any lurking > bugs once and allow any regressions introduced by the DMA/IOMMU code to > be caught more quickly in the future. > > Note that some drivers still use the Kconfig symbol to provide different > code paths depending on what architecture the code runs on (e.g. 32-bit > ARM vs. 64-bit ARM which have different and incompatible implementations > of the DMA/IOMMU integration code), so leave the symbol in place to keep > those drivers working. > > For the long term, it is preferable to transition everyone to the > generic DMA/IOMMU integration code in drivers/iommu/dma-iommu.c. > > Signed-off-by: Thierry Reding > --- > Changes in v2: > - new patch > > arch/arm/Kconfig | 2 +- > arch/arm/include/asm/device.h | 6 -- > arch/arm/mm/dma-mapping.c | 18 -- > drivers/iommu/Kconfig | 7 --- > drivers/media/platform/Kconfig | 1 - > 5 files changed, 1 insertion(+), 33 deletions(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index fa0b190f8a38..3c91de78535a 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -124,7 +124,7 @@ config NEED_SG_DMA_LENGTH > bool > > config ARM_DMA_USE_IOMMU > - bool > + def_bool y > select ARM_HAS_SG_CHAIN > select NEED_SG_DMA_LENGTH This doesn't work - as has recently been discussed with hch, we can't globally enable NEED_SG_DMA_LENGTH on ARM - the ARM architecture pre-dates the addition of the DMA length member in the scatterlist, and not every machine supports the splitting of the DMA length from the non-DMA length member. Hence, this will cause a regression, sorry. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: noveau vs arm dma ops
On Wed, Apr 25, 2018 at 01:54:39AM -0700, Christoph Hellwig wrote: > [discussion about this patch, which should have been cced to the iommu > and linux-arm-kernel lists, but wasn't: > https://www.spinics.net/lists/dri-devel/msg173630.html] > > On Wed, Apr 25, 2018 at 09:41:51AM +0200, Thierry Reding wrote: > > > API from the iommu/dma-mapping code. Drivers have no business poking > > > into these details. > > > > The interfaces that the above patch uses are all EXPORT_SYMBOL_GPL, > > which is rather misleading if they are not meant to be used by drivers > > directly. EXPORT_SYMBOL* means nothing as far as whether a driver should be able to use the symbol or not - it merely means that the symbol is made available to a module. Modules cover much more than just device drivers - we have library modules, filesystem modules, helper modules to name a few non-driver classes of modules. We also have symbols that are exported as part of the architecture implementation detail of a public interface. For example, the public interface "copy_from_user" is implemented as an inline function (actually several layers of inline functions) eventually calling into arm_copy_from_user(). arm_copy_from_user() is exported, but drivers (in fact no module) is allowed to make direct reference to arm_copy_from_user() - it'd fail when software PAN is enabled. The whole idea that "if a symbol is exported, it's fine for a driver to use it" is a complete work of fiction, always has been, and always will be. We've had this with the architecture implementation details of the DMA API before, and with the architecture implementation details of the CPU cache flushing. There's only so much commentry, or __ prefixes you can add to a symbol before things get rediculous, and none of it stops people creating this abuse. The only thing that seems to prevent it is to make life hard for folk wanting to use the symbols (eg, hiding the symbol prototype in a private header, etc.) Never, ever go under the covers of an interface. If the interface doesn't do what you want, _discuss_ it, don't just think "oh, that architecture private facility looks like what I need, I'll use that directly." If you ever are on the side of trying to maintain those implementation details that are abused in this way, you'll soon understand why this behaviour by driver authors is soo annoying, and the maintainability problems it creates. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: move the SWIOTLB config symbol to lib/Kconfig
On Tue, Apr 24, 2018 at 08:55:49AM +0200, Christoph Hellwig wrote: > On Tue, Apr 24, 2018 at 12:52:05AM +0100, Russell King - ARM Linux wrote: > > On Mon, Apr 23, 2018 at 07:04:18PM +0200, Christoph Hellwig wrote: > > > This way we have one central definition of it, and user can select it as > > > needed. Note that we also add a second ARCH_HAS_SWIOTLB symbol to > > > indicate the architecture supports swiotlb at all, so that we can still > > > make the usage optional for a few architectures that want this feature > > > to be user selectable. > > > > > > Signed-off-by: Christoph Hellwig > > > > Hmm, this looks like we end up with NEED_SG_DMA_LENGTH=y on ARM by > > default, which probably isn't a good idea - ARM pre-dates the dma_length > > parameter in scatterlists, and I don't think all code is guaranteed to > > do the right thing if this is enabled. > > We shouldn't end up with NEED_SG_DMA_LENGTH=y on ARM by default. Your patch as sent would end up with: ARM selects ARCH_HAS_SWIOTLB SWIOTLB is defaulted to ARCH_HAS_SWIOTLB SWIOTLB selects NEED_SG_DMA_LENGTH due to: @@ -106,6 +106,7 @@ config ARM select REFCOUNT_FULL select RTC_LIB select SYS_SUPPORTS_APM_EMULATION + select ARCH_HAS_SWIOTLB and: +config SWIOTLB + bool "SWIOTLB support" + default ARCH_HAS_SWIOTLB + select NEED_SG_DMA_LENGTH Therefore, the default state for SWIOTLB and hence NEED_SG_DMA_LENGTH becomes 'y' on ARM, and any defconfig file that does not mention SWIOTLB explicitly ends up with both these enabled. > It is only select by ARM_DMA_USE_IOMMU before the patch, and it will > now also be selected by SWIOTLB, which for arm is never used or seleted > directly by anything but xen-swiotlb. See above. > Then again looking at the series there shouldn't be any need to > even select NEED_SG_DMA_LENGTH for swiotlb, as we'll never merge segments, > so I'll fix that up. That would help to avoid any regressions along the lines I've spotted by review. It does look a bit weird though - patch 10 arranged stuff so that we didn't end up with SWIOTLB always enabled, but this patch reintroduces that with the allowance that the user can disable if so desired. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: move the SWIOTLB config symbol to lib/Kconfig
On Mon, Apr 23, 2018 at 07:04:18PM +0200, Christoph Hellwig wrote: > This way we have one central definition of it, and user can select it as > needed. Note that we also add a second ARCH_HAS_SWIOTLB symbol to > indicate the architecture supports swiotlb at all, so that we can still > make the usage optional for a few architectures that want this feature > to be user selectable. > > Signed-off-by: Christoph Hellwig Hmm, this looks like we end up with NEED_SG_DMA_LENGTH=y on ARM by default, which probably isn't a good idea - ARM pre-dates the dma_length parameter in scatterlists, and I don't think all code is guaranteed to do the right thing if this is enabled. For example, arch/arm/mach-rpc/dma.c doesn't use the dma_length member of struct scatterlist. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 25/44] arm: implement ->mapping_error
BOn Thu, Jun 08, 2017 at 03:25:50PM +0200, Christoph Hellwig wrote: > +static int dmabounce_mapping_error(struct device *dev, dma_addr_t dma_addr) > +{ > + if (dev->archdata.dmabounce) > + return 0; I'm not convinced that we need this check here: dev->archdata.dmabounce = device_info; set_dma_ops(dev, &dmabounce_ops); There shouldn't be any chance of dev->archdata.dmabounce being NULL if the dmabounce_ops has been set as the current device DMA ops. So I think that test can be killed. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops
On Wed, May 24, 2017 at 02:26:13PM +0300, Laurent Pinchart wrote: > Again, the patch I propose is the simplest v4.12-rc fix I can think of, short > of reverting your complete IOMMU probe deferral patch series. Let's focus on > the v4.12-rc fix, and then discuss how to move forward in v4.13 and beyond. Except, I don't think it fixes the problem that Sricharan is trying to fix, namely that of DMA ops that have been setup, torn down, and are trying to be re-setup again. The issue here is that results in a stale setup, because the dma_ops are left in-place from the first iommu setup, but the iommu mapping has been disposed of. Your patch only avoids the problem of tearing down someone else's DMA ops. We need a combination of both patches together. What needs to happen for Sricharan's problem to be resolved is: 1. move all of __arm_iommu_detach_device() into arm_iommu_detach_device(). 2. replace the __arm_iommu_detach_device() call in arm_teardown_iommu_dma_ops() with arm_iommu_detach_device(). as I don't see any need to have a different order in arm_teardown_iommu_dma_ops(). -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops
On Wed, May 24, 2017 at 12:46:51AM +0300, Laurent Pinchart wrote: > Hi Russell, > > On Tuesday 23 May 2017 18:53:19 Russell King - ARM Linux wrote: > > On Tue, May 23, 2017 at 05:55:57PM +0100, Robin Murphy wrote: > > > On 23/05/17 17:25, Russell King - ARM Linux wrote: > > >> So, I've come to apply this patch (since it's finally landed in the > > >> patch system), and I'm not convinced that the commit message is really > > >> up to scratch. > > >> > > >> The current commit message looks like this: > > >> > > >> " ARM: 8674/1: dma-mapping: Reset the device's dma_ops > > >> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(), > > >> dma_ops should be cleared in the teardown path. Otherwise this > > >> causes problem when the probe of device is retried after being > > >> deferred. The device's iommu structures are cleared after > > >> EPROBEDEFER error, but on the next try dma_ops will still be set to > > >> old value, which is not right." > > >> > > >> It is obviously a fix, but a fix for which patch? Looking at the > > >> history, we have "arm: dma-mapping: Don't override dma_ops in > > >> arch_setup_dma_ops()" which I would have guessed is the appropriate > > >> one, but this post-dates your patch (it's very recent, v4.12-rc > > >> recent.) > > >> > > >> So, I need more description about the problem you were seeing when > > >> you first proposed this patch. > > >> > > >> How does leaving the dma_ops in place prior to "arm: dma-mapping: > > >> Don't override dma_ops in arch_setup_dma_ops()" cause problems for > > >> deferred probing? > > >> > > >> What patch is your change trying to fix? In other words, how far > > >> back does this patch need to be backported? > > > > > > In effect, it's fixing a latent inconsistency that's been present since > > > its introduction in 4bb25789ed28. However, that has clearly not proven > > > to be an issue in practice since then. With 09515ef5ddad we start > > > actually calling arch_teardown_dma_ops() in a manner that might leave > > > things partially initialised if anyone starts removing and reprobing > > > drivers, but so far that's still from code inspection[1] rather than > > > anyone hitting it. > > > > > > Given that the changes which tickle it are fresh in -rc1 I'd say there's > > > no need to backport this, but at the same time it shouldn't do any > > > damage if you really want to. > > > > Well, looking at this, I'm not convinced that much of it is correct. > > > > 1) set_dma_ops() is in arch_setup_dma_ops() but this patch adds > >the unsetting of the DMA ops inside arm_teardown_iommu_dma_ops() > >rather than arch_teardown_dma_ops(). > > > >This doesn't strike me as being particularly symmetric. > >arm_teardown_iommu_dma_ops() is arm_setup_iommu_dma_ops()'s > >counterpart. > > > > 2) arch_setup_dma_ops(), the recent patch to add the existing dma_ops > >check, and Xen - Xen wants to override the DMA ops if in the > >initial domain. It's not clear (at least to me) whether the recent > >patch adding the dma_ops check took account of this or not. > > > > 3) random places seem to fiddle with the dma_ops - notice that > >arm_iommu_detach_device() sets the dma_ops to NULL. > > > >In fact, I think moving __arm_iommu_detach_device() into > >arm_iommu_detach_device(), calling arm_iommu_detach_device(), > >and getting rid of the explicit set_dma_ops(, NULL) in this > >path would be a good first step. > > > > 4) I think arch_setup_dma_ops() is over-complex. > > > > So, in summary, this code is a mess today, and that means it's not > > obviously correct - which is bad. This needs sorting. > > We've reached the same conclusion independently, but I'll refrain from > commenting on whether that's a good or bad thing :-) > > I don't think this patch should be applied, as it could break Xen (and other > platforms/drivers that set the DMA operations manually) by resetting DMA > operations at device remove() time even if they have been set independently > of > arch_setup_dma_ops(). That will only occur if the dma ops have been overriden once the DMA operations have been setup via arch_setup_dma_ops. What saves it from wholesale NULLing of the DMA operations is the check for a valid dma_iommu_mapping structure in arm_teardown_iommu_dma_ops(). This only exists when arm_setup_iommu_dma_ops() has attached a mapping to the device. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops
On Tue, May 23, 2017 at 05:55:57PM +0100, Robin Murphy wrote: > On 23/05/17 17:25, Russell King - ARM Linux wrote: > > So, I've come to apply this patch (since it's finally landed in the patch > > system), and I'm not convinced that the commit message is really up to > > scratch. > > > > The current commit message looks like this: > > > > " ARM: 8674/1: dma-mapping: Reset the device's dma_ops > > > > arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(), > > dma_ops should be cleared in the teardown path. Otherwise this causes > > problem when the probe of device is retried after being deferred. The > > device's iommu structures are cleared after EPROBEDEFER error, but on > > the next try dma_ops will still be set to old value, which is not > > right." > > > > It is obviously a fix, but a fix for which patch? Looking at the > > history, we have "arm: dma-mapping: Don't override dma_ops in > > arch_setup_dma_ops()" which I would have guessed is the appropriate > > one, but this post-dates your patch (it's very recent, v4.12-rc > > recent.) > > > > So, I need more description about the problem you were seeing when > > you first proposed this patch. > > > > How does leaving the dma_ops in place prior to "arm: dma-mapping: > > Don't override dma_ops in arch_setup_dma_ops()" cause problems for > > deferred probing? > > > > What patch is your change trying to fix? In other words, how far > > back does this patch need to be backported? > > In effect, it's fixing a latent inconsistency that's been present since > its introduction in 4bb25789ed28. However, that has clearly not proven > to be an issue in practice since then. With 09515ef5ddad we start > actually calling arch_teardown_dma_ops() in a manner that might leave > things partially initialised if anyone starts removing and reprobing > drivers, but so far that's still from code inspection[1] rather than > anyone hitting it. > > Given that the changes which tickle it are fresh in -rc1 I'd say there's > no need to backport this, but at the same time it shouldn't do any > damage if you really want to. Well, looking at this, I'm not convinced that much of it is correct. 1) set_dma_ops() is in arch_setup_dma_ops() but this patch adds the unsetting of the DMA ops inside arm_teardown_iommu_dma_ops() rather than arch_teardown_dma_ops(). This doesn't strike me as being particularly symmetric. arm_teardown_iommu_dma_ops() is arm_setup_iommu_dma_ops()'s counterpart. 2) arch_setup_dma_ops(), the recent patch to add the existing dma_ops check, and Xen - Xen wants to override the DMA ops if in the initial domain. It's not clear (at least to me) whether the recent patch adding the dma_ops check took account of this or not. 3) random places seem to fiddle with the dma_ops - notice that arm_iommu_detach_device() sets the dma_ops to NULL. In fact, I think moving __arm_iommu_detach_device() into arm_iommu_detach_device(), calling arm_iommu_detach_device(), and getting rid of the explicit set_dma_ops(, NULL) in this path would be a good first step. 4) I think arch_setup_dma_ops() is over-complex. So, in summary, this code is a mess today, and that means it's not obviously correct - which is bad. This needs sorting. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops
On Thu, Oct 27, 2016 at 09:07:23AM +0530, Sricharan wrote: > Hi Robin, > > >-Original Message- > >From: Robin Murphy [mailto:robin.mur...@arm.com] > >Sent: Wednesday, October 26, 2016 8:37 PM > >To: Sricharan R ; will.dea...@arm.com; > >j...@8bytes.org; iommu@lists.linux-foundation.org; linux-arm- > >ker...@lists.infradead.org; linux-arm-...@vger.kernel.org; > >laurent.pinch...@ideasonboard.com; m.szyprow...@samsung.com; > >tf...@chromium.org; srinivas.kandaga...@linaro.org > >Subject: Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops > > > >On 04/10/16 18:03, Sricharan R wrote: > >> The dma_ops for the device is not getting set to NULL in > >> arch_tear_down_dma_ops and this causes an issue when the > >> device's probe gets deferred and retried. So reset the > >> dma_ops to NULL. > > > >Reviewed-by: Robin Murphy > > > > Thanks. > > >This seems like it could stand independently from the rest of the series > >- might be worth rewording the commit message to more general terms, > >i.e. arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops() > >thus clearing the ops set by the latter, and sending it to Russell as a > >fix in its own right. > > Ok, will reword the commit log and push this separately. So, I've come to apply this patch (since it's finally landed in the patch system), and I'm not convinced that the commit message is really up to scratch. The current commit message looks like this: " ARM: 8674/1: dma-mapping: Reset the device's dma_ops arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(), dma_ops should be cleared in the teardown path. Otherwise this causes problem when the probe of device is retried after being deferred. The device's iommu structures are cleared after EPROBEDEFER error, but on the next try dma_ops will still be set to old value, which is not right." It is obviously a fix, but a fix for which patch? Looking at the history, we have "arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()" which I would have guessed is the appropriate one, but this post-dates your patch (it's very recent, v4.12-rc recent.) So, I need more description about the problem you were seeing when you first proposed this patch. How does leaving the dma_ops in place prior to "arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()" cause problems for deferred probing? What patch is your change trying to fix? In other words, how far back does this patch need to be backported? -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] arm64/dma-mapping: fix DMA_ATTR_FORCE_CONTIGUOUS mmaping code
On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: > In this version of the patch I have replaced temporal pages and > iommu_dma_mmap with remap_pfn_range or rather its simplified version > vm_iomap_memory. > Unfortunately I have not find nice helper for sgtable creation, so I > left sg allocation inside _iommu_mmap_attrs. > > As the fixing of DMA_ATTR_FORCE_CONTIGUOUS related crashes has higher > priority I have focused only on it in this patch. As I mentioned elsewhere, I don't believe that fudging around in this way is a proper fix. DMA coherent memory was never, ever, intended to be passed back into the streaming APIs - it was designed that the two would be mutually exclusive. The problem is that we now have DMA coherent allocations being passed through the dma-buf API using this dma_get_sgtable() thing, which is quite broken. I regard dma_get_sgtable() as very broken, and had I realised at the time that this was what it was trying to do, I would have NAK'd it. Rather than bodging around this brokenness, trying to get dma_get_sgtable() to work, I believe we need to address the root cause - which is proper support for passing DMA coherent allocations through dma-buf, which does not involve scatterlists and calling dma_map_sg() on it. That's going to need to be addressed in any case, because of the dma_declare_coherent_memory() issue, where we may not have struct pages backing a coherent allocation. Such a case can come up on ARM64 via DT's "shared-dma-pool" reserved memory stuff. Right now, I have a "fix" for ARM queued which causes dma_get_sgtable() to fail when used with reserved memory, but we have one user who needs this to work. So, dma-buf needs to be fixed for this one way or another, and I don't think that bending the current broken stuff to suit it by trying these vmalloc_to_page() hacks is acceptable. dma_get_sgtable() is fundamentally broken IMHO. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv7 4/6] arm: dma-mapping: add {map, unmap}_resource for iommu ops
On Wed, Jun 01, 2016 at 05:22:27PM +0200, Niklas Söderlund wrote: > +static dma_addr_t arm_iommu_map_resource(struct device *dev, > + phys_addr_t phys_addr, size_t size, > + enum dma_data_direction dir, struct dma_attrs *attrs) > +{ > + struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); > + dma_addr_t dma_addr; > + int ret, prot; > + phys_addr_t addr = phys_addr & PAGE_MASK; > + int offset = phys_addr & ~PAGE_MASK; > + int len = PAGE_ALIGN(size + offset); Shouldn't both of these be unsigned - preferably size_t for len? > + > + dma_addr = __alloc_iova(mapping, size); Is this really correct? What if size = 4095 and offset = 10? Do we really only need one IOVA page for such a mapping (I count two pages.) Shouldn't this be "len" ? > + if (dma_addr == DMA_ERROR_CODE) > + return dma_addr; > + > + prot = __dma_direction_to_prot(dir) | IOMMU_MMIO; > + > + ret = iommu_map(mapping->domain, dma_addr, addr, len, prot); > + if (ret < 0) > + goto fail; > + > + return dma_addr + offset; > +fail: > + __free_iova(mapping, dma_addr, size); Shouldn't this be "len" as well? > + return DMA_ERROR_CODE; > +} > + > +/** > + * arm_iommu_unmap_resource - unmap a device DMA resource > + * @dev: valid struct device pointer > + * @dma_handle: DMA address to resource > + * @size: size of resource to map > + * @dir: DMA transfer direction > + */ > +static void arm_iommu_unmap_resource(struct device *dev, dma_addr_t > dma_handle, > + size_t size, enum dma_data_direction dir, > + struct dma_attrs *attrs) > +{ > + struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); > + dma_addr_t iova = dma_handle & PAGE_MASK; > + int offset = dma_handle & ~PAGE_MASK; > + int len = PAGE_ALIGN(size + offset); unsigned/size_t again. > + > + if (!iova) > + return; > + > + iommu_unmap(mapping->domain, iova, len); > + __free_iova(mapping, iova, len); Here, you free "len" bytes of iova, which is different from above. > +} > + > static void arm_iommu_sync_single_for_cpu(struct device *dev, > dma_addr_t handle, size_t size, enum dma_data_direction dir) > { > @@ -1994,6 +2051,9 @@ struct dma_map_ops iommu_ops = { > .unmap_sg = arm_iommu_unmap_sg, > .sync_sg_for_cpu= arm_iommu_sync_sg_for_cpu, > .sync_sg_for_device = arm_iommu_sync_sg_for_device, > + > + .map_resource = arm_iommu_map_resource, > + .unmap_resource = arm_iommu_unmap_resource, > }; > > struct dma_map_ops iommu_coherent_ops = { > @@ -2007,6 +2067,9 @@ struct dma_map_ops iommu_coherent_ops = { > > .map_sg = arm_coherent_iommu_map_sg, > .unmap_sg = arm_coherent_iommu_unmap_sg, > + > + .map_resource = arm_iommu_map_resource, > + .unmap_resource = arm_iommu_unmap_resource, > }; > > /** > -- > 2.8.2 > > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/4] ARM: dma-mapping: Constify attrs passed to internal functions
On Tue, May 24, 2016 at 08:28:08AM +0200, Krzysztof Kozlowski wrote: > Some of the non-exported functions do not modify passed dma_attrs so the > pointer can point to const data. > > Signed-off-by: Krzysztof Kozlowski Acked-by: Russell King Thanks. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/4] scatterlist: add sg_alloc_table_from_buf() helper
On Thu, Mar 31, 2016 at 04:45:57PM +0200, Boris Brezillon wrote: > Hi Russell, > > On Thu, 31 Mar 2016 15:14:13 +0100 > Russell King - ARM Linux wrote: > > > On Thu, Mar 31, 2016 at 02:29:42PM +0200, Boris Brezillon wrote: > > > sg_alloc_table_from_buf() provides an easy solution to create an sg_table > > > from a virtual address pointer. This function takes care of dealing with > > > vmallocated buffers, buffer alignment, or DMA engine limitations (maximum > > > DMA transfer size). > > > > Please note that the DMA API does not take account of coherency of memory > > regions other than non-high/lowmem - there are specific extensions to > > deal with this. > > Ok, you said 'non-high/lowmem', this means vmalloced and kmapped buffers > already fall in this case, right? > > Could you tell me more about those specific extensions? I was slightly confused - the extensions I was thinking of are those listed at the bottom of Documentation/cachetlb.txt, which have nothing to do with DMA. However, it's probably worth reading Documentation/DMA-API-HOWTO.txt to read up on what kinds of memory are considered to be DMA-able in the kernel. > > What this means is that having an API that takes any virtual address > > pointer, converts it to a scatterlist which is then DMA mapped, is > > unsafe. > > Which means some implementations already get this wrong (see > spi_map_buf(), and I'm pretty sure it's not the only one). Quite possible, but that is driver stuff, and driver stuff gets things wrong all the time. :) > > It'll be okay for PIPT and non-aliasing VIPT cache architectures, but > > for other cache architectures this will hide this problem and make > > review harder. > > > > Ok, you lost me. I'll have to do my homework and try to understand what > this means :). P = physical address V = virtual address I = indexed T = tag The tag is held in each cache line. When a location is looked up in the cache, an index is used to locate a set of cache lines and the tag is compared to check which cache line in the set is the correct one (or whether the address even exists in the cache.) How the index and tag are derived varies between cache architectures. PIPT = indexed by physical address, tagged with physical address. Never aliases with itself in the presence of multiple virtual mappings. VIPT = indexed by virtual address, tagged with physical address. If the bits from the virtual address do not overlap the MMU page size, it is also alias free, otherwise aliases can exist, but can be eliminated by "cache colouring" - ensuring that a physical address is always mapped with the same overlapping bits. VIVT = indexed by virtual address, tagged with virtual address. The worst kind of cache, since every different mapping of the same physical address is guaranteed by design to alias with other mappings. There is little cache colouring between different kernel mappings (eg, between lowmem and vmalloc space.) What this means is that, while the DMA API takes care of DMA aliases in the lowmem mappings, an alias-prone VIPT cache will remain incoherent with DMA if it is remapped into vmalloc space, and the mapping happens to have a different cache colour. In other words, this is a data corruption issue. Hence, taking a range of vmalloc() addresses, converting them into a scatterlist, then using the DMA API on the scatterlist _only_ guarantees that the lowmem (and kmap'd highmem mappings) are coherent with DMA. There is no way for the DMA API to know that other mappings exist, and obviously flushing every possible cache line just because a mapping might exist multiplies the expense of the DMA API: not only in terms of time spent running through all the possibilities, which doubles for every aliasing bit of VIPT, but also TLB pressure since you'd have to create a mapping for each alias and tear it back down. VIVT is even worse, since there is no other virtual mapping which is coherent, would need to be known, and each mapping would need to be individually flushed. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/4] scatterlist: add sg_alloc_table_from_buf() helper
On Thu, Mar 31, 2016 at 02:29:42PM +0200, Boris Brezillon wrote: > sg_alloc_table_from_buf() provides an easy solution to create an sg_table > from a virtual address pointer. This function takes care of dealing with > vmallocated buffers, buffer alignment, or DMA engine limitations (maximum > DMA transfer size). Please note that the DMA API does not take account of coherency of memory regions other than non-high/lowmem - there are specific extensions to deal with this. What this means is that having an API that takes any virtual address pointer, converts it to a scatterlist which is then DMA mapped, is unsafe. It'll be okay for PIPT and non-aliasing VIPT cache architectures, but for other cache architectures this will hide this problem and make review harder. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping
On Wed, Nov 04, 2015 at 06:48:50PM +0900, Tomasz Figa wrote: > There is no requirement, but shouldn't it be desired for the mapping > code to map them as such? Otherwise, how could the IOMMU use case you > described above (address translator for devices which don't have the > capability to address a scatterlist) be handled properly? It's up to the IOMMU code to respect the parameters that the device has supplied to it via the device_dma_parameters. This doesn't currently allow a device to say "I want this scatterlist to be mapped as a contiguous device address", so really if a device has such a requirement, at the moment the device driver _must_ check the dma_map_sg() return value and act accordingly. While it's possible to say "an IOMMU should map as a single contiguous address" what happens when the IOMMU's device address space becomes fragmented? > Is the general conclusion now that dma_map_sg() should not be used to > create IOMMU mappings and we should make a step backwards making all > drivers (or frameworks, such as videobuf2) do that manually? That > would be really backwards, because code not aware of IOMMU existence > at all would have to become aware of it. No. The DMA API has always had the responsibility for managing the IOMMU device, which may well be shared between multiple different devices. However, if the IOMMU is part of a device IP block (such as a GPU) then the decision on whether the DMA API should be used or not is up to the driver author. If it has special management requirements, then it's probably appropriate for the device driver to manage it by itself. For example, a GPUs MMU may need something inserted into the GPUs command stream to flush the MMU TLBs. Such cases are inappropriate to be using the DMA API for IOMMU management. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping
On Wed, Nov 04, 2015 at 02:12:03PM +0900, Tomasz Figa wrote: > My understanding of a scatterlist was that it represents a buffer as a > whole, by joining together its physically discontinuous segments. Correct, and it may also be scattered in CPU virtual space as well. > I don't see how single segments (layout of which is completely up to > the allocator; often just single pages) would be usable for hardware > that needs to do some work more serious than just writing a byte > stream continuously to subsequent buffers. In case of such simple > devices you don't even need an IOMMU (for means other than protection > and/or getting over address space limitations). All that's required is that the addresses described in the scatterlist are accessed as an apparently contiguous series of bytes. They don't have to be contiguous in any address view, provided the device access appears to be contiguous. How that is done is really neither here nor there. IOMMUs are normally there as an address translator - for example, the underlying device may not have the capability to address a scatterlist (eg, because it makes effectively random access) and in order to be accessible to the device, it needs to be made contiguous in device address space. Another scenario is that you have more bits of physical address than a device can generate itself for DMA purposes, and you need an IOMMU to create a (possibly scattered) mapping in device address space within the ability of the device to address. The requirements here depend on the device behind the IOMMU. > However, IMHO the most important use case of an IOMMU is to make > buffers, which are contiguous in CPU virtual address space (VA), > contiguous in device's address space (IOVA). No - there is no requirement for CPU virtual contiguous buffers to also be contiguous in the device address space. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping
On Wed, Nov 04, 2015 at 02:15:41PM +0900, Tomasz Figa wrote: > On Wed, Nov 4, 2015 at 3:40 AM, Russell King - ARM Linux > wrote: > > On Tue, Nov 03, 2015 at 05:41:24PM +, Robin Murphy wrote: > >> Hi Tomasz, > >> > >> On 02/11/15 13:43, Tomasz Figa wrote: > >> >Agreed. The dma_map_*() API is not guaranteed to return a single > >> >contiguous part of virtual address space for any given SG list. > >> >However it was understood to be able to map buffers contiguously > >> >mappable by the CPU into a single segment and users, > >> >videobuf2-dma-contig in particular, relied on this. > >> > >> I don't follow that - _any_ buffer made of page-sized chunks is going to be > >> mappable contiguously by the CPU; it's clearly impossible for the streaming > >> DMA API itself to offer such a guarantee, because it's entirely orthogonal > >> to the presence or otherwise of an IOMMU. > > > > Tomasz's use of "virtual address space" above in combination with the > > DMA API is really confusing. > > I suppose I must have mistakenly use "virtual address space" somewhere > instead of "IO virtual address space". I'm sorry for causing > confusion. > > The thing being discussed here is mapping of buffers described by > scatterlists into IO virtual address space, i.e. the operation > happening when dma_map_sg() is called for an IOMMU-enabled device. ... and there, it's perfectly legal for an IOMMU to merge all entries in a scatterlist into one mapping - so dma_map_sg() would return 1. What that means is that the scatterlist contains the original number of entries which describes the CPU view of the buffer list using the original number of entries, and the DMA device view of the same but using just the first entry. In other words, if you're walking a scatterlist, and doing a mixture of DMA and PIO, you can't assume that if you're at scatterlist entry N for DMA, you can switch to PIO for entry N and you'll write to the same memory. (I know that there's badly written drivers in the kernel which unfortunately do make this assumption, and if they're used in the presence of an IOMMU, they _will_ be silently data corrupting.) -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping
On Tue, Nov 03, 2015 at 05:41:24PM +, Robin Murphy wrote: > Hi Tomasz, > > On 02/11/15 13:43, Tomasz Figa wrote: > >Agreed. The dma_map_*() API is not guaranteed to return a single > >contiguous part of virtual address space for any given SG list. > >However it was understood to be able to map buffers contiguously > >mappable by the CPU into a single segment and users, > >videobuf2-dma-contig in particular, relied on this. > > I don't follow that - _any_ buffer made of page-sized chunks is going to be > mappable contiguously by the CPU; it's clearly impossible for the streaming > DMA API itself to offer such a guarantee, because it's entirely orthogonal > to the presence or otherwise of an IOMMU. Tomasz's use of "virtual address space" above in combination with the DMA API is really confusing. dma_map_sg() does *not* construct a CPU view of the passed scatterlist. The only thing dma_map_sg() might do with virtual addresses is to use them as a way to achieve cache coherence for one particular view of that memory, that being the kernel's own lowmem mapping and any kmaps. It doesn't extend to vmalloc() or userspace mappings of the memory. If the scatterlist is converted to an array of struct page pointers, it's possible to map it with vmap(), but it's implementation defined whether such a mapping will receive cache maintanence as part of the DMA API or not. (If you have PIPT caches, it will, if they're VIPT caches, maybe not.) There is a separate set of calls to deal with the flushing issues for vmap()'d memory in this case - see flush_kernel_vmap_range() and invalidate_kernel_vmap_range(). -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/3] iommu: Add range flush operation
On Tue, Sep 29, 2015 at 05:27:12PM +0100, Robin Murphy wrote: > Eh, swings and roundabouts. An argument denoting whether the flush is being > called on the map or unmap path would be fine, Sorry, that statement is wrong. It's not about whether you flush before or after the DMA operation. I'm afraid I'm probably going to tell you how to suck eggs here, because I don't think you quite "get it" with non-dma-coherent modern CPUs. Modern CPUs prefetch data into their caches, and they also randomly write back data from their caches to memory. When performing a DMA operation from device to memory, you need to do two things with CPU caches which aren't coherent: 1. Before starting the DMA operation, you need to walk over the memory to be mapped, ensuring that any dirty cache lines are written back. This is to prevent dirty cache lines overwriting data that has already been DMA'd from the device. 2. After the DMA operation has completed, you need to walk over the memory again, invalidating any cache lines which may have been speculatively loaded from that memory while DMA was running. These cache lines may have been loaded prior to the DMA operation placing the new data into memory. So, it's not a before-or-after, you have to always perform write-back cache maintanence prior to any DMA operation, and then invalidate cache maintanence after the DMA operation has completed for any mapping which the DMA may have written to (which means device-to-memory and bidirectional mappings.) -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/3] iommu: Add range flush operation
On Tue, Sep 29, 2015 at 03:20:38PM +0100, Robin Murphy wrote: > A single callback doesn't really generalise well enough: If we wanted to > implement this in the ARM SMMU drivers to optimise the unmap() case [ask > Will how long he spends waiting for a software model to tear down an entire > VFIO domain invalidating one page at a time ;)], then we'd either regress > performance in the map() case with an unnecessary TLB flush, or have to do a > table walk in every flush() call to infer what actually needs doing. And this is the problem of frameworks. They get in the way of doing things efficiently. Fine, we have the DMA ops, and that calls a map_sg() method. What we then need is to have a series of standardised library functions which can be called to perform various actions. Consider this: an IOMMU driver gets the raw scatterlist which the driver passed. The IOMMU driver walks the scatterlist, creating the IOMMU side mapping, and writing the device DMA addresses and DMA lengths to the scatterlist, possibly coalescing some of the entries. It remembers the number of scatterlist entries that the DMA operation now requires. The IOMMU code can setup whatever mappings it wants using whatever sizes it wants to satisfy the requested scatterlist. It then goes on to call the arch backend with the original scatterlist, asking it to _only_ deal with the CPU coherency for the mapping. The arch code walks the scatterlist again, this time dealing with the CPU coherency part. Finally, the IOMMU code returns the number of DMA scatterlist entries. When it comes to tearing it down, it's a similar operation to the above, except reversing those actions. The only issue with this approach is that it opens up some of the cache handling to the entire kernel, and that will be _too_ big a target for idiotic driver writers to think they have permission to directly use those interfaces. To solve this, I'd love to be able to have the linker link together certain objects in the kernel build, and then convert some global symbols to be local symbols, thus denying access to functions that driver authors have no business what so ever touching. > Personally I think it would be nicest to have two separate callbacks, e.g. > .map_sync/.unmap_sync, but at the very least some kind of additional > 'direction' kind of parameter would be necessary. No, not more callbacks - that's the framework thinking, not the library thinking. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/io-pgtable-arm: Don't use dma_to_phys()
On Fri, Sep 18, 2015 at 12:04:26PM +0100, Robin Murphy wrote: > Specifically, the problem case for that is when phys_addr_t is 64-bit but > dma_addr_t is 32-bit. The cast in __arm_lpae_dma_addr is necessary to avoid > a truncation warning when we make the DMA API calls, but we actually need > the opposite in the comparison here - comparing the different types directly > allows integer promotion to kick in appropriately so we don't lose the top > half of the larger address. Otherwise, you'd never spot the difference > between, say, your original page at 0x88c000 and a bounce-buffered copy > that happened to end up mapped to 0xc000. Hmm. Thinking about this, I think we ought to add to arch/arm/mm/Kconfig: config ARCH_PHYS_ADDR_T_64BIT def_bool ARM_LPAE config ARCH_DMA_ADDR_T_64BIT bool + select ARCH_PHYS_ADDR_T_64BIT I seem to remember that you're quite right that dma_addr_t <= phys_addr_t but dma_addr_t must never be bigger than phys_addr_t. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use
On Tue, Aug 04, 2015 at 03:47:13PM +0100, Robin Murphy wrote: > Hi Laurent, > > [ +RMK, as his patch is indirectly involved here ] > > On 04/08/15 14:16, Laurent Pinchart wrote: > >This is what I believe to be an API abuse. The dma_sync_single_for_device() > >API is meant to pass ownership of a buffer to the device. Unless I'm > >mistaken, > >once that's done the CPU isn't allowed to touch the buffer anymore until > >dma_sync_single_for_cpu() is called to get ownership of the buffer back. That's what I thought up until recently, but it's not strictly true - see Documentation/DMA-API.txt which Robin quoted. > [3]:Yes, there may generally be exceptions to that, but not in the context > of this code. Unless the Renesas IPMMU does something I don't know about? If an IOMMU does write to its page tables, then the only way to handle those is using DMA-coherent memory, either via a coherent mapping or allocated via dma_alloc_coherent(). The streaming DMA API is wholely unsuitable for any mapping where both the CPU and DMA device both want to simultaneously alter the contained data. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu-v2: ThunderX(errata-23399) mis-extends 64bit registers
On Thu, Jul 30, 2015 at 08:54:04PM +, Chalamarla, Tirumalesh wrote: > is some thing like this looks good > > +#ifdef CONFIG_64BIT > +#define smmu_writeq(reg64, addr) writeq_relaxed((reg64), (addr)) > +#else > +#define smmu_writeq(reg64, addr) \ > + writel_relaxed(((reg64) >> 32), ((addr) + 4)); \ > + writel_relaxed((reg64), (addr)) It's missing a #endif. This also suffers from multiple argument evaluation, and it hides that there's two expressions here - which makes future maintanence harder. #define smmu_writeq(reg64, addr)\ do {\ u64 __val = (reg64);\ volatile void __iomem *__addr = (addr); \ writel_relaxed(__val >> 32, __addr + 4);\ writel_relaxed(__val, __addr); \ } while (0) is longer but is much preferred as it won't suffer side effects from stuff like: if (...) smmu_writeq(val++, addr); -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 5/6] iommu/mediatek: Add mt8173 IOMMU driver
On Mon, Jul 27, 2015 at 02:23:26PM +0100, Robin Murphy wrote: > On 16/07/15 10:04, Yong Wu wrote: > >This patch adds support for mediatek m4u (MultiMedia Memory Management > >Unit). > > > >Signed-off-by: Yong Wu > [...] > >+static void mtk_iommu_flush_pgtable(void *ptr, size_t size, void *cookie) > >+{ > >+ struct mtk_iommu_domain *domain = cookie; > >+ unsigned long offset = (unsigned long)ptr & ~PAGE_MASK; > >+ > >+ dma_map_page(domain->data->dev, virt_to_page(ptr), offset, > >+size, DMA_TO_DEVICE); > > Nit: this looks like it may as well be dma_map_single. > > It would probably be worth following it with a matching unmap too, just to > avoid any possible leakage bugs (especially if this M4U ever appears in a > SoC supporting RAM above the 32-bit boundary). Why not do the job properly? Take a look at how I implemented the streaming DMA API on Tegra SMMU (patch set recently sent out earlier today). There's no need for hacks like dma_map_page() (and discarding it's return value) or dma_map_page() followed by dma_unmap_page(). -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 00/18] Clean up exposure of arch-internal code
On Mon, Jul 27, 2015 at 04:09:06PM +0200, Thierry Reding wrote: > On Mon, Jul 27, 2015 at 01:28:24PM +0100, Russell King - ARM Linux wrote: > > This series of patches attempts to clean up the use of architecture > > internal functions in drivers, and removes some functions from view > > in the asm/ headers. I'm also considering whether we need to add > > some linker magic to hide symbols when building the built-in.o files. > > > > This was triggered by 3rd party drivers "going under the covers" of > > the DMA API and calling the dmac_*() functions directly, a practice > > which I have always refused to allow. This also breaks module > > building (which is the big hint that they're doing something wrong.) > > > > However, it also came to light that various drivers are using > > __cpuc_* functions directly, which are non-portable, and is another > > instance of "going under the covers" and tinkering with what should > > be kept as architecture implementation details. > > > > This series addresses some of these points. It: > > > > (a) moves dmac_map_area() and dmac_unmap_area() prototypes out of > > asm/cacheflush.h and into arch/arm/mm. > > (b) provide a secure_flush() call for the Qualcomm secure monitor > > code. > > (c) stop tegra smmu driver(s) from using __cpuc_flush_dcache_area, > > something which necessitates additional complexity to deal with > > the ARM vs ARM64 differences. It should be using the DMA API. > > However, the Tegra SMMU driver is in really bad shape, and this > > isn't a simple conversion - it requires a lot of additional > > fixes to bring the code up to scratch. > > Out of curiosity, did you have any hardware to test this on? Nope - it only got build-tested, and it's partly why there's soo many incremental small patches. > I've given > it a quick spin and haven't seen anything out of the ordinary. I have a > couple of nitpicks towards the end of the series, but other than that, > very nice work. Good news. > What's the plan for merging this? Should it all go in through Joerg's > tree so that potentially other driver conversions can go in on top of > this, or would you prefer to take it through the ARM tree? I don't mind the series being split - there's nothing dependent between the individual drivers, so all the Tegra SMMU stuff can go through Joerg's tree if that'll be easiest. > > It leaves the Rockchip IOMMU driver for the time being, but that is also > > something which needs cleaning up in the same way as the Tegra SMMU > > driver. > > From a cursory look, MSM, OMAP and Exynos are in the same boat as well. Yes. There is a question here though: Is it a good idea to use normal cacheable memory for the IOMMU page table entries, or would DMA coherent memory be better? If the latter, then we don't need to do any cache maintanence of the tables, and all the hacks can go. If the former, using the streaming DMA API in all IOMMU drivers would be a good idea, like Tegra SMMU now does as a result of these patches. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 00/18] Clean up exposure of arch-internal code
This series of patches attempts to clean up the use of architecture internal functions in drivers, and removes some functions from view in the asm/ headers. I'm also considering whether we need to add some linker magic to hide symbols when building the built-in.o files. This was triggered by 3rd party drivers "going under the covers" of the DMA API and calling the dmac_*() functions directly, a practice which I have always refused to allow. This also breaks module building (which is the big hint that they're doing something wrong.) However, it also came to light that various drivers are using __cpuc_* functions directly, which are non-portable, and is another instance of "going under the covers" and tinkering with what should be kept as architecture implementation details. This series addresses some of these points. It: (a) moves dmac_map_area() and dmac_unmap_area() prototypes out of asm/cacheflush.h and into arch/arm/mm. (b) provide a secure_flush() call for the Qualcomm secure monitor code. (c) stop tegra smmu driver(s) from using __cpuc_flush_dcache_area, something which necessitates additional complexity to deal with the ARM vs ARM64 differences. It should be using the DMA API. However, the Tegra SMMU driver is in really bad shape, and this isn't a simple conversion - it requires a lot of additional fixes to bring the code up to scratch. It leaves the Rockchip IOMMU driver for the time being, but that is also something which needs cleaning up in the same way as the Tegra SMMU driver. arch/arm/include/asm/cacheflush.h | 21 ++- arch/arm/include/asm/glue-cache.h | 2 - arch/arm/mm/dma-mapping.c | 1 + arch/arm/mm/dma.h | 32 drivers/firmware/qcom_scm-32.c| 4 +- drivers/iommu/tegra-smmu.c| 297 -- drivers/memory/tegra/tegra114.c | 17 --- drivers/memory/tegra/tegra124.c | 30 drivers/memory/tegra/tegra30.c| 17 --- include/soc/tegra/mc.h| 7 - 10 files changed, 242 insertions(+), 186 deletions(-) -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v2 2/3] arm64: add IOMMU dma_ops
On Thu, Mar 05, 2015 at 11:16:28AM +, Robin Murphy wrote: > Hi Laura, > > On 05/03/15 00:19, Laura Abbott wrote: > [...] > >>Consider that the IOMMU's page table walker is a DMA master in its own > >right, and that is the device you're mapping the page tables for. > >Therefore your IOMMU driver needs to have access to the struct device > >of the IOMMU itself to use for the page-table-related mappings. Also, > >be sure to set the IOMMU's DMA mask correctly to prevent SWIOTLB bounce > >buffers being created in the process (which as I've found generally ends > >in disaster). > >> > >>> And normally, we always need do cache maintenance only for some > >>>bytes in the pagetable but not whole a page. Then is there a easy way to > >>>do the cache maintenance? > >> > >>For a noncoherent device, dma_map_single() will end up calling > >__dma_map_area() with the page offset and size of the original request, so > >the updated part gets flushed by VA, and the rest of the page isn't touched > >if it doesn't need to be. On the other hand if the page tables were > >allocated with dma_alloc_coherent() in the first place, then just calling > >dma_sync_single_for_device() for the updated region should suffice. That's wrong. dma_sync_single_*() is not permitted to be called on coherently allocated memory. Where coherent memory needs to be remapped, dma_sync_single_*() will panic the kernel. If it's in coherent memory, all you should need is the appropriate memory barrier to ensure that the DMA agent can see the writes. > >Where exactly would you call the dma_unmap? It seems a bit strange to > >be repeatedly calling dma_map and never calling dma_unmap. I don't see it > >explicitly forbidden in the docs anywhere to do this but it seems like > >it would be violating the implicit handoff of dma_map/dma_unmap. > > I think ideally you'd call dma_map_page when you first create the page > table, dma_sync_single_for_device on any update, and dma_unmap_page when you > tear it down, and you'd also use the appropriate DMA addresses everywhere > instead of physical addresses. No. dma_map_page() ownership changes CPU->DMA dma_sync_single_for_cpu() ownership changes DMA->CPU dma_sync_single_for_device()ownership changes CPU->DMA dma_unmap_page()ownership changes DMA->CPU It's invalid to miss out the pairing that give those ownership changes. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv7 07/26] driver core: amba: add device binding path 'driver_override'
On Tue, Sep 23, 2014 at 04:46:06PM +0200, Antonios Motakis wrote: > As already demonstrated with PCI [1] and the platform bus [2], a > driver_override property in sysfs can be used to bypass the id matching > of a device to a AMBA driver. This can be used by VFIO to bind to any AMBA > device requested by the user. > > [1] > http://lists-archives.com/linux-kernel/28030441-pci-introduce-new-device-binding-path-using-pci_dev-driver_override.html > [2] https://www.redhat.com/archives/libvir-list/2014-April/msg00382.html > > Signed-off-by: Antonios Motakis I have to ask why this is even needed in the first place. To take the example in [2], what's wrong with: echo fff51000.ethernet > /sys/bus/platform/devices/fff51000.ethernet/driver/unbind echo fff51000.ethernet > /sys/bus/platform/drivers/vfio-platform/bind and similar for AMBA. All we would need to do is to introduce a way of having a driver accept explicit bind requests. In any case: > +static ssize_t driver_override_store(struct device *_dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + struct amba_device *dev = to_amba_device(_dev); > + char *driver_override, *old = dev->driver_override, *cp; > + > + if (count > PATH_MAX) > + return -EINVAL; > + > + driver_override = kstrndup(buf, count, GFP_KERNEL); > + if (!driver_override) > + return -ENOMEM; > + > + cp = strchr(driver_override, '\n'); > + if (cp) > + *cp = '\0'; I hope that is not replicated everywhere. This allows up to a page to be allocated, even when the first byte may be a newline. This is wasteful. How about: if (count > PATH_MAX) return -EINVAL; cp = strnchr(buf, count, '\n'); if (cp) count = cp - buf - 1; if (count) { driver_override = kstrndup(buf, count, GFP_KERNEL); if (!driver_override) return -ENOMEM; } else { driver_override = NULL; } kfree(dev->driver_override); dev->driver_override = driver_override; Also: > +static ssize_t driver_override_show(struct device *_dev, > + struct device_attribute *attr, char *buf) > +{ > + struct amba_device *dev = to_amba_device(_dev); > + > + return sprintf(buf, "%s\n", dev->driver_override); > +} Do we really want to do a NULL pointer dereference here? -- FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: avoid calling request_irq in atomic context
On Fri, Jul 25, 2014 at 04:26:59PM -0700, Mitchel Humpherys wrote: > - irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx]; > - ret = request_irq(irq, arm_smmu_context_fault, IRQF_SHARED, > - "arm-smmu-context-fault", domain); > - if (IS_ERR_VALUE(ret)) { > - dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n", > - cfg->irptndx, irq); > - cfg->irptndx = INVALID_IRPTNDX; > - goto out_free_context; > - } > - > smmu_domain->smmu = smmu; > arm_smmu_init_context_bank(smmu_domain); > return 0; > - > -out_free_context: > - __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); > - return ret; This returns 'ret' from request_irq. > + ret = request_irq(irq, arm_smmu_context_fault, IRQF_SHARED, > + "arm-smmu-context-fault", domain); > + if (IS_ERR_VALUE(ret)) { > + dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n", > + cfg->irptndx, irq); > + cfg->irptndx = INVALID_IRPTNDX; > + return -ENODEV; This returns -ENODEV instead. > + } > + > /* Looks ok, so add the device to the domain */ > - cfg = find_smmu_master_cfg(smmu_domain->smmu, dev); > - if (!cfg) > + master_cfg = find_smmu_master_cfg(smmu_domain->smmu, dev); > + if (!master_cfg) > return -ENODEV; If this fails, we exit leaving the interrupt registered. This is a bug introduced by this change. -- FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu: shmobile: Enable the driver on all ARM platforms
On Wed, Oct 30, 2013 at 12:20:43PM +0100, Laurent Pinchart wrote: > Renesas ARM platforms are transitioning from single-platform to > multi-platform kernels using the new ARCH_SHMOBILE_MULTI. Make the > driver available on all ARM platforms to enable it on both ARCH_SHMOBILE > and ARCH_SHMOBILE_MULTI and increase build testing coverage. > > Don't enable COMPILE_TEST support as the driver doesn't compile on > non-ARM platforms due to usage of the ARM DMA IOMMU API. For similar reasons as x86, can we please think about using: depends on ARM depends on ARCH_SHMOBILE || ARCH_SHMOBILE_MULTI || COMPILE_TEST This way we don't end up polluting the configuration for non-shmobile platforms. Same goes for other ARM stuff... the number of options is getting rather large and we need to think about keeping that in check where its easily possible to do so. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv7 9/9] ARM: dma-mapping: add support for IOMMU mapper
On Wed, Feb 29, 2012 at 04:04:22PM +0100, Marek Szyprowski wrote: > +static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct > *vma, > + void *cpu_addr, dma_addr_t dma_addr, size_t size, > + struct dma_attrs *attrs) > +{ > + struct arm_vmregion *c; > + > + vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); > + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); What protects this against other insertions/removals from the list? > + > + if (c) { > + struct page **pages = c->priv; > + > + unsigned long uaddr = vma->vm_start; > + unsigned long usize = vma->vm_end - vma->vm_start; > + int i = 0; > + > + do { > + int ret; > + > + ret = vm_insert_page(vma, uaddr, pages[i++]); > + if (ret) { > + pr_err("Remapping memory, error: %d\n", ret); > + return ret; > + } > + > + uaddr += PAGE_SIZE; > + usize -= PAGE_SIZE; > + } while (usize > 0); > + } > + return 0; > +} > + > +/* > + * free a page as defined by the above mapping. > + * Must not be called with IRQs disabled. > + */ > +void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, > + dma_addr_t handle, struct dma_attrs *attrs) > +{ > + struct arm_vmregion *c; > + size = PAGE_ALIGN(size); > + > + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr); What protects this against other insertions/removals from the list? > + if (c) { > + struct page **pages = c->priv; > + __dma_free_remap(cpu_addr, size); > + __iommu_remove_mapping(dev, handle, size); > + __iommu_free_buffer(dev, pages, size); > + } > +} > + > +/* > + * Map a part of the scatter-gather list into contiguous io address space > + */ > +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, > + size_t size, dma_addr_t *handle, > + enum dma_data_direction dir) > +{ > + struct dma_iommu_mapping *mapping = dev->archdata.mapping; > + dma_addr_t iova, iova_base; > + int ret = 0; > + unsigned int count; > + struct scatterlist *s; > + > + size = PAGE_ALIGN(size); > + *handle = ARM_DMA_ERROR; > + > + iova_base = iova = __alloc_iova(mapping, size); > + if (iova == ARM_DMA_ERROR) > + return -ENOMEM; > + > + for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) > + { > + phys_addr_t phys = page_to_phys(sg_page(s)); > + unsigned int len = PAGE_ALIGN(s->offset + s->length); > + > + if (!arch_is_coherent()) > + __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, > dir); > + > + ret = iommu_map(mapping->domain, iova, phys, len, 0); Dealing with phys addresses on one part and pages + offset + length in a different part doesn't look like a good idea. Why can't there be some consistency? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/omap: fix NULL pointer dereference
On Wed, Feb 22, 2012 at 11:10:38AM +0200, Ohad Ben-Cohen wrote: > On Wed, Feb 22, 2012 at 10:56 AM, Russell King - ARM Linux > wrote: > > There's something else which needs fixing here - the return code. If > > omap_find_iovm_area() returns an error code that needs propagating. > > Otherwise it might as well just return NULL for all errors. > > Seems like it does just return NULL for all errors, so a cleaner fix > can probably just be s/IS_ERR(area)/!area/. > > I'll submit it, but Joerg, if you feel this is becoming a "cleanup", > feel free to take the first version. That sounds more like a bug fix than a cleanup to me. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/omap: fix NULL pointer dereference
On Wed, Feb 22, 2012 at 10:52:52AM +0200, Ohad Ben-Cohen wrote: > @@ -274,7 +274,7 @@ static ssize_t debug_read_mem(struct file *file, char > __user *userbuf, > mutex_lock(&iommu_debug_lock); > > area = omap_find_iovm_area(dev, (u32)ppos); > - if (IS_ERR(area)) { > + if (IS_ERR_OR_NULL(area)) { > bytes = -EINVAL; There's something else which needs fixing here - the return code. If omap_find_iovm_area() returns an error code that needs propagating. Otherwise it might as well just return NULL for all errors. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 8/8 RESEND] ARM: dma-mapping: add support for IOMMU mapper
On Mon, Jan 09, 2012 at 04:49:21PM +0100, Marek Szyprowski wrote: > This patch add a complete implementation of DMA-mapping API for > devices that have IOMMU support. All DMA-mapping calls are supported. > > This patch contains some of the code kindly provided by Krishna Reddy > and Andrzej Pietrasiewicz > > Signed-off-by: Marek Szyprowski > Signed-off-by: Kyungmin Park > > --- > > Hello, > > This is the corrected version of the previous patch from the "[PATCH 0/8 > v4] ARM: DMA-mapping framework redesign" thread which can be found here: > http://www.spinics.net/lists/linux-mm/msg27382.html > > Previous version had very nasty bug which causes memory trashing if > DMA-mapping managed to allocate pages larger than 4KiB. The problem was > in __iommu_alloc_buffer() function which did not check how many pages > has been left to allocate. This patch seems to be incomplete. If the standard DMA API is used (the one which exists in current kernels) and NEED_SG_DMA_LENGTH is enabled, then where do we set the DMA length in the scatterlist? > diff --git a/arch/arm/include/asm/dma-iommu.h > b/arch/arm/include/asm/dma-iommu.h > new file mode 100644 > index 000..6668b41 > --- /dev/null > +++ b/arch/arm/include/asm/dma-iommu.h > @@ -0,0 +1,36 @@ > +#ifndef ASMARM_DMA_IOMMU_H > +#define ASMARM_DMA_IOMMU_H > + > +#ifdef __KERNEL__ > + > +#include > +#include > +#include > +#include > + > +#include I can't see anything in here which needs asm/memory.h - if files which include this need it, please include it in there so we can see why it's needed. > + > +struct dma_iommu_mapping { > + /* iommu specific data */ > + struct iommu_domain *domain; > + > + void*bitmap; > + size_t bits; > + unsigned intorder; > + dma_addr_t base; > + > + spinlock_t lock; > + struct kref kref; > +}; > + > +struct dma_iommu_mapping * > +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size, > + int order); > + > +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping); > + > +int arm_iommu_attach_device(struct device *dev, > + struct dma_iommu_mapping *mapping); > + > +#endif /* __KERNEL__ */ > +#endif > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 4845c09..2287b01 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -27,6 +27,9 @@ > #include > #include > > +#include linux/ includes should be grouped together. > diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h > index 15e9f04..6bbc402 100644 > --- a/arch/arm/mm/vmregion.h > +++ b/arch/arm/mm/vmregion.h > @@ -17,7 +17,7 @@ struct arm_vmregion { > struct list_headvm_list; > unsigned long vm_start; > unsigned long vm_end; > - struct page *vm_pages; > + void*priv; I want to think about that - I may wish to export the vm_pages via the new dma-mappings file to provide additional information. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu