Re: Consult on ARM SMMU debugfs

2021-01-15 Thread Russell King - ARM Linux admin
On Mon, Jan 11, 2021 at 08:01:48PM +, Robin Murphy wrote:
> On 2021-01-07 02:45, chenxiang (M) wrote:
> > Hi Will,  Robin or other guys,
> > 
> > When debugging SMMU/SVA issue on huawei ARM64 board, we find that it
> > lacks of enough debugfs for ARM SMMU driver (such as
> > 
> > the value of STE/CD which we need to check sometimes). Currently it
> > creates top-level iommu directory in debugfs, but there is no debugfs
> > 
> > for ARM SMMU driver specially. Do you know whether ARM have the plan to
> > do that recently?
> 
> FWIW I don't think I've ever felt the need to need to inspect the Stream
> Table on a live system. So far the nature of the STE code has been simple
> enough that it's very hard for any given STE to be *wrong* - either it's set
> up as expected and thus works fine, or it's not initialised at all and you
> get C_BAD_STE, where 99% of the time you then just cross-reference the
> Stream ID against the firmware and find that the DT/IORT is wrong.
> 
> Similarly I don't think I've even even *seen* an issue that could be
> attributed to a context descriptor, although I appreciate that as we start
> landing more PASID and SVA support the scope for that starts to widen
> considerably.
> 
> Feel free to propose a patch if you believe it would be genuinely useful and
> won't just bit-rot into a maintenance burden, but it's not something that's
> on our roadmap here.

I do think that the IOMMU stuff needs better debugging. I've hit the
WARN_ON() in __arm_lpae_map(), and it's been pretty much undebuggable,
so I've resorted to putting the IOMMU into bypass mode permanently to
work around the issue.

The reason that it's undebuggable is if one puts printk() or trace
statements in the code, boots the platform, you get flooded with those
debugging messages, because every access to the rootfs generates and
tears down a mapping.

It would be nice to be able to inspect the IOMMU page tables and state
of the IOMMU, rather than having to resort to effectively disabling
the IOMMU.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/4] ARM/omap1: switch to use dma_direct_set_offset for lbus DMA offsets

2020-09-23 Thread Russell King - ARM Linux admin
On Mon, Sep 21, 2020 at 08:47:23AM +0200, Christoph Hellwig wrote:
> On Mon, Sep 21, 2020 at 09:44:18AM +0300, Tony Lindgren wrote:
> > * Janusz Krzysztofik  [200919 22:29]:
> > > Hi Tony,
> > > 
> > > On Friday, September 18, 2020 7:49:33 A.M. CEST Tony Lindgren wrote:
> > > > * Christoph Hellwig  [200917 17:37]:
> > > > > Switch the omap1510 platform ohci device to use dma_direct_set_offset
> > > > > to set the DMA offset instead of using direct hooks into the DMA
> > > > > mapping code and remove the now unused hooks.
> > > > 
> > > > Looks nice to me :) I still can't test this probably for few more weeks
> > > > though but hopefully Aaro or Janusz (Added to Cc) can test it.
> > > 
> > > Works for me on Amstrad Delta (tested with a USB ethernet adapter).
> > > 
> > > Tested-by: Janusz Krzysztofik 
> > 
> > Great, good to hear! And thanks for testing it.
> > 
> > Christoph, feel free to queue this along with the other patches:
> > 
> > Acked-by: Tony Lindgren 
> > 
> > Or let me know if you want me to pick it up.
> 
> I'd prefer to pick it up through the dma-mapping tree, but preferably
> I'd pick the whole series up once Russell has tested footwinder.

I don't think that's going to happen very soon... seems way too much
effort to pull down the appropriate tree to build and test.  Sorry.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/4] ARM/footbridge: switch to use dma_direct_set_offset for lbus DMA offsets

2020-09-17 Thread Russell King - ARM Linux admin
On Thu, Sep 17, 2020 at 07:32:27PM +0200, Christoph Hellwig wrote:
>  static int __init cats_pci_init(void)
>  {
> - if (machine_is_cats())
> - pci_common_init(&cats_pci);
> + if (!machine_is_cats())
> + return 0;
> + bus_register_notifier(&pci_bus_type, &footbridge_pci_dma_nb);
> + pci_common_init(&cats_pci);

I'd prefer these things to retain a positive-logic construct, so:

if (machine_is_cats()) {
bus_register_notifier(&pci_bus_type, &footbridge_pci_dma_nb);
pci_common_init(&cats_pci);
}

It's the same number of lines.

Otherwise, I think it's fine. I'll try to find some spare time to give
it a go on a Netwinder.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] ARM/dma-mapping: don't handle NULL devices in dma-direct.h

2020-09-17 Thread Russell King - ARM Linux admin
On Thu, Sep 17, 2020 at 07:32:28PM +0200, Christoph Hellwig wrote:
> The DMA API removed support for not passing in a device a long time
> ago, so remove the NULL checks.

What happens with ISA devices?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE

2020-09-11 Thread Russell King - ARM Linux admin
On Thu, Sep 10, 2020 at 07:40:37AM +0200, Christoph Hellwig wrote:
> The DMA offset notifier can only be used if PHYS_OFFSET is at least
> KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit
> phys_addr_t.  Currently the code compiles fine despite that, a pending
> change to the DMA offset handling would create a compiler warning for
> this case.  Add an ifdef to not compile the code except for LPAE
> configs.

However, to have use of the high physical offset, LPAE needs to be
enabled, which ensures that phys_addr_t is 64-bit.

I believe that DMA is non-coherent on this platform unless the high
physical address is used. Or something like that.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: decruft the vmalloc API

2020-04-08 Thread Russell King - ARM Linux admin
On Wed, Apr 08, 2020 at 01:58:58PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> Peter noticed that with some dumb luck you can toast the kernel address
> space with exported vmalloc symbols.
> 
> I used this as an opportunity to decruft the vmalloc.c API and make it
> much more systematic.  This also removes any chance to create vmalloc
> mappings outside the designated areas or using executable permissions
> from modules.  Besides that it removes more than 300 lines of code.

I haven't read all your patches yet.

Have you tested it on 32-bit ARM, where the module area is located
_below_ PAGE_OFFSET and outside of the vmalloc area?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: silence iommu group prints

2020-03-04 Thread Russell King - ARM Linux admin
On Wed, Mar 04, 2020 at 12:33:14PM +0200, Laurentiu Tudor wrote:
> 
> 
> On 04.03.2020 12:07, Russell King - ARM Linux admin wrote:
> > On Wed, Mar 04, 2020 at 11:56:53AM +0200, Laurentiu Tudor wrote:
> > >  From 44ae46501b5379bd0890df87fd3827248626ed03 Mon Sep 17 00:00:00 2001
> > > From: Laurentiu Tudor 
> > > Date: Tue, 1 Oct 2019 16:22:48 +0300
> > > Subject: [PATCH 1/6] bus: fsl-mc: make mc work with smmu disable bypass on
> > > Content-Type: text/plain; charset="us-ascii"
> > > 
> > > Since this commit [1] booting kernel on MC based chips started to
> > > fail because this firmware starts before the kernel and as soon as
> > > SMMU is probed it starts to trigger contiguous faults.
> > 
> > I think you mean "continuous" here.
> 
> Yes, thanks.
> 
> > > This is a workaround that allows MC firmware to run with SMMU
> > > in disable bypass mode. It consists of the following steps:
> > >   1. pause the firmware at early boot to get a chance to setup SMMU
> > >   2. request direct mapping for MC device
> > >   3. resume the firmware
> > > The workaround relies on the fact that no state is lost when
> > > pausing / resuming firmware, as per the docs.
> > > With this patch, platforms with MC firmware can now boot without
> > > having to change the default config to set:
> > >CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=n
> > 
> > This alone is a definite improvement, and has been needed for a while.
> > Please submit this patch with an appropriate Fixes: tag so stable trees
> > can pick it up.
> 
> The thing is that probably this workaround will never make it in the kernel
> because it questionable to say the least, e.g. see [1]. My plan is to give
> this approach [2] a try sometime soon.
> 
> [1] https://patchwork.kernel.org/comment/23149049/
> [2] https://patchwork.kernel.org/cover/11279577/

So, if we want to reduce the iommu noise, we need to completely break
the ability to use a mainline kernel on the LX2160A.  This doesn't
seem practical nor sensible.  Someone has to give.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: silence iommu group prints

2020-03-04 Thread Russell King - ARM Linux admin
On Wed, Mar 04, 2020 at 11:56:53AM +0200, Laurentiu Tudor wrote:
> From 44ae46501b5379bd0890df87fd3827248626ed03 Mon Sep 17 00:00:00 2001
> From: Laurentiu Tudor 
> Date: Tue, 1 Oct 2019 16:22:48 +0300
> Subject: [PATCH 1/6] bus: fsl-mc: make mc work with smmu disable bypass on
> Content-Type: text/plain; charset="us-ascii"
> 
> Since this commit [1] booting kernel on MC based chips started to
> fail because this firmware starts before the kernel and as soon as
> SMMU is probed it starts to trigger contiguous faults.

I think you mean "continuous" here.

> This is a workaround that allows MC firmware to run with SMMU
> in disable bypass mode. It consists of the following steps:
>  1. pause the firmware at early boot to get a chance to setup SMMU
>  2. request direct mapping for MC device
>  3. resume the firmware
> The workaround relies on the fact that no state is lost when
> pausing / resuming firmware, as per the docs.
> With this patch, platforms with MC firmware can now boot without
> having to change the default config to set:
>   CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=n

This alone is a definite improvement, and has been needed for a while.
Please submit this patch with an appropriate Fixes: tag so stable trees
can pick it up.

> [1] 954a03be033 ("iommu/arm-smmu: Break insecure users by disabling bypass by 
> default")

Please put this where you're referencing it above - it's fine to wrap
the description of the commit when using it in the body of the commit
message.  However, that should _never_ when providing a Fixes: tag
(linux-next has a script which will detect and complain about broken
Fixes: tags.)

Thanks.

> 
> Signed-off-by: Laurentiu Tudor 
> ---
>  drivers/bus/fsl-mc/fsl-mc-bus.c | 53 +
>  1 file changed, 53 insertions(+)
> 
> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
> index a0f8854acb3a..683a6401ffe8 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> @@ -18,6 +18,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  #include "fsl-mc-private.h"
>  
> @@ -889,6 +891,12 @@ static int get_mc_addr_translation_ranges(struct device 
> *dev,
>   return 0;
>  }
>  
> +#define FSL_MC_GCR1  0x0
> +#define GCR1_P1_STOP BIT(31)
> +
> +static u32 boot_gcr1;
> +static void __iomem *fsl_mc_regs;
> +
>  /**
>   * fsl_mc_bus_probe - callback invoked when the root MC bus is being
>   * added
> @@ -906,6 +914,21 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
>   struct mc_version mc_version;
>   struct resource res;
>  
> + /*
> +  * The MC firmware requires full access to the whole address space plus
> +  * it has no way of dealing with any kind of address translation, so
> +  * request direct mapping for it.
> +  */
> + error = iommu_request_dm_for_dev(&pdev->dev);
> + if (error)
> + dev_warn(&pdev->dev, "iommu_request_dm_for_dev() failed: %d\n",
> +  error);
> +
> + if (fsl_mc_regs) {
> + /* Resume the firmware */
> + writel(boot_gcr1 & ~GCR1_P1_STOP, fsl_mc_regs + FSL_MC_GCR1);
> + }
> +
>   mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
>   if (!mc)
>   return -ENOMEM;
> @@ -990,6 +1013,13 @@ static int fsl_mc_bus_remove(struct platform_device 
> *pdev)
>   if (!fsl_mc_is_root_dprc(&mc->root_mc_bus_dev->dev))
>   return -EINVAL;
>  
> + /*
> +  * Pause back the firmware so that it doesn't trigger faults by the
> +  * time SMMU gets brought down.
> +  */
> + writel(boot_gcr1 | GCR1_P1_STOP, fsl_mc_regs + FSL_MC_GCR1);
> + iounmap(fsl_mc_regs);
> +
>   fsl_mc_device_remove(mc->root_mc_bus_dev);
>  
>   fsl_destroy_mc_io(mc->root_mc_bus_dev->mc_io);
> @@ -1018,6 +1048,8 @@ static struct platform_driver fsl_mc_bus_driver = {
>  static int __init fsl_mc_bus_driver_init(void)
>  {
>   int error;
> + struct device_node *np;
> + struct resource res;
>  
>   error = bus_register(&fsl_mc_bus_type);
>   if (error < 0) {
> @@ -1039,9 +1071,30 @@ static int __init fsl_mc_bus_driver_init(void)
>   if (error < 0)
>   goto error_cleanup_dprc_driver;
>  
> + np = of_find_matching_node(NULL, fsl_mc_bus_match_table);
> + if (np && of_device_is_available(np)) {
> + error = of_address_to_resource(np, 1, &res);
> + if (error)
> + goto error_cleanup_dprc_driver;
> + fsl_mc_regs = ioremap(res.start, resource_size(&res));
> + if (!fsl_mc_regs) {
> + error = -ENXIO;
> + goto error_cleanup_dprc_driver;
> + }
> +
> + boot_gcr1 = readl(fsl_mc_regs + FSL_MC_GCR1);
> + /*
> +  * If found running, pause MC firmware in order to get a chance
> +  * to setup SMMU for it.
> +  */
> +

Re: [PATCH] iommu: silence iommu group prints

2020-03-04 Thread Russell King - ARM Linux admin
On Wed, Mar 04, 2020 at 11:42:16AM +0200, Laurentiu Tudor wrote:
> On 04.03.2020 11:33, Russell King - ARM Linux admin wrote:
> > On Wed, Mar 04, 2020 at 10:56:06AM +0200, Laurentiu Tudor wrote:
> > > 
> > > On 04.03.2020 00:17, Russell King - ARM Linux admin wrote:
> > > > On Tue, Mar 03, 2020 at 05:55:05PM +0200, Laurentiu Tudor wrote:
> > > > >   From c98dc05cdd45ae923654f2427985bd28bcde4bb2 Mon Sep 17 00:00:00 
> > > > > 2001
> > > > > From: Laurentiu Tudor 
> > > > > Date: Thu, 13 Feb 2020 11:59:12 +0200
> > > > > Subject: [PATCH 1/4] bus: fsl-mc: add custom .dma_configure 
> > > > > implementation
> > > > > Content-Type: text/plain; charset="us-ascii"
> > > > > 
> > > > > The devices on this bus are not discovered by way of device tree
> > > > > but by queries to the firmware. It makes little sense to trick the
> > > > > generic of layer into thinking that these devices are of related so
> > > > > that we can get our dma configuration. Instead of doing that, add
> > > > > our custom dma configuration implementation.
> > > > 
> > > > Firstly, applying this to v5.5 results in a build failure, due to a
> > > > missing linux/iommu.h include.
> > > > 
> > > > Secondly, this on its own appears to make the DPAA2 network interfaces
> > > > completely disappear.  Looking in /sys/bus/fsl-mc/drivers/*, none of
> > > > the DPAA2 drivers are bound to anything, and looking in
> > > > /sys/bus/fsl-mc/devices/, there is:
> > > > 
> > > > lrwxrwxrwx 1 root root 0 Mar  3 22:06 dprc.1 -> 
> > > > ../../../devices/platform/soc/80c00.fsl-mc/dprc.1
> > > > 
> > > > This is booting with u-boot, so using DT rather than ACPI.
> > > > 
> > > > > Signed-off-by: Laurentiu Tudor 
> > > > > ---
> > > > >drivers/bus/fsl-mc/fsl-mc-bus.c | 42 
> > > > > -
> > > > >1 file changed, 41 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c 
> > > > > b/drivers/bus/fsl-mc/fsl-mc-bus.c
> > > > > index 36eb25f82c8e..3df015eedae4 100644
> > > > > --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> > > > > +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> > > > > @@ -132,11 +132,51 @@ static int fsl_mc_bus_uevent(struct device 
> > > > > *dev, struct kobj_uevent_env *env)
> > > > >static int fsl_mc_dma_configure(struct device *dev)
> > > > >{
> > > > >   struct device *dma_dev = dev;
> > > > > + struct iommu_fwspec *fwspec;
> > > > > + const struct iommu_ops *iommu_ops;
> > > > > + struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
> > > > > + int ret;
> > > > > + u32 icid;
> > > > > +
> > > > > + /* Skip DMA setup for devices that are not DMA masters */
> > > > > + if (dev->type == &fsl_mc_bus_dpmcp_type ||
> > > > > + dev->type == &fsl_mc_bus_dpbp_type ||
> > > > > + dev->type == &fsl_mc_bus_dpcon_type ||
> > > > > + dev->type == &fsl_mc_bus_dpio_type)
> > > > > + return 0;
> > > > >   while (dev_is_fsl_mc(dma_dev))
> > > > >   dma_dev = dma_dev->parent;
> > > > > - return of_dma_configure(dev, dma_dev->of_node, 0);
> > > > > + fwspec = dev_iommu_fwspec_get(dma_dev);
> > > > > + if (!fwspec)
> > > > > + return -ENODEV;
> > > > 
> > > > The problem appears to be here - fwspec is NULL for dprc.1.
> > > 
> > > Ok, that's because the iommu config is missing from the DT node that's
> > > exposing the MC firmware. I've attached a fresh set of patches that 
> > > include
> > > on top the missing config and a workaround that makes MC work over SMMU.
> > > Also added the missing #include, thanks for pointing out.
> > > Let me know how it goes.
> > 
> > Shouldn't patch 6 should be first in the series, so that patch 1 does
> > not cause a regression and bisectability damage?
> > 
> 
> Correct, width one comment: both 5 and 6 should go first. Once iommu-map is
> added in the device tree the workaround for MC with the
> arm-smmu.disable_bypass boot arg will no longer work.

So, wouldn't it be reasonable to arrange the patch series like that?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: silence iommu group prints

2020-03-04 Thread Russell King - ARM Linux admin
On Wed, Mar 04, 2020 at 10:56:06AM +0200, Laurentiu Tudor wrote:
> 
> On 04.03.2020 00:17, Russell King - ARM Linux admin wrote:
> > On Tue, Mar 03, 2020 at 05:55:05PM +0200, Laurentiu Tudor wrote:
> > >  From c98dc05cdd45ae923654f2427985bd28bcde4bb2 Mon Sep 17 00:00:00 2001
> > > From: Laurentiu Tudor 
> > > Date: Thu, 13 Feb 2020 11:59:12 +0200
> > > Subject: [PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation
> > > Content-Type: text/plain; charset="us-ascii"
> > > 
> > > The devices on this bus are not discovered by way of device tree
> > > but by queries to the firmware. It makes little sense to trick the
> > > generic of layer into thinking that these devices are of related so
> > > that we can get our dma configuration. Instead of doing that, add
> > > our custom dma configuration implementation.
> > 
> > Firstly, applying this to v5.5 results in a build failure, due to a
> > missing linux/iommu.h include.
> > 
> > Secondly, this on its own appears to make the DPAA2 network interfaces
> > completely disappear.  Looking in /sys/bus/fsl-mc/drivers/*, none of
> > the DPAA2 drivers are bound to anything, and looking in
> > /sys/bus/fsl-mc/devices/, there is:
> > 
> > lrwxrwxrwx 1 root root 0 Mar  3 22:06 dprc.1 -> 
> > ../../../devices/platform/soc/80c00.fsl-mc/dprc.1
> > 
> > This is booting with u-boot, so using DT rather than ACPI.
> > 
> > > Signed-off-by: Laurentiu Tudor 
> > > ---
> > >   drivers/bus/fsl-mc/fsl-mc-bus.c | 42 -
> > >   1 file changed, 41 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c 
> > > b/drivers/bus/fsl-mc/fsl-mc-bus.c
> > > index 36eb25f82c8e..3df015eedae4 100644
> > > --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> > > +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> > > @@ -132,11 +132,51 @@ static int fsl_mc_bus_uevent(struct device *dev, 
> > > struct kobj_uevent_env *env)
> > >   static int fsl_mc_dma_configure(struct device *dev)
> > >   {
> > >   struct device *dma_dev = dev;
> > > + struct iommu_fwspec *fwspec;
> > > + const struct iommu_ops *iommu_ops;
> > > + struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
> > > + int ret;
> > > + u32 icid;
> > > +
> > > + /* Skip DMA setup for devices that are not DMA masters */
> > > + if (dev->type == &fsl_mc_bus_dpmcp_type ||
> > > + dev->type == &fsl_mc_bus_dpbp_type ||
> > > + dev->type == &fsl_mc_bus_dpcon_type ||
> > > + dev->type == &fsl_mc_bus_dpio_type)
> > > + return 0;
> > >   while (dev_is_fsl_mc(dma_dev))
> > >   dma_dev = dma_dev->parent;
> > > - return of_dma_configure(dev, dma_dev->of_node, 0);
> > > + fwspec = dev_iommu_fwspec_get(dma_dev);
> > > + if (!fwspec)
> > > + return -ENODEV;
> > 
> > The problem appears to be here - fwspec is NULL for dprc.1.
> 
> Ok, that's because the iommu config is missing from the DT node that's
> exposing the MC firmware. I've attached a fresh set of patches that include
> on top the missing config and a workaround that makes MC work over SMMU.
> Also added the missing #include, thanks for pointing out.
> Let me know how it goes.

Shouldn't patch 6 should be first in the series, so that patch 1 does
not cause a regression and bisectability damage?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: silence iommu group prints

2020-03-03 Thread Russell King - ARM Linux admin
On Tue, Mar 03, 2020 at 05:55:05PM +0200, Laurentiu Tudor wrote:
> From c98dc05cdd45ae923654f2427985bd28bcde4bb2 Mon Sep 17 00:00:00 2001
> From: Laurentiu Tudor 
> Date: Thu, 13 Feb 2020 11:59:12 +0200
> Subject: [PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation
> Content-Type: text/plain; charset="us-ascii"
> 
> The devices on this bus are not discovered by way of device tree
> but by queries to the firmware. It makes little sense to trick the
> generic of layer into thinking that these devices are of related so
> that we can get our dma configuration. Instead of doing that, add
> our custom dma configuration implementation.

Firstly, applying this to v5.5 results in a build failure, due to a
missing linux/iommu.h include.

Secondly, this on its own appears to make the DPAA2 network interfaces
completely disappear.  Looking in /sys/bus/fsl-mc/drivers/*, none of
the DPAA2 drivers are bound to anything, and looking in
/sys/bus/fsl-mc/devices/, there is:

lrwxrwxrwx 1 root root 0 Mar  3 22:06 dprc.1 -> 
../../../devices/platform/soc/80c00.fsl-mc/dprc.1

This is booting with u-boot, so using DT rather than ACPI.

> Signed-off-by: Laurentiu Tudor 
> ---
>  drivers/bus/fsl-mc/fsl-mc-bus.c | 42 -
>  1 file changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
> index 36eb25f82c8e..3df015eedae4 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> @@ -132,11 +132,51 @@ static int fsl_mc_bus_uevent(struct device *dev, struct 
> kobj_uevent_env *env)
>  static int fsl_mc_dma_configure(struct device *dev)
>  {
>   struct device *dma_dev = dev;
> + struct iommu_fwspec *fwspec;
> + const struct iommu_ops *iommu_ops;
> + struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
> + int ret;
> + u32 icid;
> +
> + /* Skip DMA setup for devices that are not DMA masters */
> + if (dev->type == &fsl_mc_bus_dpmcp_type ||
> + dev->type == &fsl_mc_bus_dpbp_type ||
> + dev->type == &fsl_mc_bus_dpcon_type ||
> + dev->type == &fsl_mc_bus_dpio_type)
> + return 0;
>  
>   while (dev_is_fsl_mc(dma_dev))
>   dma_dev = dma_dev->parent;
>  
> - return of_dma_configure(dev, dma_dev->of_node, 0);
> + fwspec = dev_iommu_fwspec_get(dma_dev);
> + if (!fwspec)
> + return -ENODEV;

The problem appears to be here - fwspec is NULL for dprc.1.

> + iommu_ops = iommu_ops_from_fwnode(fwspec->iommu_fwnode);
> + if (!iommu_ops)
> + return -ENODEV;
> +
> + ret = iommu_fwspec_init(dev, fwspec->iommu_fwnode, iommu_ops);
> + if (ret)
> + return ret;
> +
> + icid = mc_dev->icid;
> + ret = iommu_fwspec_add_ids(dev, &icid, 1);
> + if (ret) {
> + iommu_fwspec_free(dev);
> + return ret;
> + }
> +
> + if (!device_iommu_mapped(dev)) {
> + ret = iommu_probe_device(dev);
> + if (ret) {
> + iommu_fwspec_free(dev);
> + return ret;
> + }
> + }
> +
> + arch_setup_dma_ops(dev, 0, *dma_dev->dma_mask + 1, iommu_ops, true);
> +
> + return 0;
>  }
>  
>  static ssize_t modalias_show(struct device *dev, struct device_attribute 
> *attr,
> -- 
> 2.17.1
> 


-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: silence iommu group prints

2020-03-03 Thread Russell King - ARM Linux admin
On Tue, Mar 03, 2020 at 04:18:57PM +0200, Laurentiu Tudor wrote:
> 
> 
> On 28.02.2020 20:32, Robin Murphy wrote:
> > [ +Laurentiu ]
> > 
> > Hi Russell,
> > 
> > Thanks for sharing a log, now I properly understand what's up... further
> > comments at the end (for context).
> > 
> > On 28/02/2020 10:06 am, Russell King - ARM Linux admin wrote:
> > > On Fri, Feb 28, 2020 at 09:33:40AM +, John Garry wrote:
> > > > On 28/02/2020 02:16, Lu Baolu wrote:
> > > > > Hi,
> > > > > 
> > > > > On 2020/2/27 19:57, Russell King wrote:
> > > > > > On the LX2160A, there are lots (about 160) of IOMMU messages 
> > > > > > produced
> > > > > > during boot; this is excessive.  Reduce the severity of these 
> > > > > > messages
> > > > > > to debug level.
> > > > > > 
> > > > > > Signed-off-by: Russell King 
> > > > > > ---
> > > > > >    drivers/iommu/iommu.c | 4 ++--
> > > > > >    1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > > > > > index 3ead597e1c57..304281ec623b 100644
> > > > > > --- a/drivers/iommu/iommu.c
> > > > > > +++ b/drivers/iommu/iommu.c
> > > > > > @@ -741,7 +741,7 @@ int iommu_group_add_device(struct iommu_group
> > > > > > *group, struct device *dev)
> > > > > >    trace_add_device_to_group(group->id, dev);
> > > > > > -    dev_info(dev, "Adding to iommu group %d\n", group->id);
> > > > > > +    dev_dbg(dev, "Adding to iommu group %d\n", group->id);
> > > > > 
> > > > > I'm not strongly against this. But to me this message seems
> > > > > to be a good
> > > > > indicator that a device was probed successfully by the iommu 
> > > > > subsystem.
> > > > > Keeping it in the default kernel message always helps to the kernel
> > > > > debugging.
> > > > > 
> > > > 
> > > > I would tend to agree.
> > > 
> > > Here's the boot messages.  Notice how many of these "Adding to iommu
> > > group" messages there are:
> > > 
> > > [    0.00] Booting Linux on physical CPU 0x00 [0x410fd083]
> > > [    0.00] Linux version 5.5.0+ (rmk@rmk-PC) (gcc version 4.9.2
> > > (GCC)) #655 SMP PREEMPT Fri Feb 28 09:54:47 GMT 2020
> > > [    0.00] Machine model: SolidRun LX2160A Clearfog CX
> > > [    0.00] earlycon: pl11 at MMIO32 0x021c (options '')
> > > [    0.00] printk: bootconsole [pl11] enabled
> > > [    0.00] efi: Getting EFI parameters from FDT:
> > > [    0.00] efi: UEFI not found.
> > > [    0.00] cma: Reserved 32 MiB at 0xf9c0
> > > [    0.00] On node 0 totalpages: 1555968
> > > [    0.00]   DMA zone: 4096 pages used for memmap
> > > [    0.00]   DMA zone: 0 pages reserved
> > > [    0.00]   DMA zone: 262144 pages, LIFO batch:63
> > > [    0.00]   DMA32 zone: 3832 pages used for memmap
> > > [    0.00]   DMA32 zone: 245248 pages, LIFO batch:63
> > > [    0.00]   Normal zone: 16384 pages used for memmap
> > > [    0.00]   Normal zone: 1048576 pages, LIFO batch:63
> > > [    0.00] psci: probing for conduit method from DT.
> > > [    0.00] psci: PSCIv1.1 detected in firmware.
> > > [    0.00] psci: Using standard PSCI v0.2 function IDs
> > > [    0.00] psci: MIGRATE_INFO_TYPE not supported.
> > > [    0.00] psci: SMC Calling Convention v1.1
> > > [    0.00] percpu: Embedded 31 pages/cpu s88968 r8192 d29816 u126976
> > > [    0.00] pcpu-alloc: s88968 r8192 d29816 u126976 alloc=31*4096
> > > [    0.00] pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05
> > > [0] 06 [0] 07
> > > [    0.00] pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11 [0] 12 [0] 13
> > > [0] 14 [0] 15
> > > [    0.00] Detected PIPT I-cache on CPU0
> > > [    0.00] CPU features: detected: GIC system register CPU interface
> > > [    0.00] CPU features: detected: EL2 vector hardening
> > > [    0.00] Speculative Store Bypass Disable mitigation not required
> > > [    0.00] CPU features: detected: ARM erratum

Re: [PATCH] iommu: silence iommu group prints

2020-02-28 Thread Russell King - ARM Linux admin
On Fri, Feb 28, 2020 at 09:33:40AM +, John Garry wrote:
> On 28/02/2020 02:16, Lu Baolu wrote:
> > Hi,
> > 
> > On 2020/2/27 19:57, Russell King wrote:
> > > On the LX2160A, there are lots (about 160) of IOMMU messages produced
> > > during boot; this is excessive.  Reduce the severity of these messages
> > > to debug level.
> > > 
> > > Signed-off-by: Russell King 
> > > ---
> > >   drivers/iommu/iommu.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > > index 3ead597e1c57..304281ec623b 100644
> > > --- a/drivers/iommu/iommu.c
> > > +++ b/drivers/iommu/iommu.c
> > > @@ -741,7 +741,7 @@ int iommu_group_add_device(struct iommu_group
> > > *group, struct device *dev)
> > >   trace_add_device_to_group(group->id, dev);
> > > -    dev_info(dev, "Adding to iommu group %d\n", group->id);
> > > +    dev_dbg(dev, "Adding to iommu group %d\n", group->id);
> > 
> > I'm not strongly against this. But to me this message seems to be a good
> > indicator that a device was probed successfully by the iommu subsystem.
> > Keeping it in the default kernel message always helps to the kernel
> > debugging.
> > 
> 
> I would tend to agree.

Here's the boot messages.  Notice how many of these "Adding to iommu
group" messages there are:

[0.00] Booting Linux on physical CPU 0x00 [0x410fd083]
[0.00] Linux version 5.5.0+ (rmk@rmk-PC) (gcc version 4.9.2 (GCC)) #655 
SMP PREEMPT Fri Feb 28 09:54:47 GMT 2020
[0.00] Machine model: SolidRun LX2160A Clearfog CX
[0.00] earlycon: pl11 at MMIO32 0x021c (options '')
[0.00] printk: bootconsole [pl11] enabled
[0.00] efi: Getting EFI parameters from FDT:
[0.00] efi: UEFI not found.
[0.00] cma: Reserved 32 MiB at 0xf9c0
[0.00] On node 0 totalpages: 1555968
[0.00]   DMA zone: 4096 pages used for memmap
[0.00]   DMA zone: 0 pages reserved
[0.00]   DMA zone: 262144 pages, LIFO batch:63
[0.00]   DMA32 zone: 3832 pages used for memmap
[0.00]   DMA32 zone: 245248 pages, LIFO batch:63
[0.00]   Normal zone: 16384 pages used for memmap
[0.00]   Normal zone: 1048576 pages, LIFO batch:63
[0.00] psci: probing for conduit method from DT.
[0.00] psci: PSCIv1.1 detected in firmware.
[0.00] psci: Using standard PSCI v0.2 function IDs
[0.00] psci: MIGRATE_INFO_TYPE not supported.
[0.00] psci: SMC Calling Convention v1.1
[0.00] percpu: Embedded 31 pages/cpu s88968 r8192 d29816 u126976
[0.00] pcpu-alloc: s88968 r8192 d29816 u126976 alloc=31*4096
[0.00] pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05 [0] 06 [0] 
07 
[0.00] pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11 [0] 12 [0] 13 [0] 14 [0] 
15 
[0.00] Detected PIPT I-cache on CPU0
[0.00] CPU features: detected: GIC system register CPU interface
[0.00] CPU features: detected: EL2 vector hardening
[0.00] Speculative Store Bypass Disable mitigation not required
[0.00] CPU features: detected: ARM erratum 1319367
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 1531656
[0.00] Kernel command line: console=ttyAMA0,115200 
root=PARTUUID=c7837e2f-02 rootwait earlycon=pl011,mmio32,0x21c 
ramdisk_size=0 pci=pcie_bus_perf arm_smmu.disable_bypass=0 iommu.passthrough=0
[0.00] Dentry cache hash table entries: 1048576 (order: 11, 8388608 
bytes, linear)
[0.00] Inode-cache hash table entries: 524288 (order: 10, 4194304 
bytes, linear)
[0.00] mem auto-init: stack:off, heap alloc:off, heap free:off
[0.00] software IO TLB: mapped [mem 0xbbfff000-0xb000] (64MB)
[0.00] Memory: 5991500K/6223872K available (10172K kernel code, 1376K 
rwdata, 3888K rodata, 960K init, 4326K bss, 199604K reserved, 32768K 
cma-reserved)
[0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
[0.00] rcu: Preemptible hierarchical RCU implementation.
[0.00] rcu: RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=16.
[0.00]  Tasks RCU enabled.
[0.00] rcu: RCU calculated value of scheduler-enlistment delay is 25 
jiffies.
[0.00] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=16
[0.00] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[0.00] GICv3: GIC: Using split EOI/Deactivate mode
[0.00] GICv3: 256 SPIs implemented
[0.00] GICv3: 0 Extended SPIs implemented
[0.00] GICv3: Distributor has no Range Selector support
[0.00] GICv3: 16 PPIs implemented
[0.00] GICv3: no VLPI support, no direct LPI support
[0.00] GICv3: CPU0: found redistributor 0 region 0:0x0620
[0.00] ITS [mem 0x0602-0x0603]
[0.00] ITS@0x0602: allocated 65536 Devices @2178d0 
(flat, esz 8, psz 64K, shr 0)
[0.00] ITS:

Re: [PATCH] iommu: silence iommu group prints

2020-02-27 Thread Russell King - ARM Linux admin
On Thu, Feb 27, 2020 at 06:19:10PM +, Robin Murphy wrote:
> On 27/02/2020 1:48 pm, Russell King - ARM Linux admin wrote:
> > On Thu, Feb 27, 2020 at 01:44:56PM +, Robin Murphy wrote:
> > > On 27/02/2020 11:57 am, Russell King wrote:
> > > > On the LX2160A, there are lots (about 160) of IOMMU messages produced
> > > > during boot; this is excessive.  Reduce the severity of these messages
> > > > to debug level.
> > > 
> > > That's... a lot. Does the system really have that many devices, or is some
> > > driver being stupid and repeatedly populating and destroying an entire bus
> > > in a probe-deferral dance?
> > 
> > It's all the devices created by for the mc-bus for the DPAA2
> > networking support.  I don't know the technicalities, just that
> > the boot is spammed with these messages.
> 
> Well, the "technicalities" are really just whether the thing before the
> colon on each message is unique or not. If you're seeing multiple add and
> remove calls pertaining to the same device (or frankly any remove calls at
> all during boot) then it smacks of something somewhere wasting time and
> resources with unnecessary busywork, which is indicative of either poor
> design or an actual bug, either of which would deserve fixing.

It's not the same device.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: silence iommu group prints

2020-02-27 Thread Russell King - ARM Linux admin
On Thu, Feb 27, 2020 at 01:44:56PM +, Robin Murphy wrote:
> On 27/02/2020 11:57 am, Russell King wrote:
> > On the LX2160A, there are lots (about 160) of IOMMU messages produced
> > during boot; this is excessive.  Reduce the severity of these messages
> > to debug level.
> 
> That's... a lot. Does the system really have that many devices, or is some
> driver being stupid and repeatedly populating and destroying an entire bus
> in a probe-deferral dance?

It's all the devices created by for the mc-bus for the DPAA2
networking support.  I don't know the technicalities, just that
the boot is spammed with these messages.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: fix the module name for disable_bypass parameter

2020-02-11 Thread Russell King - ARM Linux admin
On Tue, Feb 11, 2020 at 05:36:55PM -0600, Li Yang wrote:
> Since commit cd221bd24ff5 ("iommu/arm-smmu: Allow building as a module"),
> there is a side effect that the module name is changed from arm-smmu to
> arm-smmu-mod.  So the kernel parameter for disable_bypass need to be
> changed too.  Fix the Kconfig help and error message to the correct
> parameter name.

Hmm, this seems to be a user-visible change - so those of us who have
been booting with "arm-smmu.disable_bypass=0" now need to change that
depending on which kernel is being booted - which is not nice, and
makes the support side on platforms that need this kernel parameter
harder.

> 
> Signed-off-by: Li Yang 
> ---
>  drivers/iommu/Kconfig| 2 +-
>  drivers/iommu/arm-smmu.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index d2fade984999..fb54be903c60 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -415,7 +415,7 @@ config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT
> hardcode the bypass disable in the code.
>  
> NOTE: the kernel command line parameter
> -   'arm-smmu.disable_bypass' will continue to override this
> +   'arm-smmu-mod.disable_bypass' will continue to override this
> config.
>  
>  config ARM_SMMU_V3
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 16c4b87af42b..2ffe8ff04393 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -512,7 +512,7 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
> *dev)
>   if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
>   (gfsr & ARM_SMMU_sGFSR_USF))
>   dev_err(smmu->dev,
> - "Blocked unknown Stream ID 0x%hx; boot with 
> \"arm-smmu.disable_bypass=0\" to allow, but this may have security 
> implications\n",
> + "Blocked unknown Stream ID 0x%hx; boot with 
> \"arm-smmu-mod.disable_bypass=0\" to allow, but this may have security 
> implications\n",
>   (u16)gfsynr1);
>   else
>   dev_err(smmu->dev,
> -- 
> 2.17.1
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Report USF more clearly

2019-09-13 Thread Russell King - ARM Linux admin
On Fri, Sep 13, 2019 at 12:48:37PM +0100, Robin Murphy wrote:
> Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool
> for smoking out inadequate firmware, the failure mode is non-obvious
> and can be confusing for end users. Add some special-case reporting of
> Unidentified Stream Faults to help clarify this particular symptom.

Having encountered this on a board that turned up this week, it may
be better to use the hex representation of the stream ID, especially
as it seems normal for the stream ID to be made up of implementation
defined bitfields.

If we want to stick with decimal, maybe masking the stream ID with
the number of allowable bits would be a good idea, so that the
decimal value remains meaningful should other bits be non-zero?

> CC: Douglas Anderson 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm-smmu.c | 5 +
>  drivers/iommu/arm-smmu.h | 2 ++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index b7cf24402a94..76ac8c180695 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -499,6 +499,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
> *dev)
>   dev_err_ratelimited(smmu->dev,
>   "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
> 0x%08x\n",
>   gfsr, gfsynr0, gfsynr1, gfsynr2);
> + if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
> + (gfsr & sGFSR_USF))
> + dev_err_ratelimited(smmu->dev,
> + "Stream ID %hu may not be described by firmware, try 
> booting with \"arm-smmu.disable_bypass=0\"\n",
> + (u16)gfsynr1);
>  
>   arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sGFSR, gfsr);
>   return IRQ_HANDLED;
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index c9c13b5785f2..46f7e161e83e 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -79,6 +79,8 @@
>  #define ID7_MINORGENMASK(3, 0)
>  
>  #define ARM_SMMU_GR0_sGFSR   0x48
> +#define sGFSR_USFBIT(2)

I do wonder if this is another instance where writing "(1 << 1)"
would have resulted in less chance of a mistake being made...
wrapping stuff up into macros is not always better!

9.6.15SMMU_sGFSR, Global Fault Status Register

The SMMU_sGFSR bit assignments are:

USF, bit[1]   Unidentified stream fault. The possible values of this
  bit are:
  0  No Unidentified stream fault.
  1  Unidentified stream fault.

So this wants to be:

#define sGFSR_USF   BIT(1)


-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/4] vmalloc: lift the arm flag for coherent mappings to common code

2019-08-30 Thread Russell King - ARM Linux admin
On Fri, Aug 30, 2019 at 08:29:21AM +0200, Christoph Hellwig wrote:
> The arm architecture had a VM_ARM_DMA_CONSISTENT flag to mark DMA
> coherent remapping for a while.  Lift this flag to common code so
> that we can use it generically.  We also check it in the only place
> VM_USERMAP is directly check so that we can entirely replace that
> flag as well (although I'm not even sure why we'd want to allow
> remapping DMA appings, but I'd rather not change behavior).

Good, because if you did change that behaviour, you'd break almost
every ARM framebuffer and cripple ARM audio drivers.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-mapping: fix page attributes for dma_mmap_*

2019-08-06 Thread Russell King - ARM Linux admin
On Tue, Aug 06, 2019 at 05:45:03PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Aug 06, 2019 at 05:08:54PM +0100, Will Deacon wrote:
> > On Sat, Aug 03, 2019 at 08:48:12AM +0200, Christoph Hellwig wrote:
> > > On Fri, Aug 02, 2019 at 11:38:03AM +0100, Will Deacon wrote:
> > > > 
> > > > So this boils down to a terminology mismatch. The Arm architecture 
> > > > doesn't have
> > > > anything called "write combine", so in Linux we instead provide what 
> > > > the Arm
> > > > architecture calls "Normal non-cacheable" memory for 
> > > > pgprot_writecombine().
> > > > Amongst other things, this memory type permits speculation, unaligned 
> > > > accesses
> > > > and merging of writes. I found something in the architecture spec about
> > > > non-cachable memory, but it's written in Armglish[1].
> > > > 
> > > > pgprot_noncached(), on the other hand, provides what the architecture 
> > > > calls
> > > > Strongly Ordered or Device-nGnRnE memory. This is intended for mapping 
> > > > MMIO
> > > > (i.e. PCI config space) and therefore forbids speculation, preserves 
> > > > access
> > > > size, requires strict alignment and also forces write responses to come 
> > > > from
> > > > the endpoint.
> > > > 
> > > > I think the naming mismatch is historical, but on arm64 we wanted to 
> > > > use the
> > > > same names as arm32 so that any drivers using these things directly 
> > > > would get
> > > > the same behaviour.
> > > 
> > > That all makes sense, but it totally needs a comment.  I'll try to draft
> > > one based on this.  I've also looked at the arm32 code a bit more, and
> > > it seems arm always (?) supported Normal non-cacheable attribute, but
> > > Linux only optionally uses it for arm v6+ because of fears of drivers
> > > missing barriers.
> > 
> > I think it was also to do with aliasing, but I don't recall all of the
> > details.
> 
> ARMv6+ is where the architecture significantly changed to introduce
> the idea of [Normal, Device, Strongly Ordered] where Normal has the
> cache attributes.
> 
> Before that, we had just "uncached/unbuffered, uncached/buffered,
> cached/unbuffered, cached/buffered" modes.
> 
> The write buffer (enabled by buffered modes) has no architected
> guarantees about how long writes will sit in it, and there is only
> the "drain write buffer" instruction to push writes out.
> 
> Up to and including ARMv5, we took the easy approach of just using
> the "uncached/unbuffered" mode since that is (a) the safest, and (b)
> avoids write buffers that alias when there are multiple different
> mappings.
> 
> We could have used a different approach, making all IO writes contain
> a "drain write buffer" instruction, and map DMA memory as "buffered",
> but as there were no Linux barriers defined to order memory accesses
> to DMA memory (so, for example, ring buffers can be updated in the
> correct order) back in those days, using the uncached/unbuffered mode
> was the sanest and most reliable solution.
> 
> > 
> > > The other really weird things is that in arm32
> > > pgprot_dmacoherent incudes the L_PTE_XN bit, which from my understanding
> > > is the no-execture bit, but pgprot_writecombine does not.  This seems to
> > > not very unintentional.  So minus that the whole DMA_ATTR_WRITE_COMBІNE
> > > seems to be about flagging old arm specific drivers as having the proper
> > > barriers in places and otherwise is a no-op.
> > 
> > I think it only matters for Armv7 CPUs, but yes, we should probably be
> > setting L_PTE_XN for both of these memory types.
> 
> Conventionally, pgprot_writecombine() has only been used to change
> the memory type and not the permissions.  Since writecombine memory
> is still capable of being executed, I don't see any reason to set XN
> for it.
> 
> If the user wishes to mmap() using PROT_READ|PROT_EXEC, then is there
> really a reason for writecombine to set XN overriding the user?
> 
> That said, pgprot_writecombine() is mostly used for framebuffers, which
> arguably shouldn't be executable anyway - but who'd want to mmap() the
> framebuffer with PROT_EXEC?
> 
> > 
> > > Here is my tentative plan:
> > > 
> > >  - respin this patch with a small fix to handle the
> > >DMA_ATTR_NON_CONSISTENT (as in ignore it unl

Re: [PATCH] dma-mapping: fix page attributes for dma_mmap_*

2019-08-06 Thread Russell King - ARM Linux admin
On Tue, Aug 06, 2019 at 05:08:54PM +0100, Will Deacon wrote:
> On Sat, Aug 03, 2019 at 08:48:12AM +0200, Christoph Hellwig wrote:
> > On Fri, Aug 02, 2019 at 11:38:03AM +0100, Will Deacon wrote:
> > > 
> > > So this boils down to a terminology mismatch. The Arm architecture 
> > > doesn't have
> > > anything called "write combine", so in Linux we instead provide what the 
> > > Arm
> > > architecture calls "Normal non-cacheable" memory for 
> > > pgprot_writecombine().
> > > Amongst other things, this memory type permits speculation, unaligned 
> > > accesses
> > > and merging of writes. I found something in the architecture spec about
> > > non-cachable memory, but it's written in Armglish[1].
> > > 
> > > pgprot_noncached(), on the other hand, provides what the architecture 
> > > calls
> > > Strongly Ordered or Device-nGnRnE memory. This is intended for mapping 
> > > MMIO
> > > (i.e. PCI config space) and therefore forbids speculation, preserves 
> > > access
> > > size, requires strict alignment and also forces write responses to come 
> > > from
> > > the endpoint.
> > > 
> > > I think the naming mismatch is historical, but on arm64 we wanted to use 
> > > the
> > > same names as arm32 so that any drivers using these things directly would 
> > > get
> > > the same behaviour.
> > 
> > That all makes sense, but it totally needs a comment.  I'll try to draft
> > one based on this.  I've also looked at the arm32 code a bit more, and
> > it seems arm always (?) supported Normal non-cacheable attribute, but
> > Linux only optionally uses it for arm v6+ because of fears of drivers
> > missing barriers.
> 
> I think it was also to do with aliasing, but I don't recall all of the
> details.

ARMv6+ is where the architecture significantly changed to introduce
the idea of [Normal, Device, Strongly Ordered] where Normal has the
cache attributes.

Before that, we had just "uncached/unbuffered, uncached/buffered,
cached/unbuffered, cached/buffered" modes.

The write buffer (enabled by buffered modes) has no architected
guarantees about how long writes will sit in it, and there is only
the "drain write buffer" instruction to push writes out.

Up to and including ARMv5, we took the easy approach of just using
the "uncached/unbuffered" mode since that is (a) the safest, and (b)
avoids write buffers that alias when there are multiple different
mappings.

We could have used a different approach, making all IO writes contain
a "drain write buffer" instruction, and map DMA memory as "buffered",
but as there were no Linux barriers defined to order memory accesses
to DMA memory (so, for example, ring buffers can be updated in the
correct order) back in those days, using the uncached/unbuffered mode
was the sanest and most reliable solution.

> 
> > The other really weird things is that in arm32
> > pgprot_dmacoherent incudes the L_PTE_XN bit, which from my understanding
> > is the no-execture bit, but pgprot_writecombine does not.  This seems to
> > not very unintentional.  So minus that the whole DMA_ATTR_WRITE_COMBІNE
> > seems to be about flagging old arm specific drivers as having the proper
> > barriers in places and otherwise is a no-op.
> 
> I think it only matters for Armv7 CPUs, but yes, we should probably be
> setting L_PTE_XN for both of these memory types.

Conventionally, pgprot_writecombine() has only been used to change
the memory type and not the permissions.  Since writecombine memory
is still capable of being executed, I don't see any reason to set XN
for it.

If the user wishes to mmap() using PROT_READ|PROT_EXEC, then is there
really a reason for writecombine to set XN overriding the user?

That said, pgprot_writecombine() is mostly used for framebuffers, which
arguably shouldn't be executable anyway - but who'd want to mmap() the
framebuffer with PROT_EXEC?

> 
> > Here is my tentative plan:
> > 
> >  - respin this patch with a small fix to handle the
> >DMA_ATTR_NON_CONSISTENT (as in ignore it unless actually supported),
> >but keep the name as-is to avoid churn.  This should allow 5.3
> >inclusion and backports
> >  - remove DMA_ATTR_WRITE_COMBINE support from mips, probably also 5.3
> >material.
> >  - move all architectures but arm over to just define
> >pgprot_dmacoherent, including a comment with the above explanation
> >for arm64.
> 
> That would be great, thanks.
> 
> >  - make DMA_ATTR_WRITE_COMBINE a no-op and schedule it for removal,
> >thus removing the last instances of arch_dma_mmap_pgprot
> 
> All sounds good to me, although I suppose 32-bit Arm platforms without
> CONFIG_ARM_DMA_MEM_BUFFERABLE may run into issues if DMA_ATTR_WRITE_COMBINE
> disappears. Only one way to find out...

Looking at the results of grep, I think only OMAP2+ and Exynos may be
affected.

However, removing writecombine support from the DMA API is going to
have a huge impact for framebuffers on earlier ARMs - that's where we
do expect framebuffers to be mapped "uncached/buffered" for perfo

Re: SATA broken with LPAE

2019-06-27 Thread Russell King - ARM Linux admin
On Thu, Jun 27, 2019 at 11:15:30AM +0200, Christoph Hellwig wrote:
> On Thu, Jun 27, 2019 at 10:07:53AM +0100, Russell King - ARM Linux admin 
> wrote:
> > dmabounce has only ever been used with specific devices that have weird
> > setups.  Otherwise, we've never expected what you describe on ARM.  I
> > also don't recognise your assertion about the way the DMA API should
> > behave as ever having been documented as a requirement for architectures
> > to implement.
> 
> That requirement has basically always been there since at least the
> 2.6.x days.  The history here is that when 64-bit architectures showed
> up they all had iommus, so this wasn't an issue.  Next was x86 with
> highmem, which added special bounce buffering for block I/O and networking
> only.  Then ia64 showed up that didn't always have an iommu and swiotlb
> was added as a "software IOMMU".  At this point we had to bounce buffering
> schemes for block and networking, while everything else potentially
> DMAing to higher memory relied on swiotlb, which was picked up by
> basically every architecture that could have memory not covered by a
> 32-bit mask and didn't have an iommu.

If it wasn't documented...

> Except it seems arm never did
> that and has been lucky as people didn't try anything that is not
> block or networking on their extended physical address space.

Because no one knew that the requirement had been introduced, and we've
been reliant on all the code you've been stripping out.

You're now causing regressions on 32-bit ARM, so you get to fix it.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: SATA broken with LPAE

2019-06-27 Thread Russell King - ARM Linux admin
On Wed, Jun 26, 2019 at 02:53:25PM +0200, Christoph Hellwig wrote:
> Hi Roger,
> 
> it seems the arm dma direct mapping code isn't doing the right thing
> here.  On other platforms that have > 4G memory we always use swiotlb
> for bounce buffering in case a device that can't DMA to all the memory.
> 
> Arm is the odd one out and uses its own dmabounce framework instead,
> but it seems like it doesn't get used in this case.  We need to make
> sure dmabounce (or swiotlb for that matter) is set up if > 32-bit
> addressing is supported.  I'm not really an arm platform expert,
> but some of those on the Cc list are and might chime in on how to
> do that.

dmabounce has only ever been used with specific devices that have weird
setups.  Otherwise, we've never expected what you describe on ARM.  I
also don't recognise your assertion about the way the DMA API should
behave as ever having been documented as a requirement for architectures
to implement.

dmabounce comes with it some serious issues that have been known to
cause OOMs with old kernels (which have never been solved), so that
really isn't the answer.

This kind of breakage that is now being reported is exactly the problem
that I thought would happen with your patches.

32-bit ARM is in maintenance only now, there is no longer any appetite
nor funding for development work on core code.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] ARM: dma-mapping: allow larger DMA mask than supported

2019-05-21 Thread Russell King - ARM Linux admin
On Tue, May 21, 2019 at 02:47:29PM +0200, Christoph Hellwig wrote:
> Since Linux 5.1 we allow drivers to just set the largest DMA mask they
> support instead of falling back to smaller ones.

This doesn't make sense.  "they" is confusing - why would a driver set
a DMA mask larger than the driver supports?  Or is "they" not
referring to the drivers (in which case, what is it referring to?)

> When fixing up all the dma ops instances to allow for this behavior
> the arm direct mapping code was missed.  Fix it up by removing the
> sanity check, as all the actual mapping code handles this case just
> fine.
> 
> Fixes: 9eb9e96e97b3 ("Documentation/DMA-API-HOWTO: update dma_mask sections")
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm/mm/dma-mapping.c | 20 +---
>  1 file changed, 1 insertion(+), 19 deletions(-)
> 
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 0a75058c11f3..bdf0d236aaee 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -219,25 +219,7 @@ EXPORT_SYMBOL(arm_coherent_dma_ops);
>  
>  static int __dma_supported(struct device *dev, u64 mask, bool warn)
>  {
> - unsigned long max_dma_pfn;
> -
> - /*
> -  * If the mask allows for more memory than we can address,
> -  * and we actually have that much memory, then we must
> -  * indicate that DMA to this device is not supported.
> -  */
> - if (sizeof(mask) != sizeof(dma_addr_t) &&
> - mask > (dma_addr_t)~0 &&
> - dma_to_pfn(dev, ~0) < max_pfn - 1) {
> - if (warn) {
> - dev_warn(dev, "Coherent DMA mask %#llx is larger than 
> dma_addr_t allows\n",
> -  mask);
> - dev_warn(dev, "Driver did not use or check the return 
> value from dma_set_coherent_mask()?\n");
> - }
> - return 0;
> - }

The point of this check is to trap the case where we have, for example,
8GB of memory, but dma_addr_t is 32-bit.  We can allocate in the high
4GB, but we can't represent the address in a dma_addr_t.

> -
> - max_dma_pfn = min(max_pfn, arm_dma_pfn_limit);
> + unsigned long max_dma_pfn = min(max_pfn, arm_dma_pfn_limit);
>  
>   /*
>* Translate the device's DMA mask to a PFN limit.  This
> -- 
> 2.20.1
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] dma-mapping: truncate dma masks to what dma_addr_t can hold

2019-05-21 Thread Russell King - ARM Linux admin
On Tue, May 21, 2019 at 02:47:28PM +0200, Christoph Hellwig wrote:
> The dma masks in struct device are always 64-bits wide.  But for builds
> using a 32-bit dma_addr_t we need to ensure we don't store an
> unsupportable value.  Before Linux 5.0 this was handled at least by
> the ARM dma mapping code by never allowing to set a larger dma_mask,
> but these days we allow the driver to just set the largest supported
> value and never fall back to a smaller one.  Ensure this always works
> by truncating the value.

So how does the driver negotiation for >32bit addresses work if we don't
fail for large masks?

I'm thinking about all those PCI drivers that need DAC cycles for >32bit
addresses, such as e1000, which negotiate via (eg):

/* there is a workaround being applied below that limits
 * 64-bit DMA addresses to 64-bit hardware.  There are some
 * 32-bit adapters that Tx hang when given 64-bit DMA addresses
 */
pci_using_dac = 0;
if ((hw->bus_type == e1000_bus_type_pcix) &&
!dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64))) {
pci_using_dac = 1;
} else {
err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
if (err) {
pr_err("No usable DMA config, aborting\n");
goto err_dma;
}
}

and similar.  If we blindly trunate the 64-bit to 32-bit, aren't we
going to end up with PCI cards using DAC cycles to a host bridge that
do not support DAC cycles?

> 
> Fixes: 9eb9e96e97b3 ("Documentation/DMA-API-HOWTO: update dma_mask sections")
> Signed-off-by: Christoph Hellwig 
> ---
>  kernel/dma/mapping.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index f7afdadb6770..1f628e7ac709 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -317,6 +317,12 @@ void arch_dma_set_mask(struct device *dev, u64 mask);
>  
>  int dma_set_mask(struct device *dev, u64 mask)
>  {
> + /*
> +  * Truncate the mask to the actually supported dma_addr_t width to
> +  * avoid generating unsupportable addresses.
> +  */
> + mask = (dma_addr_t)mask;
> +
>   if (!dev->dma_mask || !dma_supported(dev, mask))
>   return -EIO;
>  
> @@ -330,6 +336,12 @@ EXPORT_SYMBOL(dma_set_mask);
>  #ifndef CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK
>  int dma_set_coherent_mask(struct device *dev, u64 mask)
>  {
> + /*
> +  * Truncate the mask to the actually supported dma_addr_t width to
> +  * avoid generating unsupportable addresses.
> +  */
> + mask = (dma_addr_t)mask;
> +
>   if (!dma_supported(dev, mask))
>   return -EIO;
>  
> -- 
> 2.20.1
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: remove NULL struct device support in the DMA API

2019-04-03 Thread Russell King - ARM Linux admin
On Thu, Mar 21, 2019 at 03:52:28PM -0700, Christoph Hellwig wrote:
> We still have a few drivers which pass a NULL struct device pointer
> to DMA API functions, which generally is a bad idea as the API
> implementations rely on the device not only for ops selection, but
> also the dma mask and various other attributes, and many implementations
> have been broken for NULL device support for a while.

I think I must be missing something, but...

My understanding is that ISA DMA is normally limited to 24 bits of
address - indeed, the x86 version only programs 24 bits of DMA address.
Looking through this series, it appears that the conversions mean that
the DMA mask for ISA becomes the full all-ones DMA mask, which would
of course lead to memory corruption if only 24 bits of the address end
up being programmed into the hardware.

Maybe you could say why you think this series is safe in regard to ISA
DMA?

> 
> This series removes the few remaning users that weren't picked up in
> the last merge window and then removes core support for this "feature".
> 
> A git tree is also available at:
> 
> git://git.infradead.org/users/hch/misc.git dma-remove-NULL-dev-support
> 
> Gitweb:
> 
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-remove-NULL-dev-support
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 0/9] Use vm_insert_range

2018-12-24 Thread Russell King - ARM Linux
Having discussed with Matthew offlist, I think we've come to the
following conclusion - there's a number of drivers that buggily
ignore vm_pgoff.

So, what I proposed is:

static int __vm_insert_range(struct vm_struct *vma, struct page *pages,
 size_t num, unsigned long offset)
{
unsigned long count = vma_pages(vma);
unsigned long uaddr = vma->vm_start;
int ret;

/* Fail if the user requested offset is beyond the end of the object */
if (offset > num)
return -ENXIO;

/* Fail if the user requested size exceeds available object size */
if (count > num - offset)
return -ENXIO;

/* Never exceed the number of pages that the user requested */
for (i = 0; i < count; i++) {
ret = vm_insert_page(vma, uaddr, pages[offset + i]);
if (ret < 0)
return ret;
uaddr += PAGE_SIZE;
}

return 0;
}

/*
 * Maps an object consisting of `num' `pages', catering for the user's
 * requested vm_pgoff
 */
int vm_insert_range(struct vm_struct *vma, struct page *pages, size_t num)
{
return __vm_insert_range(vma, pages, num, vma->vm_pgoff);
}

/*
 * Maps a set of pages, always starting at page[0]
 */
int vm_insert_range_buggy(struct vm_struct *vma, struct page *pages, size_t num)
{
return __vm_insert_range(vma, pages, num, 0);
}

With this, drivers such as iommu/dma-iommu.c can be converted thusly:

 int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct 
*vma+)
 {
-   unsigned long uaddr = vma->vm_start;
-   unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   int ret = -ENXIO;
-
-   for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
-   ret = vm_insert_page(vma, uaddr, pages[i]);
-   if (ret)
-   break;
-   uaddr += PAGE_SIZE;
-   }
-   return ret;
+   return vm_insert_range(vma, pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
}

and drivers such as firewire/core-iso.c:

 int fw_iso_buffer_map_vma(struct fw_iso_buffer *buffer,
  struct vm_area_struct *vma)
 {
-   unsigned long uaddr;
-   int i, err;
-
-   uaddr = vma->vm_start;
-   for (i = 0; i < buffer->page_count; i++) {
-   err = vm_insert_page(vma, uaddr, buffer->pages[i]);
-   if (err)
-   return err;
-
-   uaddr += PAGE_SIZE;
-   }
-
-   return 0;
+   return vm_insert_range_buggy(vma, buffer->pages, buffer->page_count);
}

and this gives us something to grep for to find these buggy drivers.

Now, this may not look exactly equivalent, but if you look at
fw_device_op_mmap(), buffer->page_count is basically vma_pages(vma)
at this point, which means this should be equivalent.

We _could_ then at a later date "fix" these drivers to behave according
to the normal vm_pgoff offsetting simply by removing the _buggy suffix
on the function name... and if that causes regressions, it gives us an
easy way to revert (as long as vm_insert_range_buggy() remains
available.)

In the case of firewire/core-iso.c, it currently ignores the mmap offset
entirely, so making the above suggested change would be tantamount to
causing it to return -ENXIO for any non-zero mmap offset.

IMHO, this approach is way simpler, and easier to get it correct at
each call site, rather than the current approach which seems to be
error-prone.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 6/9] iommu/dma-iommu.c: Convert to use vm_insert_range

2018-12-24 Thread Russell King - ARM Linux
On Mon, Dec 24, 2018 at 06:55:31PM +0530, Souptick Joarder wrote:
> Convert to use vm_insert_range() to map range of kernel
> memory to user vma.
> 
> Signed-off-by: Souptick Joarder 
> Reviewed-by: Matthew Wilcox 
> ---
>  drivers/iommu/dma-iommu.c | 13 +++--
>  1 file changed, 3 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index d1b0475..de7ffd8 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -622,17 +622,10 @@ struct page **iommu_dma_alloc(struct device *dev, 
> size_t size, gfp_t gfp,
>  
>  int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct 
> *vma)
>  {
> - unsigned long uaddr = vma->vm_start;
> - unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> - int ret = -ENXIO;
> + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
>  
> - for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
> - ret = vm_insert_page(vma, uaddr, pages[i]);
> - if (ret)
> - break;
> - uaddr += PAGE_SIZE;
> - }
> - return ret;
> + return vm_insert_range(vma, vma->vm_start, pages + vma->vm_pgoff,
> + count - vma->vm_pgoff);

This introduces a new bug.

I'm not going to continue to point out in minute detail the mistakes
you are introducing, as I don't think that is helping you to learn.

Look at this closely, and see whether you can spot the mistake.
Specifically, compare the boundary conditions for the final page
that is to be inserted and the value returned by the original version
and by your version for different scenarios.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/9] dma-mapping: move the arm64 ncoherent alloc/free support to common code

2018-12-04 Thread Russell King - ARM Linux
On Mon, Nov 05, 2018 at 01:19:26PM +0100, Christoph Hellwig wrote:
> The arm64 codebase to implement coherent dma allocation for architectures
> with non-coherent DMA is a good start for a generic implementation, given
> that is uses the generic remap helpers, provides the atomic pool for
> allocations that can't sleep and still is realtively simple and well
> tested.  Move it to kernel/dma and allow architectures to opt into it
> using a config symbol.  Architectures just need to provide a new
> arch_dma_prep_coherent helper to writeback an invalidate the caches
> for any memory that gets remapped for uncached access.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm64/Kconfig  |   2 +-
>  arch/arm64/mm/dma-mapping.c | 184 ++--
>  include/linux/dma-mapping.h |   5 +
>  include/linux/dma-noncoherent.h |   2 +
>  kernel/dma/Kconfig  |   6 ++
>  kernel/dma/remap.c  | 158 ++-
>  6 files changed, 181 insertions(+), 176 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 5d065acb6d10..2e645ea693ea 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -82,7 +82,7 @@ config ARM64
>   select CRC32
>   select DCACHE_WORD_ACCESS
>   select DMA_DIRECT_OPS
> - select DMA_REMAP
> + select DMA_DIRECT_REMAP
>   select EDAC_SUPPORT
>   select FRAME_POINTER
>   select GENERIC_ALLOCATOR
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index a3ac26284845..e2e7e5d0f94e 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -33,113 +33,6 @@
>  
>  #include 
>  
> -static struct gen_pool *atomic_pool __ro_after_init;
> -
> -#define DEFAULT_DMA_COHERENT_POOL_SIZE  SZ_256K
> -static size_t atomic_pool_size __initdata = DEFAULT_DMA_COHERENT_POOL_SIZE;
> -
> -static int __init early_coherent_pool(char *p)
> -{
> - atomic_pool_size = memparse(p, &p);
> - return 0;
> -}
> -early_param("coherent_pool", early_coherent_pool);
> -
> -static void *__alloc_from_pool(size_t size, struct page **ret_page, gfp_t 
> flags)
> -{
> - unsigned long val;
> - void *ptr = NULL;
> -
> - if (!atomic_pool) {
> - WARN(1, "coherent pool not initialised!\n");
> - return NULL;
> - }
> -
> - val = gen_pool_alloc(atomic_pool, size);
> - if (val) {
> - phys_addr_t phys = gen_pool_virt_to_phys(atomic_pool, val);
> -
> - *ret_page = phys_to_page(phys);
> - ptr = (void *)val;
> - memset(ptr, 0, size);
> - }
> -
> - return ptr;
> -}
> -
> -static bool __in_atomic_pool(void *start, size_t size)
> -{
> - return addr_in_gen_pool(atomic_pool, (unsigned long)start, size);
> -}
> -
> -static int __free_from_pool(void *start, size_t size)
> -{
> - if (!__in_atomic_pool(start, size))
> - return 0;
> -
> - gen_pool_free(atomic_pool, (unsigned long)start, size);
> -
> - return 1;
> -}
> -
> -void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> - gfp_t flags, unsigned long attrs)
> -{
> - struct page *page;
> - void *ptr, *coherent_ptr;
> - pgprot_t prot = pgprot_writecombine(PAGE_KERNEL);
> -
> - size = PAGE_ALIGN(size);
> -
> - if (!gfpflags_allow_blocking(flags)) {
> - struct page *page = NULL;
> - void *addr = __alloc_from_pool(size, &page, flags);
> -
> - if (addr)
> - *dma_handle = phys_to_dma(dev, page_to_phys(page));
> -
> - return addr;
> - }
> -
> - ptr = dma_direct_alloc_pages(dev, size, dma_handle, flags, attrs);
> - if (!ptr)
> - goto no_mem;
> -
> - /* remove any dirty cache lines on the kernel alias */
> - __dma_flush_area(ptr, size);
> -
> - /* create a coherent mapping */
> - page = virt_to_page(ptr);
> - coherent_ptr = dma_common_contiguous_remap(page, size, VM_USERMAP,
> -prot, 
> __builtin_return_address(0));
> - if (!coherent_ptr)
> - goto no_map;
> -
> - return coherent_ptr;
> -
> -no_map:
> - dma_direct_free_pages(dev, size, ptr, *dma_handle, attrs);
> -no_mem:
> - return NULL;
> -}
> -
> -void arch_dma_free(struct device *dev, size_t size, void *vaddr,
> - dma_addr_t dma_handle, unsigned long attrs)
> -{
> - if (!__free_from_pool(vaddr, PAGE_ALIGN(size))) {
> - void *kaddr = phys_to_virt(dma_to_phys(dev, dma_handle));
> -
> - vunmap(vaddr);
> - dma_direct_free_pages(dev, size, kaddr, dma_handle, attrs);
> - }
> -}
> -
> -long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,
> - dma_addr_t dma_addr)
> -{
> - return __phys_to_pfn(dma_to_phys(dev, dma_addr));
> -}
> -
>  pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
>   unsigned long att

Re: remove the ->mapping_error method from dma_map_ops V3

2018-11-30 Thread Russell King - ARM Linux
On Fri, Nov 30, 2018 at 02:22:08PM +0100, Christoph Hellwig wrote:
> Error reporting for the dma_map_single and dma_map_page operations is
> currently a mess.  Both APIs directly return the dma_addr_t to be used for
> the DMA, with a magic error escape that is specific to the instance and
> checked by another method provided.  This has a few downsides:
> 
>  - the error check is easily forgotten and a __must_check marker doesn't
>help as the value always is consumed anyway
>  - the error checking requires another indirect call, which have gotten
>incredibly expensive
>  - a lot of code is wasted on implementing these methods
> 
> The historical reason for this is that people thought DMA mappings would
> not fail when the API was created, which sounds like a really bad
> assumption in retrospective, and then we tried to cram error handling
> onto it later on.
> 
> There basically are two variants:  the error code is 0 because the
> implementation will never return 0 as a valid DMA address, or the error
> code is all-F as the implementation won't ever return an address that
> high.  The old AMD GART is the only one not falling into these two camps
> as it picks sort of a relative zero relative to where it is mapped.
> 
> The 0 return doesn't work for direct mappings that have ram at address
> zero and a lot of IOMMUs that start allocating bus space from address
> zero, so we can't consolidate on that, but I think we can move everyone
> to all-Fs, which the patch here does.  The reason for that is that
> there is only one way to ever get this address: by doing a 1-byte long,
> 1-byte aligned mapping, but all our mappings are not only longer but
> generally aligned, and the mappings have to keep at least the basic
> alignment.
> 
> A git tree is also available here:
> 
> git://git.infradead.org/users/hch/misc.git dma-mapping-error.3

Hi Chris,

For patches 1, 3 and 23,

Acked-by: Russell King 

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-28 Thread Russell King - ARM Linux
On Wed, Nov 28, 2018 at 11:27:17AM -0800, David Miller wrote:
> From: Linus Torvalds 
> Date: Wed, 28 Nov 2018 10:00:06 -0800
> 
> > Not all memory is accessible even to the kernel. If you have memory
> > that shows up in the last page of phys_addr_t, you just mark it
> > reserved at boot-time.
> 
> It's not the physical memory at the end that needs to be reserved.
> 
> It's the IOMMU mapping arena.

True, if and only if you have an IOMMU.

Where there isn't an IOMMU, then we'd have to reserve every page that
that translates to a bus address in the top 4K of dma_addr_t on any
bus in the system - that means knowing early in the kernel
initialisation about all buses in the system so we can detect and
reserve these pages.

I don't think that's trivial to do.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-28 Thread Russell King - ARM Linux
On Wed, Nov 28, 2018 at 11:19:15AM -0800, Linus Torvalds wrote:
> On Wed, Nov 28, 2018, 10:08 Russell King - ARM Linux  wrote:
> 
> > >
> > > You already cannot do that kmalloc(), exactly because of ERR_PTR().
> >
> > I'm very sorry, but I think you are confused.
> >
> > kmalloc() returns a _virtual_ address, which quite rightly must not be
> > in the top 4K of 4GB, exactly due to ERR_PTR().  That is fine.
> >
> > However, that is a completely different kettle of fish from a physical
> > or DMA address - neither of which are virtual addresses.
> >
> 
> You cut away the part that showed that I'm not in the least confused.
> 
> Let me just paste it back in here:
> 
>   "Which is what we ALREADY do for these exact reasons. If the DMA
> mappings means that you'd need to add one more page to that list of
> reserved pages, then so be it."

We're not talking about just one more page.  We're talking about
_every_ page that _may_ map to a bus address in the range of 4GB - 4K
(a point I've already made earlier in this discussion.)

It's not just about physical addresses, it's about bus addresses,
and there are buses out there that have a translation between physical
address and bus address without IOMMUs and the like.

Can we know ahead of time about all these translations?  It means
moving that information out of various bus drivers into core
architecture code (yuck, when you have multiple different platforms)
or into DT (which means effectively breaking every damn platform
that's using older DT files) - something that we don't have to do
today.

I remain 100% opposed to your idea.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-28 Thread Russell King - ARM Linux
On Wed, Nov 28, 2018 at 06:08:41PM +, Russell King - ARM Linux wrote:
> On Wed, Nov 28, 2018 at 10:00:06AM -0800, Linus Torvalds wrote:
> > On Wed, Nov 28, 2018 at 9:45 AM Russell King - ARM Linux
> >  wrote:
> > >
> > > > I don't think this is a huge deal, but ERR_PTR() has been hugely
> > > > successful elsewhere. And I'm not hugely convinced about all these
> > > > "any address can be valid" arguments. How the hell do you generate a
> > > > random dma address in the last page that isn't even page-aligned?
> > >
> > > kmalloc() a 64-byte buffer, dma_map_single() that buffer.
> > 
> > No.
> > 
> > You already cannot do that kmalloc(), exactly because of ERR_PTR().
> 
> I'm very sorry, but I think you are confused.
> 
> kmalloc() returns a _virtual_ address, which quite rightly must not be
> in the top 4K of 4GB, exactly due to ERR_PTR().  That is fine.
> 
> However, that is a completely different kettle of fish from a physical
> or DMA address - neither of which are virtual addresses.
> 
> Now, say we have 1GB of RAM which starts at 0xc000 _physical_.
> The kernel is configured with a 2GB/2GB user/kernel split, which means
> all 1GB of RAM is mapped as lowmem from 0x8000 to 0xbfff
> inclusive.  This means kmalloc() can return any address in that range.
> 
> ERR_PTR() will work correctly on any of those pointers, meaning that
> none of them will be seen as an error.
> 
> However, map any virtual address in the range of 0xb000 to
> 0xbfff into DMA space, and the resulting DMA address could well
> be in the range of 0xf000 to 0x - precisely the range
> of addresses that you are advocating to be used for error codes.
> 
> > The whole argument of "every possible piece of memory is DMA'able" is
> > just wrong.
> 
> I'm very sorry, but I do not buy your argument - you are conflating
> virtual addresses which ERR_PTR() deals in with physical and bus
> addresses - and if you persist down this route, you will cause
> regressions.

Here's another case:

i.MX6 with 4GB of RAM.  Devices are mapped to 0-0x0fff physical,
RAM is mapped to 0x1000-0x physical.  The last 256MB of
RAM is not accessible as this is a 32-bit device.  DMA addresses are
the same as physical addresses.

While the final physical page will be highmem in a normal kernel,
and thus will not be available for kmalloc(), that doesn't mean it
can't happen.  A crashdump kernel loaded high in physical memory
(eg, last 512MB and given the last 512MB to play around in) would
have the top 512MB as lowmem, and therefore available for kmalloc().

If a page is available in lowmem, it's available for kmalloc(), and
we can't say that we will never allocate memory from such a page for
DMA - if we do and we're using an IS_ERR_VALUE() scheme, it _will_
break if that happens as memory will end up being mapped by the DMA
API but dma_mapping_error() will see it as a failure.

It won't be an obvious breakage, because it depends on the right
conditions happening - a kmalloc() from the top page of physical
RAM and that being passed to dma_map_single().  IOW, it's not something
that a quick boot test would find, it's something that is likely to
cause failures after a system has been running for a period of time.

There are other situations where there are possibilities - such as:

dma_map_page(dev, page, offset, size, direction)

If 'page' is a highmem page which happens to be the top page in the
4GB space, and offset is non-zero, and there's a 1:1 mapping between
physical address and DMA address, the returned value will be
0xf000 + offset - within the "last 4095 values are errors"
range.

Networking uses this for fragments - the packet fragment list is
a list of pages, offsets and sizes - we have sendpage() that may
end up finding that last page, and TCP-sized packets may be
generated from it which would certianly result in non-zero offsets
being passed to dma_map_page().

So, whatever way _I_ look at it, I find your proposal to be unsafe
and potentially regression causing, and I *completely* and strongly
oppose it in its current form.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-28 Thread Russell King - ARM Linux
On Wed, Nov 28, 2018 at 10:00:06AM -0800, Linus Torvalds wrote:
> On Wed, Nov 28, 2018 at 9:45 AM Russell King - ARM Linux
>  wrote:
> >
> > > I don't think this is a huge deal, but ERR_PTR() has been hugely
> > > successful elsewhere. And I'm not hugely convinced about all these
> > > "any address can be valid" arguments. How the hell do you generate a
> > > random dma address in the last page that isn't even page-aligned?
> >
> > kmalloc() a 64-byte buffer, dma_map_single() that buffer.
> 
> No.
> 
> You already cannot do that kmalloc(), exactly because of ERR_PTR().

I'm very sorry, but I think you are confused.

kmalloc() returns a _virtual_ address, which quite rightly must not be
in the top 4K of 4GB, exactly due to ERR_PTR().  That is fine.

However, that is a completely different kettle of fish from a physical
or DMA address - neither of which are virtual addresses.

Now, say we have 1GB of RAM which starts at 0xc000 _physical_.
The kernel is configured with a 2GB/2GB user/kernel split, which means
all 1GB of RAM is mapped as lowmem from 0x8000 to 0xbfff
inclusive.  This means kmalloc() can return any address in that range.

ERR_PTR() will work correctly on any of those pointers, meaning that
none of them will be seen as an error.

However, map any virtual address in the range of 0xb000 to
0xbfff into DMA space, and the resulting DMA address could well
be in the range of 0xf000 to 0x - precisely the range
of addresses that you are advocating to be used for error codes.

> The whole argument of "every possible piece of memory is DMA'able" is
> just wrong.

I'm very sorry, but I do not buy your argument - you are conflating
virtual addresses which ERR_PTR() deals in with physical and bus
addresses - and if you persist down this route, you will cause
regressions.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-28 Thread Russell King - ARM Linux
On Wed, Nov 28, 2018 at 08:47:05AM -0800, Linus Torvalds wrote:
> On Tue, Nov 27, 2018 at 11:41 PM Christoph Hellwig  wrote:
> >
> > On Fri, Nov 23, 2018 at 07:55:11AM +0100, Christoph Hellwig wrote:
> > > Well, I can tweak the last patch to return -EINVAL from dma_mapping_error
> > > instead of the old 1 is as bool true.  The callers should all be fine,
> > > although I'd have to audit them.  Still wouldn't help with being able to
> > > return different errors.
> >
> > Any opinions?  I'd really like to make some forward progress on this
> > series.
> 
> So I do think that yes, dma_mapping_error() should return an error
> code, not 0/1.
> 
> But I was really hoping that the individual drivers themselves could
> return error codes. Right now the patch-series has code like this:
> 
>   ret = needs_bounce(dev, dma_addr, size);
>   if (ret < 0)
> - return ARM_MAPPING_ERROR;
> + return DMA_MAPPING_ERROR;
> 
> which while it all makes sense in the context of this patch-series, I
> *really* think it would have been so much nicer to return the error
> code 'ret' instead (which in this case is -E2BIG).
> 
> I don't think this is a huge deal, but ERR_PTR() has been hugely
> successful elsewhere. And I'm not hugely convinced about all these
> "any address can be valid" arguments. How the hell do you generate a
> random dma address in the last page that isn't even page-aligned?

kmalloc() a 64-byte buffer, dma_map_single() that buffer.  If you
have RAM that maps to a _bus_ address in the top page of 4GB of a
32-bit bus address, then you lose.  Simples.

Subsystems like I2C, SPI, USB etc all deal with small kmalloc'd
buffers and their drivers make use of DMA.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-23 Thread Russell King - ARM Linux
On Fri, Nov 23, 2018 at 02:03:13PM +0100, Joerg Roedel wrote:
> On Fri, Nov 23, 2018 at 11:01:55AM +0000, Russell King - ARM Linux wrote:
> > Yuck.  So, if we have a 4GB non-PAE 32-bit system, or a PAE system
> > where we have valid memory across the 4GB boundary and no IOMMU,
> > we have to reserve the top 4K page in the first 4GB of RAM?
> 
> But that is only needed when dma_addr_t is 32bit anyway, no?

You said:

  But we can easily work around that by reserving the top 4k of the first
  4GB of IOVA address space in the allocator, no? Then these values are
  never returned as valid DMA handles.

To me, your proposal equates to this in code:

int dma_error(dma_addr_t addr)
{
return (addr & ~(dma_addr_t)0xfff) == 0xf000 ? (s32)addr : 0; 
}

Hence, the size of dma_addr_t would be entirely irrelevant.  This
makes dma_addr_t values in the range of 0xf000 to 0x special,
irrespective of whether dma_addr_t is 32-bit or 64-bit.

If that's not what you meant, then I'm afraid your statement wasn't
worded very well - so please can you re-word to state more precisely
what your proposal is?

> > Rather than inventing magic cookies like this, I'd much rather we
> > sanitised the API so that we have functions that return success or
> > an error code, rather than trying to shoe-horn some kind of magic
> > error codes into dma_addr_t and subtly break systems in the process.
> 
> Sure, but is has the obvious downside that we need to touch every driver
> that uses these functions, and that are probably a lot of drivers.

So we have two options:
- change the interface
- subtly break drivers

In any case, I disagree that we need to touch every driver.  Today,
drivers just do:

if (dma_mapping_error(dev, addr))

and, because dma_mapping_error() returns a boolean, they only care about
the true/falseness.  If we're going to start returning error codes, then
there's no point just returning error codes unless we have some way for
drivers to use them.  Yes, the simple way would be to make
dma_mapping_error() translate addr to an error code, and that maintains
compatibility with existing drivers.

If, instead, we revamped the DMA API, and had the *new* mapping functions
which returned an error code, then the existing mapping functions can be
implemented as part of compatibility rather trivially:

dma_addr_t dma_map_single(...)
{
dma_addr_t addr;
int error;

error = dma_map_single_error(..., &addr);
if (error)
addr = DMA_MAPPING_ERROR;
return addr;
}

which means that if drivers want access to the error code, they use
dma_map_single_error(), meanwhile existing drivers just get on with life
as it _currently_ is, with the cookie-based all-ones error code - without
introducing any of this potential breakage.

Remember, existing drivers would need modification in _any_ case to make
use of a returned error code, so the argument that we'd need to touch
every driver doesn't really stand up.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-23 Thread Russell King - ARM Linux
On Fri, Nov 23, 2018 at 11:49:18AM +0100, Joerg Roedel wrote:
> On Thu, Nov 22, 2018 at 05:52:15PM +, Robin Murphy wrote:
> > Unfortunately, with things like the top-down IOVA allocator, and 32-bit
> > systems in general, "the top 4095" values may well still be valid addresses
> > - we're relying on a 1-byte mapping of the very top byte of memory/IOVA
> > space being sufficiently ridiculous that no real code would ever do that,
> > but even a 4-byte mapping of the top 4 bytes is within the realms of the
> > plausible (I've definitely seen the USB layer make 8-byte mappings from any
> > old offset within a page, for example).
> 
> But we can easily work around that by reserving the top 4k of the first
> 4GB of IOVA address space in the allocator, no? Then these values are
> never returned as valid DMA handles.

Yuck.  So, if we have a 4GB non-PAE 32-bit system, or a PAE system
where we have valid memory across the 4GB boundary and no IOMMU,
we have to reserve the top 4K page in the first 4GB of RAM?

Linus' IS_DMA_ERR() solution is way more preferable, but unfortunately
it still falls short, because it knocks out the top 4K of every DMA
capable bus.

A DMA capable bus may involve some translation of the address (eg, by
simple offsetting) which means that we'd need to potentially knock out
multiple pages from the page allocator for each of those pages that
correspond to the top 4K of any DMA capable bus.  Knowing that at the
right time at boot will be fun!  It also sounds wasteful to me.

Rather than inventing magic cookies like this, I'd much rather we
sanitised the API so that we have functions that return success or
an error code, rather than trying to shoe-horn some kind of magic
error codes into dma_addr_t and subtly break systems in the process.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-22 Thread Russell King - ARM Linux
On Thu, Nov 22, 2018 at 09:55:25AM -0800, Linus Torvalds wrote:
> On Thu, Nov 22, 2018 at 9:52 AM Robin Murphy  wrote:
> >
> > Unfortunately, with things like the top-down IOVA allocator, and 32-bit
> > systems in general, "the top 4095" values may well still be valid
> > addresses -
> 
> Ugh.
> 
> > The only immediate benefit I can see is that we could distinguish cases
> > like the first which can never possibly succeed, and thus callers could
> > actually give up instead of doing what various subsystems currently do
> > and retry the exact same mapping indefinitely on the apparent assumption
> > that errors must be transient.
> 
> No, the big immediate benefit of allowing "return -EINVAL" etc is
> simply legibility and error avoidance.
> 
> It's basically how pretty much all the rest of the kernel returns
> errors, so not only is it very obvious, it's also what people do
> without even thinking.. So it would be good to work.

An alternative idea would be to migrate away from the
dma_map_single() and dma_map_page() interfaces that return a
dma_addr_t, and instead have them return an error code or zero
on success.

I've thought for some time that our DMA interfaces aren't particularly
friendly, especially after we had the stuff with PCI DMA which migrated
its way into the DMA API:

DEFINE_DMA_UNMAP_ADDR
DEFINE_DMA_UNMAP_LEN
dma_unmap_*

When coupled that with the requirement that dma_unmap_*() should be
called with the same parameters as dma_map_*(), I wonder why we never
did:

struct dma_map_state {
dma_addr_t dma_addr;
whatever's needed;
}

int dma_map_single(struct device *dev, struct dma_map_state *state,
   void *cpu, size_t len, enum dma_data_direction dir);
void dma_unmap_single(struct device *dev, struct dma_map_state *state);

note the simpler unmap API, which inherently guarantees that the
parameters to the map could be carried over to the unmap - without
our many driver authors having to think about it.

That also paves the way for dma_map_single() to return an error code
or zero.

However, I fear that boat sailed long ago - but maybe its worth
thinking along these lines if we want to sanitise the API now?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-22 Thread Russell King - ARM Linux
On Thu, Nov 22, 2018 at 08:50:47AM -0800, Linus Torvalds wrote:
> On Thu, Nov 22, 2018 at 6:03 AM Christoph Hellwig  wrote:
> >
> > The 0 return doesn't work for direct mappings that have ram at address
> > zero and a lot of IOMMUs that start allocating bus space from address
> > zero, so we can't consolidate on that, but I think we can move everyone
> > to all-Fs, which the patch here does.
> 
> Hmm. Maybe not limit it to just all ones, but actually use the
> (standard, for the kernel) IS_ERR_VALUE()?
> 
> That basically reserves the last 4095 values in an unsigned long for
> error values.
> 
> Then those functions could actually return *what* error they
> encountered, using just plain
> 
> return -ENOMEM;
> 
> or whatever?

Linus,

I'm afraid that won't work very well - 32 bit platforms with 64-bit
addresses (LPAE) would have dma_addr_t as a 64-bit value, which
wouldn't fit into an unsigned long.

IS_ERR_VALUE() would cast a 64-bit DMA address down to a 32-bit
pointer (effectively masking with 0x).  It would have the
effect of making (eg) 0x_fVVV an error, where 
are any of the top 32-bits of a 64-bit bus address, and VVV is the
error code value.

That could be a problem if you hit it in several places throughout
your available RAM... we'd have to mark every top page of RAM in
a naturally aligned 4GB as unusable, as well as block the top page
in natually aligned 4GB blocks from IOMMUs... and then what about
buses that have weird offsets...

So, I don't think the IS_ERR_VALUE() would work very well.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-22 Thread Russell King - ARM Linux
On Thu, Nov 22, 2018 at 09:09:47AM -0800, Linus Torvalds wrote:
> On Thu, Nov 22, 2018 at 9:07 AM Russell King - ARM Linux
>  wrote:
> >
> > I'm afraid that won't work very well - 32 bit platforms with 64-bit
> > addresses (LPAE) would have dma_addr_t as a 64-bit value, which
> > wouldn't fit into an unsigned long.
> 
> Good point. So we'd have to have a special IS_DMA_ERR() function that
> takes a dma_addr_t and checks the same "is it the top 4095 values".

No problem with that.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] of/platform: initialise AMBA default DMA masks

2018-08-31 Thread Russell King - ARM Linux
On Fri, Aug 31, 2018 at 10:26:14AM +0200, Linus Walleij wrote:
> This addresses a v4.19-rc1 regression in the PL111 DRM driver
> in drivers/gpu/pl111/*
> 
> The driver uses the CMA KMS helpers and will thus at some
> point call down to dma_alloc_attrs() to allocate a chunk
> of contigous DMA memory for the framebuffer.
> 
> It appears that in v4.18, it was OK that this (and other
> DMA mastering AMBA devices) left dev->coherent_dma_mask
> blank (zero).
> 
> In v4.19-rc1 the WARN_ON_ONCE(dev && !dev->coherent_dma_mask)
> in dma_alloc_attrs() in include/linux/dma-mapping.h is
> triggered. The allocation later fails when get_coherent_dma_mask()
> is called from __dma_alloc() and __dma_alloc() returns
> NULL:
> 
> drm-clcd-pl111 dev:20: coherent DMA mask is unset
> drm-clcd-pl111 dev:20: [drm:drm_fb_helper_fbdev_setup] *ERROR*
>   Failed to set fbdev configuration
> 
> I have not been able to properly bisect down to the actual
> committ causing it beacuse the kernel simply fails to build
> at to many bisection points, pushing the bisect back to places
> like the merge of entire subsystem trees, where things have
> likely been resolved while merging them.
> 
> I found that Robin Murphy solved a similar problem in
> a5516219b102 ("of/platform: Initialise default DMA masks")
> by simply assigning dev.coherent_dma_mask and the
> dev.dma_mask to point to the same when creating devices
> from the device tree, and introducing the same code into
> the code path creating AMBA/PrimeCell devices solved my
> problem, graphics now come up.
> 
> The code simply assumes that the device can access all
> of the system memory by setting the coherent DMA mask
> to 0x when creating a device from the device
> tree, which is crude, but seems to be what kernel v4.18
> assumed.
> 
> Cc: Russell King 
> Cc: Christoph Hellwig 
> Cc: Robin Murphy 
> Signed-off-by: Linus Walleij 
> ---
> Russell, Christoph: this is probably not the last patch.
> But it has a more accurate description of the problem.
> I still do not know what change to the kernel made
> it start triggering this, whether in the DMA API or
> in DRM :(

Doing some research, it seems the warning was added in v4.16-rc1, and
it is only a warning - it doesn't otherwise change the behaviour, so
it's not the actual warning that's the problem.

I can't see anything in the DMA API nor DRM which would cause this
either, but there are changes in the DT code.

However, there are changes to of_dma_configure() which will be called
just before the driver's probe function is called - and I guess the
culpret is 4d8bde883bfb ("OF: Don't set default coherent DMA mask").
So I guess that's the root cause of the problem, and indeed this
change was made in 4.19-rc1.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] of/platform: Initialise AMBA default DMA masks

2018-08-30 Thread Russell King - ARM Linux
On Thu, Aug 30, 2018 at 10:45:11AM +0200, Linus Walleij wrote:
> On Thu, Aug 30, 2018 at 10:40 AM Russell King - ARM Linux
>  wrote:
> 
> > Well, as I've no idea what the issue is here, I can't do anything or
> > make any suggestions.  I wasn't copied on the initial part of the
> > thread.
> 
> Sorry about that, it was because the original patch only hit in
> drivers/of/*.
> 
> I will send you a copy of the thread.

Thanks.

>From the original patch description:
> commit a5516219b102 ("of/platform: Initialise default DMA masks")
> sets up the coherent_dma_mask of platform devices created
> from the device tree, but fails to do the same for AMBA
> (PrimeCell) devices.
> 
> This leads to a regression in kernel v4.19-rc1 triggering the
> WARN_ONCE() in kernel/dma/coherent.c, dma_alloc_attrs()
> WARN_ON_ONCE(dev && !dev->coherent_dma_mask):

This description is very misleading.  It makes it sound like commit
a5516219b102 caused the problem.  Maybe someone would like to explain
how that can be the case - this commit just touches the DT platform
device code, and thus can not itself cause this regression.

Second reason it is misleading is that it claims that there is a
regression in 4.19-rc1 with a WARN in kernel/dma/coherent.c.
Looking at Linus' tip, kernel/dma/coherent.c does not contain any
WARNings except for one to do with dma_reserved_default_memory.
Looking at the history of that file in Linus' tree, there is only
one commit which touches this file, and that is Christoph's commit
which creates it.

So, this patch description needs much improvement, and AFAICS there
is no regression in Linus' kernel.  There may be a regression in
-next, but that is not as urgent as if it were in Linus' tree - iow,
we have time to fix this properly.


Now, from the AMBA perspective, we do not want every AMBA device to
appear to be DMA capable - we only want the ones which are to be so.

>From the backtrace, it seems to be a PL111 causing the issue - this
device has an AXI bus interface as well as an APB bus interface, so
is one of the few that are capable of DMA.  It's only restriction
is that it is limited to 32 bits of DMA address, and it doesn't care
about Linux's distinction between coherent and streaming DMA.

So, I'd suggest that we arrange for the DT code to setup the DMA mask
for the device only if it has an appropriate property (dma-ranges?),
and we don't need to re-introduce the separate mask for coherent DMA.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] of/platform: Initialise AMBA default DMA masks

2018-08-30 Thread Russell King - ARM Linux
On Wed, Aug 29, 2018 at 07:55:21AM +0200, Christoph Hellwig wrote:
> On Tue, Aug 28, 2018 at 03:14:14PM +0100, Russell King - ARM Linux wrote:
> > But yes, the fundamental fact is that AMBA devices don't have any
> > care about the differences between coherent and streaming DMA.  The
> > distinction that we make in the kernel is purely a software one when
> > it comes to these devices.
> > 
> > Most AMBA devices themselves are not DMA capable, as they are only
> > connected to the APB (Amba peripheral bus) and they rely on a
> > separate DMA engine for their DMA.  APB devices should not have DMA
> > masks - their DMA capabilities are entirely down to the DMA controller.
> > So, the majority of AMBA devices should not have any DMA masks.
> > 
> > Only those connected to a bus that they can master on (eg AXI) should
> > have DMA masks - things like the PL08x DMA controllers, PL11x LCD
> > controllers, etc.  As I've said above, there is no difference between
> > streaming and coherent DMA for these devices.
> 
> So for now I plan to apply the patch from Linus to just set a dma
> mask, as that gets back the previous behavior where dma did just
> work (as it did without a mask).

NAK on that at the moment.

> But if Linus, you or someone else familiar with amba would like to
> add an explicit opt-in into dma support eventually that would be
> even better.

Well, as I've no idea what the issue is here, I can't do anything or
make any suggestions.  I wasn't copied on the initial part of the
thread.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] of/platform: Initialise AMBA default DMA masks

2018-08-28 Thread Russell King - ARM Linux
On Tue, Aug 28, 2018 at 03:25:55PM +0200, Linus Walleij wrote:
> On Tue, Aug 28, 2018 at 11:21 AM Christoph Hellwig  wrote:
> 
> > > + dev->dev.coherent_dma_mask = DMA_BIT_MASK(32);
> > > + if (!dev->dev.dma_mask)
> > > + dev->dev.dma_mask = &dev->dev.coherent_dma_mask;
> >
> > We should never set dma_mask to point to the coherent_dma_mask,
> > as that will cause problems with devices that have different
> > mask for the two.
> >
> > How about something like this?
> (...)
> > +   u64 dma_mask;
> 
> We did that before, so this becomes effectively a revert of:
> commit 446b2a9380b64b9d7410d86ee8226031e03645cf
> 
> DMA-API: amba: get rid of separate dma_mask
> 
> AMBA Primecell devices always treat streaming and coherent DMA exactly
> the same, so there's no point in having the masks separated.
> 
> So there is some history around this.
> 
> There is also some code in drivers/amba/bus.c affected by that
> and I need to look over all static amba device allocations if we
> reintroduce this.

I don't have the rest of this thread to read...

But yes, the fundamental fact is that AMBA devices don't have any
care about the differences between coherent and streaming DMA.  The
distinction that we make in the kernel is purely a software one when
it comes to these devices.

Most AMBA devices themselves are not DMA capable, as they are only
connected to the APB (Amba peripheral bus) and they rely on a
separate DMA engine for their DMA.  APB devices should not have DMA
masks - their DMA capabilities are entirely down to the DMA controller.
So, the majority of AMBA devices should not have any DMA masks.

Only those connected to a bus that they can master on (eg AXI) should
have DMA masks - things like the PL08x DMA controllers, PL11x LCD
controllers, etc.  As I've said above, there is no difference between
streaming and coherent DMA for these devices.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 7/7] OF: Don't set default coherent DMA mask

2018-07-27 Thread Russell King - ARM Linux
On Fri, Jul 27, 2018 at 12:36:00PM +0100, Robin Murphy wrote:
> That is intentional. Consider a 32-bit device on a bus with an upstream DMA
> range of 40 bits (which could easily happen with e.g. PCI). If the firmware
> code gives that device 40-bit DMA masks and the driver doesn't change them,
> then DMA is not going to work correctly. Therefore the firmware code cannot
> safely make {coherent_}dma_mask larger than the default value passed in, and
> if the default is 0 then that should be fixed in whatever code created the
> device, not worked around later.

I think you're missing the point.

When OF creates platform devices, they are created without any DMA masks
what so ever.  It is only during device probe that dma_configure() gets
called, which then calls of_dma_configure(), where the DMA mask for any
DT declared device is setup.

What this means is that your change has broken all existing DT devices
that are trying to use DMA, because they now end up with a zero coherent
DMA mask and/or streaming DMA mask.

Your original idea behind the patch may be a sound one, but since the
implementation has caused a regression, the implementation is obviously
broken, and that needs fixing or reworking in some way.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()

2018-07-02 Thread Russell King - ARM Linux
On Mon, Jul 02, 2018 at 01:53:17PM +0200, Thierry Reding wrote:
> On Wed, May 30, 2018 at 04:06:24PM +0200, Thierry Reding wrote:
> > From: Thierry Reding 
> > 
> > Instead of setting the DMA ops pointer to NULL, set the correct,
> > non-IOMMU ops depending on the device's coherency setting.
> > 
> > Signed-off-by: Thierry Reding 
> > ---
> > Changes in v4:
> > - new patch to fix existing arm_iommu_detach_device() to do what we need
> > 
> >  arch/arm/mm/dma-mapping.c | 12 ++--
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> Christoph, Russell,
> 
> could either of you provide an Acked-by for this? I think it makes the
> most sense for Ben to pick this up into the Nouveau tree along with
> patch 2/2.

Looks fine to me.

Acked-by: Russell King 

Thanks.

> 
> Thierry
> 
> > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> > index af27f1c22d93..87a0037574e4 100644
> > --- a/arch/arm/mm/dma-mapping.c
> > +++ b/arch/arm/mm/dma-mapping.c
> > @@ -1151,6 +1151,11 @@ int arm_dma_supported(struct device *dev, u64 mask)
> > return __dma_supported(dev, mask, false);
> >  }
> >  
> > +static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
> > +{
> > +   return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
> > +}
> > +
> >  #ifdef CONFIG_ARM_DMA_USE_IOMMU
> >  
> >  static int __dma_info_to_prot(enum dma_data_direction dir, unsigned long 
> > attrs)
> > @@ -2296,7 +2301,7 @@ void arm_iommu_detach_device(struct device *dev)
> > iommu_detach_device(mapping->domain, dev);
> > kref_put(&mapping->kref, release_iommu_mapping);
> > to_dma_iommu_mapping(dev) = NULL;
> > -   set_dma_ops(dev, NULL);
> > +   set_dma_ops(dev, arm_get_dma_map_ops(dev->archdata.dma_coherent));
> >  
> > pr_debug("Detached IOMMU controller from %s device.\n", dev_name(dev));
> >  }
> > @@ -2357,11 +2362,6 @@ static void arm_teardown_iommu_dma_ops(struct device 
> > *dev) { }
> >  
> >  #endif /* CONFIG_ARM_DMA_USE_IOMMU */
> >  
> > -static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
> > -{
> > -   return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
> > -}
> > -
> >  void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> > const struct iommu_ops *iommu, bool coherent)
> >  {
> > -- 
> > 2.17.0
> > 



-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)

2018-05-18 Thread Russell King - ARM Linux
On Fri, May 18, 2018 at 01:35:08PM -0700, Vineet Gupta wrote:
> On 05/18/2018 10:50 AM, Russell King - ARM Linux wrote:
> >On Fri, May 18, 2018 at 10:20:02AM -0700, Vineet Gupta wrote:
> >>I never understood the need for this direction. And if memory serves me
> >>right, at that time I was seeing twice the amount of cache flushing !
> >It's necessary.  Take a moment to think carefully about this:
> >
> > dma_map_single(, dir)
> >
> > dma_sync_single_for_cpu(, dir)
> >
> > dma_sync_single_for_device(, dir)
> >
> > dma_unmap_single(, dir)
> 
> As an aside, do these imply a state machine of sorts - does a driver needs
> to always call map_single first ?

Kind-of, but some drivers do omit some of the dma_sync_*() calls.
For example, if a buffer is written to, then mapped with TO_DEVICE,
and then the CPU wishes to write to it, it's fairly common that a
driver omits the dma_sync_single_for_cpu() call.  If you think about
the cases I gave and what cache operations happen, such a scenario
practically turns out to be safe.

> My original point of contention/confusion is the specific combinations of
> API and direction, specifically for_cpu(TO_DEV) and for_device(TO_CPU)

Remember that it is expected that all calls for a mapping use the
same direction argument while that mapping exists.  In other words,
if you call dma_map_single(TO_DEVICE) and then use any of the other
functions, the other functions will also use TO_DEVICE.  The DMA
direction argument describes the direction of the DMA operation
being performed on the buffer, not on the individual dma_* operation.

What isn't expected at arch level is for drivers to do:

dma_map_single(TO_DEVICE)
dma_sync_single_for_cpu(FROM_DEVICE)

or vice versa.

> Semantically what does dma_sync_single_for_cpu(TO_DEV) even imply for a non
> dma coherent arch.
> 
> Your tables below have "none" for both, implying it is unlikely to be a real
> combination (for ARM and ARC atleast).

Very little for the cases that I've stated (and as I mentioned
above, some drivers do omit the call in that case.)

> The other case, actually @dir TO_CPU, independent of for_{cpu, device} 
> implies driver intends to touch it after the call, so it would invalidate
> any stray lines, unconditionally (and not just for speculative prefetch
> case).

If you don't have a CPU that speculatively prefetches, and you've
already had to invalidate the cache lines (to avoid write-backs
corrupting DMA'd data) then there's no need for the architecture
to do any work at the for_cpu(TO_CPU) case - the CPU shouldn't
be touching cache lines that are part of the buffer while it is
mapped, which means a non-speculating CPU won't pull in any
cache lines without an explicit access.

Speculating CPUs are different.  The action of the speculation is
to try and guess what data the program wants to access ahead of
the program flow.  That causes the CPU to prefetch data into the
cache.  The point in the program flow that this happens is not
really determinant to the programmer.  This means that if you try
to read from the DMA buffer after the DMA operation has complete
without invalidating the cache between the DMA completing and the
CPU reading, you have no guarantee that you're reading the data
that the DMA operation has been written.  The cache may have
loaded itself with data before the DMA operation completed, and
the CPU may see that stale data.

The difference between non-speculating CPUs and speculating CPUs
is that for non-speculating CPUs, caches work according to explicit
accesses by the program, and the program is stalled while the data
is fetched from external memory.  Speculating CPUs try to predict
ahead of time what data the program will require in the future,
and attempt to load that data into the caches _before_ the program
requires it - which means that the program suffers fewer stalls.

> >In the case of a DMA-incoherent architecture, the operations done at each
> >stage depend on the direction argument:
> >
> > map for_cpu for_device  unmap
> >TO_DEV   writeback   nonewriteback   none
> >TO_CPU   invalidate  invalidate* invalidate  invalidate*
> >BIDIRwriteback   invalidate  writeback   invalidate
> >
> >* - only necessary if the CPU speculatively prefetches.
> >
> >The multiple invalidations for the TO_CPU case handles different
> >conditions that can result in data corruption, and for some CPUs, all
> >four are necessary.
> 
> Can you please explain in some more detail, TO_CPU row, why invalidate is
> conditional sometimes.

See above - I hope my explanation above is sufficient.

-- 
RMK's Patch system: h

Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)

2018-05-18 Thread Russell King - ARM Linux
On Fri, May 18, 2018 at 07:57:34PM +, Alexey Brodkin wrote:
> Hi Russel,

That's Russell.

> On Fri, 2018-05-18 at 18:50 +0100, Russell King - ARM Linux wrote:
> > It's necessary.  Take a moment to think carefully about this:
> > 
> > dma_map_single(, dir)
> > 
> > dma_sync_single_for_cpu(, dir)
> > 
> > dma_sync_single_for_device(, dir)
> > 
> > dma_unmap_single(, dir)
> > 
> > In the case of a DMA-incoherent architecture, the operations done at each
> > stage depend on the direction argument:
> > 
> > map for_cpu for_device  unmap
> > TO_DEV  writeback   nonewriteback   none
> > TO_CPU  invalidate  invalidate* invalidate  invalidate*
> > BIDIR   writeback   invalidate  writeback   invalidate
> > 
> > * - only necessary if the CPU speculatively prefetches.
> 
> I think invalidation of DMA buffer is required on for_cpu(TO_CPU) even
> if CPU doesn't preferch - what if we reuse the same buffer for multiple
> reads from DMA device?

That's fine - for non-coherent DMA, the CPU caches will only end up
containing data for that memory if:
- the CPU speculatively fetches data from that memory, or
- the CPU explicitly touches that memory

> > The multiple invalidations for the TO_CPU case handles different
> > conditions that can result in data corruption, and for some CPUs, all
> > four are necessary.
> 
> I would agree that map()/unmap() a quite a special cases and so depending
> on direction we need to execute in them either for_cpu() or for_device()
> call-backs depending on direction.
> 
> As for invalidation in case of for_device(TO_CPU) I still don't see
> a rationale behind it. Would be interesting to see a real example where
> we benefit from this.

Yes, you could avoid that, but depending how you structure the
architecture implementation, it can turn out to be a corner case.
The above table is precisely how 32-bit ARM is implemented, because
the way we implement them is based on who owns the memory - the "map"
and "for_device" operations translate internally to a cpu-to-device
ownership transition of the buffer.  Similar for "unmap" and "to_cpu".
It basically avoids having to add additional functions at the lower
implementation levels.

> > Things get more interesting if the implementation behind the DMA API has
> > to copy data between the buffer supplied to the mapping and some DMA
> > accessible buffer:
> > 
> > map for_cpu for_device  unmap
> > TO_DEV  copy to dma nonecopy to dma none
> > TO_CPU  nonecopy to cpu nonecopy to cpu
> > BIDIR   copy to dma copy to cpu copy to dma copy to cpu
> > 
> > So, in both cases, the value of the direction argument defines what you
> > need to do in each call.
> 
> Interesting enough in your seond table (which describes more complicated
> case indeed) you set "none" for for_device(TO_CPU) which looks logical
> to me.
> 
> So IMHO that's what make sense:
> >8-
> map for_cpu for_device  unmap
> TO_DEV  writeback   nonewriteback   none
> TO_CPU  noneinvalidate  noneinvalidate*
> BIDIR   writeback   invalidate  writeback   invalidate*
> >8-

That doesn't make sense for the TO_CPU case.  If the caches contain
dirty cache lines, and you're DMAing from the device to the system
RAM, other system activity can cause the dirty cache lines to be
evicted (written back) to memory which the DMA has already overwritten.
The result is data corruption.  So, you really can't have "none" in
the "map" case there.

Given that, the "for_cpu" case becomes dependent on whether the CPU
speculatively prefetches.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)

2018-05-18 Thread Russell King - ARM Linux
On Fri, May 18, 2018 at 10:20:02AM -0700, Vineet Gupta wrote:
> I never understood the need for this direction. And if memory serves me
> right, at that time I was seeing twice the amount of cache flushing !

It's necessary.  Take a moment to think carefully about this:

dma_map_single(, dir)

dma_sync_single_for_cpu(, dir)

dma_sync_single_for_device(, dir)

dma_unmap_single(, dir)

In the case of a DMA-incoherent architecture, the operations done at each
stage depend on the direction argument:

map for_cpu for_device  unmap
TO_DEV  writeback   nonewriteback   none
TO_CPU  invalidate  invalidate* invalidate  invalidate*
BIDIR   writeback   invalidate  writeback   invalidate

* - only necessary if the CPU speculatively prefetches.

The multiple invalidations for the TO_CPU case handles different
conditions that can result in data corruption, and for some CPUs, all
four are necessary.

This is what is implemented for 32-bit ARM, depending on the CPU
capabilities, as we have DMA incoherent devices and we have CPUs that
speculatively prefetch data, and so may load data into the caches while
DMA is in operation.


Things get more interesting if the implementation behind the DMA API has
to copy data between the buffer supplied to the mapping and some DMA
accessible buffer:

map for_cpu for_device  unmap
TO_DEV  copy to dma nonecopy to dma none
TO_CPU  nonecopy to cpu nonecopy to cpu
BIDIR   copy to dma copy to cpu copy to dma copy to cpu

So, in both cases, the value of the direction argument defines what you
need to do in each call.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Difference between IOVA and bus address when SMMU is enabled

2018-05-14 Thread Russell King - ARM Linux
On Sat, May 12, 2018 at 06:25:13PM +0530, valmiki wrote:
> Hi All,
> 
> What is the difference between IOVA address and bus address
> when SMMU is enabled ?
> 
> Is IOVA address term used only when hypervisor is present ?

IOVA = IO virtual address.  IOVA is the term normally used to describe
the address used on the _device_ side of an IOMMU.

For any general setup:

RAM - MMU - DEVICE
  ^ ^
  physical   virtual
  addressaddress

where "device" can be an IO device or a CPU, the terms still apply.

If you have something like this:

RAM - PCI bridge - MMU - DEVICE
  ^^ ^
   physical   bus virtual
   address  address   address

You could also have (eg, in the case of a system MMU):

RAM - MMU - PCI bridge - DEVICE
  ^ ^^
   physical  virtualbus
   address   address  address
   (this can also be
considered a bus
address!)

In both of the above two cases, the PCI bridge may perform some address
translation, meaning that the bus address is different from the address
seen on the other side of the bridge.

So, the terms used depend exactly on the overall bus topology.

In the case of a system MMU, where the system MMU sits between peripheral
devices and RAM, then the bus addresses are the same as the
_IOVA of the system MMU_.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 04/20] arm-nommu: use generic dma_noncoherent_ops

2018-05-11 Thread Russell King - ARM Linux
On Fri, May 11, 2018 at 09:59:29AM +0200, Christoph Hellwig wrote:
> Switch to the generic noncoherent direct mapping implementation for
> the nommu dma map implementation.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arc/Kconfig|   1 +
>  arch/arm/Kconfig|   4 +
>  arch/arm/mm/dma-mapping-nommu.c | 139 +---
>  3 files changed, 23 insertions(+), 121 deletions(-)
> 
> diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
> index 89d47eac18b2..3a492a9aeaad 100644
> --- a/arch/arc/Kconfig
> +++ b/arch/arc/Kconfig
> @@ -9,6 +9,7 @@
>  config ARC
>   def_bool y
>   select ARC_TIMERS
> + select ARCH_HAS_SYNC_DMA_FOR_DEVICE
>   select ARCH_HAS_SYNC_DMA_FOR_CPU
>   select ARCH_HAS_SYNC_DMA_FOR_DEVICE
>   select ARCH_HAS_SG_CHAIN
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index c43f5bb55ac8..76ddd0064f87 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -12,6 +12,8 @@ config ARM
>   select ARCH_HAS_PHYS_TO_DMA
>   select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
>   select ARCH_HAS_STRICT_MODULE_RWX if MMU
> + select ARCH_HAS_SYNC_DMA_FOR_CPU if !MMU
> + select ARCH_HAS_SYNC_DMA_FOR_DEVICE if !MMU
>   select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
>   select ARCH_HAVE_CUSTOM_GPIO_H
>   select ARCH_HAS_GCOV_PROFILE_ALL
> @@ -27,6 +29,8 @@ config ARM
>   select CPU_PM if (SUSPEND || CPU_IDLE)
>   select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS
>   select DMA_DIRECT_OPS if !MMU
> + select DMA_NONCOHERENT_OPS if !MMU
> + select DMA_NONCOHERENT_MMAP if !MMU
>   select EDAC_SUPPORT
>   select EDAC_ATOMIC_SCRUB
>   select GENERIC_ALLOCATOR
> diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c
> index f448a0663b10..a74ed6632982 100644
> --- a/arch/arm/mm/dma-mapping-nommu.c
> +++ b/arch/arm/mm/dma-mapping-nommu.c
> @@ -12,6 +12,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include 
> @@ -26,18 +27,16 @@
>   *   - MMU/MPU is off
>   *   - cpu is v7m w/o cache support
>   *   - device is coherent
> - *  otherwise arm_nommu_dma_ops is used.
> + *  otherwise dma_noncoherent_ops is used.
>   *
> - *  arm_nommu_dma_ops rely on consistent DMA memory (please, refer to
> + *  dma_noncoherent_ops rely on consistent DMA memory (please, refer to
>   *  [1] on how to declare such memory).
>   *
>   *  [1] Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
>   */
>  
> -static void *arm_nommu_dma_alloc(struct device *dev, size_t size,
> -  dma_addr_t *dma_handle, gfp_t gfp,
> -  unsigned long attrs)
> -
> +void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> + gfp_t gfp, unsigned long attrs)
>  {
>   void *ret;
>  
> @@ -65,9 +64,8 @@ static void *arm_nommu_dma_alloc(struct device *dev, size_t 
> size,
>   return ret;
>  }
>  
> -static void arm_nommu_dma_free(struct device *dev, size_t size,
> -void *cpu_addr, dma_addr_t dma_addr,
> -unsigned long attrs)
> +void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
> + dma_addr_t dma_addr, unsigned long attrs)
>  {
>   if (attrs & DMA_ATTR_NON_CONSISTENT) {
>   dma_direct_free(dev, size, cpu_addr, dma_addr, attrs);
> @@ -81,9 +79,9 @@ static void arm_nommu_dma_free(struct device *dev, size_t 
> size,
>   return;
>  }
>  
> -static int arm_nommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
> -   void *cpu_addr, dma_addr_t dma_addr, size_t size,
> -   unsigned long attrs)
> +int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
> + void *cpu_addr, dma_addr_t dma_addr, size_t size,
> + unsigned long attrs)
>  {
>   int ret;
>  
> @@ -93,9 +91,8 @@ static int arm_nommu_dma_mmap(struct device *dev, struct 
> vm_area_struct *vma,
>   return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
>  }
>  
> -
> -static void __dma_page_cpu_to_dev(phys_addr_t paddr, size_t size,
> -   enum dma_data_direction dir)
> +void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
> + size_t size, enum dma_data_direction dir)

Please no.  There is a lot of history of these (__dma_page_cpu_to_dev etc)
functions being abused by out of tree drivers, because they think they
know better.  This is stopped by making them static and ensuring that
drivers have no access to these functions.

Please do not re-expose these to the global kernel.

While it may make things easier for a cross-architecture point of view,
it makes it a lot easier for people to abuse these private APIs.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in su

Re: noveau vs arm dma ops

2018-04-26 Thread Russell King - ARM Linux
(While there's a rain shower...)

On Thu, Apr 26, 2018 at 02:09:42AM -0700, Christoph Hellwig wrote:
> synopsis:
> 
> drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c:pdevinfo.dma_mask 
>   = DMA_BIT_MASK(32);
> drivers/gpu/drm/bridge/synopsys/dw-hdmi.c:  pdevinfo.dma_mask = 
> DMA_BIT_MASK(32);

This is for the AHB audio driver, and is a correct default on two counts:

1. It is historically usual to initialise DMA masks to 32-bit, and leave
   it to the device drivers to negotiate via the DMA mask functions if
   they wish to use higher orders of bits.

2. The AHB audio hardware in the HDMI block only supports 32-bit addresses.

What I've missed from the AHB audio driver is calling the DMA mask
functions... oops.  Patch below.

> drivers/gpu/drm/bridge/synopsys/dw-hdmi.c:  pdevinfo.dma_mask = 
> DMA_BIT_MASK(32);

This is for the I2S audio driver, and I suspect is wrong - I doubt
that the I2S sub-device itself does any DMA what so ever.

8<===
From: Russell King 
Subject: drm: bridge: dw-hdmi: Negotiate dma mask with DMA API

DMA drivers are supposed to negotiate the DMA mask with the DMA API,
but this was missing from the AHB audio driver.  Add the necessary
call.

Signed-off-by: Russell King 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c
index cf3f0caf9c63..16c45b6cd6af 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c
@@ -539,6 +539,10 @@ static int snd_dw_hdmi_probe(struct platform_device *pdev)
unsigned revision;
int ret;
 
+   ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+   if (ret)
+   return ret;
+
writeb_relaxed(HDMI_IH_MUTE_AHBDMAAUD_STAT0_ALL,
   data->base + HDMI_IH_MUTE_AHBDMAAUD_STAT0);
revision = readb_relaxed(data->base + HDMI_REVISION_ID);


-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: noveau vs arm dma ops

2018-04-25 Thread Russell King - ARM Linux
On Wed, Apr 25, 2018 at 11:35:13PM +0200, Daniel Vetter wrote:
> On arm that doesn't work. The iommu api seems like a good fit, except
> the dma-api tends to get in the way a bit (drm/msm apparently has
> similar problems like tegra), and if you need contiguous memory
> dma_alloc_coherent is the only way to get at contiguous memory. There
> was a huge discussion years ago about that, and direct cma access was
> shot down because it would have exposed too much of the caching
> attribute mangling required (most arm platforms need wc-pages to not
> be in the kernel's linear map apparently).

I think you completely misunderstand ARM from what you've written above,
and this worries me greatly about giving DRM the level of control that
is being asked for.

Modern ARMs have a PIPT cache or a non-aliasing VIPT cache, and cache
attributes are stored in the page tables.  These caches are inherently
non-aliasing when there are multiple mappings (which is a great step
forward compared to the previous aliasing caches.)

As the cache attributes are stored in the page tables, this in theory
allows different virtual mappings of the same physical memory to have
different cache attributes.  However, there's a problem, and that's
called speculative prefetching.

Let's say you have one mapping which is cacheable, and another that is
marked as write combining.  If a cache line is speculatively prefetched
through the cacheable mapping of this memory, and then you read the
same physical location through the write combining mapping, it is
possible that you could read cached data.

So, it is generally accepted that all mappings of any particular
physical bit of memory should have the same cache attributes to avoid
unpredictable behaviour.

This presents a problem with what is generally called "lowmem" where
the memory is mapped in kernel virtual space with cacheable
attributes.  It can also happen with highmem if the memory is
kmapped.

This is why, on ARM, you can't use something like get_free_pages() to
grab some pages from the system, pass it to the GPU, map it into
userspace as write-combining, etc.  It _might_ work for some CPUs,
but ARM CPUs vary in how much prefetching they do, and what may work
for one particular CPU is in no way guaranteed to work for another
ARM CPU.

The official line from architecture folk is to assume that the caches
infinitely speculate, are of infinite size, and can writeback *dirty*
data at any moment.

The way to stop things like speculative prefetches to particular
physical memory is to, quite "simply", not have any cacheable
mappings of that physical memory anywhere in the system.

Now, cache flushes on ARM tend to be fairly expensive for GPU buffers.
If you have, say, an 8MB buffer (for a 1080p frame) and you need to
do a cache operation on that buffer, you'll be iterating over it
32 or maybe 64 bytes at a time "just in case" there's a cache line
present.  Referring to my previous email, where I detailed the
potential need for _two_ flushes, one before the GPU operation and
one after, and this becomes _really_ expensive.  At that point, you're
probably way better off using write-combine memory where you don't
need to spend CPU cycles performing cache flushing - potentially
across all CPUs in the system if cache operations aren't broadcasted.

This isn't a simple matter of "just provide some APIs for cache
operations" - there's much more that needs to be understood by
all parties here, especially when we have GPU drivers that can be
used with quite different CPUs.

It may well be that for some combinations of CPUs and workloads, it's
better to use write-combine memory without cache flushing, but for
other CPUs that tradeoff (for the same workload) could well be
different.

Older ARMs get more interesting, because they have aliasing caches.
That means the CPU cache aliases across different virtual space
mappings in some way, which complicates (a) the mapping of memory
and (b) handling the cache operations on it.

It's too late for me to go into that tonight, and I probably won't
be reading mail for the next week and a half, sorry.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: noveau vs arm dma ops

2018-04-25 Thread Russell King - ARM Linux
On Wed, Apr 25, 2018 at 08:33:12AM -0700, Christoph Hellwig wrote:
> On Wed, Apr 25, 2018 at 12:04:29PM +0200, Daniel Vetter wrote:
> > - dma api hides the cache flushing requirements from us. GPUs love
> >   non-snooped access, and worse give userspace control over that. We want
> >   a strict separation between mapping stuff and flushing stuff. With the
> >   IOMMU api we mostly have the former, but for the later arch maintainers
> >   regularly tells they won't allow that. So we have drm_clflush.c.
> 
> The problem is that a cache flushing API entirely separate is hard. That
> being said if you look at my generic dma-noncoherent API series it tries
> to move that way.  So far it is in early stages and apparently rather
> buggy unfortunately.

And if folk want a cacheable mapping with explicit cache flushing, the
cache flushing must not be defined in terms of "this is what CPU seems
to need" but from the point of view of a CPU with infinite prefetching,
infinite caching and infinite capacity to perform writebacks of dirty
cache lines at unexpected moments when the memory is mapped in a
cacheable mapping.

(The reason for that is you're operating in a non-CPU specific space,
so you can't make any guarantees as to how much caching or prefetching
will occur by the CPU - different CPUs will do different amounts.)

So, for example, the sequence:

GPU writes to memory
CPU reads from cacheable memory

if the memory was previously dirty (iow, CPU has written), you need to
flush the dirty cache lines _before_ the GPU writes happen, but you
don't know whether the CPU has speculatively prefetched, so you need
to flush any prefetched cache lines before reading from the cacheable
memory _after_ the GPU has finished writing.

Also note that "flush" there can be "clean the cache", "clean and
invalidate the cache" or "invalidate the cache" as appropriate - some
CPUs are able to perform those three operations, and the appropriate
one depends on not only where in the above sequence it's being used,
but also on what the operations are.

So, the above sequence could be:

CPU invalidates cache for memory
(due to possible dirty cache lines)
GPU writes to memory
CPU invalidates cache for memory
(to get rid of any speculatively prefetched
 lines)
CPU reads from cacheable memory

Yes, in the above case, _two_ cache operations are required to ensure
correct behaviour.  However, if you know for certain that the memory was
previously clean, then the first cache operation can be skipped.

What I'm pointing out is there's much more than just "I want to flush
the cache" here, which is currently what DRM seems to assume at the
moment with the code in drm_cache.c.

If we can agree a set of interfaces that allows _proper_ use of these
facilities, one which can be used appropriately, then there shouldn't
be a problem.  The DMA API does that via it's ideas about who owns a
particular buffer (because of the above problem) and that's something
which would need to be carried over to such a cache flushing API (it
should be pretty obvious that having a GPU read or write memory while
the cache for that memory is being cleaned will lead to unexpected
results.)

Also note that things get even more interesting in a SMP environment
if cache operations aren't broadcasted across the SMP cluster (which
means cache operations have to be IPI'd to other CPUs.)

The next issue, which I've brought up before, is that exposing cache
flushing to userspace on architectures where it isn't already exposed
comes.  As has been shown by Google Project Zero, this risks exposing
those architectures to Spectre and Meltdown exploits where they weren't
at such a risk before.  (I've pretty much shown here that you _do_
need to control which cache lines get flushed to make these exploits
work, and flushing the cache by reading lots of data in liu of having
the ability to explicitly flush bits of cache makes it very difficult
to impossible for them to work.)

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 5/5] ARM: Unconditionally enable ARM_DMA_USE_IOMMU

2018-04-25 Thread Russell King - ARM Linux
On Wed, Apr 25, 2018 at 12:10:51PM +0200, Thierry Reding wrote:
> From: Thierry Reding 
> 
> The ARM_DMA_USE_IOMMU Kconfig option has side-effects that drivers can
> not opt into but have to explicitly opt out of. This can lead to subtle
> bugs that are difficult to track down and not immediately obvious to be
> related to this Kconfig option.
> 
> To avoid this confusion, always enable the option to expose any lurking
> bugs once and allow any regressions introduced by the DMA/IOMMU code to
> be caught more quickly in the future.
> 
> Note that some drivers still use the Kconfig symbol to provide different
> code paths depending on what architecture the code runs on (e.g. 32-bit
> ARM vs. 64-bit ARM which have different and incompatible implementations
> of the DMA/IOMMU integration code), so leave the symbol in place to keep
> those drivers working.
> 
> For the long term, it is preferable to transition everyone to the
> generic DMA/IOMMU integration code in drivers/iommu/dma-iommu.c.
> 
> Signed-off-by: Thierry Reding 
> ---
> Changes in v2:
> - new patch
> 
>  arch/arm/Kconfig   |  2 +-
>  arch/arm/include/asm/device.h  |  6 --
>  arch/arm/mm/dma-mapping.c  | 18 --
>  drivers/iommu/Kconfig  |  7 ---
>  drivers/media/platform/Kconfig |  1 -
>  5 files changed, 1 insertion(+), 33 deletions(-)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index fa0b190f8a38..3c91de78535a 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -124,7 +124,7 @@ config NEED_SG_DMA_LENGTH
>   bool
>  
>  config ARM_DMA_USE_IOMMU
> - bool
> + def_bool y
>   select ARM_HAS_SG_CHAIN
>   select NEED_SG_DMA_LENGTH

This doesn't work - as has recently been discussed with hch, we can't
globally enable NEED_SG_DMA_LENGTH on ARM - the ARM architecture
pre-dates the addition of the DMA length member in the scatterlist,
and not every machine supports the splitting of the DMA length from
the non-DMA length member.  Hence, this will cause a regression,
sorry.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: noveau vs arm dma ops

2018-04-25 Thread Russell King - ARM Linux
On Wed, Apr 25, 2018 at 01:54:39AM -0700, Christoph Hellwig wrote:
> [discussion about this patch, which should have been cced to the iommu
>  and linux-arm-kernel lists, but wasn't:
>  https://www.spinics.net/lists/dri-devel/msg173630.html]
> 
> On Wed, Apr 25, 2018 at 09:41:51AM +0200, Thierry Reding wrote:
> > > API from the iommu/dma-mapping code.  Drivers have no business poking
> > > into these details.
> > 
> > The interfaces that the above patch uses are all EXPORT_SYMBOL_GPL,
> > which is rather misleading if they are not meant to be used by drivers
> > directly.

EXPORT_SYMBOL* means nothing as far as whether a driver should
be able to use the symbol or not - it merely means that the symbol is
made available to a module.  Modules cover much more than just device
drivers - we have library modules, filesystem modules, helper modules
to name a few non-driver classes of modules.

We also have symbols that are exported as part of the architecture
implementation detail of a public interface.  For example, the
public interface "copy_from_user" is implemented as an inline
function (actually several layers of inline functions) eventually
calling into arm_copy_from_user().  arm_copy_from_user() is exported,
but drivers (in fact no module) is allowed to make direct reference
to arm_copy_from_user() - it'd fail when software PAN is enabled.

The whole idea that "if a symbol is exported, it's fine for a driver
to use it" is a complete work of fiction, always has been, and always
will be.

We've had this with the architecture implementation details of the
DMA API before, and with the architecture implementation details of
the CPU cache flushing.  There's only so much commentry, or __
prefixes you can add to a symbol before things get rediculous, and
none of it stops people creating this abuse.  The only thing that
seems to prevent it is to make life hard for folk wanting to use
the symbols (eg, hiding the symbol prototype in a private header,
etc.)

Never, ever go under the covers of an interface.  If the interface
doesn't do what you want, _discuss_ it, don't just think "oh, that
architecture private facility looks like what I need, I'll use that
directly."

If you ever are on the side of trying to maintain those implementation
details that are abused in this way, you'll soon understand why this
behaviour by driver authors is soo annoying, and the maintainability
problems it creates.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/12] swiotlb: move the SWIOTLB config symbol to lib/Kconfig

2018-04-24 Thread Russell King - ARM Linux
On Tue, Apr 24, 2018 at 08:55:49AM +0200, Christoph Hellwig wrote:
> On Tue, Apr 24, 2018 at 12:52:05AM +0100, Russell King - ARM Linux wrote:
> > On Mon, Apr 23, 2018 at 07:04:18PM +0200, Christoph Hellwig wrote:
> > > This way we have one central definition of it, and user can select it as
> > > needed.  Note that we also add a second ARCH_HAS_SWIOTLB symbol to
> > > indicate the architecture supports swiotlb at all, so that we can still
> > > make the usage optional for a few architectures that want this feature
> > > to be user selectable.
> > > 
> > > Signed-off-by: Christoph Hellwig 
> > 
> > Hmm, this looks like we end up with NEED_SG_DMA_LENGTH=y on ARM by
> > default, which probably isn't a good idea - ARM pre-dates the dma_length
> > parameter in scatterlists, and I don't think all code is guaranteed to
> > do the right thing if this is enabled.
> 
> We shouldn't end up with NEED_SG_DMA_LENGTH=y on ARM by default.

Your patch as sent would end up with:

ARM selects ARCH_HAS_SWIOTLB
SWIOTLB is defaulted to ARCH_HAS_SWIOTLB
SWIOTLB selects NEED_SG_DMA_LENGTH

due to:

@@ -106,6 +106,7 @@ config ARM
select REFCOUNT_FULL
select RTC_LIB
select SYS_SUPPORTS_APM_EMULATION
+   select ARCH_HAS_SWIOTLB

and:

+config SWIOTLB
+   bool "SWIOTLB support"
+   default ARCH_HAS_SWIOTLB
+   select NEED_SG_DMA_LENGTH

Therefore, the default state for SWIOTLB and hence NEED_SG_DMA_LENGTH
becomes 'y' on ARM, and any defconfig file that does not mention SWIOTLB
explicitly ends up with both these enabled.

> It is only select by ARM_DMA_USE_IOMMU before the patch, and it will
> now also be selected by SWIOTLB, which for arm is never used or seleted
> directly by anything but xen-swiotlb.

See above.

> Then again looking at the series there shouldn't be any need to
> even select NEED_SG_DMA_LENGTH for swiotlb, as we'll never merge segments,
> so I'll fix that up.

That would help to avoid any regressions along the lines I've spotted
by review.

It does look a bit weird though - patch 10 arranged stuff so that we
didn't end up with SWIOTLB always enabled, but this patch reintroduces
that with the allowance that the user can disable if so desired.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/12] swiotlb: move the SWIOTLB config symbol to lib/Kconfig

2018-04-23 Thread Russell King - ARM Linux
On Mon, Apr 23, 2018 at 07:04:18PM +0200, Christoph Hellwig wrote:
> This way we have one central definition of it, and user can select it as
> needed.  Note that we also add a second ARCH_HAS_SWIOTLB symbol to
> indicate the architecture supports swiotlb at all, so that we can still
> make the usage optional for a few architectures that want this feature
> to be user selectable.
> 
> Signed-off-by: Christoph Hellwig 

Hmm, this looks like we end up with NEED_SG_DMA_LENGTH=y on ARM by
default, which probably isn't a good idea - ARM pre-dates the dma_length
parameter in scatterlists, and I don't think all code is guaranteed to
do the right thing if this is enabled.

For example, arch/arm/mach-rpc/dma.c doesn't use the dma_length
member of struct scatterlist.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 25/44] arm: implement ->mapping_error

2017-06-08 Thread Russell King - ARM Linux
BOn Thu, Jun 08, 2017 at 03:25:50PM +0200, Christoph Hellwig wrote:
> +static int dmabounce_mapping_error(struct device *dev, dma_addr_t dma_addr)
> +{
> + if (dev->archdata.dmabounce)
> + return 0;

I'm not convinced that we need this check here:

dev->archdata.dmabounce = device_info;
set_dma_ops(dev, &dmabounce_ops);

There shouldn't be any chance of dev->archdata.dmabounce being NULL if
the dmabounce_ops has been set as the current device DMA ops.  So I
think that test can be killed.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops

2017-05-25 Thread Russell King - ARM Linux
On Wed, May 24, 2017 at 02:26:13PM +0300, Laurent Pinchart wrote:
> Again, the patch I propose is the simplest v4.12-rc fix I can think of, short 
> of reverting your complete IOMMU probe deferral patch series. Let's focus on 
> the v4.12-rc fix, and then discuss how to move forward in v4.13 and beyond.

Except, I don't think it fixes the problem that Sricharan is trying to
fix, namely that of DMA ops that have been setup, torn down, and are
trying to be re-setup again.  The issue here is that results in a stale
setup, because the dma_ops are left in-place from the first iommu setup,
but the iommu mapping has been disposed of.

Your patch only avoids the problem of tearing down someone else's DMA
ops.  We need a combination of both patches together.

What needs to happen for Sricharan's problem to be resolved is:

1. move all of __arm_iommu_detach_device() into arm_iommu_detach_device().
2. replace the __arm_iommu_detach_device() call in arm_teardown_iommu_dma_ops()
   with arm_iommu_detach_device().

as I don't see any need to have a different order in
arm_teardown_iommu_dma_ops().

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops

2017-05-23 Thread Russell King - ARM Linux
On Wed, May 24, 2017 at 12:46:51AM +0300, Laurent Pinchart wrote:
> Hi Russell,
> 
> On Tuesday 23 May 2017 18:53:19 Russell King - ARM Linux wrote:
> > On Tue, May 23, 2017 at 05:55:57PM +0100, Robin Murphy wrote:
> > > On 23/05/17 17:25, Russell King - ARM Linux wrote:
> > >> So, I've come to apply this patch (since it's finally landed in the
> > >> patch system), and I'm not convinced that the commit message is really
> > >> up to scratch.
> > >> 
> > >> The current commit message looks like this:
> > >> 
> > >> "   ARM: 8674/1: dma-mapping: Reset the device's dma_ops
> > >> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(),
> > >> dma_ops should be cleared in the teardown path. Otherwise this
> > >> causes problem when the probe of device is retried after being
> > >> deferred. The device's iommu structures are cleared after
> > >> EPROBEDEFER error, but on the next try dma_ops will still be set to
> > >> old value, which is not right."
> > >> 
> > >> It is obviously a fix, but a fix for which patch?  Looking at the
> > >> history, we have "arm: dma-mapping: Don't override dma_ops in
> > >> arch_setup_dma_ops()" which I would have guessed is the appropriate
> > >> one, but this post-dates your patch (it's very recent, v4.12-rc
> > >> recent.)
> > >> 
> > >> So, I need more description about the problem you were seeing when
> > >> you first proposed this patch.
> > >> 
> > >> How does leaving the dma_ops in place prior to "arm: dma-mapping:
> > >> Don't override dma_ops in arch_setup_dma_ops()" cause problems for
> > >> deferred probing?
> > >> 
> > >> What patch is your change trying to fix?  In other words, how far
> > >> back does this patch need to be backported?
> > > 
> > > In effect, it's fixing a latent inconsistency that's been present since
> > > its introduction in 4bb25789ed28. However, that has clearly not proven
> > > to be an issue in practice since then. With 09515ef5ddad we start
> > > actually calling arch_teardown_dma_ops() in a manner that might leave
> > > things partially initialised if anyone starts removing and reprobing
> > > drivers, but so far that's still from code inspection[1] rather than
> > > anyone hitting it.
> > > 
> > > Given that the changes which tickle it are fresh in -rc1 I'd say there's
> > > no need to backport this, but at the same time it shouldn't do any
> > > damage if you really want to.
> > 
> > Well, looking at this, I'm not convinced that much of it is correct.
> > 
> > 1) set_dma_ops() is in arch_setup_dma_ops() but this patch adds
> >the unsetting of the DMA ops inside arm_teardown_iommu_dma_ops()
> >rather than arch_teardown_dma_ops().
> > 
> >This doesn't strike me as being particularly symmetric.
> >arm_teardown_iommu_dma_ops() is arm_setup_iommu_dma_ops()'s
> >counterpart.
> > 
> > 2) arch_setup_dma_ops(), the recent patch to add the existing dma_ops
> >check, and Xen - Xen wants to override the DMA ops if in the
> >initial domain.  It's not clear (at least to me) whether the recent
> >patch adding the dma_ops check took account of this or not.
> > 
> > 3) random places seem to fiddle with the dma_ops - notice that
> >arm_iommu_detach_device() sets the dma_ops to NULL.
> > 
> >In fact, I think moving __arm_iommu_detach_device() into
> >arm_iommu_detach_device(), calling arm_iommu_detach_device(),
> >and getting rid of the explicit set_dma_ops(, NULL) in this
> >path would be a good first step.
> > 
> > 4) I think arch_setup_dma_ops() is over-complex.
> > 
> > So, in summary, this code is a mess today, and that means it's not
> > obviously correct - which is bad.  This needs sorting.
> 
> We've reached the same conclusion independently, but I'll refrain from 
> commenting on whether that's a good or bad thing :-)
> 
> I don't think this patch should be applied, as it could break Xen (and other 
> platforms/drivers that set the DMA operations manually) by resetting DMA 
> operations at device remove() time even if they have been set independently 
> of 
> arch_setup_dma_ops().

That will only occur if the dma ops have been overriden once the DMA
operations have been setup via arch_setup_dma_ops.  What saves it from
wholesale NULLing of the DMA operations is the check for a valid
dma_iommu_mapping structure in arm_teardown_iommu_dma_ops().  This only
exists when arm_setup_iommu_dma_ops() has attached a mapping to the
device.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops

2017-05-23 Thread Russell King - ARM Linux
On Tue, May 23, 2017 at 05:55:57PM +0100, Robin Murphy wrote:
> On 23/05/17 17:25, Russell King - ARM Linux wrote:
> > So, I've come to apply this patch (since it's finally landed in the patch
> > system), and I'm not convinced that the commit message is really up to
> > scratch.
> > 
> > The current commit message looks like this:
> > 
> > "   ARM: 8674/1: dma-mapping: Reset the device's dma_ops
> > 
> > arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(),
> > dma_ops should be cleared in the teardown path. Otherwise this causes
> > problem when the probe of device is retried after being deferred. The
> > device's iommu structures are cleared after EPROBEDEFER error, but on
> > the next try dma_ops will still be set to old value, which is not 
> > right."
> > 
> > It is obviously a fix, but a fix for which patch?  Looking at the
> > history, we have "arm: dma-mapping: Don't override dma_ops in
> > arch_setup_dma_ops()" which I would have guessed is the appropriate
> > one, but this post-dates your patch (it's very recent, v4.12-rc
> > recent.)
> > 
> > So, I need more description about the problem you were seeing when
> > you first proposed this patch.
> > 
> > How does leaving the dma_ops in place prior to "arm: dma-mapping:
> > Don't override dma_ops in arch_setup_dma_ops()" cause problems for
> > deferred probing?
> > 
> > What patch is your change trying to fix?  In other words, how far
> > back does this patch need to be backported?
> 
> In effect, it's fixing a latent inconsistency that's been present since
> its introduction in 4bb25789ed28. However, that has clearly not proven
> to be an issue in practice since then. With 09515ef5ddad we start
> actually calling arch_teardown_dma_ops() in a manner that might leave
> things partially initialised if anyone starts removing and reprobing
> drivers, but so far that's still from code inspection[1] rather than
> anyone hitting it.
> 
> Given that the changes which tickle it are fresh in -rc1 I'd say there's
> no need to backport this, but at the same time it shouldn't do any
> damage if you really want to.

Well, looking at this, I'm not convinced that much of it is correct.

1) set_dma_ops() is in arch_setup_dma_ops() but this patch adds
   the unsetting of the DMA ops inside arm_teardown_iommu_dma_ops()
   rather than arch_teardown_dma_ops().

   This doesn't strike me as being particularly symmetric.
   arm_teardown_iommu_dma_ops() is arm_setup_iommu_dma_ops()'s
   counterpart.

2) arch_setup_dma_ops(), the recent patch to add the existing dma_ops
   check, and Xen - Xen wants to override the DMA ops if in the
   initial domain.  It's not clear (at least to me) whether the recent
   patch adding the dma_ops check took account of this or not.

3) random places seem to fiddle with the dma_ops - notice that
   arm_iommu_detach_device() sets the dma_ops to NULL.

   In fact, I think moving __arm_iommu_detach_device() into
   arm_iommu_detach_device(), calling arm_iommu_detach_device(),
   and getting rid of the explicit set_dma_ops(, NULL) in this
   path would be a good first step.

4) I think arch_setup_dma_ops() is over-complex.

So, in summary, this code is a mess today, and that means it's not
obviously correct - which is bad.  This needs sorting.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops

2017-05-23 Thread Russell King - ARM Linux
On Thu, Oct 27, 2016 at 09:07:23AM +0530, Sricharan wrote:
> Hi Robin,
> 
> >-Original Message-
> >From: Robin Murphy [mailto:robin.mur...@arm.com]
> >Sent: Wednesday, October 26, 2016 8:37 PM
> >To: Sricharan R ; will.dea...@arm.com; 
> >j...@8bytes.org; iommu@lists.linux-foundation.org; linux-arm-
> >ker...@lists.infradead.org; linux-arm-...@vger.kernel.org; 
> >laurent.pinch...@ideasonboard.com; m.szyprow...@samsung.com;
> >tf...@chromium.org; srinivas.kandaga...@linaro.org
> >Subject: Re: [PATCH V3 6/8] arm: dma-mapping: Reset the device's dma_ops
> >
> >On 04/10/16 18:03, Sricharan R wrote:
> >> The dma_ops for the device is not getting set to NULL in
> >> arch_tear_down_dma_ops and this causes an issue when the
> >> device's probe gets deferred and retried. So reset the
> >> dma_ops to NULL.
> >
> >Reviewed-by: Robin Murphy 
> >
> 
>  Thanks.
> 
> >This seems like it could stand independently from the rest of the series
> >- might be worth rewording the commit message to more general terms,
> >i.e. arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
> >thus clearing the ops set by the latter, and sending it to Russell as a
> >fix in its own right.
> 
>   Ok, will reword the commit log and push this separately.

So, I've come to apply this patch (since it's finally landed in the patch
system), and I'm not convinced that the commit message is really up to
scratch.

The current commit message looks like this:

"   ARM: 8674/1: dma-mapping: Reset the device's dma_ops

arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(),
dma_ops should be cleared in the teardown path. Otherwise this causes
problem when the probe of device is retried after being deferred. The
device's iommu structures are cleared after EPROBEDEFER error, but on
the next try dma_ops will still be set to old value, which is not right."

It is obviously a fix, but a fix for which patch?  Looking at the
history, we have "arm: dma-mapping: Don't override dma_ops in
arch_setup_dma_ops()" which I would have guessed is the appropriate
one, but this post-dates your patch (it's very recent, v4.12-rc
recent.)

So, I need more description about the problem you were seeing when
you first proposed this patch.

How does leaving the dma_ops in place prior to "arm: dma-mapping:
Don't override dma_ops in arch_setup_dma_ops()" cause problems for
deferred probing?

What patch is your change trying to fix?  In other words, how far
back does this patch need to be backported?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] arm64/dma-mapping: fix DMA_ATTR_FORCE_CONTIGUOUS mmaping code

2017-03-31 Thread Russell King - ARM Linux
On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote:
> In this version of the patch I have replaced temporal pages and
> iommu_dma_mmap with remap_pfn_range or rather its simplified version
> vm_iomap_memory.
> Unfortunately I have not find nice helper for sgtable creation, so I
> left sg allocation inside _iommu_mmap_attrs.
> 
> As the fixing of DMA_ATTR_FORCE_CONTIGUOUS related crashes has higher
> priority I have focused only on it in this patch.

As I mentioned elsewhere, I don't believe that fudging around in this way
is a proper fix.

DMA coherent memory was never, ever, intended to be passed back into the
streaming APIs - it was designed that the two would be mutually exclusive.

The problem is that we now have DMA coherent allocations being passed
through the dma-buf API using this dma_get_sgtable() thing, which is quite
broken.  I regard dma_get_sgtable() as very broken, and had I realised at
the time that this was what it was trying to do, I would have NAK'd it.

Rather than bodging around this brokenness, trying to get dma_get_sgtable()
to work, I believe we need to address the root cause - which is proper
support for passing DMA coherent allocations through dma-buf, which does
not involve scatterlists and calling dma_map_sg() on it.

That's going to need to be addressed in any case, because of the
dma_declare_coherent_memory() issue, where we may not have struct pages
backing a coherent allocation.  Such a case can come up on ARM64 via
DT's "shared-dma-pool" reserved memory stuff.

Right now, I have a "fix" for ARM queued which causes dma_get_sgtable()
to fail when used with reserved memory, but we have one user who needs
this to work.  So, dma-buf needs to be fixed for this one way or another,
and I don't think that bending the current broken stuff to suit it by
trying these vmalloc_to_page() hacks is acceptable.

dma_get_sgtable() is fundamentally broken IMHO.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv7 4/6] arm: dma-mapping: add {map, unmap}_resource for iommu ops

2016-06-01 Thread Russell King - ARM Linux
On Wed, Jun 01, 2016 at 05:22:27PM +0200, Niklas Söderlund wrote:
> +static dma_addr_t arm_iommu_map_resource(struct device *dev,
> + phys_addr_t phys_addr, size_t size,
> + enum dma_data_direction dir, struct dma_attrs *attrs)
> +{
> + struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> + dma_addr_t dma_addr;
> + int ret, prot;
> + phys_addr_t addr = phys_addr & PAGE_MASK;
> + int offset = phys_addr & ~PAGE_MASK;
> + int len = PAGE_ALIGN(size + offset);

Shouldn't both of these be unsigned - preferably size_t for len?

> +
> + dma_addr = __alloc_iova(mapping, size);

Is this really correct?  What if size = 4095 and offset = 10?  Do we
really only need one IOVA page for such a mapping (I count two pages.)
Shouldn't this be "len" ?

> + if (dma_addr == DMA_ERROR_CODE)
> + return dma_addr;
> +
> + prot = __dma_direction_to_prot(dir) | IOMMU_MMIO;
> +
> + ret = iommu_map(mapping->domain, dma_addr, addr, len, prot);
> + if (ret < 0)
> + goto fail;
> +
> + return dma_addr + offset;
> +fail:
> + __free_iova(mapping, dma_addr, size);

Shouldn't this be "len" as well?

> + return DMA_ERROR_CODE;
> +}
> +
> +/**
> + * arm_iommu_unmap_resource - unmap a device DMA resource
> + * @dev: valid struct device pointer
> + * @dma_handle: DMA address to resource
> + * @size: size of resource to map
> + * @dir: DMA transfer direction
> + */
> +static void arm_iommu_unmap_resource(struct device *dev, dma_addr_t 
> dma_handle,
> + size_t size, enum dma_data_direction dir,
> + struct dma_attrs *attrs)
> +{
> + struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> + dma_addr_t iova = dma_handle & PAGE_MASK;
> + int offset = dma_handle & ~PAGE_MASK;
> + int len = PAGE_ALIGN(size + offset);

unsigned/size_t again.

> +
> + if (!iova)
> + return;
> +
> + iommu_unmap(mapping->domain, iova, len);
> + __free_iova(mapping, iova, len);

Here, you free "len" bytes of iova, which is different from above.

> +}
> +
>  static void arm_iommu_sync_single_for_cpu(struct device *dev,
>   dma_addr_t handle, size_t size, enum dma_data_direction dir)
>  {
> @@ -1994,6 +2051,9 @@ struct dma_map_ops iommu_ops = {
>   .unmap_sg   = arm_iommu_unmap_sg,
>   .sync_sg_for_cpu= arm_iommu_sync_sg_for_cpu,
>   .sync_sg_for_device = arm_iommu_sync_sg_for_device,
> +
> + .map_resource   = arm_iommu_map_resource,
> + .unmap_resource = arm_iommu_unmap_resource,
>  };
>  
>  struct dma_map_ops iommu_coherent_ops = {
> @@ -2007,6 +2067,9 @@ struct dma_map_ops iommu_coherent_ops = {
>  
>   .map_sg = arm_coherent_iommu_map_sg,
>   .unmap_sg   = arm_coherent_iommu_unmap_sg,
> +
> + .map_resource   = arm_iommu_map_resource,
> + .unmap_resource = arm_iommu_unmap_resource,
>  };
>  
>  /**
> -- 
> 2.8.2
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/4] ARM: dma-mapping: Constify attrs passed to internal functions

2016-05-24 Thread Russell King - ARM Linux
On Tue, May 24, 2016 at 08:28:08AM +0200, Krzysztof Kozlowski wrote:
> Some of the non-exported functions do not modify passed dma_attrs so the
> pointer can point to const data.
> 
> Signed-off-by: Krzysztof Kozlowski 

Acked-by: Russell King 

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/4] scatterlist: add sg_alloc_table_from_buf() helper

2016-03-31 Thread Russell King - ARM Linux
On Thu, Mar 31, 2016 at 04:45:57PM +0200, Boris Brezillon wrote:
> Hi Russell,
> 
> On Thu, 31 Mar 2016 15:14:13 +0100
> Russell King - ARM Linux  wrote:
> 
> > On Thu, Mar 31, 2016 at 02:29:42PM +0200, Boris Brezillon wrote:
> > > sg_alloc_table_from_buf() provides an easy solution to create an sg_table
> > > from a virtual address pointer. This function takes care of dealing with
> > > vmallocated buffers, buffer alignment, or DMA engine limitations (maximum
> > > DMA transfer size).
> > 
> > Please note that the DMA API does not take account of coherency of memory
> > regions other than non-high/lowmem - there are specific extensions to
> > deal with this.
> 
> Ok, you said 'non-high/lowmem', this means vmalloced and kmapped buffers
> already fall in this case, right?
> 
> Could you tell me more about those specific extensions?

I was slightly confused - the extensions I was thinking of are those
listed at the bottom of Documentation/cachetlb.txt, which have nothing
to do with DMA.

However, it's probably worth reading Documentation/DMA-API-HOWTO.txt
to read up on what kinds of memory are considered to be DMA-able in
the kernel.

> > What this means is that having an API that takes any virtual address
> > pointer, converts it to a scatterlist which is then DMA mapped, is
> > unsafe.
> 
> Which means some implementations already get this wrong (see
> spi_map_buf(), and I'm pretty sure it's not the only one).

Quite possible, but that is driver stuff, and driver stuff gets things
wrong all the time. :)

> > It'll be okay for PIPT and non-aliasing VIPT cache architectures, but
> > for other cache architectures this will hide this problem and make
> > review harder.
> > 
> 
> Ok, you lost me. I'll have to do my homework and try to understand what
> this means :).

P = physical address
V = virtual address
I = indexed
T = tag

The tag is held in each cache line.  When a location is looked up in the
cache, an index is used to locate a set of cache lines and the tag is
compared to check which cache line in the set is the correct one (or
whether the address even exists in the cache.)

How the index and tag are derived varies between cache architectures.

PIPT = indexed by physical address, tagged with physical address.  Never
aliases with itself in the presence of multiple virtual mappings.

VIPT = indexed by virtual address, tagged with physical address.  If the
bits from the virtual address do not overlap the MMU page size, it is
also alias free, otherwise aliases can exist, but can be eliminated by
"cache colouring" - ensuring that a physical address is always mapped
with the same overlapping bits.

VIVT = indexed by virtual address, tagged with virtual address.  The
worst kind of cache, since every different mapping of the same physical
address is guaranteed by design to alias with other mappings.

There is little cache colouring between different kernel mappings (eg,
between lowmem and vmalloc space.)

What this means is that, while the DMA API takes care of DMA aliases
in the lowmem mappings, an alias-prone VIPT cache will remain incoherent
with DMA if it is remapped into vmalloc space, and the mapping happens
to have a different cache colour.  In other words, this is a data
corruption issue.

Hence, taking a range of vmalloc() addresses, converting them into a
scatterlist, then using the DMA API on the scatterlist _only_ guarantees
that the lowmem (and kmap'd highmem mappings) are coherent with DMA.
There is no way for the DMA API to know that other mappings exist, and
obviously flushing every possible cache line just because a mapping might
exist multiplies the expense of the DMA API: not only in terms of time
spent running through all the possibilities, which doubles for every
aliasing bit of VIPT, but also TLB pressure since you'd have to create
a mapping for each alias and tear it back down.

VIVT is even worse, since there is no other virtual mapping which is
coherent, would need to be known, and each mapping would need to be
individually flushed.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/4] scatterlist: add sg_alloc_table_from_buf() helper

2016-03-31 Thread Russell King - ARM Linux
On Thu, Mar 31, 2016 at 02:29:42PM +0200, Boris Brezillon wrote:
> sg_alloc_table_from_buf() provides an easy solution to create an sg_table
> from a virtual address pointer. This function takes care of dealing with
> vmallocated buffers, buffer alignment, or DMA engine limitations (maximum
> DMA transfer size).

Please note that the DMA API does not take account of coherency of memory
regions other than non-high/lowmem - there are specific extensions to
deal with this.

What this means is that having an API that takes any virtual address
pointer, converts it to a scatterlist which is then DMA mapped, is
unsafe.

It'll be okay for PIPT and non-aliasing VIPT cache architectures, but
for other cache architectures this will hide this problem and make
review harder.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping

2015-11-04 Thread Russell King - ARM Linux
On Wed, Nov 04, 2015 at 06:48:50PM +0900, Tomasz Figa wrote:
> There is no requirement, but shouldn't it be desired for the mapping
> code to map them as such? Otherwise, how could the IOMMU use case you
> described above (address translator for devices which don't have the
> capability to address a scatterlist) be handled properly?

It's up to the IOMMU code to respect the parameters that the device has
supplied to it via the device_dma_parameters.  This doesn't currently
allow a device to say "I want this scatterlist to be mapped as a
contiguous device address", so really if a device has such a requirement,
at the moment the device driver _must_ check the dma_map_sg() return
value and act accordingly.

While it's possible to say "an IOMMU should map as a single contiguous
address" what happens when the IOMMU's device address space becomes
fragmented?

> Is the general conclusion now that dma_map_sg() should not be used to
> create IOMMU mappings and we should make a step backwards making all
> drivers (or frameworks, such as videobuf2) do that manually? That
> would be really backwards, because code not aware of IOMMU existence
> at all would have to become aware of it.

No.  The DMA API has always had the responsibility for managing the
IOMMU device, which may well be shared between multiple different
devices.

However, if the IOMMU is part of a device IP block (such as a GPU)
then the decision on whether the DMA API should be used or not is up
to the driver author.  If it has special management requirements,
then it's probably appropriate for the device driver to manage it by
itself.

For example, a GPUs MMU may need something inserted into the GPUs
command stream to flush the MMU TLBs.  Such cases are inappropriate
to be using the DMA API for IOMMU management.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping

2015-11-04 Thread Russell King - ARM Linux
On Wed, Nov 04, 2015 at 02:12:03PM +0900, Tomasz Figa wrote:
> My understanding of a scatterlist was that it represents a buffer as a
> whole, by joining together its physically discontinuous segments.

Correct, and it may also be scattered in CPU virtual space as well.

> I don't see how single segments (layout of which is completely up to
> the allocator; often just single pages) would be usable for hardware
> that needs to do some work more serious than just writing a byte
> stream continuously to subsequent buffers. In case of such simple
> devices you don't even need an IOMMU (for means other than protection
> and/or getting over address space limitations).

All that's required is that the addresses described in the scatterlist
are accessed as an apparently contiguous series of bytes.  They don't
have to be contiguous in any address view, provided the device access
appears to be contiguous.  How that is done is really neither here nor
there.

IOMMUs are normally there as an address translator - for example, the
underlying device may not have the capability to address a scatterlist
(eg, because it makes effectively random access) and in order to be
accessible to the device, it needs to be made contiguous in device
address space.

Another scenario is that you have more bits of physical address than
a device can generate itself for DMA purposes, and you need an IOMMU
to create a (possibly scattered) mapping in device address space
within the ability of the device to address.

The requirements here depend on the device behind the IOMMU.

> However, IMHO the most important use case of an IOMMU is to make
> buffers, which are contiguous in CPU virtual address space (VA),
> contiguous in device's address space (IOVA).

No - there is no requirement for CPU virtual contiguous buffers to also
be contiguous in the device address space.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping

2015-11-04 Thread Russell King - ARM Linux
On Wed, Nov 04, 2015 at 02:15:41PM +0900, Tomasz Figa wrote:
> On Wed, Nov 4, 2015 at 3:40 AM, Russell King - ARM Linux
>  wrote:
> > On Tue, Nov 03, 2015 at 05:41:24PM +, Robin Murphy wrote:
> >> Hi Tomasz,
> >>
> >> On 02/11/15 13:43, Tomasz Figa wrote:
> >> >Agreed. The dma_map_*() API is not guaranteed to return a single
> >> >contiguous part of virtual address space for any given SG list.
> >> >However it was understood to be able to map buffers contiguously
> >> >mappable by the CPU into a single segment and users,
> >> >videobuf2-dma-contig in particular, relied on this.
> >>
> >> I don't follow that - _any_ buffer made of page-sized chunks is going to be
> >> mappable contiguously by the CPU; it's clearly impossible for the streaming
> >> DMA API itself to offer such a guarantee, because it's entirely orthogonal
> >> to the presence or otherwise of an IOMMU.
> >
> > Tomasz's use of "virtual address space" above in combination with the
> > DMA API is really confusing.
> 
> I suppose I must have mistakenly use "virtual address space" somewhere
> instead of "IO virtual address space". I'm sorry for causing
> confusion.
> 
> The thing being discussed here is mapping of buffers described by
> scatterlists into IO virtual address space, i.e. the operation
> happening when dma_map_sg() is called for an IOMMU-enabled device.

... and there, it's perfectly legal for an IOMMU to merge all entries
in a scatterlist into one mapping - so dma_map_sg() would return 1.

What that means is that the scatterlist contains the original number of
entries which describes the CPU view of the buffer list using the original
number of entries, and the DMA device view of the same but using just the
first entry.

In other words, if you're walking a scatterlist, and doing a mixture of
DMA and PIO, you can't assume that if you're at scatterlist entry N for
DMA, you can switch to PIO for entry N and you'll write to the same
memory.  (I know that there's badly written drivers in the kernel which
unfortunately do make this assumption, and if they're used in the
presence of an IOMMU, they _will_ be silently data corrupting.)

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping

2015-11-03 Thread Russell King - ARM Linux
On Tue, Nov 03, 2015 at 05:41:24PM +, Robin Murphy wrote:
> Hi Tomasz,
> 
> On 02/11/15 13:43, Tomasz Figa wrote:
> >Agreed. The dma_map_*() API is not guaranteed to return a single
> >contiguous part of virtual address space for any given SG list.
> >However it was understood to be able to map buffers contiguously
> >mappable by the CPU into a single segment and users,
> >videobuf2-dma-contig in particular, relied on this.
> 
> I don't follow that - _any_ buffer made of page-sized chunks is going to be
> mappable contiguously by the CPU; it's clearly impossible for the streaming
> DMA API itself to offer such a guarantee, because it's entirely orthogonal
> to the presence or otherwise of an IOMMU.

Tomasz's use of "virtual address space" above in combination with the
DMA API is really confusing.

dma_map_sg() does *not* construct a CPU view of the passed scatterlist.
The only thing dma_map_sg() might do with virtual addresses is to use
them as a way to achieve cache coherence for one particular view of
that memory, that being the kernel's own lowmem mapping and any kmaps.
It doesn't extend to vmalloc() or userspace mappings of the memory.

If the scatterlist is converted to an array of struct page pointers,
it's possible to map it with vmap(), but it's implementation defined
whether such a mapping will receive cache maintanence as part of the
DMA API or not.  (If you have PIPT caches, it will, if they're VIPT
caches, maybe not.)

There is a separate set of calls to deal with the flushing issues for
vmap()'d memory in this case - see flush_kernel_vmap_range() and
invalidate_kernel_vmap_range().

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/3] iommu: Add range flush operation

2015-09-29 Thread Russell King - ARM Linux
On Tue, Sep 29, 2015 at 05:27:12PM +0100, Robin Murphy wrote:
> Eh, swings and roundabouts. An argument denoting whether the flush is being
> called on the map or unmap path would be fine,

Sorry, that statement is wrong.  It's not about whether you flush before
or after the DMA operation.  I'm afraid I'm probably going to tell you
how to suck eggs here, because I don't think you quite "get it" with
non-dma-coherent modern CPUs.

Modern CPUs prefetch data into their caches, and they also randomly write
back data from their caches to memory.  When performing a DMA operation
from device to memory, you need to do two things with CPU caches which
aren't coherent:

1. Before starting the DMA operation, you need to walk over the memory to
   be mapped, ensuring that any dirty cache lines are written back.  This
   is to prevent dirty cache lines overwriting data that has already been
   DMA'd from the device.

2. After the DMA operation has completed, you need to walk over the
   memory again, invalidating any cache lines which may have been
   speculatively loaded from that memory while DMA was running.  These
   cache lines may have been loaded prior to the DMA operation placing
   the new data into memory.

So, it's not a before-or-after, you have to always perform write-back
cache maintanence prior to any DMA operation, and then invalidate cache
maintanence after the DMA operation has completed for any mapping which
the DMA may have written to (which means device-to-memory and
bidirectional mappings.)

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/3] iommu: Add range flush operation

2015-09-29 Thread Russell King - ARM Linux
On Tue, Sep 29, 2015 at 03:20:38PM +0100, Robin Murphy wrote:
> A single callback doesn't really generalise well enough: If we wanted to
> implement this in the ARM SMMU drivers to optimise the unmap() case [ask
> Will how long he spends waiting for a software model to tear down an entire
> VFIO domain invalidating one page at a time ;)], then we'd either regress
> performance in the map() case with an unnecessary TLB flush, or have to do a
> table walk in every flush() call to infer what actually needs doing.

And this is the problem of frameworks.  They get in the way of doing
things efficiently.

Fine, we have the DMA ops, and that calls a map_sg() method.  What we
then need is to have a series of standardised library functions which
can be called to perform various actions.

Consider this: an IOMMU driver gets the raw scatterlist which the
driver passed.  The IOMMU driver walks the scatterlist, creating the
IOMMU side mapping, and writing the device DMA addresses and DMA lengths
to the scatterlist, possibly coalescing some of the entries.  It
remembers the number of scatterlist entries that the DMA operation now
requires.  The IOMMU code can setup whatever mappings it wants using
whatever sizes it wants to satisfy the requested scatterlist.

It then goes on to call the arch backend with the original scatterlist,
asking it to _only_ deal with the CPU coherency for the mapping.  The
arch code walks the scatterlist again, this time dealing with the CPU
coherency part.

Finally, the IOMMU code returns the number of DMA scatterlist entries.

When it comes to tearing it down, it's a similar operation to the above,
except reversing those actions.

The only issue with this approach is that it opens up some of the cache
handling to the entire kernel, and that will be _too_ big a target for
idiotic driver writers to think they have permission to directly use
those interfaces.  To solve this, I'd love to be able to have the linker
link together certain objects in the kernel build, and then convert some
global symbols to be local symbols, thus denying access to functions that
driver authors have no business what so ever touching.

> Personally I think it would be nicest to have two separate callbacks, e.g.
> .map_sync/.unmap_sync, but at the very least some kind of additional
> 'direction' kind of parameter would be necessary.

No, not more callbacks - that's the framework thinking, not the library
thinking.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/io-pgtable-arm: Don't use dma_to_phys()

2015-09-18 Thread Russell King - ARM Linux
On Fri, Sep 18, 2015 at 12:04:26PM +0100, Robin Murphy wrote:
> Specifically, the problem case for that is when phys_addr_t is 64-bit but
> dma_addr_t is 32-bit. The cast in __arm_lpae_dma_addr is necessary to avoid
> a truncation warning when we make the DMA API calls, but we actually need
> the opposite in the comparison here - comparing the different types directly
> allows integer promotion to kick in appropriately so we don't lose the top
> half of the larger address. Otherwise, you'd never spot the difference
> between, say, your original page at 0x88c000 and a bounce-buffered copy
> that happened to end up mapped to 0xc000.

Hmm.  Thinking about this, I think we ought to add to arch/arm/mm/Kconfig:

 config ARCH_PHYS_ADDR_T_64BIT
def_bool ARM_LPAE
 
 config ARCH_DMA_ADDR_T_64BIT
bool
+   select ARCH_PHYS_ADDR_T_64BIT

I seem to remember that you're quite right that dma_addr_t <= phys_addr_t
but dma_addr_t must never be bigger than phys_addr_t.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use

2015-08-04 Thread Russell King - ARM Linux
On Tue, Aug 04, 2015 at 03:47:13PM +0100, Robin Murphy wrote:
> Hi Laurent,
> 
> [ +RMK, as his patch is indirectly involved here ]
> 
> On 04/08/15 14:16, Laurent Pinchart wrote:
> >This is what I believe to be an API abuse. The dma_sync_single_for_device()
> >API is meant to pass ownership of a buffer to the device. Unless I'm 
> >mistaken,
> >once that's done the CPU isn't allowed to touch the buffer anymore until
> >dma_sync_single_for_cpu() is called to get ownership of the buffer back.

That's what I thought up until recently, but it's not strictly true - see
Documentation/DMA-API.txt which Robin quoted.

> [3]:Yes, there may generally be exceptions to that, but not in the context
> of this code. Unless the Renesas IPMMU does something I don't know about?

If an IOMMU does write to its page tables, then the only way to handle
those is using DMA-coherent memory, either via a coherent mapping or
allocated via dma_alloc_coherent().  The streaming DMA API is wholely
unsuitable for any mapping where both the CPU and DMA device both want
to simultaneously alter the contained data.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v2: ThunderX(errata-23399) mis-extends 64bit registers

2015-07-31 Thread Russell King - ARM Linux
On Thu, Jul 30, 2015 at 08:54:04PM +, Chalamarla, Tirumalesh wrote:
> is some thing like this looks good
> 
> +#ifdef CONFIG_64BIT
> +#define smmu_writeq(reg64, addr)   writeq_relaxed((reg64), (addr))
> +#else
> +#define smmu_writeq(reg64, addr)   \
> +   writel_relaxed(((reg64) >> 32), ((addr) + 4));  \
> +   writel_relaxed((reg64), (addr))

It's missing a #endif.

This also suffers from multiple argument evaluation, and it hides that
there's two expressions here - which makes future maintanence harder.

#define smmu_writeq(reg64, addr)\
do {\
u64 __val = (reg64);\
volatile void __iomem *__addr = (addr); \
writel_relaxed(__val >> 32, __addr + 4);\
writel_relaxed(__val, __addr);  \
} while (0)

is longer but is much preferred as it won't suffer side effects from
stuff like:

if (...)
smmu_writeq(val++, addr);

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 5/6] iommu/mediatek: Add mt8173 IOMMU driver

2015-07-27 Thread Russell King - ARM Linux
On Mon, Jul 27, 2015 at 02:23:26PM +0100, Robin Murphy wrote:
> On 16/07/15 10:04, Yong Wu wrote:
> >This patch adds support for mediatek m4u (MultiMedia Memory Management
> >Unit).
> >
> >Signed-off-by: Yong Wu 
> [...]
> >+static void mtk_iommu_flush_pgtable(void *ptr, size_t size, void *cookie)
> >+{
> >+   struct mtk_iommu_domain *domain = cookie;
> >+   unsigned long offset = (unsigned long)ptr & ~PAGE_MASK;
> >+
> >+   dma_map_page(domain->data->dev, virt_to_page(ptr), offset,
> >+size, DMA_TO_DEVICE);
> 
> Nit: this looks like it may as well be dma_map_single.
> 
> It would probably be worth following it with a matching unmap too, just to
> avoid any possible leakage bugs (especially if this M4U ever appears in a
> SoC supporting RAM above the 32-bit boundary).

Why not do the job properly?  Take a look at how I implemented the
streaming DMA API on Tegra SMMU (patch set recently sent out earlier
today).

There's no need for hacks like dma_map_page() (and discarding it's
return value) or dma_map_page() followed by dma_unmap_page().

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/18] Clean up exposure of arch-internal code

2015-07-27 Thread Russell King - ARM Linux
On Mon, Jul 27, 2015 at 04:09:06PM +0200, Thierry Reding wrote:
> On Mon, Jul 27, 2015 at 01:28:24PM +0100, Russell King - ARM Linux wrote:
> > This series of patches attempts to clean up the use of architecture
> > internal functions in drivers, and removes some functions from view
> > in the asm/ headers.  I'm also considering whether we need to add
> > some linker magic to hide symbols when building the built-in.o files.
> > 
> > This was triggered by 3rd party drivers "going under the covers" of
> > the DMA API and calling the dmac_*() functions directly, a practice
> > which I have always refused to allow.  This also breaks module
> > building (which is the big hint that they're doing something wrong.)
> > 
> > However, it also came to light that various drivers are using
> > __cpuc_* functions directly, which are non-portable, and is another
> > instance of "going under the covers" and tinkering with what should
> > be kept as architecture implementation details.
> > 
> > This series addresses some of these points.  It:
> > 
> > (a) moves dmac_map_area() and dmac_unmap_area() prototypes out of
> > asm/cacheflush.h and into arch/arm/mm.
> > (b) provide a secure_flush() call for the Qualcomm secure monitor
> > code.
> > (c) stop tegra smmu driver(s) from using __cpuc_flush_dcache_area,
> > something which necessitates additional complexity to deal with
> > the ARM vs ARM64 differences.  It should be using the DMA API.
> > However, the Tegra SMMU driver is in really bad shape, and this
> > isn't a simple conversion - it requires a lot of additional
> > fixes to bring the code up to scratch.
> 
> Out of curiosity, did you have any hardware to test this on?

Nope - it only got build-tested, and it's partly why there's soo many
incremental small patches.

> I've given
> it a quick spin and haven't seen anything out of the ordinary. I have a
> couple of nitpicks towards the end of the series, but other than that,
> very nice work.

Good news.

> What's the plan for merging this? Should it all go in through Joerg's
> tree so that potentially other driver conversions can go in on top of
> this, or would you prefer to take it through the ARM tree?

I don't mind the series being split - there's nothing dependent between
the individual drivers, so all the Tegra SMMU stuff can go through
Joerg's tree if that'll be easiest.

> > It leaves the Rockchip IOMMU driver for the time being, but that is also
> > something which needs cleaning up in the same way as the Tegra SMMU
> > driver.
> 
> From a cursory look, MSM, OMAP and Exynos are in the same boat as well.

Yes.  There is a question here though:

Is it a good idea to use normal cacheable memory for the IOMMU page table
entries, or would DMA coherent memory be better?

If the latter, then we don't need to do any cache maintanence of the
tables, and all the hacks can go.  If the former, using the streaming
DMA API in all IOMMU drivers would be a good idea, like Tegra SMMU now
does as a result of these patches.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 00/18] Clean up exposure of arch-internal code

2015-07-27 Thread Russell King - ARM Linux
This series of patches attempts to clean up the use of architecture
internal functions in drivers, and removes some functions from view
in the asm/ headers.  I'm also considering whether we need to add
some linker magic to hide symbols when building the built-in.o files.

This was triggered by 3rd party drivers "going under the covers" of
the DMA API and calling the dmac_*() functions directly, a practice
which I have always refused to allow.  This also breaks module
building (which is the big hint that they're doing something wrong.)

However, it also came to light that various drivers are using
__cpuc_* functions directly, which are non-portable, and is another
instance of "going under the covers" and tinkering with what should
be kept as architecture implementation details.

This series addresses some of these points.  It:

(a) moves dmac_map_area() and dmac_unmap_area() prototypes out of
asm/cacheflush.h and into arch/arm/mm.
(b) provide a secure_flush() call for the Qualcomm secure monitor
code.
(c) stop tegra smmu driver(s) from using __cpuc_flush_dcache_area,
something which necessitates additional complexity to deal with
the ARM vs ARM64 differences.  It should be using the DMA API.
However, the Tegra SMMU driver is in really bad shape, and this
isn't a simple conversion - it requires a lot of additional
fixes to bring the code up to scratch.

It leaves the Rockchip IOMMU driver for the time being, but that is also
something which needs cleaning up in the same way as the Tegra SMMU
driver.

 arch/arm/include/asm/cacheflush.h |  21 ++-
 arch/arm/include/asm/glue-cache.h |   2 -
 arch/arm/mm/dma-mapping.c |   1 +
 arch/arm/mm/dma.h |  32 
 drivers/firmware/qcom_scm-32.c|   4 +-
 drivers/iommu/tegra-smmu.c| 297 --
 drivers/memory/tegra/tegra114.c   |  17 ---
 drivers/memory/tegra/tegra124.c   |  30 
 drivers/memory/tegra/tegra30.c|  17 ---
 include/soc/tegra/mc.h|   7 -
 10 files changed, 242 insertions(+), 186 deletions(-)

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 2/3] arm64: add IOMMU dma_ops

2015-03-09 Thread Russell King - ARM Linux
On Thu, Mar 05, 2015 at 11:16:28AM +, Robin Murphy wrote:
> Hi Laura,
> 
> On 05/03/15 00:19, Laura Abbott wrote:
> [...]
> >>Consider that the IOMMU's page table walker is a DMA master in its own
> >right, and that is the device you're mapping the page tables for.
> >Therefore your IOMMU driver needs to have access to the struct device
> >of the IOMMU itself to use for the page-table-related mappings. Also,
> >be sure to set the IOMMU's DMA mask correctly to prevent SWIOTLB bounce
> >buffers being created in the process (which as I've found generally ends
> >in disaster).
> >>
> >>>  And normally, we always need do cache maintenance only for some
> >>>bytes in the pagetable but not whole a page. Then is there a easy way to
> >>>do the cache maintenance?
> >>
> >>For a noncoherent device, dma_map_single() will end up calling
> >__dma_map_area() with the page offset and size of the original request, so
> >the updated part gets flushed by VA, and the rest of the page isn't touched
> >if it doesn't need to be. On the other hand if the page tables were
> >allocated with dma_alloc_coherent() in the first place, then just calling
> >dma_sync_single_for_device() for the updated region should suffice.

That's wrong.  dma_sync_single_*() is not permitted to be called on
coherently allocated memory.  Where coherent memory needs to be remapped,
dma_sync_single_*() will panic the kernel.

If it's in coherent memory, all you should need is the appropriate
memory barrier to ensure that the DMA agent can see the writes.

> >Where exactly would you call the dma_unmap? It seems a bit strange to
> >be repeatedly calling dma_map and never calling dma_unmap. I don't see it
> >explicitly forbidden in the docs anywhere to do this but it seems like
> >it would be violating the implicit handoff of dma_map/dma_unmap.
> 
> I think ideally you'd call dma_map_page when you first create the page
> table, dma_sync_single_for_device on any update, and dma_unmap_page when you
> tear it down, and you'd also use the appropriate DMA addresses everywhere
> instead of physical addresses.

No.

dma_map_page()  ownership changes CPU->DMA
dma_sync_single_for_cpu()   ownership changes DMA->CPU
dma_sync_single_for_device()ownership changes CPU->DMA
dma_unmap_page()ownership changes DMA->CPU

It's invalid to miss out the pairing that give those ownership changes.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv7 07/26] driver core: amba: add device binding path 'driver_override'

2014-09-26 Thread Russell King - ARM Linux
On Tue, Sep 23, 2014 at 04:46:06PM +0200, Antonios Motakis wrote:
> As already demonstrated with PCI [1] and the platform bus [2], a
> driver_override property in sysfs can be used to bypass the id matching
> of a device to a AMBA driver. This can be used by VFIO to bind to any AMBA
> device requested by the user.
> 
> [1] 
> http://lists-archives.com/linux-kernel/28030441-pci-introduce-new-device-binding-path-using-pci_dev-driver_override.html
> [2] https://www.redhat.com/archives/libvir-list/2014-April/msg00382.html
> 
> Signed-off-by: Antonios Motakis 

I have to ask why this is even needed in the first place.  To take the
example in [2], what's wrong with:

echo fff51000.ethernet > 
/sys/bus/platform/devices/fff51000.ethernet/driver/unbind
echo fff51000.ethernet > /sys/bus/platform/drivers/vfio-platform/bind

and similar for AMBA.

All we would need to do is to introduce a way of having a driver accept
explicit bind requests.

In any case:

> +static ssize_t driver_override_store(struct device *_dev,
> +  struct device_attribute *attr,
> +  const char *buf, size_t count)
> +{
> + struct amba_device *dev = to_amba_device(_dev);
> + char *driver_override, *old = dev->driver_override, *cp;
> +
> + if (count > PATH_MAX)
> + return -EINVAL;
> +
> + driver_override = kstrndup(buf, count, GFP_KERNEL);
> + if (!driver_override)
> + return -ENOMEM;
> +
> + cp = strchr(driver_override, '\n');
> + if (cp)
> + *cp = '\0';

I hope that is not replicated everywhere.  This allows up to a page to be
allocated, even when the first byte may be a newline.  This is wasteful.

How about:

if (count > PATH_MAX)
return -EINVAL;

cp = strnchr(buf, count, '\n');
if (cp)
count = cp - buf - 1;

if (count) {
driver_override = kstrndup(buf, count, GFP_KERNEL);
if (!driver_override)
return -ENOMEM;
} else {
driver_override = NULL;
}

kfree(dev->driver_override);
dev->driver_override = driver_override;

Also:

> +static ssize_t driver_override_show(struct device *_dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct amba_device *dev = to_amba_device(_dev);
> +
> + return sprintf(buf, "%s\n", dev->driver_override);
> +}

Do we really want to do a NULL pointer dereference here?

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: avoid calling request_irq in atomic context

2014-07-26 Thread Russell King - ARM Linux
On Fri, Jul 25, 2014 at 04:26:59PM -0700, Mitchel Humpherys wrote:
> - irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> - ret = request_irq(irq, arm_smmu_context_fault, IRQF_SHARED,
> -   "arm-smmu-context-fault", domain);
> - if (IS_ERR_VALUE(ret)) {
> - dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
> - cfg->irptndx, irq);
> - cfg->irptndx = INVALID_IRPTNDX;
> - goto out_free_context;
> - }
> -
>   smmu_domain->smmu = smmu;
>   arm_smmu_init_context_bank(smmu_domain);
>   return 0;
> -
> -out_free_context:
> - __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
> - return ret;

This returns 'ret' from request_irq.

> + ret = request_irq(irq, arm_smmu_context_fault, IRQF_SHARED,
> +   "arm-smmu-context-fault", domain);
> + if (IS_ERR_VALUE(ret)) {
> + dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
> + cfg->irptndx, irq);
> + cfg->irptndx = INVALID_IRPTNDX;
> + return -ENODEV;

This returns -ENODEV instead.

> + }
> +
>   /* Looks ok, so add the device to the domain */
> - cfg = find_smmu_master_cfg(smmu_domain->smmu, dev);
> - if (!cfg)
> + master_cfg = find_smmu_master_cfg(smmu_domain->smmu, dev);
> + if (!master_cfg)
>   return -ENODEV;

If this fails, we exit leaving the interrupt registered.  This is a
bug introduced by this change.

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3] iommu: shmobile: Enable the driver on all ARM platforms

2013-10-30 Thread Russell King - ARM Linux
On Wed, Oct 30, 2013 at 12:20:43PM +0100, Laurent Pinchart wrote:
> Renesas ARM platforms are transitioning from single-platform to
> multi-platform kernels using the new ARCH_SHMOBILE_MULTI. Make the
> driver available on all ARM platforms to enable it on both ARCH_SHMOBILE
> and ARCH_SHMOBILE_MULTI and increase build testing coverage.
> 
> Don't enable COMPILE_TEST support as the driver doesn't compile on
> non-ARM platforms due to usage of the ARM DMA IOMMU API.

For similar reasons as x86, can we please think about using:

depends on ARM
depends on ARCH_SHMOBILE || ARCH_SHMOBILE_MULTI || COMPILE_TEST

This way we don't end up polluting the configuration for non-shmobile
platforms.  Same goes for other ARM stuff... the number of options is
getting rather large and we need to think about keeping that in check
where its easily possible to do so.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv7 9/9] ARM: dma-mapping: add support for IOMMU mapper

2012-03-06 Thread Russell King - ARM Linux
On Wed, Feb 29, 2012 at 04:04:22PM +0100, Marek Szyprowski wrote:
> +static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct 
> *vma,
> + void *cpu_addr, dma_addr_t dma_addr, size_t size,
> + struct dma_attrs *attrs)
> +{
> + struct arm_vmregion *c;
> +
> + vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
> + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);

What protects this against other insertions/removals from the list?

> +
> + if (c) {
> + struct page **pages = c->priv;
> +
> + unsigned long uaddr = vma->vm_start;
> + unsigned long usize = vma->vm_end - vma->vm_start;
> + int i = 0;
> +
> + do {
> + int ret;
> +
> + ret = vm_insert_page(vma, uaddr, pages[i++]);
> + if (ret) {
> + pr_err("Remapping memory, error: %d\n", ret);
> + return ret;
> + }
> +
> + uaddr += PAGE_SIZE;
> + usize -= PAGE_SIZE;
> + } while (usize > 0);
> + }
> + return 0;
> +}
> +
> +/*
> + * free a page as defined by the above mapping.
> + * Must not be called with IRQs disabled.
> + */
> +void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
> +   dma_addr_t handle, struct dma_attrs *attrs)
> +{
> + struct arm_vmregion *c;
> + size = PAGE_ALIGN(size);
> +
> + c = arm_vmregion_find(&consistent_head, (unsigned long)cpu_addr);

What protects this against other insertions/removals from the list?

> + if (c) {
> + struct page **pages = c->priv;
> + __dma_free_remap(cpu_addr, size);
> + __iommu_remove_mapping(dev, handle, size);
> + __iommu_free_buffer(dev, pages, size);
> + }
> +}
> +
> +/*
> + * Map a part of the scatter-gather list into contiguous io address space
> + */
> +static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
> +   size_t size, dma_addr_t *handle,
> +   enum dma_data_direction dir)
> +{
> + struct dma_iommu_mapping *mapping = dev->archdata.mapping;
> + dma_addr_t iova, iova_base;
> + int ret = 0;
> + unsigned int count;
> + struct scatterlist *s;
> +
> + size = PAGE_ALIGN(size);
> + *handle = ARM_DMA_ERROR;
> +
> + iova_base = iova = __alloc_iova(mapping, size);
> + if (iova == ARM_DMA_ERROR)
> + return -ENOMEM;
> +
> + for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s))
> + {
> + phys_addr_t phys = page_to_phys(sg_page(s));
> + unsigned int len = PAGE_ALIGN(s->offset + s->length);
> +
> + if (!arch_is_coherent())
> + __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, 
> dir);
> +
> + ret = iommu_map(mapping->domain, iova, phys, len, 0);

Dealing with phys addresses on one part and pages + offset + length
in a different part doesn't look like a good idea.  Why can't there
be some consistency?

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/omap: fix NULL pointer dereference

2012-02-22 Thread Russell King - ARM Linux
On Wed, Feb 22, 2012 at 11:10:38AM +0200, Ohad Ben-Cohen wrote:
> On Wed, Feb 22, 2012 at 10:56 AM, Russell King - ARM Linux
>  wrote:
> > There's something else which needs fixing here - the return code.  If
> > omap_find_iovm_area() returns an error code that needs propagating.
> > Otherwise it might as well just return NULL for all errors.
> 
> Seems like it does just return NULL for all errors, so a cleaner fix
> can probably just be s/IS_ERR(area)/!area/.
> 
> I'll submit it, but Joerg, if you feel this is becoming a "cleanup",
> feel free to take the first version.

That sounds more like a bug fix than a cleanup to me.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/omap: fix NULL pointer dereference

2012-02-22 Thread Russell King - ARM Linux
On Wed, Feb 22, 2012 at 10:52:52AM +0200, Ohad Ben-Cohen wrote:
> @@ -274,7 +274,7 @@ static ssize_t debug_read_mem(struct file *file, char 
> __user *userbuf,
>   mutex_lock(&iommu_debug_lock);
>  
>   area = omap_find_iovm_area(dev, (u32)ppos);
> - if (IS_ERR(area)) {
> + if (IS_ERR_OR_NULL(area)) {
>   bytes = -EINVAL;

There's something else which needs fixing here - the return code.  If
omap_find_iovm_area() returns an error code that needs propagating.
Otherwise it might as well just return NULL for all errors.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 8/8 RESEND] ARM: dma-mapping: add support for IOMMU mapper

2012-01-25 Thread Russell King - ARM Linux
On Mon, Jan 09, 2012 at 04:49:21PM +0100, Marek Szyprowski wrote:
> This patch add a complete implementation of DMA-mapping API for
> devices that have IOMMU support. All DMA-mapping calls are supported.
> 
> This patch contains some of the code kindly provided by Krishna Reddy
>  and Andrzej Pietrasiewicz 
> 
> Signed-off-by: Marek Szyprowski 
> Signed-off-by: Kyungmin Park 
> 
> ---
> 
> Hello,
> 
> This is the corrected version of the previous patch from the "[PATCH 0/8
> v4] ARM: DMA-mapping framework redesign" thread which can be found here:
> http://www.spinics.net/lists/linux-mm/msg27382.html
> 
> Previous version had very nasty bug which causes memory trashing if
> DMA-mapping managed to allocate pages larger than 4KiB. The problem was
> in __iommu_alloc_buffer() function which did not check how many pages
> has been left to allocate.

This patch seems to be incomplete.

If the standard DMA API is used (the one which exists in current kernels)
and NEED_SG_DMA_LENGTH is enabled, then where do we set the DMA length
in the scatterlist?

> diff --git a/arch/arm/include/asm/dma-iommu.h 
> b/arch/arm/include/asm/dma-iommu.h
> new file mode 100644
> index 000..6668b41
> --- /dev/null
> +++ b/arch/arm/include/asm/dma-iommu.h
> @@ -0,0 +1,36 @@
> +#ifndef ASMARM_DMA_IOMMU_H
> +#define ASMARM_DMA_IOMMU_H
> +
> +#ifdef __KERNEL__
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 

I can't see anything in here which needs asm/memory.h - if files which
include this need it, please include it in there so we can see why it's
needed.

> +
> +struct dma_iommu_mapping {
> + /* iommu specific data */
> + struct iommu_domain *domain;
> +
> + void*bitmap;
> + size_t  bits;
> + unsigned intorder;
> + dma_addr_t  base;
> +
> + spinlock_t  lock;
> + struct kref kref;
> +};
> +
> +struct dma_iommu_mapping *
> +arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, size_t size,
> +  int order);
> +
> +void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping);
> +
> +int arm_iommu_attach_device(struct device *dev,
> + struct dma_iommu_mapping *mapping);
> +
> +#endif /* __KERNEL__ */
> +#endif
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 4845c09..2287b01 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -27,6 +27,9 @@
>  #include 
>  #include 
>  
> +#include 

linux/ includes should be grouped together.

> diff --git a/arch/arm/mm/vmregion.h b/arch/arm/mm/vmregion.h
> index 15e9f04..6bbc402 100644
> --- a/arch/arm/mm/vmregion.h
> +++ b/arch/arm/mm/vmregion.h
> @@ -17,7 +17,7 @@ struct arm_vmregion {
>   struct list_headvm_list;
>   unsigned long   vm_start;
>   unsigned long   vm_end;
> - struct page *vm_pages;
> + void*priv;

I want to think about that - I may wish to export the vm_pages via
the new dma-mappings file to provide additional information.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu