Re: [PATCH v2 4/7] irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg()

2019-05-01 Thread Julien Grall

On 30/04/2019 13:34, Auger Eric wrote:

Hi Julien,


Hi Eric,

Thank you for the review!



On 4/29/19 4:44 PM, Julien Grall wrote:

its_irq_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 ITS driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
maping when allocating the MSI interrupt.

mapping


Signed-off-by: Julien Grall 

---
 Changes in v2:
 - Rework the commit message to use imperative mood
---
  drivers/irqchip/irq-gic-v3-its.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755bdcf4..12ddbcfe1b1e 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1179,7 +1179,7 @@ static void its_irq_compose_msi_msg(struct irq_data *d, 
struct msi_msg *msg)
msg->address_hi  = upper_32_bits(addr);
msg->data= its_get_event_id(d);
  
-	iommu_dma_map_msi_msg(d->irq, msg);

+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
  }
  
  static int its_irq_set_irqchip_state(struct irq_data *d,

@@ -2566,6 +2566,7 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
  {
msi_alloc_info_t *info = args;
struct its_device *its_dev = info->scratchpad[0].ptr;
+   struct its_node *its = its_dev->its;
irq_hw_number_t hwirq;
int err;
int i;
@@ -2574,6 +2575,8 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
if (err)
return err;
  
+	err = iommu_dma_prepare_msi(info->desc, its->get_msi_base(its_dev));

Test err as in gicv2m driver?


Hmmm yes. Marc, do you want me to respin the patch?

Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/3] PCIe Host request to reserve IOVA

2019-05-01 Thread Lorenzo Pieralisi
On Fri, Apr 12, 2019 at 08:43:32AM +0530, Srinath Mannam wrote:
> Few SOCs have limitation that their PCIe host can't allow few inbound
> address ranges. Allowed inbound address ranges are listed in dma-ranges
> DT property and this address ranges are required to do IOVA mapping.
> Remaining address ranges have to be reserved in IOVA mapping.
> 
> PCIe Host driver of those SOCs has to list resource entries of allowed
> address ranges given in dma-ranges DT property in sorted order. This
> sorted list of resources will be processed and reserve IOVA address for
> inaccessible address holes while initializing IOMMU domain.
> 
> This patch set is based on Linux-5.0-rc2.
> 
> Changes from v3:
>   - Addressed Robin Murphy review comments.
> - pcie-iproc: parse dma-ranges and make sorted resource list.
> - dma-iommu: process list and reserve gaps between entries
> 
> Changes from v2:
>   - Patch set rebased to Linux-5.0-rc2
> 
> Changes from v1:
>   - Addressed Oza review comments.
> 
> Srinath Mannam (3):
>   PCI: Add dma_ranges window list
>   iommu/dma: Reserve IOVA for PCIe inaccessible DMA address
>   PCI: iproc: Add sorted dma ranges resource entries to host bridge
> 
>  drivers/iommu/dma-iommu.c   | 19 
>  drivers/pci/controller/pcie-iproc.c | 44 
> -
>  drivers/pci/probe.c |  3 +++
>  include/linux/pci.h |  1 +
>  4 files changed, 66 insertions(+), 1 deletion(-)

Bjorn, Joerg,

this series should not affect anything in the mainline other than its
consumer (ie patch 3); if that's the case should we consider it for v5.2
and if yes how are we going to merge it ?

Thanks,
Lorenzo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 05/23] iommu: Introduce cache_invalidate API

2019-05-01 Thread Jean-Philippe Brucker
On 08/04/2019 13:18, Eric Auger wrote:
> +int iommu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
> +struct iommu_cache_invalidate_info *inv_info)
> +{
> + int ret = 0;
> +
> + if (unlikely(!domain->ops->cache_invalidate))
> + return -ENODEV;
> +
> + ret = domain->ops->cache_invalidate(domain, dev, inv_info);
> +
> + return ret;

Nit: you don't really need ret

The UAPI looks good to me, so

Reviewed-by: Jean-Philippe Brucker 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 3/3] PCI: iproc: Add sorted dma ranges resource entries to host bridge

2019-05-01 Thread Lorenzo Pieralisi
On Fri, Apr 12, 2019 at 08:43:35AM +0530, Srinath Mannam wrote:
> IPROC host has the limitation that it can use only those address ranges
> given by dma-ranges property as inbound address. So that the memory
> address holes in dma-ranges should be reserved to allocate as DMA address.
> 
> Inbound address of host accessed by PCIe devices will not be translated
> before it comes to IOMMU or directly to PE.

What does that mean "directly to PE" ?

IIUC all you want to say is that there is no entity translating
PCI memory transactions addresses before they it the PCI host
controller inbound regions address decoder.

> But the limitation of this host is, access to few address ranges are
> ignored. So that IOVA ranges for these address ranges have to be
> reserved.
> 
> All allowed address ranges are listed in dma-ranges DT parameter. These
> address ranges are converted as resource entries and listed in sorted
> order add added to dma_ranges list of PCI host bridge structure.
> 
> Ex:
> dma-ranges = < \
>   0x4300 0x00 0x8000 0x00 0x8000 0x00 0x8000 \
>   0x4300 0x08 0x 0x08 0x 0x08 0x \
>   0x4300 0x80 0x 0x80 0x 0x40 0x>
> 
> In the above example of dma-ranges, memory address from
> 0x0 - 0x8000,
> 0x1 - 0x8,
> 0x10 - 0x80 and
> 0x100 - 0x.
> are not allowed to use as inbound addresses.
> 
> Signed-off-by: Srinath Mannam 
> Based-on-patch-by: Oza Pawandeep 
> Reviewed-by: Oza Pawandeep 
> ---
>  drivers/pci/controller/pcie-iproc.c | 44 
> -
>  1 file changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pcie-iproc.c 
> b/drivers/pci/controller/pcie-iproc.c
> index c20fd6b..94ba5c0 100644
> --- a/drivers/pci/controller/pcie-iproc.c
> +++ b/drivers/pci/controller/pcie-iproc.c
> @@ -1146,11 +1146,43 @@ static int iproc_pcie_setup_ib(struct iproc_pcie 
> *pcie,
>   return ret;
>  }
>  
> +static int
> +iproc_pcie_add_dma_range(struct device *dev, struct list_head *resources,
> +  struct of_pci_range *range)
> +{
> + struct resource *res;
> + struct resource_entry *entry, *tmp;
> + struct list_head *head = resources;
> +
> + res = devm_kzalloc(dev, sizeof(struct resource), GFP_KERNEL);
> + if (!res)
> + return -ENOMEM;
> +
> + resource_list_for_each_entry(tmp, resources) {
> + if (tmp->res->start < range->cpu_addr)
> + head = >node;
> + }
> +
> + res->start = range->cpu_addr;
> + res->end = res->start + range->size - 1;
> +
> + entry = resource_list_create_entry(res, 0);
> + if (!entry)
> + return -ENOMEM;
> +
> + entry->offset = res->start - range->cpu_addr;
> + resource_list_add(entry, head);
> +
> + return 0;
> +}
> +
>  static int iproc_pcie_map_dma_ranges(struct iproc_pcie *pcie)
>  {
> + struct pci_host_bridge *host = pci_host_bridge_from_priv(pcie);
>   struct of_pci_range range;
>   struct of_pci_range_parser parser;
>   int ret;
> + LIST_HEAD(resources);
>  
>   /* Get the dma-ranges from DT */
>   ret = of_pci_dma_range_parser_init(, pcie->dev->of_node);
> @@ -1158,13 +1190,23 @@ static int iproc_pcie_map_dma_ranges(struct 
> iproc_pcie *pcie)
>   return ret;
>  
>   for_each_of_pci_range(, ) {
> + ret = iproc_pcie_add_dma_range(pcie->dev,
> +,
> +);
> + if (ret)
> + goto out;
>   /* Each range entry corresponds to an inbound mapping region */
>   ret = iproc_pcie_setup_ib(pcie, , IPROC_PCIE_IB_MAP_MEM);
>   if (ret)
> - return ret;
> + goto out;
>   }
>  
> + list_splice_init(, >dma_ranges);
> +
>   return 0;
> +out:
> + pci_free_resource_list();
> + return ret;
>  }
>  
>  static int iproce_pcie_get_msi(struct iproc_pcie *pcie,
> -- 
> 2.7.4
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/3] PCIe Host request to reserve IOVA

2019-05-01 Thread Srinath Mannam via iommu
Hi Lorenzo,

Thanks a lot. Please see my reply below.

On Wed, May 1, 2019 at 7:24 PM Lorenzo Pieralisi
 wrote:
>
> On Wed, May 01, 2019 at 02:20:56PM +0100, Robin Murphy wrote:
> > On 2019-05-01 1:55 pm, Bjorn Helgaas wrote:
> > > On Wed, May 01, 2019 at 12:30:38PM +0100, Lorenzo Pieralisi wrote:
> > > > On Fri, Apr 12, 2019 at 08:43:32AM +0530, Srinath Mannam wrote:
> > > > > Few SOCs have limitation that their PCIe host can't allow few inbound
> > > > > address ranges. Allowed inbound address ranges are listed in 
> > > > > dma-ranges
> > > > > DT property and this address ranges are required to do IOVA mapping.
> > > > > Remaining address ranges have to be reserved in IOVA mapping.
> > > > >
> > > > > PCIe Host driver of those SOCs has to list resource entries of allowed
> > > > > address ranges given in dma-ranges DT property in sorted order. This
> > > > > sorted list of resources will be processed and reserve IOVA address 
> > > > > for
> > > > > inaccessible address holes while initializing IOMMU domain.
> > > > >
> > > > > This patch set is based on Linux-5.0-rc2.
> > > > >
> > > > > Changes from v3:
> > > > >- Addressed Robin Murphy review comments.
> > > > >  - pcie-iproc: parse dma-ranges and make sorted resource list.
> > > > >  - dma-iommu: process list and reserve gaps between entries
> > > > >
> > > > > Changes from v2:
> > > > >- Patch set rebased to Linux-5.0-rc2
> > > > >
> > > > > Changes from v1:
> > > > >- Addressed Oza review comments.
> > > > >
> > > > > Srinath Mannam (3):
> > > > >PCI: Add dma_ranges window list
> > > > >iommu/dma: Reserve IOVA for PCIe inaccessible DMA address
> > > > >PCI: iproc: Add sorted dma ranges resource entries to host bridge
> > > > >
> > > > >   drivers/iommu/dma-iommu.c   | 19 
> > > > >   drivers/pci/controller/pcie-iproc.c | 44 
> > > > > -
> > > > >   drivers/pci/probe.c |  3 +++
> > > > >   include/linux/pci.h |  1 +
> > > > >   4 files changed, 66 insertions(+), 1 deletion(-)
> > > >
> > > > Bjorn, Joerg,
> > > >
> > > > this series should not affect anything in the mainline other than its
> > > > consumer (ie patch 3); if that's the case should we consider it for v5.2
> > > > and if yes how are we going to merge it ?
> > >
> > > I acked the first one
> > >
> > > Robin reviewed the second
> > > (https://lore.kernel.org/lkml/e6c812d6-0cad-4cfd-defd-d7ec427a6...@arm.com)
> > > (though I do agree with his comment about DMA_BIT_MASK()), Joerg was OK
> > > with it if Robin was
> > > (https://lore.kernel.org/lkml/20190423145721.gh29...@8bytes.org).
> > >
> > > Eric reviewed the third (and pointed out a typo).
> > >
> > > My Kconfiggery never got fully answered -- it looks to me as though it's
> > > possible to build pcie-iproc without the DMA hole support, and I thought
> > > the whole point of this series was to deal with those holes
> > > (https://lore.kernel.org/lkml/20190418234241.gf126...@google.com).  I 
> > > would
> > > have expected something like making pcie-iproc depend on IOMMU_SUPPORT.
> > > But Srinath didn't respond to that, so maybe it's not an issue and it
> > > should only affect pcie-iproc anyway.
> >
> > Hmm, I'm sure I had at least half-written a reply on that point, but I
> > can't seem to find it now... anyway, the gist is that these inbound
> > windows are generally set up to cover the physical address ranges of DRAM
> > and anything else that devices might need to DMA to. Thus if you're not
> > using an IOMMU, the fact that devices can't access the gaps in between
> > doesn't matter because there won't be anything there anyway; it only
> > needs mitigating if you do use an IOMMU and start giving arbitrary
> > non-physical addresses to the endpoint.
>
> So basically there is no strict IOMMU_SUPPORT dependency.
Yes, without IOMMU_SUPPORT, all inbound addresses will fall inside dma-ranges.
Issue is only in the case of IOMMU enable, this patch will address by
reserving non-allowed
address (holes of dma-ranges) by reserving them.
>
> > > So bottom line, I'm fine with merging it for v5.2.  Do you want to merge
> > > it, Lorenzo, or ...?
> >
> > This doesn't look like it will conflict with the other DMA ops and MSI
> > mapping changes currently in-flight for iommu-dma, so I have no
> > objection to it going through the PCI tree for 5.2.
>
> I will update the DMA_BIT_MASK() according to your review and fix the
> typo Eric pointed out and push out a branch - we shall see if we can
> include it for v5.2.
I will send new patches with the change DMA_BIT_MASK() and typo along
with Bjorn's comment in PATCH-1.

Regards,
Srinath.
>
> Thanks,
> Lorenzo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/7 v2] MIPS: use the generic uncached segment support in dma-direct

2019-05-01 Thread Christoph Hellwig
Stop providing our arch alloc/free hooks and just expose the segment
offset instead.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig  |  1 +
 arch/mips/include/asm/page.h   |  3 ---
 arch/mips/jazz/jazzdma.c   |  6 --
 arch/mips/mm/dma-noncoherent.c | 26 +-
 4 files changed, 10 insertions(+), 26 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 4a5f5b0ee9a9..cde4b490f3c7 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -9,6 +9,7 @@ config MIPS
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UBSAN_SANITIZE_ALL
+   select ARCH_HAS_UNCACHED_SEGMENT
select ARCH_SUPPORTS_UPROBES
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
diff --git a/arch/mips/include/asm/page.h b/arch/mips/include/asm/page.h
index 6b31c93b5eaa..23e0f1386e04 100644
--- a/arch/mips/include/asm/page.h
+++ b/arch/mips/include/asm/page.h
@@ -258,9 +258,6 @@ extern int __virt_addr_valid(const volatile void *kaddr);
 ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
 
-#define UNCAC_ADDR(addr)   (UNCAC_BASE + __pa(addr))
-#define CAC_ADDR(addr) ((unsigned long)__va((addr) - UNCAC_BASE))
-
 #include 
 #include 
 
diff --git a/arch/mips/jazz/jazzdma.c b/arch/mips/jazz/jazzdma.c
index bedb5047aff3..1804dc9d8136 100644
--- a/arch/mips/jazz/jazzdma.c
+++ b/arch/mips/jazz/jazzdma.c
@@ -575,10 +575,6 @@ static void *jazz_dma_alloc(struct device *dev, size_t 
size,
return NULL;
}
 
-   if (!(attrs & DMA_ATTR_NON_CONSISTENT)) {
-   dma_cache_wback_inv((unsigned long)ret, size);
-   ret = (void *)UNCAC_ADDR(ret);
-   }
return ret;
 }
 
@@ -586,8 +582,6 @@ static void jazz_dma_free(struct device *dev, size_t size, 
void *vaddr,
dma_addr_t dma_handle, unsigned long attrs)
 {
vdma_free(dma_handle);
-   if (!(attrs & DMA_ATTR_NON_CONSISTENT))
-   vaddr = (void *)CAC_ADDR((unsigned long)vaddr);
dma_direct_free_pages(dev, size, vaddr, dma_handle, attrs);
 }
 
diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c
index f9549d2fbea3..ed56c6fa7be2 100644
--- a/arch/mips/mm/dma-noncoherent.c
+++ b/arch/mips/mm/dma-noncoherent.c
@@ -44,33 +44,25 @@ static inline bool cpu_needs_post_dma_flush(struct device 
*dev)
}
 }
 
-void *arch_dma_alloc(struct device *dev, size_t size,
-   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+void arch_dma_prep_coherent(struct page *page, size_t size)
 {
-   void *ret;
-
-   ret = dma_direct_alloc_pages(dev, size, dma_handle, gfp, attrs);
-   if (ret && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
-   dma_cache_wback_inv((unsigned long) ret, size);
-   ret = (void *)UNCAC_ADDR(ret);
-   }
+   dma_cache_wback_inv((unsigned long)page_address(page), size);
+}
 
-   return ret;
+void *uncached_kernel_address(void *addr)
+{
+   return (void *)(__pa(addr) + UNCAC_BASE);
 }
 
-void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
-   dma_addr_t dma_addr, unsigned long attrs)
+void *cached_kernel_address(void *addr)
 {
-   if (!(attrs & DMA_ATTR_NON_CONSISTENT))
-   cpu_addr = (void *)CAC_ADDR((unsigned long)cpu_addr);
-   dma_direct_free_pages(dev, size, cpu_addr, dma_addr, attrs);
+   return __va(addr) - UNCAC_BASE;
 }
 
 long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,
dma_addr_t dma_addr)
 {
-   unsigned long addr = CAC_ADDR((unsigned long)cpu_addr);
-   return page_to_pfn(virt_to_page((void *)addr));
+   return page_to_pfn(virt_to_page(cached_kernel_address(cpu_addr)));
 }
 
 pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/3] PCIe Host request to reserve IOVA

2019-05-01 Thread Robin Murphy

On 2019-05-01 1:55 pm, Bjorn Helgaas wrote:

On Wed, May 01, 2019 at 12:30:38PM +0100, Lorenzo Pieralisi wrote:

On Fri, Apr 12, 2019 at 08:43:32AM +0530, Srinath Mannam wrote:

Few SOCs have limitation that their PCIe host can't allow few inbound
address ranges. Allowed inbound address ranges are listed in dma-ranges
DT property and this address ranges are required to do IOVA mapping.
Remaining address ranges have to be reserved in IOVA mapping.

PCIe Host driver of those SOCs has to list resource entries of allowed
address ranges given in dma-ranges DT property in sorted order. This
sorted list of resources will be processed and reserve IOVA address for
inaccessible address holes while initializing IOMMU domain.

This patch set is based on Linux-5.0-rc2.

Changes from v3:
   - Addressed Robin Murphy review comments.
 - pcie-iproc: parse dma-ranges and make sorted resource list.
 - dma-iommu: process list and reserve gaps between entries

Changes from v2:
   - Patch set rebased to Linux-5.0-rc2

Changes from v1:
   - Addressed Oza review comments.

Srinath Mannam (3):
   PCI: Add dma_ranges window list
   iommu/dma: Reserve IOVA for PCIe inaccessible DMA address
   PCI: iproc: Add sorted dma ranges resource entries to host bridge

  drivers/iommu/dma-iommu.c   | 19 
  drivers/pci/controller/pcie-iproc.c | 44 -
  drivers/pci/probe.c |  3 +++
  include/linux/pci.h |  1 +
  4 files changed, 66 insertions(+), 1 deletion(-)


Bjorn, Joerg,

this series should not affect anything in the mainline other than its
consumer (ie patch 3); if that's the case should we consider it for v5.2
and if yes how are we going to merge it ?


I acked the first one

Robin reviewed the second
(https://lore.kernel.org/lkml/e6c812d6-0cad-4cfd-defd-d7ec427a6...@arm.com)
(though I do agree with his comment about DMA_BIT_MASK()), Joerg was OK
with it if Robin was
(https://lore.kernel.org/lkml/20190423145721.gh29...@8bytes.org).

Eric reviewed the third (and pointed out a typo).

My Kconfiggery never got fully answered -- it looks to me as though it's
possible to build pcie-iproc without the DMA hole support, and I thought
the whole point of this series was to deal with those holes
(https://lore.kernel.org/lkml/20190418234241.gf126...@google.com).  I would
have expected something like making pcie-iproc depend on IOMMU_SUPPORT.
But Srinath didn't respond to that, so maybe it's not an issue and it
should only affect pcie-iproc anyway.


Hmm, I'm sure I had at least half-written a reply on that point, but I 
can't seem to find it now... anyway, the gist is that these inbound 
windows are generally set up to cover the physical address ranges of 
DRAM and anything else that devices might need to DMA to. Thus if you're 
not using an IOMMU, the fact that devices can't access the gaps in 
between doesn't matter because there won't be anything there anyway; it 
only needs mitigating if you do use an IOMMU and start giving arbitrary 
non-physical addresses to the endpoint.



So bottom line, I'm fine with merging it for v5.2.  Do you want to merge
it, Lorenzo, or ...?


This doesn't look like it will conflict with the other DMA ops and MSI 
mapping changes currently in-flight for iommu-dma, so I have no 
objection to it going through the PCI tree for 5.2.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-05-01 Thread Julien Grall
Hi all,

On RT, the function iommu_dma_map_msi_msg expects to be called from preemptible
context. However, this is not always the case resulting a splat with
!CONFIG_DEBUG_ATOMIC_SLEEP:

[   48.875777] BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:974
[   48.875779] in_atomic(): 1, irqs_disabled(): 128, pid: 2103, name: ip
[   48.875782] INFO: lockdep is turned off.
[   48.875784] irq event stamp: 10684
[   48.875786] hardirqs last  enabled at (10683): [] 
_raw_spin_unlock_irqrestore+0x88/0x90
[   48.875791] hardirqs last disabled at (10684): [] 
_raw_spin_lock_irqsave+0x24/0x68
[   48.875796] softirqs last  enabled at (0): [] 
copy_process.isra.1.part.2+0x8d8/0x1970
[   48.875801] softirqs last disabled at (0): [<>]   
(null)
[   48.875805] Preemption disabled at:
[   48.875805] [] __setup_irq+0xd8/0x6c0
[   48.875811] CPU: 2 PID: 2103 Comm: ip Not tainted 
5.0.3-rt1-7-g42ede9a0fed6 #45
[   48.875815] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno 
Development Platform, BIOS EDK II Jan 23 2017
[   48.875817] Call trace:
[   48.875818]  dump_backtrace+0x0/0x140
[   48.875821]  show_stack+0x14/0x20
[   48.875823]  dump_stack+0xa0/0xd4
[   48.875827]  ___might_sleep+0x16c/0x1f8
[   48.875831]  rt_spin_lock+0x5c/0x70
[   48.875835]  iommu_dma_map_msi_msg+0x5c/0x1d8
[   48.875839]  gicv2m_compose_msi_msg+0x3c/0x48
[   48.875843]  irq_chip_compose_msi_msg+0x40/0x58
[   48.875846]  msi_domain_activate+0x38/0x98
[   48.875849]  __irq_domain_activate_irq+0x58/0xa0
[   48.875852]  irq_domain_activate_irq+0x34/0x58
[   48.875855]  irq_activate+0x28/0x30
[   48.875858]  __setup_irq+0x2b0/0x6c0
[   48.875861]  request_threaded_irq+0xdc/0x188
[   48.875865]  sky2_setup_irq+0x44/0xf8
[   48.875868]  sky2_open+0x1a4/0x240
[   48.875871]  __dev_open+0xd8/0x188
[   48.875874]  __dev_change_flags+0x164/0x1f0
[   48.875877]  dev_change_flags+0x20/0x60
[   48.875879]  do_setlink+0x2a0/0xd30
[   48.875882]  __rtnl_newlink+0x5b4/0x6d8
[   48.875885]  rtnl_newlink+0x50/0x78
[   48.875888]  rtnetlink_rcv_msg+0x178/0x640
[   48.875891]  netlink_rcv_skb+0x58/0x118
[   48.875893]  rtnetlink_rcv+0x14/0x20
[   48.875896]  netlink_unicast+0x188/0x200
[   48.875898]  netlink_sendmsg+0x248/0x3d8
[   48.875900]  sock_sendmsg+0x18/0x40
[   48.875904]  ___sys_sendmsg+0x294/0x2d0
[   48.875908]  __sys_sendmsg+0x68/0xb8
[   48.875911]  __arm64_sys_sendmsg+0x20/0x28
[   48.875914]  el0_svc_common+0x90/0x118
[   48.875918]  el0_svc_handler+0x2c/0x80
[   48.875922]  el0_svc+0x8/0xc

Most of the patches have now been acked (missing a couple of ack from Joerg).

I was able to test the changes in GICv2m and GICv3 ITS. I don't have
hardware for the other interrupt controllers.

Cheers,

Julien Grall (7):
  genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
  iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts
  irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()
  irqchip/gic-v3-its: Don't map the MSI page in
its_irq_compose_msi_msg()
  irqchip/ls-scfg-msi: Don't map the MSI page in
ls_scfg_msi_compose_msg()
  irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b,
s}i_msg()
  iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

 drivers/iommu/Kconfig |  1 +
 drivers/iommu/dma-iommu.c | 48 +++
 drivers/irqchip/irq-gic-v2m.c |  8 ++-
 drivers/irqchip/irq-gic-v3-its.c  |  7 +-
 drivers/irqchip/irq-gic-v3-mbi.c  | 15 ++--
 drivers/irqchip/irq-ls-scfg-msi.c |  7 +-
 include/linux/dma-iommu.h | 24 ++--
 include/linux/msi.h   | 26 +
 kernel/irq/Kconfig|  3 +++
 9 files changed, 112 insertions(+), 27 deletions(-)

-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-05-01 Thread Khalid Aziz
On 5/1/19 8:49 AM, Waiman Long wrote:
> On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> 
>> index 858b6c0b9a15..9b36da94760e 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -2997,6 +2997,12 @@
>>
>>   nox2apic    [X86-64,APIC] Do not enable x2APIC mode.
>>
>> +    noxpfo    [XPFO] Disable eXclusive Page Frame Ownership (XPFO)
>> +    when CONFIG_XPFO is on. Physical pages mapped into
>> +    user applications will also be mapped in the
>> +    kernel's address space as if CONFIG_XPFO was not
>> +    enabled.
>> +
>>   cpu0_hotplug    [X86] Turn on CPU0 hotplug feature when
>>   CONFIG_BO OTPARAM_HOTPLUG_CPU0 is off.
>>   Some features depend on CPU0. Known dependencies are:
> 
> Given the big performance impact that XPFO can have. It should be off by
> default when configured. Instead, the xpfo option should be used to
> enable it.

Agreed. I plan to disable it by default in the next version of the
patch. This is likely to end up being a feature for extreme security
conscious folks only, unless I or someone else comes up with further
significant performance boost.

Thanks,
Khalid

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 3/3] PCI: iproc: Add sorted dma ranges resource entries to host bridge

2019-05-01 Thread Srinath Mannam via iommu
Hi Lorenzo,

Please see my reply below.

On Wed, May 1, 2019 at 8:07 PM Lorenzo Pieralisi
 wrote:
>
> On Fri, Apr 12, 2019 at 08:43:35AM +0530, Srinath Mannam wrote:
> > IPROC host has the limitation that it can use only those address ranges
> > given by dma-ranges property as inbound address. So that the memory
> > address holes in dma-ranges should be reserved to allocate as DMA address.
> >
> > Inbound address of host accessed by PCIe devices will not be translated
> > before it comes to IOMMU or directly to PE.
>
> What does that mean "directly to PE" ?
In general, with IOMMU enable case, inbound address access of endpoint
will come to IOMMU.
If IOMMU disable then it comes to PE (processing element - ARM).
>
> IIUC all you want to say is that there is no entity translating
> PCI memory transactions addresses before they it the PCI host
> controller inbound regions address decoder.
In our SOC we have an entity (Inside PCIe RC) which will translate
inbound address before it goes to
IOMMU or PE. In other SOCs this will not be the case, all inbound
address access will go to IOMMU or
PE.
Regards,
Srinath.
>
> > But the limitation of this host is, access to few address ranges are
> > ignored. So that IOVA ranges for these address ranges have to be
> > reserved.
> >
> > All allowed address ranges are listed in dma-ranges DT parameter. These
> > address ranges are converted as resource entries and listed in sorted
> > order add added to dma_ranges list of PCI host bridge structure.
> >
> > Ex:
> > dma-ranges = < \
> >   0x4300 0x00 0x8000 0x00 0x8000 0x00 0x8000 \
> >   0x4300 0x08 0x 0x08 0x 0x08 0x \
> >   0x4300 0x80 0x 0x80 0x 0x40 0x>
> >
> > In the above example of dma-ranges, memory address from
> > 0x0 - 0x8000,
> > 0x1 - 0x8,
> > 0x10 - 0x80 and
> > 0x100 - 0x.
> > are not allowed to use as inbound addresses.
> >
> > Signed-off-by: Srinath Mannam 
> > Based-on-patch-by: Oza Pawandeep 
> > Reviewed-by: Oza Pawandeep 
> > ---
> >  drivers/pci/controller/pcie-iproc.c | 44 
> > -
> >  1 file changed, 43 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/controller/pcie-iproc.c 
> > b/drivers/pci/controller/pcie-iproc.c
> > index c20fd6b..94ba5c0 100644
> > --- a/drivers/pci/controller/pcie-iproc.c
> > +++ b/drivers/pci/controller/pcie-iproc.c
> > @@ -1146,11 +1146,43 @@ static int iproc_pcie_setup_ib(struct iproc_pcie 
> > *pcie,
> >   return ret;
> >  }
> >
> > +static int
> > +iproc_pcie_add_dma_range(struct device *dev, struct list_head *resources,
> > +  struct of_pci_range *range)
> > +{
> > + struct resource *res;
> > + struct resource_entry *entry, *tmp;
> > + struct list_head *head = resources;
> > +
> > + res = devm_kzalloc(dev, sizeof(struct resource), GFP_KERNEL);
> > + if (!res)
> > + return -ENOMEM;
> > +
> > + resource_list_for_each_entry(tmp, resources) {
> > + if (tmp->res->start < range->cpu_addr)
> > + head = >node;
> > + }
> > +
> > + res->start = range->cpu_addr;
> > + res->end = res->start + range->size - 1;
> > +
> > + entry = resource_list_create_entry(res, 0);
> > + if (!entry)
> > + return -ENOMEM;
> > +
> > + entry->offset = res->start - range->cpu_addr;
> > + resource_list_add(entry, head);
> > +
> > + return 0;
> > +}
> > +
> >  static int iproc_pcie_map_dma_ranges(struct iproc_pcie *pcie)
> >  {
> > + struct pci_host_bridge *host = pci_host_bridge_from_priv(pcie);
> >   struct of_pci_range range;
> >   struct of_pci_range_parser parser;
> >   int ret;
> > + LIST_HEAD(resources);
> >
> >   /* Get the dma-ranges from DT */
> >   ret = of_pci_dma_range_parser_init(, pcie->dev->of_node);
> > @@ -1158,13 +1190,23 @@ static int iproc_pcie_map_dma_ranges(struct 
> > iproc_pcie *pcie)
> >   return ret;
> >
> >   for_each_of_pci_range(, ) {
> > + ret = iproc_pcie_add_dma_range(pcie->dev,
> > +,
> > +);
> > + if (ret)
> > + goto out;
> >   /* Each range entry corresponds to an inbound mapping region 
> > */
> >   ret = iproc_pcie_setup_ib(pcie, , 
> > IPROC_PCIE_IB_MAP_MEM);
> >   if (ret)
> > - return ret;
> > + goto out;
> >   }
> >
> > + list_splice_init(, >dma_ranges);
> > +
> >   return 0;
> > +out:
> > + pci_free_resource_list();
> > + return ret;
> >  }
> >
> >  static int iproce_pcie_get_msi(struct iproc_pcie *pcie,
> > --
> > 2.7.4
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/3] PCIe Host request to reserve IOVA

2019-05-01 Thread Bjorn Helgaas
On Wed, May 01, 2019 at 12:30:38PM +0100, Lorenzo Pieralisi wrote:
> On Fri, Apr 12, 2019 at 08:43:32AM +0530, Srinath Mannam wrote:
> > Few SOCs have limitation that their PCIe host can't allow few inbound
> > address ranges. Allowed inbound address ranges are listed in dma-ranges
> > DT property and this address ranges are required to do IOVA mapping.
> > Remaining address ranges have to be reserved in IOVA mapping.
> > 
> > PCIe Host driver of those SOCs has to list resource entries of allowed
> > address ranges given in dma-ranges DT property in sorted order. This
> > sorted list of resources will be processed and reserve IOVA address for
> > inaccessible address holes while initializing IOMMU domain.
> > 
> > This patch set is based on Linux-5.0-rc2.
> > 
> > Changes from v3:
> >   - Addressed Robin Murphy review comments.
> > - pcie-iproc: parse dma-ranges and make sorted resource list.
> > - dma-iommu: process list and reserve gaps between entries
> > 
> > Changes from v2:
> >   - Patch set rebased to Linux-5.0-rc2
> > 
> > Changes from v1:
> >   - Addressed Oza review comments.
> > 
> > Srinath Mannam (3):
> >   PCI: Add dma_ranges window list
> >   iommu/dma: Reserve IOVA for PCIe inaccessible DMA address
> >   PCI: iproc: Add sorted dma ranges resource entries to host bridge
> > 
> >  drivers/iommu/dma-iommu.c   | 19 
> >  drivers/pci/controller/pcie-iproc.c | 44 
> > -
> >  drivers/pci/probe.c |  3 +++
> >  include/linux/pci.h |  1 +
> >  4 files changed, 66 insertions(+), 1 deletion(-)
> 
> Bjorn, Joerg,
> 
> this series should not affect anything in the mainline other than its
> consumer (ie patch 3); if that's the case should we consider it for v5.2
> and if yes how are we going to merge it ?

I acked the first one

Robin reviewed the second
(https://lore.kernel.org/lkml/e6c812d6-0cad-4cfd-defd-d7ec427a6...@arm.com)
(though I do agree with his comment about DMA_BIT_MASK()), Joerg was OK
with it if Robin was
(https://lore.kernel.org/lkml/20190423145721.gh29...@8bytes.org).

Eric reviewed the third (and pointed out a typo).

My Kconfiggery never got fully answered -- it looks to me as though it's
possible to build pcie-iproc without the DMA hole support, and I thought
the whole point of this series was to deal with those holes
(https://lore.kernel.org/lkml/20190418234241.gf126...@google.com).  I would
have expected something like making pcie-iproc depend on IOMMU_SUPPORT.
But Srinath didn't respond to that, so maybe it's not an issue and it
should only affect pcie-iproc anyway.

So bottom line, I'm fine with merging it for v5.2.  Do you want to merge
it, Lorenzo, or ...?

Bjorn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 4/7] irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg()

2019-05-01 Thread Marc Zyngier
On 01/05/2019 12:14, Julien Grall wrote:
> On 30/04/2019 13:34, Auger Eric wrote:
>> Hi Julien,
> 
> Hi Eric,
> 
> Thank you for the review!
> 
>>
>> On 4/29/19 4:44 PM, Julien Grall wrote:
>>> its_irq_compose_msi_msg() may be called from non-preemptible context.
>>> However, on RT, iommu_dma_map_msi_msg requires to be called from a
>>> preemptible context.
>>>
>>> A recent change split iommu_dma_map_msi_msg() in two new functions:
>>> one that should be called in preemptible context, the other does
>>> not have any requirement.
>>>
>>> The GICv3 ITS driver is reworked to avoid executing preemptible code in
>>> non-preemptible context. This can be achieved by preparing the MSI
>>> maping when allocating the MSI interrupt.
>> mapping
>>>
>>> Signed-off-by: Julien Grall 
>>>
>>> ---
>>>  Changes in v2:
>>>  - Rework the commit message to use imperative mood
>>> ---
>>>   drivers/irqchip/irq-gic-v3-its.c | 5 -
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/irqchip/irq-gic-v3-its.c 
>>> b/drivers/irqchip/irq-gic-v3-its.c
>>> index 7577755bdcf4..12ddbcfe1b1e 100644
>>> --- a/drivers/irqchip/irq-gic-v3-its.c
>>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>>> @@ -1179,7 +1179,7 @@ static void its_irq_compose_msi_msg(struct irq_data 
>>> *d, struct msi_msg *msg)
>>> msg->address_hi = upper_32_bits(addr);
>>> msg->data   = its_get_event_id(d);
>>>   
>>> -   iommu_dma_map_msi_msg(d->irq, msg);
>>> +   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
>>>   }
>>>   
>>>   static int its_irq_set_irqchip_state(struct irq_data *d,
>>> @@ -2566,6 +2566,7 @@ static int its_irq_domain_alloc(struct irq_domain 
>>> *domain, unsigned int virq,
>>>   {
>>> msi_alloc_info_t *info = args;
>>> struct its_device *its_dev = info->scratchpad[0].ptr;
>>> +   struct its_node *its = its_dev->its;
>>> irq_hw_number_t hwirq;
>>> int err;
>>> int i;
>>> @@ -2574,6 +2575,8 @@ static int its_irq_domain_alloc(struct irq_domain 
>>> *domain, unsigned int virq,
>>> if (err)
>>> return err;
>>>   
>>> +   err = iommu_dma_prepare_msi(info->desc, its->get_msi_base(its_dev));
>> Test err as in gicv2m driver?
> 
> Hmmm yes. Marc, do you want me to respin the patch?

Sure, feel free to if you can. But what I really need is an Ack from
Jorg on the first few patches.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts

2019-05-01 Thread Julien Grall
On RT, iommu_dma_map_msi_msg() may be called from non-preemptible
context. This will lead to a splat with CONFIG_DEBUG_ATOMIC_SLEEP as
the function is using spin_lock (they can sleep on RT).

iommu_dma_map_msi_msg() is used to map the MSI page in the IOMMU PT
and update the MSI message with the IOVA.

Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.

iomma_dma_map_msi_msg() is now split in two functions:
- iommu_dma_prepare_msi(): This function will prepare the mapping
in the IOMMU and store the cookie in the structure msi_desc. This
function should be called in preemptible context.
- iommu_dma_compose_msi_msg(): This function will update the MSI
message with the IOVA when the device is behind an IOMMU.

Signed-off-by: Julien Grall 
Reviewed-by: Robin Murphy 
Reviewed-by: Eric Auguer 

---
Changes in v3:
- Update the comment to use kerneldoc format
- Fix typoes in the comments
- More use of msi_desc_set_iommu_cookie
- Add Robin's and Eric's reviewed-by

Changes in v2:
- Rework the commit message to use imperative mood
- Use the MSI accessor to get/set the iommu cookie
- Don't use ternary on return
- Select CONFIG_IRQ_MSI_IOMMU
- Pass an msi_desc rather than the irq number
---
 drivers/iommu/Kconfig |  1 +
 drivers/iommu/dma-iommu.c | 46 +-
 include/linux/dma-iommu.h | 25 +
 3 files changed, 63 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816..eb1c8cd243f9 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,6 +94,7 @@ config IOMMU_DMA
bool
select IOMMU_API
select IOMMU_IOVA
+   select IRQ_MSI_IOMMU
select NEED_SG_DMA_LENGTH
 
 config FSL_PAMU
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..f847904098f7 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -888,17 +888,18 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
return NULL;
 }
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
 {
-   struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+   struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
struct iommu_dma_msi_page *msi_page;
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;
 
-   if (!domain || !domain->iova_cookie)
-   return;
+   if (!domain || !domain->iova_cookie) {
+   desc->iommu_cookie = NULL;
+   return 0;
+   }
 
cookie = domain->iova_cookie;
 
@@ -911,7 +912,36 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(>msi_lock, flags);
 
-   if (WARN_ON(!msi_page)) {
+   msi_desc_set_iommu_cookie(desc, msi_page);
+
+   if (!msi_page)
+   return -ENOMEM;
+   return 0;
+}
+
+void iommu_dma_compose_msi_msg(struct msi_desc *desc,
+  struct msi_msg *msg)
+{
+   struct device *dev = msi_desc_to_dev(desc);
+   const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+   const struct iommu_dma_msi_page *msi_page;
+
+   msi_page = msi_desc_get_iommu_cookie(desc);
+
+   if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
+   return;
+
+   msg->address_hi = upper_32_bits(msi_page->iova);
+   msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
+   msg->address_lo += lower_32_bits(msi_page->iova);
+}
+
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+   struct msi_desc *desc = irq_get_msi_desc(irq);
+   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
+
+   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
/*
 * We're called from a void callback, so the best we can do is
 * 'fail' by filling the message with obviously bogus values.
@@ -922,8 +952,6 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msg->address_lo = ~0U;
msg->data = ~0U;
} else {
-   msg->address_hi = upper_32_bits(msi_page->iova);
-   msg->address_lo &= cookie_msi_granule(cookie) - 1;
-   msg->address_lo += lower_32_bits(msi_page->iova);
+   iommu_dma_compose_msi_msg(desc, msg);
}
 }
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 

[PATCH v3 7/7] iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

2019-05-01 Thread Julien Grall
A recent change split iommu_dma_map_msi_msg() in two new functions. The
function was still implemented to avoid modifying all the callers at
once.

Now that all the callers have been reworked, iommu_dma_map_msi_msg() can
be removed.

Signed-off-by: Julien Grall 
Reviewed-by: Robin Murphy 
Reviewed-by: Eric Auger 

---
Changes in v3:
- Add Robin's and Eric's reviewed-by

Changes in v2:
- Rework the commit message
---
 drivers/iommu/dma-iommu.c | 20 
 include/linux/dma-iommu.h |  5 -
 2 files changed, 25 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f847904098f7..13916fefeb27 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -935,23 +935,3 @@ void iommu_dma_compose_msi_msg(struct msi_desc *desc,
msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
msg->address_lo += lower_32_bits(msi_page->iova);
 }
-
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-   struct msi_desc *desc = irq_get_msi_desc(irq);
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
-
-   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
-   /*
-* We're called from a void callback, so the best we can do is
-* 'fail' by filling the message with obviously bogus values.
-* Since we got this far due to an IOMMU being present, it's
-* not like the existing address would have worked anyway...
-*/
-   msg->address_hi = ~0U;
-   msg->address_lo = ~0U;
-   msg->data = ~0U;
-   } else {
-   iommu_dma_compose_msi_msg(desc, msg);
-   }
-}
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 0b781a98ee73..476e0c54de2d 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -84,7 +84,6 @@ int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t 
msi_addr);
 void iommu_dma_compose_msi_msg(struct msi_desc *desc,
   struct msi_msg *msg);
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
 #else
@@ -124,10 +123,6 @@ static inline void iommu_dma_compose_msi_msg(struct 
msi_desc *desc,
 {
 }
 
-static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-}
-
 static inline void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
 {
 }
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 6/7] irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg()

2019-05-01 Thread Julien Grall
The functions mbi_compose_m{b, s}i_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg() requires to be called
from a preemptible context.

A recent patch split iommu_dma_map_msi_msg in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 MSI driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the two MSI
mappings when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v3-mbi.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-mbi.c b/drivers/irqchip/irq-gic-v3-mbi.c
index fbfa7ff6deb1..d50f6cdf043c 100644
--- a/drivers/irqchip/irq-gic-v3-mbi.c
+++ b/drivers/irqchip/irq-gic-v3-mbi.c
@@ -84,6 +84,7 @@ static void mbi_free_msi(struct mbi_range *mbi, unsigned int 
hwirq,
 static int mbi_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct mbi_range *mbi = NULL;
int hwirq, offset, i, err = 0;
 
@@ -104,6 +105,16 @@ static int mbi_irq_domain_alloc(struct irq_domain *domain, 
unsigned int virq,
 
hwirq = mbi->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_CLRSPI_NSR);
+   if (err)
+   return err;
+
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_SETSPI_NSR);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = mbi_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
@@ -142,7 +153,7 @@ static void mbi_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[0].address_lo = lower_32_bits(mbi_phys_base + GICD_SETSPI_NSR);
msg[0].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 #ifdef CONFIG_PCI_MSI
@@ -202,7 +213,7 @@ static void mbi_compose_mbi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[1].address_lo = lower_32_bits(mbi_phys_base + GICD_CLRSPI_NSR);
msg[1].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, [1]);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), [1]);
 }
 
 /* Platform-MSI specific irqchip */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/7] genirq/msi: Add a new field in msi_desc to store an IOMMU cookie

2019-05-01 Thread Julien Grall
When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up change will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

A new field is introduced in msi_desc to store an IOMMU cookie. As the
cookie may not be required in some configuration, the field is protected
under a new config CONFIG_IRQ_MSI_IOMMU.

A pair of helpers has also been introduced to access the field.

Signed-off-by: Julien Grall 
Reviewed-by: Robin Murphy 
Reviewed-by: Eric Auger 

---
Changes in v3:
- Add Robin's and Eric's reviewed-by

Changes in v2:
- Update the commit message to use imperative mood
- Protect the field with a new config that will be selected by
IOMMU_DMA later on
- Add a set of helpers to access the new field
---
 include/linux/msi.h | 26 ++
 kernel/irq/Kconfig  |  3 +++
 2 files changed, 29 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 7e9b81c3b50d..82a308c19222 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -77,6 +77,9 @@ struct msi_desc {
struct device   *dev;
struct msi_msg  msg;
struct irq_affinity_desc*affinity;
+#ifdef CONFIG_IRQ_MSI_IOMMU
+   const void  *iommu_cookie;
+#endif
 
union {
/* PCI MSI/X specific data */
@@ -119,6 +122,29 @@ struct msi_desc {
 #define for_each_msi_entry_safe(desc, tmp, dev)\
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
 
+#ifdef CONFIG_IRQ_MSI_IOMMU
+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return desc->iommu_cookie;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+   desc->iommu_cookie = iommu_cookie;
+}
+#else
+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return NULL;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+}
+#endif
+
 #ifdef CONFIG_PCI_MSI
 #define first_pci_msi_entry(pdev)  first_msi_entry(&(pdev)->dev)
 #define for_each_pci_msi_entry(desc, pdev) \
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 5f3e2baefca9..8fee06625c37 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -91,6 +91,9 @@ config GENERIC_MSI_IRQ_DOMAIN
select IRQ_DOMAIN_HIERARCHY
select GENERIC_MSI_IRQ
 
+config IRQ_MSI_IOMMU
+   bool
+
 config HANDLE_DOMAIN_IRQ
bool
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/7] irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()

2019-05-01 Thread Julien Grall
gicv2m_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg() requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv2m driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
mapping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 
Reviewed-by: Eric Auger 

---
Changes in v3:
- Add Eric's reviewed-by

Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v2m.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index f5fe0100f9ff..4359f0583377 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -110,7 +110,7 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
msg->data -= v2m->spi_offset;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 static struct irq_chip gicv2m_irq_chip = {
@@ -167,6 +167,7 @@ static void gicv2m_unalloc_msi(struct v2m_data *v2m, 
unsigned int hwirq,
 static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int 
virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct v2m_data *v2m = NULL, *tmp;
int hwirq, offset, i, err = 0;
 
@@ -186,6 +187,11 @@ static int gicv2m_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 
hwirq = v2m->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   v2m->res.start + V2M_MSI_SETSPI_NS);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = gicv2m_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 5/7] irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg()

2019-05-01 Thread Julien Grall
ls_scfg_msi_compose_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg() requires to be called from a
preemptible context.

A recent patch split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The FreeScale SCFG MSI driver is reworked to avoid executing preemptible
code in non-preemptible context. This can be achieved by preparing the
MSI maping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-ls-scfg-msi.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-ls-scfg-msi.c 
b/drivers/irqchip/irq-ls-scfg-msi.c
index c671b3212010..669d29105772 100644
--- a/drivers/irqchip/irq-ls-scfg-msi.c
+++ b/drivers/irqchip/irq-ls-scfg-msi.c
@@ -100,7 +100,7 @@ static void ls_scfg_msi_compose_msg(struct irq_data *data, 
struct msi_msg *msg)
msg->data |= cpumask_first(mask);
}
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 static int ls_scfg_msi_set_affinity(struct irq_data *irq_data,
@@ -141,6 +141,7 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
unsigned int nr_irqs,
void *args)
 {
+   msi_alloc_info_t *info = args;
struct ls_scfg_msi *msi_data = domain->host_data;
int pos, err = 0;
 
@@ -157,6 +158,10 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, msi_data->msiir_addr);
+   if (err)
+   return err;
+
irq_domain_set_info(domain, virq, pos,
_scfg_msi_parent_chip, msi_data,
handle_simple_irq, NULL, NULL);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 4/7] irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg()

2019-05-01 Thread Julien Grall
its_irq_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 ITS driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
mapping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 
Reviewed-by: Eric Auger 

---
Changes in v3:
- Fix typo in the commit message
- Check the return of iommu_dma_prepare_msi
- Add Eric's reviewed-by

Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v3-its.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755bdcf4..9cddf336c09d 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1179,7 +1179,7 @@ static void its_irq_compose_msi_msg(struct irq_data *d, 
struct msi_msg *msg)
msg->address_hi = upper_32_bits(addr);
msg->data   = its_get_event_id(d);
 
-   iommu_dma_map_msi_msg(d->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
 }
 
 static int its_irq_set_irqchip_state(struct irq_data *d,
@@ -2566,6 +2566,7 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 {
msi_alloc_info_t *info = args;
struct its_device *its_dev = info->scratchpad[0].ptr;
+   struct its_node *its = its_dev->its;
irq_hw_number_t hwirq;
int err;
int i;
@@ -2574,6 +2575,10 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, its->get_msi_base(its_dev));
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = its_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/3] PCIe Host request to reserve IOVA

2019-05-01 Thread Lorenzo Pieralisi
On Wed, May 01, 2019 at 02:20:56PM +0100, Robin Murphy wrote:
> On 2019-05-01 1:55 pm, Bjorn Helgaas wrote:
> > On Wed, May 01, 2019 at 12:30:38PM +0100, Lorenzo Pieralisi wrote:
> > > On Fri, Apr 12, 2019 at 08:43:32AM +0530, Srinath Mannam wrote:
> > > > Few SOCs have limitation that their PCIe host can't allow few inbound
> > > > address ranges. Allowed inbound address ranges are listed in dma-ranges
> > > > DT property and this address ranges are required to do IOVA mapping.
> > > > Remaining address ranges have to be reserved in IOVA mapping.
> > > > 
> > > > PCIe Host driver of those SOCs has to list resource entries of allowed
> > > > address ranges given in dma-ranges DT property in sorted order. This
> > > > sorted list of resources will be processed and reserve IOVA address for
> > > > inaccessible address holes while initializing IOMMU domain.
> > > > 
> > > > This patch set is based on Linux-5.0-rc2.
> > > > 
> > > > Changes from v3:
> > > >- Addressed Robin Murphy review comments.
> > > >  - pcie-iproc: parse dma-ranges and make sorted resource list.
> > > >  - dma-iommu: process list and reserve gaps between entries
> > > > 
> > > > Changes from v2:
> > > >- Patch set rebased to Linux-5.0-rc2
> > > > 
> > > > Changes from v1:
> > > >- Addressed Oza review comments.
> > > > 
> > > > Srinath Mannam (3):
> > > >PCI: Add dma_ranges window list
> > > >iommu/dma: Reserve IOVA for PCIe inaccessible DMA address
> > > >PCI: iproc: Add sorted dma ranges resource entries to host bridge
> > > > 
> > > >   drivers/iommu/dma-iommu.c   | 19 
> > > >   drivers/pci/controller/pcie-iproc.c | 44 
> > > > -
> > > >   drivers/pci/probe.c |  3 +++
> > > >   include/linux/pci.h |  1 +
> > > >   4 files changed, 66 insertions(+), 1 deletion(-)
> > > 
> > > Bjorn, Joerg,
> > > 
> > > this series should not affect anything in the mainline other than its
> > > consumer (ie patch 3); if that's the case should we consider it for v5.2
> > > and if yes how are we going to merge it ?
> > 
> > I acked the first one
> > 
> > Robin reviewed the second
> > (https://lore.kernel.org/lkml/e6c812d6-0cad-4cfd-defd-d7ec427a6...@arm.com)
> > (though I do agree with his comment about DMA_BIT_MASK()), Joerg was OK
> > with it if Robin was
> > (https://lore.kernel.org/lkml/20190423145721.gh29...@8bytes.org).
> > 
> > Eric reviewed the third (and pointed out a typo).
> > 
> > My Kconfiggery never got fully answered -- it looks to me as though it's
> > possible to build pcie-iproc without the DMA hole support, and I thought
> > the whole point of this series was to deal with those holes
> > (https://lore.kernel.org/lkml/20190418234241.gf126...@google.com).  I would
> > have expected something like making pcie-iproc depend on IOMMU_SUPPORT.
> > But Srinath didn't respond to that, so maybe it's not an issue and it
> > should only affect pcie-iproc anyway.
> 
> Hmm, I'm sure I had at least half-written a reply on that point, but I
> can't seem to find it now... anyway, the gist is that these inbound
> windows are generally set up to cover the physical address ranges of DRAM
> and anything else that devices might need to DMA to. Thus if you're not
> using an IOMMU, the fact that devices can't access the gaps in between
> doesn't matter because there won't be anything there anyway; it only
> needs mitigating if you do use an IOMMU and start giving arbitrary
> non-physical addresses to the endpoint.

So basically there is no strict IOMMU_SUPPORT dependency.

> > So bottom line, I'm fine with merging it for v5.2.  Do you want to merge
> > it, Lorenzo, or ...?
> 
> This doesn't look like it will conflict with the other DMA ops and MSI
> mapping changes currently in-flight for iommu-dma, so I have no
> objection to it going through the PCI tree for 5.2.

I will update the DMA_BIT_MASK() according to your review and fix the
typo Eric pointed out and push out a branch - we shall see if we can
include it for v5.2.

Thanks,
Lorenzo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-05-01 Thread Waiman Long
On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
> diff --git a/Documentation/admin-guide/kernel-parameters.txt
b/Documentation/admin-guide/kernel-parameters.txt

> index 858b6c0b9a15..9b36da94760e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2997,6 +2997,12 @@
>
>  nox2apic    [X86-64,APIC] Do not enable x2APIC mode.
>
> +    noxpfo    [XPFO] Disable eXclusive Page Frame Ownership (XPFO)
> +    when CONFIG_XPFO is on. Physical pages mapped into
> +    user applications will also be mapped in the
> +    kernel's address space as if CONFIG_XPFO was not
> +    enabled.
> +
>  cpu0_hotplug    [X86] Turn on CPU0 hotplug feature when
>  CONFIG_BO OTPARAM_HOTPLUG_CPU0 is off.
>  Some features depend on CPU0. Known dependencies are:

Given the big performance impact that XPFO can have. It should be off by
default when configured. Instead, the xpfo option should be used to
enable it.

Cheers,
Longman

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 0/3] PCIe Host request to reserve IOVA

2019-05-01 Thread Srinath Mannam via iommu
Hi Robin,

Thank you so much for all the information.

Regards,
Srinath.

On Wed, May 1, 2019 at 6:51 PM Robin Murphy  wrote:
>
> On 2019-05-01 1:55 pm, Bjorn Helgaas wrote:
> > On Wed, May 01, 2019 at 12:30:38PM +0100, Lorenzo Pieralisi wrote:
> >> On Fri, Apr 12, 2019 at 08:43:32AM +0530, Srinath Mannam wrote:
> >>> Few SOCs have limitation that their PCIe host can't allow few inbound
> >>> address ranges. Allowed inbound address ranges are listed in dma-ranges
> >>> DT property and this address ranges are required to do IOVA mapping.
> >>> Remaining address ranges have to be reserved in IOVA mapping.
> >>>
> >>> PCIe Host driver of those SOCs has to list resource entries of allowed
> >>> address ranges given in dma-ranges DT property in sorted order. This
> >>> sorted list of resources will be processed and reserve IOVA address for
> >>> inaccessible address holes while initializing IOMMU domain.
> >>>
> >>> This patch set is based on Linux-5.0-rc2.
> >>>
> >>> Changes from v3:
> >>>- Addressed Robin Murphy review comments.
> >>>  - pcie-iproc: parse dma-ranges and make sorted resource list.
> >>>  - dma-iommu: process list and reserve gaps between entries
> >>>
> >>> Changes from v2:
> >>>- Patch set rebased to Linux-5.0-rc2
> >>>
> >>> Changes from v1:
> >>>- Addressed Oza review comments.
> >>>
> >>> Srinath Mannam (3):
> >>>PCI: Add dma_ranges window list
> >>>iommu/dma: Reserve IOVA for PCIe inaccessible DMA address
> >>>PCI: iproc: Add sorted dma ranges resource entries to host bridge
> >>>
> >>>   drivers/iommu/dma-iommu.c   | 19 
> >>>   drivers/pci/controller/pcie-iproc.c | 44 
> >>> -
> >>>   drivers/pci/probe.c |  3 +++
> >>>   include/linux/pci.h |  1 +
> >>>   4 files changed, 66 insertions(+), 1 deletion(-)
> >>
> >> Bjorn, Joerg,
> >>
> >> this series should not affect anything in the mainline other than its
> >> consumer (ie patch 3); if that's the case should we consider it for v5.2
> >> and if yes how are we going to merge it ?
> >
> > I acked the first one
> >
> > Robin reviewed the second
> > (https://lore.kernel.org/lkml/e6c812d6-0cad-4cfd-defd-d7ec427a6...@arm.com)
> > (though I do agree with his comment about DMA_BIT_MASK()), Joerg was OK
> > with it if Robin was
> > (https://lore.kernel.org/lkml/20190423145721.gh29...@8bytes.org).
> >
> > Eric reviewed the third (and pointed out a typo).
> >
> > My Kconfiggery never got fully answered -- it looks to me as though it's
> > possible to build pcie-iproc without the DMA hole support, and I thought
> > the whole point of this series was to deal with those holes
> > (https://lore.kernel.org/lkml/20190418234241.gf126...@google.com).  I would
> > have expected something like making pcie-iproc depend on IOMMU_SUPPORT.
> > But Srinath didn't respond to that, so maybe it's not an issue and it
> > should only affect pcie-iproc anyway.
>
> Hmm, I'm sure I had at least half-written a reply on that point, but I
> can't seem to find it now... anyway, the gist is that these inbound
> windows are generally set up to cover the physical address ranges of
> DRAM and anything else that devices might need to DMA to. Thus if you're
> not using an IOMMU, the fact that devices can't access the gaps in
> between doesn't matter because there won't be anything there anyway; it
> only needs mitigating if you do use an IOMMU and start giving arbitrary
> non-physical addresses to the endpoint.
>
> > So bottom line, I'm fine with merging it for v5.2.  Do you want to merge
> > it, Lorenzo, or ...?
>
> This doesn't look like it will conflict with the other DMA ops and MSI
> mapping changes currently in-flight for iommu-dma, so I have no
> objection to it going through the PCI tree for 5.2.
>
> Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/7] dma-direct: provide generic support for uncached kernel segments

2019-05-01 Thread Paul Burton
Hi Christoph,

On Tue, Apr 30, 2019 at 07:00:29AM -0400, Christoph Hellwig wrote:
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 2c2772e9702a..d15a535c3e67 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -164,6 +164,13 @@ void *dma_direct_alloc_pages(struct device *dev, size_t 
> size,
>   }
>  
>   ret = page_address(page);
> +
> + if (IS_ENABLED(CONFIG_ARCH_HAS_UNCACHED_SEGMENT) &&
> + !dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
> + arch_dma_prep_coherent(page, size);
> + ret = uncached_kernel_address(ret);
> + }
> +
>   if (force_dma_unencrypted()) {
>   set_memory_decrypted((unsigned long)ret, 1 << get_order(size));
>   *dma_handle = __phys_to_dma(dev, page_to_phys(page));
> @@ -171,6 +178,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t 
> size,
>   *dma_handle = phys_to_dma(dev, page_to_phys(page));
>   }
>   memset(ret, 0, size);
> +
>   return ret;
>  }

I'm not so sure about this part though.

On MIPS we currently don't clear the allocated memory with memset. Is
doing that really necessary?

If it is necessary then as-is this code will clear the allocated memory
using uncached writes which will be pretty slow. It would be much more
efficient to perform the memset before arch_dma_prep_coherent() & before
converting ret to an uncached address.

Thanks,
Paul
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 06/19] drivers core: Add I/O ASID allocator

2019-05-01 Thread Jean-Philippe Brucker
On 30/04/2019 21:24, Jacob Pan wrote:
> On Thu, 25 Apr 2019 11:41:05 +0100
> Jean-Philippe Brucker  wrote:
> 
>> On 25/04/2019 11:17, Auger Eric wrote:
 +/**
 + * ioasid_alloc - Allocate an IOASID
 + * @set: the IOASID set
 + * @min: the minimum ID (inclusive)
 + * @max: the maximum ID (exclusive)
 + * @private: data private to the caller
 + *
 + * Allocate an ID between @min and @max (or %0 and %INT_MAX).
 Return the  
>>> I would remove "(or %0 and %INT_MAX)".  
>>
>> Agreed, those where the default values of idr, but the xarray doesn't
>> define a default max value. By the way, I do think squashing patches 6
>> and 7 would be better (keeping my SOB but you can change the author).
>>
> I will squash 6 and 7 in v3. I will just add my SOB but keep the
> author if that is OK.

Sure, that works

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/7] dma-direct: provide generic support for uncached kernel segments

2019-05-01 Thread Paul Burton
Hi Christoph,

On Wed, May 01, 2019 at 07:29:12PM +0200, Christoph Hellwig wrote:
> On Wed, May 01, 2019 at 05:18:59PM +, Paul Burton wrote:
> > I'm not so sure about this part though.
> > 
> > On MIPS we currently don't clear the allocated memory with memset. Is
> > doing that really necessary?
> 
> We are clearling it on mips, it is inside dma_direct_alloc_pages.

Ah, of course, I clearly require more caffeine :)

> > If it is necessary then as-is this code will clear the allocated memory
> > using uncached writes which will be pretty slow. It would be much more
> > efficient to perform the memset before arch_dma_prep_coherent() & before
> > converting ret to an uncached address.
> 
> Yes, we could do that.

Great; using cached writes would match the existing MIPS behavior.

Thanks,
Paul
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/7] dma-direct: provide generic support for uncached kernel segments

2019-05-01 Thread Christoph Hellwig
On Wed, May 01, 2019 at 05:40:34PM +, Paul Burton wrote:
> > > If it is necessary then as-is this code will clear the allocated memory
> > > using uncached writes which will be pretty slow. It would be much more
> > > efficient to perform the memset before arch_dma_prep_coherent() & before
> > > converting ret to an uncached address.
> > 
> > Yes, we could do that.
> 
> Great; using cached writes would match the existing MIPS behavior.

Can you test the stack with the two updated patches and ack them if
they are fine?  That would allow getting at least the infrastructure
and mips in for this merge window.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 3/3] PCI: iproc: Add sorted dma ranges resource entries to host bridge

2019-05-01 Thread Srinath Mannam via iommu
IPROC host has the limitation that it can use only those address ranges
given by dma-ranges property as inbound address. So that the memory
address holes in dma-ranges should be reserved to allocate as DMA address.

Inbound address of host accessed by PCIe devices will not be translated
before it comes to IOMMU or directly to PE. But the limitation of this
host is, access to few address ranges are ignored. So that IOVA ranges
for these address ranges have to be reserved.

All allowed address ranges are listed in dma-ranges DT parameter. These
address ranges are converted as resource entries and listed in sorted
order and added to dma_ranges list of PCI host bridge structure.

Ex:
dma-ranges = < \
  0x4300 0x00 0x8000 0x00 0x8000 0x00 0x8000 \
  0x4300 0x08 0x 0x08 0x 0x08 0x \
  0x4300 0x80 0x 0x80 0x 0x40 0x>

In the above example of dma-ranges, memory address from
0x0 - 0x8000,
0x1 - 0x8,
0x10 - 0x80 and
0x100 - 0x.
are not allowed to be used as inbound addresses.

Signed-off-by: Srinath Mannam 
Based-on-patch-by: Oza Pawandeep 
Reviewed-by: Oza Pawandeep 
Reviewed-by: Eric Auger 
---
 drivers/pci/controller/pcie-iproc.c | 44 -
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pcie-iproc.c 
b/drivers/pci/controller/pcie-iproc.c
index c20fd6b..94ba5c0 100644
--- a/drivers/pci/controller/pcie-iproc.c
+++ b/drivers/pci/controller/pcie-iproc.c
@@ -1146,11 +1146,43 @@ static int iproc_pcie_setup_ib(struct iproc_pcie *pcie,
return ret;
 }
 
+static int
+iproc_pcie_add_dma_range(struct device *dev, struct list_head *resources,
+struct of_pci_range *range)
+{
+   struct resource *res;
+   struct resource_entry *entry, *tmp;
+   struct list_head *head = resources;
+
+   res = devm_kzalloc(dev, sizeof(struct resource), GFP_KERNEL);
+   if (!res)
+   return -ENOMEM;
+
+   resource_list_for_each_entry(tmp, resources) {
+   if (tmp->res->start < range->cpu_addr)
+   head = >node;
+   }
+
+   res->start = range->cpu_addr;
+   res->end = res->start + range->size - 1;
+
+   entry = resource_list_create_entry(res, 0);
+   if (!entry)
+   return -ENOMEM;
+
+   entry->offset = res->start - range->cpu_addr;
+   resource_list_add(entry, head);
+
+   return 0;
+}
+
 static int iproc_pcie_map_dma_ranges(struct iproc_pcie *pcie)
 {
+   struct pci_host_bridge *host = pci_host_bridge_from_priv(pcie);
struct of_pci_range range;
struct of_pci_range_parser parser;
int ret;
+   LIST_HEAD(resources);
 
/* Get the dma-ranges from DT */
ret = of_pci_dma_range_parser_init(, pcie->dev->of_node);
@@ -1158,13 +1190,23 @@ static int iproc_pcie_map_dma_ranges(struct iproc_pcie 
*pcie)
return ret;
 
for_each_of_pci_range(, ) {
+   ret = iproc_pcie_add_dma_range(pcie->dev,
+  ,
+  );
+   if (ret)
+   goto out;
/* Each range entry corresponds to an inbound mapping region */
ret = iproc_pcie_setup_ib(pcie, , IPROC_PCIE_IB_MAP_MEM);
if (ret)
-   return ret;
+   goto out;
}
 
+   list_splice_init(, >dma_ranges);
+
return 0;
+out:
+   pci_free_resource_list();
+   return ret;
 }
 
 static int iproce_pcie_get_msi(struct iproc_pcie *pcie,
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 1/3] PCI: Add dma_ranges window list

2019-05-01 Thread Srinath Mannam via iommu
Add a dma_ranges field in PCI host bridge structure to hold resource
entries list of memory regions in sorted order given through dma-ranges
DT property.

While initializing IOMMU domain of PCI EPs connected to that host bridge,
this list of resources will be processed and IOVAs for the address holes
will be reserved.

Signed-off-by: Srinath Mannam 
Based-on-patch-by: Oza Pawandeep 
Reviewed-by: Oza Pawandeep 
Acked-by: Bjorn Helgaas 
---
 drivers/pci/probe.c | 3 +++
 include/linux/pci.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 7e12d01..72563c1 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -595,6 +595,7 @@ struct pci_host_bridge *pci_alloc_host_bridge(size_t priv)
return NULL;
 
INIT_LIST_HEAD(>windows);
+   INIT_LIST_HEAD(>dma_ranges);
bridge->dev.release = pci_release_host_bridge_dev;
 
/*
@@ -623,6 +624,7 @@ struct pci_host_bridge *devm_pci_alloc_host_bridge(struct 
device *dev,
return NULL;
 
INIT_LIST_HEAD(>windows);
+   INIT_LIST_HEAD(>dma_ranges);
bridge->dev.release = devm_pci_release_host_bridge_dev;
 
return bridge;
@@ -632,6 +634,7 @@ EXPORT_SYMBOL(devm_pci_alloc_host_bridge);
 void pci_free_host_bridge(struct pci_host_bridge *bridge)
 {
pci_free_resource_list(>windows);
+   pci_free_resource_list(>dma_ranges);
 
kfree(bridge);
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7744821..bba0a29 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -490,6 +490,7 @@ struct pci_host_bridge {
void*sysdata;
int busnr;
struct list_head windows;   /* resource_entry */
+   struct list_head dma_ranges;/* dma ranges resource list */
u8 (*swizzle_irq)(struct pci_dev *, u8 *); /* Platform IRQ swizzler */
int (*map_irq)(const struct pci_dev *, u8, u8);
void (*release_fn)(struct pci_host_bridge *);
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 0/3] PCIe Host request to reserve IOVA

2019-05-01 Thread Srinath Mannam via iommu
Few SOCs have limitation that their PCIe host can't allow few inbound
address ranges. Allowed inbound address ranges are listed in dma-ranges
DT property and this address ranges are required to do IOVA mapping.
Remaining address ranges have to be reserved in IOVA mapping.

PCIe Host driver of those SOCs has to list resource entries of allowed
address ranges given in dma-ranges DT property in sorted order. This
sorted list of resources will be processed and reserve IOVA address for
inaccessible address holes while initializing IOMMU domain.

This patch set is based on Linux-5.1-rc3.

Changes from v4:
  - Addressed Bjorn, Robin Murphy and Auger Eric review comments.
- Commit message modification.
- Change DMA_BIT_MASK to "~(dma_addr_t)0".

Changes from v3:
  - Addressed Robin Murphy review comments.
- pcie-iproc: parse dma-ranges and make sorted resource list.
- dma-iommu: process list and reserve gaps between entries

Changes from v2:
  - Patch set rebased to Linux-5.0-rc2

Changes from v1:
  - Addressed Oza review comments.

Srinath Mannam (3):
  PCI: Add dma_ranges window list
  iommu/dma: Reserve IOVA for PCIe inaccessible DMA address
  PCI: iproc: Add sorted dma ranges resource entries to host bridge

 drivers/iommu/dma-iommu.c   | 19 
 drivers/pci/controller/pcie-iproc.c | 44 -
 drivers/pci/probe.c |  3 +++
 include/linux/pci.h |  1 +
 4 files changed, 66 insertions(+), 1 deletion(-)

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 2/3] iommu/dma: Reserve IOVA for PCIe inaccessible DMA address

2019-05-01 Thread Srinath Mannam via iommu
dma_ranges field of PCI host bridge structure has resource entries in
sorted order of address range given through dma-ranges DT property. This
list is the accessible DMA address range. So that this resource list will
be processed and reserve IOVA address to the inaccessible address holes in
the list.

This method is similar to PCI IO resources address ranges reserving in
IOMMU for each EP connected to host bridge.

Signed-off-by: Srinath Mannam 
Based-on-patch-by: Oza Pawandeep 
Reviewed-by: Oza Pawandeep 
Acked-by: Robin Murphy 
---
 drivers/iommu/dma-iommu.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe6..da94844 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -212,6 +212,7 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
struct resource_entry *window;
unsigned long lo, hi;
+   phys_addr_t start = 0, end;
 
resource_list_for_each_entry(window, >windows) {
if (resource_type(window->res) != IORESOURCE_MEM)
@@ -221,6 +222,24 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
hi = iova_pfn(iovad, window->res->end - window->offset);
reserve_iova(iovad, lo, hi);
}
+
+   /* Get reserved DMA windows from host bridge */
+   resource_list_for_each_entry(window, >dma_ranges) {
+   end = window->res->start - window->offset;
+resv_iova:
+   if (end - start) {
+   lo = iova_pfn(iovad, start);
+   hi = iova_pfn(iovad, end);
+   reserve_iova(iovad, lo, hi);
+   }
+   start = window->res->end - window->offset + 1;
+   /* If window is last entry */
+   if (window->node.next == >dma_ranges &&
+   end != ~(dma_addr_t)0) {
+   end = ~(dma_addr_t)0;
+   goto resv_iova;
+   }
+   }
 }
 
 static int iova_reserve_iommu_regions(struct device *dev,
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/7 v2] MIPS: use the generic uncached segment support in dma-direct

2019-05-01 Thread Paul Burton
Hi Christoph,

On Wed, May 01, 2019 at 03:13:39PM +0200, Christoph Hellwig wrote:
> Stop providing our arch alloc/free hooks and just expose the segment
> offset instead.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/mips/Kconfig  |  1 +
>  arch/mips/include/asm/page.h   |  3 ---
>  arch/mips/jazz/jazzdma.c   |  6 --
>  arch/mips/mm/dma-noncoherent.c | 26 +-
>  4 files changed, 10 insertions(+), 26 deletions(-)

This one looks good to me now, for patches 1 & 5:

  Acked-by: Paul Burton 

Thanks,
Paul
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/9] soc/fsl/qman: fixup liodns only on ppc targets

2019-05-01 Thread Li Yang
On Sat, Apr 27, 2019 at 2:14 AM  wrote:
>
> From: Laurentiu Tudor 
>
> ARM SoCs use SMMU so the liodn fixup done in the qman driver is no
> longer making sense and it also breaks the ICID settings inherited
> from u-boot. Do the fixups only for PPC targets.
>
> Signed-off-by: Laurentiu Tudor 

Applied for next.  Thanks.

Leo
> ---
>  drivers/soc/fsl/qbman/qman_ccsr.c | 2 +-
>  drivers/soc/fsl/qbman/qman_priv.h | 9 -
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/soc/fsl/qbman/qman_ccsr.c 
> b/drivers/soc/fsl/qbman/qman_ccsr.c
> index 109b38de3176..a6bb43007d03 100644
> --- a/drivers/soc/fsl/qbman/qman_ccsr.c
> +++ b/drivers/soc/fsl/qbman/qman_ccsr.c
> @@ -596,7 +596,7 @@ static int qman_init_ccsr(struct device *dev)
>  }
>
>  #define LIO_CFG_LIODN_MASK 0x0fff
> -void qman_liodn_fixup(u16 channel)
> +void __qman_liodn_fixup(u16 channel)
>  {
> static int done;
> static u32 liodn_offset;
> diff --git a/drivers/soc/fsl/qbman/qman_priv.h 
> b/drivers/soc/fsl/qbman/qman_priv.h
> index 75a8f905f8f7..04515718cfd9 100644
> --- a/drivers/soc/fsl/qbman/qman_priv.h
> +++ b/drivers/soc/fsl/qbman/qman_priv.h
> @@ -193,7 +193,14 @@ extern struct gen_pool *qm_cgralloc; /* CGR ID allocator 
> */
>  u32 qm_get_pools_sdqcr(void);
>
>  int qman_wq_alloc(void);
> -void qman_liodn_fixup(u16 channel);
> +#ifdef CONFIG_FSL_PAMU
> +#define qman_liodn_fixup __qman_liodn_fixup
> +#else
> +static inline void qman_liodn_fixup(u16 channel)
> +{
> +}
> +#endif
> +void __qman_liodn_fixup(u16 channel);
>  void qman_set_sdest(u16 channel, unsigned int cpu_idx);
>
>  struct qman_portal *qman_create_affine_portal(
> --
> 2.17.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/9] soc/fsl/qbman_portals: add APIs to retrieve the probing status

2019-05-01 Thread Li Yang
On Sat, Apr 27, 2019 at 2:14 AM  wrote:
>
> From: Laurentiu Tudor 
>
> Add a couple of new APIs to check the probing status of the required
> cpu bound qman and bman portals:
>  'int bman_portals_probed()' and 'int qman_portals_probed()'.
> They return the following values.
>  *  1 if qman/bman portals were all probed correctly
>  *  0 if qman/bman portals were not yet probed
>  * -1 if probing of qman/bman portals failed
> Portals are considered successful probed if no error occurred during
> the probing of any of the portals and if enough portals were probed
> to have one available for each cpu.
> The error handling paths were slightly rearranged in order to fit this
> new functionality without being too intrusive.
> Drivers that use qman/bman portal driver services are required to use
> these APIs before calling any functions exported by these drivers or
> otherwise they will crash the kernel.
> First user will be the dpaa1 ethernet driver, coming in a subsequent
> patch.
>
> Signed-off-by: Laurentiu Tudor 

Applied for next.  Thanks.

Leo

> ---
>  drivers/soc/fsl/qbman/bman_portal.c | 20 
>  drivers/soc/fsl/qbman/qman_portal.c | 21 +
>  include/soc/fsl/bman.h  |  8 
>  include/soc/fsl/qman.h  |  9 +
>  4 files changed, 50 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/soc/fsl/qbman/bman_portal.c 
> b/drivers/soc/fsl/qbman/bman_portal.c
> index 2c95cf59f3e7..cf4f10d6f590 100644
> --- a/drivers/soc/fsl/qbman/bman_portal.c
> +++ b/drivers/soc/fsl/qbman/bman_portal.c
> @@ -32,6 +32,7 @@
>
>  static struct bman_portal *affine_bportals[NR_CPUS];
>  static struct cpumask portal_cpus;
> +static int __bman_portals_probed;
>  /* protect bman global registers and global data shared among portals */
>  static DEFINE_SPINLOCK(bman_lock);
>
> @@ -87,6 +88,12 @@ static int bman_online_cpu(unsigned int cpu)
> return 0;
>  }
>
> +int bman_portals_probed(void)
> +{
> +   return __bman_portals_probed;
> +}
> +EXPORT_SYMBOL_GPL(bman_portals_probed);
> +
>  static int bman_portal_probe(struct platform_device *pdev)
>  {
> struct device *dev = >dev;
> @@ -104,8 +111,10 @@ static int bman_portal_probe(struct platform_device 
> *pdev)
> }
>
> pcfg = devm_kmalloc(dev, sizeof(*pcfg), GFP_KERNEL);
> -   if (!pcfg)
> +   if (!pcfg) {
> +   __bman_portals_probed = -1;
> return -ENOMEM;
> +   }
>
> pcfg->dev = dev;
>
> @@ -113,14 +122,14 @@ static int bman_portal_probe(struct platform_device 
> *pdev)
>  DPAA_PORTAL_CE);
> if (!addr_phys[0]) {
> dev_err(dev, "Can't get %pOF property 'reg::CE'\n", node);
> -   return -ENXIO;
> +   goto err_ioremap1;
> }
>
> addr_phys[1] = platform_get_resource(pdev, IORESOURCE_MEM,
>  DPAA_PORTAL_CI);
> if (!addr_phys[1]) {
> dev_err(dev, "Can't get %pOF property 'reg::CI'\n", node);
> -   return -ENXIO;
> +   goto err_ioremap1;
> }
>
> pcfg->cpu = -1;
> @@ -128,7 +137,7 @@ static int bman_portal_probe(struct platform_device *pdev)
> irq = platform_get_irq(pdev, 0);
> if (irq <= 0) {
> dev_err(dev, "Can't get %pOF IRQ'\n", node);
> -   return -ENXIO;
> +   goto err_ioremap1;
> }
> pcfg->irq = irq;
>
> @@ -150,6 +159,7 @@ static int bman_portal_probe(struct platform_device *pdev)
> spin_lock(_lock);
> cpu = cpumask_next_zero(-1, _cpus);
> if (cpu >= nr_cpu_ids) {
> +   __bman_portals_probed = 1;
> /* unassigned portal, skip init */
> spin_unlock(_lock);
> return 0;
> @@ -175,6 +185,8 @@ static int bman_portal_probe(struct platform_device *pdev)
>  err_ioremap2:
> memunmap(pcfg->addr_virt_ce);
>  err_ioremap1:
> +__bman_portals_probed = -1;
> +
> return -ENXIO;
>  }
>
> diff --git a/drivers/soc/fsl/qbman/qman_portal.c 
> b/drivers/soc/fsl/qbman/qman_portal.c
> index 661c9b234d32..e2186b681d87 100644
> --- a/drivers/soc/fsl/qbman/qman_portal.c
> +++ b/drivers/soc/fsl/qbman/qman_portal.c
> @@ -38,6 +38,7 @@ EXPORT_SYMBOL(qman_dma_portal);
>  #define CONFIG_FSL_DPA_PIRQ_FAST  1
>
>  static struct cpumask portal_cpus;
> +static int __qman_portals_probed;
>  /* protect qman global registers and global data shared among portals */
>  static DEFINE_SPINLOCK(qman_lock);
>
> @@ -220,6 +221,12 @@ static int qman_online_cpu(unsigned int cpu)
> return 0;
>  }
>
> +int qman_portals_probed(void)
> +{
> +   return __qman_portals_probed;
> +}
> +EXPORT_SYMBOL_GPL(qman_portals_probed);
> +
>  static int qman_portal_probe(struct platform_device *pdev)
>  {
> struct device *dev = >dev;
> @@ -238,8 +245,10 @@ static int 

Re: [PATCH 4/7] dma-direct: provide generic support for uncached kernel segments

2019-05-01 Thread Paul Burton
Hi Christoph,

On Wed, May 01, 2019 at 07:49:05PM +0200, Christoph Hellwig wrote:
> On Wed, May 01, 2019 at 05:40:34PM +, Paul Burton wrote:
> > > > If it is necessary then as-is this code will clear the allocated memory
> > > > using uncached writes which will be pretty slow. It would be much more
> > > > efficient to perform the memset before arch_dma_prep_coherent() & before
> > > > converting ret to an uncached address.
> > > 
> > > Yes, we could do that.
> > 
> > Great; using cached writes would match the existing MIPS behavior.
> 
> Can you test the stack with the two updated patches and ack them if
> they are fine?  That would allow getting at least the infrastructure
> and mips in for this merge window.

Did you send a v2 of this patch?

If so it hasn't showed up in my inbox, nor on the linux-mips archive on
lore.kernel.org.

Thanks,
Paul
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] iommu/vt-d: Set intel_iommu_gfx_mapped correctly

2019-05-01 Thread Lu Baolu
The intel_iommu_gfx_mapped flag is exported by the Intel
IOMMU driver to indicate whether an IOMMU is used for the
graphic device. In a virtualized IOMMU environment (e.g.
QEMU), an include-all IOMMU is used for graphic device.
This flag is found to be clear even the IOMMU is used.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Reported-by: Zhenyu Wang 
Fixes: c0771df8d5297 ("intel-iommu: Export a flag indicating that the IOMMU is 
used for iGFX.")
Suggested-by: Kevin Tian 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index e0c0febc6fa5..00ad00193883 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4068,9 +4068,7 @@ static void __init init_no_remapping_devices(void)
 
/* This IOMMU has *only* gfx devices. Either bypass it or
   set the gfx_mapped flag, as appropriate */
-   if (dmar_map_gfx) {
-   intel_iommu_gfx_mapped = 1;
-   } else {
+   if (!dmar_map_gfx) {
drhd->ignored = 1;
for_each_active_dev_scope(drhd->devices,
  drhd->devices_cnt, i, dev)
@@ -4909,6 +4907,9 @@ int __init intel_iommu_init(void)
goto out_free_reserved_range;
}
 
+   if (dmar_map_gfx)
+   intel_iommu_gfx_mapped = 1;
+
init_no_remapping_devices();
 
ret = init_dmars();
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu/vt-d: Make kernel parameter igfx_off work with vIOMMU

2019-05-01 Thread Lu Baolu
The kernel parameter igfx_off is used by users to disable
DMA remapping for the Intel integrated graphic device. It
was designed for bare metal cases where a dedicated IOMMU
is used for graphic. This doesn't apply to virtual IOMMU
case where an include-all IOMMU is used.  This makes the
kernel parameter work with virtual IOMMU as well.

Cc: Ashok Raj 
Cc: Jacob Pan 
Suggested-by: Kevin Tian 
Fixes: c0771df8d5297 ("intel-iommu: Export a flag indicating that the IOMMU is 
used for iGFX.")
Signed-off-by: Lu Baolu 
Tested-by: Zhenyu Wang 
---
 drivers/iommu/intel-iommu.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 00ad00193883..e078b13ce3d8 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3415,9 +3415,12 @@ static int __init init_dmars(void)
iommu_identity_mapping |= IDENTMAP_ALL;
 
 #ifdef CONFIG_INTEL_IOMMU_BROKEN_GFX_WA
-   iommu_identity_mapping |= IDENTMAP_GFX;
+   dmar_map_gfx = 0;
 #endif
 
+   if (!dmar_map_gfx)
+   iommu_identity_mapping |= IDENTMAP_GFX;
+
check_tylersburg_isoch();
 
if (iommu_identity_mapping) {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/2] iommu/vt-d: Small fixes for 5.2-rc1

2019-05-01 Thread Lu Baolu
Hi Joerg,

This includes two small fixes for virtual IOMMU running in
qemu enviroment. On bare metal, we always have an dedicated
IOMMU for Intel integrated graphic device. And some aspects
of the driver was designed according to this. Unfortunately,
in qemu environment, the virtual IOMMU has only a single
include-all IOMMU engine, as the result some interfaces don't
work as expected anymore. This includes two fixes for this.

Best regards,
Lu Baolu

Lu Baolu (2):
  iommu/vt-d: Set intel_iommu_gfx_mapped correctly
  iommu/vt-d: Make kernel parameter igfx_off work with vIOMMU

 drivers/iommu/intel-iommu.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 02/10] swiotlb: Factor out slot allocation and free

2019-05-01 Thread Lu Baolu

Hi Robin,

On 4/30/19 5:53 PM, Robin Murphy wrote:

On 30/04/2019 03:02, Lu Baolu wrote:

Hi Robin,

On 4/29/19 7:06 PM, Robin Murphy wrote:

On 29/04/2019 06:10, Lu Baolu wrote:

Hi Christoph,

On 4/26/19 11:04 PM, Christoph Hellwig wrote:

On Thu, Apr 25, 2019 at 10:07:19AM +0800, Lu Baolu wrote:

This is not VT-d specific. It's just how generic IOMMU works.

Normally, IOMMU works in paging mode. So if a driver issues DMA with
IOVA  0x0123, IOMMU can remap it with a physical address 
0x0123.

But we should never expect IOMMU to remap 0x0123 with physical
address of 0x. That's the reason why I said that IOMMU 
will not

work there.


Well, with the iommu it doesn't happen.  With swiotlb it obviosuly
can happen, so drivers are fine with it.  Why would that suddenly
become an issue when swiotlb is called from the iommu code?



I would say IOMMU is DMA remapping, not DMA engine. :-)


I'm not sure I really follow the issue here - if we're copying the 
buffer to the bounce page(s) there's no conceptual difference from 
copying it to SWIOTLB slot(s), so there should be no need to worry 
about the original in-page offset.


 From the reply up-thread I guess you're trying to include an 
optimisation to only copy the head and tail of the buffer if it spans 
multiple pages, and directly map the ones in the middle, but AFAICS 
that's going to tie you to also using strict mode for TLB 
maintenance, which may not be a win overall depending on the balance 
between invalidation bandwidth vs. memcpy bandwidth. At least if we 
use standard SWIOTLB logic to always copy the whole thing, we should 
be able to release the bounce pages via the flush queue to allow 
'safe' lazy unmaps.




With respect, even we use the standard SWIOTLB logic, we need to use
the strict mode for TLB maintenance.

Say, some swiotbl slots are used by untrusted device for bounce page
purpose. When the device driver unmaps the IOVA, the slots are freed but
the mapping is still cached in IOTLB, hence the untrusted device is 
still able to access the slots. Then the slots are allocated to other

devices. This makes it possible for the untrusted device to access
the data buffer of other devices.


Sure, that's indeed how it would work right now - however since the 
bounce pages will be freed and reused by the DMA API layer itself (at 
the same level as the IOVAs) I see no technical reason why we couldn't 
investigate deferred freeing as a future optimisation.


Yes, agreed.

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu