[PATCH v1] iommu/vt-d: Switch to bitmap_zalloc()

2019-03-04 Thread Andy Shevchenko
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko 
---
 drivers/iommu/intel_irq_remapping.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index 2d74641b7f7b..634d8f059019 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -548,8 +548,7 @@ static int intel_setup_irq_remapping(struct intel_iommu 
*iommu)
goto out_free_table;
}
 
-   bitmap = kcalloc(BITS_TO_LONGS(INTR_REMAP_TABLE_ENTRIES),
-sizeof(long), GFP_ATOMIC);
+   bitmap = bitmap_zalloc(INTR_REMAP_TABLE_ENTRIES, GFP_ATOMIC);
if (bitmap == NULL) {
pr_err("IR%d: failed to allocate bitmap\n", iommu->seq_id);
goto out_free_pages;
@@ -616,7 +615,7 @@ static int intel_setup_irq_remapping(struct intel_iommu 
*iommu)
return 0;
 
 out_free_bitmap:
-   kfree(bitmap);
+   bitmap_free(bitmap);
 out_free_pages:
__free_pages(pages, INTR_REMAP_PAGE_ORDER);
 out_free_table:
@@ -640,7 +639,7 @@ static void intel_teardown_irq_remapping(struct intel_iommu 
*iommu)
}
free_pages((unsigned long)iommu->ir_table->base,
   INTR_REMAP_PAGE_ORDER);
-   kfree(iommu->ir_table->bitmap);
+   bitmap_free(iommu->ir_table->bitmap);
kfree(iommu->ir_table);
iommu->ir_table = NULL;
}
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2][RFC] kernel/dma: Support platform-specific default CMA sizes

2019-03-04 Thread Robin Murphy

Hi Thomas,

On 04/03/2019 11:52, Thomas Zimmermann wrote:

This is an RFC patch set that adds support for configuring a default CMA
size, depending on the platform.

At SUSE, our ARM installer image runs on a variety of platforms. One of
these, the Raspberry Pi 3, requires CMA to be allocated on order to provide
graphics output. Currently this can be done with the kernel's cma command-
line parameter, or with a compile-time setting. The cma parameter requires
user intervention during boot. Using a compile-time constant affects all
platforms. So both options are undesireable.

This patch set adds a data structure and look-up code to select a default
CMA size, depending on the device's model name. The kernel's cma parameter
or a compile-time setting take precedence of the per-device default.


Is there some issue with specifying a board-specific default CMA region 
via the existing DT binding[1]? I assume there are no ACPI concerns in 
this particular case.


Robin.

[1] Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt


For the Raspberry Pi 3, the CMA allocator now reserves 64 MiB (out of
1 GiB) by default. We found that this is enough memory to run a graphics
installer or a simple desktop environment under X11.

Thomas Zimmermann (2):
   kernel/dma: Support device-specific CMA defaults
   kernel/dma: Default to 64 MiB of contiguous memory on Raspberry Pi 3

  kernel/dma/contiguous.c | 62 +
  1 file changed, 62 insertions(+)


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v1 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2

2019-03-04 Thread Jean-Philippe Brucker
Hi Jordan,

On 01/03/2019 19:38, Jordan Crouse wrote:
> Support the new auxiliary domain API for arm-smmuv2 to initialize and
> support multiple pagetables for a SMMU device. Since the smmu-v2 hardware
> doesn't have any built in support for switching the pagetable base it is
> left as an exercise to the caller to actually use the pagetable; aux
> domains in the IOMMU driver are only preoccupied with creating and managing
> the pagetable memory.
> 
> Following is a pseudo code example of how a domain can be created
> 
>  /* Check to see if aux domains are supported */
>  if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
>iommu = iommu_domain_alloc(...);
> 
>if (iommu_aux_attach_device(domain, dev))
>return FAIL;
> 
>   /* Save the base address of the pagetable for use by the driver
>   iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, );
>  }

> After this 'domain' can be used like any other iommu domain to map and
> unmap iova addresses in the pagetable. The driver/hardware can be used
> to switch the pagetable according to its own specific implementation.
> 
> Signed-off-by: Jordan Crouse 

[...]
> +static bool arm_smmu_dev_has_feat(struct device *dev,
> + enum iommu_dev_features feat)
> +{
> + /*
> +  * FIXME: Should we do some hardware checking here, like to be sure this
> +  * is a stage 1 and such?
> +  */
> +
> + /* Always allow aux domains */
> + if (feat == IOMMU_DEV_FEAT_AUX)
> + return true;

If possible, we should only return true when SMMU and GPU are able to
coordinate and switch contexts. Can the feature be identified through ID
reg or compatible string?

If we plug a PCIe card with PASID behind a SMMUv2 'classic', and its
driver attempts to enable AUXD support, then this should return false.

> +
> + return false;
> +}
> +
> +/* FIXME: Add stubs for dev_enable_feat and dev_disable_feat? */

Ideally yes. Although SMMUv2 support for aux domains will likely only be
used by the MSM driver, using the same model in all IOMMU drivers would
ease moving things to common code later.

> +
> +/* Set up a new aux domain and create a new pagetable with the same
> + * characteristics as the master
> + */
> +static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
> + struct device *dev)
> +{
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> + struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +
> + smmu_domain->is_aux = true;

The API allows to attach the same domain to one device using
aux_attach_dev() and another using attach_dev(). For SMMUv3 we'll reject
this, since normal and aux domain are different things (one has PASID
tables, the other doesn't). Is this supported by SMMUv2? Otherwise some
sanity-check here might be necessary

> +
> + /* No power is needed because aux domain doesn't touch the hardware */
> + return arm_smmu_init_domain_context(domain, smmu);
> +}
> +
>  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device 
> *dev)
>  {
>   int ret;
> @@ -1342,6 +1414,8 @@ static int arm_smmu_attach_dev(struct iommu_domain 
> *domain, struct device *dev)
>   return -ENXIO;
>   }
>  
> + /* FIXME: Reject unmanged domains since those should be aux? */

No, unmanaged domains are also used by VFIO and a couple other drivers
that want to setup IOMMU mappings themselves.

Thanks,
Jean

> +
>   /*
>* FIXME: The arch/arm DMA API code tries to attach devices to its own
>* domains between of_xlate() and add_device() - we have no way to cope
> @@ -1388,7 +1462,13 @@ arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, 
> unsigned long iova)
>  {
>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   struct arm_smmu_cfg *cfg = _domain->cfg;
> - struct arm_smmu_cb *cb = _domain->smmu->cbs[cfg->cbndx];
> + struct arm_smmu_cb *cb;
> +
> + /* quick escape for domains that don't have split pagetables enabled */
> + if (!smmu_domain->pgtbl_ops[1])
> + return smmu_domain->pgtbl_ops[0];
> +
> + cb = _domain->smmu->cbs[cfg->cbndx];
>  
>   if (iova & cb->split_table_mask)
>   return smmu_domain->pgtbl_ops[1];
> @@ -1700,6 +1780,11 @@ static int arm_smmu_domain_get_attr(struct 
> iommu_domain *domain,
>   !!(smmu_domain->attributes &
>  (1 << DOMAIN_ATTR_SPLIT_TABLES));
>   return 0;
> + case DOMAIN_ATTR_PTBASE:
> + if (!smmu_domain->is_aux)
> + return -ENODEV;
> + *((u64 *)data) = smmu_domain->ttbr0;
> + return 0;
>   default:
>   return -ENODEV;
>   }
> @@ -1810,7 +1895,9 @@ static struct iommu_ops arm_smmu_ops = {
>   .capable= 

Re: [PATCH v2] iommu: io-pgtable: Add ARM Mali midgard MMU page table format

2019-03-04 Thread Robin Murphy

On 04/03/2019 15:32, Rob Herring wrote:

On Fri, Mar 1, 2019 at 10:28 AM Robin Murphy  wrote:


On 27/02/2019 23:22, Rob Herring wrote:

ARM Mali midgard GPU page tables are similar to standard 64-bit stage 1
page tables, but have a few differences. Add a new format type to
represent the format. The input address size is 48-bits and the output
address size is 40-bits (and possibly less?). Note that the later
bifrost GPUs follow the standard 64-bit stage 1 format.

The differences in the format compared to 64-bit stage 1 format are:

The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.

The access flags are not read-only and unprivileged, but read and write.
This is similar to stage 2 entries, but the memory attributes field matches
stage 1 being an index.

The nG bit is not set by the vendor driver. This one didn't seem to matter,
but we'll keep it aligned to the vendor driver.


[...]


+ cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
+ iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
+ if (iop)
+ cfg->arm_lpae_s1_cfg.tcr = 0;


The general design intent is that we return ready-to-go register values
here (much like mmu_get_as_setup() in mali_kbase, it seems), so I think
it's worth adding an arm_mali_cfg to the io_pgtable_cfg union and
transforming the VMSA output into final transtab/transcfg/memattr format
at this point, rather then callers needing to care (e.g. some of those
AS_TRANSTAB_LPAE_* bits look reminiscent of the walk attributes we set
up for a VMSA TCR).


I agree with returning ready-to-go values, but I'm not so sure the
bits are the same. Bits 0-1 are enable/mode bits which are pretty
specific to mali. Bit 2 is 'read inner' for whatever that means.
Perhaps it is read allocate cacheability, but that's a bit different
than RGN bits. Bit 4 is 'share outer'. Maybe it's the same as SH0 if
bit 3 is included, but I have no evidence of that. So I don't think
there's really anything to transform. We just set the bits needed. So
here's what I have in mind:


Right, my Friday-afternoon wording perhaps wasn't the best - by 
"transform" I didn't mean to imply trying to reinterpret the default 
settings we configure for a VMSA TCR, merely applying some 
similarly-appropriate defaults to make a Mali TRANSTAB out of the VMSA TTBR.



iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
if (iop) {
   u64 mair, ttbr;

   /* Copy values as union fields overlap */
   mair = cfg->arm_lpae_s1_cfg.mair[0];
   ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];

   cfg->arm_mali_lpae_cfg.mair = mair;
   cfg->arm_mali_lpae_cfg.ttbr = ttbr;
   cfg->arm_mali_lpae_cfg.ttbr |= ARM_MALI_LPAE_TTBR_READ_INNER |
 ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
}


...pretty much exactly like that (although I'd still prefer to use the 
Mali register names for clarity, and presumably you'll still explicitly 
initialise TRANSCFG too in the real patch).


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated

2019-03-04 Thread James Sewart via iommu
Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
default_domain of an iommu_group to be set. This delegates device-domain
relationships to the generic IOMMU code.

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 113 +++-
 1 file changed, 84 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8e0a4e2ff77f..71cd6bbfec05 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -309,6 +309,14 @@ static int hw_pass_through = 1;
 /* si_domain contains mulitple devices */
 #define DOMAIN_FLAG_STATIC_IDENTITY(1 << 1)
 
+/* Domain managed externally, don't cleanup if it isn't attached
+ * to any devices. */
+#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2)
+
+/* Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device */
+#define DOMAIN_FLAG_INITIALISED(1 << 4)
+
 #define for_each_domain_iommu(idx, domain) \
for (idx = 0; idx < g_num_of_iommus; idx++) \
if (domain->iommu_refcnt[idx])
@@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct 
dmar_domain *domain)
DOMAIN_FLAG_STATIC_IDENTITY);
 }
 
+static inline int domain_managed_externally(struct dmar_domain *domain)
+{
+   return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+   return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
 static inline int domain_pfn_supported(struct dmar_domain *domain,
   unsigned long pfn)
 {
@@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
 
__dmar_remove_one_dev_info(info);
 
-   if (!domain_type_is_vm_or_si(domain)) {
+   if (!domain_managed_externally(domain)) {
/*
 * The domain_exit() function  can't be called under
 * device_domain_lock, as it takes this lock itself.
@@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct 
intel_iommu *iommu,
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+   domain->flags |= DOMAIN_FLAG_INITIALISED;
__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
return 0;
 }
@@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)
 
 static int md_domain_init(struct dmar_domain *domain, int guest_width);
 
-static int __init si_domain_init(int hw)
+static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
 {
int nid, ret = 0;
 
-   si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
-   if (!si_domain)
-   return -EFAULT;
-
if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
domain_exit(si_domain);
return -EFAULT;
@@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
check_tylersburg_isoch();
 
if (iommu_identity_mapping) {
-   ret = si_domain_init(hw_pass_through);
-   if (ret)
+   si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
+   if (!si_domain) {
+   ret = -EFAULT;
goto free_iommu;
+   }
+   ret = si_domain_init(si_domain, hw_pass_through);
+   if (ret) {
+   domain_exit(si_domain);
+   goto free_iommu;
+   }
}
 
 
@@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb,
return 0;
 
dmar_remove_one_dev_info(domain, dev);
-   if (!domain_type_is_vm_or_si(domain) && list_empty(>devices))
+   if (!domain_managed_externally(domain) && list_empty(>devices))
domain_exit(domain);
 
return 0;
@@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int 
guest_width)
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+   domain->flags |= DOMAIN_FLAG_INITIALISED;
domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
return 0;
 }
@@ -5028,28 +5051,54 @@ static struct iommu_domain 
*intel_iommu_domain_alloc(unsigned type)
 {
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
+   int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
 
-   if (type != IOMMU_DOMAIN_UNMANAGED)
-   return NULL;
-
-   dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
-   if (!dmar_domain) {
-   pr_err("Can't allocate dmar_domain\n");
-   return NULL;
-   }
-   if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
-

[PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

2019-03-04 Thread James Sewart via iommu
The generic IOMMU code will allocate and attach a dma ops domain to each
device that comes online, replacing any lazy allocated domain. Removes
RMRR application at iommu init time as we won't have a domain attached
to any device. iommu.c will do this after attaching a device using
create_direct_mappings.

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 202 ++--
 1 file changed, 8 insertions(+), 194 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 71cd6bbfec05..282257e2628d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2595,118 +2595,6 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
return domain;
 }
 
-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
-{
-   *(u16 *)opaque = alias;
-   return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
-   struct device_domain_info *info = NULL;
-   struct dmar_domain *domain = NULL;
-   struct intel_iommu *iommu;
-   u16 dma_alias;
-   unsigned long flags;
-   u8 bus, devfn;
-
-   iommu = device_to_iommu(dev, , );
-   if (!iommu)
-   return NULL;
-
-   if (dev_is_pci(dev)) {
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   pci_for_each_dma_alias(pdev, get_last_alias, _alias);
-
-   spin_lock_irqsave(_domain_lock, flags);
-   info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
- PCI_BUS_NUM(dma_alias),
- dma_alias & 0xff);
-   if (info) {
-   iommu = info->iommu;
-   domain = info->domain;
-   }
-   spin_unlock_irqrestore(_domain_lock, flags);
-
-   /* DMA alias already has a domain, use it */
-   if (info)
-   goto out;
-   }
-
-   /* Allocate and initialize new domain for the device */
-   domain = alloc_domain(0);
-   if (!domain)
-   return NULL;
-   if (domain_init(domain, iommu, gaw)) {
-   domain_exit(domain);
-   return NULL;
-   }
-
-out:
-
-   return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
- struct dmar_domain *domain)
-{
-   struct intel_iommu *iommu;
-   struct dmar_domain *tmp;
-   u16 req_id, dma_alias;
-   u8 bus, devfn;
-
-   iommu = device_to_iommu(dev, , );
-   if (!iommu)
-   return NULL;
-
-   req_id = ((u16)bus << 8) | devfn;
-
-   if (dev_is_pci(dev)) {
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   pci_for_each_dma_alias(pdev, get_last_alias, _alias);
-
-   /* register PCI DMA alias device */
-   if (req_id != dma_alias) {
-   tmp = dmar_insert_one_dev_info(iommu, 
PCI_BUS_NUM(dma_alias),
-   dma_alias & 0xff, NULL, domain);
-
-   if (!tmp || tmp != domain)
-   return tmp;
-   }
-   }
-
-   tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
-   if (!tmp || tmp != domain)
-   return tmp;
-
-   return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
-   struct dmar_domain *domain, *tmp;
-
-   domain = find_domain(dev);
-   if (domain)
-   goto out;
-
-   domain = find_or_alloc_domain(dev, gaw);
-   if (!domain)
-   goto out;
-
-   tmp = set_domain_for_dev(dev, domain);
-   if (!tmp || domain != tmp) {
-   domain_exit(domain);
-   domain = tmp;
-   }
-
-out:
-
-   return domain;
-}
-
 static int iommu_domain_identity_map(struct dmar_domain *domain,
 unsigned long long start,
 unsigned long long end)
@@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
struct dmar_domain *domain;
int ret;
 
-   domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
+   domain = find_domain(dev);
if (!domain)
return -ENOMEM;
 
@@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu 
*iommu)
 static int __init init_dmars(void)
 {
struct dmar_drhd_unit *drhd;
-   struct dmar_rmrr_unit *rmrr;
bool copied_tables = false;
-   struct device *dev;
struct intel_iommu *iommu;
-   int i, ret;
+   int ret;
 
/*
 * for each drhd
@@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
goto free_iommu;
}
}
-   /*
-

Re: [PATCH RFC 1/1] iommu: set the default iommu-dma mode as non-strict

2019-03-04 Thread Robin Murphy

On 02/03/2019 06:12, Leizhen (ThunderTown) wrote:



On 2019/3/1 19:07, Jean-Philippe Brucker wrote:

Hi Leizhen,

On 01/03/2019 04:44, Leizhen (ThunderTown) wrote:



On 2019/2/26 20:36, Hanjun Guo wrote:

Hi Jean,

On 2019/1/31 22:55, Jean-Philippe Brucker wrote:

Hi,

On 31/01/2019 13:52, Zhen Lei wrote:

Currently, many peripherals are faster than before. For example, the top
speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But
when iommu page-table mapping enabled, it's hard to reach the top speed
in strict mode, because of frequently map and unmap operations. In order
to keep abreast of the times, I think it's better to set non-strict as
default.


Most users won't be aware of this relaxation and will have their system
vulnerable to e.g. thunderbolt hotplug. See for example 4.3 Deferred
Invalidation in
http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2018/MSC/MSC-2018-21.pdf

Hi Jean,

In fact, we have discussed the vulnerable of deferred invalidation before 
upstream
the non-strict patches. The attacks maybe possible because of an untrusted 
device or
the mistake of the device driver. And we limited the VFIO to still use strict 
mode.
As mentioned in the pdf, limit the freed memory with deferred invalidation 
only to
be reused by the device, can mitigate the vulnerability. But it's too hard to 
implement
it now.
A compromise maybe we only apply non-strict to (1) dma_free_coherent, 
because the
memory is controlled by DMA common module, so we can make the memory to be 
freed after
the global invalidation in the timer handler. (2) And provide some new APIs 
related to
iommu_unmap_page/sg, these new APIs deferred invalidation. And the candiate 
device
drivers update the APIs if they want to improve performance. (3) Make sure that 
only
the trusted devices and trusted drivers can apply (1) and (2). For example, the 
driver
must be built into kernel Image.


Do we have a notion of untrusted kernel drivers? A userspace driver

It seems impossible to have such driver. The modules insmod by root users 
should be
guaranteed by themselves.


(VFIO) is untrusted, ok. But a malicious driver loaded into the kernel
address space would have much easier ways to corrupt the system than to
exploit lazy mode...

Yes, so that we have no need to consider untrusted drivers.



For (3), I agree that we should at least disallow lazy mode if
pci_dev->untrusted is set. At the moment it means that we require the
strictest IOMMU configuration for external-facing PCI ports, but it can
be extended to blacklist other vulnerable devices or locations.

I plan to add an attribute file for each device, espcially for hotplug devices. 
And
let the root users to decide which mode should be used, strict or non-strict. 
Becasue
they should known whether the hot-plug divice is trusted or not.


Aside from the problem that without massive implementation changes 
strict/non-strict is at best a per-domain property, not a per-device 
one, I can't see this being particularly practical - surely the whole 
point of a malicious endpoint is that it's going to pretend to be some 
common device for which a 'trusted' kernel driver already exists? If 
you've chosen to trust *any* external device, I think you may as well 
have just set non-strict globally anyway. The effort involved in trying 
to implement super-fine-grained control seems hard to justify.


Robin.



If you do (3) then maybe we don't need (1) and (2), which require a
tonne of work in the DMA and IOMMU layers (but would certainly be nice
to see, since it would also help handle ATS invalidation timeouts)

Thanks,
Jean


So that some high-end trusted devices use non-strict mode, and keep others 
still using
strict mode. The drivers who want to use non-strict mode, should change to use 
new APIs
by themselves.




Why not keep the policy to secure by default, as we do for
iommu.passthrough? And maybe add something similar to
CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a
command-line argument or change the default config.


Sorry for the late reply, it was Chinese new year, and we had a long discussion
internally, we are fine to add a Kconfig but not sure OS vendors will set it
to default y.

OS vendors seems not happy to pass a command-line argument, to be honest,
this is our motivation to enable non-strict as default. Hope OS vendors
can see this email thread, and give some input here.

Thanks
Hanjun


.






.




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

2019-03-04 Thread James Sewart via iommu
Hey,

This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c. 
This avoids the use of find_or_alloc_domain whose domain assignment is 
inconsistent with the iommu grouping as determined by pci_device_group.

Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the 
iommu_ops api, allowing the default_domain of an iommu group to be set in 
iommu.c. This domain will be attached to every device that is brought up 
with an iommu group, and the devices reserved regions will be mapped using 
regions returned by get_resv_regions.

In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be 
associated with so we defer full initialisation until 
intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will 
try to map a devices reserved regions before attaching the domain which 
would cause issue if the domain is not fully initialised. This is 
addressed in patch 1 by moving the mapping to after attaching.

Patch 2 implements function apply_resv_region, used in 
iommu_group_create_direct_mappings to mark the reserved regions as non 
mappable for the dma_map_ops api.

Patch 4 removes the domain lazy allocation logic. Before this patch the 
lazy allocation logic would not be used as any domain allocated using 
these paths would be replaced when attaching the group default domain. 
Default domain allocation has been tested with and without this patch on 
4.19.

Cheers,
James.

James Sewart (4):
  iommu: Move iommu_group_create_direct_mappings to after device_attach
  iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
  iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
  iommu/vt-d: Remove lazy allocation of domains

 drivers/iommu/intel-iommu.c | 329 
 drivers/iommu/iommu.c   |   4 +-
 2 files changed, 108 insertions(+), 225 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] iommu: io-pgtable: Add ARM Mali midgard MMU page table format

2019-03-04 Thread Rob Herring
On Fri, Mar 1, 2019 at 10:28 AM Robin Murphy  wrote:
>
> On 27/02/2019 23:22, Rob Herring wrote:
> > ARM Mali midgard GPU page tables are similar to standard 64-bit stage 1
> > page tables, but have a few differences. Add a new format type to
> > represent the format. The input address size is 48-bits and the output
> > address size is 40-bits (and possibly less?). Note that the later
> > bifrost GPUs follow the standard 64-bit stage 1 format.
> >
> > The differences in the format compared to 64-bit stage 1 format are:
> >
> > The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.
> >
> > The access flags are not read-only and unprivileged, but read and write.
> > This is similar to stage 2 entries, but the memory attributes field matches
> > stage 1 being an index.
> >
> > The nG bit is not set by the vendor driver. This one didn't seem to matter,
> > but we'll keep it aligned to the vendor driver.

[...]

> > + cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
> > + iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
> > + if (iop)
> > + cfg->arm_lpae_s1_cfg.tcr = 0;
>
> The general design intent is that we return ready-to-go register values
> here (much like mmu_get_as_setup() in mali_kbase, it seems), so I think
> it's worth adding an arm_mali_cfg to the io_pgtable_cfg union and
> transforming the VMSA output into final transtab/transcfg/memattr format
> at this point, rather then callers needing to care (e.g. some of those
> AS_TRANSTAB_LPAE_* bits look reminiscent of the walk attributes we set
> up for a VMSA TCR).

I agree with returning ready-to-go values, but I'm not so sure the
bits are the same. Bits 0-1 are enable/mode bits which are pretty
specific to mali. Bit 2 is 'read inner' for whatever that means.
Perhaps it is read allocate cacheability, but that's a bit different
than RGN bits. Bit 4 is 'share outer'. Maybe it's the same as SH0 if
bit 3 is included, but I have no evidence of that. So I don't think
there's really anything to transform. We just set the bits needed. So
here's what I have in mind:

iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
if (iop) {
  u64 mair, ttbr;

  /* Copy values as union fields overlap */
  mair = cfg->arm_lpae_s1_cfg.mair[0];
  ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];

  cfg->arm_mali_lpae_cfg.mair = mair;
  cfg->arm_mali_lpae_cfg.ttbr = ttbr;
  cfg->arm_mali_lpae_cfg.ttbr |= ARM_MALI_LPAE_TTBR_READ_INNER |
ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
}

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] iommu: Move iommu_group_create_direct_mappings to after device_attach

2019-03-04 Thread James Sewart via iommu
If an IOMMU driver requires to know which IOMMU a domain is associated
to initialise a domain then it will do so in device_attach, before which
it is not safe to call iommu_ops.

Signed-off-by: James Sewart 
---
 drivers/iommu/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3ed4db334341..1c6ffbb2d20e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
 
dev->iommu_group = group;
 
-   iommu_group_create_direct_mappings(group, dev);
-
mutex_lock(>mutex);
list_add_tail(>list, >devices);
if (group->domain)
@@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
if (ret)
goto err_put_group;
 
+   iommu_group_create_direct_mappings(group, dev);
+
/* Notify any listeners about change to group. */
blocking_notifier_call_chain(>notifier,
 IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-- 
2.17.1



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges

2019-03-04 Thread James Sewart via iommu
Used by iommu.c before creating identity mappings for reserved ranges to
ensure dma-map-ops won't ever remap these addresses.

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 78188bf7e90d..8e0a4e2ff77f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5299,6 +5299,19 @@ static void intel_iommu_put_resv_regions(struct device 
*dev,
}
 }
 
+static void intel_iommu_apply_resv_region(struct device *dev,
+   struct iommu_domain *domain,
+   struct iommu_resv_region *region)
+{
+   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+   unsigned long start, end;
+
+   start = IOVA_PFN(region->start);
+   end   = IOVA_PFN(region->start + region->length - 1);
+
+   WARN_ON_ONCE(reserve_iova(_domain->iovad, start, end) == NULL);
+}
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev 
*sdev)
 {
@@ -5392,6 +5405,7 @@ const struct iommu_ops intel_iommu_ops = {
.remove_device  = intel_iommu_remove_device,
.get_resv_regions   = intel_iommu_get_resv_regions,
.put_resv_regions   = intel_iommu_put_resv_regions,
+   .apply_resv_region  = intel_iommu_apply_resv_region,
.device_group   = pci_device_group,
.pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
 };
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated

2019-03-04 Thread Lu Baolu

Hi,

On 3/4/19 11:46 PM, James Sewart wrote:

Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
default_domain of an iommu_group to be set. This delegates device-domain
relationships to the generic IOMMU code.

Signed-off-by: James Sewart 
---
  drivers/iommu/intel-iommu.c | 113 +++-
  1 file changed, 84 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8e0a4e2ff77f..71cd6bbfec05 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -309,6 +309,14 @@ static int hw_pass_through = 1;
  /* si_domain contains mulitple devices */
  #define DOMAIN_FLAG_STATIC_IDENTITY   (1 << 1)
  
+/* Domain managed externally, don't cleanup if it isn't attached

+ * to any devices. */
+#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2)
+
+/* Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device */
+#define DOMAIN_FLAG_INITIALISED(1 << 4)


Why do you skip bit 3?


+
  #define for_each_domain_iommu(idx, domain)\
for (idx = 0; idx < g_num_of_iommus; idx++)  \
if (domain->iommu_refcnt[idx])
@@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct 
dmar_domain *domain)
DOMAIN_FLAG_STATIC_IDENTITY);
  }
  
+static inline int domain_managed_externally(struct dmar_domain *domain)

+{
+   return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+   return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
  static inline int domain_pfn_supported(struct dmar_domain *domain,
   unsigned long pfn)
  {
@@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
  
  		__dmar_remove_one_dev_info(info);
  
-		if (!domain_type_is_vm_or_si(domain)) {

+   if (!domain_managed_externally(domain)) {
/*
 * The domain_exit() function  can't be called under
 * device_domain_lock, as it takes this lock itself.
@@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct 
intel_iommu *iommu,
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+   domain->flags |= DOMAIN_FLAG_INITIALISED;
__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
return 0;
  }
@@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)
  
  static int md_domain_init(struct dmar_domain *domain, int guest_width);
  
-static int __init si_domain_init(int hw)

+static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
  {
int nid, ret = 0;
  
-	si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);

-   if (!si_domain)
-   return -EFAULT;
-
if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
domain_exit(si_domain);
return -EFAULT;
@@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
check_tylersburg_isoch();
  
  	if (iommu_identity_mapping) {

-   ret = si_domain_init(hw_pass_through);
-   if (ret)
+   si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);


Where will this si_domain be used? We still need to keep the global
si_domain?


+   if (!si_domain) {
+   ret = -EFAULT;
goto free_iommu;
+   }
+   ret = si_domain_init(si_domain, hw_pass_through);
+   if (ret) {
+   domain_exit(si_domain);
+   goto free_iommu;
+   }
}
  
  
@@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb,

return 0;
  
  	dmar_remove_one_dev_info(domain, dev);

-   if (!domain_type_is_vm_or_si(domain) && list_empty(>devices))
+   if (!domain_managed_externally(domain) && list_empty(>devices))
domain_exit(domain);
  
  	return 0;

@@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int 
guest_width)
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+   domain->flags |= DOMAIN_FLAG_INITIALISED;
domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
return 0;
  }
@@ -5028,28 +5051,54 @@ static struct iommu_domain 
*intel_iommu_domain_alloc(unsigned type)
  {
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
+   int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
  
-	if (type != IOMMU_DOMAIN_UNMANAGED)

-   return NULL;
-
-   dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
-   if (!dmar_domain) {
-   pr_err("Can't allocate 

Re: [PATCH] Revert "swiotlb: remove SWIOTLB_MAP_ERROR"

2019-03-04 Thread Robin Murphy

Hi Arnd,

On 2019-03-04 7:59 pm, Arnd Bergmann wrote:

This reverts commit b907e20508d0 ("swiotlb: remove SWIOTLB_MAP_ERROR"), which
introduced an overflow warning in configurations that have a larger
dma_addr_t than phys_addr_t:

In file included from include/linux/dma-direct.h:5,
  from kernel/dma/swiotlb.c:23:
kernel/dma/swiotlb.c: In function 'swiotlb_tbl_map_single':
include/linux/dma-mapping.h:136:28: error: conversion from 'long long unsigned 
int' to 'phys_addr_t' {aka 'unsigned int'} changes value from 
'18446744073709551615' to '4294967295' [-Werror=overflow]
  #define DMA_MAPPING_ERROR  (~(dma_addr_t)0)
 ^
kernel/dma/swiotlb.c:544:9: note: in expansion of macro 'DMA_MAPPING_ERROR'
   return DMA_MAPPING_ERROR;

The configuration that caused this is on 32-bit ARM, where the DMA address
space depends on the enabled hardware platforms, while the physical
address space depends on the type of MMU chosen (classic vs LPAE).


Are these real platforms, or random configs? Realistically I don't see a 
great deal of need to support DMA_ADDR_T_64BIT for non-LPAE. 
Particularly in this case since AFAIK the only selector of SWIOTLB on 
Arm is Xen, and that by definition is never going to be useful on 
non-LPAE hardware.


Fair enough that we don't still don't want even randconfigs generating 
warnings, though. As long as this change doesn't let SWIOTLB_MAP_ERROR 
leak out to logic expecting DMA_MAP_ERROR - which does appear to be the 
case - and is also still OK for the opposite weirdness of 
PHYS_ADDR_T_64BIT && !DMA_ADDR_T_64BIT, then I think it's reasonable.


Robin.


I tried a couple of alternative approaches, but the previous version
seems as good as any other, so I went back to that.

Fixes: b907e20508d0 ("swiotlb: remove SWIOTLB_MAP_ERROR")
Signed-off-by: Arnd Bergmann 
---
  drivers/xen/swiotlb-xen.c | 4 ++--
  include/linux/swiotlb.h   | 3 +++
  kernel/dma/swiotlb.c  | 4 ++--
  3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 877baf2a94f4..57a98279bf4f 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -406,7 +406,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, 
struct page *page,
  
  	map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,

 attrs);
-   if (map == DMA_MAPPING_ERROR)
+   if (map == SWIOTLB_MAP_ERROR)
return DMA_MAPPING_ERROR;
  
  	dev_addr = xen_phys_to_bus(map);

@@ -557,7 +557,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct 
scatterlist *sgl,
 sg_phys(sg),
 sg->length,
 dir, attrs);
-   if (map == DMA_MAPPING_ERROR) {
+   if (map == SWIOTLB_MAP_ERROR) {
dev_warn(hwdev, "swiotlb buffer is full\n");
/* Don't panic here, we expect map_sg users
   to do proper error handling. */
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 361f62bb4a8e..a65a36551f58 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -44,6 +44,9 @@ enum dma_sync_target {
SYNC_FOR_DEVICE = 1,
  };
  
+/* define the last possible byte of physical address space as a mapping error */

+#define SWIOTLB_MAP_ERROR (~(phys_addr_t)0x0)
+
  extern phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
  dma_addr_t tbl_dma_addr,
  phys_addr_t phys, size_t size,
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 12059b78b631..922880b84387 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -541,7 +541,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
spin_unlock_irqrestore(_tlb_lock, flags);
if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
size);
-   return DMA_MAPPING_ERROR;
+   return SWIOTLB_MAP_ERROR;
  found:
io_tlb_used += nslots;
spin_unlock_irqrestore(_tlb_lock, flags);
@@ -659,7 +659,7 @@ bool swiotlb_map(struct device *dev, phys_addr_t *phys, 
dma_addr_t *dma_addr,
/* Oh well, have to allocate and map a bounce buffer. */
*phys = swiotlb_tbl_map_single(dev, __phys_to_dma(dev, io_tlb_start),
*phys, size, dir, attrs);
-   if (*phys == DMA_MAPPING_ERROR)
+   if (*phys == SWIOTLB_MAP_ERROR)
return false;
  
  	/* Ensure that the address returned is DMA'ble */



___
iommu mailing list
iommu@lists.linux-foundation.org

Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

2019-03-04 Thread Lu Baolu

Hi,

It's hard for me to understand why do we remove the rmrr related
code in this patch.

And, now we have two places to hold a domain for a device: group and
dev->info. We can consider to optimize the use of per device iommu
arch data. This should be later work anyway.

More comments inline.

On 3/4/19 11:47 PM, James Sewart wrote:

The generic IOMMU code will allocate and attach a dma ops domain to each
device that comes online, replacing any lazy allocated domain. Removes
RMRR application at iommu init time as we won't have a domain attached
to any device. iommu.c will do this after attaching a device using
create_direct_mappings.

Signed-off-by: James Sewart 
---
  drivers/iommu/intel-iommu.c | 202 ++--
  1 file changed, 8 insertions(+), 194 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 71cd6bbfec05..282257e2628d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2595,118 +2595,6 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
return domain;
  }
  
-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)

-{
-   *(u16 *)opaque = alias;
-   return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
-   struct device_domain_info *info = NULL;
-   struct dmar_domain *domain = NULL;
-   struct intel_iommu *iommu;
-   u16 dma_alias;
-   unsigned long flags;
-   u8 bus, devfn;
-
-   iommu = device_to_iommu(dev, , );
-   if (!iommu)
-   return NULL;
-
-   if (dev_is_pci(dev)) {
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   pci_for_each_dma_alias(pdev, get_last_alias, _alias);
-
-   spin_lock_irqsave(_domain_lock, flags);
-   info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
- PCI_BUS_NUM(dma_alias),
- dma_alias & 0xff);
-   if (info) {
-   iommu = info->iommu;
-   domain = info->domain;
-   }
-   spin_unlock_irqrestore(_domain_lock, flags);
-
-   /* DMA alias already has a domain, use it */
-   if (info)
-   goto out;
-   }
-
-   /* Allocate and initialize new domain for the device */
-   domain = alloc_domain(0);
-   if (!domain)
-   return NULL;
-   if (domain_init(domain, iommu, gaw)) {
-   domain_exit(domain);
-   return NULL;
-   }
-
-out:
-
-   return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
- struct dmar_domain *domain)
-{
-   struct intel_iommu *iommu;
-   struct dmar_domain *tmp;
-   u16 req_id, dma_alias;
-   u8 bus, devfn;
-
-   iommu = device_to_iommu(dev, , );
-   if (!iommu)
-   return NULL;
-
-   req_id = ((u16)bus << 8) | devfn;
-
-   if (dev_is_pci(dev)) {
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   pci_for_each_dma_alias(pdev, get_last_alias, _alias);
-
-   /* register PCI DMA alias device */
-   if (req_id != dma_alias) {
-   tmp = dmar_insert_one_dev_info(iommu, 
PCI_BUS_NUM(dma_alias),
-   dma_alias & 0xff, NULL, domain);
-
-   if (!tmp || tmp != domain)
-   return tmp;
-   }
-   }
-
-   tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
-   if (!tmp || tmp != domain)
-   return tmp;
-
-   return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
-   struct dmar_domain *domain, *tmp;
-
-   domain = find_domain(dev);
-   if (domain)
-   goto out;
-
-   domain = find_or_alloc_domain(dev, gaw);
-   if (!domain)
-   goto out;
-
-   tmp = set_domain_for_dev(dev, domain);
-   if (!tmp || domain != tmp) {
-   domain_exit(domain);
-   domain = tmp;
-   }
-
-out:
-
-   return domain;
-}
-
  static int iommu_domain_identity_map(struct dmar_domain *domain,
 unsigned long long start,
 unsigned long long end)
@@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
struct dmar_domain *domain;
int ret;
  
-	domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);

+   domain = find_domain(dev);
if (!domain)
return -ENOMEM;
  
@@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)

  static int __init init_dmars(void)
  {
struct dmar_drhd_unit *drhd;
-   struct 

Re: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

2019-03-04 Thread Lu Baolu

Hi James,

Very glad to see this. Thank you!

On 3/4/19 11:41 PM, James Sewart wrote:

Hey,

This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c.
This avoids the use of find_or_alloc_domain whose domain assignment is
inconsistent with the iommu grouping as determined by pci_device_group.


Is this a bug fix or an improvement? What's the real issue will it cause
if we go without this patch set?

Best regards,
Lu Baolu



Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the
iommu_ops api, allowing the default_domain of an iommu group to be set in
iommu.c. This domain will be attached to every device that is brought up
with an iommu group, and the devices reserved regions will be mapped using
regions returned by get_resv_regions.

In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be
associated with so we defer full initialisation until
intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will
try to map a devices reserved regions before attaching the domain which
would cause issue if the domain is not fully initialised. This is
addressed in patch 1 by moving the mapping to after attaching.

Patch 2 implements function apply_resv_region, used in
iommu_group_create_direct_mappings to mark the reserved regions as non
mappable for the dma_map_ops api.

Patch 4 removes the domain lazy allocation logic. Before this patch the
lazy allocation logic would not be used as any domain allocated using
these paths would be replaced when attaching the group default domain.
Default domain allocation has been tested with and without this patch on
4.19.

Cheers,
James.

James Sewart (4):
   iommu: Move iommu_group_create_direct_mappings to after device_attach
   iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
   iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
   iommu/vt-d: Remove lazy allocation of domains

  drivers/iommu/intel-iommu.c | 329 
  drivers/iommu/iommu.c   |   4 +-
  2 files changed, 108 insertions(+), 225 deletions(-)


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] iommu: io-pgtable: Add ARM Mali midgard MMU page table format

2019-03-04 Thread Rob Herring
On Mon, Mar 4, 2019 at 11:31 AM Robin Murphy  wrote:
>
> On 04/03/2019 15:32, Rob Herring wrote:
> > On Fri, Mar 1, 2019 at 10:28 AM Robin Murphy  wrote:
> >>
> >> On 27/02/2019 23:22, Rob Herring wrote:
> >>> ARM Mali midgard GPU page tables are similar to standard 64-bit stage 1
> >>> page tables, but have a few differences. Add a new format type to
> >>> represent the format. The input address size is 48-bits and the output
> >>> address size is 40-bits (and possibly less?). Note that the later
> >>> bifrost GPUs follow the standard 64-bit stage 1 format.
> >>>
> >>> The differences in the format compared to 64-bit stage 1 format are:
> >>>
> >>> The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.
> >>>
> >>> The access flags are not read-only and unprivileged, but read and write.
> >>> This is similar to stage 2 entries, but the memory attributes field 
> >>> matches
> >>> stage 1 being an index.
> >>>
> >>> The nG bit is not set by the vendor driver. This one didn't seem to 
> >>> matter,
> >>> but we'll keep it aligned to the vendor driver.
> >
> > [...]
> >
> >>> + cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
> >>> + iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
> >>> + if (iop)
> >>> + cfg->arm_lpae_s1_cfg.tcr = 0;
> >>
> >> The general design intent is that we return ready-to-go register values
> >> here (much like mmu_get_as_setup() in mali_kbase, it seems), so I think
> >> it's worth adding an arm_mali_cfg to the io_pgtable_cfg union and
> >> transforming the VMSA output into final transtab/transcfg/memattr format
> >> at this point, rather then callers needing to care (e.g. some of those
> >> AS_TRANSTAB_LPAE_* bits look reminiscent of the walk attributes we set
> >> up for a VMSA TCR).
> >
> > I agree with returning ready-to-go values, but I'm not so sure the
> > bits are the same. Bits 0-1 are enable/mode bits which are pretty
> > specific to mali. Bit 2 is 'read inner' for whatever that means.
> > Perhaps it is read allocate cacheability, but that's a bit different
> > than RGN bits. Bit 4 is 'share outer'. Maybe it's the same as SH0 if
> > bit 3 is included, but I have no evidence of that. So I don't think
> > there's really anything to transform. We just set the bits needed. So
> > here's what I have in mind:
>
> Right, my Friday-afternoon wording perhaps wasn't the best - by
> "transform" I didn't mean to imply trying to reinterpret the default
> settings we configure for a VMSA TCR, merely applying some
> similarly-appropriate defaults to make a Mali TRANSTAB out of the VMSA TTBR.
>
> > iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
> > if (iop) {
> >u64 mair, ttbr;
> >
> >/* Copy values as union fields overlap */
> >mair = cfg->arm_lpae_s1_cfg.mair[0];
> >ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];
> >
> >cfg->arm_mali_lpae_cfg.mair = mair;
> >cfg->arm_mali_lpae_cfg.ttbr = ttbr;
> >cfg->arm_mali_lpae_cfg.ttbr |= ARM_MALI_LPAE_TTBR_READ_INNER |
> >  ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
> > }
>
> ...pretty much exactly like that (although I'd still prefer to use the
> Mali register names for clarity, and presumably you'll still explicitly
> initialise TRANSCFG too in the real patch).

No, TRANSCFG is only on Bifrost. While the page table format seems to
be standard stage 1 64-bit, the registers are still different from
anything else. So I guess we'll need yet another format definition
when we get there. Also, we may still have to massage the register
values outside of this code. It's not going to know the
'system_coherency' value the kbase driver uses (And I'm not sure how
we want to express that upstream either).

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] Revert "swiotlb: remove SWIOTLB_MAP_ERROR"

2019-03-04 Thread Arnd Bergmann
This reverts commit b907e20508d0 ("swiotlb: remove SWIOTLB_MAP_ERROR"), which
introduced an overflow warning in configurations that have a larger
dma_addr_t than phys_addr_t:

In file included from include/linux/dma-direct.h:5,
 from kernel/dma/swiotlb.c:23:
kernel/dma/swiotlb.c: In function 'swiotlb_tbl_map_single':
include/linux/dma-mapping.h:136:28: error: conversion from 'long long unsigned 
int' to 'phys_addr_t' {aka 'unsigned int'} changes value from 
'18446744073709551615' to '4294967295' [-Werror=overflow]
 #define DMA_MAPPING_ERROR  (~(dma_addr_t)0)
^
kernel/dma/swiotlb.c:544:9: note: in expansion of macro 'DMA_MAPPING_ERROR'
  return DMA_MAPPING_ERROR;

The configuration that caused this is on 32-bit ARM, where the DMA address
space depends on the enabled hardware platforms, while the physical
address space depends on the type of MMU chosen (classic vs LPAE).

I tried a couple of alternative approaches, but the previous version
seems as good as any other, so I went back to that.

Fixes: b907e20508d0 ("swiotlb: remove SWIOTLB_MAP_ERROR")
Signed-off-by: Arnd Bergmann 
---
 drivers/xen/swiotlb-xen.c | 4 ++--
 include/linux/swiotlb.h   | 3 +++
 kernel/dma/swiotlb.c  | 4 ++--
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 877baf2a94f4..57a98279bf4f 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -406,7 +406,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, 
struct page *page,
 
map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
 attrs);
-   if (map == DMA_MAPPING_ERROR)
+   if (map == SWIOTLB_MAP_ERROR)
return DMA_MAPPING_ERROR;
 
dev_addr = xen_phys_to_bus(map);
@@ -557,7 +557,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct 
scatterlist *sgl,
 sg_phys(sg),
 sg->length,
 dir, attrs);
-   if (map == DMA_MAPPING_ERROR) {
+   if (map == SWIOTLB_MAP_ERROR) {
dev_warn(hwdev, "swiotlb buffer is full\n");
/* Don't panic here, we expect map_sg users
   to do proper error handling. */
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 361f62bb4a8e..a65a36551f58 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -44,6 +44,9 @@ enum dma_sync_target {
SYNC_FOR_DEVICE = 1,
 };
 
+/* define the last possible byte of physical address space as a mapping error 
*/
+#define SWIOTLB_MAP_ERROR (~(phys_addr_t)0x0)
+
 extern phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
  dma_addr_t tbl_dma_addr,
  phys_addr_t phys, size_t size,
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 12059b78b631..922880b84387 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -541,7 +541,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
spin_unlock_irqrestore(_tlb_lock, flags);
if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
size);
-   return DMA_MAPPING_ERROR;
+   return SWIOTLB_MAP_ERROR;
 found:
io_tlb_used += nslots;
spin_unlock_irqrestore(_tlb_lock, flags);
@@ -659,7 +659,7 @@ bool swiotlb_map(struct device *dev, phys_addr_t *phys, 
dma_addr_t *dma_addr,
/* Oh well, have to allocate and map a bounce buffer. */
*phys = swiotlb_tbl_map_single(dev, __phys_to_dma(dev, io_tlb_start),
*phys, size, dir, attrs);
-   if (*phys == DMA_MAPPING_ERROR)
+   if (*phys == SWIOTLB_MAP_ERROR)
return false;
 
/* Ensure that the address returned is DMA'ble */
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] Revert "swiotlb: remove SWIOTLB_MAP_ERROR"

2019-03-04 Thread Konrad Rzeszutek Wilk
On Mon, Mar 04, 2019 at 08:59:03PM +0100, Arnd Bergmann wrote:
> This reverts commit b907e20508d0 ("swiotlb: remove SWIOTLB_MAP_ERROR"), which
> introduced an overflow warning in configurations that have a larger
> dma_addr_t than phys_addr_t:
> 
> In file included from include/linux/dma-direct.h:5,
>  from kernel/dma/swiotlb.c:23:
> kernel/dma/swiotlb.c: In function 'swiotlb_tbl_map_single':
> include/linux/dma-mapping.h:136:28: error: conversion from 'long long 
> unsigned int' to 'phys_addr_t' {aka 'unsigned int'} changes value from 
> '18446744073709551615' to '4294967295' [-Werror=overflow]
>  #define DMA_MAPPING_ERROR  (~(dma_addr_t)0)
> ^
> kernel/dma/swiotlb.c:544:9: note: in expansion of macro 'DMA_MAPPING_ERROR'
>   return DMA_MAPPING_ERROR;
> 
> The configuration that caused this is on 32-bit ARM, where the DMA address
> space depends on the enabled hardware platforms, while the physical
> address space depends on the type of MMU chosen (classic vs LPAE).
> 
> I tried a couple of alternative approaches, but the previous version
> seems as good as any other, so I went back to that.

That is really a bummer.

What about making the phys_addr_t and dma_addr_t have the same
width with some magic #ifdef hackery?

> 
> Fixes: b907e20508d0 ("swiotlb: remove SWIOTLB_MAP_ERROR")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/xen/swiotlb-xen.c | 4 ++--
>  include/linux/swiotlb.h   | 3 +++
>  kernel/dma/swiotlb.c  | 4 ++--
>  3 files changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 877baf2a94f4..57a98279bf4f 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -406,7 +406,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device 
> *dev, struct page *page,
>  
>   map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
>attrs);
> - if (map == DMA_MAPPING_ERROR)
> + if (map == SWIOTLB_MAP_ERROR)
>   return DMA_MAPPING_ERROR;
>  
>   dev_addr = xen_phys_to_bus(map);
> @@ -557,7 +557,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct 
> scatterlist *sgl,
>sg_phys(sg),
>sg->length,
>dir, attrs);
> - if (map == DMA_MAPPING_ERROR) {
> + if (map == SWIOTLB_MAP_ERROR) {
>   dev_warn(hwdev, "swiotlb buffer is full\n");
>   /* Don't panic here, we expect map_sg users
>  to do proper error handling. */
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 361f62bb4a8e..a65a36551f58 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -44,6 +44,9 @@ enum dma_sync_target {
>   SYNC_FOR_DEVICE = 1,
>  };
>  
> +/* define the last possible byte of physical address space as a mapping 
> error */
> +#define SWIOTLB_MAP_ERROR (~(phys_addr_t)0x0)
> +
>  extern phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
> dma_addr_t tbl_dma_addr,
> phys_addr_t phys, size_t size,
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 12059b78b631..922880b84387 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -541,7 +541,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>   spin_unlock_irqrestore(_tlb_lock, flags);
>   if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
>   dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
> size);
> - return DMA_MAPPING_ERROR;
> + return SWIOTLB_MAP_ERROR;
>  found:
>   io_tlb_used += nslots;
>   spin_unlock_irqrestore(_tlb_lock, flags);
> @@ -659,7 +659,7 @@ bool swiotlb_map(struct device *dev, phys_addr_t *phys, 
> dma_addr_t *dma_addr,
>   /* Oh well, have to allocate and map a bounce buffer. */
>   *phys = swiotlb_tbl_map_single(dev, __phys_to_dma(dev, io_tlb_start),
>   *phys, size, dir, attrs);
> - if (*phys == DMA_MAPPING_ERROR)
> + if (*phys == SWIOTLB_MAP_ERROR)
>   return false;
>  
>   /* Ensure that the address returned is DMA'ble */
> -- 
> 2.20.0
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu