Re: [PATCH v3 4/7] PCI: OF: Initialize dev->fwnode appropriately

2018-10-12 Thread Bjorn Helgaas
On Fri, Oct 12, 2018 at 03:59:14PM +0100, Jean-Philippe Brucker wrote:
> For PCI devices that have an OF node, set the fwnode as well. This way
> drivers that rely on fwnode don't need the special case described by
> commit f94277af03ea ("of/platform: Initialise dev->fwnode appropriately").
> 
> Signed-off-by: Jean-Philippe Brucker 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/of.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 2f5015bdb256..8026417fab38 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -21,12 +21,15 @@ void pci_set_of_node(struct pci_dev *dev)
>   return;
>   dev->dev.of_node = of_pci_find_child_device(dev->bus->dev.of_node,
>   dev->devfn);
> + if (dev->dev.of_node)
> + dev->dev.fwnode = >dev.of_node->fwnode;
>  }
>  
>  void pci_release_of_node(struct pci_dev *dev)
>  {
>   of_node_put(dev->dev.of_node);
>   dev->dev.of_node = NULL;
> + dev->dev.fwnode = NULL;
>  }
>  
>  void pci_set_bus_of_node(struct pci_bus *bus)
> @@ -35,12 +38,16 @@ void pci_set_bus_of_node(struct pci_bus *bus)
>   bus->dev.of_node = pcibios_get_phb_of_node(bus);
>   else
>   bus->dev.of_node = of_node_get(bus->self->dev.of_node);
> +
> + if (bus->dev.of_node)
> + bus->dev.fwnode = >dev.of_node->fwnode;
>  }
>  
>  void pci_release_bus_of_node(struct pci_bus *bus)
>  {
>   of_node_put(bus->dev.of_node);
>   bus->dev.of_node = NULL;
> + bus->dev.fwnode = NULL;
>  }
>  
>  struct device_node * __weak pcibios_get_phb_of_node(struct pci_bus *bus)
> -- 
> 2.19.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 3/7] PCI: OF: Allow endpoints to bypass the iommu

2018-10-12 Thread Bjorn Helgaas
s/iommu/IOMMU/ in subject

On Fri, Oct 12, 2018 at 03:59:13PM +0100, Jean-Philippe Brucker wrote:
> Using the iommu-map binding, endpoints in a given PCI domain can be
> managed by different IOMMUs. Some virtual machines may allow a subset of
> endpoints to bypass the IOMMU. In some case the IOMMU itself is presented

s/case/cases/

> as a PCI endpoint (e.g. AMD IOMMU and virtio-iommu). Currently, when a
> PCI root complex has an iommu-map property, the driver requires all
> endpoints to be described by the property. Allow the iommu-map property to
> have gaps.

I'm not an IOMMU or virtio expert, so it's not obvious to me why it is
safe to allow devices to bypass the IOMMU.  Does this mean a typo in
iommu-map could inadvertently allow devices to bypass it?  Should we
indicate something in dmesg (and/or sysfs) about devices that bypass
it?

> Relaxing of_pci_map_rid also allows the msi-map property to have gaps,

s/of_pci_map_rid/of_pci_map_rid()/

> which is invalid since MSIs always reach an MSI controller. Thankfully
> Linux will error out later, when attempting to find an MSI domain for the
> device.

Not clear to me what "error out" means here.  In a userspace program,
I would infer that the program exits with an error message, but I
doubt you mean that Linux exits.

> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/pci/of.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 1836b8ddf292..2f5015bdb256 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -451,9 +451,10 @@ int of_pci_map_rid(struct device_node *np, u32 rid,
>   return 0;
>   }
>  
> - pr_err("%pOF: Invalid %s translation - no match for rid 0x%x on %pOF\n",
> - np, map_name, rid, target && *target ? *target : NULL);
> - return -EFAULT;
> + /* Bypasses translation */
> + if (id_out)
> + *id_out = rid;
> + return 0;
>  }
>  
>  #if IS_ENABLED(CONFIG_OF_IRQ)
> -- 
> 2.19.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/7] Add virtio-iommu driver

2018-10-12 Thread Jean-Philippe Brucker
On 12/10/2018 18:00, Michael S. Tsirkin wrote:
> This all looks good to me. Minor nits:
> - I think DEBUG mode is best just removed for now
> - Slightly wrong patch splitup causing a misaligned structure
>   in uapi until all patches are applied.

Thanks a lot for the review, I'll fix these up and send a new version

> You should Cc Bjorn on the pci change - I'd like to see his ack on it
> being merged through my tree.

Argh, I don't know how I missed him. However patches 1-4 are device tree
changes, and need acks from Rob or Mark (on Cc)

> And pls Cc the virtio-dev list on any virtio uapi changes.
> 
> At a feature level I have some ideas for more features we
> could add, but for now I think I'll put this version in -next
> while you iron out the above wrinkles. Hope you can make the
> merge window.
Thanks, I also have some work lined up for hardware acceleration and
shared address spaces.

Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 5/7] iommu: Add virtio-iommu driver

2018-10-12 Thread Jean-Philippe Brucker
On 12/10/2018 17:35, Michael S. Tsirkin wrote:
>> +list_del(>list);
>> +kfree(req);
> 
> So with DEBUG set, this will actually free memory that device still
> DMA's into. Hardly pretty. I think you want to mark device broken,
> queue the request and then wait for device to be reset.

Ok, let's remove DEBUG for the moment

>> +static int viommu_probe(struct virtio_device *vdev)
>> +{
>> +struct device *parent_dev = vdev->dev.parent;
>> +struct viommu_dev *viommu = NULL;
>> +struct device *dev = >dev;
>> +u64 input_start = 0;
>> +u64 input_end = -1UL;
>> +int ret;
>> +
>> +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
>> +return -ENODEV;
> 
> I'm a bit confused about what will happen if this device
> happens to be behind an iommu itself.
>
> If we can't handle that, should we clear PLATFORM_IOMMU
> e.g. like the balloon does?

I think the DMA API can handle this device doing DMA through another
IOMMU. I haven't tested this case because it is very unusual (IOMMUs
themselves generally access the physical address space) but I don't see
anything preventing it.

What we can't handle is a device performing DMA through two
daisy-chained IOMMUs, but clearing PLATFORM_IOMMU on the first one
wouldn't make things work in that case, we'd need some core changes.

>> +struct virtio_iommu_config {
>> +/* Supported page sizes */
>> +__u64   page_size_mask;
>> +/* Supported IOVA range */
>> +struct virtio_iommu_range {
> 
> I'd rather we moved the definition outside even though gcc allows it -
> some old userspace compilers might not.
> 
>> +__u64   start;
>> +__u64   end;
>> +} input_range;
>> +/* Max domain ID size */
>> +__u8domain_bits;
> 
> Let's add explicit padding here as well?

Ok

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/10] arm64: use the generic swiotlb_dma_ops

2018-10-12 Thread Catalin Marinas
On Fri, Oct 12, 2018 at 04:40:49PM +0200, Christoph Hellwig wrote:
> On Fri, Oct 12, 2018 at 02:01:00PM +0100, Robin Murphy wrote:
> > On 08/10/18 09:02, Christoph Hellwig wrote:
> >> Now that the generic swiotlb code supports non-coherent DMA we can switch
> >> to it for arm64.  For that we need to refactor the existing
> >> alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
> >> and implement the standard arch_sync_dma_for_{device,cpu} hooks for
> >> cache maintaincance in the streaming dma hooks, which also implies
> >> using the generic dma_coherent flag in struct device.
> >>
> >> Note that we need to keep the old is_device_dma_coherent function around
> >> for now, so that the shared arm/arm64 Xen code keeps working.
> >
> > OK, so when I said last night that it boot-tested OK, that much was true, 
> > but then I shut the board down as I left and got a megasplosion of bad page 
> > state BUGs, e.g.:
> 
> I think this is because I am passing the wrong address to
> dma_direct_free_pages.  Please try this patch on top:
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 3c75d69b54e7..4f0f92856c4c 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -126,10 +126,12 @@ void *arch_dma_alloc(struct device *dev, size_t size, 
> dma_addr_t *dma_handle,
>  void arch_dma_free(struct device *dev, size_t size, void *vaddr,
>   dma_addr_t dma_handle, unsigned long attrs)
>  {
> - if (__free_from_pool(vaddr, PAGE_ALIGN(size)))
> - return;
> - vunmap(vaddr);
> - dma_direct_free_pages(dev, size, vaddr, dma_handle, attrs);
> + if (!__free_from_pool(vaddr, PAGE_ALIGN(size)))
> + void *kaddr = phys_to_virt(dma_to_phys(dev, dma_handle));
> +
> + vunmap(vaddr);
> + dma_direct_free_pages(dev, size, kaddr, dma_handle, attrs);
> + }
>  }
>  
>  long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,

With this fix:

Acked-by: Catalin Marinas 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations

2018-10-12 Thread Catalin Marinas
On Mon, Oct 08, 2018 at 10:02:44AM +0200, Christoph Hellwig wrote:
> All architectures that support swiotlb also have a zone that backs up
> these less than full addressing allocations (usually ZONE_DMA32).
> 
> Because of that it is rather pointless to fall back to the global swiotlb
> buffer if the normal dma direct allocation failed - the only thing this
> will do is to eat up bounce buffers that would be more useful to serve
> streaming mappings.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm64/mm/dma-mapping.c |   6 +--
>  include/linux/swiotlb.h |   5 --
>  kernel/dma/swiotlb.c| 105 +---
>  3 files changed, 5 insertions(+), 111 deletions(-)

Acked-by: Catalin Marinas 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 04/10] swiotlb: remove the overflow buffer

2018-10-12 Thread Catalin Marinas
On Mon, Oct 08, 2018 at 10:02:40AM +0200, Christoph Hellwig wrote:
> Like all other dma mapping drivers just return an error code instead
> of an actual memory buffer.  The reason for the overflow buffer was
> that at the time swiotlb was invented there was no way to check for
> dma mapping errors, but this has long been fixed.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm64/mm/dma-mapping.c   |  2 +-
>  arch/powerpc/kernel/dma-swiotlb.c |  4 +--
>  include/linux/dma-direct.h|  2 ++
>  include/linux/swiotlb.h   |  3 --
>  kernel/dma/direct.c   |  2 --
>  kernel/dma/swiotlb.c  | 59 ++-
>  6 files changed, 8 insertions(+), 64 deletions(-)

I went through the patches and they look fine to me (with the vunmap
fix for the last patch). For the arm64 bits:

Acked-by: Catalin Marinas 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/7] Add virtio-iommu driver

2018-10-12 Thread Michael S. Tsirkin
On Fri, Oct 12, 2018 at 03:59:10PM +0100, Jean-Philippe Brucker wrote:
> Implement the virtio-iommu driver, following specification v0.8 [1].
> Changes since v2 [2]:
> 
> * Patches 2-4 allow virtio-iommu to use the PCI transport, since QEMU
>   would like to phase out the MMIO transport. This produces a complex
>   topology where the programming interface of the IOMMU could appear
>   lower than the endpoints that it translates. It's not unheard of (e.g.
>   AMD IOMMU), and the guest easily copes with this.
>   
>   The "Firmware description" section of the specification has been
>   updated with all combinations of PCI, MMIO and DT, ACPI.
> 
> * Fix structures layout, they don't need the "packed" attribute anymore.
> 
> * While we're at it, add domain parameter to DETACH request, and leave
>   some padding. This way the next version, that adds PASID support,
>   won't have to introduce a "DETACH2" request to stay backward
>   compatible.
> 
> * Require virtio device 1.0+. Remove legacy transport notes from the
>   specification.
> 
> * Request timeout is now only enabled with DEBUG.
> 
> * The patch for VFIO Kconfig (previously patch 5/5) is in next.
> 
> You can find Linux driver and kvmtool device on branches
> virtio-iommu/v0.8 [3] (currently based on 4.19-rc7 but rebasing onto
> next only produced a trivial conflict). Branch virtio-iommu/devel
> contains a few patches that I'd like to send once the base is upstream:
> 
> * virtio-iommu as a module. It got *much* nicer after Rob's probe
>   deferral rework, but I still have a bug to fix when re-loading the
>   virtio-iommu module.
> 
> * ACPI support requires a minor IORT spec update (reservation of node
>   ID). I think it should be easier to obtain once the device and drivers
>   are upstream.
> 
> [1] Virtio-iommu specification v0.8, diff from v0.7, and sources
> git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.8
> http://jpbrucker.net/virtio-iommu/spec/v0.8/virtio-iommu-v0.8.pdf
> 
> http://jpbrucker.net/virtio-iommu/spec/diffs/virtio-iommu-pdf-diff-v0.7-v0.8.pdf
> 
> [2] [PATCH v2 0/5] Add virtio-iommu driver
> https://www.spinics.net/lists/kvm/msg170655.html
> 
> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.8
> git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.8
> 
> Jean-Philippe Brucker (7):
>   dt-bindings: virtio-mmio: Add IOMMU description
>   dt-bindings: virtio: Add virtio-pci-iommu node
>   PCI: OF: allow endpoints to bypass the iommu
>   PCI: OF: Initialize dev->fwnode appropriately
>   iommu: Add virtio-iommu driver
>   iommu/virtio: Add probe request
>   iommu/virtio: Add event queue
> 
>  .../devicetree/bindings/virtio/iommu.txt  |   66 +
>  .../devicetree/bindings/virtio/mmio.txt   |   30 +
>  MAINTAINERS   |7 +
>  drivers/iommu/Kconfig |   11 +
>  drivers/iommu/Makefile|1 +
>  drivers/iommu/virtio-iommu.c  | 1171 +
>  drivers/pci/of.c  |   14 +-
>  include/uapi/linux/virtio_ids.h   |1 +
>  include/uapi/linux/virtio_iommu.h |  159 +++
>  9 files changed, 1457 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h

This all looks good to me. Minor nits:
- I think DEBUG mode is best just removed for now
- Slightly wrong patch splitup causing a misaligned structure
  in uapi until all patches are applied.

You should Cc Bjorn on the pci change - I'd like to see his ack on it
being merged through my tree.

And pls Cc the virtio-dev list on any virtio uapi changes.

At a feature level I have some ideas for more features we
could add, but for now I think I'll put this version in -next
while you iron out the above wrinkles. Hope you can make the
merge window.

> -- 
> 2.19.1
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 6/7] iommu/virtio: Add probe request

2018-10-12 Thread Michael S. Tsirkin
On Fri, Oct 12, 2018 at 03:59:16PM +0100, Jean-Philippe Brucker wrote:
> When the device offers the probe feature, send a probe request for each
> device managed by the IOMMU. Extract RESV_MEM information. When we
> encounter a MSI doorbell region, set it up as a IOMMU_RESV_MSI region.
> This will tell other subsystems that there is no need to map the MSI
> doorbell in the virtio-iommu, because MSIs bypass it.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/iommu/virtio-iommu.c  | 147 --
>  include/uapi/linux/virtio_iommu.h |  39 
>  2 files changed, 180 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 9fb38cd3b727..8eaf66770469 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -56,6 +56,7 @@ struct viommu_dev {
>   struct iommu_domain_geometrygeometry;
>   u64 pgsize_bitmap;
>   u8  domain_bits;
> + u32 probe_size;
>  };
>  
>  struct viommu_mapping {
> @@ -77,8 +78,10 @@ struct viommu_domain {
>  };
>  
>  struct viommu_endpoint {
> + struct device   *dev;
>   struct viommu_dev   *viommu;
>   struct viommu_domain*vdomain;
> + struct list_headresv_regions;
>  };
>  
>  struct viommu_request {
> @@ -129,6 +132,9 @@ static off_t viommu_get_req_offset(struct viommu_dev 
> *viommu,
>  {
>   size_t tail_size = sizeof(struct virtio_iommu_req_tail);
>  
> + if (req->type == VIRTIO_IOMMU_T_PROBE)
> + return len - viommu->probe_size - tail_size;
> +
>   return len - tail_size;
>  }
>  
> @@ -414,6 +420,101 @@ static int viommu_replay_mappings(struct viommu_domain 
> *vdomain)
>   return ret;
>  }
>  
> +static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
> +struct virtio_iommu_probe_resv_mem *mem,
> +size_t len)
> +{
> + struct iommu_resv_region *region = NULL;
> + unsigned long prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> +
> + u64 start = le64_to_cpu(mem->start);
> + u64 end = le64_to_cpu(mem->end);
> + size_t size = end - start + 1;
> +
> + if (len < sizeof(*mem))
> + return -EINVAL;


Maybe validate that size, end and start all make sense
and e.g. fit within phys_addr_t and size_t?

> +
> + switch (mem->subtype) {
> + default:
> + dev_warn(vdev->dev, "unknown resv mem subtype 0x%x\n",
> +  mem->subtype);
> + /* Fall-through */
> + case VIRTIO_IOMMU_RESV_MEM_T_RESERVED:
> + region = iommu_alloc_resv_region(start, size, 0,
> +  IOMMU_RESV_RESERVED);
> + break;
> + case VIRTIO_IOMMU_RESV_MEM_T_MSI:
> + region = iommu_alloc_resv_region(start, size, prot,
> +  IOMMU_RESV_MSI);
> + break;
> + }
> +
> + list_add(>resv_regions, >list);
> + return 0;
> +}
> +
> +static int viommu_probe_endpoint(struct viommu_dev *viommu, struct device 
> *dev)
> +{
> + int ret;
> + u16 type, len;
> + size_t cur = 0;
> + size_t probe_len;
> + struct virtio_iommu_req_probe *probe;
> + struct virtio_iommu_probe_property *prop;
> + struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> + struct viommu_endpoint *vdev = fwspec->iommu_priv;
> +
> + if (!fwspec->num_ids)
> + return -EINVAL;
> +
> + probe_len = sizeof(*probe) + viommu->probe_size +
> + sizeof(struct virtio_iommu_req_tail);
> + probe = kzalloc(probe_len, GFP_KERNEL);
> + if (!probe)
> + return -ENOMEM;
> +
> + probe->head.type = VIRTIO_IOMMU_T_PROBE;
> + /*
> +  * For now, assume that properties of an endpoint that outputs multiple
> +  * IDs are consistent. Only probe the first one.
> +  */
> + probe->endpoint = cpu_to_le32(fwspec->ids[0]);
> +
> + ret = viommu_send_req_sync(viommu, probe, probe_len);
> + if (ret)
> + goto out_free;
> +
> + prop = (void *)probe->properties;
> + type = le16_to_cpu(prop->type) & VIRTIO_IOMMU_PROBE_T_MASK;
> +
> + while (type != VIRTIO_IOMMU_PROBE_T_NONE &&
> +cur < viommu->probe_size) {
> + len = le16_to_cpu(prop->length) + sizeof(*prop);
> +
> + switch (type) {
> + case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
> + ret = viommu_add_resv_mem(vdev, (void *)prop, len);
> + break;
> + default:
> + dev_err(dev, "unknown viommu prop 0x%x\n", type);
> + }
> +
> + if (ret)
> + dev_err(dev, "failed to parse viommu prop 0x%x\n", 
> type);
> +
> + cur += len;
> + if (cur >= 

Re: [PATCH v3 5/7] iommu: Add virtio-iommu driver

2018-10-12 Thread Michael S. Tsirkin
On Fri, Oct 12, 2018 at 03:59:15PM +0100, Jean-Philippe Brucker wrote:
> The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
> requests such as map/unmap over virtio transport without emulating page
> tables. This implementation handles ATTACH, DETACH, MAP and UNMAP
> requests.
> 
> The bulk of the code transforms calls coming from the IOMMU API into
> corresponding virtio requests. Mappings are kept in an interval tree
> instead of page tables.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  MAINTAINERS   |   7 +
>  drivers/iommu/Kconfig |  11 +
>  drivers/iommu/Makefile|   1 +
>  drivers/iommu/virtio-iommu.c  | 938 ++
>  include/uapi/linux/virtio_ids.h   |   1 +
>  include/uapi/linux/virtio_iommu.h | 101 
>  6 files changed, 1059 insertions(+)
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 48a65c3a4189..f02fa65f47e2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -15599,6 +15599,13 @@ S:   Maintained
>  F:   drivers/virtio/virtio_input.c
>  F:   include/uapi/linux/virtio_input.h
>  
> +VIRTIO IOMMU DRIVER
> +M:   Jean-Philippe Brucker 
> +L:   virtualizat...@lists.linux-foundation.org
> +S:   Maintained
> +F:   drivers/iommu/virtio-iommu.c
> +F:   include/uapi/linux/virtio_iommu.h
> +
>  VIRTUAL BOX GUEST DEVICE DRIVER
>  M:   Hans de Goede 
>  M:   Arnd Bergmann 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index c60395b7470f..2dc016dc2b92 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -414,4 +414,15 @@ config QCOM_IOMMU
>   help
> Support for IOMMU on certain Qualcomm SoCs.
>  
> +config VIRTIO_IOMMU
> + bool "Virtio IOMMU driver"
> + depends on VIRTIO=y
> + select IOMMU_API
> + select INTERVAL_TREE
> + select ARM_DMA_USE_IOMMU if ARM
> + help
> +   Para-virtualised IOMMU driver with virtio.
> +
> +   Say Y here if you intend to run this kernel as a guest.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index ab5eba6edf82..4cd643408e49 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -31,3 +31,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>  obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> new file mode 100644
> index ..9fb38cd3b727
> --- /dev/null
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -0,0 +1,938 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Virtio driver for the paravirtualized IOMMU
> + *
> + * Copyright (C) 2018 Arm Limited
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#define MSI_IOVA_BASE0x800
> +#define MSI_IOVA_LENGTH  0x10
> +
> +#define VIOMMU_REQUEST_VQ0
> +#define VIOMMU_NR_VQS1
> +
> +/*
> + * During development, it is convenient to time out rather than wait
> + * indefinitely in atomic context when a device misbehaves and a request 
> doesn't
> + * return. In production however, some requests shouldn't return until they 
> are
> + * successful.
> + */
> +#ifdef DEBUG
> +#define VIOMMU_REQUEST_TIMEOUT   1 /* 10s */
> +#endif
> +
> +struct viommu_dev {
> + struct iommu_device iommu;
> + struct device   *dev;
> + struct virtio_device*vdev;
> +
> + struct ida  domain_ids;
> +
> + struct virtqueue*vqs[VIOMMU_NR_VQS];
> + spinlock_t  request_lock;
> + struct list_headrequests;
> +
> + /* Device configuration */
> + struct iommu_domain_geometrygeometry;
> + u64 pgsize_bitmap;
> + u8  domain_bits;
> +};
> +
> +struct viommu_mapping {
> + phys_addr_t paddr;
> + struct interval_tree_node   iova;
> + u32 flags;
> +};
> +
> +struct viommu_domain {
> + struct iommu_domain domain;
> + struct viommu_dev   *viommu;
> + struct mutexmutex;
> + unsigned intid;
> +
> + spinlock_t  mappings_lock;
> + struct rb_root_cached   mappings;
> +
> + unsigned long   nr_endpoints;
> +};
> +
> +struct viommu_endpoint {
> + struct viommu_dev   *viommu;

[PATCH v3 7/7] iommu/virtio: Add event queue

2018-10-12 Thread Jean-Philippe Brucker
The event queue offers a way for the device to report access faults from
endpoints. It is implemented on virtqueue #1. Whenever the host needs to
signal a fault, it fills one of the buffers offered by the guest and
interrupts it.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c  | 116 +++---
 include/uapi/linux/virtio_iommu.h |  19 +
 2 files changed, 126 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 8eaf66770469..0fe8ed2a290c 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -29,7 +29,8 @@
 #define MSI_IOVA_LENGTH0x10
 
 #define VIOMMU_REQUEST_VQ  0
-#define VIOMMU_NR_VQS  1
+#define VIOMMU_EVENT_VQ1
+#define VIOMMU_NR_VQS  2
 
 /*
  * During development, it is convenient to time out rather than wait
@@ -51,6 +52,7 @@ struct viommu_dev {
struct virtqueue*vqs[VIOMMU_NR_VQS];
spinlock_t  request_lock;
struct list_headrequests;
+   void*evts;
 
/* Device configuration */
struct iommu_domain_geometrygeometry;
@@ -92,6 +94,15 @@ struct viommu_request {
charbuf[];
 };
 
+#define VIOMMU_FAULT_RESV_MASK 0xff00
+
+struct viommu_event {
+   union {
+   u32 head;
+   struct virtio_iommu_fault fault;
+   };
+};
+
 #define to_viommu_domain(domain)   \
container_of(domain, struct viommu_domain, domain)
 
@@ -515,6 +526,69 @@ static int viommu_probe_endpoint(struct viommu_dev 
*viommu, struct device *dev)
return ret;
 }
 
+static int viommu_fault_handler(struct viommu_dev *viommu,
+   struct virtio_iommu_fault *fault)
+{
+   char *reason_str;
+
+   u8 reason   = fault->reason;
+   u32 flags   = le32_to_cpu(fault->flags);
+   u32 endpoint= le32_to_cpu(fault->endpoint);
+   u64 address = le64_to_cpu(fault->address);
+
+   switch (reason) {
+   case VIRTIO_IOMMU_FAULT_R_DOMAIN:
+   reason_str = "domain";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_MAPPING:
+   reason_str = "page";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_UNKNOWN:
+   default:
+   reason_str = "unknown";
+   break;
+   }
+
+   /* TODO: find EP by ID and report_iommu_fault */
+   if (flags & VIRTIO_IOMMU_FAULT_F_ADDRESS)
+   dev_err_ratelimited(viommu->dev, "%s fault from EP %u at %#llx 
[%s%s%s]\n",
+   reason_str, endpoint, address,
+   flags & VIRTIO_IOMMU_FAULT_F_READ ? "R" : 
"",
+   flags & VIRTIO_IOMMU_FAULT_F_WRITE ? "W" : 
"",
+   flags & VIRTIO_IOMMU_FAULT_F_EXEC ? "X" : 
"");
+   else
+   dev_err_ratelimited(viommu->dev, "%s fault from EP %u\n",
+   reason_str, endpoint);
+   return 0;
+}
+
+static void viommu_event_handler(struct virtqueue *vq)
+{
+   int ret;
+   unsigned int len;
+   struct scatterlist sg[1];
+   struct viommu_event *evt;
+   struct viommu_dev *viommu = vq->vdev->priv;
+
+   while ((evt = virtqueue_get_buf(vq, )) != NULL) {
+   if (len > sizeof(*evt)) {
+   dev_err(viommu->dev,
+   "invalid event buffer (len %u != %zu)\n",
+   len, sizeof(*evt));
+   } else if (!(evt->head & VIOMMU_FAULT_RESV_MASK)) {
+   viommu_fault_handler(viommu, >fault);
+   }
+
+   sg_init_one(sg, evt, sizeof(*evt));
+   ret = virtqueue_add_inbuf(vq, sg, 1, evt, GFP_ATOMIC);
+   if (ret)
+   dev_err(viommu->dev, "could not add event buffer\n");
+   }
+
+   if (!virtqueue_kick(vq))
+   dev_err(viommu->dev, "kick failed\n");
+}
+
 /* IOMMU API */
 
 static struct iommu_domain *viommu_domain_alloc(unsigned type)
@@ -899,16 +973,35 @@ static struct iommu_ops viommu_ops = {
 static int viommu_init_vqs(struct viommu_dev *viommu)
 {
struct virtio_device *vdev = dev_to_virtio(viommu->dev);
-   const char *name = "request";
-   void *ret;
+   const char *names[] = { "request", "event" };
+   vq_callback_t *callbacks[] = {
+   NULL, /* No async requests */
+   viommu_event_handler,
+   };
 
-   ret = virtio_find_single_vq(vdev, NULL, name);
-   if (IS_ERR(ret)) {
-   dev_err(viommu->dev, "cannot find VQ\n");
-   return PTR_ERR(ret);
-   }
+   return virtio_find_vqs(vdev, VIOMMU_NR_VQS, viommu->vqs, callbacks,
+  

[PATCH v3 6/7] iommu/virtio: Add probe request

2018-10-12 Thread Jean-Philippe Brucker
When the device offers the probe feature, send a probe request for each
device managed by the IOMMU. Extract RESV_MEM information. When we
encounter a MSI doorbell region, set it up as a IOMMU_RESV_MSI region.
This will tell other subsystems that there is no need to map the MSI
doorbell in the virtio-iommu, because MSIs bypass it.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c  | 147 --
 include/uapi/linux/virtio_iommu.h |  39 
 2 files changed, 180 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 9fb38cd3b727..8eaf66770469 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -56,6 +56,7 @@ struct viommu_dev {
struct iommu_domain_geometrygeometry;
u64 pgsize_bitmap;
u8  domain_bits;
+   u32 probe_size;
 };
 
 struct viommu_mapping {
@@ -77,8 +78,10 @@ struct viommu_domain {
 };
 
 struct viommu_endpoint {
+   struct device   *dev;
struct viommu_dev   *viommu;
struct viommu_domain*vdomain;
+   struct list_headresv_regions;
 };
 
 struct viommu_request {
@@ -129,6 +132,9 @@ static off_t viommu_get_req_offset(struct viommu_dev 
*viommu,
 {
size_t tail_size = sizeof(struct virtio_iommu_req_tail);
 
+   if (req->type == VIRTIO_IOMMU_T_PROBE)
+   return len - viommu->probe_size - tail_size;
+
return len - tail_size;
 }
 
@@ -414,6 +420,101 @@ static int viommu_replay_mappings(struct viommu_domain 
*vdomain)
return ret;
 }
 
+static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
+  struct virtio_iommu_probe_resv_mem *mem,
+  size_t len)
+{
+   struct iommu_resv_region *region = NULL;
+   unsigned long prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+
+   u64 start = le64_to_cpu(mem->start);
+   u64 end = le64_to_cpu(mem->end);
+   size_t size = end - start + 1;
+
+   if (len < sizeof(*mem))
+   return -EINVAL;
+
+   switch (mem->subtype) {
+   default:
+   dev_warn(vdev->dev, "unknown resv mem subtype 0x%x\n",
+mem->subtype);
+   /* Fall-through */
+   case VIRTIO_IOMMU_RESV_MEM_T_RESERVED:
+   region = iommu_alloc_resv_region(start, size, 0,
+IOMMU_RESV_RESERVED);
+   break;
+   case VIRTIO_IOMMU_RESV_MEM_T_MSI:
+   region = iommu_alloc_resv_region(start, size, prot,
+IOMMU_RESV_MSI);
+   break;
+   }
+
+   list_add(>resv_regions, >list);
+   return 0;
+}
+
+static int viommu_probe_endpoint(struct viommu_dev *viommu, struct device *dev)
+{
+   int ret;
+   u16 type, len;
+   size_t cur = 0;
+   size_t probe_len;
+   struct virtio_iommu_req_probe *probe;
+   struct virtio_iommu_probe_property *prop;
+   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+   struct viommu_endpoint *vdev = fwspec->iommu_priv;
+
+   if (!fwspec->num_ids)
+   return -EINVAL;
+
+   probe_len = sizeof(*probe) + viommu->probe_size +
+   sizeof(struct virtio_iommu_req_tail);
+   probe = kzalloc(probe_len, GFP_KERNEL);
+   if (!probe)
+   return -ENOMEM;
+
+   probe->head.type = VIRTIO_IOMMU_T_PROBE;
+   /*
+* For now, assume that properties of an endpoint that outputs multiple
+* IDs are consistent. Only probe the first one.
+*/
+   probe->endpoint = cpu_to_le32(fwspec->ids[0]);
+
+   ret = viommu_send_req_sync(viommu, probe, probe_len);
+   if (ret)
+   goto out_free;
+
+   prop = (void *)probe->properties;
+   type = le16_to_cpu(prop->type) & VIRTIO_IOMMU_PROBE_T_MASK;
+
+   while (type != VIRTIO_IOMMU_PROBE_T_NONE &&
+  cur < viommu->probe_size) {
+   len = le16_to_cpu(prop->length) + sizeof(*prop);
+
+   switch (type) {
+   case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
+   ret = viommu_add_resv_mem(vdev, (void *)prop, len);
+   break;
+   default:
+   dev_err(dev, "unknown viommu prop 0x%x\n", type);
+   }
+
+   if (ret)
+   dev_err(dev, "failed to parse viommu prop 0x%x\n", 
type);
+
+   cur += len;
+   if (cur >= viommu->probe_size)
+   break;
+
+   prop = (void *)probe->properties + cur;
+   type = le16_to_cpu(prop->type) & VIRTIO_IOMMU_PROBE_T_MASK;
+   }
+
+out_free:
+   kfree(probe);
+   return ret;
+}
+
 /* IOMMU API */
 
 static struct iommu_domain 

[PATCH v3 5/7] iommu: Add virtio-iommu driver

2018-10-12 Thread Jean-Philippe Brucker
The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
requests such as map/unmap over virtio transport without emulating page
tables. This implementation handles ATTACH, DETACH, MAP and UNMAP
requests.

The bulk of the code transforms calls coming from the IOMMU API into
corresponding virtio requests. Mappings are kept in an interval tree
instead of page tables.

Signed-off-by: Jean-Philippe Brucker 
---
 MAINTAINERS   |   7 +
 drivers/iommu/Kconfig |  11 +
 drivers/iommu/Makefile|   1 +
 drivers/iommu/virtio-iommu.c  | 938 ++
 include/uapi/linux/virtio_ids.h   |   1 +
 include/uapi/linux/virtio_iommu.h | 101 
 6 files changed, 1059 insertions(+)
 create mode 100644 drivers/iommu/virtio-iommu.c
 create mode 100644 include/uapi/linux/virtio_iommu.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 48a65c3a4189..f02fa65f47e2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15599,6 +15599,13 @@ S: Maintained
 F: drivers/virtio/virtio_input.c
 F: include/uapi/linux/virtio_input.h
 
+VIRTIO IOMMU DRIVER
+M: Jean-Philippe Brucker 
+L: virtualizat...@lists.linux-foundation.org
+S: Maintained
+F: drivers/iommu/virtio-iommu.c
+F: include/uapi/linux/virtio_iommu.h
+
 VIRTUAL BOX GUEST DEVICE DRIVER
 M: Hans de Goede 
 M: Arnd Bergmann 
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index c60395b7470f..2dc016dc2b92 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -414,4 +414,15 @@ config QCOM_IOMMU
help
  Support for IOMMU on certain Qualcomm SoCs.
 
+config VIRTIO_IOMMU
+   bool "Virtio IOMMU driver"
+   depends on VIRTIO=y
+   select IOMMU_API
+   select INTERVAL_TREE
+   select ARM_DMA_USE_IOMMU if ARM
+   help
+ Para-virtualised IOMMU driver with virtio.
+
+ Say Y here if you intend to run this kernel as a guest.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index ab5eba6edf82..4cd643408e49 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -31,3 +31,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
+obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
new file mode 100644
index ..9fb38cd3b727
--- /dev/null
+++ b/drivers/iommu/virtio-iommu.c
@@ -0,0 +1,938 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Virtio driver for the paravirtualized IOMMU
+ *
+ * Copyright (C) 2018 Arm Limited
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define MSI_IOVA_BASE  0x800
+#define MSI_IOVA_LENGTH0x10
+
+#define VIOMMU_REQUEST_VQ  0
+#define VIOMMU_NR_VQS  1
+
+/*
+ * During development, it is convenient to time out rather than wait
+ * indefinitely in atomic context when a device misbehaves and a request 
doesn't
+ * return. In production however, some requests shouldn't return until they are
+ * successful.
+ */
+#ifdef DEBUG
+#define VIOMMU_REQUEST_TIMEOUT 1 /* 10s */
+#endif
+
+struct viommu_dev {
+   struct iommu_device iommu;
+   struct device   *dev;
+   struct virtio_device*vdev;
+
+   struct ida  domain_ids;
+
+   struct virtqueue*vqs[VIOMMU_NR_VQS];
+   spinlock_t  request_lock;
+   struct list_headrequests;
+
+   /* Device configuration */
+   struct iommu_domain_geometrygeometry;
+   u64 pgsize_bitmap;
+   u8  domain_bits;
+};
+
+struct viommu_mapping {
+   phys_addr_t paddr;
+   struct interval_tree_node   iova;
+   u32 flags;
+};
+
+struct viommu_domain {
+   struct iommu_domain domain;
+   struct viommu_dev   *viommu;
+   struct mutexmutex;
+   unsigned intid;
+
+   spinlock_t  mappings_lock;
+   struct rb_root_cached   mappings;
+
+   unsigned long   nr_endpoints;
+};
+
+struct viommu_endpoint {
+   struct viommu_dev   *viommu;
+   struct viommu_domain*vdomain;
+};
+
+struct viommu_request {
+   struct list_headlist;
+   void*writeback;
+   unsigned intwrite_offset;
+   unsigned intlen;
+   char 

[PATCH v3 4/7] PCI: OF: Initialize dev->fwnode appropriately

2018-10-12 Thread Jean-Philippe Brucker
For PCI devices that have an OF node, set the fwnode as well. This way
drivers that rely on fwnode don't need the special case described by
commit f94277af03ea ("of/platform: Initialise dev->fwnode appropriately").

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/pci/of.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 2f5015bdb256..8026417fab38 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -21,12 +21,15 @@ void pci_set_of_node(struct pci_dev *dev)
return;
dev->dev.of_node = of_pci_find_child_device(dev->bus->dev.of_node,
dev->devfn);
+   if (dev->dev.of_node)
+   dev->dev.fwnode = >dev.of_node->fwnode;
 }
 
 void pci_release_of_node(struct pci_dev *dev)
 {
of_node_put(dev->dev.of_node);
dev->dev.of_node = NULL;
+   dev->dev.fwnode = NULL;
 }
 
 void pci_set_bus_of_node(struct pci_bus *bus)
@@ -35,12 +38,16 @@ void pci_set_bus_of_node(struct pci_bus *bus)
bus->dev.of_node = pcibios_get_phb_of_node(bus);
else
bus->dev.of_node = of_node_get(bus->self->dev.of_node);
+
+   if (bus->dev.of_node)
+   bus->dev.fwnode = >dev.of_node->fwnode;
 }
 
 void pci_release_bus_of_node(struct pci_bus *bus)
 {
of_node_put(bus->dev.of_node);
bus->dev.of_node = NULL;
+   bus->dev.fwnode = NULL;
 }
 
 struct device_node * __weak pcibios_get_phb_of_node(struct pci_bus *bus)
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/7] PCI: OF: Allow endpoints to bypass the iommu

2018-10-12 Thread Jean-Philippe Brucker
Using the iommu-map binding, endpoints in a given PCI domain can be
managed by different IOMMUs. Some virtual machines may allow a subset of
endpoints to bypass the IOMMU. In some case the IOMMU itself is presented
as a PCI endpoint (e.g. AMD IOMMU and virtio-iommu). Currently, when a
PCI root complex has an iommu-map property, the driver requires all
endpoints to be described by the property. Allow the iommu-map property to
have gaps.

Relaxing of_pci_map_rid also allows the msi-map property to have gaps,
which is invalid since MSIs always reach an MSI controller. Thankfully
Linux will error out later, when attempting to find an MSI domain for the
device.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/pci/of.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 1836b8ddf292..2f5015bdb256 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -451,9 +451,10 @@ int of_pci_map_rid(struct device_node *np, u32 rid,
return 0;
}
 
-   pr_err("%pOF: Invalid %s translation - no match for rid 0x%x on %pOF\n",
-   np, map_name, rid, target && *target ? *target : NULL);
-   return -EFAULT;
+   /* Bypasses translation */
+   if (id_out)
+   *id_out = rid;
+   return 0;
 }
 
 #if IS_ENABLED(CONFIG_OF_IRQ)
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/7] dt-bindings: virtio-mmio: Add IOMMU description

2018-10-12 Thread Jean-Philippe Brucker
The nature of a virtio-mmio node is discovered by the virtio driver at
probe time. However the DMA relation between devices must be described
statically. When a virtio-mmio node is a virtio-iommu device, it needs an
"#iommu-cells" property as specified by bindings/iommu/iommu.txt.

Otherwise, the virtio-mmio device may perform DMA through an IOMMU, which
requires an "iommus" property. Describe these requirements in the
device-tree bindings documentation.

Signed-off-by: Jean-Philippe Brucker 
---
 .../devicetree/bindings/virtio/mmio.txt   | 30 +++
 1 file changed, 30 insertions(+)

diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt 
b/Documentation/devicetree/bindings/virtio/mmio.txt
index 5069c1b8e193..748595473b36 100644
--- a/Documentation/devicetree/bindings/virtio/mmio.txt
+++ b/Documentation/devicetree/bindings/virtio/mmio.txt
@@ -8,10 +8,40 @@ Required properties:
 - reg: control registers base address and size including configuration 
space
 - interrupts:  interrupt generated by the device
 
+Required properties for virtio-iommu:
+
+- #iommu-cells:When the node corresponds to a virtio-iommu device, it 
is
+   linked to DMA masters using the "iommus" or "iommu-map"
+   properties [1][2]. #iommu-cells specifies the size of the
+   "iommus" property. For virtio-iommu #iommu-cells must be
+   1, each cell describing a single endpoint ID.
+
+Optional properties:
+
+- iommus:  If the device accesses memory through an IOMMU, it should
+   have an "iommus" property [1]. Since virtio-iommu itself
+   does not access memory through an IOMMU, the "virtio,mmio"
+   node cannot have both an "#iommu-cells" and an "iommus"
+   property.
+
 Example:
 
virtio_block@3000 {
compatible = "virtio,mmio";
reg = <0x3000 0x100>;
interrupts = <41>;
+
+   /* Device has endpoint ID 23 */
+   iommus = < 23>
}
+
+   viommu: virtio_iommu@3100 {
+   compatible = "virtio,mmio";
+   reg = <0x3100 0x100>;
+   interrupts = <42>;
+
+   #iommu-cells = <1>
+   }
+
+[1] Documentation/devicetree/bindings/iommu/iommu.txt
+[2] Documentation/devicetree/bindings/pci/pci-iommu.txt
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/7] dt-bindings: virtio: Add virtio-pci-iommu node

2018-10-12 Thread Jean-Philippe Brucker
Some systems implement virtio-iommu as a PCI endpoint. The operating
systems needs to discover the relationship between IOMMU and masters long
before the PCI endpoint gets probed. Add a PCI child node to describe the
virtio-iommu device.

The virtio-pci-iommu is conceptually split between a PCI programming
interface and a translation component on the parent bus. The latter
doesn't have a node in the device tree. The virtio-pci-iommu node
describes both, by linking the PCI endpoint to "iommus" property of DMA
master nodes and to "iommu-map" properties of bus nodes.

Signed-off-by: Jean-Philippe Brucker 
---
 .../devicetree/bindings/virtio/iommu.txt  | 66 +++
 1 file changed, 66 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt

diff --git a/Documentation/devicetree/bindings/virtio/iommu.txt 
b/Documentation/devicetree/bindings/virtio/iommu.txt
new file mode 100644
index ..2407fea0651c
--- /dev/null
+++ b/Documentation/devicetree/bindings/virtio/iommu.txt
@@ -0,0 +1,66 @@
+* virtio IOMMU PCI device
+
+When virtio-iommu uses the PCI transport, its programming interface is
+discovered dynamically by the PCI probing infrastructure. However the
+device tree statically describes the relation between IOMMU and DMA
+masters. Therefore, the PCI root complex that hosts the virtio-iommu
+contains a child node representing the IOMMU device explicitly.
+
+Required properties:
+
+- compatible:  Should be "virtio,pci-iommu"
+- reg: PCI address of the IOMMU. As defined in the PCI Bus
+   Binding reference [1], the reg property is a five-cell
+   address encoded as (phys.hi phys.mid phys.lo size.hi
+   size.lo). phys.hi should contain the device's BDF as
+   0b  dfff . The other cells
+   should be zero.
+- #iommu-cells:Each platform DMA master managed by the IOMMU is 
assigned
+   an endpoint ID, described by the "iommus" property [2].
+   For virtio-iommu, #iommu-cells must be 1.
+
+Notes:
+
+- DMA from the IOMMU device isn't managed by another IOMMU. Therefore the
+  virtio-iommu node doesn't have an "iommus" property, and is omitted from
+  the iommu-map property of the root complex.
+
+Example:
+
+pcie@1000 {
+   compatible = "pci-host-ecam-generic";
+   ...
+
+   /* The IOMMU programming interface uses slot 00:01.0 */
+   iommu0: iommu@0008 {
+   compatible = "virtio,pci-iommu";
+   reg = <0x0800 0 0 0 0>;
+   #iommu-cells = <1>;
+   };
+
+   /*
+* The IOMMU manages all functions in this PCI domain except
+* itself. Omit BDF 00:01.0.
+*/
+   iommu-map = <0x0  0x0 0x8>
+   <0x9  0x9 0xfff7>;
+};
+
+pcie@2000 {
+   compatible = "pci-host-ecam-generic";
+   ...
+   /*
+* The IOMMU also manages all functions from this domain,
+* with endpoint IDs 0x1 - 0x1
+*/
+   iommu-map = <0x0  0x1 0x1>;
+};
+
+ethernet@fe001000 {
+   ...
+   /* The IOMMU manages this platform device with endpoint ID 0x2 */
+   iommus = < 0x2>;
+};
+
+[1] Documentation/devicetree/bindings/pci/pci.txt
+[2] Documentation/devicetree/bindings/iommu/iommu.txt
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/7] Add virtio-iommu driver

2018-10-12 Thread Jean-Philippe Brucker
Implement the virtio-iommu driver, following specification v0.8 [1].
Changes since v2 [2]:

* Patches 2-4 allow virtio-iommu to use the PCI transport, since QEMU
  would like to phase out the MMIO transport. This produces a complex
  topology where the programming interface of the IOMMU could appear
  lower than the endpoints that it translates. It's not unheard of (e.g.
  AMD IOMMU), and the guest easily copes with this.
  
  The "Firmware description" section of the specification has been
  updated with all combinations of PCI, MMIO and DT, ACPI.

* Fix structures layout, they don't need the "packed" attribute anymore.

* While we're at it, add domain parameter to DETACH request, and leave
  some padding. This way the next version, that adds PASID support,
  won't have to introduce a "DETACH2" request to stay backward
  compatible.

* Require virtio device 1.0+. Remove legacy transport notes from the
  specification.

* Request timeout is now only enabled with DEBUG.

* The patch for VFIO Kconfig (previously patch 5/5) is in next.

You can find Linux driver and kvmtool device on branches
virtio-iommu/v0.8 [3] (currently based on 4.19-rc7 but rebasing onto
next only produced a trivial conflict). Branch virtio-iommu/devel
contains a few patches that I'd like to send once the base is upstream:

* virtio-iommu as a module. It got *much* nicer after Rob's probe
  deferral rework, but I still have a bug to fix when re-loading the
  virtio-iommu module.

* ACPI support requires a minor IORT spec update (reservation of node
  ID). I think it should be easier to obtain once the device and drivers
  are upstream.

[1] Virtio-iommu specification v0.8, diff from v0.7, and sources
git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.8
http://jpbrucker.net/virtio-iommu/spec/v0.8/virtio-iommu-v0.8.pdf

http://jpbrucker.net/virtio-iommu/spec/diffs/virtio-iommu-pdf-diff-v0.7-v0.8.pdf

[2] [PATCH v2 0/5] Add virtio-iommu driver
https://www.spinics.net/lists/kvm/msg170655.html

[3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.8
git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.8

Jean-Philippe Brucker (7):
  dt-bindings: virtio-mmio: Add IOMMU description
  dt-bindings: virtio: Add virtio-pci-iommu node
  PCI: OF: allow endpoints to bypass the iommu
  PCI: OF: Initialize dev->fwnode appropriately
  iommu: Add virtio-iommu driver
  iommu/virtio: Add probe request
  iommu/virtio: Add event queue

 .../devicetree/bindings/virtio/iommu.txt  |   66 +
 .../devicetree/bindings/virtio/mmio.txt   |   30 +
 MAINTAINERS   |7 +
 drivers/iommu/Kconfig |   11 +
 drivers/iommu/Makefile|1 +
 drivers/iommu/virtio-iommu.c  | 1171 +
 drivers/pci/of.c  |   14 +-
 include/uapi/linux/virtio_ids.h   |1 +
 include/uapi/linux/virtio_iommu.h |  159 +++
 9 files changed, 1457 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
 create mode 100644 drivers/iommu/virtio-iommu.c
 create mode 100644 include/uapi/linux/virtio_iommu.h

-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/10] arm64: use the generic swiotlb_dma_ops

2018-10-12 Thread Christoph Hellwig
On Fri, Oct 12, 2018 at 02:01:00PM +0100, Robin Murphy wrote:
> On 08/10/18 09:02, Christoph Hellwig wrote:
>> Now that the generic swiotlb code supports non-coherent DMA we can switch
>> to it for arm64.  For that we need to refactor the existing
>> alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
>> and implement the standard arch_sync_dma_for_{device,cpu} hooks for
>> cache maintaincance in the streaming dma hooks, which also implies
>> using the generic dma_coherent flag in struct device.
>>
>> Note that we need to keep the old is_device_dma_coherent function around
>> for now, so that the shared arm/arm64 Xen code keeps working.
>
> OK, so when I said last night that it boot-tested OK, that much was true, 
> but then I shut the board down as I left and got a megasplosion of bad page 
> state BUGs, e.g.:

I think this is because I am passing the wrong address to
dma_direct_free_pages.  Please try this patch on top:

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 3c75d69b54e7..4f0f92856c4c 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -126,10 +126,12 @@ void *arch_dma_alloc(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
 void arch_dma_free(struct device *dev, size_t size, void *vaddr,
dma_addr_t dma_handle, unsigned long attrs)
 {
-   if (__free_from_pool(vaddr, PAGE_ALIGN(size)))
-   return;
-   vunmap(vaddr);
-   dma_direct_free_pages(dev, size, vaddr, dma_handle, attrs);
+   if (!__free_from_pool(vaddr, PAGE_ALIGN(size)))
+   void *kaddr = phys_to_virt(dma_to_phys(dev, dma_handle));
+
+   vunmap(vaddr);
+   dma_direct_free_pages(dev, size, kaddr, dma_handle, attrs);
+   }
 }
 
 long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v3 10/10] iommu/sva: Add support for private PASIDs

2018-10-12 Thread Jordan Crouse
On Thu, Sep 20, 2018 at 06:00:46PM +0100, Jean-Philippe Brucker wrote:
> Provide an API for allocating PASIDs and populating them manually. To ease
> cleanup and factor allocation code, reuse the io_mm structure for private
> PASID. Private io_mm has a NULL mm_struct pointer, and cannot be bound to
> multiple devices. The mm_alloc() IOMMU op must now check if the mm
> argument is NULL, in which case it should allocate io_pgtables instead of
> binding to an mm.
> 
> Signed-off-by: Jordan Crouse 
> Signed-off-by: Jean-Philippe Brucker 
> ---
> Sadly this probably won't be the final thing. The API in this patch is
> used like this:
> 
> iommu_sva_alloc_pasid(dev, _mm) -> PASID
> iommu_sva_map(io_mm, ...)
> iommu_sva_unmap(io_mm, ...)
> iommu_sva_free_pasid(dev, io_mm)
> 
> The proposed API for auxiliary domains is in an early stage but might
> replace this patch and could be used like this:
> 
> iommu_enable_aux_domain(dev)
> d = iommu_domain_alloc()
> iommu_attach_aux(dev, d)
> iommu_aux_id(d) -> PASID
> iommu_map(d, ...)
> iommu_unmap(d, ...)
> iommu_detach_aux(dev, d)
> iommu_domain_free(d)
> 
> The advantage being that the driver doesn't have to use a special
> version of map/unmap/etc.

Hi Jean-Phillippe -

Have you thought about this any more? I want to send out a
refresh for the per-context pagetables for arm-smmu so if we want to change
the underlying assumptions this would be a great time.

For my part I'm okay with either model. In fact the second one is closer
to the original implementation that I sent out so I have a clear development
path in mind for either option depending on what the community decides.

Thanks,
Jordan



-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/10] arm64: use the generic swiotlb_dma_ops

2018-10-12 Thread Robin Murphy

On 08/10/18 09:02, Christoph Hellwig wrote:

Now that the generic swiotlb code supports non-coherent DMA we can switch
to it for arm64.  For that we need to refactor the existing
alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
and implement the standard arch_sync_dma_for_{device,cpu} hooks for
cache maintaincance in the streaming dma hooks, which also implies
using the generic dma_coherent flag in struct device.

Note that we need to keep the old is_device_dma_coherent function around
for now, so that the shared arm/arm64 Xen code keeps working.


OK, so when I said last night that it boot-tested OK, that much was 
true, but then I shut the board down as I left and got a megasplosion of 
bad page state BUGs, e.g.:


[ 3312.848182] BUG: Bad page state in process llvmpipe-5  pfn:8a78e
[ 3312.854138] page:7e29e380 count:2 mapcount:1 
mapping:800034019268 index:0xd3

[ 3312.862170] flags: 0x10028(uptodate|lru|mappedtodisk)
[ 3312.867176] raw: 00010028 7edc5508 7ec7bb48 
800034019268
[ 3312.874883] raw: 00d3  0002 
800036b91000

[ 3312.882608] page dumped because: page still charged to cgroup
[ 3312.888352] page->mem_cgroup:800036b91000
[ 3312.892714] bad because of flags: 0x20(lru)
[ 3312.896959] Modules linked in:
[ 3312.900042] CPU: 0 PID: 283 Comm: llvmpipe-5 Tainted: GB 
   4.19.0-rc3+ #801
[ 3312.908136] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Jul 10 2018

[ 3312.918810] Call trace:
[ 3312.921236]  dump_backtrace+0x0/0x1f0
[ 3312.924860]  show_stack+0x14/0x20
[ 3312.928142]  dump_stack+0x9c/0xbc
[ 3312.931422]  bad_page+0xe8/0x150
[ 3312.934615]  free_pages_check_bad+0x9c/0xa8
[ 3312.938754]  __free_pages_ok+0x27c/0x288
[ 3312.942635]  __free_pages+0x30/0x48
[ 3312.946086]  free_pages.part.22+0x1c/0x28
[ 3312.950053]  free_pages+0x14/0x20
[ 3312.95]  dma_direct_free_pages+0x5c/0x68
[ 3312.957560]  arch_dma_free+0x5c/0x70
[ 3312.961095]  dma_direct_free+0x20/0x28
[ 3312.964805]  drm_gem_cma_free_object+0xb0/0x150
[ 3312.969290]  drm_gem_object_free+0x1c/0x58
[ 3312.973343]  drm_gem_object_put_unlocked+0x70/0x98
[ 3312.978084]  drm_gem_object_handle_put_unlocked+0x64/0xc0
[ 3312.983426]  drm_gem_object_release_handle+0x54/0x90
[ 3312.988339]  idr_for_each+0x70/0x130
[ 3312.991876]  drm_gem_release+0x28/0x40
[ 3312.995584]  drm_file_free.part.0+0x214/0x288
[ 3312.999895]  drm_release+0x94/0x118
[ 3313.003347]  __fput+0x8c/0x1c0
[ 3313.006368]  fput+0xc/0x18
[ 3313.009390]  task_work_run+0x98/0xb8
[ 3313.012927]  do_exit+0x2b8/0x970
[ 3313.016119]  do_group_exit+0x38/0xa0
[ 3313.019656]  get_signal+0xfc/0x4b0
[ 3313.023021]  do_signal+0x1a0/0x2a8
[ 3313.026386]  do_notify_resume+0xe8/0x128
[ 3313.030267]  work_pending+0x8/0x10
...
[ 3313.972527] BUG: Bad page state in process llvmpipe-5  pfn:8a794
[ 3313.978605] page:7e29e500 count:0 mapcount:-128 
mapping: index:0x1

[ 3313.986845] flags: 0x0()
[ 3313.989393] raw:  7e185908 7e933b08 

[ 3313.997116] raw: 0001 0002 ff7f 


[ 3314.004824] page dumped because: nonzero mapcount
[ 3314.009523] Modules linked in:
[ 3314.012623] CPU: 0 PID: 283 Comm: llvmpipe-5 Tainted: GB 
   4.19.0-rc3+ #801
[ 3314.020718] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Jul 10 2018

[ 3314.031390] Call trace:
[ 3314.033810]  dump_backtrace+0x0/0x1f0
[ 3314.037434]  show_stack+0x14/0x20
[ 3314.040713]  dump_stack+0x9c/0xbc
[ 3314.043993]  bad_page+0xe8/0x150
[ 3314.047186]  free_pages_check_bad+0x70/0xa8
[ 3314.051325]  __free_pages_ok+0x27c/0x288
[ 3314.055205]  __free_pages+0x30/0x48
[ 3314.058656]  free_pages.part.22+0x1c/0x28
[ 3314.062623]  free_pages+0x14/0x20
[ 3314.065902]  dma_direct_free_pages+0x5c/0x68
[ 3314.070127]  arch_dma_free+0x5c/0x70
[ 3314.073663]  dma_direct_free+0x20/0x28
[ 3314.077371]  drm_gem_cma_free_object+0xb0/0x150
[ 3314.081854]  drm_gem_object_free+0x1c/0x58
[ 3314.085906]  drm_gem_object_put_unlocked+0x70/0x98
[ 3314.090647]  drm_gem_object_handle_put_unlocked+0x64/0xc0
[ 3314.095990]  drm_gem_object_release_handle+0x54/0x90
[ 3314.100902]  idr_for_each+0x70/0x130
[ 3314.104439]  drm_gem_release+0x28/0x40
[ 3314.108148]  drm_file_free.part.0+0x214/0x288
[ 3314.112458]  drm_release+0x94/0x118
[ 3314.115909]  __fput+0x8c/0x1c0
[ 3314.118930]  fput+0xc/0x18
[ 3314.121950]  task_work_run+0x98/0xb8
[ 3314.125486]  do_exit+0x2b8/0x970
[ 3314.128678]  do_group_exit+0x38/0xa0
[ 3314.132214]  get_signal+0xfc/0x4b0
[ 3314.135579]  do_signal+0x1a0/0x2a8
[ 3314.138945]  do_notify_resume+0xe8/0x128
[ 3314.142825]  work_pending+0x8/0x10
...

and hundreds more with various different reasons.

AFAICS there's something about coherent allocations causing page 
corruptions (it doesn't seem triggerable without the 

Re: [PATCH v4 1/2] dt-bindings: arm-smmu: Add binding doc for Qcom smmu-500

2018-10-12 Thread Vivek Gautam




On 10/12/2018 3:46 AM, Rob Herring wrote:

On Thu, 11 Oct 2018 15:19:29 +0530, Vivek Gautam wrote:

Qcom's implementation of arm,mmu-500 works well with current
arm-smmu driver implementation. Adding a soc specific compatible
along with arm,mmu-500 makes the bindings future safe.

Signed-off-by: Vivek Gautam 
---

Changes since v3:
  - Refined language more to state things directly for the bindings
description.

  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 4 
  1 file changed, 4 insertions(+)


Reviewed-by: Rob Herring 


Thank you Rob.

Best regards
Vivek
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu