date:20181115

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Mika Westerberg

On Thu, Nov 15, 2018 at 09:00:54PM +0200, Mika Westerberg wrote:
> On Thu, Nov 15, 2018 at 05:46:08PM +, Lorenzo Pieralisi wrote:
> > Do you really need to parse it if the dev->is_thunderbolt check is enough ?
> 
> Yes, we need to parse it one way or another. dev->is_thunderbolt is
> based on heuristics which do not apply anymore when the thing gets
> integrated in the SoC.
> 
> The _DSD is there already (on existing systems) and is being used by
> Windows so I don't understand why we cannot take advantage of it? Every
> new system with Thunderbolt ports will have it.

Just to clarify a bit. We can use is_thunderbolt in place of is_external
if we don't want to deal with other possible "external" devices right now.

However, we still need to parse the _DSD and based on that fill the
is_thunderbolt for these devices (same way we do for is_external in this
series). So basically we just get rid of the is_external flag and use
is_thunderbolt instead.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/9] mm: Introduce new vm_insert_range API

2018-11-15 Thread Matthew Wilcox

On Fri, Nov 16, 2018 at 11:00:30AM +0530, Souptick Joarder wrote:
> On Thu, Nov 15, 2018 at 11:44 PM Randy Dunlap  wrote:
> > On 11/15/18 7:45 AM, Souptick Joarder wrote:
> > What is the opposite of vm_insert_range() or even of vm_insert_page()?
> > or is there no need for that?
> 
> There is no opposite function of vm_insert_range() / vm_insert_page().
> My understanding is, in case of any error, mmap handlers will return the
> err to user process and user space will decide the next action. So next
> time when mmap handler is getting invoked it will map from the beginning.
> Correct me if I am wrong.

The opposite function, I suppose, is unmap_region().

> > s/no./number/
> 
> I didn't get it ??

This is a 'sed' expression.  's' is the 'substitute' command; the /
is a separator, 'no.' is what you wrote, and 'number' is what Randy
is recommending instead.

> > > + for (i = 0; i < page_count; i++) {
> > > + ret = vm_insert_page(vma, uaddr, pages[i]);
> > > + if (ret < 0)
> > > + return ret;
> >
> > For a non-trivial value of page_count:
> > Is it a problem if vm_insert_page() succeeds for several pages
> > and then fails?
> 
> No, it will be considered as total failure and mmap handler will return
> the err to user space.

I think what Randy means is "What happens to the inserted pages?" and
the answer is that mmap_region() jumps to the 'unmap_and_free_vma'
label, which is an accurate name.

Thanks for looking, Randy.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 7/7] iommu/virtio: Add event queue

2018-11-15 Thread Auger Eric

Hi Jean,

On 11/15/18 5:52 PM, Jean-Philippe Brucker wrote:
> The event queue offers a way for the device to report access faults from
> endpoints. It is implemented on virtqueue #1. Whenever the host needs to
> signal a fault, it fills one of the buffers offered by the guest and
> interrupts it.
> 
> Signed-off-by: Jean-Philippe Brucker 

> ---
>  drivers/iommu/virtio-iommu.c  | 116 +++---
>  include/uapi/linux/virtio_iommu.h |  19 +
>  2 files changed, 126 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index c547ebd79c43..81c6b72e9c43 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -29,7 +29,8 @@
>  #define MSI_IOVA_LENGTH  0x10
>  
>  #define VIOMMU_REQUEST_VQ0
> -#define VIOMMU_NR_VQS1
> +#define VIOMMU_EVENT_VQ  1
> +#define VIOMMU_NR_VQS2
>  
>  struct viommu_dev {
>   struct iommu_device iommu;
> @@ -41,6 +42,7 @@ struct viommu_dev {
>   struct virtqueue*vqs[VIOMMU_NR_VQS];
>   spinlock_t  request_lock;
>   struct list_headrequests;
> + void*evts;
>  
>   /* Device configuration */
>   struct iommu_domain_geometrygeometry;
> @@ -82,6 +84,15 @@ struct viommu_request {
>   charbuf[];
>  };
>  
> +#define VIOMMU_FAULT_RESV_MASK   0xff00
> +
> +struct viommu_event {
> + union {
> + u32 head;
> + struct virtio_iommu_fault fault;
> + };
> +};
> +
>  #define to_viommu_domain(domain) \
>   container_of(domain, struct viommu_domain, domain)
>  
> @@ -504,6 +515,69 @@ static int viommu_probe_endpoint(struct viommu_dev 
> *viommu, struct device *dev)
>   return ret;
>  }
>  
> +static int viommu_fault_handler(struct viommu_dev *viommu,
> + struct virtio_iommu_fault *fault)
> +{
> + char *reason_str;
> +
> + u8 reason   = fault->reason;
> + u32 flags   = le32_to_cpu(fault->flags);
> + u32 endpoint= le32_to_cpu(fault->endpoint);
> + u64 address = le64_to_cpu(fault->address);
> +
> + switch (reason) {
> + case VIRTIO_IOMMU_FAULT_R_DOMAIN:
> + reason_str = "domain";
> + break;
> + case VIRTIO_IOMMU_FAULT_R_MAPPING:
> + reason_str = "page";
> + break;
> + case VIRTIO_IOMMU_FAULT_R_UNKNOWN:
> + default:
> + reason_str = "unknown";
> + break;
> + }
> +
> + /* TODO: find EP by ID and report_iommu_fault */
> + if (flags & VIRTIO_IOMMU_FAULT_F_ADDRESS)
> + dev_err_ratelimited(viommu->dev, "%s fault from EP %u at %#llx 
> [%s%s%s]\n",
> + reason_str, endpoint, address,
> + flags & VIRTIO_IOMMU_FAULT_F_READ ? "R" : 
> "",
> + flags & VIRTIO_IOMMU_FAULT_F_WRITE ? "W" : 
> "",
> + flags & VIRTIO_IOMMU_FAULT_F_EXEC ? "X" : 
> "");
> + else
> + dev_err_ratelimited(viommu->dev, "%s fault from EP %u\n",
> + reason_str, endpoint);
> + return 0;
> +}
> +
> +static void viommu_event_handler(struct virtqueue *vq)
> +{
> + int ret;
> + unsigned int len;
> + struct scatterlist sg[1];
> + struct viommu_event *evt;
> + struct viommu_dev *viommu = vq->vdev->priv;
> +
> + while ((evt = virtqueue_get_buf(vq, )) != NULL) {
> + if (len > sizeof(*evt)) {
> + dev_err(viommu->dev,
> + "invalid event buffer (len %u != %zu)\n",
> + len, sizeof(*evt));
> + } else if (!(evt->head & VIOMMU_FAULT_RESV_MASK)) {
> + viommu_fault_handler(viommu, >fault);
> + }
> +
> + sg_init_one(sg, evt, sizeof(*evt));
> + ret = virtqueue_add_inbuf(vq, sg, 1, evt, GFP_ATOMIC);
> + if (ret)
> + dev_err(viommu->dev, "could not add event buffer\n");
> + }
> +
> + if (!virtqueue_kick(vq))
> + dev_err(viommu->dev, "kick failed\n");
There are other occurences of virtqueue_kick where you don't check the
returned value
> +}
> +
>  /* IOMMU API */
>  
>  static struct iommu_domain *viommu_domain_alloc(unsigned type)
> @@ -887,16 +961,35 @@ static struct iommu_ops viommu_ops = {
>  static int viommu_init_vqs(struct viommu_dev *viommu)
>  {
>   struct virtio_device *vdev = dev_to_virtio(viommu->dev);
> - const char *name = "request";
> - void *ret;
> + const char *names[] = { "request", "event" };
> + vq_callback_t *callbacks[] = {
> + NULL, /* No async requests */
> + viommu_event_handler,

Re: [PATCH v4 5/7] iommu: Add virtio-iommu driver

2018-11-15 Thread Auger Eric

Hi Jean,

On 11/15/18 5:52 PM, Jean-Philippe Brucker wrote:
> The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
> requests such as map/unmap over virtio transport without emulating page
> tables. This implementation handles ATTACH, DETACH, MAP and UNMAP
> requests.
> 
> The bulk of the code transforms calls coming from the IOMMU API into
> corresponding virtio requests. Mappings are kept in an interval tree
> instead of page tables.

Some few comments/questions below.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  MAINTAINERS   |   7 +
>  drivers/iommu/Kconfig |  11 +
>  drivers/iommu/Makefile|   1 +
>  drivers/iommu/virtio-iommu.c  | 918 ++
>  include/uapi/linux/virtio_ids.h   |   1 +
>  include/uapi/linux/virtio_iommu.h | 104 
>  6 files changed, 1042 insertions(+)
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0abecc528dac..0c7bdce57719 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -15948,6 +15948,13 @@ S:   Maintained
>  F:   drivers/virtio/virtio_input.c
>  F:   include/uapi/linux/virtio_input.h
>  
> +VIRTIO IOMMU DRIVER
> +M:   Jean-Philippe Brucker 
> +L:   virtualizat...@lists.linux-foundation.org
> +S:   Maintained
> +F:   drivers/iommu/virtio-iommu.c
> +F:   include/uapi/linux/virtio_iommu.h
> +
>  VIRTUAL BOX GUEST DEVICE DRIVER
>  M:   Hans de Goede 
>  M:   Arnd Bergmann 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index d9a25715650e..efdeaaeee0e0 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -435,4 +435,15 @@ config QCOM_IOMMU
>   help
> Support for IOMMU on certain Qualcomm SoCs.
>  
> +config VIRTIO_IOMMU
> + bool "Virtio IOMMU driver"
> + depends on VIRTIO=y
> + select IOMMU_API
> + select INTERVAL_TREE
> + select ARM_DMA_USE_IOMMU if ARM
> + help
> +   Para-virtualised IOMMU driver with virtio.
> +
> +   Say Y here if you intend to run this kernel as a guest.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index a158a68c8ea8..48d831a39281 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -32,3 +32,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>  obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> new file mode 100644
> index ..2a9cb6285a1e
> --- /dev/null
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -0,0 +1,918 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Virtio driver for the paravirtualized IOMMU
> + *
> + * Copyright (C) 2018 Arm Limited
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#define MSI_IOVA_BASE0x800
> +#define MSI_IOVA_LENGTH  0x10
> +
> +#define VIOMMU_REQUEST_VQ0
> +#define VIOMMU_NR_VQS1
> +
> +struct viommu_dev {
> + struct iommu_device iommu;
> + struct device   *dev;
> + struct virtio_device*vdev;
> +
> + struct ida  domain_ids;
> +
> + struct virtqueue*vqs[VIOMMU_NR_VQS];
> + spinlock_t  request_lock;
> + struct list_headrequests;
> +
> + /* Device configuration */
> + struct iommu_domain_geometrygeometry;
> + u64 pgsize_bitmap;
> + u8  domain_bits;
> +};
> +
> +struct viommu_mapping {
> + phys_addr_t paddr;
> + struct interval_tree_node   iova;
> + u32 flags;
> +};
> +
> +struct viommu_domain {
> + struct iommu_domain domain;
> + struct viommu_dev   *viommu;
> + struct mutexmutex;
same naming/comment as in smmu driver may help here
struct mutex init_mutex; /* Protects viommu pointer */
> + unsigned intid;
> +
> + spinlock_t  mappings_lock;
> + struct rb_root_cached   mappings;
> +
> + unsigned long   nr_endpoints;
> +};
> +
> +struct viommu_endpoint {
> + struct viommu_dev   *viommu;
> + struct viommu_domain*vdomain;
> +};
> +
> +struct viommu_request {
> + struct list_headlist;
> + void*writeback;
> + unsigned int

Re: [PATCH 1/9] mm: Introduce new vm_insert_range API

2018-11-15 Thread Souptick Joarder

On Thu, Nov 15, 2018 at 11:44 PM Randy Dunlap  wrote:
>
> On 11/15/18 7:45 AM, Souptick Joarder wrote:
> > Previouly drivers have their own way of mapping range of
> > kernel pages/memory into user vma and this was done by
> > invoking vm_insert_page() within a loop.
> >
> > As this pattern is common across different drivers, it can
> > be generalized by creating a new function and use it across
> > the drivers.
> >
> > vm_insert_range is the new API which will be used to map a
> > range of kernel memory/pages to user vma.
> >
> > Signed-off-by: Souptick Joarder 
> > Reviewed-by: Matthew Wilcox 
> > ---
> >  include/linux/mm_types.h |  3 +++
> >  mm/memory.c  | 28 
> >  mm/nommu.c   |  7 +++
> >  3 files changed, 38 insertions(+)
>
> Hi,
>
> What is the opposite of vm_insert_range() or even of vm_insert_page()?
> or is there no need for that?

There is no opposite function of vm_insert_range() / vm_insert_page().
My understanding is, in case of any error, mmap handlers will return the
err to user process and user space will decide the next action. So next
time when mmap handler is getting invoked it will map from the beginning.
Correct me if I am wrong.
>
>
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 15c417e..da904ed 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1478,6 +1478,34 @@ static int insert_page(struct vm_area_struct *vma, 
> > unsigned long addr,
> >  }
> >
> >  /**
> > + * vm_insert_range - insert range of kernel pages into user vma
> > + * @vma: user vma to map to
> > + * @addr: target user address of this page
> > + * @pages: pointer to array of source kernel pages
> > + * @page_count: no. of pages need to insert into user vma
>
> s/no./number/

I didn't get it ??
>
> > + *
> > + * This allows drivers to insert range of kernel pages they've allocated
> > + * into a user vma. This is a generic function which drivers can use
> > + * rather than using their own way of mapping range of kernel pages into
> > + * user vma.
> > + */
> > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > + struct page **pages, unsigned long page_count)
> > +{
> > + unsigned long uaddr = addr;
> > + int ret = 0, i;
> > +
> > + for (i = 0; i < page_count; i++) {
> > + ret = vm_insert_page(vma, uaddr, pages[i]);
> > + if (ret < 0)
> > + return ret;
>
> For a non-trivial value of page_count:
> Is it a problem if vm_insert_page() succeeds for several pages
> and then fails?

No, it will be considered as total failure and mmap handler will return
the err to user space.
>
> > + uaddr += PAGE_SIZE;
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +/**
> >   * vm_insert_page - insert single page into user vma
> >   * @vma: user vma to map to
> >   * @addr: target user address of this page
>
>
> thanks.
> --
> ~Randy
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] csky, h8300, riscv: remove leftovers

2018-11-15 Thread Palmer Dabbelt


On Fri, 09 Nov 2018 01:00:07 PST (-0800), Christoph Hellwig wrote:

There has been no  for a long time, which also means
there is no point in using it from asm-generic.

Signed-off-by: Christoph Hellwig 
---
 arch/csky/include/asm/Kbuild  | 1 -
 arch/h8300/include/asm/Kbuild | 1 -
 arch/riscv/include/asm/Kbuild | 1 -
 3 files changed, 3 deletions(-)

diff --git a/arch/csky/include/asm/Kbuild b/arch/csky/include/asm/Kbuild
index 2a0abe8f2a35..7c48a123300d 100644
--- a/arch/csky/include/asm/Kbuild
+++ b/arch/csky/include/asm/Kbuild
@@ -34,7 +34,6 @@ generic-y += pci.h
 generic-y += percpu.h
 generic-y += preempt.h
 generic-y += qrwlock.h
-generic-y += scatterlist.h
 generic-y += sections.h
 generic-y += serial.h
 generic-y += shm.h
diff --git a/arch/h8300/include/asm/Kbuild b/arch/h8300/include/asm/Kbuild
index a5d0b2991f47..32f0c8952147 100644
--- a/arch/h8300/include/asm/Kbuild
+++ b/arch/h8300/include/asm/Kbuild
@@ -36,7 +36,6 @@ generic-y += parport.h
 generic-y += percpu.h
 generic-y += pgalloc.h
 generic-y += preempt.h
-generic-y += scatterlist.h
 generic-y += sections.h
 generic-y += serial.h
 generic-y += sizes.h
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 6a646d9ea780..011cc7c7b3ad 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -37,7 +37,6 @@ generic-y += poll.h
 generic-y += posix_types.h
 generic-y += preempt.h
 generic-y += resource.h
-generic-y += scatterlist.h
 generic-y += sections.h
 generic-y += sembuf.h
 generic-y += serial.h


Reviewed-by: Palmer Dabbelt 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 6/8] vfio/mdev: Add iommu place holders in mdev_device

2018-11-15 Thread Lu Baolu


Hi,

On 11/16/18 5:31 AM, Kirti Wankhede wrote:



On 11/7/2018 7:18 AM, Lu Baolu wrote:

Hi Alex,

On 11/7/18 7:53 AM, Alex Williamson wrote:

On Mon,  5 Nov 2018 15:34:06 +0800
Lu Baolu  wrote:


A parent device might create different types of mediated
devices. For example, a mediated device could be created
by the parent device with full isolation and protection
provided by the IOMMU. One usage case could be found on
Intel platforms where a mediated device is an assignable
subset of a PCI, the DMA requests on behalf of it are all
tagged with a PASID. Since IOMMU supports PASID-granular
translations (scalable mode in vt-d 3.0), this mediated
device could be individually protected and isolated by an
IOMMU.

This patch adds two new members in struct mdev_device:
* iommu_device
    - This, if set, indicates that the mediated device could
  be fully isolated and protected by IOMMU via attaching
  an iommu domain to this device. If empty, it indicates
  using vendor defined isolation.

* iommu_domain
    - This is a place holder for an iommu domain. A domain
  could be store here for later use once it has been
  attached to the iommu_device of this mdev.

Below helpers are added to set and get above iommu device
and iommu domain pointers.

* mdev_set/get_iommu_device(dev, iommu_device)
    - Set or get the iommu device which represents this mdev
  in IOMMU's device scope. Drivers don't need to set the
  iommu device if it uses vendor defined isolation.

* mdev_set/get_iommu_domain(domain)
    - A iommu domain which has been attached to the iommu
  device in order to protect and isolate the mediated
  device will be kept in the mdev data structure and
  could be retrieved later.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Liu Yi L 
Suggested-by: Kevin Tian 
Suggested-by: Alex Williamson 
Signed-off-by: Lu Baolu 
---
   drivers/vfio/mdev/mdev_core.c    | 36 
   drivers/vfio/mdev/mdev_private.h |  2 ++
   include/linux/mdev.h | 23 
   3 files changed, 61 insertions(+)

diff --git a/drivers/vfio/mdev/mdev_core.c
b/drivers/vfio/mdev/mdev_core.c
index 0212f0ee8aea..5119809225c5 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -390,6 +390,42 @@ int mdev_device_remove(struct device *dev, bool
force_remove)
   return 0;
   }
   +int mdev_set_iommu_device(struct device *dev, struct device
*iommu_device)
+{
+    struct mdev_device *mdev = to_mdev_device(dev);
+
+    mdev->iommu_device = iommu_device;
+
+    return 0;
+}
+EXPORT_SYMBOL(mdev_set_iommu_device);
+
+struct device *mdev_get_iommu_device(struct device *dev)
+{
+    struct mdev_device *mdev = to_mdev_device(dev);
+
+    return mdev->iommu_device;
+}
+EXPORT_SYMBOL(mdev_get_iommu_device);
+
+int mdev_set_iommu_domain(struct device *dev, void *domain)
+{
+    struct mdev_device *mdev = to_mdev_device(dev);
+
+    mdev->iommu_domain = domain;
+
+    return 0;
+}
+EXPORT_SYMBOL(mdev_set_iommu_domain);
+
+void *mdev_get_iommu_domain(struct device *dev)
+{
+    struct mdev_device *mdev = to_mdev_device(dev);
+
+    return mdev->iommu_domain;
+}
+EXPORT_SYMBOL(mdev_get_iommu_domain);
+
   static int __init mdev_init(void)
   {
   return mdev_bus_register();
diff --git a/drivers/vfio/mdev/mdev_private.h
b/drivers/vfio/mdev/mdev_private.h
index b5819b7d7ef7..c01518068e84 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -34,6 +34,8 @@ struct mdev_device {
   struct list_head next;
   struct kobject *type_kobj;
   bool active;
+    struct device *iommu_device;
+    void *iommu_domain;
   };
     #define to_mdev_device(dev)    container_of(dev, struct
mdev_device, dev)
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index b6e048e1045f..c46777d3e568 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -14,6 +14,29 @@
   #define MDEV_H
     struct mdev_device;
+struct iommu_domain;
+
+/*
+ * Called by the parent device driver to set the PCI device which
represents


s/PCI //

There is no requirement or expectation that the device is PCI.



Fair enough.


+ * this mdev in iommu protection scope. By default, the iommu device
is NULL,
+ * that indicates using vendor defined isolation.
+ *
+ * @dev: the mediated device that iommu will isolate.
+ * @iommu_device: a pci device which represents the iommu for @dev.
+ *
+ * Return 0 for success, otherwise negative error value.
+ */
+int mdev_set_iommu_device(struct device *dev, struct device
*iommu_device);
+
+struct device *mdev_get_iommu_device(struct device *dev);
+
+/*
+ * Called by vfio iommu modules to save the iommu domain after a
domain being
+ * attached to the mediated device.
+ */
+int mdev_set_iommu_domain(struct device *dev, void *domain);
+
+void *mdev_get_iommu_domain(struct device *dev);


I can't say I really understand the purpose of this, the cover letter
indicates this is a placeholder,

Re: [PATCH v4 6/8] vfio/mdev: Add iommu place holders in mdev_device

2018-11-15 Thread Kirti Wankhede



On 11/7/2018 7:18 AM, Lu Baolu wrote:
> Hi Alex,
> 
> On 11/7/18 7:53 AM, Alex Williamson wrote:
>> On Mon,  5 Nov 2018 15:34:06 +0800
>> Lu Baolu  wrote:
>>
>>> A parent device might create different types of mediated
>>> devices. For example, a mediated device could be created
>>> by the parent device with full isolation and protection
>>> provided by the IOMMU. One usage case could be found on
>>> Intel platforms where a mediated device is an assignable
>>> subset of a PCI, the DMA requests on behalf of it are all
>>> tagged with a PASID. Since IOMMU supports PASID-granular
>>> translations (scalable mode in vt-d 3.0), this mediated
>>> device could be individually protected and isolated by an
>>> IOMMU.
>>>
>>> This patch adds two new members in struct mdev_device:
>>> * iommu_device
>>>    - This, if set, indicates that the mediated device could
>>>  be fully isolated and protected by IOMMU via attaching
>>>  an iommu domain to this device. If empty, it indicates
>>>  using vendor defined isolation.
>>>
>>> * iommu_domain
>>>    - This is a place holder for an iommu domain. A domain
>>>  could be store here for later use once it has been
>>>  attached to the iommu_device of this mdev.
>>>
>>> Below helpers are added to set and get above iommu device
>>> and iommu domain pointers.
>>>
>>> * mdev_set/get_iommu_device(dev, iommu_device)
>>>    - Set or get the iommu device which represents this mdev
>>>  in IOMMU's device scope. Drivers don't need to set the
>>>  iommu device if it uses vendor defined isolation.
>>>
>>> * mdev_set/get_iommu_domain(domain)
>>>    - A iommu domain which has been attached to the iommu
>>>  device in order to protect and isolate the mediated
>>>  device will be kept in the mdev data structure and
>>>  could be retrieved later.
>>>
>>> Cc: Ashok Raj 
>>> Cc: Jacob Pan 
>>> Cc: Kevin Tian 
>>> Cc: Liu Yi L 
>>> Suggested-by: Kevin Tian 
>>> Suggested-by: Alex Williamson 
>>> Signed-off-by: Lu Baolu 
>>> ---
>>>   drivers/vfio/mdev/mdev_core.c    | 36 
>>>   drivers/vfio/mdev/mdev_private.h |  2 ++
>>>   include/linux/mdev.h | 23 
>>>   3 files changed, 61 insertions(+)
>>>
>>> diff --git a/drivers/vfio/mdev/mdev_core.c
>>> b/drivers/vfio/mdev/mdev_core.c
>>> index 0212f0ee8aea..5119809225c5 100644
>>> --- a/drivers/vfio/mdev/mdev_core.c
>>> +++ b/drivers/vfio/mdev/mdev_core.c
>>> @@ -390,6 +390,42 @@ int mdev_device_remove(struct device *dev, bool
>>> force_remove)
>>>   return 0;
>>>   }
>>>   +int mdev_set_iommu_device(struct device *dev, struct device
>>> *iommu_device)
>>> +{
>>> +    struct mdev_device *mdev = to_mdev_device(dev);
>>> +
>>> +    mdev->iommu_device = iommu_device;
>>> +
>>> +    return 0;
>>> +}
>>> +EXPORT_SYMBOL(mdev_set_iommu_device);
>>> +
>>> +struct device *mdev_get_iommu_device(struct device *dev)
>>> +{
>>> +    struct mdev_device *mdev = to_mdev_device(dev);
>>> +
>>> +    return mdev->iommu_device;
>>> +}
>>> +EXPORT_SYMBOL(mdev_get_iommu_device);
>>> +
>>> +int mdev_set_iommu_domain(struct device *dev, void *domain)
>>> +{
>>> +    struct mdev_device *mdev = to_mdev_device(dev);
>>> +
>>> +    mdev->iommu_domain = domain;
>>> +
>>> +    return 0;
>>> +}
>>> +EXPORT_SYMBOL(mdev_set_iommu_domain);
>>> +
>>> +void *mdev_get_iommu_domain(struct device *dev)
>>> +{
>>> +    struct mdev_device *mdev = to_mdev_device(dev);
>>> +
>>> +    return mdev->iommu_domain;
>>> +}
>>> +EXPORT_SYMBOL(mdev_get_iommu_domain);
>>> +
>>>   static int __init mdev_init(void)
>>>   {
>>>   return mdev_bus_register();
>>> diff --git a/drivers/vfio/mdev/mdev_private.h
>>> b/drivers/vfio/mdev/mdev_private.h
>>> index b5819b7d7ef7..c01518068e84 100644
>>> --- a/drivers/vfio/mdev/mdev_private.h
>>> +++ b/drivers/vfio/mdev/mdev_private.h
>>> @@ -34,6 +34,8 @@ struct mdev_device {
>>>   struct list_head next;
>>>   struct kobject *type_kobj;
>>>   bool active;
>>> +    struct device *iommu_device;
>>> +    void *iommu_domain;
>>>   };
>>>     #define to_mdev_device(dev)    container_of(dev, struct
>>> mdev_device, dev)
>>> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
>>> index b6e048e1045f..c46777d3e568 100644
>>> --- a/include/linux/mdev.h
>>> +++ b/include/linux/mdev.h
>>> @@ -14,6 +14,29 @@
>>>   #define MDEV_H
>>>     struct mdev_device;
>>> +struct iommu_domain;
>>> +
>>> +/*
>>> + * Called by the parent device driver to set the PCI device which
>>> represents
>>
>> s/PCI //
>>
>> There is no requirement or expectation that the device is PCI.
>>
> 
> Fair enough.
> 
>>> + * this mdev in iommu protection scope. By default, the iommu device
>>> is NULL,
>>> + * that indicates using vendor defined isolation.
>>> + *
>>> + * @dev: the mediated device that iommu will isolate.
>>> + * @iommu_device: a pci device which represents the iommu for @dev.
>>> + *
>>> + * Return 0 for success, otherwise negative error value.
>>> + */
>>> +int

Re: move the arm arch_dma_alloc implementation to common code

2018-11-15 Thread Robin Murphy


On 2018-11-15 11:50 am, Will Deacon wrote:

On Fri, Nov 09, 2018 at 08:52:38AM +0100, Christoph Hellwig wrote:

Can I get a quick review from the arm64 folks?  I think it should
be fine there as it basically is a code move, but an additional pair
or two of eyes always helps to weed out bugs.


I reviewed the arm64 parts, but it would be ideal if Robin could have a look
as well.


Yup, from a quick skim the general shape of the whole series looks 
pleasing, but I've been holding off going through it in detail until 
I've figured out what's up with the last thing I thought I'd reviewed 
exhaustively...


Either way I'll make some time for a proper look next week once I'm back.

Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: move the arm arch_dma_alloc implementation to common code

2018-11-15 Thread Will Deacon

On Fri, Nov 09, 2018 at 08:52:38AM +0100, Christoph Hellwig wrote:
> Can I get a quick review from the arm64 folks?  I think it should
> be fine there as it basically is a code move, but an additional pair
> or two of eyes always helps to weed out bugs.

I reviewed the arm64 parts, but it would be ideal if Robin could have a look
as well.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 4/9] dma-mapping: move the arm64 ncoherent alloc/free support to common code

2018-11-15 Thread Will Deacon

Hi Christoph,

Minor nit: typo in the subject "ncoherent".

On Mon, Nov 05, 2018 at 01:19:26PM +0100, Christoph Hellwig wrote:
> The arm64 codebase to implement coherent dma allocation for architectures
> with non-coherent DMA is a good start for a generic implementation, given
> that is uses the generic remap helpers, provides the atomic pool for
> allocations that can't sleep and still is realtively simple and well
> tested.  Move it to kernel/dma and allow architectures to opt into it
> using a config symbol.  Architectures just need to provide a new
> arch_dma_prep_coherent helper to writeback an invalidate the caches
> for any memory that gets remapped for uncached access.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm64/Kconfig  |   2 +-
>  arch/arm64/mm/dma-mapping.c | 184 ++--
>  include/linux/dma-mapping.h |   5 +
>  include/linux/dma-noncoherent.h |   2 +
>  kernel/dma/Kconfig  |   6 ++
>  kernel/dma/remap.c  | 158 ++-
>  6 files changed, 181 insertions(+), 176 deletions(-)

I'm currently at LPC, so I've not been able to test this, but I've been
through the changes this morning and they look fine to me, so:

Reviewed-by: Will Deacon 

Hopefully we'll get the fallout from the previous changes addressed next
week.

Cheers,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Mario.Limonciello




> -Original Message-
> From: Mika Westerberg 
> Sent: Thursday, November 15, 2018 1:01 PM
> To: Lorenzo Pieralisi
> Cc: Lukas Wunner; iommu@lists.linux-foundation.org; Joerg Roedel; David
> Woodhouse; Lu Baolu; Ashok Raj; Bjorn Helgaas; Rafael J. Wysocki; Jacob jun 
> Pan;
> Andreas Noever; Michael Jamet; Yehezkel Bernat; Christian Kellner; 
> Limonciello,
> Mario; Anthony Wong; linux-a...@vger.kernel.org; linux-...@vger.kernel.org; 
> linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices
> 
> 
> [EXTERNAL EMAIL]
> 
> On Thu, Nov 15, 2018 at 05:46:08PM +, Lorenzo Pieralisi wrote:
> > Do you really need to parse it if the dev->is_thunderbolt check is enough ?
> 
> Yes, we need to parse it one way or another. dev->is_thunderbolt is
> based on heuristics which do not apply anymore when the thing gets
> integrated in the SoC.
> 
> The _DSD is there already (on existing systems) and is being used by
> Windows so I don't understand why we cannot take advantage of it? Every
> new system with Thunderbolt ports will have it.

Furthermore it's entirely in the BIOS writers best interest to do this correctly
as it applies proper policy on Windows as well.

I wouldn't be surprised if WHCK failed if it was done wrong.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Mika Westerberg

On Thu, Nov 15, 2018 at 08:27:41PM +0100, Lukas Wunner wrote:
> On Thu, Nov 15, 2018 at 09:10:26PM +0200, Mika Westerberg wrote:
> > I was thinking we could cover all these with is_external filling them
> > based on the _DSD or some other means in the kernel.
> > 
> > We would then deal all such devices as "untrusted" by default.
> 
> Tinfoil hat on, even internal devices could be malicious.
> What's the downside of enabling the feature for everything?

Mostly performance, I think. That's the main reason we put all non
external devices to passthrough IOMMU mode.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Lukas Wunner

On Thu, Nov 15, 2018 at 09:10:26PM +0200, Mika Westerberg wrote:
> I was thinking we could cover all these with is_external filling them
> based on the _DSD or some other means in the kernel.
> 
> We would then deal all such devices as "untrusted" by default.

Tinfoil hat on, even internal devices could be malicious.
What's the downside of enabling the feature for everything?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Mika Westerberg

On Thu, Nov 15, 2018 at 07:58:13PM +0200, Yehezkel Bernat wrote:
> From what I know, there are more devices that suffer from similar security
> issues like Thunderbolt, e.g. FireWire [1].
> My assumption is that the same protection may be applied to such devices too,
> even if currently it sounds like vendors care mostly about Thunderbolt 
> (probably
> because it removes the need for user approval for device connection; it 
> becames
> a simple plug-and-play experience).

FireWire is kind of different but there are connectors such as
ExpressCard and NVMe (over U.2 connector) which carry PCIe and are
relatively easy to access without need for a screwdriver. AFAIK some
eGPUs are also using some other proprietary (non-TBT) connector that
carries PCIe.

I was thinking we could cover all these with is_external filling them
based on the _DSD or some other means in the kernel.

We would then deal all such devices as "untrusted" by default.

> Thus, I don't think binding it with dev->is_thunderbolt is the correct
> thing to do.

One option that I suggested already is that we keep both and mark all
is_thunderbolt devices as is_external as well. But I guess this is up to
Bjorn and Rafael to decide :-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Mika Westerberg

On Thu, Nov 15, 2018 at 05:46:08PM +, Lorenzo Pieralisi wrote:
> Do you really need to parse it if the dev->is_thunderbolt check is enough ?

Yes, we need to parse it one way or another. dev->is_thunderbolt is
based on heuristics which do not apply anymore when the thing gets
integrated in the SoC.

The _DSD is there already (on existing systems) and is being used by
Windows so I don't understand why we cannot take advantage of it? Every
new system with Thunderbolt ports will have it.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/9] mm: Introduce new vm_insert_range API

2018-11-15 Thread Randy Dunlap

On 11/15/18 7:45 AM, Souptick Joarder wrote:
> Previouly drivers have their own way of mapping range of
> kernel pages/memory into user vma and this was done by
> invoking vm_insert_page() within a loop.
> 
> As this pattern is common across different drivers, it can
> be generalized by creating a new function and use it across
> the drivers.
> 
> vm_insert_range is the new API which will be used to map a
> range of kernel memory/pages to user vma.
> 
> Signed-off-by: Souptick Joarder 
> Reviewed-by: Matthew Wilcox 
> ---
>  include/linux/mm_types.h |  3 +++
>  mm/memory.c  | 28 
>  mm/nommu.c   |  7 +++
>  3 files changed, 38 insertions(+)

Hi,

What is the opposite of vm_insert_range() or even of vm_insert_page()?
or is there no need for that?


> diff --git a/mm/memory.c b/mm/memory.c
> index 15c417e..da904ed 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1478,6 +1478,34 @@ static int insert_page(struct vm_area_struct *vma, 
> unsigned long addr,
>  }
>  
>  /**
> + * vm_insert_range - insert range of kernel pages into user vma
> + * @vma: user vma to map to
> + * @addr: target user address of this page
> + * @pages: pointer to array of source kernel pages
> + * @page_count: no. of pages need to insert into user vma

s/no./number/

> + *
> + * This allows drivers to insert range of kernel pages they've allocated
> + * into a user vma. This is a generic function which drivers can use
> + * rather than using their own way of mapping range of kernel pages into
> + * user vma.
> + */
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> + struct page **pages, unsigned long page_count)
> +{
> + unsigned long uaddr = addr;
> + int ret = 0, i;
> +
> + for (i = 0; i < page_count; i++) {
> + ret = vm_insert_page(vma, uaddr, pages[i]);
> + if (ret < 0)
> + return ret;

For a non-trivial value of page_count:
Is it a problem if vm_insert_page() succeeds for several pages
and then fails?

> + uaddr += PAGE_SIZE;
> + }
> +
> + return ret;
> +}
> +
> +/**
>   * vm_insert_page - insert single page into user vma
>   * @vma: user vma to map to
>   * @addr: target user address of this page


thanks.
-- 
~Randy
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Yehezkel Bernat

On Thu, Nov 15, 2018 at 7:46 PM Lorenzo Pieralisi
 wrote:
>
> On Thu, Nov 15, 2018 at 02:16:27PM +0200, Mika Westerberg wrote:
> > On Thu, Nov 15, 2018 at 01:07:36PM +0100, Lukas Wunner wrote:
> > > On Thu, Nov 15, 2018 at 01:37:37PM +0200, Mika Westerberg wrote:
> > > > On Thu, Nov 15, 2018 at 11:13:56AM +, Lorenzo Pieralisi wrote:
> > > > > I have strong objections to the way these bindings have been forced 
> > > > > upon
> > > > > everybody; if that's the way *generic* ACPI bindings are specified I
> > > > > wonder why there still exists an ACPI specification and related 
> > > > > working
> > > > > group.
> > > > >
> > > > > I personally (but that's Bjorn and Rafael choice) think that this is
> > > > > not a change that belongs in PCI core, ACPI bindings are ill-defined
> > > > > and device tree bindings are non-existing.
> > > >
> > > > Any idea where should I put it then? These systems are already out there
> > > > and we need to support them one way or another.
> > >
> > > I suppose those are all Thunderbolt, so could be handled by the
> > > existing ->is_thunderbolt bit?
> > >
> > > It was said in this thread that ->is_external is more generic in
> > > that it could also be used on PCIe slots, however that use case
> > > doesn't appear to lend itself to the "plug in while laptop owner
> > > is getting coffee" attack.  To access PCIe slots on a server you
> > > normally need access to a data center.  On a desktop, you usually
> > > have to open the case, by which time the coffee may already have
> > > been fetched.  So frankly the binding seems a bit over-engineered
> > > to me and yet another thing that BIOS writers may get wrong.
> >
> > I would not say it should include PCIe slots but there are other cables
> > that carry PCIe and I was thinking we could make it to support those as
> > well.
> >
> > I have no problem using is_thunderbolt here, though if we don't want to
> > support non-Thunderbolt external devices this way.
> >
> > However, the question here is more that where I should put the _DSD
> > parsing code if it is not suitable to be placed inside PCI/ACPI core as
> > I've done in this patch? ;-)
>
> Do you really need to parse it if the dev->is_thunderbolt check is enough ?

>From what I know, there are more devices that suffer from similar security
issues like Thunderbolt, e.g. FireWire [1].
My assumption is that the same protection may be applied to such devices too,
even if currently it sounds like vendors care mostly about Thunderbolt (probably
because it removes the need for user approval for device connection; it becames
a simple plug-and-play experience).

Thus, I don't think binding it with dev->is_thunderbolt is the correct
thing to do.

[1] https://github.com/carmaa/inception
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Lorenzo Pieralisi

On Thu, Nov 15, 2018 at 02:16:27PM +0200, Mika Westerberg wrote:
> On Thu, Nov 15, 2018 at 01:07:36PM +0100, Lukas Wunner wrote:
> > On Thu, Nov 15, 2018 at 01:37:37PM +0200, Mika Westerberg wrote:
> > > On Thu, Nov 15, 2018 at 11:13:56AM +, Lorenzo Pieralisi wrote:
> > > > I have strong objections to the way these bindings have been forced upon
> > > > everybody; if that's the way *generic* ACPI bindings are specified I
> > > > wonder why there still exists an ACPI specification and related working
> > > > group.
> > > > 
> > > > I personally (but that's Bjorn and Rafael choice) think that this is
> > > > not a change that belongs in PCI core, ACPI bindings are ill-defined
> > > > and device tree bindings are non-existing.
> > > 
> > > Any idea where should I put it then? These systems are already out there
> > > and we need to support them one way or another.
> > 
> > I suppose those are all Thunderbolt, so could be handled by the
> > existing ->is_thunderbolt bit?
> > 
> > It was said in this thread that ->is_external is more generic in
> > that it could also be used on PCIe slots, however that use case
> > doesn't appear to lend itself to the "plug in while laptop owner
> > is getting coffee" attack.  To access PCIe slots on a server you
> > normally need access to a data center.  On a desktop, you usually
> > have to open the case, by which time the coffee may already have
> > been fetched.  So frankly the binding seems a bit over-engineered
> > to me and yet another thing that BIOS writers may get wrong.
> 
> I would not say it should include PCIe slots but there are other cables
> that carry PCIe and I was thinking we could make it to support those as
> well.
> 
> I have no problem using is_thunderbolt here, though if we don't want to
> support non-Thunderbolt external devices this way.
> 
> However, the question here is more that where I should put the _DSD
> parsing code if it is not suitable to be placed inside PCI/ACPI core as
> I've done in this patch? ;-)

Do you really need to parse it if the dev->is_thunderbolt check is enough ?

Thanks,
Lorenzo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v4 4/7] PCI: OF: Initialize dev->fwnode appropriately

2018-11-15 Thread Jean-Philippe Brucker

For PCI devices that have an OF node, set the fwnode as well. This way
drivers that rely on fwnode don't need the special case described by
commit f94277af03ea ("of/platform: Initialise dev->fwnode appropriately").

Acked-by: Bjorn Helgaas 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/pci/of.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 4c4217d0c3f1..c272ecfcd038 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -21,12 +21,15 @@ void pci_set_of_node(struct pci_dev *dev)
return;
dev->dev.of_node = of_pci_find_child_device(dev->bus->dev.of_node,
dev->devfn);
+   if (dev->dev.of_node)
+   dev->dev.fwnode = >dev.of_node->fwnode;
 }
 
 void pci_release_of_node(struct pci_dev *dev)
 {
of_node_put(dev->dev.of_node);
dev->dev.of_node = NULL;
+   dev->dev.fwnode = NULL;
 }
 
 void pci_set_bus_of_node(struct pci_bus *bus)
@@ -35,12 +38,16 @@ void pci_set_bus_of_node(struct pci_bus *bus)
bus->dev.of_node = pcibios_get_phb_of_node(bus);
else
bus->dev.of_node = of_node_get(bus->self->dev.of_node);
+
+   if (bus->dev.of_node)
+   bus->dev.fwnode = >dev.of_node->fwnode;
 }
 
 void pci_release_bus_of_node(struct pci_bus *bus)
 {
of_node_put(bus->dev.of_node);
bus->dev.of_node = NULL;
+   bus->dev.fwnode = NULL;
 }
 
 struct device_node * __weak pcibios_get_phb_of_node(struct pci_bus *bus)
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v4 3/7] of: Allow the iommu-map property to omit untranslated devices

2018-11-15 Thread Jean-Philippe Brucker

In PCI root complex nodes, the iommu-map property describes the IOMMU that
translates each endpoint. On some platforms, the IOMMU itself is presented
as a PCI endpoint (e.g. AMD IOMMU and virtio-iommu). This isn't supported
by the current OF driver, which expects all endpoints to have an IOMMU.
Allow the iommu-map property to have gaps.

Relaxing of_map_rid() also allows the msi-map property to have gaps, which
is invalid since MSIs always reach an MSI controller. In that case
pci_msi_setup_msi_irqs() will return an error when attempting to find the
device's MSI domain.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/of/base.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 09692c9b32a7..99f6bfa9b898 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -2237,8 +2237,12 @@ int of_map_rid(struct device_node *np, u32 rid,
return 0;
}
 
-   pr_err("%pOF: Invalid %s translation - no match for rid 0x%x on %pOF\n",
-   np, map_name, rid, target && *target ? *target : NULL);
-   return -EFAULT;
+   pr_info("%pOF: no %s translation for rid 0x%x on %pOF\n", np, map_name,
+   rid, target && *target ? *target : NULL);
+
+   /* Bypasses translation */
+   if (id_out)
+   *id_out = rid;
+   return 0;
 }
 EXPORT_SYMBOL_GPL(of_map_rid);
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v4 6/7] iommu/virtio: Add probe request

2018-11-15 Thread Jean-Philippe Brucker

When the device offers the probe feature, send a probe request for each
device managed by the IOMMU. Extract RESV_MEM information. When we
encounter a MSI doorbell region, set it up as a IOMMU_RESV_MSI region.
This will tell other subsystems that there is no need to map the MSI
doorbell in the virtio-iommu, because MSIs bypass it.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c  | 156 --
 include/uapi/linux/virtio_iommu.h |  38 
 2 files changed, 188 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 2a9cb6285a1e..c547ebd79c43 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -46,6 +46,7 @@ struct viommu_dev {
struct iommu_domain_geometrygeometry;
u64 pgsize_bitmap;
u8  domain_bits;
+   u32 probe_size;
 };
 
 struct viommu_mapping {
@@ -67,8 +68,10 @@ struct viommu_domain {
 };
 
 struct viommu_endpoint {
+   struct device   *dev;
struct viommu_dev   *viommu;
struct viommu_domain*vdomain;
+   struct list_headresv_regions;
 };
 
 struct viommu_request {
@@ -119,6 +122,9 @@ static off_t viommu_get_req_offset(struct viommu_dev 
*viommu,
 {
size_t tail_size = sizeof(struct virtio_iommu_req_tail);
 
+   if (req->type == VIRTIO_IOMMU_T_PROBE)
+   return len - viommu->probe_size - tail_size;
+
return len - tail_size;
 }
 
@@ -394,6 +400,110 @@ static int viommu_replay_mappings(struct viommu_domain 
*vdomain)
return ret;
 }
 
+static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
+  struct virtio_iommu_probe_resv_mem *mem,
+  size_t len)
+{
+   size_t size;
+   u64 start64, end64;
+   phys_addr_t start, end;
+   struct iommu_resv_region *region = NULL;
+   unsigned long prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+
+   start = start64 = le64_to_cpu(mem->start);
+   end = end64 = le64_to_cpu(mem->end);
+   size = end64 - start64 + 1;
+
+   /* Catch any overflow, including the unlikely end64 - start64 + 1 = 0 */
+   if (start != start64 || end != end64 || size < end64 - start64)
+   return -EOVERFLOW;
+
+   if (len < sizeof(*mem))
+   return -EINVAL;
+
+   switch (mem->subtype) {
+   default:
+   dev_warn(vdev->dev, "unknown resv mem subtype 0x%x\n",
+mem->subtype);
+   /* Fall-through */
+   case VIRTIO_IOMMU_RESV_MEM_T_RESERVED:
+   region = iommu_alloc_resv_region(start, size, 0,
+IOMMU_RESV_RESERVED);
+   break;
+   case VIRTIO_IOMMU_RESV_MEM_T_MSI:
+   region = iommu_alloc_resv_region(start, size, prot,
+IOMMU_RESV_MSI);
+   break;
+   }
+   if (!region)
+   return -ENOMEM;
+
+   list_add(>resv_regions, >list);
+   return 0;
+}
+
+static int viommu_probe_endpoint(struct viommu_dev *viommu, struct device *dev)
+{
+   int ret;
+   u16 type, len;
+   size_t cur = 0;
+   size_t probe_len;
+   struct virtio_iommu_req_probe *probe;
+   struct virtio_iommu_probe_property *prop;
+   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+   struct viommu_endpoint *vdev = fwspec->iommu_priv;
+
+   if (!fwspec->num_ids)
+   return -EINVAL;
+
+   probe_len = sizeof(*probe) + viommu->probe_size +
+   sizeof(struct virtio_iommu_req_tail);
+   probe = kzalloc(probe_len, GFP_KERNEL);
+   if (!probe)
+   return -ENOMEM;
+
+   probe->head.type = VIRTIO_IOMMU_T_PROBE;
+   /*
+* For now, assume that properties of an endpoint that outputs multiple
+* IDs are consistent. Only probe the first one.
+*/
+   probe->endpoint = cpu_to_le32(fwspec->ids[0]);
+
+   ret = viommu_send_req_sync(viommu, probe, probe_len);
+   if (ret)
+   goto out_free;
+
+   prop = (void *)probe->properties;
+   type = le16_to_cpu(prop->type) & VIRTIO_IOMMU_PROBE_T_MASK;
+
+   while (type != VIRTIO_IOMMU_PROBE_T_NONE &&
+  cur < viommu->probe_size) {
+   len = le16_to_cpu(prop->length) + sizeof(*prop);
+
+   switch (type) {
+   case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
+   ret = viommu_add_resv_mem(vdev, (void *)prop, len);
+   break;
+   default:
+   dev_err(dev, "unknown viommu prop 0x%x\n", type);
+   }
+
+   if (ret)
+   dev_err(dev, "failed to parse viommu prop 0x%x\n", 
type);
+
+   cur +=

[PATCH v4 7/7] iommu/virtio: Add event queue

2018-11-15 Thread Jean-Philippe Brucker

The event queue offers a way for the device to report access faults from
endpoints. It is implemented on virtqueue #1. Whenever the host needs to
signal a fault, it fills one of the buffers offered by the guest and
interrupts it.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/virtio-iommu.c  | 116 +++---
 include/uapi/linux/virtio_iommu.h |  19 +
 2 files changed, 126 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index c547ebd79c43..81c6b72e9c43 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -29,7 +29,8 @@
 #define MSI_IOVA_LENGTH0x10
 
 #define VIOMMU_REQUEST_VQ  0
-#define VIOMMU_NR_VQS  1
+#define VIOMMU_EVENT_VQ1
+#define VIOMMU_NR_VQS  2
 
 struct viommu_dev {
struct iommu_device iommu;
@@ -41,6 +42,7 @@ struct viommu_dev {
struct virtqueue*vqs[VIOMMU_NR_VQS];
spinlock_t  request_lock;
struct list_headrequests;
+   void*evts;
 
/* Device configuration */
struct iommu_domain_geometrygeometry;
@@ -82,6 +84,15 @@ struct viommu_request {
charbuf[];
 };
 
+#define VIOMMU_FAULT_RESV_MASK 0xff00
+
+struct viommu_event {
+   union {
+   u32 head;
+   struct virtio_iommu_fault fault;
+   };
+};
+
 #define to_viommu_domain(domain)   \
container_of(domain, struct viommu_domain, domain)
 
@@ -504,6 +515,69 @@ static int viommu_probe_endpoint(struct viommu_dev 
*viommu, struct device *dev)
return ret;
 }
 
+static int viommu_fault_handler(struct viommu_dev *viommu,
+   struct virtio_iommu_fault *fault)
+{
+   char *reason_str;
+
+   u8 reason   = fault->reason;
+   u32 flags   = le32_to_cpu(fault->flags);
+   u32 endpoint= le32_to_cpu(fault->endpoint);
+   u64 address = le64_to_cpu(fault->address);
+
+   switch (reason) {
+   case VIRTIO_IOMMU_FAULT_R_DOMAIN:
+   reason_str = "domain";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_MAPPING:
+   reason_str = "page";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_UNKNOWN:
+   default:
+   reason_str = "unknown";
+   break;
+   }
+
+   /* TODO: find EP by ID and report_iommu_fault */
+   if (flags & VIRTIO_IOMMU_FAULT_F_ADDRESS)
+   dev_err_ratelimited(viommu->dev, "%s fault from EP %u at %#llx 
[%s%s%s]\n",
+   reason_str, endpoint, address,
+   flags & VIRTIO_IOMMU_FAULT_F_READ ? "R" : 
"",
+   flags & VIRTIO_IOMMU_FAULT_F_WRITE ? "W" : 
"",
+   flags & VIRTIO_IOMMU_FAULT_F_EXEC ? "X" : 
"");
+   else
+   dev_err_ratelimited(viommu->dev, "%s fault from EP %u\n",
+   reason_str, endpoint);
+   return 0;
+}
+
+static void viommu_event_handler(struct virtqueue *vq)
+{
+   int ret;
+   unsigned int len;
+   struct scatterlist sg[1];
+   struct viommu_event *evt;
+   struct viommu_dev *viommu = vq->vdev->priv;
+
+   while ((evt = virtqueue_get_buf(vq, )) != NULL) {
+   if (len > sizeof(*evt)) {
+   dev_err(viommu->dev,
+   "invalid event buffer (len %u != %zu)\n",
+   len, sizeof(*evt));
+   } else if (!(evt->head & VIOMMU_FAULT_RESV_MASK)) {
+   viommu_fault_handler(viommu, >fault);
+   }
+
+   sg_init_one(sg, evt, sizeof(*evt));
+   ret = virtqueue_add_inbuf(vq, sg, 1, evt, GFP_ATOMIC);
+   if (ret)
+   dev_err(viommu->dev, "could not add event buffer\n");
+   }
+
+   if (!virtqueue_kick(vq))
+   dev_err(viommu->dev, "kick failed\n");
+}
+
 /* IOMMU API */
 
 static struct iommu_domain *viommu_domain_alloc(unsigned type)
@@ -887,16 +961,35 @@ static struct iommu_ops viommu_ops = {
 static int viommu_init_vqs(struct viommu_dev *viommu)
 {
struct virtio_device *vdev = dev_to_virtio(viommu->dev);
-   const char *name = "request";
-   void *ret;
+   const char *names[] = { "request", "event" };
+   vq_callback_t *callbacks[] = {
+   NULL, /* No async requests */
+   viommu_event_handler,
+   };
 
-   ret = virtio_find_single_vq(vdev, NULL, name);
-   if (IS_ERR(ret)) {
-   dev_err(viommu->dev, "cannot find VQ\n");
-   return PTR_ERR(ret);
-   }
+   return virtio_find_vqs(vdev, VIOMMU_NR_VQS, viommu->vqs, callbacks,
+

[PATCH v4 5/7] iommu: Add virtio-iommu driver

2018-11-15 Thread Jean-Philippe Brucker

The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
requests such as map/unmap over virtio transport without emulating page
tables. This implementation handles ATTACH, DETACH, MAP and UNMAP
requests.

The bulk of the code transforms calls coming from the IOMMU API into
corresponding virtio requests. Mappings are kept in an interval tree
instead of page tables.

Signed-off-by: Jean-Philippe Brucker 
---
 MAINTAINERS   |   7 +
 drivers/iommu/Kconfig |  11 +
 drivers/iommu/Makefile|   1 +
 drivers/iommu/virtio-iommu.c  | 918 ++
 include/uapi/linux/virtio_ids.h   |   1 +
 include/uapi/linux/virtio_iommu.h | 104 
 6 files changed, 1042 insertions(+)
 create mode 100644 drivers/iommu/virtio-iommu.c
 create mode 100644 include/uapi/linux/virtio_iommu.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 0abecc528dac..0c7bdce57719 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15948,6 +15948,13 @@ S: Maintained
 F: drivers/virtio/virtio_input.c
 F: include/uapi/linux/virtio_input.h
 
+VIRTIO IOMMU DRIVER
+M: Jean-Philippe Brucker 
+L: virtualizat...@lists.linux-foundation.org
+S: Maintained
+F: drivers/iommu/virtio-iommu.c
+F: include/uapi/linux/virtio_iommu.h
+
 VIRTUAL BOX GUEST DEVICE DRIVER
 M: Hans de Goede 
 M: Arnd Bergmann 
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d9a25715650e..efdeaaeee0e0 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -435,4 +435,15 @@ config QCOM_IOMMU
help
  Support for IOMMU on certain Qualcomm SoCs.
 
+config VIRTIO_IOMMU
+   bool "Virtio IOMMU driver"
+   depends on VIRTIO=y
+   select IOMMU_API
+   select INTERVAL_TREE
+   select ARM_DMA_USE_IOMMU if ARM
+   help
+ Para-virtualised IOMMU driver with virtio.
+
+ Say Y here if you intend to run this kernel as a guest.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index a158a68c8ea8..48d831a39281 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -32,3 +32,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
+obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
new file mode 100644
index ..2a9cb6285a1e
--- /dev/null
+++ b/drivers/iommu/virtio-iommu.c
@@ -0,0 +1,918 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Virtio driver for the paravirtualized IOMMU
+ *
+ * Copyright (C) 2018 Arm Limited
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define MSI_IOVA_BASE  0x800
+#define MSI_IOVA_LENGTH0x10
+
+#define VIOMMU_REQUEST_VQ  0
+#define VIOMMU_NR_VQS  1
+
+struct viommu_dev {
+   struct iommu_device iommu;
+   struct device   *dev;
+   struct virtio_device*vdev;
+
+   struct ida  domain_ids;
+
+   struct virtqueue*vqs[VIOMMU_NR_VQS];
+   spinlock_t  request_lock;
+   struct list_headrequests;
+
+   /* Device configuration */
+   struct iommu_domain_geometrygeometry;
+   u64 pgsize_bitmap;
+   u8  domain_bits;
+};
+
+struct viommu_mapping {
+   phys_addr_t paddr;
+   struct interval_tree_node   iova;
+   u32 flags;
+};
+
+struct viommu_domain {
+   struct iommu_domain domain;
+   struct viommu_dev   *viommu;
+   struct mutexmutex;
+   unsigned intid;
+
+   spinlock_t  mappings_lock;
+   struct rb_root_cached   mappings;
+
+   unsigned long   nr_endpoints;
+};
+
+struct viommu_endpoint {
+   struct viommu_dev   *viommu;
+   struct viommu_domain*vdomain;
+};
+
+struct viommu_request {
+   struct list_headlist;
+   void*writeback;
+   unsigned intwrite_offset;
+   unsigned intlen;
+   charbuf[];
+};
+
+#define to_viommu_domain(domain)   \
+   container_of(domain, struct viommu_domain, domain)
+
+static int viommu_get_req_errno(void *buf, size_t len)
+{
+   struct virtio_iommu_req_tail *tail = buf + len - sizeof(*tail);
+
+   switch (tail->status) {
+   case VIRTIO_IOMMU_S_OK:
+   return

[PATCH v4 2/7] dt-bindings: virtio: Add virtio-pci-iommu node

2018-11-15 Thread Jean-Philippe Brucker

Some systems implement virtio-iommu as a PCI endpoint. The operating
system needs to discover the relationship between IOMMU and masters long
before the PCI endpoint gets probed. Add a PCI child node to describe the
virtio-iommu device.

The virtio-pci-iommu is conceptually split between a PCI programming
interface and a translation component on the parent bus. The latter
doesn't have a node in the device tree. The virtio-pci-iommu node
describes both, by linking the PCI endpoint to "iommus" property of DMA
master nodes and to "iommu-map" properties of bus nodes.

Reviewed-by: Rob Herring 
Reviewed-by: Eric Auger 
Signed-off-by: Jean-Philippe Brucker 
---
 .../devicetree/bindings/virtio/iommu.txt  | 66 +++
 1 file changed, 66 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt

diff --git a/Documentation/devicetree/bindings/virtio/iommu.txt 
b/Documentation/devicetree/bindings/virtio/iommu.txt
new file mode 100644
index ..2407fea0651c
--- /dev/null
+++ b/Documentation/devicetree/bindings/virtio/iommu.txt
@@ -0,0 +1,66 @@
+* virtio IOMMU PCI device
+
+When virtio-iommu uses the PCI transport, its programming interface is
+discovered dynamically by the PCI probing infrastructure. However the
+device tree statically describes the relation between IOMMU and DMA
+masters. Therefore, the PCI root complex that hosts the virtio-iommu
+contains a child node representing the IOMMU device explicitly.
+
+Required properties:
+
+- compatible:  Should be "virtio,pci-iommu"
+- reg: PCI address of the IOMMU. As defined in the PCI Bus
+   Binding reference [1], the reg property is a five-cell
+   address encoded as (phys.hi phys.mid phys.lo size.hi
+   size.lo). phys.hi should contain the device's BDF as
+   0b  dfff . The other cells
+   should be zero.
+- #iommu-cells:Each platform DMA master managed by the IOMMU is 
assigned
+   an endpoint ID, described by the "iommus" property [2].
+   For virtio-iommu, #iommu-cells must be 1.
+
+Notes:
+
+- DMA from the IOMMU device isn't managed by another IOMMU. Therefore the
+  virtio-iommu node doesn't have an "iommus" property, and is omitted from
+  the iommu-map property of the root complex.
+
+Example:
+
+pcie@1000 {
+   compatible = "pci-host-ecam-generic";
+   ...
+
+   /* The IOMMU programming interface uses slot 00:01.0 */
+   iommu0: iommu@0008 {
+   compatible = "virtio,pci-iommu";
+   reg = <0x0800 0 0 0 0>;
+   #iommu-cells = <1>;
+   };
+
+   /*
+* The IOMMU manages all functions in this PCI domain except
+* itself. Omit BDF 00:01.0.
+*/
+   iommu-map = <0x0  0x0 0x8>
+   <0x9  0x9 0xfff7>;
+};
+
+pcie@2000 {
+   compatible = "pci-host-ecam-generic";
+   ...
+   /*
+* The IOMMU also manages all functions from this domain,
+* with endpoint IDs 0x1 - 0x1
+*/
+   iommu-map = <0x0  0x1 0x1>;
+};
+
+ethernet@fe001000 {
+   ...
+   /* The IOMMU manages this platform device with endpoint ID 0x2 */
+   iommus = < 0x2>;
+};
+
+[1] Documentation/devicetree/bindings/pci/pci.txt
+[2] Documentation/devicetree/bindings/iommu/iommu.txt
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v4 0/7] Add virtio-iommu driver

2018-11-15 Thread Jean-Philippe Brucker

Implement the virtio-iommu driver, following specification v0.8 [1].

Changes since v3 [2]:
* Rebase onto v4.20-rc2. Patch 3 now touches drivers/of/base.c instead
  of drivers/pci/of.c, since the map_rid() function has moved.
* Removed the request timeout, that depended on DEBUG.
* Other small fixes addressing comments on v3.

You can find Linux driver and kvmtool device on my virtio-iommu/v0.8.1
branches [3]. You can also test it with the latest version of Eric's
QEMU device [4].

[1] Virtio-iommu specification v0.8, sources and pdf
git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.8
http://jpbrucker.net/virtio-iommu/spec/v0.8/virtio-iommu-v0.8.pdf

[2] [PATCH v3 0/7] Add virtio-iommu driver
https://www.spinics.net/lists/linux-pci/msg77110.html

[3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.8.1
git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.8.1

[4] [RFC v8 00/18] VIRTIO-IOMMU device
https://www.mail-archive.com/qemu-devel@nongnu.org/msg572637.html

Jean-Philippe Brucker (7):
  dt-bindings: virtio-mmio: Add IOMMU description
  dt-bindings: virtio: Add virtio-pci-iommu node
  of: Allow the iommu-map property to omit untranslated devices
  PCI: OF: Initialize dev->fwnode appropriately
  iommu: Add virtio-iommu driver
  iommu/virtio: Add probe request
  iommu/virtio: Add event queue

 .../devicetree/bindings/virtio/iommu.txt  |   66 +
 .../devicetree/bindings/virtio/mmio.txt   |   30 +
 MAINTAINERS   |7 +
 drivers/iommu/Kconfig |   11 +
 drivers/iommu/Makefile|1 +
 drivers/iommu/virtio-iommu.c  | 1160 +
 drivers/of/base.c |   10 +-
 drivers/pci/of.c  |7 +
 include/uapi/linux/virtio_ids.h   |1 +
 include/uapi/linux/virtio_iommu.h |  161 +++
 10 files changed, 1451 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
 create mode 100644 drivers/iommu/virtio-iommu.c
 create mode 100644 include/uapi/linux/virtio_iommu.h

-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v4 1/7] dt-bindings: virtio-mmio: Add IOMMU description

2018-11-15 Thread Jean-Philippe Brucker

The nature of a virtio-mmio node is discovered by the virtio driver at
probe time. However the DMA relation between devices must be described
statically. When a virtio-mmio node is a virtio-iommu device, it needs an
"#iommu-cells" property as specified by bindings/iommu/iommu.txt.

Otherwise, the virtio-mmio device may perform DMA through an IOMMU, which
requires an "iommus" property. Describe these requirements in the
device-tree bindings documentation.

Reviewed-by: Rob Herring 
Reviewed-by: Eric Auger 
Signed-off-by: Jean-Philippe Brucker 
---
 .../devicetree/bindings/virtio/mmio.txt   | 30 +++
 1 file changed, 30 insertions(+)

diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt 
b/Documentation/devicetree/bindings/virtio/mmio.txt
index 5069c1b8e193..21af30fbb81f 100644
--- a/Documentation/devicetree/bindings/virtio/mmio.txt
+++ b/Documentation/devicetree/bindings/virtio/mmio.txt
@@ -8,10 +8,40 @@ Required properties:
 - reg: control registers base address and size including configuration 
space
 - interrupts:  interrupt generated by the device
 
+Required properties for virtio-iommu:
+
+- #iommu-cells:When the node corresponds to a virtio-iommu device, it 
is
+   linked to DMA masters using the "iommus" or "iommu-map"
+   properties [1][2]. #iommu-cells specifies the size of the
+   "iommus" property. For virtio-iommu #iommu-cells must be
+   1, each cell describing a single endpoint ID.
+
+Optional properties:
+
+- iommus:  If the device accesses memory through an IOMMU, it should
+   have an "iommus" property [1]. Since virtio-iommu itself
+   does not access memory through an IOMMU, the "virtio,mmio"
+   node cannot have both an "#iommu-cells" and an "iommus"
+   property.
+
 Example:
 
virtio_block@3000 {
compatible = "virtio,mmio";
reg = <0x3000 0x100>;
interrupts = <41>;
+
+   /* Device has endpoint ID 23 */
+   iommus = < 23>
}
+
+   viommu: iommu@3100 {
+   compatible = "virtio,mmio";
+   reg = <0x3100 0x100>;
+   interrupts = <42>;
+
+   #iommu-cells = <1>
+   }
+
+[1] Documentation/devicetree/bindings/iommu/iommu.txt
+[2] Documentation/devicetree/bindings/pci/pci-iommu.txt
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 6/7] iommu/virtio: Add probe request

2018-11-15 Thread Jean-Philippe Brucker

Fixed all of these, thanks

Jean

On 15/11/2018 13:20, Auger Eric wrote:
> Hi Jean,
> On 10/12/18 4:59 PM, Jean-Philippe Brucker wrote:
>> When the device offers the probe feature, send a probe request for each
>> device managed by the IOMMU. Extract RESV_MEM information. When we
>> encounter a MSI doorbell region, set it up as a IOMMU_RESV_MSI region.
>> This will tell other subsystems that there is no need to map the MSI
>> doorbell in the virtio-iommu, because MSIs bypass it.
>>
>> Signed-off-by: Jean-Philippe Brucker 
>> ---
>>  drivers/iommu/virtio-iommu.c  | 147 --
>>  include/uapi/linux/virtio_iommu.h |  39 
>>  2 files changed, 180 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
>> index 9fb38cd3b727..8eaf66770469 100644
>> --- a/drivers/iommu/virtio-iommu.c
>> +++ b/drivers/iommu/virtio-iommu.c
>> @@ -56,6 +56,7 @@ struct viommu_dev {
>>  struct iommu_domain_geometrygeometry;
>>  u64 pgsize_bitmap;
>>  u8  domain_bits;
>> +u32 probe_size;
>>  };
>>  
>>  struct viommu_mapping {
>> @@ -77,8 +78,10 @@ struct viommu_domain {
>>  };
>>  
>>  struct viommu_endpoint {
>> +struct device   *dev;
>>  struct viommu_dev   *viommu;
>>  struct viommu_domain*vdomain;
>> +struct list_headresv_regions;
>>  };
>>  
>>  struct viommu_request {
>> @@ -129,6 +132,9 @@ static off_t viommu_get_req_offset(struct viommu_dev 
>> *viommu,
>>  {
>>  size_t tail_size = sizeof(struct virtio_iommu_req_tail);
>>  
>> +if (req->type == VIRTIO_IOMMU_T_PROBE)
>> +return len - viommu->probe_size - tail_size;
>> +
>>  return len - tail_size;
>>  }
>>  
>> @@ -414,6 +420,101 @@ static int viommu_replay_mappings(struct viommu_domain 
>> *vdomain)
>>  return ret;
>>  }
>>  
>> +static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
>> +   struct virtio_iommu_probe_resv_mem *mem,
>> +   size_t len)
>> +{
>> +struct iommu_resv_region *region = NULL;
>> +unsigned long prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>> +
> nit: extra void line
>> +u64 start = le64_to_cpu(mem->start);
>> +u64 end = le64_to_cpu(mem->end);
>> +size_t size = end - start + 1;
>> +
>> +if (len < sizeof(*mem))
>> +return -EINVAL;
>> +
>> +switch (mem->subtype) {
>> +default:
>> +dev_warn(vdev->dev, "unknown resv mem subtype 0x%x\n",
>> + mem->subtype);
>> +/* Fall-through */
>> +case VIRTIO_IOMMU_RESV_MEM_T_RESERVED:
>> +region = iommu_alloc_resv_region(start, size, 0,
>> + IOMMU_RESV_RESERVED);
> need to test region
>> +break;
>> +case VIRTIO_IOMMU_RESV_MEM_T_MSI:
>> +region = iommu_alloc_resv_region(start, size, prot,
>> + IOMMU_RESV_MSI);
> same
>> +break;
>> +}
>> +
>> +list_add(>resv_regions, >list);
>> +return 0;
>> +}
>> +
>> +static int viommu_probe_endpoint(struct viommu_dev *viommu, struct device 
>> *dev)
>> +{
>> +int ret;
>> +u16 type, len;
>> +size_t cur = 0;
>> +size_t probe_len;
>> +struct virtio_iommu_req_probe *probe;
>> +struct virtio_iommu_probe_property *prop;
>> +struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>> +struct viommu_endpoint *vdev = fwspec->iommu_priv;
>> +
>> +if (!fwspec->num_ids)
>> +return -EINVAL;
>> +
>> +probe_len = sizeof(*probe) + viommu->probe_size +
>> +sizeof(struct virtio_iommu_req_tail);
>> +probe = kzalloc(probe_len, GFP_KERNEL);
>> +if (!probe)
>> +return -ENOMEM;
>> +
>> +probe->head.type = VIRTIO_IOMMU_T_PROBE;
>> +/*
>> + * For now, assume that properties of an endpoint that outputs multiple
>> + * IDs are consistent. Only probe the first one.
>> + */
>> +probe->endpoint = cpu_to_le32(fwspec->ids[0]);
>> +
>> +ret = viommu_send_req_sync(viommu, probe, probe_len);
>> +if (ret)
>> +goto out_free;
>> +
>> +prop = (void *)probe->properties;
>> +type = le16_to_cpu(prop->type) & VIRTIO_IOMMU_PROBE_T_MASK;
>> +
>> +while (type != VIRTIO_IOMMU_PROBE_T_NONE &&
>> +   cur < viommu->probe_size) {
>> +len = le16_to_cpu(prop->length) + sizeof(*prop);
>> +
>> +switch (type) {
>> +case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
>> +ret = viommu_add_resv_mem(vdev, (void *)prop, len);
>> +break;
>> +default:
>> +dev_err(dev, "unknown viommu prop 0x%x\n", type);
>> +}
>> +
>> +if (ret)
>> +dev_err(dev, "failed to parse viommu prop 0x%x\n", 
>>

Re: [PATCH 7/7] vfio/type1: Remove map_try_harder() code path

2018-11-15 Thread Joerg Roedel

Hi Alex,

On Fri, Nov 09, 2018 at 09:23:29AM -0700, Alex Williamson wrote:
> Cool, glad to see this finally fixed.  My "should be fixed soon"
> comment turned out to be a little optimistic with the fix finally
> coming 5 years later.  We could of course keep this code as it really
> doesn't harm anything, but I'm in favor trying to remove it if we think
> it's dead now.

Yeah, it took a while to fix that :) And we can easily revert this patch
if it turns out someone else is also relying on the workaround.

> In order to expedite into one pull:
> 
> Acked-by: Alex Williamson 

Thanks a lot. I've queued these patches into my tree now.


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 6/9] iommu/dma-iommu.c: Convert to use vm_insert_range

2018-11-15 Thread Souptick Joarder

Convert to use vm_insert_range() to map range of kernel
memory to user vma.

Signed-off-by: Souptick Joarder 
Reviewed-by: Matthew Wilcox 
---
 drivers/iommu/dma-iommu.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d1b0475..69c66b1 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -622,17 +622,9 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
size, gfp_t gfp,
 
 int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct 
*vma)
 {
-   unsigned long uaddr = vma->vm_start;
-   unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   int ret = -ENXIO;
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 
-   for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
-   ret = vm_insert_page(vma, uaddr, pages[i]);
-   if (ret)
-   break;
-   uaddr += PAGE_SIZE;
-   }
-   return ret;
+   return vm_insert_range(vma, vma->vm_start, pages, count);
 }
 
 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 1/9] mm: Introduce new vm_insert_range API

2018-11-15 Thread Souptick Joarder

Previouly drivers have their own way of mapping range of
kernel pages/memory into user vma and this was done by
invoking vm_insert_page() within a loop.

As this pattern is common across different drivers, it can
be generalized by creating a new function and use it across
the drivers.

vm_insert_range is the new API which will be used to map a
range of kernel memory/pages to user vma.

Signed-off-by: Souptick Joarder 
Reviewed-by: Matthew Wilcox 
---
 include/linux/mm_types.h |  3 +++
 mm/memory.c  | 28 
 mm/nommu.c   |  7 +++
 3 files changed, 38 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5ed8f62..15ae24f 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -523,6 +523,9 @@ extern void tlb_gather_mmu(struct mmu_gather *tlb, struct 
mm_struct *mm,
 extern void tlb_finish_mmu(struct mmu_gather *tlb,
unsigned long start, unsigned long end);
 
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+   struct page **pages, unsigned long page_count);
+
 static inline void init_tlb_flush_pending(struct mm_struct *mm)
 {
atomic_set(>tlb_flush_pending, 0);
diff --git a/mm/memory.c b/mm/memory.c
index 15c417e..da904ed 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1478,6 +1478,34 @@ static int insert_page(struct vm_area_struct *vma, 
unsigned long addr,
 }
 
 /**
+ * vm_insert_range - insert range of kernel pages into user vma
+ * @vma: user vma to map to
+ * @addr: target user address of this page
+ * @pages: pointer to array of source kernel pages
+ * @page_count: no. of pages need to insert into user vma
+ *
+ * This allows drivers to insert range of kernel pages they've allocated
+ * into a user vma. This is a generic function which drivers can use
+ * rather than using their own way of mapping range of kernel pages into
+ * user vma.
+ */
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+   struct page **pages, unsigned long page_count)
+{
+   unsigned long uaddr = addr;
+   int ret = 0, i;
+
+   for (i = 0; i < page_count; i++) {
+   ret = vm_insert_page(vma, uaddr, pages[i]);
+   if (ret < 0)
+   return ret;
+   uaddr += PAGE_SIZE;
+   }
+
+   return ret;
+}
+
+/**
  * vm_insert_page - insert single page into user vma
  * @vma: user vma to map to
  * @addr: target user address of this page
diff --git a/mm/nommu.c b/mm/nommu.c
index 749276b..d6ef5c7 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned 
long addr,
 }
 EXPORT_SYMBOL(vm_insert_page);
 
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+   struct page **pages, unsigned long page_count)
+{
+   return -EINVAL;
+}
+EXPORT_SYMBOL(vm_insert_range);
+
 /*
  *  sys_brk() for the most part doesn't need the global kernel
  *  lock, except when an application is doing something nasty
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 0/9] Use vm_insert_range

2018-11-15 Thread Souptick Joarder

Previouly drivers have their own way of mapping range of
kernel pages/memory into user vma and this was done by
invoking vm_insert_page() within a loop.

As this pattern is common across different drivers, it can
be generalized by creating a new function and use it across
the drivers.

vm_insert_range is the new API which will be used to map a
range of kernel memory/pages to user vma.

All the applicable places are converted to use new vm_insert_range
in this patch series.

Souptick Joarder (9):
  mm: Introduce new vm_insert_range API
  arch/arm/mm/dma-mapping.c: Convert to use vm_insert_range
  drivers/firewire/core-iso.c: Convert to use vm_insert_range
  drm/rockchip/rockchip_drm_gem.c: Convert to use vm_insert_range
  drm/xen/xen_drm_front_gem.c: Convert to use vm_insert_range
  iommu/dma-iommu.c: Convert to use vm_insert_range
  videobuf2/videobuf2-dma-sg.c: Convert to use vm_insert_range
  xen/gntdev.c: Convert to use vm_insert_range
  xen/privcmd-buf.c: Convert to use vm_insert_range

 arch/arm/mm/dma-mapping.c | 21 ++---
 drivers/firewire/core-iso.c   | 15 ++--
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c   | 20 ++--
 drivers/gpu/drm/xen/xen_drm_front_gem.c   | 20 +---
 drivers/iommu/dma-iommu.c | 12 ++
 drivers/media/common/videobuf2/videobuf2-dma-sg.c | 23 ++-
 drivers/xen/gntdev.c  | 11 -
 drivers/xen/privcmd-buf.c |  8 ++-
 include/linux/mm_types.h  |  3 +++
 mm/memory.c   | 28 +++
 mm/nommu.c|  7 ++
 11 files changed, 70 insertions(+), 98 deletions(-)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 6/7] iommu/virtio: Add probe request

2018-11-15 Thread Auger Eric

Hi Jean,
On 10/12/18 4:59 PM, Jean-Philippe Brucker wrote:
> When the device offers the probe feature, send a probe request for each
> device managed by the IOMMU. Extract RESV_MEM information. When we
> encounter a MSI doorbell region, set it up as a IOMMU_RESV_MSI region.
> This will tell other subsystems that there is no need to map the MSI
> doorbell in the virtio-iommu, because MSIs bypass it.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/iommu/virtio-iommu.c  | 147 --
>  include/uapi/linux/virtio_iommu.h |  39 
>  2 files changed, 180 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 9fb38cd3b727..8eaf66770469 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -56,6 +56,7 @@ struct viommu_dev {
>   struct iommu_domain_geometrygeometry;
>   u64 pgsize_bitmap;
>   u8  domain_bits;
> + u32 probe_size;
>  };
>  
>  struct viommu_mapping {
> @@ -77,8 +78,10 @@ struct viommu_domain {
>  };
>  
>  struct viommu_endpoint {
> + struct device   *dev;
>   struct viommu_dev   *viommu;
>   struct viommu_domain*vdomain;
> + struct list_headresv_regions;
>  };
>  
>  struct viommu_request {
> @@ -129,6 +132,9 @@ static off_t viommu_get_req_offset(struct viommu_dev 
> *viommu,
>  {
>   size_t tail_size = sizeof(struct virtio_iommu_req_tail);
>  
> + if (req->type == VIRTIO_IOMMU_T_PROBE)
> + return len - viommu->probe_size - tail_size;
> +
>   return len - tail_size;
>  }
>  
> @@ -414,6 +420,101 @@ static int viommu_replay_mappings(struct viommu_domain 
> *vdomain)
>   return ret;
>  }
>  
> +static int viommu_add_resv_mem(struct viommu_endpoint *vdev,
> +struct virtio_iommu_probe_resv_mem *mem,
> +size_t len)
> +{
> + struct iommu_resv_region *region = NULL;
> + unsigned long prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> +
nit: extra void line
> + u64 start = le64_to_cpu(mem->start);
> + u64 end = le64_to_cpu(mem->end);
> + size_t size = end - start + 1;
> +
> + if (len < sizeof(*mem))
> + return -EINVAL;
> +
> + switch (mem->subtype) {
> + default:
> + dev_warn(vdev->dev, "unknown resv mem subtype 0x%x\n",
> +  mem->subtype);
> + /* Fall-through */
> + case VIRTIO_IOMMU_RESV_MEM_T_RESERVED:
> + region = iommu_alloc_resv_region(start, size, 0,
> +  IOMMU_RESV_RESERVED);
need to test region
> + break;
> + case VIRTIO_IOMMU_RESV_MEM_T_MSI:
> + region = iommu_alloc_resv_region(start, size, prot,
> +  IOMMU_RESV_MSI);
same
> + break;
> + }
> +
> + list_add(>resv_regions, >list);
> + return 0;
> +}
> +
> +static int viommu_probe_endpoint(struct viommu_dev *viommu, struct device 
> *dev)
> +{
> + int ret;
> + u16 type, len;
> + size_t cur = 0;
> + size_t probe_len;
> + struct virtio_iommu_req_probe *probe;
> + struct virtio_iommu_probe_property *prop;
> + struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> + struct viommu_endpoint *vdev = fwspec->iommu_priv;
> +
> + if (!fwspec->num_ids)
> + return -EINVAL;
> +
> + probe_len = sizeof(*probe) + viommu->probe_size +
> + sizeof(struct virtio_iommu_req_tail);
> + probe = kzalloc(probe_len, GFP_KERNEL);
> + if (!probe)
> + return -ENOMEM;
> +
> + probe->head.type = VIRTIO_IOMMU_T_PROBE;
> + /*
> +  * For now, assume that properties of an endpoint that outputs multiple
> +  * IDs are consistent. Only probe the first one.
> +  */
> + probe->endpoint = cpu_to_le32(fwspec->ids[0]);
> +
> + ret = viommu_send_req_sync(viommu, probe, probe_len);
> + if (ret)
> + goto out_free;
> +
> + prop = (void *)probe->properties;
> + type = le16_to_cpu(prop->type) & VIRTIO_IOMMU_PROBE_T_MASK;
> +
> + while (type != VIRTIO_IOMMU_PROBE_T_NONE &&
> +cur < viommu->probe_size) {
> + len = le16_to_cpu(prop->length) + sizeof(*prop);
> +
> + switch (type) {
> + case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
> + ret = viommu_add_resv_mem(vdev, (void *)prop, len);
> + break;
> + default:
> + dev_err(dev, "unknown viommu prop 0x%x\n", type);
> + }
> +
> + if (ret)
> + dev_err(dev, "failed to parse viommu prop 0x%x\n", 
> type);
> +
> + cur += len;
> + if (cur >= viommu->probe_size)
> + break;
> +
> + prop =

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Mika Westerberg

On Thu, Nov 15, 2018 at 01:07:36PM +0100, Lukas Wunner wrote:
> On Thu, Nov 15, 2018 at 01:37:37PM +0200, Mika Westerberg wrote:
> > On Thu, Nov 15, 2018 at 11:13:56AM +, Lorenzo Pieralisi wrote:
> > > I have strong objections to the way these bindings have been forced upon
> > > everybody; if that's the way *generic* ACPI bindings are specified I
> > > wonder why there still exists an ACPI specification and related working
> > > group.
> > > 
> > > I personally (but that's Bjorn and Rafael choice) think that this is
> > > not a change that belongs in PCI core, ACPI bindings are ill-defined
> > > and device tree bindings are non-existing.
> > 
> > Any idea where should I put it then? These systems are already out there
> > and we need to support them one way or another.
> 
> I suppose those are all Thunderbolt, so could be handled by the
> existing ->is_thunderbolt bit?
> 
> It was said in this thread that ->is_external is more generic in
> that it could also be used on PCIe slots, however that use case
> doesn't appear to lend itself to the "plug in while laptop owner
> is getting coffee" attack.  To access PCIe slots on a server you
> normally need access to a data center.  On a desktop, you usually
> have to open the case, by which time the coffee may already have
> been fetched.  So frankly the binding seems a bit over-engineered
> to me and yet another thing that BIOS writers may get wrong.

I would not say it should include PCIe slots but there are other cables
that carry PCIe and I was thinking we could make it to support those as
well.

I have no problem using is_thunderbolt here, though if we don't want to
support non-Thunderbolt external devices this way.

However, the question here is more that where I should put the _DSD
parsing code if it is not suitable to be placed inside PCI/ACPI core as
I've done in this patch? ;-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Lukas Wunner

On Thu, Nov 15, 2018 at 01:37:37PM +0200, Mika Westerberg wrote:
> On Thu, Nov 15, 2018 at 11:13:56AM +, Lorenzo Pieralisi wrote:
> > I have strong objections to the way these bindings have been forced upon
> > everybody; if that's the way *generic* ACPI bindings are specified I
> > wonder why there still exists an ACPI specification and related working
> > group.
> > 
> > I personally (but that's Bjorn and Rafael choice) think that this is
> > not a change that belongs in PCI core, ACPI bindings are ill-defined
> > and device tree bindings are non-existing.
> 
> Any idea where should I put it then? These systems are already out there
> and we need to support them one way or another.

I suppose those are all Thunderbolt, so could be handled by the
existing ->is_thunderbolt bit?

It was said in this thread that ->is_external is more generic in
that it could also be used on PCIe slots, however that use case
doesn't appear to lend itself to the "plug in while laptop owner
is getting coffee" attack.  To access PCIe slots on a server you
normally need access to a data center.  On a desktop, you usually
have to open the case, by which time the coffee may already have
been fetched.  So frankly the binding seems a bit over-engineered
to me and yet another thing that BIOS writers may get wrong.

Well, just my 2 cents anyway.

Lukas
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Mika Westerberg

On Thu, Nov 15, 2018 at 11:13:56AM +, Lorenzo Pieralisi wrote:
> I have strong objections to the way these bindings have been forced upon
> everybody; if that's the way *generic* ACPI bindings are specified I
> wonder why there still exists an ACPI specification and related working
> group.
> 
> I personally (but that's Bjorn and Rafael choice) think that this is
> not a change that belongs in PCI core, ACPI bindings are ill-defined
> and device tree bindings are non-existing.

Any idea where should I put it then? These systems are already out there
and we need to support them one way or another.

> At the very least Microsoft should be asked to publish and discuss
> these bindings within the ACPI and UEFI forums.

These bindings are public, see here:

https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports

However, they are not part of the ACPI spec as you say.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Lorenzo Pieralisi

On Thu, Nov 15, 2018 at 12:22:39PM +0200, Mika Westerberg wrote:
> On Tue, Nov 13, 2018 at 11:45:36AM +, Lorenzo Pieralisi wrote:
> > On Tue, Nov 13, 2018 at 01:27:00PM +0200, Mika Westerberg wrote:
> > 
> > [...]
> > 
> > > > To be frank the concept (and Microsoft _DSD bindings) seems a bit vague
> > > > and not thoroughly defined and I would question its detection at
> > > > PCI/ACPI core level, I would hope this can be clarified at ACPI
> > > > specification level, at least.
> > > 
> > > I guess that is the way they envision to use _DSD. Instead of having
> > > single UUID that covers all properties (like what we have with device
> > > properties) they have one UUID per property "class". I certainly hope we
> > > don't need to keep extending prp_guids[] array each time they invent
> > > another "class" of properties.
> > 
> > It is even worse than that. This is a unilateral/obscure change that
> > won't be part of ACPI specifications (I guess it was easier to add a
> > UUID than add this to the ACPI specifications through the AWSG) but it
> > is still supposed to be applicable to ACPI PCI bindings on any
> > platforms/arches; this way of adding bindings does not work and it
> > has to be rectified.
> 
> I agree.
> 
> For the existing property "classes" such as the one here I don't think
> we can do anything. There are systems already with these included in
> their ACPI tables.
> 
> I wonder if you have any objections regarding this patch?

I have strong objections to the way these bindings have been forced upon
everybody; if that's the way *generic* ACPI bindings are specified I
wonder why there still exists an ACPI specification and related working
group.

I personally (but that's Bjorn and Rafael choice) think that this is
not a change that belongs in PCI core, ACPI bindings are ill-defined
and device tree bindings are non-existing.

At the very least Microsoft should be asked to publish and discuss
these bindings within the ACPI and UEFI forums.

Lorenzo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI / ACPI: Identify external PCI devices

2018-11-15 Thread Mika Westerberg

On Tue, Nov 13, 2018 at 11:45:36AM +, Lorenzo Pieralisi wrote:
> On Tue, Nov 13, 2018 at 01:27:00PM +0200, Mika Westerberg wrote:
> 
> [...]
> 
> > > To be frank the concept (and Microsoft _DSD bindings) seems a bit vague
> > > and not thoroughly defined and I would question its detection at
> > > PCI/ACPI core level, I would hope this can be clarified at ACPI
> > > specification level, at least.
> > 
> > I guess that is the way they envision to use _DSD. Instead of having
> > single UUID that covers all properties (like what we have with device
> > properties) they have one UUID per property "class". I certainly hope we
> > don't need to keep extending prp_guids[] array each time they invent
> > another "class" of properties.
> 
> It is even worse than that. This is a unilateral/obscure change that
> won't be part of ACPI specifications (I guess it was easier to add a
> UUID than add this to the ACPI specifications through the AWSG) but it
> is still supposed to be applicable to ACPI PCI bindings on any
> platforms/arches; this way of adding bindings does not work and it
> has to be rectified.

I agree.

For the existing property "classes" such as the one here I don't think
we can do anything. There are systems already with these included in
their ACPI tables.

I wonder if you have any objections regarding this patch?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/7] dt-bindings: virtio: Add virtio-pci-iommu node

2018-11-15 Thread Auger Eric

Hi Jean,

On 10/12/18 4:59 PM, Jean-Philippe Brucker wrote:
> Some systems implement virtio-iommu as a PCI endpoint. The operating
> systems needs to discover the relationship between IOMMU and masters long
s/systems/system
> before the PCI endpoint gets probed. Add a PCI child node to describe the
> virtio-iommu device.
> 
> The virtio-pci-iommu is conceptually split between a PCI programming
> interface and a translation component on the parent bus. The latter
> doesn't have a node in the device tree. The virtio-pci-iommu node
> describes both, by linking the PCI endpoint to "iommus" property of DMA
> master nodes and to "iommu-map" properties of bus nodes.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  .../devicetree/bindings/virtio/iommu.txt  | 66 +++
>  1 file changed, 66 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
> 
> diff --git a/Documentation/devicetree/bindings/virtio/iommu.txt 
> b/Documentation/devicetree/bindings/virtio/iommu.txt
> new file mode 100644
> index ..2407fea0651c
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/virtio/iommu.txt
> @@ -0,0 +1,66 @@
> +* virtio IOMMU PCI device
> +
> +When virtio-iommu uses the PCI transport, its programming interface is
> +discovered dynamically by the PCI probing infrastructure. However the
> +device tree statically describes the relation between IOMMU and DMA
> +masters. Therefore, the PCI root complex that hosts the virtio-iommu
> +contains a child node representing the IOMMU device explicitly.
> +
> +Required properties:
> +
> +- compatible:Should be "virtio,pci-iommu"
> +- reg:   PCI address of the IOMMU. As defined in the PCI Bus
> + Binding reference [1], the reg property is a five-cell
> + address encoded as (phys.hi phys.mid phys.lo size.hi
> + size.lo). phys.hi should contain the device's BDF as
> + 0b  dfff . The other cells
> + should be zero.
> +- #iommu-cells:  Each platform DMA master managed by the IOMMU is 
> assigned
> + an endpoint ID, described by the "iommus" property [2].
> + For virtio-iommu, #iommu-cells must be 1.
> +
> +Notes:
> +
> +- DMA from the IOMMU device isn't managed by another IOMMU. Therefore the
> +  virtio-iommu node doesn't have an "iommus" property, and is omitted from
> +  the iommu-map property of the root complex.
> +
> +Example:
> +
> +pcie@1000 {
> + compatible = "pci-host-ecam-generic";
> + ...
> +
> + /* The IOMMU programming interface uses slot 00:01.0 */
> + iommu0: iommu@0008 {
> + compatible = "virtio,pci-iommu";
> + reg = <0x0800 0 0 0 0>;
> + #iommu-cells = <1>;
> + };
> +
> + /*
> +  * The IOMMU manages all functions in this PCI domain except
> +  * itself. Omit BDF 00:01.0.
> +  */
> + iommu-map = <0x0  0x0 0x8>
> + <0x9  0x9 0xfff7>;
> +};
> +
> +pcie@2000 {
> + compatible = "pci-host-ecam-generic";
> + ...
> + /*
> +  * The IOMMU also manages all functions from this domain,
> +  * with endpoint IDs 0x1 - 0x1
> +  */
> + iommu-map = <0x0  0x1 0x1>;
> +};
> +
> +ethernet@fe001000 {
> + ...
> + /* The IOMMU manages this platform device with endpoint ID 0x2 */
> + iommus = < 0x2>;
> +};
> +
> +[1] Documentation/devicetree/bindings/pci/pci.txt
> +[2] Documentation/devicetree/bindings/iommu/iommu.txt
Reviewed-by: Eric Auger 

Thanks

Eric

> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/7] dt-bindings: virtio-mmio: Add IOMMU description

2018-11-15 Thread Auger Eric

Hi Jean,

On 10/12/18 4:59 PM, Jean-Philippe Brucker wrote:
> The nature of a virtio-mmio node is discovered by the virtio driver at
> probe time. However the DMA relation between devices must be described
> statically. When a virtio-mmio node is a virtio-iommu device, it needs an
> "#iommu-cells" property as specified by bindings/iommu/iommu.txt.
> 
> Otherwise, the virtio-mmio device may perform DMA through an IOMMU, which
> requires an "iommus" property. Describe these requirements in the
> device-tree bindings documentation.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  .../devicetree/bindings/virtio/mmio.txt   | 30 +++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt 
> b/Documentation/devicetree/bindings/virtio/mmio.txt
> index 5069c1b8e193..748595473b36 100644
> --- a/Documentation/devicetree/bindings/virtio/mmio.txt
> +++ b/Documentation/devicetree/bindings/virtio/mmio.txt
> @@ -8,10 +8,40 @@ Required properties:
>  - reg:   control registers base address and size including 
> configuration space
>  - interrupts:interrupt generated by the device
>  
> +Required properties for virtio-iommu:
> +
> +- #iommu-cells:  When the node corresponds to a virtio-iommu device, it 
> is
> + linked to DMA masters using the "iommus" or "iommu-map"
> + properties [1][2]. #iommu-cells specifies the size of the
> + "iommus" property. For virtio-iommu #iommu-cells must be
> + 1, each cell describing a single endpoint ID.
> +
> +Optional properties:
> +
> +- iommus:If the device accesses memory through an IOMMU, it should
> + have an "iommus" property [1]. Since virtio-iommu itself
> + does not access memory through an IOMMU, the "virtio,mmio"
> + node cannot have both an "#iommu-cells" and an "iommus"
> + property.
> +
>  Example:
>  
>   virtio_block@3000 {
>   compatible = "virtio,mmio";
>   reg = <0x3000 0x100>;
>   interrupts = <41>;
> +
> + /* Device has endpoint ID 23 */
> + iommus = < 23>
>   }
> +
> + viommu: virtio_iommu@3100 {
> + compatible = "virtio,mmio";
> + reg = <0x3100 0x100>;
> + interrupts = <42>;
> +
> + #iommu-cells = <1>
> + }
> +
> +[1] Documentation/devicetree/bindings/iommu/iommu.txt
> +[2] Documentation/devicetree/bindings/pci/pci-iommu.txt
> 
Reviewed-by: Eric Auger 

Thanks

Eric
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

40 matches

Mail list logo