Re: [PATCH 3/5] swiotlb: Add alloc and free APIs

2020-04-29 Thread kbuild test robot
Hi Srivatsa,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on vhost/linux-next]
[also build test ERROR on xen-tip/linux-next linus/master v5.7-rc3 
next-20200429]
[cannot apply to swiotlb/linux-next]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Srivatsa-Vaddagiri/virtio-on-Type-1-hypervisor/20200429-032334
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: i386-randconfig-b002-20200429 (attached as .config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   In file included from drivers/gpu/drm/i915/i915_scatterlist.h:12:0,
from drivers/gpu/drm/i915/i915_scatterlist.c:7:
   include/linux/swiotlb.h: In function 'swiotlb_alloc':
>> include/linux/swiotlb.h:231:9: error: 'DMA_MAPPING_ERROR' undeclared (first 
>> use in this function); did you mean 'APM_NO_ERROR'?
 return DMA_MAPPING_ERROR;
^
APM_NO_ERROR
   include/linux/swiotlb.h:231:9: note: each undeclared identifier is reported 
only once for each function it appears in

vim +231 include/linux/swiotlb.h

   226  
   227  static inline phys_addr_t swiotlb_alloc(struct swiotlb_pool *pool,
   228  size_t alloc_size, unsigned long tbl_dma_addr,
   229  unsigned long mask)
   230  {
 > 231  return DMA_MAPPING_ERROR;
   232  }
   233  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 0/1] Add uvirtio for testing

2020-04-29 Thread lepton
On Wed, Apr 29, 2020 at 4:58 AM Gerd Hoffmann  wrote:
>
> > 3) Need to be verbose on how the vring processing work in the commit log of
> > patch 1
>
> Ecven better a file documenting the interface somewhere in
> Documentation/
I put a uvirtio-vga.c under samples/uvirtio and hope this can serve
the purpose of the document since currently that's the only tested use
case. Maybe have a document later if this really get more use cases?
Thanks.
>
> take care,
>   Gerd
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/1] Add uvirtio for testing

2020-04-29 Thread lepton
On Wed, Apr 29, 2020 at 2:57 AM Jason Wang  wrote:
>
>
> On 2020/4/29 上午4:47, Lepton Wu wrote:
> > This is a way to create virtio based devices from user space. This is the
> > background for this patch:
> >
> > We have some images works fine under qemu, we'd like to also run the same 
> > image
> > on Google Cloud. Currently Google Cloud doesn't support virtio-vga. I had a
> > patch to create a virtio-vga from kernel directly:
> > https://www.spinics.net/lists/dri-devel/msg248573.html
> >
> > Then I got feedback from Gerd that maybe it's better to change that to 
> > something
> > like uvirtio. Since I really don't have other use cases for now, I just 
> > implemented the minimal stuff which work for my use case.
>
>
> Interesting, several questions:
>
> 1) Are you aware of virtio vhost-user driver done by UM guys?
> (arch/um/drivers/virtio_uml.c) The memory part is tricky but overall
> both of you have similar target.
Thanks for reminding me, I was not aware of it. The use case looks a
little different: they are trying create virtio devices for user mode
linux and it communicated with "host" side. My driver doesn't depends
on any HOST/VMM side stuff. Basically we can use it to create virtio
device from guest itself. Or even create virtio device on bare metal.
> 2) Patch 1 said it's userspace virtio driver, which I think it is
> actually "userspace virtio device"
Updated in version 2 of this patch.
> 3) Need to be verbose on how the vring processing work in the commit log
> of patch 1
Updated.
> 4) I'm curious which testing you want to accomplish through this new
> transport, I guess you want to test a specific virtio driver?
Here is the whole story: we want to test our custom linux image. In
the past, we just test our custom linux image with qemu (and virtio
vga), and run qemu session on Google Cloud. As you can see, this is
nested virtualization and performance hurts. And more, we have another
vm inside our custom linux image. So we want to remove the qemu layer,
just run our custom linux image on Google Cloud directly. Then we need
some kind of VGA.  So a "dummy" virtio vga looks a good fit.
> 5) You mentioned that you may want to develop communication between
> kernel and userspace, any more details on that?
Currently, we don't have such use case. But maybe others can
furthermore to extend uvirtio for this. For example, user space can
use read/write to actually exchange data with kernel. Then that could
be used for simulate more complex use case.
>
> Thanks
>
>
> >
> > Lepton Wu (1):
> >virtio: Add uvirtio driver
> >
> >   drivers/virtio/Kconfig|   8 +
> >   drivers/virtio/Makefile   |   1 +
> >   drivers/virtio/uvirtio.c  | 405 ++
> >   include/linux/uvirtio.h   |   8 +
> >   include/uapi/linux/uvirtio.h  |  69 ++
> >   samples/uvirtio/Makefile  |   9 +
> >   samples/uvirtio/uvirtio-vga.c |  63 ++
> >   7 files changed, 563 insertions(+)
> >   create mode 100644 drivers/virtio/uvirtio.c
> >   create mode 100644 include/linux/uvirtio.h
> >   create mode 100644 include/uapi/linux/uvirtio.h
> >   create mode 100644 samples/uvirtio/Makefile
> >   create mode 100644 samples/uvirtio/uvirtio-vga.c
> >
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v2] virtio: Add uvirtio driver

2020-04-29 Thread Lepton Wu
This is for testing purpose to create virtio devices from user space.
uvirtio-vga.c shows how to create a virtio-vga device.

For "simple" device like virtio 2d VGA, actually we only need to handle
one virtio request which return the resolution of VGA. We provide a
UV_DEV_EXPECT ioctl which set the expected virtio request and the prepared
reply. This can eliminate user/kernel communication. We just handle this in
the notify callback of the virtqueue.

Check samples/uvirtio/uvirtio-vga.c for example.

Currently we don't have a use case which requires user/kernel communication so
read/write api hasn't been implemented.

Signed-off-by: Lepton Wu 
---
v2:
  * Fix styles issues found by checkpatch.pl
  * Update comments and commit log
---
 drivers/virtio/Kconfig|  11 +
 drivers/virtio/Makefile   |   1 +
 drivers/virtio/uvirtio.c  | 399 ++
 include/linux/uvirtio.h   |  16 ++
 include/uapi/linux/uvirtio.h  |  50 +
 samples/uvirtio/Makefile  |   9 +
 samples/uvirtio/uvirtio-vga.c |  72 ++
 7 files changed, 558 insertions(+)
 create mode 100644 drivers/virtio/uvirtio.c
 create mode 100644 include/linux/uvirtio.h
 create mode 100644 include/uapi/linux/uvirtio.h
 create mode 100644 samples/uvirtio/Makefile
 create mode 100644 samples/uvirtio/uvirtio-vga.c

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 69a32dfc318a..4686df49cac5 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -109,4 +109,15 @@ config VIRTIO_MMIO_CMDLINE_DEVICES
 
 If unsure, say 'N'.
 
+config UVIRTIO
+   tristate "UVirtio driver"
+   select VIRTIO
+   help
+This driver supports creating virtio devices from userspace.
+
+This can be used to create virtio devices from user space without
+ supports from VMM. Check samples/uvirtio for examples.
+
+If unsure, say 'N'.
+
 endif # VIRTIO_MENU
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 29a1386ecc03..558b2f890e8c 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -7,3 +7,4 @@ virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
 obj-$(CONFIG_VIRTIO_VDPA) += virtio_vdpa.o
+obj-$(CONFIG_UVIRTIO) += uvirtio.o
diff --git a/drivers/virtio/uvirtio.c b/drivers/virtio/uvirtio.c
new file mode 100644
index ..64cc9140de7a
--- /dev/null
+++ b/drivers/virtio/uvirtio.c
@@ -0,0 +1,399 @@
+// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+/*
+ *  User level device support for virtio subsystem
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define UVIRTIO_MAX_EXPECT_DATA  (1UL << 20)
+
+struct uvirtio_device {
+   struct virtio_device vdev;
+   struct mutex mutex;
+   enum uvirtio_state state;
+   unsigned char virtio_status;
+   struct uvirtio_setup setup;
+   struct uvirtio_expect expect;
+   char *expect_data;
+};
+
+static struct miscdevice uvirtio_misc;
+
+static struct bus_type uvirtio_bus = {
+   .name = "",
+};
+
+static u64 uvirtio_get_features(struct virtio_device *dev)
+{
+   struct uvirtio_device *udev = container_of(dev, struct uvirtio_device,
+  vdev);
+   return udev->setup.features;
+}
+
+static int uvirtio_finalize_features(struct virtio_device *vdev)
+{
+   return 0;
+}
+
+static void uvirtio_get(struct virtio_device *dev, unsigned int offset,
+   void *buf, unsigned int len)
+{
+   struct uvirtio_device *udev = container_of(dev, struct uvirtio_device,
+  vdev);
+   if (WARN_ON(offset + len > udev->setup.config_len))
+   return;
+   memcpy(buf, (char *)udev->setup.config_addr + offset, len);
+}
+
+static u8 uvirtio_get_status(struct virtio_device *dev)
+{
+   struct uvirtio_device *udev = container_of(dev, struct uvirtio_device,
+  vdev);
+   return udev->virtio_status;
+}
+
+static void uvirtio_set_status(struct virtio_device *dev, u8 status)
+{
+   struct uvirtio_device *udev = container_of(dev, struct uvirtio_device,
+  vdev);
+   if (WARN_ON(!status))
+   return;
+   udev->virtio_status = status;
+}
+
+static int find_match(int write, char *buf, unsigned int len,
+ struct uvirtio_block *block, char *data)
+{
+   int i;
+   int off = 0;
+
+   for (i = 0; i < UVIRTIO_MAX_RULES; ++i) {
+   if (!block->rules[i].len)
+   break;
+   if (block->rules[i].off + block->rules[i].len > len)
+   return -1;
+   if (write) {
+   memcpy(buf + block->rules[i].off,
+  data + 

Re: [PATCH 5/5] virtio: Add bounce DMA ops

2020-04-29 Thread kbuild test robot
Hi Srivatsa,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on vhost/linux-next]
[also build test WARNING on xen-tip/linux-next linus/master v5.7-rc3 
next-20200428]
[cannot apply to swiotlb/linux-next]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Srivatsa-Vaddagiri/virtio-on-Type-1-hypervisor/20200429-032334
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
reproduce:
# apt-get install sparse
# sparse version: v0.6.1-191-gc51a0382-dirty
make ARCH=x86_64 allmodconfig
make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 


sparse warnings: (new ones prefixed by >>)

>> drivers/virtio/virtio_bounce.c:22:21: sparse: sparse: symbol 'virtio_pool' 
>> was not declared. Should it be static?
>> drivers/virtio/virtio_bounce.c:79:8: sparse: sparse: symbol 
>> 'virtio_max_mapping_size' was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC PATCH] virtio: virtio_pool can be static

2020-04-29 Thread kbuild test robot


Signed-off-by: kbuild test robot 
---
 virtio_bounce.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_bounce.c b/drivers/virtio/virtio_bounce.c
index 3de8e0eb71e48..5a68d48667c42 100644
--- a/drivers/virtio/virtio_bounce.c
+++ b/drivers/virtio/virtio_bounce.c
@@ -19,7 +19,7 @@
 static phys_addr_t bounce_buf_paddr;
 static void *bounce_buf_vaddr;
 static size_t bounce_buf_size;
-struct swiotlb_pool *virtio_pool;
+static struct swiotlb_pool *virtio_pool;
 
 #define VIRTIO_MAX_BOUNCE_SIZE (16*4096)
 
@@ -76,7 +76,7 @@ static void virtio_unmap_page(struct device *dev, dma_addr_t 
dev_addr,
size, dir, attrs);
 }
 
-size_t virtio_max_mapping_size(struct device *dev)
+static size_t virtio_max_mapping_size(struct device *dev)
 {
return VIRTIO_MAX_BOUNCE_SIZE;
 }
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3] virtio-blk: handle block_device_operations callbacks after hot unplug

2020-04-29 Thread Stefan Hajnoczi
A userspace process holding a file descriptor to a virtio_blk device can
still invoke block_device_operations after hot unplug.  This leads to a
use-after-free accessing vblk->vdev in virtblk_getgeo() when
ioctl(HDIO_GETGEO) is invoked:

  BUG: unable to handle kernel NULL pointer dereference at 0090
  IP: [] virtio_check_driver_offered_feature+0x10/0x90 
[virtio]
  PGD 80003a92f067 PUD 3a930067 PMD 0
  Oops:  [#1] SMP
  CPU: 0 PID: 1310 Comm: hdio-getgeo Tainted: G   OE     
3.10.0-1062.el7.x86_64 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  task: 9be5fbfb8000 ti: 9be5fa89 task.ti: 9be5fa89
  RIP: 0010:[]  [] 
virtio_check_driver_offered_feature+0x10/0x90 [virtio]
  RSP: 0018:9be5fa893dc8  EFLAGS: 00010246
  RAX: 9be5fc3f3400 RBX: 9be5fa893e30 RCX: 
  RDX:  RSI: 0004 RDI: 9be5fbc10b40
  RBP: 9be5fa893dc8 R08: 0301 R09: 0301
  R10:  R11:  R12: 9be5fdc24680
  R13: 9be5fbc10b40 R14: 9be5fbc10480 R15: 
  FS:  7f1bfb968740() GS:9be5ffc0() knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2: 0090 CR3: 3a894000 CR4: 00360ff0
  DR0:  DR1:  DR2: 
  DR3:  DR6: fffe0ff0 DR7: 0400
  Call Trace:
   [] virtblk_getgeo+0x47/0x110 [virtio_blk]
   [] ? handle_mm_fault+0x39d/0x9b0
   [] blkdev_ioctl+0x1f5/0xa20
   [] block_ioctl+0x41/0x50
   [] do_vfs_ioctl+0x3a0/0x5a0
   [] SyS_ioctl+0xa1/0xc0

A related problem is that virtblk_remove() leaks the vd_index_ida index
when something still holds a reference to vblk->disk during hot unplug.
This causes virtio-blk device names to be lost (vda, vdb, etc).

Fix these issues by protecting vblk->vdev with a mutex and reference
counting vblk so the vd_index_ida index can be removed in all cases.

Fixes: 48e4043d4529523cbc7fa8dd745bd8e2c45ce1d3
   ("virtio: add virtio disk geometry feature")
Reported-by: Lance Digby 
Signed-off-by: Stefan Hajnoczi 
---
 drivers/block/virtio_blk.c | 87 ++
 1 file changed, 79 insertions(+), 8 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 93468b7c6701..6f7f277495f4 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -33,6 +33,16 @@ struct virtio_blk_vq {
 } cacheline_aligned_in_smp;
 
 struct virtio_blk {
+   /*
+* vdev may be accessed without taking this mutex in blk-mq and
+* virtqueue code paths because virtblk_remove() stops them before vdev
+* is freed.
+*
+* Everything else must hold this mutex when accessing vdev and must
+* handle the case where vdev is NULL after virtblk_remove() has been
+* called.
+*/
+   struct mutex vdev_mutex;
struct virtio_device *vdev;
 
/* The disk structure for the kernel. */
@@ -44,6 +54,13 @@ struct virtio_blk {
/* Process context for config space updates */
struct work_struct config_work;
 
+   /*
+* Tracks references from block_device_operations open/release and
+* virtio_driver probe/remove so this object can be freed once no
+* longer in use.
+*/
+   refcount_t refs;
+
/* What host tells us, plus 2 for header & tailer. */
unsigned int sg_elems;
 
@@ -295,10 +312,55 @@ static int virtblk_get_id(struct gendisk *disk, char 
*id_str)
return err;
 }
 
+static void virtblk_get(struct virtio_blk *vblk)
+{
+   refcount_inc(>refs);
+}
+
+static void virtblk_put(struct virtio_blk *vblk)
+{
+   if (refcount_dec_and_test(>refs)) {
+   ida_simple_remove(_index_ida, vblk->index);
+   mutex_destroy(>vdev_mutex);
+   kfree(vblk);
+   }
+}
+
+static int virtblk_open(struct block_device *bd, fmode_t mode)
+{
+   struct virtio_blk *vblk = bd->bd_disk->private_data;
+   int ret = 0;
+
+   mutex_lock(>vdev_mutex);
+
+   if (vblk->vdev)
+   virtblk_get(vblk);
+   else
+   ret = -ENXIO;
+
+   mutex_unlock(>vdev_mutex);
+   return ret;
+}
+
+static void virtblk_release(struct gendisk *disk, fmode_t mode)
+{
+   struct virtio_blk *vblk = disk->private_data;
+
+   virtblk_put(vblk);
+}
+
 /* We provide getgeo only to please some old bootloader/partitioning tools */
 static int virtblk_getgeo(struct block_device *bd, struct hd_geometry *geo)
 {
struct virtio_blk *vblk = bd->bd_disk->private_data;
+   int ret = 0;
+
+   mutex_lock(>vdev_mutex);
+
+   if (!vblk->vdev) {
+   ret = -ENXIO;
+   goto out;
+   }
 
/* see if the host passed in geometry config */
if 

[PATCH v1 2/3] mm/memory_hotplug: Introduce MHP_DRIVER_MANAGED

2020-04-29 Thread David Hildenbrand
Some paravirtualized devices that add memory via add_memory() and
friends (esp. virtio-mem) don't want to create entries in
/sys/firmware/memmap/ - primarily to hinder kexec from adding this
memory to the boot memmap of the kexec kernel.

In fact, such memory is never exposed via the firmware (e.g., e820), but
only via the device, so exposing this memory via /sys/firmware/memmap/ is
wrong:
 "kexec needs the raw firmware-provided memory map to setup the
  parameter segment of the kernel that should be booted with
  kexec. Also, the raw memory map is useful for debugging. For
  that reason, /sys/firmware/memmap is an interface that provides
  the raw memory map to userspace." [1]

We want to let user space know that memory which is always detected,
added, and managed via a (device) driver - like memory managed by
virtio-mem - is special. It cannot be used for placing kexec segments
and the (device) driver is responsible for re-adding memory that
(eventually shrunk/grown/defragmented) memory after a reboot/kexec. It
should e.g., not be added to a fixed up firmware memmap. However, it should
be dumped by kdump.

Also, such memory could behave differently than an ordinary DIMM - e.g.,
memory managed by virtio-mem can have holes inside added memory resource,
which should not be touched, especially for writing.

Let's expose that memory as "System RAM (driver managed)" e.g., via
/pro/iomem.

We don't have to worry about firmware_map_remove() on the removal path.
If there is no entry, it will simply return with -EINVAL.

[1] https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-memmap

Cc: Andrew Morton 
Cc: Michal Hocko 
Cc: Pankaj Gupta 
Cc: Wei Yang 
Cc: Baoquan He 
Cc: Eric Biederman 
Signed-off-by: David Hildenbrand 
---
 include/linux/memory_hotplug.h |  8 
 mm/memory_hotplug.c| 20 
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index bf0e3edb8688..cc538584b39e 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -68,6 +68,14 @@ struct mhp_params {
pgprot_t pgprot;
 };
 
+/* Flags used for add_memory() and friends. */
+
+/*
+ * Don't create entries in /sys/firmware/memmap/ and expose memory as
+ * "System RAM (driver managed)" in e.g., /proc/iomem
+ */
+#define MHP_DRIVER_MANAGED 1
+
 /*
  * Zone resizing functions
  *
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index ebdf6541d074..cfa0721280aa 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -98,11 +98,11 @@ void mem_hotplug_done(void)
 u64 max_mem_size = U64_MAX;
 
 /* add this memory to iomem resource */
-static struct resource *register_memory_resource(u64 start, u64 size)
+static struct resource *register_memory_resource(u64 start, u64 size,
+const char *resource_name)
 {
struct resource *res;
unsigned long flags =  IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
-   char *resource_name = "System RAM";
 
/*
 * Make sure value parsed from 'mem=' only restricts memory adding
@@ -1058,7 +1058,8 @@ int __ref add_memory_resource(int nid, struct resource 
*res,
BUG_ON(ret);
 
/* create new memmap entry */
-   firmware_map_add_hotplug(start, start + size, "System RAM");
+   if (!(flags & MHP_DRIVER_MANAGED))
+   firmware_map_add_hotplug(start, start + size, "System RAM");
 
/* device_online() will take the lock when calling online_pages() */
mem_hotplug_done();
@@ -1081,10 +1082,21 @@ int __ref add_memory_resource(int nid, struct resource 
*res,
 /* requires device_hotplug_lock, see add_memory_resource() */
 int __ref __add_memory(int nid, u64 start, u64 size, unsigned long flags)
 {
+   const char *resource_name = "System RAM";
struct resource *res;
int ret;
 
-   res = register_memory_resource(start, size);
+   /*
+* Indicate that memory managed by a driver is special. It's always
+* detected and added via a driver, should not be given to the kexec
+* kernel for booting when manually crafting the firmware memmap, and
+* no kexec segments should be placed on it. However, kdump should
+* dump this memory.
+*/
+   if (flags & MHP_DRIVER_MANAGED)
+   resource_name = "System RAM (driver managed)";
+
+   res = register_memory_resource(start, size, resource_name);
if (IS_ERR(res))
return PTR_ERR(res);
 
-- 
2.25.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v1 1/3] mm/memory_hotplug: Prepare passing flags to add_memory() and friends

2020-04-29 Thread David Hildenbrand
We soon want to pass flags - prepare for that.

This patch is based on a similar patch by Oscar Salvador:

https://lkml.kernel.org/r/20190625075227.15193-3-osalva...@suse.de

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: Greg Kroah-Hartman 
Cc: Dan Williams 
Cc: Vishal Verma 
Cc: Dave Jiang 
Cc: "K. Y. Srinivasan" 
Cc: Haiyang Zhang 
Cc: Stephen Hemminger 
Cc: Wei Liu 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Andrew Morton 
Cc: Thomas Gleixner 
Cc: Pingfan Liu 
Cc: Leonardo Bras 
Cc: Nathan Lynch 
Cc: Oscar Salvador 
Cc: Michal Hocko 
Cc: Baoquan He 
Cc: Wei Yang 
Cc: Pankaj Gupta 
Cc: Eric Biederman 
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux-nvd...@lists.01.org
Cc: linux-hyp...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-de...@lists.xenproject.org
Signed-off-by: David Hildenbrand 
---
 arch/powerpc/platforms/powernv/memtrace.c   |  2 +-
 arch/powerpc/platforms/pseries/hotplug-memory.c |  2 +-
 drivers/acpi/acpi_memhotplug.c  |  2 +-
 drivers/base/memory.c   |  2 +-
 drivers/dax/kmem.c  |  2 +-
 drivers/hv/hv_balloon.c |  2 +-
 drivers/s390/char/sclp_cmd.c|  2 +-
 drivers/virtio/virtio_mem.c |  2 +-
 drivers/xen/balloon.c   |  2 +-
 include/linux/memory_hotplug.h  |  7 ---
 mm/memory_hotplug.c | 11 ++-
 11 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c 
b/arch/powerpc/platforms/powernv/memtrace.c
index 13b369d2cc45..a7475d18c671 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -224,7 +224,7 @@ static int memtrace_online(void)
ent->mem = 0;
}
 
-   if (add_memory(ent->nid, ent->start, ent->size)) {
+   if (add_memory(ent->nid, ent->start, ent->size, 0)) {
pr_err("Failed to add trace memory to node %d\n",
ent->nid);
ret += 1;
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 5ace2f9a277e..ae44eba46ca0 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -646,7 +646,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
block_sz = memory_block_size_bytes();
 
/* Add the memory */
-   rc = __add_memory(lmb->nid, lmb->base_addr, block_sz);
+   rc = __add_memory(lmb->nid, lmb->base_addr, block_sz, 0);
if (rc) {
invalidate_lmb_associativity_index(lmb);
return rc;
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index e294f44a7850..d91b3584d4b2 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -207,7 +207,7 @@ static int acpi_memory_enable_device(struct 
acpi_memory_device *mem_device)
if (node < 0)
node = memory_add_physaddr_to_nid(info->start_addr);
 
-   result = __add_memory(node, info->start_addr, info->length);
+   result = __add_memory(node, info->start_addr, info->length, 0);
 
/*
 * If the memory block has been used by the kernel, add_memory()
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 2b09b68b9f78..c0ef7d9e310a 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -432,7 +432,7 @@ static ssize_t probe_store(struct device *dev, struct 
device_attribute *attr,
 
nid = memory_add_physaddr_to_nid(phys_addr);
ret = __add_memory(nid, phys_addr,
-  MIN_MEMORY_BLOCK_SIZE * sections_per_block);
+  MIN_MEMORY_BLOCK_SIZE * sections_per_block, 0);
 
if (ret)
goto out;
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 3d0a7e702c94..e159184e0ba0 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -65,7 +65,7 @@ int dev_dax_kmem_probe(struct device *dev)
new_res->flags = IORESOURCE_SYSTEM_RAM;
new_res->name = dev_name(dev);
 
-   rc = add_memory(numa_node, new_res->start, resource_size(new_res));
+   rc = add_memory(numa_node, new_res->start, resource_size(new_res), 0);
if (rc) {
release_resource(new_res);
kfree(new_res);
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 32e3bc0aa665..0194bed1a573 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ 

[PATCH v1 0/3] mm/memory_hotplug: Make virtio-mem play nicely with kexec-tools

2020-04-29 Thread David Hildenbrand
This series is based on [1]:
[PATCH v2 00/10] virtio-mem: paravirtualized memory
That will hopefull get picked up soon, rebased to -next.

The following patches were reverted from -next [2]:
[PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use
As discussed in that thread, they should be reverted from -next already.

In theory, if people agree, we could take the first two patches via the
-mm tree now and the last (virtio-mem) patch via MST's tree once picking up
virtio-mem. No strong feelings.


Memory added by virtio-mem is special and might contain logical holes,
especially after memory unplug, but also when adding memory in
sub-section size. While memory in these holes can usually be read, that
memory should not be touched. virtio-mem managed device memory is never
exposed via any firmware memmap (esp., e820). The device driver will
request to plug memory from the hypervisor and add it to Linux.

On a cold start, all memory is unplugged, and the guest driver will first
request to plug memory from the hypervisor, to then add it to Linux. After
a reboot, all memory will get unplugged (except in rare, special cases). In
case the device driver comes up and detects that some memory is still
plugged after a reboot, it will manually request to unplug all memory from
the hypervisor first - to then request to plug memory from the hypervisor
and add to Linux. This is essentially a defragmentation step, where all
logical holes are removed.

As the device driver is responsible for detecting, adding and managing that
memory, also kexec should treat it like that. It is special. We need a way
to teach kexec-tools to not add that memory to the fixed-up firmware
memmap, to not place kexec images onto this memory, but still allow kdump
to dump it. Add a flag to tell memory hotplug code to
not create /sys/firmware/memmap entries and to indicate it via
"System RAM (driver managed)" in /proc/iomem.

Before this series, kexec_file_load() already did the right thing (for
virtio-mem) by not adding that memory to the fixed-up firmware memmap and
letting the device driver handle it. With this series, also kexec_load() -
which relies on user space to provide a fixed up firmware memmap - does
the right thing with virtio-mem memory.

When the virtio-mem device driver(s) come up, they will request to unplug
all memory from the hypervisor first (esp. defragment), to then request to
plug consecutive memory ranges from the hypervisor, and add them to Linux
- just like on a reboot where we still have memory plugged.

[1] https://lore.kernel.org/r/20200311171422.10484-1-da...@redhat.com/
[2] https://lore.kernel.org/r/20200326180730.4754-1-james.mo...@arm.com

David Hildenbrand (3):
  mm/memory_hotplug: Prepare passing flags to add_memory() and friends
  mm/memory_hotplug: Introduce MHP_DRIVER_MANAGED
  virtio-mem: Add memory with MHP_DRIVER_MANAGED

 arch/powerpc/platforms/powernv/memtrace.c |  2 +-
 .../platforms/pseries/hotplug-memory.c|  2 +-
 drivers/acpi/acpi_memhotplug.c|  2 +-
 drivers/base/memory.c |  2 +-
 drivers/dax/kmem.c|  2 +-
 drivers/hv/hv_balloon.c   |  2 +-
 drivers/s390/char/sclp_cmd.c  |  2 +-
 drivers/virtio/virtio_mem.c   |  3 +-
 drivers/xen/balloon.c |  2 +-
 include/linux/memory_hotplug.h| 15 +++--
 mm/memory_hotplug.c   | 31 +--
 11 files changed, 44 insertions(+), 21 deletions(-)

-- 
2.25.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v1 3/3] virtio-mem: Add memory with MHP_DRIVER_MANAGED

2020-04-29 Thread David Hildenbrand
We don't want /sys/firmware/memmap entries and we want to indicate
our memory as "System RAM (driver managed)" in /proc/iomem. This is
especially relevant for kexec-tools, which have to be updated to
support dumping virtio-mem memory after this patch. Expected behavior in
kexec-tools:
- Don't use this memory when creating a fixed-up firmware memmap. Works
  now out of the box on x86-64.
- Don't use this memory for placing kexec segments. Works now out of the
  box on x86-64.
- Consider "System RAM (driver managed)" when creating the elfcorehdr
  for kdump. This memory has to be dumped. Needs update of kexec-tools.

With this patch on x86-64:

/proc/iomem:
-0fff : Reserved
1000-0009fbff : System RAM
[...]
fffc- : Reserved
1-13fff : System RAM
14000-147ff : System RAM (driver managed)
34000-347ff : System RAM (driver managed)
34800-34fff : System RAM (driver managed)
[..]
328000-32 : PCI Bus :00

/sys/firmware/memmap:
-0009fc00 (System RAM)
0009fc00-000a (Reserved)
000f-0010 (Reserved)
0010-bffe (System RAM)
bffe-c000 (Reserved)
feffc000-ff00 (Reserved)
fffc-0001 (Reserved)
0001-00014000 (System RAM)

Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Michal Hocko 
Cc: Eric Biederman 
Signed-off-by: David Hildenbrand 
---
 drivers/virtio/virtio_mem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 3101cbf9e59d..6f658d1aeac4 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -421,7 +421,8 @@ static int virtio_mem_mb_add(struct virtio_mem *vm, 
unsigned long mb_id)
nid = memory_add_physaddr_to_nid(addr);
 
dev_dbg(>vdev->dev, "adding memory block: %lu\n", mb_id);
-   return add_memory(nid, addr, memory_block_size_bytes(), 0);
+   return add_memory(nid, addr, memory_block_size_bytes(),
+ MHP_DRIVER_MANAGED);
 }
 
 /*
-- 
2.25.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vhost: fix default for vhost_iotlb

2020-04-29 Thread Michael S. Tsirkin
On Wed, Apr 29, 2020 at 04:23:04PM +0200, Arnd Bergmann wrote:
> During randconfig build testing, I ran into a configuration that has
> CONFIG_VHOST=m, CONFIG_VHOST_IOTLB=m and CONFIG_VHOST_RING=y, which
> makes the iotlb implementation left out from vhost_ring, and in turn
> leads to a link failure of the vdpa_sim module:
> 
> ERROR: modpost: "vringh_set_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
> undefined!
> ERROR: modpost: "vringh_init_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
> undefined!
> ERROR: modpost: "vringh_iov_push_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
> undefined!
> ERROR: modpost: "vringh_iov_pull_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
> undefined!
> ERROR: modpost: "vringh_complete_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
> undefined!
> ERROR: modpost: "vringh_getdesc_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
> undefined!
> 
> Work around it by setting the default for VHOST_IOTLB to avoid this
> configuration.
> 
> Fixes: e6faeaa12841 ("vhost: drop vring dependency on iotlb")
> Signed-off-by: Arnd Bergmann 
> ---
> I fixed this a while ago locally but never got around to sending the
> fix. If the problem has been addressed differently in the meantime,
> please ignore this one.


So I ended up not sending e6faeaa12841 upstream because of this problem.
But hey, that's a nice idea!
I'll queue something like this for the next release.

> ---
>  drivers/vhost/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> index 2c75d164b827..ee5f85761024 100644
> --- a/drivers/vhost/Kconfig
> +++ b/drivers/vhost/Kconfig
> @@ -1,6 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  config VHOST_IOTLB
>   tristate
> + default y if VHOST=m && VHOST_RING=y
>   help
> Generic IOTLB implementation for vhost and vringh.
> This option is selected by any driver which needs to support
> -- 
> 2.26.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] vhost: fix default for vhost_iotlb

2020-04-29 Thread Arnd Bergmann
During randconfig build testing, I ran into a configuration that has
CONFIG_VHOST=m, CONFIG_VHOST_IOTLB=m and CONFIG_VHOST_RING=y, which
makes the iotlb implementation left out from vhost_ring, and in turn
leads to a link failure of the vdpa_sim module:

ERROR: modpost: "vringh_set_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_init_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_iov_push_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_iov_pull_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_complete_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_getdesc_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!

Work around it by setting the default for VHOST_IOTLB to avoid this
configuration.

Fixes: e6faeaa12841 ("vhost: drop vring dependency on iotlb")
Signed-off-by: Arnd Bergmann 
---
I fixed this a while ago locally but never got around to sending the
fix. If the problem has been addressed differently in the meantime,
please ignore this one.
---
 drivers/vhost/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 2c75d164b827..ee5f85761024 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VHOST_IOTLB
tristate
+   default y if VHOST=m && VHOST_RING=y
help
  Generic IOTLB implementation for vhost and vringh.
  This option is selected by any driver which needs to support
-- 
2.26.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 33/34] iommu: Move more initialization to __iommu_probe_device()

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Move the calls to dev_iommu_get() and try_module_get() into
__iommu_probe_device(), so that the callers don't have to do it on
their own.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 47 +--
 1 file changed, 18 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 7f99e5ae432c..48a95f7d7999 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -194,9 +194,19 @@ static int __iommu_probe_device(struct device *dev, struct 
list_head *group_list
struct iommu_group *group;
int ret;
 
+   if (!dev_iommu_get(dev))
+   return -ENOMEM;
+
+   if (!try_module_get(ops->owner)) {
+   ret = -EINVAL;
+   goto err_free;
+   }
+
iommu_dev = ops->probe_device(dev);
-   if (IS_ERR(iommu_dev))
-   return PTR_ERR(iommu_dev);
+   if (IS_ERR(iommu_dev)) {
+   ret = PTR_ERR(iommu_dev);
+   goto out_module_put;
+   }
 
dev->iommu->iommu_dev = iommu_dev;
 
@@ -217,6 +227,12 @@ static int __iommu_probe_device(struct device *dev, struct 
list_head *group_list
 out_release:
ops->release_device(dev);
 
+out_module_put:
+   module_put(ops->owner);
+
+err_free:
+   dev_iommu_free(dev);
+
return ret;
 }
 
@@ -226,14 +242,6 @@ int iommu_probe_device(struct device *dev)
struct iommu_group *group;
int ret;
 
-   if (!dev_iommu_get(dev))
-   return -ENOMEM;
-
-   if (!try_module_get(ops->owner)) {
-   ret = -EINVAL;
-   goto err_out;
-   }
-
ret = __iommu_probe_device(dev, NULL);
if (ret)
goto err_out;
@@ -1532,14 +1540,10 @@ struct iommu_domain *iommu_group_default_domain(struct 
iommu_group *group)
 
 static int probe_iommu_group(struct device *dev, void *data)
 {
-   const struct iommu_ops *ops = dev->bus->iommu_ops;
struct list_head *group_list = data;
struct iommu_group *group;
int ret;
 
-   if (!dev_iommu_get(dev))
-   return -ENOMEM;
-
/* Device is probed already if in a group */
group = iommu_group_get(dev);
if (group) {
@@ -1547,22 +1551,7 @@ static int probe_iommu_group(struct device *dev, void 
*data)
return 0;
}
 
-   if (!try_module_get(ops->owner)) {
-   ret = -EINVAL;
-   goto err_free_dev_iommu;
-   }
-
ret = __iommu_probe_device(dev, group_list);
-   if (ret)
-   goto err_module_put;
-
-   return 0;
-
-err_module_put:
-   module_put(ops->owner);
-err_free_dev_iommu:
-   dev_iommu_free(dev);
-
if (ret == -ENODEV)
ret = 0;
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 08/34] iommu: Move default domain allocation to iommu_probe_device()

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Well, not really. The call to iommu_alloc_default_domain() in
iommu_group_get_for_dev() has to stay around as long as there are
IOMMU drivers using the add/remove_device() call-backs instead of
probe/release_device().

Those drivers expect that iommu_group_get_for_dev() returns the device
attached to a group and the group set up with a default domain (and
the device attached to the groups current domain).

But when all drivers are converted this compatability mess can be
removed.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 102 +-
 1 file changed, 71 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 6cfe7799dc8c..7a385c18e1a5 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -79,6 +79,16 @@ static bool iommu_cmd_line_dma_api(void)
return !!(iommu_cmd_line & IOMMU_CMD_LINE_DMA_API);
 }
 
+static int iommu_alloc_default_domain(struct device *dev);
+static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
+unsigned type);
+static int __iommu_attach_device(struct iommu_domain *domain,
+struct device *dev);
+static int __iommu_attach_group(struct iommu_domain *domain,
+   struct iommu_group *group);
+static void __iommu_detach_group(struct iommu_domain *domain,
+struct iommu_group *group);
+
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
 struct iommu_group_attribute iommu_group_attr_##_name =\
__ATTR(_name, _mode, _show, _store)
@@ -221,10 +231,29 @@ int iommu_probe_device(struct device *dev)
goto err_free_dev_param;
}
 
-   if (ops->probe_device)
+   if (ops->probe_device) {
+   struct iommu_group *group;
+
ret = __iommu_probe_device(dev);
-   else
+
+   /*
+* Try to allocate a default domain - needs support from the
+* IOMMU driver. There are still some drivers which don't
+* support default domains, so the return value is not yet
+* checked.
+*/
+   if (!ret)
+   iommu_alloc_default_domain(dev);
+
+   group = iommu_group_get(dev);
+   if (group && group->default_domain) {
+   ret = __iommu_attach_device(group->default_domain, dev);
+   iommu_group_put(group);
+   }
+
+   } else {
ret = ops->add_device(dev);
+   }
 
if (ret)
goto err_module_put;
@@ -268,15 +297,6 @@ void iommu_release_device(struct device *dev)
dev_iommu_free(dev);
 }
 
-static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
-unsigned type);
-static int __iommu_attach_device(struct iommu_domain *domain,
-struct device *dev);
-static int __iommu_attach_group(struct iommu_domain *domain,
-   struct iommu_group *group);
-static void __iommu_detach_group(struct iommu_domain *domain,
-struct iommu_group *group);
-
 static int __init iommu_set_def_domain_type(char *str)
 {
bool pt;
@@ -1423,25 +1443,18 @@ static int iommu_get_def_domain_type(struct device *dev)
return (type == 0) ? iommu_def_domain_type : type;
 }
 
-static int iommu_alloc_default_domain(struct device *dev,
- struct iommu_group *group)
+static int iommu_group_alloc_default_domain(struct bus_type *bus,
+   struct iommu_group *group,
+   unsigned int type)
 {
struct iommu_domain *dom;
-   unsigned int type;
-
-   if (group->default_domain)
-   return 0;
 
-   type = iommu_get_def_domain_type(dev);
-
-   dom = __iommu_domain_alloc(dev->bus, type);
+   dom = __iommu_domain_alloc(bus, type);
if (!dom && type != IOMMU_DOMAIN_DMA) {
-   dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
-   if (dom) {
-   dev_warn(dev,
-"failed to allocate default IOMMU domain of 
type %u; falling back to IOMMU_DOMAIN_DMA",
-type);
-   }
+   dom = __iommu_domain_alloc(bus, IOMMU_DOMAIN_DMA);
+   if (dom)
+   pr_warn("Failed to allocate default IOMMU domain of 
type %u for group %s - Falling back to IOMMU_DOMAIN_DMA",
+   type, group->name);
}
 
if (!dom)
@@ -1461,6 +1474,23 @@ static int iommu_alloc_default_domain(struct device *dev,
return 0;
 }
 

[PATCH v3 26/34] iommu/tegra: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the Tegra IOMMU drivers to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/tegra-gart.c | 24 ++--
 drivers/iommu/tegra-smmu.c | 31 ---
 2 files changed, 14 insertions(+), 41 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index db6559e8336f..5fbdff6ff41a 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -243,28 +243,16 @@ static bool gart_iommu_capable(enum iommu_cap cap)
return false;
 }
 
-static int gart_iommu_add_device(struct device *dev)
+static struct iommu_device *gart_iommu_probe_device(struct device *dev)
 {
-   struct iommu_group *group;
-
if (!dev_iommu_fwspec_get(dev))
-   return -ENODEV;
-
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return PTR_ERR(group);
-
-   iommu_group_put(group);
+   return ERR_PTR(-ENODEV);
 
-   iommu_device_link(_handle->iommu, dev);
-
-   return 0;
+   return _handle->iommu;
 }
 
-static void gart_iommu_remove_device(struct device *dev)
+static void gart_iommu_release_device(struct device *dev)
 {
-   iommu_group_remove_device(dev);
-   iommu_device_unlink(_handle->iommu, dev);
 }
 
 static int gart_iommu_of_xlate(struct device *dev,
@@ -290,8 +278,8 @@ static const struct iommu_ops gart_iommu_ops = {
.domain_free= gart_iommu_domain_free,
.attach_dev = gart_iommu_attach_dev,
.detach_dev = gart_iommu_detach_dev,
-   .add_device = gart_iommu_add_device,
-   .remove_device  = gart_iommu_remove_device,
+   .probe_device   = gart_iommu_probe_device,
+   .release_device = gart_iommu_release_device,
.device_group   = generic_device_group,
.map= gart_iommu_map,
.unmap  = gart_iommu_unmap,
diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 63a147b623e6..7426b7666e2b 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -757,11 +757,10 @@ static int tegra_smmu_configure(struct tegra_smmu *smmu, 
struct device *dev,
return 0;
 }
 
-static int tegra_smmu_add_device(struct device *dev)
+static struct iommu_device *tegra_smmu_probe_device(struct device *dev)
 {
struct device_node *np = dev->of_node;
struct tegra_smmu *smmu = NULL;
-   struct iommu_group *group;
struct of_phandle_args args;
unsigned int index = 0;
int err;
@@ -774,7 +773,7 @@ static int tegra_smmu_add_device(struct device *dev)
of_node_put(args.np);
 
if (err < 0)
-   return err;
+   return ERR_PTR(err);
 
/*
 * Only a single IOMMU master interface is currently
@@ -783,8 +782,6 @@ static int tegra_smmu_add_device(struct device *dev)
 */
dev->archdata.iommu = smmu;
 
-   iommu_device_link(>iommu, dev);
-
break;
}
 
@@ -793,26 +790,14 @@ static int tegra_smmu_add_device(struct device *dev)
}
 
if (!smmu)
-   return -ENODEV;
-
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return PTR_ERR(group);
-
-   iommu_group_put(group);
+   return ERR_PTR(-ENODEV);
 
-   return 0;
+   return >iommu;
 }
 
-static void tegra_smmu_remove_device(struct device *dev)
+static void tegra_smmu_release_device(struct device *dev)
 {
-   struct tegra_smmu *smmu = dev->archdata.iommu;
-
-   if (smmu)
-   iommu_device_unlink(>iommu, dev);
-
dev->archdata.iommu = NULL;
-   iommu_group_remove_device(dev);
 }
 
 static const struct tegra_smmu_group_soc *
@@ -895,8 +880,8 @@ static const struct iommu_ops tegra_smmu_ops = {
.domain_free = tegra_smmu_domain_free,
.attach_dev = tegra_smmu_attach_dev,
.detach_dev = tegra_smmu_detach_dev,
-   .add_device = tegra_smmu_add_device,
-   .remove_device = tegra_smmu_remove_device,
+   .probe_device = tegra_smmu_probe_device,
+   .release_device = tegra_smmu_release_device,
.device_group = tegra_smmu_device_group,
.map = tegra_smmu_map,
.unmap = tegra_smmu_unmap,
@@ -1015,7 +1000,7 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
 * value. However the IOMMU registration process will attempt to add
 * all devices to the IOMMU when bus_set_iommu() is called. In order
 * not to rely on global variables to track the IOMMU instance, we
-* set it here so that it can be looked up from the .add_device()
+* set it here so that it can be looked up from the 

[PATCH v3 30/34] iommu/exynos: Use first SYSMMU in controllers list for IOMMU core

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

On Exynos platforms there can be more than one SYSMMU (IOMMU) for one
DMA master device. Since the IOMMU core code expects only one hardware
IOMMU, use the first SYSMMU in the list.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/exynos-iommu.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 186ff5cc975c..09cdd163560a 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1261,6 +1261,11 @@ static int exynos_iommu_add_device(struct device *dev)
}
iommu_group_put(group);
 
+   /* There is always at least one entry, see exynos_iommu_of_xlate() */
+   data = list_first_entry(>controllers,
+   struct sysmmu_drvdata, owner_node);
+   iommu_device_link(>iommu, dev);
+
return 0;
 }
 
@@ -1286,6 +1291,11 @@ static void exynos_iommu_remove_device(struct device 
*dev)
 
list_for_each_entry(data, >controllers, owner_node)
device_link_del(data->link);
+
+   /* There is always at least one entry, see exynos_iommu_of_xlate() */
+   data = list_first_entry(>controllers,
+   struct sysmmu_drvdata, owner_node);
+   iommu_device_unlink(>iommu, dev);
 }
 
 static int exynos_iommu_of_xlate(struct device *dev,
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 34/34] iommu: Unexport iommu_group_get_for_dev()

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

The function is now only used in IOMMU core code and shouldn't be used
outside of it anyway, so remove the export for it.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 4 ++--
 include/linux/iommu.h | 1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 48a95f7d7999..a9e5618cde80 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -91,6 +91,7 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 struct iommu_group *group);
 static int iommu_create_device_direct_mappings(struct iommu_group *group,
   struct device *dev);
+static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
 struct iommu_group_attribute iommu_group_attr_##_name =\
@@ -1500,7 +1501,7 @@ static int iommu_alloc_default_domain(struct device *dev)
  * to the returned IOMMU group, which will already include the provided
  * device.  The reference should be released with iommu_group_put().
  */
-struct iommu_group *iommu_group_get_for_dev(struct device *dev)
+static struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
struct iommu_group *group;
@@ -1531,7 +1532,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device 
*dev)
 
return ERR_PTR(ret);
 }
-EXPORT_SYMBOL(iommu_group_get_for_dev);
 
 struct iommu_domain *iommu_group_default_domain(struct iommu_group *group)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index dd076366383f..7cfd2dddb49d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -527,7 +527,6 @@ extern int iommu_page_response(struct device *dev,
   struct iommu_page_response *msg);
 
 extern int iommu_group_id(struct iommu_group *group);
-extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
 
 extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 31/34] iommu/exynos: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the Exynos IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/exynos-iommu.c | 26 ++
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 09cdd163560a..60c8a56e4a3f 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1235,19 +1235,13 @@ static phys_addr_t exynos_iommu_iova_to_phys(struct 
iommu_domain *iommu_domain,
return phys;
 }
 
-static int exynos_iommu_add_device(struct device *dev)
+static struct iommu_device *exynos_iommu_probe_device(struct device *dev)
 {
struct exynos_iommu_owner *owner = dev->archdata.iommu;
struct sysmmu_drvdata *data;
-   struct iommu_group *group;
 
if (!has_sysmmu(dev))
-   return -ENODEV;
-
-   group = iommu_group_get_for_dev(dev);
-
-   if (IS_ERR(group))
-   return PTR_ERR(group);
+   return ERR_PTR(-ENODEV);
 
list_for_each_entry(data, >controllers, owner_node) {
/*
@@ -1259,17 +1253,15 @@ static int exynos_iommu_add_device(struct device *dev)
 DL_FLAG_STATELESS |
 DL_FLAG_PM_RUNTIME);
}
-   iommu_group_put(group);
 
/* There is always at least one entry, see exynos_iommu_of_xlate() */
data = list_first_entry(>controllers,
struct sysmmu_drvdata, owner_node);
-   iommu_device_link(>iommu, dev);
 
-   return 0;
+   return >iommu;
 }
 
-static void exynos_iommu_remove_device(struct device *dev)
+static void exynos_iommu_release_device(struct device *dev)
 {
struct exynos_iommu_owner *owner = dev->archdata.iommu;
struct sysmmu_drvdata *data;
@@ -1287,15 +1279,9 @@ static void exynos_iommu_remove_device(struct device 
*dev)
iommu_group_put(group);
}
}
-   iommu_group_remove_device(dev);
 
list_for_each_entry(data, >controllers, owner_node)
device_link_del(data->link);
-
-   /* There is always at least one entry, see exynos_iommu_of_xlate() */
-   data = list_first_entry(>controllers,
-   struct sysmmu_drvdata, owner_node);
-   iommu_device_unlink(>iommu, dev);
 }
 
 static int exynos_iommu_of_xlate(struct device *dev,
@@ -1341,8 +1327,8 @@ static const struct iommu_ops exynos_iommu_ops = {
.unmap = exynos_iommu_unmap,
.iova_to_phys = exynos_iommu_iova_to_phys,
.device_group = generic_device_group,
-   .add_device = exynos_iommu_add_device,
-   .remove_device = exynos_iommu_remove_device,
+   .probe_device = exynos_iommu_probe_device,
+   .release_device = exynos_iommu_release_device,
.pgsize_bitmap = SECT_SIZE | LPAGE_SIZE | SPAGE_SIZE,
.of_xlate = exynos_iommu_of_xlate,
 };
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 17/34] iommu/arm-smmu: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the arm-smmu and arm-smmu-v3 drivers to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code does the
group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/arm-smmu-v3.c | 38 ++--
 drivers/iommu/arm-smmu.c| 39 ++---
 2 files changed, 25 insertions(+), 52 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 82508730feb7..42e1ee7e5197 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2914,27 +2914,26 @@ static bool arm_smmu_sid_in_range(struct 
arm_smmu_device *smmu, u32 sid)
 
 static struct iommu_ops arm_smmu_ops;
 
-static int arm_smmu_add_device(struct device *dev)
+static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 {
int i, ret;
struct arm_smmu_device *smmu;
struct arm_smmu_master *master;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-   struct iommu_group *group;
 
if (!fwspec || fwspec->ops != _smmu_ops)
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
 
if (WARN_ON_ONCE(dev_iommu_priv_get(dev)))
-   return -EBUSY;
+   return ERR_PTR(-EBUSY);
 
smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode);
if (!smmu)
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
 
master = kzalloc(sizeof(*master), GFP_KERNEL);
if (!master)
-   return -ENOMEM;
+   return ERR_PTR(-ENOMEM);
 
master->dev = dev;
master->smmu = smmu;
@@ -2975,30 +2974,15 @@ static int arm_smmu_add_device(struct device *dev)
master->ssid_bits = min_t(u8, master->ssid_bits,
  CTXDESC_LINEAR_CDMAX);
 
-   ret = iommu_device_link(>iommu, dev);
-   if (ret)
-   goto err_disable_pasid;
+   return >iommu;
 
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group)) {
-   ret = PTR_ERR(group);
-   goto err_unlink;
-   }
-
-   iommu_group_put(group);
-   return 0;
-
-err_unlink:
-   iommu_device_unlink(>iommu, dev);
-err_disable_pasid:
-   arm_smmu_disable_pasid(master);
 err_free_master:
kfree(master);
dev_iommu_priv_set(dev, NULL);
-   return ret;
+   return ERR_PTR(ret);
 }
 
-static void arm_smmu_remove_device(struct device *dev)
+static void arm_smmu_release_device(struct device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_master *master;
@@ -3010,8 +2994,6 @@ static void arm_smmu_remove_device(struct device *dev)
master = dev_iommu_priv_get(dev);
smmu = master->smmu;
arm_smmu_detach_dev(master);
-   iommu_group_remove_device(dev);
-   iommu_device_unlink(>iommu, dev);
arm_smmu_disable_pasid(master);
kfree(master);
iommu_fwspec_free(dev);
@@ -3138,8 +3120,8 @@ static struct iommu_ops arm_smmu_ops = {
.flush_iotlb_all= arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys   = arm_smmu_iova_to_phys,
-   .add_device = arm_smmu_add_device,
-   .remove_device  = arm_smmu_remove_device,
+   .probe_device   = arm_smmu_probe_device,
+   .release_device = arm_smmu_release_device,
.device_group   = arm_smmu_device_group,
.domain_get_attr= arm_smmu_domain_get_attr,
.domain_set_attr= arm_smmu_domain_set_attr,
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a6a5796e9c41..e622f4e33379 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -220,7 +220,7 @@ static int arm_smmu_register_legacy_master(struct device 
*dev,
  * With the legacy DT binding in play, we have no guarantees about
  * probe order, but then we're also not doing default domains, so we can
  * delay setting bus ops until we're sure every possible SMMU is ready,
- * and that way ensure that no add_device() calls get missed.
+ * and that way ensure that no probe_device() calls get missed.
  */
 static int arm_smmu_legacy_bus_init(void)
 {
@@ -1062,7 +1062,6 @@ static int arm_smmu_master_alloc_smes(struct device *dev)
struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
struct arm_smmu_device *smmu = cfg->smmu;
struct arm_smmu_smr *smrs = smmu->smrs;
-   struct iommu_group *group;
int i, idx, ret;
 
mutex_lock(>stream_map_mutex);
@@ -1090,18 +1089,9 @@ static int arm_smmu_master_alloc_smes(struct device *dev)
cfg->smendx[i] = (s16)idx;
}
 
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group)) {
-   ret = PTR_ERR(group);
-   goto out_err;
-   }
-   

[PATCH v3 15/34] iommu/amd: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the AMD IOMMU Driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c | 71 ---
 1 file changed, 22 insertions(+), 49 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 0b4b4faa876d..c30367413683 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -343,21 +343,9 @@ static bool check_device(struct device *dev)
return true;
 }
 
-static void init_iommu_group(struct device *dev)
-{
-   struct iommu_group *group;
-
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return;
-
-   iommu_group_put(group);
-}
-
 static int iommu_init_device(struct device *dev)
 {
struct iommu_dev_data *dev_data;
-   struct amd_iommu *iommu;
int devid;
 
if (dev->archdata.iommu)
@@ -367,8 +355,6 @@ static int iommu_init_device(struct device *dev)
if (devid < 0)
return devid;
 
-   iommu = amd_iommu_rlookup_table[devid];
-
dev_data = find_dev_data(devid);
if (!dev_data)
return -ENOMEM;
@@ -391,8 +377,6 @@ static int iommu_init_device(struct device *dev)
 
dev->archdata.iommu = dev_data;
 
-   iommu_device_link(>iommu, dev);
-
return 0;
 }
 
@@ -410,7 +394,7 @@ static void iommu_ignore_device(struct device *dev)
setup_aliases(dev);
 }
 
-static void iommu_uninit_device(struct device *dev)
+static void amd_iommu_uninit_device(struct device *dev)
 {
struct iommu_dev_data *dev_data;
struct amd_iommu *iommu;
@@ -429,13 +413,6 @@ static void iommu_uninit_device(struct device *dev)
if (dev_data->domain)
detach_device(dev);
 
-   iommu_device_unlink(>iommu, dev);
-
-   iommu_group_remove_device(dev);
-
-   /* Remove dma-ops */
-   dev->dma_ops = NULL;
-
/*
 * We keep dev_data around for unplugged devices and reuse it when the
 * device is re-plugged - not doing so would introduce a ton of races.
@@ -2152,55 +2129,50 @@ static void detach_device(struct device *dev)
spin_unlock_irqrestore(>lock, flags);
 }
 
-static int amd_iommu_add_device(struct device *dev)
+static struct iommu_device *amd_iommu_probe_device(struct device *dev)
 {
-   struct iommu_dev_data *dev_data;
-   struct iommu_domain *domain;
+   struct iommu_device *iommu_dev;
struct amd_iommu *iommu;
int ret, devid;
 
-   if (get_dev_data(dev))
-   return 0;
-
if (!check_device(dev))
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
 
devid = get_device_id(dev);
if (devid < 0)
-   return devid;
+   return ERR_PTR(devid);
 
iommu = amd_iommu_rlookup_table[devid];
 
+   if (get_dev_data(dev))
+   return >iommu;
+
ret = iommu_init_device(dev);
if (ret) {
if (ret != -ENOTSUPP)
dev_err(dev, "Failed to initialize - trying to proceed 
anyway\n");
-
+   iommu_dev = ERR_PTR(ret);
iommu_ignore_device(dev);
-   dev->dma_ops = NULL;
-   goto out;
+   } else {
+   iommu_dev = >iommu;
}
-   init_iommu_group(dev);
 
-   dev_data = get_dev_data(dev);
+   iommu_completion_wait(iommu);
 
-   BUG_ON(!dev_data);
+   return iommu_dev;
+}
 
-   if (dev_data->iommu_v2)
-   iommu_request_dm_for_dev(dev);
+static void amd_iommu_probe_finalize(struct device *dev)
+{
+   struct iommu_domain *domain;
 
/* Domains are initialized for this device - have a look what we ended 
up with */
domain = iommu_get_domain_for_dev(dev);
if (domain->type == IOMMU_DOMAIN_DMA)
iommu_setup_dma_ops(dev, IOVA_START_PFN << PAGE_SHIFT, 0);
-
-out:
-   iommu_completion_wait(iommu);
-
-   return 0;
 }
 
-static void amd_iommu_remove_device(struct device *dev)
+static void amd_iommu_release_device(struct device *dev)
 {
struct amd_iommu *iommu;
int devid;
@@ -2214,7 +2186,7 @@ static void amd_iommu_remove_device(struct device *dev)
 
iommu = amd_iommu_rlookup_table[devid];
 
-   iommu_uninit_device(dev);
+   amd_iommu_uninit_device(dev);
iommu_completion_wait(iommu);
 }
 
@@ -2687,8 +2659,9 @@ const struct iommu_ops amd_iommu_ops = {
.map = amd_iommu_map,
.unmap = amd_iommu_unmap,
.iova_to_phys = amd_iommu_iova_to_phys,
-   .add_device = amd_iommu_add_device,
-   .remove_device = amd_iommu_remove_device,
+   .probe_device = amd_iommu_probe_device,
+   .release_device = amd_iommu_release_device,
+   .probe_finalize = amd_iommu_probe_finalize,
.device_group = 

[PATCH v3 32/34] iommu: Remove add_device()/remove_device() code-paths

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

All drivers are converted to use the probe/release_device()
call-backs, so the add_device/remove_device() pointers are unused and
the code using them can be removed.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 158 ++
 include/linux/iommu.h |   4 --
 2 files changed, 38 insertions(+), 124 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 397fd4fd0c32..7f99e5ae432c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -220,12 +220,20 @@ static int __iommu_probe_device(struct device *dev, 
struct list_head *group_list
return ret;
 }
 
-static int __iommu_probe_device_helper(struct device *dev)
+int iommu_probe_device(struct device *dev)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
struct iommu_group *group;
int ret;
 
+   if (!dev_iommu_get(dev))
+   return -ENOMEM;
+
+   if (!try_module_get(ops->owner)) {
+   ret = -EINVAL;
+   goto err_out;
+   }
+
ret = __iommu_probe_device(dev, NULL);
if (ret)
goto err_out;
@@ -259,75 +267,23 @@ static int __iommu_probe_device_helper(struct device *dev)
 
 err_release:
iommu_release_device(dev);
+
 err_out:
return ret;
 
 }
 
-int iommu_probe_device(struct device *dev)
+void iommu_release_device(struct device *dev)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
-   struct iommu_group *group;
-   int ret;
-
-   WARN_ON(dev->iommu_group);
-
-   if (!ops)
-   return -EINVAL;
-
-   if (!dev_iommu_get(dev))
-   return -ENOMEM;
-
-   if (!try_module_get(ops->owner)) {
-   ret = -EINVAL;
-   goto err_free_dev_param;
-   }
-
-   if (ops->probe_device)
-   return __iommu_probe_device_helper(dev);
-
-   ret = ops->add_device(dev);
-   if (ret)
-   goto err_module_put;
 
-   group = iommu_group_get(dev);
-   iommu_create_device_direct_mappings(group, dev);
-   iommu_group_put(group);
-
-   if (ops->probe_finalize)
-   ops->probe_finalize(dev);
-
-   return 0;
-
-err_module_put:
-   module_put(ops->owner);
-err_free_dev_param:
-   dev_iommu_free(dev);
-   return ret;
-}
-
-static void __iommu_release_device(struct device *dev)
-{
-   const struct iommu_ops *ops = dev->bus->iommu_ops;
+   if (!dev->iommu)
+   return;
 
iommu_device_unlink(dev->iommu->iommu_dev, dev);
-
iommu_group_remove_device(dev);
 
ops->release_device(dev);
-}
-
-void iommu_release_device(struct device *dev)
-{
-   const struct iommu_ops *ops = dev->bus->iommu_ops;
-
-   if (!dev->iommu)
-   return;
-
-   if (ops->release_device)
-   __iommu_release_device(dev);
-   else if (dev->iommu_group)
-   ops->remove_device(dev);
 
module_put(ops->owner);
dev_iommu_free(dev);
@@ -1560,23 +1516,6 @@ struct iommu_group *iommu_group_get_for_dev(struct 
device *dev)
if (ret)
goto out_put_group;
 
-   /*
-* Try to allocate a default domain - needs support from the
-* IOMMU driver. There are still some drivers which don't support
-* default domains, so the return value is not yet checked. Only
-* allocate the domain here when the driver still has the
-* add_device/remove_device call-backs implemented.
-*/
-   if (!ops->probe_device) {
-   iommu_alloc_default_domain(dev);
-
-   if (group->default_domain)
-   ret = __iommu_attach_device(group->default_domain, dev);
-
-   if (ret)
-   goto out_put_group;
-   }
-
return group;
 
 out_put_group:
@@ -1591,21 +1530,6 @@ struct iommu_domain *iommu_group_default_domain(struct 
iommu_group *group)
return group->default_domain;
 }
 
-static int add_iommu_group(struct device *dev, void *data)
-{
-   int ret = iommu_probe_device(dev);
-
-   /*
-* We ignore -ENODEV errors for now, as they just mean that the
-* device is not translated by an IOMMU. We still care about
-* other errors and fail to initialize when they happen.
-*/
-   if (ret == -ENODEV)
-   ret = 0;
-
-   return ret;
-}
-
 static int probe_iommu_group(struct device *dev, void *data)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
@@ -1793,47 +1717,41 @@ static int iommu_group_create_direct_mappings(struct 
iommu_group *group)
 
 int bus_iommu_probe(struct bus_type *bus)
 {
-   const struct iommu_ops *ops = bus->iommu_ops;
+   struct iommu_group *group, *next;
+   LIST_HEAD(group_list);
int ret;
 
-   if (ops->probe_device) {
-   struct iommu_group *group, *next;
-  

[PATCH v3 06/34] iommu/amd: Return -ENODEV in add_device when device is not handled by IOMMU

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

When check_device() fails on the device, it is not handled by the
IOMMU and amd_iommu_add_device() needs to return -ENODEV.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 504f2db75eda..3e0d27f7622e 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2157,9 +2157,12 @@ static int amd_iommu_add_device(struct device *dev)
struct amd_iommu *iommu;
int ret, devid;
 
-   if (!check_device(dev) || get_dev_data(dev))
+   if (get_dev_data(dev))
return 0;
 
+   if (!check_device(dev))
+   return -ENODEV;
+
devid = get_device_id(dev);
if (devid < 0)
return devid;
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 16/34] iommu/vt-d: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the Intel IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/intel-iommu.c | 67 -
 1 file changed, 6 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index b9f905a55dda..b906727f5b85 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5781,78 +5781,27 @@ static bool intel_iommu_capable(enum iommu_cap cap)
return false;
 }
 
-static int intel_iommu_add_device(struct device *dev)
+static struct iommu_device *intel_iommu_probe_device(struct device *dev)
 {
-   struct dmar_domain *dmar_domain;
-   struct iommu_domain *domain;
struct intel_iommu *iommu;
-   struct iommu_group *group;
u8 bus, devfn;
-   int ret;
 
iommu = device_to_iommu(dev, , );
if (!iommu)
-   return -ENODEV;
-
-   iommu_device_link(>iommu, dev);
+   return ERR_PTR(-ENODEV);
 
if (translation_pre_enabled(iommu))
dev->archdata.iommu = DEFER_DEVICE_DOMAIN_INFO;
 
-   group = iommu_group_get_for_dev(dev);
-
-   if (IS_ERR(group)) {
-   ret = PTR_ERR(group);
-   goto unlink;
-   }
-
-   iommu_group_put(group);
-
-   domain = iommu_get_domain_for_dev(dev);
-   dmar_domain = to_dmar_domain(domain);
-   if (domain->type == IOMMU_DOMAIN_DMA) {
-   if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
-   ret = iommu_request_dm_for_dev(dev);
-   if (ret) {
-   dmar_remove_one_dev_info(dev);
-   dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
-   domain_add_dev_info(si_domain, dev);
-   dev_info(dev,
-"Device uses a private identity 
domain.\n");
-   }
-   }
-   } else {
-   if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
-   ret = iommu_request_dma_domain_for_dev(dev);
-   if (ret) {
-   dmar_remove_one_dev_info(dev);
-   dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
-   if (!get_private_domain_for_dev(dev)) {
-   dev_warn(dev,
-"Failed to get a private 
domain.\n");
-   ret = -ENOMEM;
-   goto unlink;
-   }
-
-   dev_info(dev,
-"Device uses a private dma domain.\n");
-   }
-   }
-   }
-
if (device_needs_bounce(dev)) {
dev_info(dev, "Use Intel IOMMU bounce page dma_ops\n");
set_dma_ops(dev, _dma_ops);
}
 
-   return 0;
-
-unlink:
-   iommu_device_unlink(>iommu, dev);
-   return ret;
+   return >iommu;
 }
 
-static void intel_iommu_remove_device(struct device *dev)
+static void intel_iommu_release_device(struct device *dev)
 {
struct intel_iommu *iommu;
u8 bus, devfn;
@@ -5863,10 +5812,6 @@ static void intel_iommu_remove_device(struct device *dev)
 
dmar_remove_one_dev_info(dev);
 
-   iommu_group_remove_device(dev);
-
-   iommu_device_unlink(>iommu, dev);
-
if (device_needs_bounce(dev))
set_dma_ops(dev, NULL);
 }
@@ -6198,8 +6143,8 @@ const struct iommu_ops intel_iommu_ops = {
.map= intel_iommu_map,
.unmap  = intel_iommu_unmap,
.iova_to_phys   = intel_iommu_iova_to_phys,
-   .add_device = intel_iommu_add_device,
-   .remove_device  = intel_iommu_remove_device,
+   .probe_device   = intel_iommu_probe_device,
+   .release_device = intel_iommu_release_device,
.get_resv_regions   = intel_iommu_get_resv_regions,
.put_resv_regions   = generic_iommu_put_resv_regions,
.apply_resv_region  = intel_iommu_apply_resv_region,
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 25/34] iommu/rockchip: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the Rockchip IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/rockchip-iommu.c | 26 +++---
 1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index b33cdd5aad81..d25c2486ca07 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -1054,40 +1054,28 @@ static void rk_iommu_domain_free(struct iommu_domain 
*domain)
kfree(rk_domain);
 }
 
-static int rk_iommu_add_device(struct device *dev)
+static struct iommu_device *rk_iommu_probe_device(struct device *dev)
 {
-   struct iommu_group *group;
-   struct rk_iommu *iommu;
struct rk_iommudata *data;
+   struct rk_iommu *iommu;
 
data = dev->archdata.iommu;
if (!data)
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
 
iommu = rk_iommu_from_dev(dev);
 
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return PTR_ERR(group);
-   iommu_group_put(group);
-
-   iommu_device_link(>iommu, dev);
data->link = device_link_add(dev, iommu->dev,
 DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME);
 
-   return 0;
+   return >iommu;
 }
 
-static void rk_iommu_remove_device(struct device *dev)
+static void rk_iommu_release_device(struct device *dev)
 {
-   struct rk_iommu *iommu;
struct rk_iommudata *data = dev->archdata.iommu;
 
-   iommu = rk_iommu_from_dev(dev);
-
device_link_del(data->link);
-   iommu_device_unlink(>iommu, dev);
-   iommu_group_remove_device(dev);
 }
 
 static struct iommu_group *rk_iommu_device_group(struct device *dev)
@@ -1126,8 +1114,8 @@ static const struct iommu_ops rk_iommu_ops = {
.detach_dev = rk_iommu_detach_device,
.map = rk_iommu_map,
.unmap = rk_iommu_unmap,
-   .add_device = rk_iommu_add_device,
-   .remove_device = rk_iommu_remove_device,
+   .probe_device = rk_iommu_probe_device,
+   .release_device = rk_iommu_release_device,
.iova_to_phys = rk_iommu_iova_to_phys,
.device_group = rk_iommu_device_group,
.pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP,
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 18/34] iommu/pamu: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the PAMU IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/fsl_pamu_domain.c | 22 +-
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 06828e2698d5..928d37771ece 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -1016,25 +1016,13 @@ static struct iommu_group *fsl_pamu_device_group(struct 
device *dev)
return group;
 }
 
-static int fsl_pamu_add_device(struct device *dev)
+static struct iommu_device *fsl_pamu_probe_device(struct device *dev)
 {
-   struct iommu_group *group;
-
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return PTR_ERR(group);
-
-   iommu_group_put(group);
-
-   iommu_device_link(_iommu, dev);
-
-   return 0;
+   return _iommu;
 }
 
-static void fsl_pamu_remove_device(struct device *dev)
+static void fsl_pamu_release_device(struct device *dev)
 {
-   iommu_device_unlink(_iommu, dev);
-   iommu_group_remove_device(dev);
 }
 
 static const struct iommu_ops fsl_pamu_ops = {
@@ -1048,8 +1036,8 @@ static const struct iommu_ops fsl_pamu_ops = {
.iova_to_phys   = fsl_pamu_iova_to_phys,
.domain_set_attr = fsl_pamu_set_domain_attr,
.domain_get_attr = fsl_pamu_get_domain_attr,
-   .add_device = fsl_pamu_add_device,
-   .remove_device  = fsl_pamu_remove_device,
+   .probe_device   = fsl_pamu_probe_device,
+   .release_device = fsl_pamu_release_device,
.device_group   = fsl_pamu_device_group,
 };
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 13/34] iommu: Export bus_iommu_probe() and make is safe for re-probing

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Add a check to the bus_iommu_probe() call-path to make sure it ignores
devices which have already been successfully probed. Then export the
bus_iommu_probe() function so it can be used by IOMMU drivers.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 10 +-
 include/linux/iommu.h |  1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 834a45da0ed0..397fd4fd0c32 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1610,11 +1610,19 @@ static int probe_iommu_group(struct device *dev, void 
*data)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
struct list_head *group_list = data;
+   struct iommu_group *group;
int ret;
 
if (!dev_iommu_get(dev))
return -ENOMEM;
 
+   /* Device is probed already if in a group */
+   group = iommu_group_get(dev);
+   if (group) {
+   iommu_group_put(group);
+   return 0;
+   }
+
if (!try_module_get(ops->owner)) {
ret = -EINVAL;
goto err_free_dev_iommu;
@@ -1783,7 +1791,7 @@ static int iommu_group_create_direct_mappings(struct 
iommu_group *group)
  iommu_do_create_direct_mappings);
 }
 
-static int bus_iommu_probe(struct bus_type *bus)
+int bus_iommu_probe(struct bus_type *bus)
 {
const struct iommu_ops *ops = bus->iommu_ops;
int ret;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 30170d191e5e..fea1622408ad 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -445,6 +445,7 @@ static inline void iommu_iotlb_gather_init(struct 
iommu_iotlb_gather *gather)
 #define IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER  6 /* Post Driver unbind */
 
 extern int bus_set_iommu(struct bus_type *bus, const struct iommu_ops *ops);
+extern int bus_iommu_probe(struct bus_type *bus);
 extern bool iommu_present(struct bus_type *bus);
 extern bool iommu_capable(struct bus_type *bus, enum iommu_cap cap);
 extern struct iommu_domain *iommu_domain_alloc(struct bus_type *bus);
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 29/34] iommu/omap: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the OMAP IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/omap-iommu.c | 49 ++
 1 file changed, 13 insertions(+), 36 deletions(-)

diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index ecc9d0829a91..6699fe6d9e06 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1640,15 +1640,13 @@ static phys_addr_t omap_iommu_iova_to_phys(struct 
iommu_domain *domain,
return ret;
 }
 
-static int omap_iommu_add_device(struct device *dev)
+static struct iommu_device *omap_iommu_probe_device(struct device *dev)
 {
struct omap_iommu_arch_data *arch_data, *tmp;
+   struct platform_device *pdev;
struct omap_iommu *oiommu;
-   struct iommu_group *group;
struct device_node *np;
-   struct platform_device *pdev;
int num_iommus, i;
-   int ret;
 
/*
 * Allocate the archdata iommu structure for DT-based devices.
@@ -1657,7 +1655,7 @@ static int omap_iommu_add_device(struct device *dev)
 * IOMMU users.
 */
if (!dev->of_node)
-   return 0;
+   return ERR_PTR(-ENODEV);
 
/*
 * retrieve the count of IOMMU nodes using phandle size as element size
@@ -1670,27 +1668,27 @@ static int omap_iommu_add_device(struct device *dev)
 
arch_data = kcalloc(num_iommus + 1, sizeof(*arch_data), GFP_KERNEL);
if (!arch_data)
-   return -ENOMEM;
+   return ERR_PTR(-ENOMEM);
 
for (i = 0, tmp = arch_data; i < num_iommus; i++, tmp++) {
np = of_parse_phandle(dev->of_node, "iommus", i);
if (!np) {
kfree(arch_data);
-   return -EINVAL;
+   return ERR_PTR(-EINVAL);
}
 
pdev = of_find_device_by_node(np);
if (!pdev) {
of_node_put(np);
kfree(arch_data);
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
}
 
oiommu = platform_get_drvdata(pdev);
if (!oiommu) {
of_node_put(np);
kfree(arch_data);
-   return -EINVAL;
+   return ERR_PTR(-EINVAL);
}
 
tmp->iommu_dev = oiommu;
@@ -1699,46 +1697,25 @@ static int omap_iommu_add_device(struct device *dev)
of_node_put(np);
}
 
+   dev->archdata.iommu = arch_data;
+
/*
 * use the first IOMMU alone for the sysfs device linking.
 * TODO: Evaluate if a single iommu_group needs to be
 * maintained for both IOMMUs
 */
oiommu = arch_data->iommu_dev;
-   ret = iommu_device_link(>iommu, dev);
-   if (ret) {
-   kfree(arch_data);
-   return ret;
-   }
-
-   dev->archdata.iommu = arch_data;
-
-   /*
-* IOMMU group initialization calls into omap_iommu_device_group, which
-* needs a valid dev->archdata.iommu pointer
-*/
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group)) {
-   iommu_device_unlink(>iommu, dev);
-   dev->archdata.iommu = NULL;
-   kfree(arch_data);
-   return PTR_ERR(group);
-   }
-   iommu_group_put(group);
 
-   return 0;
+   return >iommu;
 }
 
-static void omap_iommu_remove_device(struct device *dev)
+static void omap_iommu_release_device(struct device *dev)
 {
struct omap_iommu_arch_data *arch_data = dev->archdata.iommu;
 
if (!dev->of_node || !arch_data)
return;
 
-   iommu_device_unlink(_data->iommu_dev->iommu, dev);
-   iommu_group_remove_device(dev);
-
dev->archdata.iommu = NULL;
kfree(arch_data);
 
@@ -1763,8 +1740,8 @@ static const struct iommu_ops omap_iommu_ops = {
.map= omap_iommu_map,
.unmap  = omap_iommu_unmap,
.iova_to_phys   = omap_iommu_iova_to_phys,
-   .add_device = omap_iommu_add_device,
-   .remove_device  = omap_iommu_remove_device,
+   .probe_device   = omap_iommu_probe_device,
+   .release_device = omap_iommu_release_device,
.device_group   = omap_iommu_device_group,
.pgsize_bitmap  = OMAP_IOMMU_PGSIZES,
 };
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 28/34] iommu/omap: Remove orphan_dev tracking

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Remove the tracking of device which could not be probed because
their IOMMU is not probed yet. Replace it with a call to
bus_iommu_probe() when a new IOMMU is probed.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/omap-iommu.c | 54 +++---
 1 file changed, 4 insertions(+), 50 deletions(-)

diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 887fefcb03b4..ecc9d0829a91 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -35,15 +35,6 @@
 
 static const struct iommu_ops omap_iommu_ops;
 
-struct orphan_dev {
-   struct device *dev;
-   struct list_head node;
-};
-
-static LIST_HEAD(orphan_dev_list);
-
-static DEFINE_SPINLOCK(orphan_lock);
-
 #define to_iommu(dev)  ((struct omap_iommu *)dev_get_drvdata(dev))
 
 /* bitmap of the page sizes currently supported */
@@ -62,8 +53,6 @@ static DEFINE_SPINLOCK(orphan_lock);
 static struct platform_driver omap_iommu_driver;
 static struct kmem_cache *iopte_cachep;
 
-static int _omap_iommu_add_device(struct device *dev);
-
 /**
  * to_omap_domain - Get struct omap_iommu_domain from generic iommu_domain
  * @dom:   generic iommu domain handle
@@ -1177,7 +1166,6 @@ static int omap_iommu_probe(struct platform_device *pdev)
struct omap_iommu *obj;
struct resource *res;
struct device_node *of = pdev->dev.of_node;
-   struct orphan_dev *orphan_dev, *tmp;
 
if (!of) {
pr_err("%s: only DT-based devices are supported\n", __func__);
@@ -1260,13 +1248,8 @@ static int omap_iommu_probe(struct platform_device *pdev)
 
dev_info(>dev, "%s registered\n", obj->name);
 
-   list_for_each_entry_safe(orphan_dev, tmp, _dev_list, node) {
-   err = _omap_iommu_add_device(orphan_dev->dev);
-   if (!err) {
-   list_del(_dev->node);
-   kfree(orphan_dev);
-   }
-   }
+   /* Re-probe bus to probe device attached to this IOMMU */
+   bus_iommu_probe(_bus_type);
 
return 0;
 
@@ -1657,7 +1640,7 @@ static phys_addr_t omap_iommu_iova_to_phys(struct 
iommu_domain *domain,
return ret;
 }
 
-static int _omap_iommu_add_device(struct device *dev)
+static int omap_iommu_add_device(struct device *dev)
 {
struct omap_iommu_arch_data *arch_data, *tmp;
struct omap_iommu *oiommu;
@@ -1666,8 +1649,6 @@ static int _omap_iommu_add_device(struct device *dev)
struct platform_device *pdev;
int num_iommus, i;
int ret;
-   struct orphan_dev *orphan_dev;
-   unsigned long flags;
 
/*
 * Allocate the archdata iommu structure for DT-based devices.
@@ -1702,23 +1683,7 @@ static int _omap_iommu_add_device(struct device *dev)
if (!pdev) {
of_node_put(np);
kfree(arch_data);
-   spin_lock_irqsave(_lock, flags);
-   list_for_each_entry(orphan_dev, _dev_list,
-   node) {
-   if (orphan_dev->dev == dev)
-   break;
-   }
-   spin_unlock_irqrestore(_lock, flags);
-
-   if (orphan_dev && orphan_dev->dev == dev)
-   return -EPROBE_DEFER;
-
-   orphan_dev = kzalloc(sizeof(*orphan_dev), GFP_KERNEL);
-   orphan_dev->dev = dev;
-   spin_lock_irqsave(_lock, flags);
-   list_add(_dev->node, _dev_list);
-   spin_unlock_irqrestore(_lock, flags);
-   return -EPROBE_DEFER;
+   return -ENODEV;
}
 
oiommu = platform_get_drvdata(pdev);
@@ -1764,17 +1729,6 @@ static int _omap_iommu_add_device(struct device *dev)
return 0;
 }
 
-static int omap_iommu_add_device(struct device *dev)
-{
-   int ret;
-
-   ret = _omap_iommu_add_device(dev);
-   if (ret == -EPROBE_DEFER)
-   return 0;
-
-   return ret;
-}
-
 static void omap_iommu_remove_device(struct device *dev)
 {
struct omap_iommu_arch_data *arch_data = dev->archdata.iommu;
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 09/34] iommu: Keep a list of allocated groups in __iommu_probe_device()

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

This is needed to defer default_domain allocation for new IOMMU groups
until all devices have been added to the group.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 7a385c18e1a5..18eb3623bd00 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -44,6 +44,7 @@ struct iommu_group {
int id;
struct iommu_domain *default_domain;
struct iommu_domain *domain;
+   struct list_head entry;
 };
 
 struct group_device {
@@ -184,7 +185,7 @@ static void dev_iommu_free(struct device *dev)
dev->iommu = NULL;
 }
 
-static int __iommu_probe_device(struct device *dev)
+static int __iommu_probe_device(struct device *dev, struct list_head 
*group_list)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
struct iommu_device *iommu_dev;
@@ -204,6 +205,9 @@ static int __iommu_probe_device(struct device *dev)
}
iommu_group_put(group);
 
+   if (group_list && !group->default_domain && list_empty(>entry))
+   list_add_tail(>entry, group_list);
+
iommu_device_link(iommu_dev, dev);
 
return 0;
@@ -234,7 +238,7 @@ int iommu_probe_device(struct device *dev)
if (ops->probe_device) {
struct iommu_group *group;
 
-   ret = __iommu_probe_device(dev);
+   ret = __iommu_probe_device(dev, NULL);
 
/*
 * Try to allocate a default domain - needs support from the
@@ -567,6 +571,7 @@ struct iommu_group *iommu_group_alloc(void)
group->kobj.kset = iommu_group_kset;
mutex_init(>mutex);
INIT_LIST_HEAD(>devices);
+   INIT_LIST_HEAD(>entry);
BLOCKING_INIT_NOTIFIER_HEAD(>notifier);
 
ret = ida_simple_get(_group_ida, 0, 0, GFP_KERNEL);
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 07/34] iommu: Add probe_device() and release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Add call-backs to 'struct iommu_ops' as an alternative to the
add_device() and remove_device() call-backs, which will be removed when
all drivers are converted.

The new call-backs will not setup IOMMU groups and domains anymore,
so also add a probe_finalize() call-back where the IOMMU driver can do
per-device setup work which require the device to be set up with a
group and a domain.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 63 ++-
 include/linux/iommu.h |  9 +++
 2 files changed, 66 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 5877abd9b693..6cfe7799dc8c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -174,6 +174,36 @@ static void dev_iommu_free(struct device *dev)
dev->iommu = NULL;
 }
 
+static int __iommu_probe_device(struct device *dev)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+   struct iommu_device *iommu_dev;
+   struct iommu_group *group;
+   int ret;
+
+   iommu_dev = ops->probe_device(dev);
+   if (IS_ERR(iommu_dev))
+   return PTR_ERR(iommu_dev);
+
+   dev->iommu->iommu_dev = iommu_dev;
+
+   group = iommu_group_get_for_dev(dev);
+   if (!IS_ERR(group)) {
+   ret = PTR_ERR(group);
+   goto out_release;
+   }
+   iommu_group_put(group);
+
+   iommu_device_link(iommu_dev, dev);
+
+   return 0;
+
+out_release:
+   ops->release_device(dev);
+
+   return ret;
+}
+
 int iommu_probe_device(struct device *dev)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
@@ -191,10 +221,17 @@ int iommu_probe_device(struct device *dev)
goto err_free_dev_param;
}
 
-   ret = ops->add_device(dev);
+   if (ops->probe_device)
+   ret = __iommu_probe_device(dev);
+   else
+   ret = ops->add_device(dev);
+
if (ret)
goto err_module_put;
 
+   if (ops->probe_finalize)
+   ops->probe_finalize(dev);
+
return 0;
 
 err_module_put:
@@ -204,17 +241,31 @@ int iommu_probe_device(struct device *dev)
return ret;
 }
 
+static void __iommu_release_device(struct device *dev)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+   iommu_device_unlink(dev->iommu->iommu_dev, dev);
+
+   iommu_group_remove_device(dev);
+
+   ops->release_device(dev);
+}
+
 void iommu_release_device(struct device *dev)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
 
-   if (dev->iommu_group)
+   if (!dev->iommu)
+   return;
+
+   if (ops->release_device)
+   __iommu_release_device(dev);
+   else if (dev->iommu_group)
ops->remove_device(dev);
 
-   if (dev->iommu) {
-   module_put(ops->owner);
-   dev_iommu_free(dev);
-   }
+   module_put(ops->owner);
+   dev_iommu_free(dev);
 }
 
 static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 1f027b07e499..30170d191e5e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -225,6 +225,10 @@ struct iommu_iotlb_gather {
  * @iova_to_phys: translate iova to physical address
  * @add_device: add device to iommu grouping
  * @remove_device: remove device from iommu grouping
+ * @probe_device: Add device to iommu driver handling
+ * @release_device: Remove device from iommu driver handling
+ * @probe_finalize: Do final setup work after the device is added to an IOMMU
+ *  group and attached to the groups domain
  * @device_group: find iommu group for a particular device
  * @domain_get_attr: Query domain attributes
  * @domain_set_attr: Change domain attributes
@@ -275,6 +279,9 @@ struct iommu_ops {
phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t 
iova);
int (*add_device)(struct device *dev);
void (*remove_device)(struct device *dev);
+   struct iommu_device *(*probe_device)(struct device *dev);
+   void (*release_device)(struct device *dev);
+   void (*probe_finalize)(struct device *dev);
struct iommu_group *(*device_group)(struct device *dev);
int (*domain_get_attr)(struct iommu_domain *domain,
   enum iommu_attr attr, void *data);
@@ -375,6 +382,7 @@ struct iommu_fault_param {
  *
  * @fault_param: IOMMU detected device fault reporting data
  * @fwspec: IOMMU fwspec data
+ * @iommu_dev:  IOMMU device this device is linked to
  * @priv:   IOMMU Driver private data
  *
  * TODO: migrate other per device data pointers under iommu_dev_data, e.g.
@@ -384,6 +392,7 @@ struct dev_iommu {
struct mutex lock;
struct iommu_fault_param*fault_param;
struct iommu_fwspec *fwspec;
+   struct 

[PATCH v3 02/34] iommu: Add def_domain_type() callback in iommu_ops

2020-04-29 Thread Joerg Roedel
From: Sai Praneeth Prakhya 

Some devices are reqired to use a specific type (identity or dma)
of default domain when they are used with a vendor iommu. When the
system level default domain type is different from it, the vendor
iommu driver has to request a new default domain with
iommu_request_dma_domain_for_dev() and iommu_request_dm_for_dev()
in the add_dev() callback. Unfortunately, these two helpers only
work when the group hasn't been assigned to any other devices,
hence, some vendor iommu driver has to use a private domain if
it fails to request a new default one.

This adds def_domain_type() callback in the iommu_ops, so that
any special requirement of default domain for a device could be
aware by the iommu generic layer.

Signed-off-by: Sai Praneeth Prakhya 
Signed-off-by: Lu Baolu 
[ jroe...@suse.de: Added iommu_get_def_domain_type() function and use
   it to allocate the default domain ]
Co-developed-by: Joerg Roedel 
Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 20 +---
 include/linux/iommu.h |  6 ++
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index bfe011760ed1..5877abd9b693 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1361,21 +1361,35 @@ struct iommu_group *fsl_mc_device_group(struct device 
*dev)
 }
 EXPORT_SYMBOL_GPL(fsl_mc_device_group);
 
+static int iommu_get_def_domain_type(struct device *dev)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+   unsigned int type = 0;
+
+   if (ops->def_domain_type)
+   type = ops->def_domain_type(dev);
+
+   return (type == 0) ? iommu_def_domain_type : type;
+}
+
 static int iommu_alloc_default_domain(struct device *dev,
  struct iommu_group *group)
 {
struct iommu_domain *dom;
+   unsigned int type;
 
if (group->default_domain)
return 0;
 
-   dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
-   if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
+   type = iommu_get_def_domain_type(dev);
+
+   dom = __iommu_domain_alloc(dev->bus, type);
+   if (!dom && type != IOMMU_DOMAIN_DMA) {
dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
if (dom) {
dev_warn(dev,
 "failed to allocate default IOMMU domain of 
type %u; falling back to IOMMU_DOMAIN_DMA",
-iommu_def_domain_type);
+type);
}
}
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 7ef8b0bda695..1f027b07e499 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -248,6 +248,10 @@ struct iommu_iotlb_gather {
  * @cache_invalidate: invalidate translation caches
  * @sva_bind_gpasid: bind guest pasid and mm
  * @sva_unbind_gpasid: unbind guest pasid and mm
+ * @def_domain_type: device default domain type, return value:
+ * - IOMMU_DOMAIN_IDENTITY: must use an identity domain
+ * - IOMMU_DOMAIN_DMA: must use a dma domain
+ * - 0: use the default setting
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  * @owner: Driver module providing these ops
  */
@@ -318,6 +322,8 @@ struct iommu_ops {
 
int (*sva_unbind_gpasid)(struct device *dev, int pasid);
 
+   int (*def_domain_type)(struct device *dev);
+
unsigned long pgsize_bitmap;
struct module *owner;
 };
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 05/34] iommu/amd: Remove dma_mask check from check_device()

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

The check was only needed for the DMA-API implementation in the AMD
IOMMU driver, which no longer exists.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 73b4f84cf449..504f2db75eda 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -326,7 +326,7 @@ static bool check_device(struct device *dev)
 {
int devid;
 
-   if (!dev || !dev->dma_mask)
+   if (!dev)
return false;
 
devid = get_device_id(dev);
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 21/34] iommu/msm: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the MSM IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/msm_iommu.c | 34 +++---
 1 file changed, 7 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 94a6df1bddd6..10cd4db0710a 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -388,43 +388,23 @@ static struct msm_iommu_dev *find_iommu_for_dev(struct 
device *dev)
return ret;
 }
 
-static int msm_iommu_add_device(struct device *dev)
+static struct iommu_device *msm_iommu_probe_device(struct device *dev)
 {
struct msm_iommu_dev *iommu;
-   struct iommu_group *group;
unsigned long flags;
 
spin_lock_irqsave(_iommu_lock, flags);
iommu = find_iommu_for_dev(dev);
spin_unlock_irqrestore(_iommu_lock, flags);
 
-   if (iommu)
-   iommu_device_link(>iommu, dev);
-   else
-   return -ENODEV;
-
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return PTR_ERR(group);
-
-   iommu_group_put(group);
+   if (!iommu)
+   return ERR_PTR(-ENODEV);
 
-   return 0;
+   return >iommu;
 }
 
-static void msm_iommu_remove_device(struct device *dev)
+static void msm_iommu_release_device(struct device *dev)
 {
-   struct msm_iommu_dev *iommu;
-   unsigned long flags;
-
-   spin_lock_irqsave(_iommu_lock, flags);
-   iommu = find_iommu_for_dev(dev);
-   spin_unlock_irqrestore(_iommu_lock, flags);
-
-   if (iommu)
-   iommu_device_unlink(>iommu, dev);
-
-   iommu_group_remove_device(dev);
 }
 
 static int msm_iommu_attach_dev(struct iommu_domain *domain, struct device 
*dev)
@@ -708,8 +688,8 @@ static struct iommu_ops msm_iommu_ops = {
 */
.iotlb_sync = NULL,
.iova_to_phys = msm_iommu_iova_to_phys,
-   .add_device = msm_iommu_add_device,
-   .remove_device = msm_iommu_remove_device,
+   .probe_device = msm_iommu_probe_device,
+   .release_device = msm_iommu_release_device,
.device_group = generic_device_group,
.pgsize_bitmap = MSM_IOMMU_PGSIZES,
.of_xlate = qcom_iommu_of_xlate,
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 19/34] iommu/s390: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the S390 IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/s390-iommu.c | 22 ++
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 1137f3ddcb85..610f0828f22d 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -166,21 +166,14 @@ static void s390_iommu_detach_device(struct iommu_domain 
*domain,
}
 }
 
-static int s390_iommu_add_device(struct device *dev)
+static struct iommu_device *s390_iommu_probe_device(struct device *dev)
 {
-   struct iommu_group *group = iommu_group_get_for_dev(dev);
struct zpci_dev *zdev = to_pci_dev(dev)->sysdata;
 
-   if (IS_ERR(group))
-   return PTR_ERR(group);
-
-   iommu_group_put(group);
-   iommu_device_link(>iommu_dev, dev);
-
-   return 0;
+   return >iommu_dev;
 }
 
-static void s390_iommu_remove_device(struct device *dev)
+static void s390_iommu_release_device(struct device *dev)
 {
struct zpci_dev *zdev = to_pci_dev(dev)->sysdata;
struct iommu_domain *domain;
@@ -191,7 +184,7 @@ static void s390_iommu_remove_device(struct device *dev)
 * to vfio-pci and completing the VFIO_SET_IOMMU ioctl (which triggers
 * the attach_dev), removing the device via
 * "echo 1 > /sys/bus/pci/devices/.../remove" won't trigger detach_dev,
-* only remove_device will be called via the BUS_NOTIFY_REMOVED_DEVICE
+* only release_device will be called via the BUS_NOTIFY_REMOVED_DEVICE
 * notifier.
 *
 * So let's call detach_dev from here if it hasn't been called before.
@@ -201,9 +194,6 @@ static void s390_iommu_remove_device(struct device *dev)
if (domain)
s390_iommu_detach_device(domain, dev);
}
-
-   iommu_device_unlink(>iommu_dev, dev);
-   iommu_group_remove_device(dev);
 }
 
 static int s390_iommu_update_trans(struct s390_domain *s390_domain,
@@ -373,8 +363,8 @@ static const struct iommu_ops s390_iommu_ops = {
.map = s390_iommu_map,
.unmap = s390_iommu_unmap,
.iova_to_phys = s390_iommu_iova_to_phys,
-   .add_device = s390_iommu_add_device,
-   .remove_device = s390_iommu_remove_device,
+   .probe_device = s390_iommu_probe_device,
+   .release_device = s390_iommu_release_device,
.device_group = generic_device_group,
.pgsize_bitmap = S390_IOMMU_PGSIZES,
 };
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 23/34] iommu/mediatek-v1 Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the Mediatek-v1 IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/mtk_iommu_v1.c | 50 +++-
 1 file changed, 20 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index a31be05601c9..7bdd74c7cb9f 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -416,14 +416,12 @@ static int mtk_iommu_create_mapping(struct device *dev,
return 0;
 }
 
-static int mtk_iommu_add_device(struct device *dev)
+static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-   struct dma_iommu_mapping *mtk_mapping;
struct of_phandle_args iommu_spec;
struct of_phandle_iterator it;
struct mtk_iommu_data *data;
-   struct iommu_group *group;
int err;
 
of_for_each_phandle(, err, dev->of_node, "iommus",
@@ -442,35 +440,28 @@ static int mtk_iommu_add_device(struct device *dev)
}
 
if (!fwspec || fwspec->ops != _iommu_ops)
-   return -ENODEV; /* Not a iommu client device */
+   return ERR_PTR(-ENODEV); /* Not a iommu client device */
 
-   /*
-* This is a short-term bodge because the ARM DMA code doesn't
-* understand multi-device groups, but we have to call into it
-* successfully (and not just rely on a normal IOMMU API attach
-* here) in order to set the correct DMA API ops on @dev.
-*/
-   group = iommu_group_alloc();
-   if (IS_ERR(group))
-   return PTR_ERR(group);
+   data = dev_iommu_priv_get(dev);
 
-   err = iommu_group_add_device(group, dev);
-   iommu_group_put(group);
-   if (err)
-   return err;
+   return >iommu;
+}
 
-   data = dev_iommu_priv_get(dev);
+static void mtk_iommu_probe_finalize(struct device *dev)
+{
+   struct dma_iommu_mapping *mtk_mapping;
+   struct mtk_iommu_data *data;
+   int err;
+
+   data= dev_iommu_priv_get(dev);
mtk_mapping = data->dev->archdata.iommu;
-   err = arm_iommu_attach_device(dev, mtk_mapping);
-   if (err) {
-   iommu_group_remove_device(dev);
-   return err;
-   }
 
-   return iommu_device_link(>iommu, dev);
+   err = arm_iommu_attach_device(dev, mtk_mapping);
+   if (err)
+   dev_err(dev, "Can't create IOMMU mapping - DMA-OPS will not 
work\n");
 }
 
-static void mtk_iommu_remove_device(struct device *dev)
+static void mtk_iommu_release_device(struct device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
@@ -479,9 +470,6 @@ static void mtk_iommu_remove_device(struct device *dev)
return;
 
data = dev_iommu_priv_get(dev);
-   iommu_device_unlink(>iommu, dev);
-
-   iommu_group_remove_device(dev);
iommu_fwspec_free(dev);
 }
 
@@ -534,8 +522,10 @@ static const struct iommu_ops mtk_iommu_ops = {
.map= mtk_iommu_map,
.unmap  = mtk_iommu_unmap,
.iova_to_phys   = mtk_iommu_iova_to_phys,
-   .add_device = mtk_iommu_add_device,
-   .remove_device  = mtk_iommu_remove_device,
+   .probe_device   = mtk_iommu_probe_device,
+   .probe_finalize = mtk_iommu_probe_finalize,
+   .release_device = mtk_iommu_release_device,
+   .device_group   = generic_device_group,
.pgsize_bitmap  = ~0UL << MT2701_IOMMU_PAGE_SHIFT,
 };
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 24/34] iommu/qcom: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the QCOM IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/qcom_iommu.c | 24 +++-
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 0e2a96467767..054e476ebd49 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -524,14 +524,13 @@ static bool qcom_iommu_capable(enum iommu_cap cap)
}
 }
 
-static int qcom_iommu_add_device(struct device *dev)
+static struct iommu_device *qcom_iommu_probe_device(struct device *dev)
 {
struct qcom_iommu_dev *qcom_iommu = to_iommu(dev);
-   struct iommu_group *group;
struct device_link *link;
 
if (!qcom_iommu)
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
 
/*
 * Establish the link between iommu and master, so that the
@@ -542,28 +541,19 @@ static int qcom_iommu_add_device(struct device *dev)
if (!link) {
dev_err(qcom_iommu->dev, "Unable to create device link between 
%s and %s\n",
dev_name(qcom_iommu->dev), dev_name(dev));
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
}
 
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return PTR_ERR(group);
-
-   iommu_group_put(group);
-   iommu_device_link(_iommu->iommu, dev);
-
-   return 0;
+   return _iommu->iommu;
 }
 
-static void qcom_iommu_remove_device(struct device *dev)
+static void qcom_iommu_release_device(struct device *dev)
 {
struct qcom_iommu_dev *qcom_iommu = to_iommu(dev);
 
if (!qcom_iommu)
return;
 
-   iommu_device_unlink(_iommu->iommu, dev);
-   iommu_group_remove_device(dev);
iommu_fwspec_free(dev);
 }
 
@@ -619,8 +609,8 @@ static const struct iommu_ops qcom_iommu_ops = {
.flush_iotlb_all = qcom_iommu_flush_iotlb_all,
.iotlb_sync = qcom_iommu_iotlb_sync,
.iova_to_phys   = qcom_iommu_iova_to_phys,
-   .add_device = qcom_iommu_add_device,
-   .remove_device  = qcom_iommu_remove_device,
+   .probe_device   = qcom_iommu_probe_device,
+   .release_device = qcom_iommu_release_device,
.device_group   = generic_device_group,
.of_xlate   = qcom_iommu_of_xlate,
.pgsize_bitmap  = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 22/34] iommu/mediatek: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the Mediatek IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/mtk_iommu.c | 24 ++--
 1 file changed, 6 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 5f4d6df59cf6..2be96f1cdbd2 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -441,38 +441,26 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct 
iommu_domain *domain,
return pa;
 }
 
-static int mtk_iommu_add_device(struct device *dev)
+static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
-   struct iommu_group *group;
 
if (!fwspec || fwspec->ops != _iommu_ops)
-   return -ENODEV; /* Not a iommu client device */
+   return ERR_PTR(-ENODEV); /* Not a iommu client device */
 
data = dev_iommu_priv_get(dev);
-   iommu_device_link(>iommu, dev);
 
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return PTR_ERR(group);
-
-   iommu_group_put(group);
-   return 0;
+   return >iommu;
 }
 
-static void mtk_iommu_remove_device(struct device *dev)
+static void mtk_iommu_release_device(struct device *dev)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-   struct mtk_iommu_data *data;
 
if (!fwspec || fwspec->ops != _iommu_ops)
return;
 
-   data = dev_iommu_priv_get(dev);
-   iommu_device_unlink(>iommu, dev);
-
-   iommu_group_remove_device(dev);
iommu_fwspec_free(dev);
 }
 
@@ -526,8 +514,8 @@ static const struct iommu_ops mtk_iommu_ops = {
.flush_iotlb_all = mtk_iommu_flush_iotlb_all,
.iotlb_sync = mtk_iommu_iotlb_sync,
.iova_to_phys   = mtk_iommu_iova_to_phys,
-   .add_device = mtk_iommu_add_device,
-   .remove_device  = mtk_iommu_remove_device,
+   .probe_device   = mtk_iommu_probe_device,
+   .release_device = mtk_iommu_release_device,
.device_group   = mtk_iommu_device_group,
.of_xlate   = mtk_iommu_of_xlate,
.pgsize_bitmap  = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 04/34] iommu/vt-d: Wire up iommu_ops->def_domain_type

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

The Intel VT-d driver already has a matching function to determine the
default domain type for a device. Wire it up in intel_iommu_ops.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/intel-iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index ef0a5246700e..b9f905a55dda 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -6209,6 +6209,7 @@ const struct iommu_ops intel_iommu_ops = {
.dev_enable_feat= intel_iommu_dev_enable_feat,
.dev_disable_feat   = intel_iommu_dev_disable_feat,
.is_attach_deferred = intel_iommu_is_attach_deferred,
+   .def_domain_type= device_def_domain_type,
.pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
 };
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 00/34] iommu: Move iommu_group setup to IOMMU core code

2020-04-29 Thread Joerg Roedel
Hi,

here is the third version of this patch-set. Older versions can be found
here:

v1: https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/
(Has some more introductory text)

v2: https://lore.kernel.org/lkml/20200414131542.25608-1-j...@8bytes.org/

Changes v2 -> v3:

* Rebased v5.7-rc3

* Added a missing iommu_group_put() as reported by Lu Baolu.

* Added a patch to consolidate more initialization work in
  __iommu_probe_device(), fixing a bug where no 'struct
  device_iommu' was allocated in the hotplug path.

There is also a git-branch available with these patches applied:


https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v3

Please review. If there are no objections I plan to put these patches
into the IOMMU tree early next week.

Thanks,

Joerg

Joerg Roedel (33):
  iommu: Move default domain allocation to separate function
  iommu/amd: Implement iommu_ops->def_domain_type call-back
  iommu/vt-d: Wire up iommu_ops->def_domain_type
  iommu/amd: Remove dma_mask check from check_device()
  iommu/amd: Return -ENODEV in add_device when device is not handled by
IOMMU
  iommu: Add probe_device() and release_device() call-backs
  iommu: Move default domain allocation to iommu_probe_device()
  iommu: Keep a list of allocated groups in __iommu_probe_device()
  iommu: Move new probe_device path to separate function
  iommu: Split off default domain allocation from group assignment
  iommu: Move iommu_group_create_direct_mappings() out of
iommu_group_add_device()
  iommu: Export bus_iommu_probe() and make is safe for re-probing
  iommu/amd: Remove dev_data->passthrough
  iommu/amd: Convert to probe/release_device() call-backs
  iommu/vt-d: Convert to probe/release_device() call-backs
  iommu/arm-smmu: Convert to probe/release_device() call-backs
  iommu/pamu: Convert to probe/release_device() call-backs
  iommu/s390: Convert to probe/release_device() call-backs
  iommu/virtio: Convert to probe/release_device() call-backs
  iommu/msm: Convert to probe/release_device() call-backs
  iommu/mediatek: Convert to probe/release_device() call-backs
  iommu/mediatek-v1 Convert to probe/release_device() call-backs
  iommu/qcom: Convert to probe/release_device() call-backs
  iommu/rockchip: Convert to probe/release_device() call-backs
  iommu/tegra: Convert to probe/release_device() call-backs
  iommu/renesas: Convert to probe/release_device() call-backs
  iommu/omap: Remove orphan_dev tracking
  iommu/omap: Convert to probe/release_device() call-backs
  iommu/exynos: Use first SYSMMU in controllers list for IOMMU core
  iommu/exynos: Convert to probe/release_device() call-backs
  iommu: Remove add_device()/remove_device() code-paths
  iommu: Move more initialization to __iommu_probe_device()
  iommu: Unexport iommu_group_get_for_dev()

Sai Praneeth Prakhya (1):
  iommu: Add def_domain_type() callback in iommu_ops

 drivers/iommu/amd_iommu.c   |  97 
 drivers/iommu/amd_iommu_types.h |   1 -
 drivers/iommu/arm-smmu-v3.c |  38 +---
 drivers/iommu/arm-smmu.c|  39 ++--
 drivers/iommu/exynos-iommu.c|  24 +-
 drivers/iommu/fsl_pamu_domain.c |  22 +-
 drivers/iommu/intel-iommu.c |  68 +-
 drivers/iommu/iommu.c   | 387 +---
 drivers/iommu/ipmmu-vmsa.c  |  60 ++---
 drivers/iommu/msm_iommu.c   |  34 +--
 drivers/iommu/mtk_iommu.c   |  24 +-
 drivers/iommu/mtk_iommu_v1.c|  50 ++---
 drivers/iommu/omap-iommu.c  |  99 ++--
 drivers/iommu/qcom_iommu.c  |  24 +-
 drivers/iommu/rockchip-iommu.c  |  26 +--
 drivers/iommu/s390-iommu.c  |  22 +-
 drivers/iommu/tegra-gart.c  |  24 +-
 drivers/iommu/tegra-smmu.c  |  31 +--
 drivers/iommu/virtio-iommu.c|  41 +---
 include/linux/iommu.h   |  21 +-
 20 files changed, 531 insertions(+), 601 deletions(-)

-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 11/34] iommu: Split off default domain allocation from group assignment

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

When a bus is initialized with iommu-ops, all devices on the bus are
scanned and iommu-groups are allocated for them, and each groups will
also get a default domain allocated.

Until now this happened as soon as the group was created and the first
device added to it. When other devices with different default domain
requirements were added to the group later on, the default domain was
re-allocated, if possible.

This resulted in some back and forth and unnecessary allocations, so
change the flow to defer default domain allocation until all devices
have been added to their respective IOMMU groups.

The default domains are allocated for newly allocated groups after
each device on the bus is handled and was probed by the IOMMU driver.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 154 +-
 1 file changed, 151 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8be047a4808f..7de0e29db333 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -199,7 +199,7 @@ static int __iommu_probe_device(struct device *dev, struct 
list_head *group_list
dev->iommu->iommu_dev = iommu_dev;
 
group = iommu_group_get_for_dev(dev);
-   if (!IS_ERR(group)) {
+   if (IS_ERR(group)) {
ret = PTR_ERR(group);
goto out_release;
}
@@ -1599,6 +1599,37 @@ static int add_iommu_group(struct device *dev, void 
*data)
return ret;
 }
 
+static int probe_iommu_group(struct device *dev, void *data)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+   struct list_head *group_list = data;
+   int ret;
+
+   if (!dev_iommu_get(dev))
+   return -ENOMEM;
+
+   if (!try_module_get(ops->owner)) {
+   ret = -EINVAL;
+   goto err_free_dev_iommu;
+   }
+
+   ret = __iommu_probe_device(dev, group_list);
+   if (ret)
+   goto err_module_put;
+
+   return 0;
+
+err_module_put:
+   module_put(ops->owner);
+err_free_dev_iommu:
+   dev_iommu_free(dev);
+
+   if (ret == -ENODEV)
+   ret = 0;
+
+   return ret;
+}
+
 static int remove_iommu_group(struct device *dev, void *data)
 {
iommu_release_device(dev);
@@ -1658,10 +1689,127 @@ static int iommu_bus_notifier(struct notifier_block 
*nb,
return 0;
 }
 
+struct __group_domain_type {
+   struct device *dev;
+   unsigned int type;
+};
+
+static int probe_get_default_domain_type(struct device *dev, void *data)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+   struct __group_domain_type *gtype = data;
+   unsigned int type = 0;
+
+   if (ops->def_domain_type)
+   type = ops->def_domain_type(dev);
+
+   if (type) {
+   if (gtype->type && gtype->type != type) {
+   dev_warn(dev, "Device needs domain type %s, but device 
%s in the same iommu group requires type %s - using default\n",
+iommu_domain_type_str(type),
+dev_name(gtype->dev),
+iommu_domain_type_str(gtype->type));
+   gtype->type = 0;
+   }
+
+   if (!gtype->dev) {
+   gtype->dev  = dev;
+   gtype->type = type;
+   }
+   }
+
+   return 0;
+}
+
+static void probe_alloc_default_domain(struct bus_type *bus,
+  struct iommu_group *group)
+{
+   struct __group_domain_type gtype;
+
+   memset(, 0, sizeof(gtype));
+
+   /* Ask for default domain requirements of all devices in the group */
+   __iommu_group_for_each_dev(group, ,
+  probe_get_default_domain_type);
+
+   if (!gtype.type)
+   gtype.type = iommu_def_domain_type;
+
+   iommu_group_alloc_default_domain(bus, group, gtype.type);
+}
+
+static int iommu_group_do_dma_attach(struct device *dev, void *data)
+{
+   struct iommu_domain *domain = data;
+   const struct iommu_ops *ops;
+   int ret;
+
+   ret = __iommu_attach_device(domain, dev);
+
+   ops = domain->ops;
+
+   if (ret == 0 && ops->probe_finalize)
+   ops->probe_finalize(dev);
+
+   return ret;
+}
+
+static int __iommu_group_dma_attach(struct iommu_group *group)
+{
+   return __iommu_group_for_each_dev(group, group->default_domain,
+ iommu_group_do_dma_attach);
+}
+
+static int bus_iommu_probe(struct bus_type *bus)
+{
+   const struct iommu_ops *ops = bus->iommu_ops;
+   int ret;
+
+   if (ops->probe_device) {
+   struct iommu_group *group, *next;
+   LIST_HEAD(group_list);
+
+   /*
+* This code-path does not allocate the default domain when
+   

[PATCH v3 01/34] iommu: Move default domain allocation to separate function

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Move the code out of iommu_group_get_for_dev() into a separate
function.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 74 ++-
 1 file changed, 45 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 2b471419e26c..bfe011760ed1 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1361,6 +1361,41 @@ struct iommu_group *fsl_mc_device_group(struct device 
*dev)
 }
 EXPORT_SYMBOL_GPL(fsl_mc_device_group);
 
+static int iommu_alloc_default_domain(struct device *dev,
+ struct iommu_group *group)
+{
+   struct iommu_domain *dom;
+
+   if (group->default_domain)
+   return 0;
+
+   dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
+   if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
+   dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
+   if (dom) {
+   dev_warn(dev,
+"failed to allocate default IOMMU domain of 
type %u; falling back to IOMMU_DOMAIN_DMA",
+iommu_def_domain_type);
+   }
+   }
+
+   if (!dom)
+   return -ENOMEM;
+
+   group->default_domain = dom;
+   if (!group->domain)
+   group->domain = dom;
+
+   if (!iommu_dma_strict) {
+   int attr = 1;
+   iommu_domain_set_attr(dom,
+ DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+ );
+   }
+
+   return 0;
+}
+
 /**
  * iommu_group_get_for_dev - Find or create the IOMMU group for a device
  * @dev: target device
@@ -1393,40 +1428,21 @@ struct iommu_group *iommu_group_get_for_dev(struct 
device *dev)
 
/*
 * Try to allocate a default domain - needs support from the
-* IOMMU driver.
+* IOMMU driver. There are still some drivers which don't support
+* default domains, so the return value is not yet checked.
 */
-   if (!group->default_domain) {
-   struct iommu_domain *dom;
-
-   dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
-   if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
-   dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
-   if (dom) {
-   dev_warn(dev,
-"failed to allocate default IOMMU 
domain of type %u; falling back to IOMMU_DOMAIN_DMA",
-iommu_def_domain_type);
-   }
-   }
-
-   group->default_domain = dom;
-   if (!group->domain)
-   group->domain = dom;
-
-   if (dom && !iommu_dma_strict) {
-   int attr = 1;
-   iommu_domain_set_attr(dom,
- DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
- );
-   }
-   }
+   iommu_alloc_default_domain(dev, group);
 
ret = iommu_group_add_device(group, dev);
-   if (ret) {
-   iommu_group_put(group);
-   return ERR_PTR(ret);
-   }
+   if (ret)
+   goto out_put_group;
 
return group;
+
+out_put_group:
+   iommu_group_put(group);
+
+   return ERR_PTR(ret);
 }
 EXPORT_SYMBOL(iommu_group_get_for_dev);
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 27/34] iommu/renesas: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the Renesas IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/ipmmu-vmsa.c | 60 +-
 1 file changed, 20 insertions(+), 40 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 310cf09feea3..fb7e702dee23 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -805,24 +805,8 @@ static int ipmmu_of_xlate(struct device *dev,
 static int ipmmu_init_arm_mapping(struct device *dev)
 {
struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
-   struct iommu_group *group;
int ret;
 
-   /* Create a device group and add the device to it. */
-   group = iommu_group_alloc();
-   if (IS_ERR(group)) {
-   dev_err(dev, "Failed to allocate IOMMU group\n");
-   return PTR_ERR(group);
-   }
-
-   ret = iommu_group_add_device(group, dev);
-   iommu_group_put(group);
-
-   if (ret < 0) {
-   dev_err(dev, "Failed to add device to IPMMU group\n");
-   return ret;
-   }
-
/*
 * Create the ARM mapping, used by the ARM DMA mapping core to allocate
 * VAs. This will allocate a corresponding IOMMU domain.
@@ -856,48 +840,39 @@ static int ipmmu_init_arm_mapping(struct device *dev)
return 0;
 
 error:
-   iommu_group_remove_device(dev);
if (mmu->mapping)
arm_iommu_release_mapping(mmu->mapping);
 
return ret;
 }
 
-static int ipmmu_add_device(struct device *dev)
+static struct iommu_device *ipmmu_probe_device(struct device *dev)
 {
struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
-   struct iommu_group *group;
-   int ret;
 
/*
 * Only let through devices that have been verified in xlate()
 */
if (!mmu)
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
 
-   if (IS_ENABLED(CONFIG_ARM) && !IS_ENABLED(CONFIG_IOMMU_DMA)) {
-   ret = ipmmu_init_arm_mapping(dev);
-   if (ret)
-   return ret;
-   } else {
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return PTR_ERR(group);
+   return >iommu;
+}
 
-   iommu_group_put(group);
-   }
+static void ipmmu_probe_finalize(struct device *dev)
+{
+   int ret = 0;
 
-   iommu_device_link(>iommu, dev);
-   return 0;
+   if (IS_ENABLED(CONFIG_ARM) && !IS_ENABLED(CONFIG_IOMMU_DMA))
+   ret = ipmmu_init_arm_mapping(dev);
+
+   if (ret)
+   dev_err(dev, "Can't create IOMMU mapping - DMA-OPS will not 
work\n");
 }
 
-static void ipmmu_remove_device(struct device *dev)
+static void ipmmu_release_device(struct device *dev)
 {
-   struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
-
-   iommu_device_unlink(>iommu, dev);
arm_iommu_detach_device(dev);
-   iommu_group_remove_device(dev);
 }
 
 static struct iommu_group *ipmmu_find_group(struct device *dev)
@@ -925,9 +900,14 @@ static const struct iommu_ops ipmmu_ops = {
.flush_iotlb_all = ipmmu_flush_iotlb_all,
.iotlb_sync = ipmmu_iotlb_sync,
.iova_to_phys = ipmmu_iova_to_phys,
-   .add_device = ipmmu_add_device,
-   .remove_device = ipmmu_remove_device,
+   .probe_device = ipmmu_probe_device,
+   .release_device = ipmmu_release_device,
+   .probe_finalize = ipmmu_probe_finalize,
+#if defined(CONFIG_ARM) && !defined(CONFIG_IOMMU_DMA)
+   .device_group = generic_device_group,
+#else
.device_group = ipmmu_find_group,
+#endif
.pgsize_bitmap = SZ_1G | SZ_2M | SZ_4K,
.of_xlate = ipmmu_of_xlate,
 };
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 03/34] iommu/amd: Implement iommu_ops->def_domain_type call-back

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Implement the new def_domain_type call-back for the AMD IOMMU driver.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 20cce366e951..73b4f84cf449 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2661,6 +2661,20 @@ static void amd_iommu_iotlb_sync(struct iommu_domain 
*domain,
amd_iommu_flush_iotlb_all(domain);
 }
 
+static int amd_iommu_def_domain_type(struct device *dev)
+{
+   struct iommu_dev_data *dev_data;
+
+   dev_data = get_dev_data(dev);
+   if (!dev_data)
+   return 0;
+
+   if (dev_data->iommu_v2)
+   return IOMMU_DOMAIN_IDENTITY;
+
+   return 0;
+}
+
 const struct iommu_ops amd_iommu_ops = {
.capable = amd_iommu_capable,
.domain_alloc = amd_iommu_domain_alloc,
@@ -2680,6 +2694,7 @@ const struct iommu_ops amd_iommu_ops = {
.pgsize_bitmap  = AMD_IOMMU_PGSIZES,
.flush_iotlb_all = amd_iommu_flush_iotlb_all,
.iotlb_sync = amd_iommu_iotlb_sync,
+   .def_domain_type = amd_iommu_def_domain_type,
 };
 
 /*
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 10/34] iommu: Move new probe_device path to separate function

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

This makes it easier to remove to old code-path when all drivers are
converted. As a side effect that it also fixes the error cleanup
path.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 69 ---
 1 file changed, 46 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 18eb3623bd00..8be047a4808f 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -218,12 +218,55 @@ static int __iommu_probe_device(struct device *dev, 
struct list_head *group_list
return ret;
 }
 
+static int __iommu_probe_device_helper(struct device *dev)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+   struct iommu_group *group;
+   int ret;
+
+   ret = __iommu_probe_device(dev, NULL);
+   if (ret)
+   goto err_out;
+
+   /*
+* Try to allocate a default domain - needs support from the
+* IOMMU driver. There are still some drivers which don't
+* support default domains, so the return value is not yet
+* checked.
+*/
+   iommu_alloc_default_domain(dev);
+
+   group = iommu_group_get(dev);
+   if (!group)
+   goto err_release;
+
+   if (group->default_domain)
+   ret = __iommu_attach_device(group->default_domain, dev);
+
+   iommu_group_put(group);
+
+   if (ret)
+   goto err_release;
+
+   if (ops->probe_finalize)
+   ops->probe_finalize(dev);
+
+   return 0;
+
+err_release:
+   iommu_release_device(dev);
+err_out:
+   return ret;
+
+}
+
 int iommu_probe_device(struct device *dev)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
int ret;
 
WARN_ON(dev->iommu_group);
+
if (!ops)
return -EINVAL;
 
@@ -235,30 +278,10 @@ int iommu_probe_device(struct device *dev)
goto err_free_dev_param;
}
 
-   if (ops->probe_device) {
-   struct iommu_group *group;
-
-   ret = __iommu_probe_device(dev, NULL);
-
-   /*
-* Try to allocate a default domain - needs support from the
-* IOMMU driver. There are still some drivers which don't
-* support default domains, so the return value is not yet
-* checked.
-*/
-   if (!ret)
-   iommu_alloc_default_domain(dev);
-
-   group = iommu_group_get(dev);
-   if (group && group->default_domain) {
-   ret = __iommu_attach_device(group->default_domain, dev);
-   iommu_group_put(group);
-   }
-
-   } else {
-   ret = ops->add_device(dev);
-   }
+   if (ops->probe_device)
+   return __iommu_probe_device_helper(dev);
 
+   ret = ops->add_device(dev);
if (ret)
goto err_module_put;
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 20/34] iommu/virtio: Convert to probe/release_device() call-backs

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Convert the VirtIO IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/virtio-iommu.c | 41 +---
 1 file changed, 10 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index d5cac4f46ca5..bda300c2a438 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -865,24 +865,23 @@ static struct viommu_dev *viommu_get_by_fwnode(struct 
fwnode_handle *fwnode)
return dev ? dev_to_virtio(dev)->priv : NULL;
 }
 
-static int viommu_add_device(struct device *dev)
+static struct iommu_device *viommu_probe_device(struct device *dev)
 {
int ret;
-   struct iommu_group *group;
struct viommu_endpoint *vdev;
struct viommu_dev *viommu = NULL;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 
if (!fwspec || fwspec->ops != _ops)
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
 
viommu = viommu_get_by_fwnode(fwspec->iommu_fwnode);
if (!viommu)
-   return -ENODEV;
+   return ERR_PTR(-ENODEV);
 
vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
if (!vdev)
-   return -ENOMEM;
+   return ERR_PTR(-ENOMEM);
 
vdev->dev = dev;
vdev->viommu = viommu;
@@ -896,45 +895,25 @@ static int viommu_add_device(struct device *dev)
goto err_free_dev;
}
 
-   ret = iommu_device_link(>iommu, dev);
-   if (ret)
-   goto err_free_dev;
+   return >iommu;
 
-   /*
-* Last step creates a default domain and attaches to it. Everything
-* must be ready.
-*/
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group)) {
-   ret = PTR_ERR(group);
-   goto err_unlink_dev;
-   }
-
-   iommu_group_put(group);
-
-   return PTR_ERR_OR_ZERO(group);
-
-err_unlink_dev:
-   iommu_device_unlink(>iommu, dev);
 err_free_dev:
generic_iommu_put_resv_regions(dev, >resv_regions);
kfree(vdev);
 
-   return ret;
+   return ERR_PTR(ret);
 }
 
-static void viommu_remove_device(struct device *dev)
+static void viommu_release_device(struct device *dev)
 {
-   struct viommu_endpoint *vdev;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct viommu_endpoint *vdev;
 
if (!fwspec || fwspec->ops != _ops)
return;
 
vdev = dev_iommu_priv_get(dev);
 
-   iommu_group_remove_device(dev);
-   iommu_device_unlink(>viommu->iommu, dev);
generic_iommu_put_resv_regions(dev, >resv_regions);
kfree(vdev);
 }
@@ -960,8 +939,8 @@ static struct iommu_ops viommu_ops = {
.unmap  = viommu_unmap,
.iova_to_phys   = viommu_iova_to_phys,
.iotlb_sync = viommu_iotlb_sync,
-   .add_device = viommu_add_device,
-   .remove_device  = viommu_remove_device,
+   .probe_device   = viommu_probe_device,
+   .release_device = viommu_release_device,
.device_group   = viommu_device_group,
.get_resv_regions   = viommu_get_resv_regions,
.put_resv_regions   = generic_iommu_put_resv_regions,
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 14/34] iommu/amd: Remove dev_data->passthrough

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

Make use of generic IOMMU infrastructure to gather the same information
carried in dev_data->passthrough and remove the struct member.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c   | 10 +-
 drivers/iommu/amd_iommu_types.h |  1 -
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 3e0d27f7622e..0b4b4faa876d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2047,8 +2047,8 @@ static int pdev_iommuv2_enable(struct pci_dev *pdev)
 static int attach_device(struct device *dev,
 struct protection_domain *domain)
 {
-   struct pci_dev *pdev;
struct iommu_dev_data *dev_data;
+   struct pci_dev *pdev;
unsigned long flags;
int ret;
 
@@ -2067,8 +2067,10 @@ static int attach_device(struct device *dev,
 
pdev = to_pci_dev(dev);
if (domain->flags & PD_IOMMUV2_MASK) {
+   struct iommu_domain *def_domain = iommu_get_dma_domain(dev);
+
ret = -EINVAL;
-   if (!dev_data->passthrough)
+   if (def_domain->type != IOMMU_DOMAIN_IDENTITY)
goto out;
 
if (dev_data->iommu_v2) {
@@ -2189,9 +2191,7 @@ static int amd_iommu_add_device(struct device *dev)
 
/* Domains are initialized for this device - have a look what we ended 
up with */
domain = iommu_get_domain_for_dev(dev);
-   if (domain->type == IOMMU_DOMAIN_IDENTITY)
-   dev_data->passthrough = true;
-   else if (domain->type == IOMMU_DOMAIN_DMA)
+   if (domain->type == IOMMU_DOMAIN_DMA)
iommu_setup_dma_ops(dev, IOVA_START_PFN << PAGE_SHIFT, 0);
 
 out:
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index ca8c4522045b..d0d7b6a0c3d8 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -640,7 +640,6 @@ struct iommu_dev_data {
struct pci_dev *pdev;
u16 devid;/* PCI Device ID */
bool iommu_v2;/* Device can make use of IOMMUv2 */
-   bool passthrough; /* Device is identity mapped */
struct {
bool enabled;
int qdep;
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 12/34] iommu: Move iommu_group_create_direct_mappings() out of iommu_group_add_device()

2020-04-29 Thread Joerg Roedel
From: Joerg Roedel 

After the previous changes the iommu group may not have a default
domain when iommu_group_add_device() is called. With no default domain
iommu_group_create_direct_mappings() will do nothing and no direct
mappings will be created.

Rename iommu_group_create_direct_mappings() to
iommu_create_device_direct_mappings() to better reflect that the
function creates direct mappings only for one device and not for all
devices in the group. Then move the call to the places where a default
domain actually exists.

Tested-by: Marek Szyprowski 
Acked-by: Marek Szyprowski 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 35 ++-
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 7de0e29db333..834a45da0ed0 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -89,6 +89,8 @@ static int __iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group);
 static void __iommu_detach_group(struct iommu_domain *domain,
 struct iommu_group *group);
+static int iommu_create_device_direct_mappings(struct iommu_group *group,
+  struct device *dev);
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
 struct iommu_group_attribute iommu_group_attr_##_name =\
@@ -243,6 +245,8 @@ static int __iommu_probe_device_helper(struct device *dev)
if (group->default_domain)
ret = __iommu_attach_device(group->default_domain, dev);
 
+   iommu_create_device_direct_mappings(group, dev);
+
iommu_group_put(group);
 
if (ret)
@@ -263,6 +267,7 @@ static int __iommu_probe_device_helper(struct device *dev)
 int iommu_probe_device(struct device *dev)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
+   struct iommu_group *group;
int ret;
 
WARN_ON(dev->iommu_group);
@@ -285,6 +290,10 @@ int iommu_probe_device(struct device *dev)
if (ret)
goto err_module_put;
 
+   group = iommu_group_get(dev);
+   iommu_create_device_direct_mappings(group, dev);
+   iommu_group_put(group);
+
if (ops->probe_finalize)
ops->probe_finalize(dev);
 
@@ -736,8 +745,8 @@ int iommu_group_set_name(struct iommu_group *group, const 
char *name)
 }
 EXPORT_SYMBOL_GPL(iommu_group_set_name);
 
-static int iommu_group_create_direct_mappings(struct iommu_group *group,
- struct device *dev)
+static int iommu_create_device_direct_mappings(struct iommu_group *group,
+  struct device *dev)
 {
struct iommu_domain *domain = group->default_domain;
struct iommu_resv_region *entry;
@@ -841,8 +850,6 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
 
dev->iommu_group = group;
 
-   iommu_group_create_direct_mappings(group, dev);
-
mutex_lock(>mutex);
list_add_tail(>list, >devices);
if (group->domain)
@@ -1736,6 +1743,7 @@ static void probe_alloc_default_domain(struct bus_type 
*bus,
gtype.type = iommu_def_domain_type;
 
iommu_group_alloc_default_domain(bus, group, gtype.type);
+
 }
 
 static int iommu_group_do_dma_attach(struct device *dev, void *data)
@@ -1760,6 +1768,21 @@ static int __iommu_group_dma_attach(struct iommu_group 
*group)
  iommu_group_do_dma_attach);
 }
 
+static int iommu_do_create_direct_mappings(struct device *dev, void *data)
+{
+   struct iommu_group *group = data;
+
+   iommu_create_device_direct_mappings(group, dev);
+
+   return 0;
+}
+
+static int iommu_group_create_direct_mappings(struct iommu_group *group)
+{
+   return __iommu_group_for_each_dev(group, group,
+ iommu_do_create_direct_mappings);
+}
+
 static int bus_iommu_probe(struct bus_type *bus)
 {
const struct iommu_ops *ops = bus->iommu_ops;
@@ -1792,6 +1815,8 @@ static int bus_iommu_probe(struct bus_type *bus)
continue;
}
 
+   iommu_group_create_direct_mappings(group);
+
ret = __iommu_group_dma_attach(group);
 
mutex_unlock(>mutex);
@@ -2632,7 +2657,7 @@ request_default_domain_for_dev(struct device *dev, 
unsigned long type)
iommu_domain_free(group->default_domain);
group->default_domain = domain;
 
-   iommu_group_create_direct_mappings(group, dev);
+   iommu_create_device_direct_mappings(group, dev);
 
dev_info(dev, "Using iommu %s mapping\n",
 type == IOMMU_DOMAIN_DMA ? "dma" : "direct");
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org

Re: [PATCH 0/1] Add uvirtio for testing

2020-04-29 Thread Gerd Hoffmann
> 3) Need to be verbose on how the vring processing work in the commit log of
> patch 1

Ecven better a file documenting the interface somewhere in
Documentation/

take care,
  Gerd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/qxl: qxl_release use after free

2020-04-29 Thread Gerd Hoffmann
On Wed, Apr 29, 2020 at 12:01:24PM +0300, Vasily Averin wrote:
> qxl_release should not be accesses after qxl_push_*_ring_release() calls:
> userspace driver can process submitted command quickly, move qxl_release
> into release_ring, generate interrupt and trigger garbage collector.
> 
> It can lead to crashes in qxl driver or trigger memory corruption
> in some kmalloc-192 slab object
> 
> Gerd Hoffmann proposes to swap the qxl_release_fence_buffer_objects() +
> qxl_push_{cursor,command}_ring_release() calls to close that race window.
> 
> cc: sta...@vger.kernel.org
> Fixes: f64122c1f6ad ("drm: add new QXL driver. (v1.4)")
> Signed-off-by: Vasily Averin 

Pushed to drm-misc-fixes.

thanks,
  Gerd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [virtio-dev] Re: [PATCH 5/5] virtio: Add bounce DMA ops

2020-04-29 Thread Jan Kiszka

On 29.04.20 12:45, Michael S. Tsirkin wrote:

On Wed, Apr 29, 2020 at 12:26:43PM +0200, Jan Kiszka wrote:

On 29.04.20 12:20, Michael S. Tsirkin wrote:

On Wed, Apr 29, 2020 at 03:39:53PM +0530, Srivatsa Vaddagiri wrote:

That would still not work I think where swiotlb is used for pass-thr devices
(when private memory is fine) as well as virtio devices (when shared memory is
required).


So that is a separate question. When there are multiple untrusted
devices, at the moment it looks like a single bounce buffer is used.

Which to me seems like a security problem, I think we should protect
untrusted devices from each other.



Definitely. That's the model we have for ivshmem-virtio as well.

Jan


Want to try implementing that?



The desire is definitely there, currently "just" not the time.

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 5/5] virtio: Add bounce DMA ops

2020-04-29 Thread Michael S. Tsirkin
On Wed, Apr 29, 2020 at 12:26:43PM +0200, Jan Kiszka wrote:
> On 29.04.20 12:20, Michael S. Tsirkin wrote:
> > On Wed, Apr 29, 2020 at 03:39:53PM +0530, Srivatsa Vaddagiri wrote:
> > > That would still not work I think where swiotlb is used for pass-thr 
> > > devices
> > > (when private memory is fine) as well as virtio devices (when shared 
> > > memory is
> > > required).
> > 
> > So that is a separate question. When there are multiple untrusted
> > devices, at the moment it looks like a single bounce buffer is used.
> > 
> > Which to me seems like a security problem, I think we should protect
> > untrusted devices from each other.
> > 
> 
> Definitely. That's the model we have for ivshmem-virtio as well.
> 
> Jan

Want to try implementing that?

> -- 
> Siemens AG, Corporate Technology, CT RDA IOT SES-DE
> Corporate Competence Center Embedded Linux

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 5/5] virtio: Add bounce DMA ops

2020-04-29 Thread Jan Kiszka

On 29.04.20 12:20, Michael S. Tsirkin wrote:

On Wed, Apr 29, 2020 at 03:39:53PM +0530, Srivatsa Vaddagiri wrote:

That would still not work I think where swiotlb is used for pass-thr devices
(when private memory is fine) as well as virtio devices (when shared memory is
required).


So that is a separate question. When there are multiple untrusted
devices, at the moment it looks like a single bounce buffer is used.

Which to me seems like a security problem, I think we should protect
untrusted devices from each other.



Definitely. That's the model we have for ivshmem-virtio as well.

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 5/5] virtio: Add bounce DMA ops

2020-04-29 Thread Michael S. Tsirkin
On Wed, Apr 29, 2020 at 03:39:53PM +0530, Srivatsa Vaddagiri wrote:
> That would still not work I think where swiotlb is used for pass-thr devices
> (when private memory is fine) as well as virtio devices (when shared memory is
> required).

So that is a separate question. When there are multiple untrusted
devices, at the moment it looks like a single bounce buffer is used.

Which to me seems like a security problem, I think we should protect
untrusted devices from each other.





> -- 
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 03/75] KVM: SVM: Use __packed shorthand

2020-04-29 Thread Borislav Petkov
On Tue, Apr 28, 2020 at 05:16:13PM +0200, Joerg Roedel wrote:
> From: Borislav Petkov 
> 
> I guess we can do that ontop.

The proper commit message was:

"... to make it more readable.

No functional changes."

:-)

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/1] Add uvirtio for testing

2020-04-29 Thread Jason Wang


On 2020/4/29 上午4:47, Lepton Wu wrote:

This is a way to create virtio based devices from user space. This is the
background for this patch:

We have some images works fine under qemu, we'd like to also run the same image
on Google Cloud. Currently Google Cloud doesn't support virtio-vga. I had a
patch to create a virtio-vga from kernel directly:
https://www.spinics.net/lists/dri-devel/msg248573.html

Then I got feedback from Gerd that maybe it's better to change that to something
like uvirtio. Since I really don't have other use cases for now, I just 
implemented the minimal stuff which work for my use case.



Interesting, several questions:

1) Are you aware of virtio vhost-user driver done by UM guys? 
(arch/um/drivers/virtio_uml.c) The memory part is tricky but overall 
both of you have similar target.
2) Patch 1 said it's userspace virtio driver, which I think it is 
actually "userspace virtio device"
3) Need to be verbose on how the vring processing work in the commit log 
of patch 1
4) I'm curious which testing you want to accomplish through this new 
transport, I guess you want to test a specific virtio driver?
5) You mentioned that you may want to develop communication between 
kernel and userspace, any more details on that?


Thanks




Lepton Wu (1):
   virtio: Add uvirtio driver

  drivers/virtio/Kconfig|   8 +
  drivers/virtio/Makefile   |   1 +
  drivers/virtio/uvirtio.c  | 405 ++
  include/linux/uvirtio.h   |   8 +
  include/uapi/linux/uvirtio.h  |  69 ++
  samples/uvirtio/Makefile  |   9 +
  samples/uvirtio/uvirtio-vga.c |  63 ++
  7 files changed, 563 insertions(+)
  create mode 100644 drivers/virtio/uvirtio.c
  create mode 100644 include/linux/uvirtio.h
  create mode 100644 include/uapi/linux/uvirtio.h
  create mode 100644 samples/uvirtio/Makefile
  create mode 100644 samples/uvirtio/uvirtio-vga.c



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 5/5] virtio: Add bounce DMA ops

2020-04-29 Thread Michael S. Tsirkin
On Wed, Apr 29, 2020 at 03:14:10PM +0530, Srivatsa Vaddagiri wrote:
> * Michael S. Tsirkin  [2020-04-29 02:50:41]:
> 
> > So it seems that with modern Linux, all one needs
> > to do on x86 is mark the device as untrusted.
> > It's already possible to do this with ACPI and with OF - would that be
> > sufficient for achieving what this patchset is trying to do?
> 
> In my case, its not sufficient to just mark virtio device untrusted and thus
> activate the use of swiotlb. All of the secondary VM memory, including those
> allocate by swiotlb driver, is private to it.

So why not make the bounce buffer memory shared then?

> An additional piece of memory is
> available to secondary VM which is shared between VMs and which is where I 
> need
> swiotlb driver to do its work.
> 
> -- 
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next 0/3] vsock: support network namespace

2020-04-29 Thread Jason Wang


On 2020/4/29 上午12:00, Stefano Garzarella wrote:

On Tue, Apr 28, 2020 at 04:13:22PM +0800, Jason Wang wrote:

On 2020/4/27 下午10:25, Stefano Garzarella wrote:

Hi David, Michael, Stefan,
I'm restarting to work on this topic since Kata guys are interested to
have that, especially on the guest side.

While working on the v2 I had few doubts, and I'd like to have your
suggestions:

   1. netns assigned to the device inside the guest

 Currently I assigned this device to 'init_net'. Maybe it is better
 if we allow the user to decide which netns assign to the device
 or to disable this new feature to have the same behavior as before
 (host reachable from any netns).
 I think we can handle this in the vsock core and not in the single
 transports.

 The simplest way that I found, is to add a new
 IOCTL_VM_SOCKETS_ASSIGN_G2H_NETNS to /dev/vsock to enable the feature
 and assign the device to the same netns of the process that do the
 ioctl(), but I'm not sure it is clean enough.

 Maybe it is better to add new rtnetlink messages, but I'm not sure if
 it is feasible since we don't have a netdev device.

 What do you suggest?

As we've discussed, it should be a netdev probably in either guest or host
side. And it would be much simpler if we want do implement namespace then.
No new API is needed.


Thanks Jason!

It would be cool, but I don't have much experience on netdev.
Do you see any particular obstacles?



I don't see but if there's we can try to find a solution or ask for 
netdev experts for that. I do hear from somebody that is interested in 
having netdev in the past.





I'll take a look to understand how to do it, surely in the guest would
be very useful to have the vsock device as a netdev and maybe also in the host.



Yes, it's worth to have a try then we will have a unified management 
interface and we will benefit from it in the future.


Starting form guest is good idea which should be less complicated than host.

Thanks




Stefano



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/1] drm/qxl: add mutex_lock/mutex_unlock to ensure the order in which resources are rele

2020-04-29 Thread Gerd Hoffmann
  Hi,

> > The only way I see for this to happen is that the guest is preempted
> > between qxl_push_{cursor,command}_ring_release() and
> > qxl_release_fence_buffer_objects() calls.  The host can complete the qxl
> > command then, signal the guest, and the IRQ handler calls
> > qxl_release_free_list() before qxl_release_fence_buffer_objects() runs.
> 
> We think the same: qxl_release was freed by garbage collector before
> original thread had called qxl_release_fence_buffer_objects().

Ok, nice, I think we can consider the issue being analyzed then ;)

> > Looking through the code I think it should be safe to simply swap the
> > qxl_release_fence_buffer_objects() +
> > qxl_push_{cursor,command}_ring_release() calls to close that race
> > window.  Can you try that and see if it fixes the bug for you?
> 
> I'm going to prepare and test such patch but I have one question here:
> qxl_push_*_ring_release can be called with  interruptible=true and fail
> How to correctly handle this case? Is the hunk below correct from your POV?

Oh, right, the error code path will be quite different, checking ...

> --- a/drivers/gpu/drm/qxl/qxl_ioctl.c
> +++ b/drivers/gpu/drm/qxl/qxl_ioctl.c
> @@ -261,12 +261,8 @@ static int qxl_process_single_command(struct qxl_device 
> *qdev,
> apply_surf_reloc(qdev, _info[i]);
> }
>  
> +   qxl_release_fence_buffer_objects(release);
> ret = qxl_push_command_ring_release(qdev, release, cmd->type, true);
> -   if (ret)
> -   qxl_release_backoff_reserve_list(release);   
> -   else
> -   qxl_release_fence_buffer_objects(release);
> -
>  out_free_bos:
>  out_free_release:
if (ret)
qxl_release_free(qdev, release);

[ code context added ]

qxl_release_free() checks whenever a release is fenced and signals the
fence in case it is so it doesn't wait for the signal forever.  So, yes,
I think qxl_release_free() should cleanup the release properly in any
case and the patch chunk should be correct.

take care,
  Gerd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 5/5] virtio: Add bounce DMA ops

2020-04-29 Thread Michael S. Tsirkin
On Wed, Apr 29, 2020 at 01:42:13PM +0800, Lu Baolu wrote:
> On 2020/4/29 12:57, Michael S. Tsirkin wrote:
> > On Wed, Apr 29, 2020 at 10:22:32AM +0800, Lu Baolu wrote:
> > > On 2020/4/29 4:41, Michael S. Tsirkin wrote:
> > > > On Tue, Apr 28, 2020 at 11:19:52PM +0530, Srivatsa Vaddagiri wrote:
> > > > > * Michael S. Tsirkin  [2020-04-28 12:17:57]:
> > > > > 
> > > > > > Okay, but how is all this virtio specific?  For example, why not 
> > > > > > allow
> > > > > > separate swiotlbs for any type of device?
> > > > > > For example, this might make sense if a given device is from a
> > > > > > different, less trusted vendor.
> > > > > Is swiotlb commonly used for multiple devices that may be on 
> > > > > different trust
> > > > > boundaries (and not behind a hardware iommu)?
> > > > Even a hardware iommu does not imply a 100% security from malicious
> > > > hardware. First lots of people use iommu=pt for performance reasons.
> > > > Second even without pt, unmaps are often batched, and sub-page buffers
> > > > might be used for DMA, so we are not 100% protected at all times.
> > > > 
> > > 
> > > For untrusted devices, IOMMU is forced on even iommu=pt is used;
> > 
> > I think you are talking about untrusted *drivers* like with VFIO.
> 
> No. I am talking about untrusted devices like thunderbolt peripherals.
> We always trust drivers hosted in kernel and the DMA APIs are designed
> for them, right?
> 
> Please refer to this series.
> 
> https://lkml.org/lkml/2019/9/6/39
> 
> Best regards,
> baolu

Oh, thanks for that! I didn't realize Linux is doing this.

So it seems that with modern Linux, all one needs
to do on x86 is mark the device as untrusted.
It's already possible to do this with ACPI and with OF - would that be
sufficient for achieving what this patchset is trying to do?

Adding more ways to mark a device as untrusted, and adding
support for more platforms to use bounce buffers
sounds like a reasonable thing to do.

> > 
> > On the other hand, I am talking about things like thunderbolt
> > peripherals being less trusted than on-board ones.
> 
> 
> 
> > 
> > Or possibly even using swiotlb for specific use-cases where
> > speed is less of an issue.
> > 
> > E.g. my wifi is pretty slow anyway, and that card is exposed to
> > malicious actors all the time, put just that behind swiotlb
> > for security, and leave my graphics card with pt since
> > I'm trusting it with secrets anyway.
> > 
> > 
> > > and
> > > iotlb flush is in strict mode (no batched flushes); ATS is also not
> > > allowed. Swiotlb is used to protect sub-page buffers since IOMMU can
> > > only apply page granularity protection. Swiotlb is now used for devices
> > > from different trust zone.
> > > 
> > > Best regards,
> > > baolu
> > 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization