QEMU v5.2.0: object_new_with_type: assertion failed: (type != NULL)
Hi All, I am using QEMU version "v5.2.0" and getting below error when running on arm64 $ qemu-system-aarch64 --version ERROR:../qom/object.c:711:object_new_with_type: assertion failed: (type != NULL) Bail out! ERROR:../qom/object.c:711:object_new_with_type: assertion failed: (type != NULL) Running GDB give below information. Is this something known. (gdb) run Starting program: /root/qemu-system-aarch64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0xf703ed90 (LWP 5212)] ** ERROR:../qom/object.c:711:object_new_with_type: assertion failed: (type != NULL) Bail out! ERROR:../qom/object.c:711:object_new_with_type: assertion failed: (type != NULL) Thread 1 "qemu-system-aar" received signal SIGABRT, Aborted. 0xf73e380c in raise () from /lib64/libc.so.6 (gdb) bt #0 0xf73e380c in raise () from /lib64/libc.so.6 #1 0xf73d0bcc in abort () from /lib64/libc.so.6 #2 0xf7b7c058 in g_assertion_message () from /lib64/libglib-2.0.so.0 #3 0xf7b7c0b8 in g_assertion_message_expr () from /lib64/libglib-2.0.so.0 #4 0xab2b880c in object_new_with_type (type=) at ../qom/object.c:711 #5 0xab2b8870 in object_new (typename=typename@entry=0xab50a668 "container") at ../qom/object.c:744 #6 0xab2b9ec4 in object_get_root () at ../qom/object.c:1674 #7 0xab2a4c0c in get_chardevs_root () at ../chardev/char.c:50 #8 0xaaf9f2b4 in chardev_machine_done_hook (notifier=, unused=) at ../chardev/chardev-sysemu.c:45 #9 0xab3d4ae8 in module_call_init (type=type@entry=MODULE_INIT_QOM) at ../util/module.c:106 #10 0xab235000 in qemu_init (argc=1, argv=0xfcb8, envp=) at ../softmmu/vl.c:2916 #11 0xaad9e084 in main (argc=, argv=, envp=) at ../softmmu/main.c:49 (gdb) Thanks -Bharat
RE: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM probe request
Hi Eric, > -Original Message- > From: Auger Eric > Sent: Tuesday, May 12, 2020 8:39 AM > To: Bharat Bhushan ; eric.auger@gmail.com; > qemu-devel@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > m...@redhat.com; jean-phili...@linaro.org; pet...@redhat.com; > arm...@redhat.com; pbonz...@redhat.com > Subject: Re: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM probe > request > > Hi Bharat, > On 5/12/20 5:03 AM, Bharat Bhushan wrote: > > Hi Eric, > > > >> -Original Message- > >> From: Auger Eric > >> Sent: Monday, May 11, 2020 2:19 PM > >> To: Bharat Bhushan ; eric.auger@gmail.com; > >> qemu-devel@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > >> m...@redhat.com; jean-phili...@linaro.org; pet...@redhat.com; > >> arm...@redhat.com; pbonz...@redhat.com > >> Subject: Re: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM > >> probe request > >> > >> Hi Bharat, > >> > >> On 5/11/20 10:42 AM, Bharat Bhushan wrote: > >>> Hi Eric, > >>> > >>>> -Original Message- > >>>> From: Auger Eric > >>>> Sent: Monday, May 11, 2020 12:26 PM > >>>> To: Bharat Bhushan ; > >>>> eric.auger@gmail.com; qemu-devel@nongnu.org; > >>>> qemu-...@nongnu.org; peter.mayd...@linaro.org; m...@redhat.com; > >>>> jean-phili...@linaro.org; pet...@redhat.com; arm...@redhat.com; > >>>> pbonz...@redhat.com > >>>> Subject: Re: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM > >>>> probe request > >>>> > >>>> Hi Bharat, > >>>> On 5/11/20 8:38 AM, Bharat Bhushan wrote: > >>>>> Hi Eric, > >>>>> > >>>>>> -Original Message- > >>>>>> From: Eric Auger > >>>>>> Sent: Friday, May 8, 2020 11:01 PM > >>>>>> To: eric.auger@gmail.com; eric.au...@redhat.com; > >>>>>> qemu-devel@nongnu.org; qemu-...@nongnu.org; > >>>>>> peter.mayd...@linaro.org; m...@redhat.com; jean- > >>>>>> phili...@linaro.org; Bharat Bhushan ; > >>>>>> pet...@redhat.com; arm...@redhat.com; pbonz...@redhat.com > >>>>>> Subject: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM > >>>>>> probe request > >>>>>> > >>>>>> External Email > >>>>>> > >>>>>> - > >>>>>> -- > >>>>>> -- > >>>>>> - This patch implements the PROBE request. At the moment, only > >>>>>> THE RESV_MEM property is handled. The first goal is to report > >>>>>> iommu wide reserved regions such as the MSI regions set by the > >>>>>> machine code. On > >>>>>> x86 this will be the IOAPIC MSI region, > >>>>>> [0xFEE0 - 0xFEEF], on ARM this may be the ITS doorbell. > >>>>>> > >>>>>> In the future we may introduce per device reserved regions. > >>>>>> This will be useful when protecting host assigned devices which > >>>>>> may expose their own reserved regions > >>>>>> > >>>>>> Signed-off-by: Eric Auger > >>>>>> > >>>>>> --- > >>>>>> > >>>>>> v1 -> v2: > >>>>>> - move the unlock back to the same place > >>>>>> - remove the push label and factorize the code after the out > >>>>>> label > >>>>>> - fix a bunch of cpu_to_leX according to the latest spec revision > >>>>>> - do not remove sizeof(last) from free space > >>>>>> - check the ep exists > >>>>>> --- > >>>>>> include/hw/virtio/virtio-iommu.h | 2 + > >>>>>> hw/virtio/virtio-iommu.c | 94 ++-- > >>>>>> hw/virtio/trace-events | 1 + > >>>>>> 3 files changed, 93 insertions(+), 4 deletions(-) > >>>>>> > >>>>>> diff --git a/include/hw/virtio/virtio-iommu.h > >>>>>> b/include/hw/virtio/virtio-iommu.h > >>>>>> index e653004d7c..49eb105cd8 100644 > >>>>>> --- a/include/hw/
RE: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM probe request
Hi Eric, > -Original Message- > From: Auger Eric > Sent: Monday, May 11, 2020 2:19 PM > To: Bharat Bhushan ; eric.auger@gmail.com; > qemu-devel@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > m...@redhat.com; jean-phili...@linaro.org; pet...@redhat.com; > arm...@redhat.com; pbonz...@redhat.com > Subject: Re: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM probe > request > > Hi Bharat, > > On 5/11/20 10:42 AM, Bharat Bhushan wrote: > > Hi Eric, > > > >> -Original Message- > >> From: Auger Eric > >> Sent: Monday, May 11, 2020 12:26 PM > >> To: Bharat Bhushan ; eric.auger@gmail.com; > >> qemu-devel@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > >> m...@redhat.com; jean-phili...@linaro.org; pet...@redhat.com; > >> arm...@redhat.com; pbonz...@redhat.com > >> Subject: Re: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM > >> probe request > >> > >> Hi Bharat, > >> On 5/11/20 8:38 AM, Bharat Bhushan wrote: > >>> Hi Eric, > >>> > >>>> -Original Message- > >>>> From: Eric Auger > >>>> Sent: Friday, May 8, 2020 11:01 PM > >>>> To: eric.auger@gmail.com; eric.au...@redhat.com; > >>>> qemu-devel@nongnu.org; qemu-...@nongnu.org; > >>>> peter.mayd...@linaro.org; m...@redhat.com; jean- > >>>> phili...@linaro.org; Bharat Bhushan ; > >>>> pet...@redhat.com; arm...@redhat.com; pbonz...@redhat.com > >>>> Subject: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM > >>>> probe request > >>>> > >>>> External Email > >>>> > >>>> --- > >>>> -- > >>>> - This patch implements the PROBE request. At the moment, only THE > >>>> RESV_MEM property is handled. The first goal is to report iommu > >>>> wide reserved regions such as the MSI regions set by the machine > >>>> code. On > >>>> x86 this will be the IOAPIC MSI region, > >>>> [0xFEE0 - 0xFEEF], on ARM this may be the ITS doorbell. > >>>> > >>>> In the future we may introduce per device reserved regions. > >>>> This will be useful when protecting host assigned devices which may > >>>> expose their own reserved regions > >>>> > >>>> Signed-off-by: Eric Auger > >>>> > >>>> --- > >>>> > >>>> v1 -> v2: > >>>> - move the unlock back to the same place > >>>> - remove the push label and factorize the code after the out label > >>>> - fix a bunch of cpu_to_leX according to the latest spec revision > >>>> - do not remove sizeof(last) from free space > >>>> - check the ep exists > >>>> --- > >>>> include/hw/virtio/virtio-iommu.h | 2 + > >>>> hw/virtio/virtio-iommu.c | 94 ++-- > >>>> hw/virtio/trace-events | 1 + > >>>> 3 files changed, 93 insertions(+), 4 deletions(-) > >>>> > >>>> diff --git a/include/hw/virtio/virtio-iommu.h > >>>> b/include/hw/virtio/virtio-iommu.h > >>>> index e653004d7c..49eb105cd8 100644 > >>>> --- a/include/hw/virtio/virtio-iommu.h > >>>> +++ b/include/hw/virtio/virtio-iommu.h > >>>> @@ -53,6 +53,8 @@ typedef struct VirtIOIOMMU { > >>>> GHashTable *as_by_busptr; > >>>> IOMMUPciBus *iommu_pcibus_by_bus_num[PCI_BUS_MAX]; > >>>> PCIBus *primary_bus; > >>>> +ReservedRegion *reserved_regions; > >>>> +uint32_t nb_reserved_regions; > >>>> GTree *domains; > >>>> QemuMutex mutex; > >>>> GTree *endpoints; > >>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > >>>> index > >>>> 22ba8848c2..35d772e021 100644 > >>>> --- a/hw/virtio/virtio-iommu.c > >>>> +++ b/hw/virtio/virtio-iommu.c > >>>> @@ -38,6 +38,7 @@ > >>>> > >>>> /* Max size */ > >>>> #define VIOMMU_DEFAULT_QUEUE_SIZE 256 > >>>> +#define VIOMMU_PROBE_SIZE 512 > >>>> > >>>> typedef struct VirtIOIOMMUDomain { > >>>> uint32_t id; > >>>> @@
RE: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM probe request
Hi Eric, > -Original Message- > From: Auger Eric > Sent: Monday, May 11, 2020 12:26 PM > To: Bharat Bhushan ; eric.auger@gmail.com; > qemu-devel@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > m...@redhat.com; jean-phili...@linaro.org; pet...@redhat.com; > arm...@redhat.com; pbonz...@redhat.com > Subject: Re: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM probe > request > > Hi Bharat, > On 5/11/20 8:38 AM, Bharat Bhushan wrote: > > Hi Eric, > > > >> -Original Message- > >> From: Eric Auger > >> Sent: Friday, May 8, 2020 11:01 PM > >> To: eric.auger@gmail.com; eric.au...@redhat.com; > >> qemu-devel@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > >> m...@redhat.com; jean- phili...@linaro.org; Bharat Bhushan > >> ; pet...@redhat.com; arm...@redhat.com; > >> pbonz...@redhat.com > >> Subject: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM probe > >> request > >> > >> External Email > >> > >> - > >> - This patch implements the PROBE request. At the moment, only THE > >> RESV_MEM property is handled. The first goal is to report iommu wide > >> reserved regions such as the MSI regions set by the machine code. On > >> x86 this will be the IOAPIC MSI region, > >> [0xFEE0 - 0xFEEF], on ARM this may be the ITS doorbell. > >> > >> In the future we may introduce per device reserved regions. > >> This will be useful when protecting host assigned devices which may > >> expose their own reserved regions > >> > >> Signed-off-by: Eric Auger > >> > >> --- > >> > >> v1 -> v2: > >> - move the unlock back to the same place > >> - remove the push label and factorize the code after the out label > >> - fix a bunch of cpu_to_leX according to the latest spec revision > >> - do not remove sizeof(last) from free space > >> - check the ep exists > >> --- > >> include/hw/virtio/virtio-iommu.h | 2 + > >> hw/virtio/virtio-iommu.c | 94 ++-- > >> hw/virtio/trace-events | 1 + > >> 3 files changed, 93 insertions(+), 4 deletions(-) > >> > >> diff --git a/include/hw/virtio/virtio-iommu.h > >> b/include/hw/virtio/virtio-iommu.h > >> index e653004d7c..49eb105cd8 100644 > >> --- a/include/hw/virtio/virtio-iommu.h > >> +++ b/include/hw/virtio/virtio-iommu.h > >> @@ -53,6 +53,8 @@ typedef struct VirtIOIOMMU { > >> GHashTable *as_by_busptr; > >> IOMMUPciBus *iommu_pcibus_by_bus_num[PCI_BUS_MAX]; > >> PCIBus *primary_bus; > >> +ReservedRegion *reserved_regions; > >> +uint32_t nb_reserved_regions; > >> GTree *domains; > >> QemuMutex mutex; > >> GTree *endpoints; > >> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > >> index > >> 22ba8848c2..35d772e021 100644 > >> --- a/hw/virtio/virtio-iommu.c > >> +++ b/hw/virtio/virtio-iommu.c > >> @@ -38,6 +38,7 @@ > >> > >> /* Max size */ > >> #define VIOMMU_DEFAULT_QUEUE_SIZE 256 > >> +#define VIOMMU_PROBE_SIZE 512 > >> > >> typedef struct VirtIOIOMMUDomain { > >> uint32_t id; > >> @@ -378,6 +379,65 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, > >> return ret; > >> } > >> > >> +static ssize_t virtio_iommu_fill_resv_mem_prop(VirtIOIOMMU *s, uint32_t > >> ep, > >> + uint8_t *buf, size_t > >> +free) { > >> +struct virtio_iommu_probe_resv_mem prop = {}; > >> +size_t size = sizeof(prop), length = size - sizeof(prop.head), total; > >> +int i; > >> + > >> +total = size * s->nb_reserved_regions; > >> + > >> +if (total > free) { > >> +return -ENOSPC; > >> +} > >> + > >> +for (i = 0; i < s->nb_reserved_regions; i++) { > >> +prop.head.type = cpu_to_le16(VIRTIO_IOMMU_PROBE_T_RESV_MEM); > >> +prop.head.length = cpu_to_le16(length); > >> +prop.subtype = s->reserved_regions[i].type; > >> +prop.start = cpu_to_le64(s->reserved_regions[i].low); > >> +prop.end = cpu_to_le64(s->reserved_regions[i].high); > >> + > >> +memcpy(buf, , size); > &g
RE: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM probe request
Hi Eric, > -Original Message- > From: Eric Auger > Sent: Friday, May 8, 2020 11:01 PM > To: eric.auger@gmail.com; eric.au...@redhat.com; qemu-devel@nongnu.org; > qemu-...@nongnu.org; peter.mayd...@linaro.org; m...@redhat.com; jean- > phili...@linaro.org; Bharat Bhushan ; > pet...@redhat.com; arm...@redhat.com; pbonz...@redhat.com > Subject: [EXT] [PATCH v2 2/5] virtio-iommu: Implement RESV_MEM probe request > > External Email > > -- > This patch implements the PROBE request. At the moment, only THE RESV_MEM > property is handled. The first goal is to report iommu wide reserved regions > such as > the MSI regions set by the machine code. On x86 this will be the IOAPIC MSI > region, > [0xFEE0 - 0xFEEF], on ARM this may be the ITS doorbell. > > In the future we may introduce per device reserved regions. > This will be useful when protecting host assigned devices which may expose > their > own reserved regions > > Signed-off-by: Eric Auger > > --- > > v1 -> v2: > - move the unlock back to the same place > - remove the push label and factorize the code after the out label > - fix a bunch of cpu_to_leX according to the latest spec revision > - do not remove sizeof(last) from free space > - check the ep exists > --- > include/hw/virtio/virtio-iommu.h | 2 + > hw/virtio/virtio-iommu.c | 94 ++-- > hw/virtio/trace-events | 1 + > 3 files changed, 93 insertions(+), 4 deletions(-) > > diff --git a/include/hw/virtio/virtio-iommu.h > b/include/hw/virtio/virtio-iommu.h > index e653004d7c..49eb105cd8 100644 > --- a/include/hw/virtio/virtio-iommu.h > +++ b/include/hw/virtio/virtio-iommu.h > @@ -53,6 +53,8 @@ typedef struct VirtIOIOMMU { > GHashTable *as_by_busptr; > IOMMUPciBus *iommu_pcibus_by_bus_num[PCI_BUS_MAX]; > PCIBus *primary_bus; > +ReservedRegion *reserved_regions; > +uint32_t nb_reserved_regions; > GTree *domains; > QemuMutex mutex; > GTree *endpoints; > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index > 22ba8848c2..35d772e021 100644 > --- a/hw/virtio/virtio-iommu.c > +++ b/hw/virtio/virtio-iommu.c > @@ -38,6 +38,7 @@ > > /* Max size */ > #define VIOMMU_DEFAULT_QUEUE_SIZE 256 > +#define VIOMMU_PROBE_SIZE 512 > > typedef struct VirtIOIOMMUDomain { > uint32_t id; > @@ -378,6 +379,65 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, > return ret; > } > > +static ssize_t virtio_iommu_fill_resv_mem_prop(VirtIOIOMMU *s, uint32_t ep, > + uint8_t *buf, size_t > +free) { > +struct virtio_iommu_probe_resv_mem prop = {}; > +size_t size = sizeof(prop), length = size - sizeof(prop.head), total; > +int i; > + > +total = size * s->nb_reserved_regions; > + > +if (total > free) { > +return -ENOSPC; > +} > + > +for (i = 0; i < s->nb_reserved_regions; i++) { > +prop.head.type = cpu_to_le16(VIRTIO_IOMMU_PROBE_T_RESV_MEM); > +prop.head.length = cpu_to_le16(length); > +prop.subtype = s->reserved_regions[i].type; > +prop.start = cpu_to_le64(s->reserved_regions[i].low); > +prop.end = cpu_to_le64(s->reserved_regions[i].high); > + > +memcpy(buf, , size); > + > +trace_virtio_iommu_fill_resv_property(ep, prop.subtype, > + prop.start, prop.end); > +buf += size; > +} > +return total; > +} > + > +/** > + * virtio_iommu_probe - Fill the probe request buffer with > + * the properties the device is able to return and add a NONE > + * property at the end. > + */ > +static int virtio_iommu_probe(VirtIOIOMMU *s, > + struct virtio_iommu_req_probe *req, > + uint8_t *buf) { > +uint32_t ep_id = le32_to_cpu(req->endpoint); > +size_t free = VIOMMU_PROBE_SIZE; > +ssize_t count; > + > +if (!virtio_iommu_mr(s, ep_id)) { > +return VIRTIO_IOMMU_S_NOENT; > +} > + > +count = virtio_iommu_fill_resv_mem_prop(s, ep_id, buf, free); > +if (count < 0) { > +return VIRTIO_IOMMU_S_INVAL; > +} > +buf += count; > +free -= count; > + > +/* Fill the rest with zeroes */ > +memset(buf, 0, free); No need to fill with zero here as "buf" is set to zero on allocation, no? Thanks -Bharat > + > +return VIRTIO_IOMMU_S_OK; > +} > + > static int virtio_iommu_iov_to_req(struct iovec *iov,
Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
Hi Eric, On Tue, May 5, 2020 at 3:16 PM Bharat Bhushan wrote: > > hi Eric, > > On Tue, May 5, 2020 at 3:00 PM Auger Eric wrote: > > > > Hi Bharat, > > > > On 5/5/20 11:25 AM, Bharat Bhushan wrote: > > > Hi Eric, > > > > > > On Fri, Apr 24, 2020 at 7:47 PM Auger Eric wrote: > > >> > > >> Hi Bharat, > > >> > > >> On 4/2/20 11:01 AM, Bharat Bhushan wrote: > > >>> Hi Eric/Alex, > > >>> > > >>>> -Original Message- > > >>>> From: Alex Williamson > > >>>> Sent: Thursday, March 26, 2020 11:23 PM > > >>>> To: Auger Eric > > >>>> Cc: Bharat Bhushan ; peter.mayd...@linaro.org; > > >>>> pet...@redhat.com; eric.auger@gmail.com; kevin.t...@intel.com; > > >>>> m...@redhat.com; Tomasz Nowicki [C] ; > > >>>> drjo...@redhat.com; linuc.dec...@gmail.com; qemu-devel@nongnu.org; > > >>>> qemu- > > >>>> a...@nongnu.org; bharatb.li...@gmail.com; jean-phili...@linaro.org; > > >>>> yang.zh...@intel.com; David Gibson > > >>>> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print > > >>>> on mmio > > >>>> region translation by viommu > > >>>> > > >>>> External Email > > >>>> > > >>>> -- > > >>>> On Thu, 26 Mar 2020 18:35:48 +0100 > > >>>> Auger Eric wrote: > > >>>> > > >>>>> Hi Alex, > > >>>>> > > >>>>> On 3/24/20 12:08 AM, Alex Williamson wrote: > > >>>>>> [Cc +dwg who originated this warning] > > >>>>>> > > >>>>>> On Mon, 23 Mar 2020 14:16:09 +0530 > > >>>>>> Bharat Bhushan wrote: > > >>>>>> > > >>>>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU. > > >>>>>>> As such address_space_translate() returns the MSI controller MMIO > > >>>>>>> region and we get an "iommu map to non memory area" > > >>>>>>> message. Let's remove this latter. > > >>>>>>> > > >>>>>>> Signed-off-by: Eric Auger > > >>>>>>> Signed-off-by: Bharat Bhushan > > >>>>>>> --- > > >>>>>>> hw/vfio/common.c | 2 -- > > >>>>>>> 1 file changed, 2 deletions(-) > > >>>>>>> > > >>>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index > > >>>>>>> 5ca11488d6..c586edf47a 100644 > > >>>>>>> --- a/hw/vfio/common.c > > >>>>>>> +++ b/hw/vfio/common.c > > >>>>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, > > >>>> void **vaddr, > > >>>>>>> , , writable, > > >>>>>>> MEMTXATTRS_UNSPECIFIED); > > >>>>>>> if (!memory_region_is_ram(mr)) { > > >>>>>>> -error_report("iommu map to non memory area %"HWADDR_PRIx"", > > >>>>>>> - xlat); > > >>>>>>> return false; > > >>>>>>> } > > >>>>>>> > > >>>>>> > > >>>>>> I'm a bit confused here, I think we need more justification beyond > > >>>>>> "we hit this warning and we don't want to because it's ok in this > > >>>>>> one special case, therefore remove it". I assume the special case > > >>>>>> is that the device MSI address is managed via the SET_IRQS ioctl and > > >>>>>> therefore we won't actually get DMAs to this range. > > >>>>> Yes exactly. The guest creates a mapping between one giova and this > > >>>>> gpa (corresponding to the MSI controller doorbell) because MSIs are > > >>>>> mapped on ARM. But practically the physical device is programmed with > > >>>>> an host chosen iova that maps onto the physical MSI controller's > > >>>>> doorb
Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
hi Eric, On Tue, May 5, 2020 at 3:00 PM Auger Eric wrote: > > Hi Bharat, > > On 5/5/20 11:25 AM, Bharat Bhushan wrote: > > Hi Eric, > > > > On Fri, Apr 24, 2020 at 7:47 PM Auger Eric wrote: > >> > >> Hi Bharat, > >> > >> On 4/2/20 11:01 AM, Bharat Bhushan wrote: > >>> Hi Eric/Alex, > >>> > >>>> -Original Message- > >>>> From: Alex Williamson > >>>> Sent: Thursday, March 26, 2020 11:23 PM > >>>> To: Auger Eric > >>>> Cc: Bharat Bhushan ; peter.mayd...@linaro.org; > >>>> pet...@redhat.com; eric.auger@gmail.com; kevin.t...@intel.com; > >>>> m...@redhat.com; Tomasz Nowicki [C] ; > >>>> drjo...@redhat.com; linuc.dec...@gmail.com; qemu-devel@nongnu.org; qemu- > >>>> a...@nongnu.org; bharatb.li...@gmail.com; jean-phili...@linaro.org; > >>>> yang.zh...@intel.com; David Gibson > >>>> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on > >>>> mmio > >>>> region translation by viommu > >>>> > >>>> External Email > >>>> > >>>> -- > >>>> On Thu, 26 Mar 2020 18:35:48 +0100 > >>>> Auger Eric wrote: > >>>> > >>>>> Hi Alex, > >>>>> > >>>>> On 3/24/20 12:08 AM, Alex Williamson wrote: > >>>>>> [Cc +dwg who originated this warning] > >>>>>> > >>>>>> On Mon, 23 Mar 2020 14:16:09 +0530 > >>>>>> Bharat Bhushan wrote: > >>>>>> > >>>>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU. > >>>>>>> As such address_space_translate() returns the MSI controller MMIO > >>>>>>> region and we get an "iommu map to non memory area" > >>>>>>> message. Let's remove this latter. > >>>>>>> > >>>>>>> Signed-off-by: Eric Auger > >>>>>>> Signed-off-by: Bharat Bhushan > >>>>>>> --- > >>>>>>> hw/vfio/common.c | 2 -- > >>>>>>> 1 file changed, 2 deletions(-) > >>>>>>> > >>>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index > >>>>>>> 5ca11488d6..c586edf47a 100644 > >>>>>>> --- a/hw/vfio/common.c > >>>>>>> +++ b/hw/vfio/common.c > >>>>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, > >>>> void **vaddr, > >>>>>>> , , writable, > >>>>>>> MEMTXATTRS_UNSPECIFIED); > >>>>>>> if (!memory_region_is_ram(mr)) { > >>>>>>> -error_report("iommu map to non memory area %"HWADDR_PRIx"", > >>>>>>> - xlat); > >>>>>>> return false; > >>>>>>> } > >>>>>>> > >>>>>> > >>>>>> I'm a bit confused here, I think we need more justification beyond > >>>>>> "we hit this warning and we don't want to because it's ok in this > >>>>>> one special case, therefore remove it". I assume the special case > >>>>>> is that the device MSI address is managed via the SET_IRQS ioctl and > >>>>>> therefore we won't actually get DMAs to this range. > >>>>> Yes exactly. The guest creates a mapping between one giova and this > >>>>> gpa (corresponding to the MSI controller doorbell) because MSIs are > >>>>> mapped on ARM. But practically the physical device is programmed with > >>>>> an host chosen iova that maps onto the physical MSI controller's > >>>>> doorbell. so the device never performs DMA accesses to this range. > >>>>> > >>>>> But I imagine the case that > >>>>>> was in mind when adding this warning was general peer-to-peer > >>>>>> between and assigned and emulated device. > >>>>> yes makes sense. > >>>>> > >>>>> Maybe there's an argument to be made > >>>>>> that such a p2p mapping might
Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
Hi Eric, On Fri, Apr 24, 2020 at 7:47 PM Auger Eric wrote: > > Hi Bharat, > > On 4/2/20 11:01 AM, Bharat Bhushan wrote: > > Hi Eric/Alex, > > > >> -Original Message- > >> From: Alex Williamson > >> Sent: Thursday, March 26, 2020 11:23 PM > >> To: Auger Eric > >> Cc: Bharat Bhushan ; peter.mayd...@linaro.org; > >> pet...@redhat.com; eric.auger@gmail.com; kevin.t...@intel.com; > >> m...@redhat.com; Tomasz Nowicki [C] ; > >> drjo...@redhat.com; linuc.dec...@gmail.com; qemu-devel@nongnu.org; qemu- > >> a...@nongnu.org; bharatb.li...@gmail.com; jean-phili...@linaro.org; > >> yang.zh...@intel.com; David Gibson > >> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on > >> mmio > >> region translation by viommu > >> > >> External Email > >> > >> -- > >> On Thu, 26 Mar 2020 18:35:48 +0100 > >> Auger Eric wrote: > >> > >>> Hi Alex, > >>> > >>> On 3/24/20 12:08 AM, Alex Williamson wrote: > >>>> [Cc +dwg who originated this warning] > >>>> > >>>> On Mon, 23 Mar 2020 14:16:09 +0530 > >>>> Bharat Bhushan wrote: > >>>> > >>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU. > >>>>> As such address_space_translate() returns the MSI controller MMIO > >>>>> region and we get an "iommu map to non memory area" > >>>>> message. Let's remove this latter. > >>>>> > >>>>> Signed-off-by: Eric Auger > >>>>> Signed-off-by: Bharat Bhushan > >>>>> --- > >>>>> hw/vfio/common.c | 2 -- > >>>>> 1 file changed, 2 deletions(-) > >>>>> > >>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index > >>>>> 5ca11488d6..c586edf47a 100644 > >>>>> --- a/hw/vfio/common.c > >>>>> +++ b/hw/vfio/common.c > >>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, > >> void **vaddr, > >>>>> , , writable, > >>>>> MEMTXATTRS_UNSPECIFIED); > >>>>> if (!memory_region_is_ram(mr)) { > >>>>> -error_report("iommu map to non memory area %"HWADDR_PRIx"", > >>>>> - xlat); > >>>>> return false; > >>>>> } > >>>>> > >>>> > >>>> I'm a bit confused here, I think we need more justification beyond > >>>> "we hit this warning and we don't want to because it's ok in this > >>>> one special case, therefore remove it". I assume the special case > >>>> is that the device MSI address is managed via the SET_IRQS ioctl and > >>>> therefore we won't actually get DMAs to this range. > >>> Yes exactly. The guest creates a mapping between one giova and this > >>> gpa (corresponding to the MSI controller doorbell) because MSIs are > >>> mapped on ARM. But practically the physical device is programmed with > >>> an host chosen iova that maps onto the physical MSI controller's > >>> doorbell. so the device never performs DMA accesses to this range. > >>> > >>> But I imagine the case that > >>>> was in mind when adding this warning was general peer-to-peer > >>>> between and assigned and emulated device. > >>> yes makes sense. > >>> > >>> Maybe there's an argument to be made > >>>> that such a p2p mapping might also be used in a non-vIOMMU case. We > >>>> skip creating those mappings and drivers continue to work, maybe > >>>> because nobody attempts to do p2p DMA with the types of devices we > >>>> emulate, maybe because p2p DMA is not absolutely reliable on bare > >>>> metal and drivers test it before using it. > >>> MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c > >>> iommu_dma_get_msi_page). > >>> One idea could be to pass that flag through the IOMMU Notifier > >>> mechanism into the iotlb->perm. Eventually when we get this in > >>> vfio_get_vaddr() we would not print the warning. Could that make sense? > >> > >> Yeah, if we can identify a valid case that doesn't need a warning, that's > >> fine by me. > >> Thanks, > > > > Let me know if I understood the proposal correctly: > > > > virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with > > VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping. > > In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) > > in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check > > same flag and will not print the warning.> > > Is above correct? > Yes that's what I had in mind. In that case virtio-iommu driver in guest should not make map (VIRTIO_IOMMU_T_MAP) call as it known nothing to be mapped. Stay Safe Thanks -Bharat > > Thanks > > Eric > > > > Thanks > > -Bharat > > > >> > >> Alex > > > > >
Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
On Fri, Apr 24, 2020 at 7:22 PM Auger Eric wrote: > > Hi Bharat, > On 4/23/20 6:09 PM, Jean-Philippe Brucker wrote: > > Hi Bharat, > > > > A few more things found while rebasing > > > > On Mon, Mar 23, 2020 at 02:16:16PM +0530, Bharat Bhushan wrote: > >> This patch implements the PROBE request. Currently supported > >> page size mask per endpoint is returned. Also append a NONE > >> property in the end. > >> > >> Signed-off-by: Bharat Bhushan > >> Signed-off-by: Eric Auger > >> --- > >> include/standard-headers/linux/virtio_iommu.h | 6 + > >> hw/virtio/virtio-iommu.c | 161 +- > >> hw/virtio/trace-events| 2 + > >> 3 files changed, 166 insertions(+), 3 deletions(-) > >> > >> diff --git a/include/standard-headers/linux/virtio_iommu.h > >> b/include/standard-headers/linux/virtio_iommu.h > >> index b9443b83a1..8a0d47b907 100644 > >> --- a/include/standard-headers/linux/virtio_iommu.h > >> +++ b/include/standard-headers/linux/virtio_iommu.h > >> @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap { > >> > >> #define VIRTIO_IOMMU_PROBE_T_NONE 0 > >> #define VIRTIO_IOMMU_PROBE_T_RESV_MEM 1 > >> +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK 2 > >> > >> #define VIRTIO_IOMMU_PROBE_T_MASK 0xfff > >> > >> @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem { > >> uint64_tend; > >> }; > >> > >> +struct virtio_iommu_probe_pgsize_mask { > >> +struct virtio_iommu_probe_property head; > >> +uint64_tpgsize_bitmap; > >> +}; > >> + > >> struct virtio_iommu_req_probe { > >> struct virtio_iommu_req_headhead; > >> uint32_tendpoint; > >> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > >> index 747e3cf1da..63fbacdcdc 100644 > >> --- a/hw/virtio/virtio-iommu.c > >> +++ b/hw/virtio/virtio-iommu.c > >> @@ -38,6 +38,10 @@ > >> > >> /* Max size */ > >> #define VIOMMU_DEFAULT_QUEUE_SIZE 256 > >> +#define VIOMMU_PROBE_SIZE 512 > >> + > >> +#define SUPPORTED_PROBE_PROPERTIES (\ > >> +1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK) > >> > >> typedef struct VirtIOIOMMUDomain { > >> uint32_t id; > >> @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping { > >> uint32_t flags; > >> } VirtIOIOMMUMapping; > >> > >> +typedef struct VirtIOIOMMUPropBuffer { > >> +VirtIOIOMMUEndpoint *endpoint; > >> +size_t filled; > >> +uint8_t *start; > >> +bool error; > > > > It doesn't seem like bufstate->error gets used anywhere > maybe rebase your work on > [PATCH for-4.2 v10 10/15] virtio-iommu: Implement probe request > which tests it. This was the staring point for me, As of now i moved away from "error" from above struct. > > Also in > [Qemu-devel] [PATCH for-4.2 v10 10/15] virtio-iommu: Implement probe request > I changed the implementation to keep it simpler. > > Thanks > > Eric > > > >> +} VirtIOIOMMUPropBuffer; > >> + > >> static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev) > >> { > >> return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); > >> @@ -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, > >> return ret; > >> } > >> > >> +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer *bufstate) > >> +{ > >> +struct virtio_iommu_probe_property *prop; > >> + > >> +prop = (struct virtio_iommu_probe_property *) > >> +(bufstate->start + bufstate->filled); > >> +prop->type = 0; > >> +prop->length = 0; > >> +bufstate->filled += sizeof(*prop); > >> +trace_virtio_iommu_fill_none_property(bufstate->endpoint->id); > >> +return 0; > >> +} > >> + > >> +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer > >> *bufstate) > >> +{ > >> +struct virtio_iommu_probe_pgsize_mask *page_size_mask; > >> +size_t prop_size = sizeof(*page_size_mask); > >> +VirtIOIOMMUEndpoint *ep = bufstate->endpoint; > >> +VirtIOIOMM
RE: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
Hi Eric/Alex, > -Original Message- > From: Alex Williamson > Sent: Thursday, March 26, 2020 11:23 PM > To: Auger Eric > Cc: Bharat Bhushan ; peter.mayd...@linaro.org; > pet...@redhat.com; eric.auger@gmail.com; kevin.t...@intel.com; > m...@redhat.com; Tomasz Nowicki [C] ; > drjo...@redhat.com; linuc.dec...@gmail.com; qemu-devel@nongnu.org; qemu- > a...@nongnu.org; bharatb.li...@gmail.com; jean-phili...@linaro.org; > yang.zh...@intel.com; David Gibson > Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio > region translation by viommu > > External Email > > -- > On Thu, 26 Mar 2020 18:35:48 +0100 > Auger Eric wrote: > > > Hi Alex, > > > > On 3/24/20 12:08 AM, Alex Williamson wrote: > > > [Cc +dwg who originated this warning] > > > > > > On Mon, 23 Mar 2020 14:16:09 +0530 > > > Bharat Bhushan wrote: > > > > > >> On ARM, the MSI doorbell is translated by the virtual IOMMU. > > >> As such address_space_translate() returns the MSI controller MMIO > > >> region and we get an "iommu map to non memory area" > > >> message. Let's remove this latter. > > >> > > >> Signed-off-by: Eric Auger > > >> Signed-off-by: Bharat Bhushan > > >> --- > > >> hw/vfio/common.c | 2 -- > > >> 1 file changed, 2 deletions(-) > > >> > > >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index > > >> 5ca11488d6..c586edf47a 100644 > > >> --- a/hw/vfio/common.c > > >> +++ b/hw/vfio/common.c > > >> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, > void **vaddr, > > >> , , writable, > > >> MEMTXATTRS_UNSPECIFIED); > > >> if (!memory_region_is_ram(mr)) { > > >> -error_report("iommu map to non memory area %"HWADDR_PRIx"", > > >> - xlat); > > >> return false; > > >> } > > >> > > > > > > I'm a bit confused here, I think we need more justification beyond > > > "we hit this warning and we don't want to because it's ok in this > > > one special case, therefore remove it". I assume the special case > > > is that the device MSI address is managed via the SET_IRQS ioctl and > > > therefore we won't actually get DMAs to this range. > > Yes exactly. The guest creates a mapping between one giova and this > > gpa (corresponding to the MSI controller doorbell) because MSIs are > > mapped on ARM. But practically the physical device is programmed with > > an host chosen iova that maps onto the physical MSI controller's > > doorbell. so the device never performs DMA accesses to this range. > > > > But I imagine the case that > > > was in mind when adding this warning was general peer-to-peer > > > between and assigned and emulated device. > > yes makes sense. > > > > Maybe there's an argument to be made > > > that such a p2p mapping might also be used in a non-vIOMMU case. We > > > skip creating those mappings and drivers continue to work, maybe > > > because nobody attempts to do p2p DMA with the types of devices we > > > emulate, maybe because p2p DMA is not absolutely reliable on bare > > > metal and drivers test it before using it. > > MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c > > iommu_dma_get_msi_page). > > One idea could be to pass that flag through the IOMMU Notifier > > mechanism into the iotlb->perm. Eventually when we get this in > > vfio_get_vaddr() we would not print the warning. Could that make sense? > > Yeah, if we can identify a valid case that doesn't need a warning, that's > fine by me. > Thanks, Let me know if I understood the proposal correctly: virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping. In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check same flag and will not print the warning. Is above correct? Thanks -Bharat > > Alex
RE: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
Hi Alex, Eric, > -Original Message- > From: Alex Williamson > Sent: Thursday, March 26, 2020 11:23 PM > To: Auger Eric > Cc: Bharat Bhushan ; peter.mayd...@linaro.org; > pet...@redhat.com; eric.auger@gmail.com; kevin.t...@intel.com; > m...@redhat.com; Tomasz Nowicki [C] ; > drjo...@redhat.com; linuc.dec...@gmail.com; qemu-devel@nongnu.org; qemu- > a...@nongnu.org; bharatb.li...@gmail.com; jean-phili...@linaro.org; > yang.zh...@intel.com; David Gibson > Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio > region translation by viommu > > External Email > > -- > On Thu, 26 Mar 2020 18:35:48 +0100 > Auger Eric wrote: > > > Hi Alex, > > > > On 3/24/20 12:08 AM, Alex Williamson wrote: > > > [Cc +dwg who originated this warning] > > > > > > On Mon, 23 Mar 2020 14:16:09 +0530 > > > Bharat Bhushan wrote: > > > > > >> On ARM, the MSI doorbell is translated by the virtual IOMMU. > > >> As such address_space_translate() returns the MSI controller MMIO > > >> region and we get an "iommu map to non memory area" > > >> message. Let's remove this latter. > > >> > > >> Signed-off-by: Eric Auger > > >> Signed-off-by: Bharat Bhushan > > >> --- > > >> hw/vfio/common.c | 2 -- > > >> 1 file changed, 2 deletions(-) > > >> > > >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index > > >> 5ca11488d6..c586edf47a 100644 > > >> --- a/hw/vfio/common.c > > >> +++ b/hw/vfio/common.c > > >> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, > void **vaddr, > > >> , , writable, > > >> MEMTXATTRS_UNSPECIFIED); > > >> if (!memory_region_is_ram(mr)) { > > >> -error_report("iommu map to non memory area %"HWADDR_PRIx"", > > >> - xlat); > > >> return false; > > >> } > > >> > > > > > > I'm a bit confused here, I think we need more justification beyond > > > "we hit this warning and we don't want to because it's ok in this > > > one special case, therefore remove it". I assume the special case > > > is that the device MSI address is managed via the SET_IRQS ioctl and > > > therefore we won't actually get DMAs to this range. > > Yes exactly. The guest creates a mapping between one giova and this > > gpa (corresponding to the MSI controller doorbell) because MSIs are > > mapped on ARM. But practically the physical device is programmed with > > an host chosen iova that maps onto the physical MSI controller's > > doorbell. so the device never performs DMA accesses to this range. > > > > But I imagine the case that > > > was in mind when adding this warning was general peer-to-peer > > > between and assigned and emulated device. > > yes makes sense. > > > > Maybe there's an argument to be made > > > that such a p2p mapping might also be used in a non-vIOMMU case. We > > > skip creating those mappings and drivers continue to work, maybe > > > because nobody attempts to do p2p DMA with the types of devices we > > > emulate, maybe because p2p DMA is not absolutely reliable on bare > > > metal and drivers test it before using it. > > MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c > > iommu_dma_get_msi_page). > > One idea could be to pass that flag through the IOMMU Notifier > > mechanism into the iotlb->perm. Eventually when we get this in > > vfio_get_vaddr() we would not print the warning. Could that make sense? > > Yeah, if we can identify a valid case that doesn't need a warning, that's > fine by me. > Thanks, Will change as per above suggestion by Eric. Thanks -Bharat > > Alex
RE: [EXT] Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
Hi Eric, > -Original Message- > From: Auger Eric > Sent: Thursday, March 26, 2020 9:18 PM > To: Bharat Bhushan ; peter.mayd...@linaro.org; > pet...@redhat.com; eric.auger@gmail.com; alex.william...@redhat.com; > kevin.t...@intel.com; m...@redhat.com; Tomasz Nowicki [C] > ; drjo...@redhat.com; linuc.dec...@gmail.com; qemu- > de...@nongnu.org; qemu-...@nongnu.org; bharatb.li...@gmail.com; jean- > phili...@linaro.org; yang.zh...@intel.com > Subject: [EXT] Re: [PATCH v9 8/9] virtio-iommu: Implement probe request > > External Email > > -- > Hi Bharat > > On 3/23/20 9:46 AM, Bharat Bhushan wrote: > > This patch implements the PROBE request. Currently supported page size > > mask per endpoint is returned. Also append a NONE property in the end. > > > > Signed-off-by: Bharat Bhushan > > Signed-off-by: Eric Auger > > --- > > include/standard-headers/linux/virtio_iommu.h | 6 + > Changes to virtio_iommu.h should be in a separate patch you should use > ./scripts/update-linux-headers.sh See for instance: > ddda37483d linux-headers: update > until the uapi updates are not upstream you can link to your kernel branch and > mention this is a temporary linux header update or partial if you just want > to pick > up the iommu.h changes. yes, I am sorry. > > > hw/virtio/virtio-iommu.c | 161 +- > > hw/virtio/trace-events| 2 + > > 3 files changed, 166 insertions(+), 3 deletions(-) > > > > diff --git a/include/standard-headers/linux/virtio_iommu.h > > b/include/standard-headers/linux/virtio_iommu.h > > index b9443b83a1..8a0d47b907 100644 > > --- a/include/standard-headers/linux/virtio_iommu.h > > +++ b/include/standard-headers/linux/virtio_iommu.h > > @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap { > > > > #define VIRTIO_IOMMU_PROBE_T_NONE 0 > > #define VIRTIO_IOMMU_PROBE_T_RESV_MEM 1 > > +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK2 > > > > #define VIRTIO_IOMMU_PROBE_T_MASK 0xfff > > > > @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem { > > uint64_tend; > > }; > > > > +struct virtio_iommu_probe_pgsize_mask { > > + struct virtio_iommu_probe_property head; > > + uint64_tpgsize_bitmap; > > +}; > > + > > struct virtio_iommu_req_probe { > > struct virtio_iommu_req_headhead; > > uint32_tendpoint; > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index > > 747e3cf1da..63fbacdcdc 100644 > > --- a/hw/virtio/virtio-iommu.c > > +++ b/hw/virtio/virtio-iommu.c > > @@ -38,6 +38,10 @@ > > > > /* Max size */ > > #define VIOMMU_DEFAULT_QUEUE_SIZE 256 > > +#define VIOMMU_PROBE_SIZE 512 > > + > > +#define SUPPORTED_PROBE_PROPERTIES (\ > > +1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK) > > > > typedef struct VirtIOIOMMUDomain { > > uint32_t id; > > @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping { > > uint32_t flags; > > } VirtIOIOMMUMapping; > > > > +typedef struct VirtIOIOMMUPropBuffer { > > +VirtIOIOMMUEndpoint *endpoint; > > +size_t filled; > > +uint8_t *start; > > +bool error; > > +} VirtIOIOMMUPropBuffer; > > + > > static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev) { > > return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); @@ > > -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, > > return ret; > > } > > > > +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer > > +*bufstate) { > > +struct virtio_iommu_probe_property *prop; > > + > > +prop = (struct virtio_iommu_probe_property *) > > +(bufstate->start + bufstate->filled); > > +prop->type = 0; > > +prop->length = 0; > > +bufstate->filled += sizeof(*prop); > > +trace_virtio_iommu_fill_none_property(bufstate->endpoint->id); > > +return 0; > > +} > > + > > +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer > > +*bufstate) { > > +struct virtio_iommu_probe_pgsize_mask *page_size_mask; > > +size_t prop_size = sizeof(*page_size_mask); > > +VirtIOIOMMUEndpoint *ep = bufstate->endpoint; > > +VirtIOIOMMU *s = ep->viommu; > > +IOM
RE: [EXT] Re: [PATCH v9 2/9] memory: Add interface to set iommu page size mask
Hi Eric, > -Original Message- > From: Auger Eric > Sent: Thursday, March 26, 2020 9:36 PM > To: Bharat Bhushan ; peter.mayd...@linaro.org; > pet...@redhat.com; eric.auger@gmail.com; alex.william...@redhat.com; > kevin.t...@intel.com; m...@redhat.com; Tomasz Nowicki [C] > ; drjo...@redhat.com; linuc.dec...@gmail.com; qemu- > de...@nongnu.org; qemu-...@nongnu.org; bharatb.li...@gmail.com; jean- > phili...@linaro.org; yang.zh...@intel.com > Subject: [EXT] Re: [PATCH v9 2/9] memory: Add interface to set iommu page size > mask > > External Email > > ------ > Hi Bharat, > On 3/23/20 9:46 AM, Bharat Bhushan wrote: > > Allow to set page size mask to be supported by iommu. > by iommu memory region. I mean this is not global to the IOMMU. Yes. > > This is required to expose page size mask compatible with host with > > virtio-iommu. > > > > Signed-off-by: Bharat Bhushan > > --- > > include/exec/memory.h | 20 > > memory.c | 10 ++ > > 2 files changed, 30 insertions(+) > > > > diff --git a/include/exec/memory.h b/include/exec/memory.h index > > e85b7de99a..063c424854 100644 > > --- a/include/exec/memory.h > > +++ b/include/exec/memory.h > > @@ -355,6 +355,16 @@ typedef struct IOMMUMemoryRegionClass { > > * @iommu: the IOMMUMemoryRegion > > */ > > int (*num_indexes)(IOMMUMemoryRegion *iommu); > > + > > +/* > > + * Set supported IOMMU page size > > + * > > + * Optional method: if this is supported then set page size that > > + * can be supported by IOMMU. This is called to set supported page > > + * size as per host Linux. > What about: If supported, allows to restrict the page size mask that can be > supported with a given IOMMU memory region. For example, this allows to > propagate host physical IOMMU page size mask limitations to the virtual IOMMU > (vfio assignment with virtual iommu). Much better > > + */ > > + void (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu, > > + uint64_t page_size_mask); > > } IOMMUMemoryRegionClass; > > > > typedef struct CoalescedMemoryRange CoalescedMemoryRange; @@ -1363,6 > > +1373,16 @@ int > memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr, > > */ > > int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr); > > > > +/** > > + * memory_region_iommu_set_page_size_mask: set the supported pages > > + * size by iommu. > supported page sizes for a given IOMMU memory region > > + * > > + * @iommu_mr: the memory region > IOMMU memory region > > + * @page_size_mask: supported page size mask */ void > > +memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion > *iommu_mr, > > +uint64_t page_size_mask); > > + > > /** > > * memory_region_name: get a memory region's name > > * > > diff --git a/memory.c b/memory.c > > index aeaa8dcc9e..14c8783084 100644 > > --- a/memory.c > > +++ b/memory.c > > @@ -1833,6 +1833,16 @@ static int > memory_region_update_iommu_notify_flags(IOMMUMemoryRegion > *iommu_mr, > > return ret; > > } > > > > +void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion > *iommu_mr, > > +uint64_t page_size_mask) > > +{ > > +IOMMUMemoryRegionClass *imrc = > > +IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr); > > + > > +if (imrc->iommu_set_page_size_mask) { > > +imrc->iommu_set_page_size_mask(iommu_mr, page_size_mask); > Shouldn't it return an int in case the setting cannot be applied? iommu_set_page_size_mask() is setting page-size-mask for endpoint. Below function from code static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr, uint64_t page_size_mask) { IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); sdev->page_size_mask = page_size_mask; } Do you see any reason it cannot be applied, am I missing something? Thanks -Bharat > > +} > > +} > > + > > int memory_region_register_iommu_notifier(MemoryRegion *mr, > >IOMMUNotifier *n, Error > > **errp) { > > > Thanks > Eric
RE: [EXT] Re: [PATCH v9 4/9] virtio-iommu: set supported page size mask
Hi Eric, > -Original Message- > From: Auger Eric > Sent: Thursday, March 26, 2020 9:22 PM > To: Bharat Bhushan ; peter.mayd...@linaro.org; > pet...@redhat.com; eric.auger@gmail.com; alex.william...@redhat.com; > kevin.t...@intel.com; m...@redhat.com; Tomasz Nowicki [C] > ; drjo...@redhat.com; linuc.dec...@gmail.com; qemu- > de...@nongnu.org; qemu-...@nongnu.org; bharatb.li...@gmail.com; jean- > phili...@linaro.org; yang.zh...@intel.com > Subject: [EXT] Re: [PATCH v9 4/9] virtio-iommu: set supported page size mask > > External Email > > ------ > Hi Bharat, > > On 3/23/20 9:46 AM, Bharat Bhushan wrote: > > Add optional interface to set page size mask. > > Currently this is set global configuration and not per endpoint. > This allows to override the page size mask per end-point? This patch adds per endpoint page-size-mask configuration in addition to global page-size-mask. endpoint page-size-mask will override global page-size-mask configuration for that endpoint. Thanks -Bharat > > > > Signed-off-by: Bharat Bhushan > > --- > > include/hw/virtio/virtio-iommu.h | 1 + > > hw/virtio/virtio-iommu.c | 9 + > > 2 files changed, 10 insertions(+) > > > > diff --git a/include/hw/virtio/virtio-iommu.h > > b/include/hw/virtio/virtio-iommu.h > > index 6f67f1020a..4efa09610a 100644 > > --- a/include/hw/virtio/virtio-iommu.h > > +++ b/include/hw/virtio/virtio-iommu.h > > @@ -35,6 +35,7 @@ typedef struct IOMMUDevice { > > void *viommu; > > PCIBus *bus; > > int devfn; > > +uint64_t page_size_mask; > > IOMMUMemoryRegion iommu_mr; > > AddressSpace as; > > } IOMMUDevice; > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index > > 4cee8083bc..a28818202c 100644 > > --- a/hw/virtio/virtio-iommu.c > > +++ b/hw/virtio/virtio-iommu.c > > @@ -650,6 +650,14 @@ static gint int_cmp(gconstpointer a, gconstpointer b, > gpointer user_data) > > return (ua > ub) - (ua < ub); > > } > > > > +static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr, > > +uint64_t page_size_mask) > > +{ > > +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); > > + > > +sdev->page_size_mask = page_size_mask; } > > + > > static void virtio_iommu_device_realize(DeviceState *dev, Error > > **errp) { > > VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -865,6 +873,7 @@ > > static void virtio_iommu_memory_region_class_init(ObjectClass *klass, > > IOMMUMemoryRegionClass *imrc = > IOMMU_MEMORY_REGION_CLASS(klass); > > > > imrc->translate = virtio_iommu_translate; > > +imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask; > > } > > > > static const TypeInfo virtio_iommu_info = { > > > Thanks > > Eric
[PATCH v9 8/9] virtio-iommu: Implement probe request
This patch implements the PROBE request. Currently supported page size mask per endpoint is returned. Also append a NONE property in the end. Signed-off-by: Bharat Bhushan Signed-off-by: Eric Auger --- include/standard-headers/linux/virtio_iommu.h | 6 + hw/virtio/virtio-iommu.c | 161 +- hw/virtio/trace-events| 2 + 3 files changed, 166 insertions(+), 3 deletions(-) diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h index b9443b83a1..8a0d47b907 100644 --- a/include/standard-headers/linux/virtio_iommu.h +++ b/include/standard-headers/linux/virtio_iommu.h @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap { #define VIRTIO_IOMMU_PROBE_T_NONE 0 #define VIRTIO_IOMMU_PROBE_T_RESV_MEM 1 +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK2 #define VIRTIO_IOMMU_PROBE_T_MASK 0xfff @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem { uint64_tend; }; +struct virtio_iommu_probe_pgsize_mask { + struct virtio_iommu_probe_property head; + uint64_tpgsize_bitmap; +}; + struct virtio_iommu_req_probe { struct virtio_iommu_req_headhead; uint32_tendpoint; diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 747e3cf1da..63fbacdcdc 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -38,6 +38,10 @@ /* Max size */ #define VIOMMU_DEFAULT_QUEUE_SIZE 256 +#define VIOMMU_PROBE_SIZE 512 + +#define SUPPORTED_PROBE_PROPERTIES (\ +1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK) typedef struct VirtIOIOMMUDomain { uint32_t id; @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping { uint32_t flags; } VirtIOIOMMUMapping; +typedef struct VirtIOIOMMUPropBuffer { +VirtIOIOMMUEndpoint *endpoint; +size_t filled; +uint8_t *start; +bool error; +} VirtIOIOMMUPropBuffer; + static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev) { return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); @@ -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, return ret; } +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer *bufstate) +{ +struct virtio_iommu_probe_property *prop; + +prop = (struct virtio_iommu_probe_property *) +(bufstate->start + bufstate->filled); +prop->type = 0; +prop->length = 0; +bufstate->filled += sizeof(*prop); +trace_virtio_iommu_fill_none_property(bufstate->endpoint->id); +return 0; +} + +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer *bufstate) +{ +struct virtio_iommu_probe_pgsize_mask *page_size_mask; +size_t prop_size = sizeof(*page_size_mask); +VirtIOIOMMUEndpoint *ep = bufstate->endpoint; +VirtIOIOMMU *s = ep->viommu; +IOMMUDevice *sdev; + +if (bufstate->filled + prop_size >= VIOMMU_PROBE_SIZE) { +bufstate->error = true; +/* get the traversal stopped by returning true */ +return true; +} + +page_size_mask = (struct virtio_iommu_probe_pgsize_mask *) + (bufstate->start + bufstate->filled); + +page_size_mask->head.type = VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK; +page_size_mask->head.length = prop_size; +QLIST_FOREACH(sdev, >notifiers_list, next) { +if (ep->id == sdev->devfn) { +page_size_mask->pgsize_bitmap = sdev->page_size_mask; + } +} +bufstate->filled += sizeof(*page_size_mask); +trace_virtio_iommu_fill_pgsize_mask_property(bufstate->endpoint->id, + page_size_mask->pgsize_bitmap, + bufstate->filled); +return false; +} + +/* Fill the properties[] buffer with properties of type @type */ +static int virtio_iommu_fill_property(int type, + VirtIOIOMMUPropBuffer *bufstate) +{ +int ret = -ENOSPC; + +if (bufstate->filled + sizeof(struct virtio_iommu_probe_property) +>= VIOMMU_PROBE_SIZE) { +/* no space left for the header */ +bufstate->error = true; +goto out; +} + +switch (type) { +case VIRTIO_IOMMU_PROBE_T_NONE: +ret = virtio_iommu_fill_none_prop(bufstate); +break; +case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK: +{ +ret = virtio_iommu_fill_page_size_mask(bufstate); + break; +} +default: +ret = -ENOENT; +break; +} +out: +if (ret) { +error_report("%s property of type=%d could not be filled (%d)," + " remaining size = 0x%lx", + __func__, type, ret, bufstate->filled); +} +return ret;
[PATCH v9 9/9] virtio-iommu: add iommu notifier memory-region
Finally add notify_flag_changed() to for memory-region access flag iommu flag change notifier Finally add the memory notifier Signed-off-by: Bharat Bhushan --- hw/virtio/virtio-iommu.c | 22 ++ hw/virtio/trace-events | 2 ++ 2 files changed, 24 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 63fbacdcdc..413792b626 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -966,6 +966,27 @@ unlock: qemu_mutex_unlock(>mutex); } +static int virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr, + IOMMUNotifierFlag old, + IOMMUNotifierFlag new, + Error **errp) +{ +IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; + +if (old == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name); +QLIST_INSERT_HEAD(>notifiers_list, sdev, next); +return 0; +} + +if (new == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name); +QLIST_REMOVE(sdev, next); +} +return 0; +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -1187,6 +1208,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = virtio_iommu_translate; imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask; imrc->replay = virtio_iommu_replay; +imrc->notify_flag_changed = virtio_iommu_notify_flag_changed; } static const TypeInfo virtio_iommu_info = { diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index b0a6e4bda3..6b7495ac3d 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -78,3 +78,5 @@ virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "m virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" virtio_iommu_fill_none_property(uint32_t devid) "devid=%d" virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask, size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx" +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s" +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s" -- 2.17.1
[PATCH v9 6/9] virtio-iommu: Call iommu notifier for attach/detach
iommu-notifier are called when a device is attached or detached to as address-space. This is needed for VFIO. Signed-off-by: Bharat Bhushan --- hw/virtio/virtio-iommu.c | 49 1 file changed, 49 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index bd464d4fb3..88849aa7b9 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint { uint32_t id; VirtIOIOMMUDomain *domain; QLIST_ENTRY(VirtIOIOMMUEndpoint) next; +VirtIOIOMMU *viommu; } VirtIOIOMMUEndpoint; typedef struct VirtIOIOMMUInterval { @@ -155,11 +156,48 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, memory_region_notify_iommu(mr, 0, entry); } +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value, + gpointer data) +{ +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_unmap(mr, interval->low, + interval->high - interval->low + 1); + +return false; +} + +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value, + gpointer data) +{ +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, +interval->high - interval->low + 1); + +return false; +} + static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) { +VirtIOIOMMU *s = ep->viommu; +VirtIOIOMMUDomain *domain = ep->domain; +IOMMUDevice *sdev; + if (!ep->domain) { return; } + +QLIST_FOREACH(sdev, >notifiers_list, next) { +if (ep->id == sdev->devfn) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap, + >iommu_mr); +} +} + QLIST_REMOVE(ep, next); ep->domain = NULL; } @@ -178,6 +216,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, } ep = g_malloc0(sizeof(*ep)); ep->id = ep_id; +ep->viommu = s; trace_virtio_iommu_get_endpoint(ep_id); g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep); return ep; @@ -274,6 +313,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, uint32_t ep_id = le32_to_cpu(req->endpoint); VirtIOIOMMUDomain *domain; VirtIOIOMMUEndpoint *ep; +IOMMUDevice *sdev; trace_virtio_iommu_attach(domain_id, ep_id); @@ -299,6 +339,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, ep->domain = domain; +/* Replay domain mappings on the associated memory region */ +QLIST_FOREACH(sdev, >notifiers_list, next) { +if (ep_id == sdev->devfn) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_map, + >iommu_mr); +} +} + return VIRTIO_IOMMU_S_OK; } @@ -872,6 +920,7 @@ static gboolean reconstruct_endpoints(gpointer key, gpointer value, QLIST_FOREACH(iter, >endpoint_list, next) { iter->domain = d; +iter->viommu = s; g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter); } return false; /* continue the domain traversal */ -- 2.17.1
[PATCH v9 7/9] virtio-iommu: add iommu replay
Default replay does not work with virtio-iommu, so this patch provide virtio-iommu replay functionality. Signed-off-by: Bharat Bhushan --- hw/virtio/virtio-iommu.c | 44 hw/virtio/trace-events | 1 + 2 files changed, 45 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 88849aa7b9..747e3cf1da 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -770,6 +770,49 @@ static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr, sdev->page_size_mask = page_size_mask; } +static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data) +{ +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +trace_virtio_iommu_remap(interval->low, mapping->phys_addr, + interval->high - interval->low + 1); +/* unmap previous entry and map again */ +virtio_iommu_notify_unmap(mr, interval->low, + interval->high - interval->low + 1); + +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, +interval->high - interval->low + 1); +return false; +} + +static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n) +{ +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +uint32_t sid; +VirtIOIOMMUEndpoint *ep; + +sid = virtio_iommu_get_bdf(sdev); + +qemu_mutex_lock(>mutex); + +if (!s->endpoints) { +goto unlock; +} + +ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); +if (!ep || !ep->domain) { +goto unlock; +} + +g_tree_foreach(ep->domain->mappings, virtio_iommu_remap, mr); + +unlock: +qemu_mutex_unlock(>mutex); +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -988,6 +1031,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = virtio_iommu_translate; imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask; +imrc->replay = virtio_iommu_replay; } static const TypeInfo virtio_iommu_info = { diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index d94a1cd8a3..8bae651191 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -75,3 +75,4 @@ virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64 virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 +virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" -- 2.17.1
[PATCH v9 4/9] virtio-iommu: set supported page size mask
Add optional interface to set page size mask. Currently this is set global configuration and not per endpoint. Signed-off-by: Bharat Bhushan --- include/hw/virtio/virtio-iommu.h | 1 + hw/virtio/virtio-iommu.c | 9 + 2 files changed, 10 insertions(+) diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h index 6f67f1020a..4efa09610a 100644 --- a/include/hw/virtio/virtio-iommu.h +++ b/include/hw/virtio/virtio-iommu.h @@ -35,6 +35,7 @@ typedef struct IOMMUDevice { void *viommu; PCIBus *bus; int devfn; +uint64_t page_size_mask; IOMMUMemoryRegion iommu_mr; AddressSpace as; } IOMMUDevice; diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 4cee8083bc..a28818202c 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -650,6 +650,14 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data) return (ua > ub) - (ua < ub); } +static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr, +uint64_t page_size_mask) +{ +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); + +sdev->page_size_mask = page_size_mask; +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -865,6 +873,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); imrc->translate = virtio_iommu_translate; +imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask; } static const TypeInfo virtio_iommu_info = { -- 2.17.1
[PATCH v9 5/9] virtio-iommu: Add iommu notifier for map/unmap
This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to notify registered iommu-notifier. Which will call vfio notifier to map/unmap region in iommu. Signed-off-by: Bharat Bhushan Signed-off-by: Eric Auger --- include/hw/virtio/virtio-iommu.h | 2 + hw/virtio/virtio-iommu.c | 67 +++- hw/virtio/trace-events | 2 + 3 files changed, 70 insertions(+), 1 deletion(-) diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h index 4efa09610a..e53586df70 100644 --- a/include/hw/virtio/virtio-iommu.h +++ b/include/hw/virtio/virtio-iommu.h @@ -38,6 +38,7 @@ typedef struct IOMMUDevice { uint64_t page_size_mask; IOMMUMemoryRegion iommu_mr; AddressSpace as; +QLIST_ENTRY(IOMMUDevice) next; } IOMMUDevice; typedef struct IOMMUPciBus { @@ -57,6 +58,7 @@ typedef struct VirtIOIOMMU { GTree *domains; QemuMutex mutex; GTree *endpoints; +QLIST_HEAD(, IOMMUDevice) notifiers_list; } VirtIOIOMMU; #endif diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index a28818202c..bd464d4fb3 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data) } } +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova, +hwaddr paddr, hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size); +entry.perm = IOMMU_RW; +entry.translated_addr = paddr; + +memory_region_notify_iommu(mr, 0, entry); +} + +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, + hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size); +entry.perm = IOMMU_NONE; +entry.translated_addr = 0; + +memory_region_notify_iommu(mr, 0, entry); +} + static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) { if (!ep->domain) { @@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s, uint64_t virt_start = le64_to_cpu(req->virt_start); uint64_t virt_end = le64_to_cpu(req->virt_end); uint32_t flags = le32_to_cpu(req->flags); +hwaddr size = virt_end - virt_start + 1; VirtIOIOMMUDomain *domain; VirtIOIOMMUInterval *interval; VirtIOIOMMUMapping *mapping; +VirtIOIOMMUEndpoint *ep; +IOMMUDevice *sdev; if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) { return VIRTIO_IOMMU_S_INVAL; @@ -339,9 +374,38 @@ static int virtio_iommu_map(VirtIOIOMMU *s, g_tree_insert(domain->mappings, interval, mapping); +/* All devices in an address-space share mapping */ +QLIST_FOREACH(sdev, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +if (ep->id == sdev->devfn) { +virtio_iommu_notify_map(>iommu_mr, +virt_start, phys_start, size); +} +} +} + return VIRTIO_IOMMU_S_OK; } +static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, +VirtIOIOMMUDomain *domain, +VirtIOIOMMUInterval *interval) +{ +VirtIOIOMMUEndpoint *ep; +IOMMUDevice *sdev; + +QLIST_FOREACH(sdev, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +if (ep->id == sdev->devfn) { +virtio_iommu_notify_unmap(>iommu_mr, + interval->low, + interval->high - interval->low + 1); +} +} +} +g_tree_remove(domain->mappings, (gpointer)(interval)); +} + static int virtio_iommu_unmap(VirtIOIOMMU *s, struct virtio_iommu_req_unmap *req) { @@ -368,7 +432,7 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, uint64_t current_high = iter_key->high; if (interval.low <= current_low && interval.high >= current_high) { -g_tree_remove(domain->mappings, iter_key); +virtio_iommu_remove_mapping(s, domain, iter_key); trace_virtio_iommu_unmap_done(domain_id, current_low, current_high); } else { ret = VIRTIO_IOMMU_S_RANGE; @@ -663,6 +727,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOIOMMU *s = VIRTIO_IOMMU(dev); +QLIST_INIT(>notifiers_list); virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU, sizeo
[PATCH v9 0/9] virtio-iommu: VFIO integration
This patch series integrates VFIO with virtio-iommu. This is only applicable for PCI pass-through with virtio-iommu. This series is available at: https://github.com/bharat-bhushan-devel/qemu.git virtio-iommu-vfio-integration-v8 This is tested with assigning more than one pci devices to Virtual Machine. v8-v9: - Have page size mask per endpint - Add PROBE interface, return page size mask v7->v8: - Set page size mask as per host This fixes issue with 64K host/guest - Device list from IOMMUDevice directly removed VirtioIOMMUNotifierNode - Add missing iep->viommu init on post-load v6->v7: - corrected email-address v5->v6: - Rebase to v16 version from Eric - Tested with upstream Linux - Added a patch from Eric/Myself on removing mmio-region error print in vfio v4->v5: - Rebase to v9 version from Eric - PCIe device hotplug fix - Added Patch 1/5 from Eric previous series (Eric somehow dropped in last version. - Patch "Translate the MSI doorbell in kvm_arch_fixup_msi_route" already integrated with vsmmu3 v3->v4: - Rebase to v4 version from Eric - Fixes from Eric with DPDK in VM - Logical division in multiple patches v2->v3: - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" Which is based on top of v2.10-rc0 that - Fixed issue with two PCI devices - Addressed review comments v1->v2: - Added trace events - removed vSMMU3 link in patch description Bharat Bhushan (9): hw/vfio/common: Remove error print on mmio region translation by viommu memory: Add interface to set iommu page size mask vfio: set iommu page size as per host supported page size virtio-iommu: set supported page size mask virtio-iommu: Add iommu notifier for map/unmap virtio-iommu: Call iommu notifier for attach/detach virtio-iommu: add iommu replay virtio-iommu: Implement probe request virtio-iommu: add iommu notifier memory-region include/exec/memory.h | 20 + include/hw/virtio/virtio-iommu.h | 3 + include/standard-headers/linux/virtio_iommu.h | 6 + hw/vfio/common.c | 5 +- hw/virtio/virtio-iommu.c | 352 +- memory.c | 10 + hw/virtio/trace-events| 7 + 7 files changed, 397 insertions(+), 6 deletions(-) -- 2.17.1
[PATCH v9 3/9] vfio: set iommu page size as per host supported page size
Set iommu supported page size mask same as host Linux supported page size mask. Signed-off-by: Bharat Bhushan --- hw/vfio/common.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index c586edf47a..6ea50d696f 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -635,6 +635,9 @@ static void vfio_listener_region_add(MemoryListener *listener, int128_get64(llend), iommu_idx); +memory_region_iommu_set_page_size_mask(giommu->iommu, + container->pgsizes); + ret = memory_region_register_iommu_notifier(section->mr, >n, ); if (ret) { -- 2.17.1
[PATCH v9 2/9] memory: Add interface to set iommu page size mask
Allow to set page size mask to be supported by iommu. This is required to expose page size mask compatible with host with virtio-iommu. Signed-off-by: Bharat Bhushan --- include/exec/memory.h | 20 memory.c | 10 ++ 2 files changed, 30 insertions(+) diff --git a/include/exec/memory.h b/include/exec/memory.h index e85b7de99a..063c424854 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -355,6 +355,16 @@ typedef struct IOMMUMemoryRegionClass { * @iommu: the IOMMUMemoryRegion */ int (*num_indexes)(IOMMUMemoryRegion *iommu); + +/* + * Set supported IOMMU page size + * + * Optional method: if this is supported then set page size that + * can be supported by IOMMU. This is called to set supported page + * size as per host Linux. + */ + void (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu, + uint64_t page_size_mask); } IOMMUMemoryRegionClass; typedef struct CoalescedMemoryRange CoalescedMemoryRange; @@ -1363,6 +1373,16 @@ int memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr, */ int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr); +/** + * memory_region_iommu_set_page_size_mask: set the supported pages + * size by iommu. + * + * @iommu_mr: the memory region + * @page_size_mask: supported page size mask + */ +void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr, +uint64_t page_size_mask); + /** * memory_region_name: get a memory region's name * diff --git a/memory.c b/memory.c index aeaa8dcc9e..14c8783084 100644 --- a/memory.c +++ b/memory.c @@ -1833,6 +1833,16 @@ static int memory_region_update_iommu_notify_flags(IOMMUMemoryRegion *iommu_mr, return ret; } +void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr, +uint64_t page_size_mask) +{ +IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr); + +if (imrc->iommu_set_page_size_mask) { +imrc->iommu_set_page_size_mask(iommu_mr, page_size_mask); +} +} + int memory_region_register_iommu_notifier(MemoryRegion *mr, IOMMUNotifier *n, Error **errp) { -- 2.17.1
[PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
On ARM, the MSI doorbell is translated by the virtual IOMMU. As such address_space_translate() returns the MSI controller MMIO region and we get an "iommu map to non memory area" message. Let's remove this latter. Signed-off-by: Eric Auger Signed-off-by: Bharat Bhushan --- hw/vfio/common.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 5ca11488d6..c586edf47a 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr, , , writable, MEMTXATTRS_UNSPECIFIED); if (!memory_region_is_ram(mr)) { -error_report("iommu map to non memory area %"HWADDR_PRIx"", - xlat); return false; } -- 2.17.1
Re: [PATCH v8 4/8] virtio-iommu: set supported page size mask
Hi Eric/Jean, On Wed, Mar 18, 2020 at 8:05 PM Bharat Bhushan wrote: > > Hi Eric, > > On Wed, Mar 18, 2020 at 4:58 PM Auger Eric wrote: > > > > Hi Bharat, > > > > On 3/18/20 11:11 AM, Bharat Bhushan wrote: > > > Add optional interface to set page size mask. > > > Currently this is set global configuration and not > > > per endpoint. > > > > > > Signed-off-by: Bharat Bhushan > > > --- > > > v7->v8: > > > - new patch > > > > > > hw/virtio/virtio-iommu.c | 10 ++ > > > 1 file changed, 10 insertions(+) > > > > > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > > > index 4cee8083bc..c00a55348d 100644 > > > --- a/hw/virtio/virtio-iommu.c > > > +++ b/hw/virtio/virtio-iommu.c > > > @@ -650,6 +650,15 @@ static gint int_cmp(gconstpointer a, gconstpointer > > > b, gpointer user_data) > > > return (ua > ub) - (ua < ub); > > > } > > > > > > +static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr, > > > +uint64_t page_size_mask) > > > +{ > > > +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); > > > +VirtIOIOMMU *s = sdev->viommu; > > > + > > > +s->config.page_size_mask = page_size_mask; > > The problem is page_size_mask is global to the VIRTIO-IOMMU. > > > > - Can't different VFIO containers impose different/inconsistent settings? > > - VFIO devices can be hotplugged. > > This is possible if we different iommu's, which we support. correct? > > > So we may start with some default > > page_size_mask which is latter overriden by a host imposed one. Assume > > you first launch the VM with a virtio NIC. This uses 64K. Then you > > hotplug a VFIO device behind a physical IOMMU which only supports 4K > > pages. Isn't it a valid scenario? > > So we need to expose page_size_mask per endpoint? Just sent Linux RFC patch to use page-size-mask per endpoint. QEMU changes are also ready, will share soon. Thanks -Bharat > > Thanks > -Bharat > > > > > Thanks > > > > Eric > > > > > +} > > > + > > > static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) > > > { > > > VirtIODevice *vdev = VIRTIO_DEVICE(dev); > > > @@ -865,6 +874,7 @@ static void > > > virtio_iommu_memory_region_class_init(ObjectClass *klass, > > > IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); > > > > > > imrc->translate = virtio_iommu_translate; > > > +imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask; > > > } > > > > > > static const TypeInfo virtio_iommu_info = { > > > > >
Re: [PATCH v8 4/8] virtio-iommu: set supported page size mask
Hi Eric, On Wed, Mar 18, 2020 at 4:58 PM Auger Eric wrote: > > Hi Bharat, > > On 3/18/20 11:11 AM, Bharat Bhushan wrote: > > Add optional interface to set page size mask. > > Currently this is set global configuration and not > > per endpoint. > > > > Signed-off-by: Bharat Bhushan > > --- > > v7->v8: > > - new patch > > > > hw/virtio/virtio-iommu.c | 10 ++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > > index 4cee8083bc..c00a55348d 100644 > > --- a/hw/virtio/virtio-iommu.c > > +++ b/hw/virtio/virtio-iommu.c > > @@ -650,6 +650,15 @@ static gint int_cmp(gconstpointer a, gconstpointer b, > > gpointer user_data) > > return (ua > ub) - (ua < ub); > > } > > > > +static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr, > > +uint64_t page_size_mask) > > +{ > > +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); > > +VirtIOIOMMU *s = sdev->viommu; > > + > > +s->config.page_size_mask = page_size_mask; > The problem is page_size_mask is global to the VIRTIO-IOMMU. > > - Can't different VFIO containers impose different/inconsistent settings? > - VFIO devices can be hotplugged. This is possible if we different iommu's, which we support. correct? > So we may start with some default > page_size_mask which is latter overriden by a host imposed one. Assume > you first launch the VM with a virtio NIC. This uses 64K. Then you > hotplug a VFIO device behind a physical IOMMU which only supports 4K > pages. Isn't it a valid scenario? So we need to expose page_size_mask per endpoint? Thanks -Bharat > > Thanks > > Eric > > > +} > > + > > static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) > > { > > VirtIODevice *vdev = VIRTIO_DEVICE(dev); > > @@ -865,6 +874,7 @@ static void > > virtio_iommu_memory_region_class_init(ObjectClass *klass, > > IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); > > > > imrc->translate = virtio_iommu_translate; > > +imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask; > > } > > > > static const TypeInfo virtio_iommu_info = { > > >
RE: [EXT] Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
> -Original Message- > From: Jean-Philippe Brucker > Sent: Wednesday, March 18, 2020 4:48 PM > To: Bharat Bhushan > Cc: Auger Eric ; Peter Maydell > ; kevin.t...@intel.com; Tomasz Nowicki [C] > ; m...@redhat.com; drjo...@redhat.com; > pet...@redhat.com; qemu-devel@nongnu.org; alex.william...@redhat.com; > qemu-...@nongnu.org; Bharat Bhushan ; > linuc.dec...@gmail.com; eric.auger@gmail.com > Subject: [EXT] Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for > attach/detach > > External Email > > ------ > On Wed, Mar 18, 2020 at 03:47:44PM +0530, Bharat Bhushan wrote: > > Hi Jean, > > > > On Tue, Mar 17, 2020 at 9:29 PM Jean-Philippe Brucker > > wrote: > > > > > > On Tue, Mar 17, 2020 at 02:46:55PM +0530, Bharat Bhushan wrote: > > > > Hi Jean, > > > > > > > > On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker > > > > wrote: > > > > > > > > > > On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote: > > > > > > Hi Jean, > > > > > > > > > > > > On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker > > > > > > wrote: > > > > > > > > > > > > > > Hi Bharat, > > > > > > > > > > > > > > Could you Cc me on your next posting? Unfortunately I don't > > > > > > > have much hardware for testing this at the moment, but I > > > > > > > might be able to help a little on the review. > > > > > > > > > > > > > > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote: > > > > > > > > > >>> First issue is: your guest can use 4K page and your > > > > > > > > > >>> host can use 64KB pages. In that case VFIO_DMA_MAP > > > > > > > > > >>> will fail with -EINVAL. We must devise a way to pass the > > > > > > > > > >>> host > settings to the VIRTIO-IOMMU device. > > > > > > > > > >>> > > > > > > > > > >>> Even with 64KB pages, it did not work for me. I have > > > > > > > > > >>> obviously not the storm of VFIO_DMA_MAP failures but > > > > > > > > > >>> I have some, most probably due to some wrong notifications > somewhere. I will try to investigate on my side. > > > > > > > > > >>> > > > > > > > > > >>> Did you test with VFIO on your side? > > > > > > > > > >> > > > > > > > > > >> I did not tried with different page sizes, only tested > > > > > > > > > >> with 4K page > size. > > > > > > > > > >> > > > > > > > > > >> Yes it works, I tested with two n/w device assigned > > > > > > > > > >> to VM, both interfaces works > > > > > > > > > >> > > > > > > > > > >> First I will try with 64k page size. > > > > > > > > > > > > > > > > > > > > 64K page size does not work for me as well, > > > > > > > > > > > > > > > > > > > > I think we are not passing correct page_size_mask here > > > > > > > > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( > > > > > > > > > > which is > > > > > > > > > > 0xf000)) > > > > > > > > > I guess you mean with guest using 4K and host using 64K. > > > > > > > > > > > > > > > > > > > > We need to set this correctly as per host page size, > > > > > > > > > > correct? > > > > > > > > > Yes that's correct. We need to put in place a control > > > > > > > > > path to retrieve the page settings on host through VFIO to > > > > > > > > > inform the > virtio-iommu device. > > > > > > > > > > > > > > > > > > Besides this issue, did you try with 64kB on host and guest? > > > > > > > > > > > > > > > > I tried Followings > > > > > > > > -
Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
Hi Jean, On Tue, Mar 17, 2020 at 9:29 PM Jean-Philippe Brucker wrote: > > On Tue, Mar 17, 2020 at 02:46:55PM +0530, Bharat Bhushan wrote: > > Hi Jean, > > > > On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker > > wrote: > > > > > > On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote: > > > > Hi Jean, > > > > > > > > On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker > > > > wrote: > > > > > > > > > > Hi Bharat, > > > > > > > > > > Could you Cc me on your next posting? Unfortunately I don't have much > > > > > hardware for testing this at the moment, but I might be able to help a > > > > > little on the review. > > > > > > > > > > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote: > > > > > > > >>> First issue is: your guest can use 4K page and your host can > > > > > > > >>> use 64KB > > > > > > > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We > > > > > > > >>> must devise > > > > > > > >>> a way to pass the host settings to the VIRTIO-IOMMU device. > > > > > > > >>> > > > > > > > >>> Even with 64KB pages, it did not work for me. I have > > > > > > > >>> obviously not the > > > > > > > >>> storm of VFIO_DMA_MAP failures but I have some, most probably > > > > > > > >>> due to > > > > > > > >>> some wrong notifications somewhere. I will try to investigate > > > > > > > >>> on my side. > > > > > > > >>> > > > > > > > >>> Did you test with VFIO on your side? > > > > > > > >> > > > > > > > >> I did not tried with different page sizes, only tested with 4K > > > > > > > >> page size. > > > > > > > >> > > > > > > > >> Yes it works, I tested with two n/w device assigned to VM, > > > > > > > >> both interfaces works > > > > > > > >> > > > > > > > >> First I will try with 64k page size. > > > > > > > > > > > > > > > > 64K page size does not work for me as well, > > > > > > > > > > > > > > > > I think we are not passing correct page_size_mask here > > > > > > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is > > > > > > > > 0xf000)) > > > > > > > I guess you mean with guest using 4K and host using 64K. > > > > > > > > > > > > > > > > We need to set this correctly as per host page size, correct? > > > > > > > Yes that's correct. We need to put in place a control path to > > > > > > > retrieve > > > > > > > the page settings on host through VFIO to inform the virtio-iommu > > > > > > > device. > > > > > > > > > > > > > > Besides this issue, did you try with 64kB on host and guest? > > > > > > > > > > > > I tried Followings > > > > > > - 4k host and 4k guest - it works with v7 version > > > > > > - 64k host and 64k guest - it does not work with v7 > > > > > > hard-coded config.page_size_mask to 0x and it > > > > > > works > > > > > > > > > > You might get this from the iova_pgsize bitmap returned by > > > > > VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so > > > > > there > > > > > is the usual problem of aggregating consistent properties, but I'm > > > > > guessing using the host page size as a granule here is safe enough. > > > > > > > > > > If it is a problem, we can add a PROBE property for page size mask, > > > > > allowing to define per-endpoint page masks. I have kernel patches > > > > > somewhere to do just that. > > > > > > > > I do not see we need page size mask per endpoint. > > > > > > > > While I am trying to understand what "page-size-mask" guest will work > > > > with > > &g
[PATCH v8 8/8] virtio-iommu: add iommu notifier memory-region
Finally add notify_flag_changed() to for memory-region access flag iommu flag change notifier Finally add the memory notifier Signed-off-by: Bharat Bhushan --- hw/virtio/virtio-iommu.c | 22 ++ hw/virtio/trace-events | 2 ++ 2 files changed, 24 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index b68644f7c3..515c965e3c 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -814,6 +814,27 @@ unlock: qemu_mutex_unlock(>mutex); } +static int virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr, + IOMMUNotifierFlag old, + IOMMUNotifierFlag new, + Error **errp) +{ +IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; + +if (old == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name); +QLIST_INSERT_HEAD(>notifiers_list, sdev, next); +return 0; +} + +if (new == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name); +QLIST_REMOVE(sdev, next); +} +return 0; +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -1033,6 +1054,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = virtio_iommu_translate; imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask; imrc->replay = virtio_iommu_replay; +imrc->notify_flag_changed = virtio_iommu_notify_flag_changed; } static const TypeInfo virtio_iommu_info = { diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 8bae651191..a486adcf6d 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s" +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s" -- 2.17.1
[PATCH v8 6/8] virtio-iommu: Call iommu notifier for attach/detach
iommu-notifier are called when a device is attached or detached to as address-space. This is needed for VFIO. Signed-off-by: Bharat Bhushan --- hw/virtio/virtio-iommu.c | 49 1 file changed, 49 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 623b477b9c..4d522a636a 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint { uint32_t id; VirtIOIOMMUDomain *domain; QLIST_ENTRY(VirtIOIOMMUEndpoint) next; +VirtIOIOMMU *viommu; } VirtIOIOMMUEndpoint; typedef struct VirtIOIOMMUInterval { @@ -155,11 +156,48 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, memory_region_notify_iommu(mr, 0, entry); } +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value, + gpointer data) +{ +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_unmap(mr, interval->low, + interval->high - interval->low + 1); + +return false; +} + +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value, + gpointer data) +{ +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, +interval->high - interval->low + 1); + +return false; +} + static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) { +VirtIOIOMMU *s = ep->viommu; +VirtIOIOMMUDomain *domain = ep->domain; +IOMMUDevice *sdev; + if (!ep->domain) { return; } + +QLIST_FOREACH(sdev, >notifiers_list, next) { +if (ep->id == sdev->devfn) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap, + >iommu_mr); +} +} + QLIST_REMOVE(ep, next); ep->domain = NULL; } @@ -178,6 +216,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, } ep = g_malloc0(sizeof(*ep)); ep->id = ep_id; +ep->viommu = s; trace_virtio_iommu_get_endpoint(ep_id); g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep); return ep; @@ -274,6 +313,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, uint32_t ep_id = le32_to_cpu(req->endpoint); VirtIOIOMMUDomain *domain; VirtIOIOMMUEndpoint *ep; +IOMMUDevice *sdev; trace_virtio_iommu_attach(domain_id, ep_id); @@ -299,6 +339,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, ep->domain = domain; +/* Replay domain mappings on the associated memory region */ +QLIST_FOREACH(sdev, >notifiers_list, next) { +if (ep_id == sdev->devfn) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_map, + >iommu_mr); +} +} + return VIRTIO_IOMMU_S_OK; } @@ -873,6 +921,7 @@ static gboolean reconstruct_endpoints(gpointer key, gpointer value, QLIST_FOREACH(iter, >endpoint_list, next) { iter->domain = d; +iter->viommu = s; g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter); } return false; /* continue the domain traversal */ -- 2.17.1
[PATCH v8 7/8] virtio-iommu: add iommu replay
Default replay does not work with virtio-iommu, so this patch provide virtio-iommu replay functionality. Signed-off-by: Bharat Bhushan --- hw/virtio/virtio-iommu.c | 44 hw/virtio/trace-events | 1 + 2 files changed, 45 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 4d522a636a..b68644f7c3 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -771,6 +771,49 @@ static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr, s->config.page_size_mask = page_size_mask; } +static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data) +{ +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +trace_virtio_iommu_remap(interval->low, mapping->phys_addr, + interval->high - interval->low + 1); +/* unmap previous entry and map again */ +virtio_iommu_notify_unmap(mr, interval->low, + interval->high - interval->low + 1); + +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, +interval->high - interval->low + 1); +return false; +} + +static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n) +{ +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +uint32_t sid; +VirtIOIOMMUEndpoint *ep; + +sid = virtio_iommu_get_bdf(sdev); + +qemu_mutex_lock(>mutex); + +if (!s->endpoints) { +goto unlock; +} + +ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); +if (!ep || !ep->domain) { +goto unlock; +} + +g_tree_foreach(ep->domain->mappings, virtio_iommu_remap, mr); + +unlock: +qemu_mutex_unlock(>mutex); +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -989,6 +1032,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = virtio_iommu_translate; imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask; +imrc->replay = virtio_iommu_replay; } static const TypeInfo virtio_iommu_info = { diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index d94a1cd8a3..8bae651191 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -75,3 +75,4 @@ virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64 virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 +virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" -- 2.17.1
[PATCH v8 5/8] virtio-iommu: Add iommu notifier for map/unmap
This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to notify registered iommu-notifier. Which will call vfio notifier to map/unmap region in iommu. Signed-off-by: Bharat Bhushan Signed-off-by: Eric Auger --- include/hw/virtio/virtio-iommu.h | 2 + hw/virtio/virtio-iommu.c | 67 +++- hw/virtio/trace-events | 2 + 3 files changed, 70 insertions(+), 1 deletion(-) diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h index 6f67f1020a..65ad3bf4ee 100644 --- a/include/hw/virtio/virtio-iommu.h +++ b/include/hw/virtio/virtio-iommu.h @@ -37,6 +37,7 @@ typedef struct IOMMUDevice { int devfn; IOMMUMemoryRegion iommu_mr; AddressSpace as; +QLIST_ENTRY(IOMMUDevice) next; } IOMMUDevice; typedef struct IOMMUPciBus { @@ -56,6 +57,7 @@ typedef struct VirtIOIOMMU { GTree *domains; QemuMutex mutex; GTree *endpoints; +QLIST_HEAD(, IOMMUDevice) notifiers_list; } VirtIOIOMMU; #endif diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index c00a55348d..623b477b9c 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data) } } +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova, +hwaddr paddr, hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size); +entry.perm = IOMMU_RW; +entry.translated_addr = paddr; + +memory_region_notify_iommu(mr, 0, entry); +} + +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, + hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size); +entry.perm = IOMMU_NONE; +entry.translated_addr = 0; + +memory_region_notify_iommu(mr, 0, entry); +} + static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) { if (!ep->domain) { @@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s, uint64_t virt_start = le64_to_cpu(req->virt_start); uint64_t virt_end = le64_to_cpu(req->virt_end); uint32_t flags = le32_to_cpu(req->flags); +hwaddr size = virt_end - virt_start + 1; VirtIOIOMMUDomain *domain; VirtIOIOMMUInterval *interval; VirtIOIOMMUMapping *mapping; +VirtIOIOMMUEndpoint *ep; +IOMMUDevice *sdev; if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) { return VIRTIO_IOMMU_S_INVAL; @@ -339,9 +374,38 @@ static int virtio_iommu_map(VirtIOIOMMU *s, g_tree_insert(domain->mappings, interval, mapping); +/* All devices in an address-space share mapping */ +QLIST_FOREACH(sdev, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +if (ep->id == sdev->devfn) { +virtio_iommu_notify_map(>iommu_mr, +virt_start, phys_start, size); +} +} +} + return VIRTIO_IOMMU_S_OK; } +static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, +VirtIOIOMMUDomain *domain, +VirtIOIOMMUInterval *interval) +{ +VirtIOIOMMUEndpoint *ep; +IOMMUDevice *sdev; + +QLIST_FOREACH(sdev, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +if (ep->id == sdev->devfn) { +virtio_iommu_notify_unmap(>iommu_mr, + interval->low, + interval->high - interval->low + 1); +} +} +} +g_tree_remove(domain->mappings, (gpointer)(interval)); +} + static int virtio_iommu_unmap(VirtIOIOMMU *s, struct virtio_iommu_req_unmap *req) { @@ -368,7 +432,7 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, uint64_t current_high = iter_key->high; if (interval.low <= current_low && interval.high >= current_high) { -g_tree_remove(domain->mappings, iter_key); +virtio_iommu_remove_mapping(s, domain, iter_key); trace_virtio_iommu_unmap_done(domain_id, current_low, current_high); } else { ret = VIRTIO_IOMMU_S_RANGE; @@ -664,6 +728,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOIOMMU *s = VIRTIO_IOMMU(dev); +QLIST_INIT(>notifiers_list); virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU, sizeof(struct vi
[PATCH v8 2/8] memory: Add interface to set iommu page size mask
Allow to set page size mask to be supported by iommu. This is required to expose page size mask compatible with host with virtio-iommu. Signed-off-by: Bharat Bhushan --- v7->v8: - new patch include/exec/memory.h | 20 memory.c | 10 ++ 2 files changed, 30 insertions(+) diff --git a/include/exec/memory.h b/include/exec/memory.h index e85b7de99a..063c424854 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -355,6 +355,16 @@ typedef struct IOMMUMemoryRegionClass { * @iommu: the IOMMUMemoryRegion */ int (*num_indexes)(IOMMUMemoryRegion *iommu); + +/* + * Set supported IOMMU page size + * + * Optional method: if this is supported then set page size that + * can be supported by IOMMU. This is called to set supported page + * size as per host Linux. + */ + void (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu, + uint64_t page_size_mask); } IOMMUMemoryRegionClass; typedef struct CoalescedMemoryRange CoalescedMemoryRange; @@ -1363,6 +1373,16 @@ int memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr, */ int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr); +/** + * memory_region_iommu_set_page_size_mask: set the supported pages + * size by iommu. + * + * @iommu_mr: the memory region + * @page_size_mask: supported page size mask + */ +void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr, +uint64_t page_size_mask); + /** * memory_region_name: get a memory region's name * diff --git a/memory.c b/memory.c index aeaa8dcc9e..14c8783084 100644 --- a/memory.c +++ b/memory.c @@ -1833,6 +1833,16 @@ static int memory_region_update_iommu_notify_flags(IOMMUMemoryRegion *iommu_mr, return ret; } +void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr, +uint64_t page_size_mask) +{ +IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr); + +if (imrc->iommu_set_page_size_mask) { +imrc->iommu_set_page_size_mask(iommu_mr, page_size_mask); +} +} + int memory_region_register_iommu_notifier(MemoryRegion *mr, IOMMUNotifier *n, Error **errp) { -- 2.17.1
[PATCH v8 3/8] vfio: set iommu page size as per host supported page size
Set iommu supported page size mask same as host Linux supported page size mask. Signed-off-by: Bharat Bhushan --- v7->v8: - new patch hw/vfio/common.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index c586edf47a..6ea50d696f 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -635,6 +635,9 @@ static void vfio_listener_region_add(MemoryListener *listener, int128_get64(llend), iommu_idx); +memory_region_iommu_set_page_size_mask(giommu->iommu, + container->pgsizes); + ret = memory_region_register_iommu_notifier(section->mr, >n, ); if (ret) { -- 2.17.1
[PATCH v8 0/8] virtio-iommu: VFIO integration
This patch series integrates VFIO with virtio-iommu. This is only applicable for PCI pass-through with virtio-iommu. This series is available at: https://github.com/bharat-bhushan-devel/qemu.git virtio-iommu-vfio-integration-v8 This is tested with assigning more than one pci devices to Virtual Machine. This series is based on: - virtio-iommu device emulation by Eric Augur. [v16,00/10] VIRTIO-IOMMU device https://github.com/eauger/qemu/tree/v4.2-virtio-iommu-v16 - Linux 5.6.0-rc4 v7->v8: - Set page size mask as per host This fixes issue with 64K host/guest - Device list from IOMMUDevice directly removed VirtioIOMMUNotifierNode - Add missing iep->viommu init on post-load v6->v7: - corrected email-address v5->v6: - Rebase to v16 version from Eric - Tested with upstream Linux - Added a patch from Eric/Myself on removing mmio-region error print in vfio v4->v5: - Rebase to v9 version from Eric - PCIe device hotplug fix - Added Patch 1/5 from Eric previous series (Eric somehow dropped in last version. - Patch "Translate the MSI doorbell in kvm_arch_fixup_msi_route" already integrated with vsmmu3 v3->v4: - Rebase to v4 version from Eric - Fixes from Eric with DPDK in VM - Logical division in multiple patches v2->v3: - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" Which is based on top of v2.10-rc0 that - Fixed issue with two PCI devices - Addressed review comments v1->v2: - Added trace events - removed vSMMU3 link in patch description Bharat Bhushan (8): hw/vfio/common: Remove error print on mmio region translation by viommu memory: Add interface to set iommu page size mask vfio: set iommu page size as per host supported page size virtio-iommu: set supported page size mask virtio-iommu: Add iommu notifier for map/unmap virtio-iommu: Call iommu notifier for attach/detach virtio-iommu: add iommu replay virtio-iommu: add iommu notifier memory-region include/exec/memory.h| 20 include/hw/virtio/virtio-iommu.h | 2 + hw/vfio/common.c | 5 +- hw/virtio/virtio-iommu.c | 192 ++- memory.c | 10 ++ hw/virtio/trace-events | 5 + 6 files changed, 231 insertions(+), 3 deletions(-) -- 2.17.1
[PATCH v8 1/8] hw/vfio/common: Remove error print on mmio region translation by viommu
On ARM, the MSI doorbell is translated by the virtual IOMMU. As such address_space_translate() returns the MSI controller MMIO region and we get an "iommu map to non memory area" message. Let's remove this latter. Signed-off-by: Eric Auger Signed-off-by: Bharat Bhushan --- hw/vfio/common.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 5ca11488d6..c586edf47a 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr, , , writable, MEMTXATTRS_UNSPECIFIED); if (!memory_region_is_ram(mr)) { -error_report("iommu map to non memory area %"HWADDR_PRIx"", - xlat); return false; } -- 2.17.1
[PATCH v8 4/8] virtio-iommu: set supported page size mask
Add optional interface to set page size mask. Currently this is set global configuration and not per endpoint. Signed-off-by: Bharat Bhushan --- v7->v8: - new patch hw/virtio/virtio-iommu.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 4cee8083bc..c00a55348d 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -650,6 +650,15 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data) return (ua > ub) - (ua < ub); } +static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr, +uint64_t page_size_mask) +{ +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; + +s->config.page_size_mask = page_size_mask; +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -865,6 +874,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); imrc->translate = virtio_iommu_translate; +imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask; } static const TypeInfo virtio_iommu_info = { -- 2.17.1
Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
Hi Jean, On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker wrote: > > On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote: > > Hi Jean, > > > > On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker > > wrote: > > > > > > Hi Bharat, > > > > > > Could you Cc me on your next posting? Unfortunately I don't have much > > > hardware for testing this at the moment, but I might be able to help a > > > little on the review. > > > > > > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote: > > > > > >>> First issue is: your guest can use 4K page and your host can use > > > > > >>> 64KB > > > > > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must > > > > > >>> devise > > > > > >>> a way to pass the host settings to the VIRTIO-IOMMU device. > > > > > >>> > > > > > >>> Even with 64KB pages, it did not work for me. I have obviously > > > > > >>> not the > > > > > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due > > > > > >>> to > > > > > >>> some wrong notifications somewhere. I will try to investigate on > > > > > >>> my side. > > > > > >>> > > > > > >>> Did you test with VFIO on your side? > > > > > >> > > > > > >> I did not tried with different page sizes, only tested with 4K > > > > > >> page size. > > > > > >> > > > > > >> Yes it works, I tested with two n/w device assigned to VM, both > > > > > >> interfaces works > > > > > >> > > > > > >> First I will try with 64k page size. > > > > > > > > > > > > 64K page size does not work for me as well, > > > > > > > > > > > > I think we are not passing correct page_size_mask here > > > > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is > > > > > > 0xf000)) > > > > > I guess you mean with guest using 4K and host using 64K. > > > > > > > > > > > > We need to set this correctly as per host page size, correct? > > > > > Yes that's correct. We need to put in place a control path to retrieve > > > > > the page settings on host through VFIO to inform the virtio-iommu > > > > > device. > > > > > > > > > > Besides this issue, did you try with 64kB on host and guest? > > > > > > > > I tried Followings > > > > - 4k host and 4k guest - it works with v7 version > > > > - 64k host and 64k guest - it does not work with v7 > > > > hard-coded config.page_size_mask to 0x and it works > > > > > > You might get this from the iova_pgsize bitmap returned by > > > VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there > > > is the usual problem of aggregating consistent properties, but I'm > > > guessing using the host page size as a granule here is safe enough. > > > > > > If it is a problem, we can add a PROBE property for page size mask, > > > allowing to define per-endpoint page masks. I have kernel patches > > > somewhere to do just that. > > > > I do not see we need page size mask per endpoint. > > > > While I am trying to understand what "page-size-mask" guest will work with > > > > - 4K page size host and 4k page size guest > > config.page_size_mask = 0x000 will work > > > > - 64K page size host and 64k page size guest > > config.page_size_mask = 0xfff will work > > > > - 64K page size host and 4k page size guest > >1) config.page_size_mask = 0x000 will also not work as > > VFIO in host expect iova and size to be aligned to 64k (PAGE_SIZE in > > host) > >2) config.page_size_mask = 0xfff will not work, iova > > initialization (in guest) expect minimum page-size supported by h/w to > > be equal to 4k (PAGE_SIZE in guest) > >Should we look to relax this in iova allocation code? > > Oh right, that's not great. Maybe the BUG_ON() can be removed, I'll ask on > the list. yes, the BUG_ON in iova_init. I tried with removing same and it worked, but not analyzed side effects. > > In the meantime, 64k granule is the right value to advertise to the guest > in this case. > Did you try 64k guest 4k host? no, will try. Thanks -Bharat > > Thanks, > Jean
Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
Hi Jean, On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker wrote: > > Hi Bharat, > > Could you Cc me on your next posting? Unfortunately I don't have much > hardware for testing this at the moment, but I might be able to help a > little on the review. > > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote: > > > >>> First issue is: your guest can use 4K page and your host can use 64KB > > > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must > > > >>> devise > > > >>> a way to pass the host settings to the VIRTIO-IOMMU device. > > > >>> > > > >>> Even with 64KB pages, it did not work for me. I have obviously not the > > > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to > > > >>> some wrong notifications somewhere. I will try to investigate on my > > > >>> side. > > > >>> > > > >>> Did you test with VFIO on your side? > > > >> > > > >> I did not tried with different page sizes, only tested with 4K page > > > >> size. > > > >> > > > >> Yes it works, I tested with two n/w device assigned to VM, both > > > >> interfaces works > > > >> > > > >> First I will try with 64k page size. > > > > > > > > 64K page size does not work for me as well, > > > > > > > > I think we are not passing correct page_size_mask here > > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is > > > > 0xf000)) > > > I guess you mean with guest using 4K and host using 64K. > > > > > > > > We need to set this correctly as per host page size, correct? > > > Yes that's correct. We need to put in place a control path to retrieve > > > the page settings on host through VFIO to inform the virtio-iommu device. > > > > > > Besides this issue, did you try with 64kB on host and guest? > > > > I tried Followings > > - 4k host and 4k guest - it works with v7 version > > - 64k host and 64k guest - it does not work with v7 > > hard-coded config.page_size_mask to 0x and it works > > You might get this from the iova_pgsize bitmap returned by > VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there > is the usual problem of aggregating consistent properties, but I'm > guessing using the host page size as a granule here is safe enough. > > If it is a problem, we can add a PROBE property for page size mask, > allowing to define per-endpoint page masks. I have kernel patches > somewhere to do just that. I do not see we need page size mask per endpoint. While I am trying to understand what "page-size-mask" guest will work with - 4K page size host and 4k page size guest config.page_size_mask = 0x000 will work - 64K page size host and 64k page size guest config.page_size_mask = 0xfff will work - 64K page size host and 4k page size guest 1) config.page_size_mask = 0x000 will also not work as VFIO in host expect iova and size to be aligned to 64k (PAGE_SIZE in host) 2) config.page_size_mask = 0xfff will not work, iova initialization (in guest) expect minimum page-size supported by h/w to be equal to 4k (PAGE_SIZE in guest) Should we look to relax this in iova allocation code? Thanks -Bharat > > Thanks, > Jean
Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
Hi Eric, On Mon, Mar 16, 2020 at 1:15 PM Bharat Bhushan wrote: > > Hi Eric, > > On Mon, Mar 16, 2020 at 1:02 PM Auger Eric wrote: > > > > Hi Bharat, > > > > On 3/16/20 7:41 AM, Bharat Bhushan wrote: > > > Hi Eric, > > > > > > On Fri, Mar 13, 2020 at 8:11 PM Auger Eric wrote: > > >> > > >> Hi Bharat > > >> > > >> On 3/13/20 8:48 AM, Bharat Bhushan wrote: > > >>> iommu-notifier are called when a device is attached > > >> IOMMU notifiers > > >>> or detached to as address-space. > > >>> This is needed for VFIO. > > >> and vhost for detach > > >>> > > >>> Signed-off-by: Bharat Bhushan > > >>> --- > > >>> hw/virtio/virtio-iommu.c | 47 > > >>> 1 file changed, 47 insertions(+) > > >>> > > >>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > > >>> index e51344a53e..2006f72901 100644 > > >>> --- a/hw/virtio/virtio-iommu.c > > >>> +++ b/hw/virtio/virtio-iommu.c > > >>> @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint { > > >>> uint32_t id; > > >>> VirtIOIOMMUDomain *domain; > > >>> QLIST_ENTRY(VirtIOIOMMUEndpoint) next; > > >>> +VirtIOIOMMU *viommu; > > >> This needs specal care on post-load. When migrating the EPs, only the id > > >> is migrated. On post-load you need to set viommu as it is done for > > >> domain. migration is allowed with vhost. > > > > > > ok, I have not tried vhost/migration. Below change set viommu when > > > reconstructing endpoint. > > > > > > Yes I think this should be OK. > > > > By the end I did the series a try with vhost/vfio. with vhost it works > > (not with recent kernel though, but the issue may be related to kernel). > > With VFIO however it does not for me. > > > > First issue is: your guest can use 4K page and your host can use 64KB > > pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise > > a way to pass the host settings to the VIRTIO-IOMMU device. > > > > Even with 64KB pages, it did not work for me. I have obviously not the > > storm of VFIO_DMA_MAP failures but I have some, most probably due to > > some wrong notifications somewhere. I will try to investigate on my side. > > > > Did you test with VFIO on your side? > > I did not tried with different page sizes, only tested with 4K page size. > > Yes it works, I tested with two n/w device assigned to VM, both interfaces > works > > First I will try with 64k page size. 64K page size does not work for me as well, I think we are not passing correct page_size_mask here (config.page_size_mask is set to TARGET_PAGE_MASK ( which is 0xf000)) We need to set this correctly as per host page size, correct? Thanks -Bharat > > Thanks > -Bharat > > > > > Thanks > > > > Eric > > > > > > @@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer > > > key, gpointer value, > > > > > > QLIST_FOREACH(iter, >endpoint_list, next) { > > > iter->domain = d; > > > + iter->viommu = s; > > > g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter); > > > } > > > return false; /* continue the domain traversal */ > > > > > >>> } VirtIOIOMMUEndpoint; > > >>> > > >>> typedef struct VirtIOIOMMUInterval { > > >>> @@ -155,8 +156,44 @@ static void > > >>> virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, > > >>> memory_region_notify_iommu(mr, 0, entry); > > >>> } > > >>> > > >>> +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer > > >>> value, > > >>> + gpointer data) > > >>> +{ > > >>> +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; > > >>> +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; > > >>> + > > >>> +virtio_iommu_notify_unmap(mr, interval->low, > > >>> + interval->high - interval->low + 1); > > >>> + > > >>> +return false; > > >>> +} > > >>> + > > >>> +static gboolean virtio_iom
Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
Hi Eric, On Mon, Mar 16, 2020 at 2:35 PM Auger Eric wrote: > > Hi Bharat, > > On 3/16/20 9:58 AM, Bharat Bhushan wrote: > > Hi Eric, > > > > On Mon, Mar 16, 2020 at 1:15 PM Bharat Bhushan > > wrote: > >> > >> Hi Eric, > >> > >> On Mon, Mar 16, 2020 at 1:02 PM Auger Eric wrote: > >>> > >>> Hi Bharat, > >>> > >>> On 3/16/20 7:41 AM, Bharat Bhushan wrote: > >>>> Hi Eric, > >>>> > >>>> On Fri, Mar 13, 2020 at 8:11 PM Auger Eric wrote: > >>>>> > >>>>> Hi Bharat > >>>>> > >>>>> On 3/13/20 8:48 AM, Bharat Bhushan wrote: > >>>>>> iommu-notifier are called when a device is attached > >>>>> IOMMU notifiers > >>>>>> or detached to as address-space. > >>>>>> This is needed for VFIO. > >>>>> and vhost for detach > >>>>>> > >>>>>> Signed-off-by: Bharat Bhushan > >>>>>> --- > >>>>>> hw/virtio/virtio-iommu.c | 47 > >>>>>> 1 file changed, 47 insertions(+) > >>>>>> > >>>>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > >>>>>> index e51344a53e..2006f72901 100644 > >>>>>> --- a/hw/virtio/virtio-iommu.c > >>>>>> +++ b/hw/virtio/virtio-iommu.c > >>>>>> @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint { > >>>>>> uint32_t id; > >>>>>> VirtIOIOMMUDomain *domain; > >>>>>> QLIST_ENTRY(VirtIOIOMMUEndpoint) next; > >>>>>> +VirtIOIOMMU *viommu; > >>>>> This needs specal care on post-load. When migrating the EPs, only the id > >>>>> is migrated. On post-load you need to set viommu as it is done for > >>>>> domain. migration is allowed with vhost. > >>>> > >>>> ok, I have not tried vhost/migration. Below change set viommu when > >>>> reconstructing endpoint. > >>> > >>> > >>> Yes I think this should be OK. > >>> > >>> By the end I did the series a try with vhost/vfio. with vhost it works > >>> (not with recent kernel though, but the issue may be related to kernel). > >>> With VFIO however it does not for me. > >>> > >>> First issue is: your guest can use 4K page and your host can use 64KB > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise > >>> a way to pass the host settings to the VIRTIO-IOMMU device. > >>> > >>> Even with 64KB pages, it did not work for me. I have obviously not the > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to > >>> some wrong notifications somewhere. I will try to investigate on my side. > >>> > >>> Did you test with VFIO on your side? > >> > >> I did not tried with different page sizes, only tested with 4K page size. > >> > >> Yes it works, I tested with two n/w device assigned to VM, both interfaces > >> works > >> > >> First I will try with 64k page size. > > > > 64K page size does not work for me as well, > > > > I think we are not passing correct page_size_mask here > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is > > 0xf000)) > I guess you mean with guest using 4K and host using 64K. > > > > We need to set this correctly as per host page size, correct? > Yes that's correct. We need to put in place a control path to retrieve > the page settings on host through VFIO to inform the virtio-iommu device. > > Besides this issue, did you try with 64kB on host and guest? I tried Followings - 4k host and 4k guest - it works with v7 version - 64k host and 64k guest - it does not work with v7 hard-coded config.page_size_mask to 0x and it works Thanks -Bharat > > Thanks > > Eric > > > > Thanks > > -Bharat > > > >> > >> Thanks > >> -Bharat > >> > >>> > >>> Thanks > >>> > >>> Eric > >>>> > >>>> @@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer > >>>> key, gpointer value, > >>>> > >>>> QLIST_FOREACH(iter, >endpoint
Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
Hi Eric, On Mon, Mar 16, 2020 at 1:02 PM Auger Eric wrote: > > Hi Bharat, > > On 3/16/20 7:41 AM, Bharat Bhushan wrote: > > Hi Eric, > > > > On Fri, Mar 13, 2020 at 8:11 PM Auger Eric wrote: > >> > >> Hi Bharat > >> > >> On 3/13/20 8:48 AM, Bharat Bhushan wrote: > >>> iommu-notifier are called when a device is attached > >> IOMMU notifiers > >>> or detached to as address-space. > >>> This is needed for VFIO. > >> and vhost for detach > >>> > >>> Signed-off-by: Bharat Bhushan > >>> --- > >>> hw/virtio/virtio-iommu.c | 47 > >>> 1 file changed, 47 insertions(+) > >>> > >>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > >>> index e51344a53e..2006f72901 100644 > >>> --- a/hw/virtio/virtio-iommu.c > >>> +++ b/hw/virtio/virtio-iommu.c > >>> @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint { > >>> uint32_t id; > >>> VirtIOIOMMUDomain *domain; > >>> QLIST_ENTRY(VirtIOIOMMUEndpoint) next; > >>> +VirtIOIOMMU *viommu; > >> This needs specal care on post-load. When migrating the EPs, only the id > >> is migrated. On post-load you need to set viommu as it is done for > >> domain. migration is allowed with vhost. > > > > ok, I have not tried vhost/migration. Below change set viommu when > > reconstructing endpoint. > > > Yes I think this should be OK. > > By the end I did the series a try with vhost/vfio. with vhost it works > (not with recent kernel though, but the issue may be related to kernel). > With VFIO however it does not for me. > > First issue is: your guest can use 4K page and your host can use 64KB > pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise > a way to pass the host settings to the VIRTIO-IOMMU device. > > Even with 64KB pages, it did not work for me. I have obviously not the > storm of VFIO_DMA_MAP failures but I have some, most probably due to > some wrong notifications somewhere. I will try to investigate on my side. > > Did you test with VFIO on your side? I did not tried with different page sizes, only tested with 4K page size. Yes it works, I tested with two n/w device assigned to VM, both interfaces works First I will try with 64k page size. Thanks -Bharat > > Thanks > > Eric > > > > @@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer > > key, gpointer value, > > > > QLIST_FOREACH(iter, >endpoint_list, next) { > > iter->domain = d; > > + iter->viommu = s; > > g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter); > > } > > return false; /* continue the domain traversal */ > > > >>> } VirtIOIOMMUEndpoint; > >>> > >>> typedef struct VirtIOIOMMUInterval { > >>> @@ -155,8 +156,44 @@ static void > >>> virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, > >>> memory_region_notify_iommu(mr, 0, entry); > >>> } > >>> > >>> +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value, > >>> + gpointer data) > >>> +{ > >>> +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; > >>> +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; > >>> + > >>> +virtio_iommu_notify_unmap(mr, interval->low, > >>> + interval->high - interval->low + 1); > >>> + > >>> +return false; > >>> +} > >>> + > >>> +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value, > >>> + gpointer data) > >>> +{ > >>> +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; > >>> +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; > >>> +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; > >>> + > >>> +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, > >>> +interval->high - interval->low + 1); > >>> + > >>> +return false; > >>> +} > >>> + > >>> static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint > >>> *ep) > >>> { > >>> +VirtioIOMMUNotifierNo
Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
Hi Eric, On Fri, Mar 13, 2020 at 8:11 PM Auger Eric wrote: > > Hi Bharat > > On 3/13/20 8:48 AM, Bharat Bhushan wrote: > > iommu-notifier are called when a device is attached > IOMMU notifiers > > or detached to as address-space. > > This is needed for VFIO. > and vhost for detach > > > > Signed-off-by: Bharat Bhushan > > --- > > hw/virtio/virtio-iommu.c | 47 > > 1 file changed, 47 insertions(+) > > > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > > index e51344a53e..2006f72901 100644 > > --- a/hw/virtio/virtio-iommu.c > > +++ b/hw/virtio/virtio-iommu.c > > @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint { > > uint32_t id; > > VirtIOIOMMUDomain *domain; > > QLIST_ENTRY(VirtIOIOMMUEndpoint) next; > > +VirtIOIOMMU *viommu; > This needs specal care on post-load. When migrating the EPs, only the id > is migrated. On post-load you need to set viommu as it is done for > domain. migration is allowed with vhost. ok, I have not tried vhost/migration. Below change set viommu when reconstructing endpoint. @@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer key, gpointer value, QLIST_FOREACH(iter, >endpoint_list, next) { iter->domain = d; + iter->viommu = s; g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter); } return false; /* continue the domain traversal */ > > } VirtIOIOMMUEndpoint; > > > > typedef struct VirtIOIOMMUInterval { > > @@ -155,8 +156,44 @@ static void > > virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, > > memory_region_notify_iommu(mr, 0, entry); > > } > > > > +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value, > > + gpointer data) > > +{ > > +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; > > +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; > > + > > +virtio_iommu_notify_unmap(mr, interval->low, > > + interval->high - interval->low + 1); > > + > > +return false; > > +} > > + > > +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value, > > + gpointer data) > > +{ > > +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; > > +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; > > +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; > > + > > +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, > > +interval->high - interval->low + 1); > > + > > +return false; > > +} > > + > > static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint > > *ep) > > { > > +VirtioIOMMUNotifierNode *node; > > +VirtIOIOMMU *s = ep->viommu; > > +VirtIOIOMMUDomain *domain = ep->domain; > > + > > +QLIST_FOREACH(node, >notifiers_list, next) { > > +if (ep->id == node->iommu_dev->devfn) { > > +g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap, > > + >iommu_dev->iommu_mr); > I understand this should fo the job for domain removal did not get the comment, are you saying we should do this on domain removal? > > +} > > +} > > + > > if (!ep->domain) { > > return; > > } > > @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint > > *virtio_iommu_get_endpoint(VirtIOIOMMU *s, > > } > > ep = g_malloc0(sizeof(*ep)); > > ep->id = ep_id; > > +ep->viommu = s; > > trace_virtio_iommu_get_endpoint(ep_id); > > g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep); > > return ep; > > @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, > > { > > uint32_t domain_id = le32_to_cpu(req->domain); > > uint32_t ep_id = le32_to_cpu(req->endpoint); > > +VirtioIOMMUNotifierNode *node; > > VirtIOIOMMUDomain *domain; > > VirtIOIOMMUEndpoint *ep; > > > > @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, > > > > ep->domain = domain; > > > > +/* Replay existing address space mappings on the associated memory > > region */ > maybe use the "domain" terminology here. ok, Thanks -Bharat > > +QLIST_FOREACH(node, >notifiers_list, next) { > > +if (ep_id == node->iommu_dev->devfn) { > > +g_tree_foreach(domain->mappings, virtio_iommu_mapping_map, > > + >iommu_dev->iommu_mr); > > +} > > +} > > + > > return VIRTIO_IOMMU_S_OK; > > } > > > > > Thanks > > Eric >
Re: [PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap
Hi Eric, On Fri, Mar 13, 2020 at 7:55 PM Auger Eric wrote: > > Hi Bharat, > On 3/13/20 8:48 AM, Bharat Bhushan wrote: > > This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to > > notify registered iommu-notifier. Which will call vfio > s/iommu-notifier/iommu-notifiers > > notifier to map/unmap region in iommu. > can be any notifier (vhost/vfio). > > > > Signed-off-by: Bharat Bhushan > > Signed-off-by: Eric Auger > > --- > > hw/virtio/trace-events | 2 + > > hw/virtio/virtio-iommu.c | 66 +++- > > include/hw/virtio/virtio-iommu.h | 6 +++ > > 3 files changed, 73 insertions(+), 1 deletion(-) > > > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events > > index e83500bee9..d94a1cd8a3 100644 > > --- a/hw/virtio/trace-events > > +++ b/hw/virtio/trace-events > > @@ -73,3 +73,5 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc > > domain=%d" > > virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d" > > virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, > > uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" > > virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t > > endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address > > =0x%"PRIx64 > > +virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, > > uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 > > +virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t > > map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > > index 4cee8083bc..e51344a53e 100644 > > --- a/hw/virtio/virtio-iommu.c > > +++ b/hw/virtio/virtio-iommu.c > > @@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, > > gconstpointer b, gpointer user_data) > > } > > } > > > > +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova, > > +hwaddr paddr, hwaddr size) > > +{ > > +IOMMUTLBEntry entry; > > + > > +entry.target_as = _space_memory; > > +entry.addr_mask = size - 1; > > + > > +entry.iova = iova; > > +trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size); > > +entry.perm = IOMMU_RW; > > +entry.translated_addr = paddr; > > + > > +memory_region_notify_iommu(mr, 0, entry); > > +} > > + > > +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, > > + hwaddr size) > > +{ > > +IOMMUTLBEntry entry; > > + > > +entry.target_as = _space_memory; > > +entry.addr_mask = size - 1; > > + > > +entry.iova = iova; > > +trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size); > > +entry.perm = IOMMU_NONE; > > +entry.translated_addr = 0; > > + > > +memory_region_notify_iommu(mr, 0, entry); > > +} > > + > > static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint > > *ep) > > { > > if (!ep->domain) { > > @@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s, > > uint64_t virt_start = le64_to_cpu(req->virt_start); > > uint64_t virt_end = le64_to_cpu(req->virt_end); > > uint32_t flags = le32_to_cpu(req->flags); > > +hwaddr size = virt_end - virt_start + 1; > > +VirtioIOMMUNotifierNode *node; > > VirtIOIOMMUDomain *domain; > > VirtIOIOMMUInterval *interval; > > VirtIOIOMMUMapping *mapping; > > +VirtIOIOMMUEndpoint *ep; > > > > if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) { > > return VIRTIO_IOMMU_S_INVAL; > > @@ -339,9 +374,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s, > > > > g_tree_insert(domain->mappings, interval, mapping); > > > > +/* All devices in an address-space share mapping */ > > +QLIST_FOREACH(node, >notifiers_list, next) { > > +QLIST_FOREACH(ep, >endpoint_list, next) { > > +if (ep->id == node->iommu_dev->devfn) { > > +virtio_iommu_notify_map(>iommu_dev->iommu_mr, > > +virt_start, phys_start, size); > > +} > > +} > > +} > > + > > return VIRTIO_IOMMU_S_OK; > > } > > &g
[PATCH v7 4/5] virtio-iommu: add iommu replay
Default replay does not work with virtio-iommu, so this patch provide virtio-iommu replay functionality. Signed-off-by: Bharat Bhushan --- hw/virtio/trace-events | 1 + hw/virtio/virtio-iommu.c | 44 2 files changed, 45 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index d94a1cd8a3..8bae651191 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -75,3 +75,4 @@ virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64 virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 +virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 2006f72901..bcc9895b76 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -760,6 +760,49 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data) return (ua > ub) - (ua < ub); } +static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data) +{ +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +trace_virtio_iommu_remap(interval->low, mapping->phys_addr, + interval->high - interval->low + 1); +/* unmap previous entry and map again */ +virtio_iommu_notify_unmap(mr, interval->low, + interval->high - interval->low + 1); + +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, +interval->high - interval->low + 1); +return false; +} + +static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n) +{ +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +uint32_t sid; +VirtIOIOMMUEndpoint *ep; + +sid = virtio_iommu_get_bdf(sdev); + +qemu_mutex_lock(>mutex); + +if (!s->endpoints) { +goto unlock; +} + +ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); +if (!ep || !ep->domain) { +goto unlock; +} + +g_tree_foreach(ep->domain->mappings, virtio_iommu_remap, mr); + +unlock: +qemu_mutex_unlock(>mutex); +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -976,6 +1019,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); imrc->translate = virtio_iommu_translate; +imrc->replay = virtio_iommu_replay; } static const TypeInfo virtio_iommu_info = { -- 2.17.1
[PATCH v7 5/5] virtio-iommu: add iommu notifier memory-region
Finally add notify_flag_changed() to for memory-region access flag iommu flag change notifier Finally add the memory notifier Signed-off-by: Bharat Bhushan --- hw/virtio/trace-events | 2 ++ hw/virtio/virtio-iommu.c | 32 2 files changed, 34 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 8bae651191..a486adcf6d 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s" +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index bcc9895b76..7744410f72 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -803,6 +803,37 @@ unlock: qemu_mutex_unlock(>mutex); } +static int virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr, + IOMMUNotifierFlag old, + IOMMUNotifierFlag new, + Error **errp) +{ +IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +VirtioIOMMUNotifierNode *node = NULL; +VirtioIOMMUNotifierNode *next_node = NULL; + +if (old == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name); +node = g_malloc0(sizeof(*node)); +node->iommu_dev = sdev; +QLIST_INSERT_HEAD(>notifiers_list, node, next); +return 0; +} + +/* update notifier node with new flags */ +QLIST_FOREACH_SAFE(node, >notifiers_list, next, next_node) { +if (node->iommu_dev == sdev) { +if (new == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name); +QLIST_REMOVE(node, next); +g_free(node); +} +} +} +return 0; +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -1020,6 +1051,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = virtio_iommu_translate; imrc->replay = virtio_iommu_replay; +imrc->notify_flag_changed = virtio_iommu_notify_flag_changed; } static const TypeInfo virtio_iommu_info = { -- 2.17.1
[PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap
This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to notify registered iommu-notifier. Which will call vfio notifier to map/unmap region in iommu. Signed-off-by: Bharat Bhushan Signed-off-by: Eric Auger --- hw/virtio/trace-events | 2 + hw/virtio/virtio-iommu.c | 66 +++- include/hw/virtio/virtio-iommu.h | 6 +++ 3 files changed, 73 insertions(+), 1 deletion(-) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index e83500bee9..d94a1cd8a3 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -73,3 +73,5 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d" virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d" virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64 +virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 +virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 4cee8083bc..e51344a53e 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data) } } +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova, +hwaddr paddr, hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size); +entry.perm = IOMMU_RW; +entry.translated_addr = paddr; + +memory_region_notify_iommu(mr, 0, entry); +} + +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, + hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size); +entry.perm = IOMMU_NONE; +entry.translated_addr = 0; + +memory_region_notify_iommu(mr, 0, entry); +} + static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) { if (!ep->domain) { @@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s, uint64_t virt_start = le64_to_cpu(req->virt_start); uint64_t virt_end = le64_to_cpu(req->virt_end); uint32_t flags = le32_to_cpu(req->flags); +hwaddr size = virt_end - virt_start + 1; +VirtioIOMMUNotifierNode *node; VirtIOIOMMUDomain *domain; VirtIOIOMMUInterval *interval; VirtIOIOMMUMapping *mapping; +VirtIOIOMMUEndpoint *ep; if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) { return VIRTIO_IOMMU_S_INVAL; @@ -339,9 +374,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s, g_tree_insert(domain->mappings, interval, mapping); +/* All devices in an address-space share mapping */ +QLIST_FOREACH(node, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +if (ep->id == node->iommu_dev->devfn) { +virtio_iommu_notify_map(>iommu_dev->iommu_mr, +virt_start, phys_start, size); +} +} +} + return VIRTIO_IOMMU_S_OK; } +static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, VirtIOIOMMUDomain *domain, +VirtIOIOMMUInterval *interval) +{ +VirtioIOMMUNotifierNode *node; +VirtIOIOMMUEndpoint *ep; + +QLIST_FOREACH(node, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +if (ep->id == node->iommu_dev->devfn) { +virtio_iommu_notify_unmap(>iommu_dev->iommu_mr, + interval->low, + interval->high - interval->low + 1); +} +} +} +g_tree_remove(domain->mappings, (gpointer)(interval)); +} + static int virtio_iommu_unmap(VirtIOIOMMU *s, struct virtio_iommu_req_unmap *req) { @@ -368,7 +431,7 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, uint64_t current_high = iter_key->high; if (interval.low <= current_low && interval.high >= current_high) { -g_tree_remove(domain->mappings, iter_key); +virtio_iommu_remove_mapping(s, domain, iter_key); trace_virtio_iommu_unmap_done(domain_id, current_
[PATCH v7 0/5] virtio-iommu: VFIO integration
This patch series integrates VFIO with virtio-iommu. This is only applicable for PCI pass-through with virtio-iommu. This series is available at: https://github.com/bharat-bhushan-devel/qemu.git virtio-iommu-vfio-integration-v7 This is tested with assigning more than one pci devices to Virtual Machine. This series is based on: - virtio-iommu device emulation by Eric Augur. [v16,00/10] VIRTIO-IOMMU device https://github.com/eauger/qemu/tree/v4.2-virtio-iommu-v16 - Linux 5.6.0-rc4 v6->v7: - corrected email-address v5->v6: - Rebase to v16 version from Eric - Tested with upstream Linux - Added a patch from Eric/Myself on removing mmio-region error print in vfio v4->v5: - Rebase to v9 version from Eric - PCIe device hotplug fix - Added Patch 1/5 from Eric previous series (Eric somehow dropped in last version. - Patch "Translate the MSI doorbell in kvm_arch_fixup_msi_route" already integrated with vsmmu3 v3->v4: - Rebase to v4 version from Eric - Fixes from Eric with DPDK in VM - Logical division in multiple patches v2->v3: - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" Which is based on top of v2.10-rc0 that - Fixed issue with two PCI devices - Addressed review comments v1->v2: - Added trace events - removed vSMMU3 link in patch description Bharat Bhushan (5): hw/vfio/common: Remove error print on mmio region translation by viommu virtio-iommu: Add iommu notifier for map/unmap virtio-iommu: Call iommu notifier for attach/detach virtio-iommu: add iommu replay virtio-iommu: add iommu notifier memory-region hw/vfio/common.c | 2 - hw/virtio/trace-events | 5 + hw/virtio/virtio-iommu.c | 189 ++- include/hw/virtio/virtio-iommu.h | 6 + 4 files changed, 199 insertions(+), 3 deletions(-) -- 2.17.1
[PATCH V6 0/5] virtio-iommu: VFIO integration
This patch series integrates VFIO with virtio-iommu. This is only applicable for PCI pass-through with virtio-iommu. This series is available at: https://github.com/bharat-bhushan-devel/qemu.git virtio-iommu-vfio-integration-v6 This is tested with assigning more than one pci devices to Virtual Machine. This series is based on: - virtio-iommu device emulation by Eric Augur. [v16,00/10] VIRTIO-IOMMU device https://github.com/eauger/qemu/tree/v4.2-virtio-iommu-v16 - Linux 5.6.0-rc4 v5->v6: - Rebase to v16 version from Eric - Tested with upstream Linux - Added a patch from Eric/Myself on removing mmio-region error print in vfio v4->v5: - Rebase to v9 version from Eric - PCIe device hotplug fix - Added Patch 1/5 from Eric previous series (Eric somehow dropped in last version. - Patch "Translate the MSI doorbell in kvm_arch_fixup_msi_route" already integrated with vsmmu3 v3->v4: - Rebase to v4 version from Eric - Fixes from Eric with DPDK in VM - Logical division in multiple patches v2->v3: - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" Which is based on top of v2.10-rc0 that - Fixed issue with two PCI devices - Addressed review comments v1->v2: - Added trace events - removed vSMMU3 link in patch description Bharat Bhushan (5): hw/vfio/common: Remove error print on mmio region translation by viommu virtio-iommu: Add iommu notifier for map/unmap virtio-iommu: Call iommu notifier for attach/detach virtio-iommu: add iommu replay virtio-iommu: add iommu notifier memory-region hw/vfio/common.c | 2 - hw/virtio/trace-events | 5 + hw/virtio/virtio-iommu.c | 189 ++- include/hw/virtio/virtio-iommu.h | 6 + 4 files changed, 199 insertions(+), 3 deletions(-) -- 2.17.1
[PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
iommu-notifier are called when a device is attached or detached to as address-space. This is needed for VFIO. Signed-off-by: Bharat Bhushan --- hw/virtio/virtio-iommu.c | 47 1 file changed, 47 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index e51344a53e..2006f72901 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint { uint32_t id; VirtIOIOMMUDomain *domain; QLIST_ENTRY(VirtIOIOMMUEndpoint) next; +VirtIOIOMMU *viommu; } VirtIOIOMMUEndpoint; typedef struct VirtIOIOMMUInterval { @@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, memory_region_notify_iommu(mr, 0, entry); } +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value, + gpointer data) +{ +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_unmap(mr, interval->low, + interval->high - interval->low + 1); + +return false; +} + +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value, + gpointer data) +{ +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, +interval->high - interval->low + 1); + +return false; +} + static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) { +VirtioIOMMUNotifierNode *node; +VirtIOIOMMU *s = ep->viommu; +VirtIOIOMMUDomain *domain = ep->domain; + +QLIST_FOREACH(node, >notifiers_list, next) { +if (ep->id == node->iommu_dev->devfn) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap, + >iommu_dev->iommu_mr); +} +} + if (!ep->domain) { return; } @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, } ep = g_malloc0(sizeof(*ep)); ep->id = ep_id; +ep->viommu = s; trace_virtio_iommu_get_endpoint(ep_id); g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep); return ep; @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, { uint32_t domain_id = le32_to_cpu(req->domain); uint32_t ep_id = le32_to_cpu(req->endpoint); +VirtioIOMMUNotifierNode *node; VirtIOIOMMUDomain *domain; VirtIOIOMMUEndpoint *ep; @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, ep->domain = domain; +/* Replay existing address space mappings on the associated memory region */ +QLIST_FOREACH(node, >notifiers_list, next) { +if (ep_id == node->iommu_dev->devfn) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_map, + >iommu_dev->iommu_mr); +} +} + return VIRTIO_IOMMU_S_OK; } -- 2.17.1
[PATCH v6 1/5] hw/vfio/common: Remove error print on mmio region translation by viommu
On ARM, the MSI doorbell is translated by the virtual IOMMU. As such address_space_translate() returns the MSI controller MMIO region and we get an "iommu map to non memory area" message. Let's remove this latter. Signed-off-by: Eric Auger Signed-off-by: Bharat Bhushan --- hw/vfio/common.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 5ca11488d6..c586edf47a 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr, , , writable, MEMTXATTRS_UNSPECIFIED); if (!memory_region_is_ram(mr)) { -error_report("iommu map to non memory area %"HWADDR_PRIx"", - xlat); return false; } -- 2.17.1
[PATCH v6 4/5] virtio-iommu: add iommu replay
Default replay does not work with virtio-iommu, so this patch provide virtio-iommu replay functionality. Signed-off-by: Bharat Bhushan --- hw/virtio/trace-events | 1 + hw/virtio/virtio-iommu.c | 44 2 files changed, 45 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index d94a1cd8a3..8bae651191 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -75,3 +75,4 @@ virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64 virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 +virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 2006f72901..bcc9895b76 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -760,6 +760,49 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data) return (ua > ub) - (ua < ub); } +static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data) +{ +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +trace_virtio_iommu_remap(interval->low, mapping->phys_addr, + interval->high - interval->low + 1); +/* unmap previous entry and map again */ +virtio_iommu_notify_unmap(mr, interval->low, + interval->high - interval->low + 1); + +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, +interval->high - interval->low + 1); +return false; +} + +static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n) +{ +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +uint32_t sid; +VirtIOIOMMUEndpoint *ep; + +sid = virtio_iommu_get_bdf(sdev); + +qemu_mutex_lock(>mutex); + +if (!s->endpoints) { +goto unlock; +} + +ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); +if (!ep || !ep->domain) { +goto unlock; +} + +g_tree_foreach(ep->domain->mappings, virtio_iommu_remap, mr); + +unlock: +qemu_mutex_unlock(>mutex); +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -976,6 +1019,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); imrc->translate = virtio_iommu_translate; +imrc->replay = virtio_iommu_replay; } static const TypeInfo virtio_iommu_info = { -- 2.17.1
[PATCH v6 3/5] virtio-iommu: Call iommu notifier for attach/detach
iommu-notifier are called when a device is attached or detached to as address-space. This is needed for VFIO. Signed-off-by: Bharat Bhushan --- hw/virtio/virtio-iommu.c | 47 1 file changed, 47 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index e51344a53e..2006f72901 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint { uint32_t id; VirtIOIOMMUDomain *domain; QLIST_ENTRY(VirtIOIOMMUEndpoint) next; +VirtIOIOMMU *viommu; } VirtIOIOMMUEndpoint; typedef struct VirtIOIOMMUInterval { @@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, memory_region_notify_iommu(mr, 0, entry); } +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value, + gpointer data) +{ +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_unmap(mr, interval->low, + interval->high - interval->low + 1); + +return false; +} + +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value, + gpointer data) +{ +VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value; +VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr, +interval->high - interval->low + 1); + +return false; +} + static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) { +VirtioIOMMUNotifierNode *node; +VirtIOIOMMU *s = ep->viommu; +VirtIOIOMMUDomain *domain = ep->domain; + +QLIST_FOREACH(node, >notifiers_list, next) { +if (ep->id == node->iommu_dev->devfn) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap, + >iommu_dev->iommu_mr); +} +} + if (!ep->domain) { return; } @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, } ep = g_malloc0(sizeof(*ep)); ep->id = ep_id; +ep->viommu = s; trace_virtio_iommu_get_endpoint(ep_id); g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep); return ep; @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, { uint32_t domain_id = le32_to_cpu(req->domain); uint32_t ep_id = le32_to_cpu(req->endpoint); +VirtioIOMMUNotifierNode *node; VirtIOIOMMUDomain *domain; VirtIOIOMMUEndpoint *ep; @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, ep->domain = domain; +/* Replay existing address space mappings on the associated memory region */ +QLIST_FOREACH(node, >notifiers_list, next) { +if (ep_id == node->iommu_dev->devfn) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_map, + >iommu_dev->iommu_mr); +} +} + return VIRTIO_IOMMU_S_OK; } -- 2.17.1
[PATCH v7 1/5] hw/vfio/common: Remove error print on mmio region translation by viommu
On ARM, the MSI doorbell is translated by the virtual IOMMU. As such address_space_translate() returns the MSI controller MMIO region and we get an "iommu map to non memory area" message. Let's remove this latter. Signed-off-by: Eric Auger Signed-off-by: Bharat Bhushan --- hw/vfio/common.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 5ca11488d6..c586edf47a 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr, , , writable, MEMTXATTRS_UNSPECIFIED); if (!memory_region_is_ram(mr)) { -error_report("iommu map to non memory area %"HWADDR_PRIx"", - xlat); return false; } -- 2.17.1
[PATCH v6 5/5] virtio-iommu: add iommu notifier memory-region
Finally add notify_flag_changed() to for memory-region access flag iommu flag change notifier Finally add the memory notifier Signed-off-by: Bharat Bhushan --- hw/virtio/trace-events | 2 ++ hw/virtio/virtio-iommu.c | 32 2 files changed, 34 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 8bae651191..a486adcf6d 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s" +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index bcc9895b76..7744410f72 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -803,6 +803,37 @@ unlock: qemu_mutex_unlock(>mutex); } +static int virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr, + IOMMUNotifierFlag old, + IOMMUNotifierFlag new, + Error **errp) +{ +IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +VirtioIOMMUNotifierNode *node = NULL; +VirtioIOMMUNotifierNode *next_node = NULL; + +if (old == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name); +node = g_malloc0(sizeof(*node)); +node->iommu_dev = sdev; +QLIST_INSERT_HEAD(>notifiers_list, node, next); +return 0; +} + +/* update notifier node with new flags */ +QLIST_FOREACH_SAFE(node, >notifiers_list, next, next_node) { +if (node->iommu_dev == sdev) { +if (new == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name); +QLIST_REMOVE(node, next); +g_free(node); +} +} +} +return 0; +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -1020,6 +1051,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = virtio_iommu_translate; imrc->replay = virtio_iommu_replay; +imrc->notify_flag_changed = virtio_iommu_notify_flag_changed; } static const TypeInfo virtio_iommu_info = { -- 2.17.1
[PATCH v6 2/5] virtio-iommu: Add iommu notifier for map/unmap
This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to notify registered iommu-notifier. Which will call vfio notifier to map/unmap region in iommu. Signed-off-by: Bharat Bhushan Signed-off-by: Eric Auger --- hw/virtio/trace-events | 2 + hw/virtio/virtio-iommu.c | 66 +++- include/hw/virtio/virtio-iommu.h | 6 +++ 3 files changed, 73 insertions(+), 1 deletion(-) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index e83500bee9..d94a1cd8a3 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -73,3 +73,5 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d" virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d" virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64 +virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64 +virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64 diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 4cee8083bc..e51344a53e 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data) } } +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova, +hwaddr paddr, hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size); +entry.perm = IOMMU_RW; +entry.translated_addr = paddr; + +memory_region_notify_iommu(mr, 0, entry); +} + +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, + hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size); +entry.perm = IOMMU_NONE; +entry.translated_addr = 0; + +memory_region_notify_iommu(mr, 0, entry); +} + static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) { if (!ep->domain) { @@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s, uint64_t virt_start = le64_to_cpu(req->virt_start); uint64_t virt_end = le64_to_cpu(req->virt_end); uint32_t flags = le32_to_cpu(req->flags); +hwaddr size = virt_end - virt_start + 1; +VirtioIOMMUNotifierNode *node; VirtIOIOMMUDomain *domain; VirtIOIOMMUInterval *interval; VirtIOIOMMUMapping *mapping; +VirtIOIOMMUEndpoint *ep; if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) { return VIRTIO_IOMMU_S_INVAL; @@ -339,9 +374,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s, g_tree_insert(domain->mappings, interval, mapping); +/* All devices in an address-space share mapping */ +QLIST_FOREACH(node, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +if (ep->id == node->iommu_dev->devfn) { +virtio_iommu_notify_map(>iommu_dev->iommu_mr, +virt_start, phys_start, size); +} +} +} + return VIRTIO_IOMMU_S_OK; } +static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, VirtIOIOMMUDomain *domain, +VirtIOIOMMUInterval *interval) +{ +VirtioIOMMUNotifierNode *node; +VirtIOIOMMUEndpoint *ep; + +QLIST_FOREACH(node, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +if (ep->id == node->iommu_dev->devfn) { +virtio_iommu_notify_unmap(>iommu_dev->iommu_mr, + interval->low, + interval->high - interval->low + 1); +} +} +} +g_tree_remove(domain->mappings, (gpointer)(interval)); +} + static int virtio_iommu_unmap(VirtIOIOMMU *s, struct virtio_iommu_req_unmap *req) { @@ -368,7 +431,7 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, uint64_t current_high = iter_key->high; if (interval.low <= current_low && interval.high >= current_high) { -g_tree_remove(domain->mappings, iter_key); +virtio_iommu_remove_mapping(s, domain, iter_key); trace_virtio_iommu_unmap_done(domain_id, current_
Re: [Qemu-devel] [PATCH RFC v5 0/5] virtio-iommu: VFIO integration
Hi Eric, On Fri, Feb 28, 2020 at 3:06 PM Auger Eric wrote: > Hi Bharat, > > On 11/27/18 7:52 AM, Bharat Bhushan wrote: > > This patch series integrates VFIO with virtio-iommu. This is > > tested with assigning 2 pci devices to Virtual Machine. > > > > This version is mainly about rebasing on v9 version on > > virtio-iommu device framework from Eric Augur. > > > > This patch series allows PCI pass-through using virtio-iommu. > > > > This series is based on: > > - virtio-iommu kernel driver by Jean-Philippe Brucker > > [PATCH v5 0/7] Add virtio-iommu driver > > git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9 > > > > - virtio-iommu device emulation by Eric Augur. > >[RFC,v9,00/17] VIRTIO-IOMMU device > >https://github.com/eauger/qemu/tree/v3.1.0-rc2-virtio-iommu-v0.9 > > Now we have the driver and the base qemu device upstream we may resume > this activity to complete the VFIO integration. Do you intend the > respin? Otherwise let me know if you want me to help. > > Yes Eric, I am planning to respin the changes. Can you please point to latest changes (qemu/Linux both). Thanks -Bharat Thanks > > Eric > > > > v4->v5: > > - Rebase to v9 version from Eric > > - PCIe device hotplug fix > > - Added Patch 1/5 from Eric previous series (Eric somehow dropped in > >last version. > > - Patch "Translate the MSI doorbell in kvm_arch_fixup_msi_route" > >already integrated with vsmmu3 > > > > v3->v4: > > - Rebase to v4 version from Eric > > - Fixes from Eric with DPDK in VM > > - Logical division in multiple patches > > > > v2->v3: > > - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" > >Which is based on top of v2.10-rc0 that > > - Fixed issue with two PCI devices > > - Addressed review comments > > > > v1->v2: > > - Added trace events > > - removed vSMMU3 link in patch description > > > > Bharat Bhushan (4): > > virtio-iommu: Add iommu notifier for iommu-map/unmap > > virtio-iommu: Call iommu notifier on attach/detach > > virtio-iommu: add virtio-iommu replay > > virtio-iommu: handle IOMMU Notifier flag changes > > > > Eric Auger (1): > > hw/vfio/common: Do not print error when viommu translates into an mmio > > region > > > > hw/vfio/common.c | 2 - > > hw/virtio/trace-events | 5 + > > hw/virtio/virtio-iommu.c | 190 ++- > > include/hw/virtio/virtio-iommu.h | 6 + > > 4 files changed, 198 insertions(+), 5 deletions(-) > > > > -- -Bharat
Re: [Qemu-devel] [RFC v9 00/17] VIRTIO-IOMMU device
Hi Eric, > -Original Message- > From: Eric Auger > Sent: Thursday, November 22, 2018 10:45 PM > To: eric.auger@gmail.com; eric.au...@redhat.com; qemu- > de...@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > m...@redhat.com; jean-philippe.bruc...@arm.com > Cc: kevin.t...@intel.com; t...@semihalf.com; Bharat Bhushan > ; pet...@redhat.com > Subject: [RFC v9 00/17] VIRTIO-IOMMU device > > This series rebases the virtio-iommu device on qemu 3.1.0-rc2 and > implements the v0.8(.1) virtio-iommu spec [1]. The pci proxy for the virtio- > iommu device is now available and needs to be instantiated from the > command line using "-device virtio-iommu-pci". > The iommu machvirt option is not used anymore to instantiate the virtio- > iommu. > > At the moment the virtio-iommu-device only is functional in the ARM virt > machine. Indeed, besides its instantiation, links between the PCIe end points > and the IOMMU must be described. This is achieved by DT or ACPI > description (IORT). This description currently only is done in ARM virt. > > Best Regards > > Eric > > This series can be found at: > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit > hub.com%2Feauger%2Fqemu%2Ftree%2Fv3.1.0-rc2-virtio-iommu- > v0.9data=02%7C01%7Cbharat.bhushan%40nxp.com%7C031e61e2a2f3 > 4e3b945a08d6509e2918%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0% > 7C636785037552135579sdata=v99FRL1dQC%2BPTofwwqqoKGNiiqJ%2F > lGBX3KJB0IUKbQU%3Dreserved=0 > > References: > [1] [PATCH v3 0/7] Add virtio-iommu driver > > [2] guest branch featuring the virtio-iommu driver v0.8.1 + ACPI > integration not yet officially released by Jean. > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit > hub.com%2Feauger%2Flinux%2Ftree%2Fvirtio-iommu- > v0.8.1data=02%7C01%7Cbharat.bhushan%40nxp.com%7C031e61e2a2f > 34e3b945a08d6509e2918%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0 > %7C636785037552135579sdata=EjHA9BZ6rNX36YFKkNocBQtoz3uf3bdn > 5T9NSJ6SaRg%3Dreserved=0 > > Testing: > - tested with guest using virtio-net-pci > (,vhost=off,iommu_platform,disable-modern=off,disable-legacy=on) > and virtio-blk-pci > - VFIO/VHOST integration is not part of this series > - When using the virtio-blk-pci, some EDK2 FW versions feature > unmapped transactions and in that case the guest fails to boot. I have tested this series with virtio and VFIO both Tested-by: Bharat Bhushan Thanks -Bharat > > History: > > v8 -> v9: > - virtio-iommu-pci device needs to be instantiated from the command > line (RID is not imposed anymore). > - tail structure properly initialized > > v7 -> v8: > - virtio-iommu-pci added > - virt instantiation modified > - DT and ACPI modified to exclude the iommu RID from the mapping > - VIRTIO_IOMMU_F_BYPASS, VIRTIO_F_VERSION_1 features exposed > > v6 -> v7: > - rebase on qemu 3.0.0-rc3 > - minor update against v0.7 > - fix issue with EP not on pci.0 and ACPI probing > - change the instantiation method > > v5 -> v6: > - minor update against v0.6 spec > - fix g_hash_table_lookup in virtio_iommu_find_add_as > - replace some error_reports by qemu_log_mask(LOG_GUEST_ERROR, ...) > > v4 -> v5: > - event queue and fault reporting > - we now return the IOAPIC MSI region if the virtio-iommu is instantiated > in a PC machine. > - we bypass transactions on MSI HW region and fault on reserved ones. > - We support ACPI boot with mach-virt (based on IORT proposal) > - We moved to the new driver naming conventions > - simplified mach-virt instantiation > - worked around the disappearing of pci_find_primary_bus > - in virtio_iommu_translate, check the dev->as is not NULL > - initialize as->device_list in virtio_iommu_get_as > - initialize bufstate.error to false in virtio_iommu_probe > > v3 -> v4: > - probe request support although no reserved region is returned at > the moment > - unmap semantics less strict, as specified in v0.4 > - device registration, attach/detach revisited > - split into smaller patches to ease review > - propose a way to inform the IOMMU mr about the page_size_mask > of underlying HW IOMMU, if any > - remove warning associated with the translation of the MSI doorbell > > v2 -> v3: > - rebase on top of 2.10-rc0 and especially > [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion > - add mutex init > - fix as->mappings deletion using g_tree_ref/unref > - when a dev is attached whereas it is already attached to > another address space, first detach it > - fix some error values > - page_sizes = TARGET_PAGE_MASK; > - I haven't changed the unmap() semantics yet, waiting for the > ne
Re: [Qemu-devel] [RFC v9 15/17] hw/arm/virt: Add the virtio-iommu device tree mappings
> -Original Message- > From: Eric Auger > Sent: Thursday, November 22, 2018 10:46 PM > To: eric.auger@gmail.com; eric.au...@redhat.com; qemu- > de...@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > m...@redhat.com; jean-philippe.bruc...@arm.com > Cc: kevin.t...@intel.com; t...@semihalf.com; Bharat Bhushan > ; pet...@redhat.com > Subject: [RFC v9 15/17] hw/arm/virt: Add the virtio-iommu device tree > mappings > > Adds the "virtio,pci-iommu" node in the host bridge node and the RID > mapping, excluding the IOMMU RID. > > Signed-off-by: Eric Auger Reviewed-by: Bharat Bhushan > > --- > > v8 -> v9: > - disable msi-bypass property > - addition of the subnode is handled is the hotplug handler > and IOMMU RID is notimposed anymore > > v6 -> v7: > - align to the smmu instantiation code > > v4 -> v5: > - VirtMachineClass no_iommu added in this patch > - Use object_resolve_path_type > --- > hw/arm/virt.c | 57 +-- > include/hw/arm/virt.h | 2 ++ > 2 files changed, 52 insertions(+), 7 deletions(-) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index a2b8d8f7c2..b2bbb0ef49 > 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -29,6 +29,7 @@ > */ > > #include "qemu/osdep.h" > +#include "monitor/qdev.h" > #include "qapi/error.h" > #include "hw/sysbus.h" > #include "hw/arm/arm.h" > @@ -49,6 +50,7 @@ > #include "qemu/bitops.h" > #include "qemu/error-report.h" > #include "hw/pci-host/gpex.h" > +#include "hw/virtio/virtio-pci.h" > #include "hw/arm/sysbus-fdt.h" > #include "hw/platform-bus.h" > #include "hw/arm/fdt.h" > @@ -59,6 +61,7 @@ > #include "qapi/visitor.h" > #include "standard-headers/linux/input.h" > #include "hw/arm/smmuv3.h" > +#include "hw/virtio/virtio-iommu.h" > > #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \ > static void virt_##major##_##minor##_class_init(ObjectClass *oc, \ @@ - > 1085,6 +1088,33 @@ static void create_smmu(const VirtMachineState *vms, > qemu_irq *pic, > g_free(node); > } > > +static void create_virtio_iommu(VirtMachineState *vms, Error **errp) { > +const char compat[] = "virtio,pci-iommu"; > +uint16_t bdf = vms->virtio_iommu_bdf; > +char *node; > + > +vms->iommu_phandle = qemu_fdt_alloc_phandle(vms->fdt); > + > +node = g_strdup_printf("%s/virtio_iommu@%d", vms- > >pciehb_nodename, bdf); > +qemu_fdt_add_subnode(vms->fdt, node); > +qemu_fdt_setprop(vms->fdt, node, "compatible", compat, > sizeof(compat)); > +qemu_fdt_setprop_sized_cells(vms->fdt, node, "reg", > + 1, bdf << 8 /* phys.hi */, > + 1, 0/* phys.mid */, > + 1, 0/* phys.lo */, > + 1, 0/* size.hi */, > + 1, 0/* size.low */); > + > +qemu_fdt_setprop_cell(vms->fdt, node, "#iommu-cells", 1); > +qemu_fdt_setprop_cell(vms->fdt, node, "phandle", vms- > >iommu_phandle); > +g_free(node); > + > +qemu_fdt_setprop_cells(vms->fdt, vms->pciehb_nodename, "iommu- > map", > + 0x0, vms->iommu_phandle, 0x0, bdf, > + bdf + 1, vms->iommu_phandle, bdf + 1, 0x > +- bdf); } > + > static void create_pcie(VirtMachineState *vms, qemu_irq *pic) { > hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base; @@ - > 1162,7 +1192,7 @@ static void create_pcie(VirtMachineState *vms, > qemu_irq *pic) > } > } > > -nodename = g_strdup_printf("/pcie@%" PRIx64, base); > +nodename = vms->pciehb_nodename = g_strdup_printf("/pcie@%" > PRIx64, > + base); > qemu_fdt_add_subnode(vms->fdt, nodename); > qemu_fdt_setprop_string(vms->fdt, nodename, > "compatible", "pci-host-ecam-generic"); @@ > -1205,13 > +1235,17 @@ static void create_pcie(VirtMachineState *vms, qemu_irq *pic) > if (vms->iommu) { > vms->iommu_phandle = qemu_fdt_alloc_phandle(vms->fdt); > > -create_smmu(vms, pic, pci->bus); > +switch (vms->iommu) { > +case VIRT_IOMMU_SMMUV3: > +create_smmu(vms, pic, pci->bus); > +qemu_fdt_setprop_cells(vms->f
[Qemu-devel] [PATCH RFC v5 4/5] virtio-iommu: add virtio-iommu replay
For virtio-iommu, on replay first unmap any previous iommu-mapping and then map in iommu as per guest iommu mappings. Also if virtual iommu do have it own replay then memory_region_iommu_replay() calls "imrc->translate()", While virtio-iommu translate() expects device to be registered before it is called. So having replay of virtio-iommu helps to take no action if device not yet probed/attached. Signed-off-by: Bharat Bhushan --- v4->v5: - Rebase to v9 version from Eric (no change) hw/virtio/trace-events | 1 + hw/virtio/virtio-iommu.c | 38 ++ 2 files changed, 39 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 420b1e471b..f29a027258 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -74,3 +74,4 @@ virtio_iommu_fill_none_property(uint32_t devid) "devid=%d" virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64 virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64"" +virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 7e8149e719..c9d8b3aa4c 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -1015,6 +1015,43 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data) return (ua > ub) - (ua < ub); } +static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data) +{ +viommu_mapping *mapping = (viommu_mapping *) value; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +trace_virtio_iommu_remap(mapping->virt_addr, mapping->phys_addr, + mapping->size); +/* unmap previous entry and map again */ +virtio_iommu_notify_unmap(mr, mapping->virt_addr, mapping->size); + +virtio_iommu_notify_map(mr, mapping->virt_addr, mapping->phys_addr, +mapping->size); +return false; +} + +static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n) +{ +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +uint32_t sid; +viommu_endpoint *ep; + +sid = virtio_iommu_get_sid(sdev); + +qemu_mutex_lock(>mutex); + +ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); +if (!ep || !ep->domain) { +goto unlock; +} + +g_tree_foreach(ep->domain->mappings, virtio_iommu_remap, mr); + +unlock: +qemu_mutex_unlock(>mutex); +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -1129,6 +1166,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); imrc->translate = virtio_iommu_translate; +imrc->replay = virtio_iommu_replay; } static const TypeInfo virtio_iommu_info = { -- 2.19.1
[Qemu-devel] [PATCH RFC v5 5/5] virtio-iommu: handle IOMMU Notifier flag changes
Finally handle the IOMMU Notifier flag changes for the iommu-memory region. Signed-off-by: Bharat Bhushan --- v4->v5: - Rebase to v9 version from Eric (no change) hw/virtio/trace-events | 2 ++ hw/virtio/virtio-iommu.c | 31 +++ 2 files changed, 33 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index f29a027258..8c1d77b0c2 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -75,3 +75,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64"" virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s" +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index c9d8b3aa4c..adc37ddf1b 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -1052,6 +1052,36 @@ unlock: qemu_mutex_unlock(>mutex); } +static void virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr, + IOMMUNotifierFlag old, + IOMMUNotifierFlag new) +{ +IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +VirtioIOMMUNotifierNode *node = NULL; +VirtioIOMMUNotifierNode *next_node = NULL; + +if (old == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name); +node = g_malloc0(sizeof(*node)); +node->iommu_dev = sdev; +QLIST_INSERT_HEAD(>notifiers_list, node, next); +return; +} + +/* update notifier node with new flags */ +QLIST_FOREACH_SAFE(node, >notifiers_list, next, next_node) { +if (node->iommu_dev == sdev) { +if (new == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name); +QLIST_REMOVE(node, next); +g_free(node); +} +return; +} +} +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -1167,6 +1197,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = virtio_iommu_translate; imrc->replay = virtio_iommu_replay; +imrc->notify_flag_changed = virtio_iommu_notify_flag_changed; } static const TypeInfo virtio_iommu_info = { -- 2.19.1
[Qemu-devel] [PATCH RFC v5 3/5] virtio-iommu: Call iommu notifier on attach/detach
This patch extend the ATTACH/DETACH command handling to call iommu-notifier to map/unmap the memory region in IOMMU using vfio. This replay existing address space mappings on attach command and remove existing address space mappings on detach command. Signed-off-by: Bharat Bhushan Signed-off-by: Eric Auger --- v4->v5: - Rebase to v9 version from Eric - PCIe device hotplug fix hw/virtio/virtio-iommu.c | 47 1 file changed, 47 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 613a77521d..7e8149e719 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -131,8 +131,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, memory_region_notify_iommu(mr, 0, entry); } +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value, + gpointer data) +{ +viommu_mapping *mapping = (viommu_mapping *) value; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_unmap(mr, mapping->virt_addr, mapping->size); + +return false; +} + +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value, + gpointer data) +{ +viommu_mapping *mapping = (viommu_mapping *) value; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_map(mr, mapping->virt_addr, mapping->phys_addr, +mapping->size); + +return false; +} + static void virtio_iommu_detach_endpoint_from_domain(viommu_endpoint *ep) { +VirtioIOMMUNotifierNode *node; +VirtIOIOMMU *s = ep->viommu; +viommu_domain *domain = ep->domain; +uint32_t sid; + +QLIST_FOREACH(node, >notifiers_list, next) { +sid = virtio_iommu_get_sid(node->iommu_dev); +if (ep->id == sid) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap, + >iommu_dev->iommu_mr); +} +} + QLIST_REMOVE(ep, next); ep->domain = NULL; } @@ -280,8 +316,10 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, { uint32_t domain_id = le32_to_cpu(req->domain); uint32_t ep_id = le32_to_cpu(req->endpoint); +VirtioIOMMUNotifierNode *node; viommu_domain *domain; viommu_endpoint *ep; +uint32_t sid; trace_virtio_iommu_attach(domain_id, ep_id); @@ -300,6 +338,15 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, ep->domain = domain; g_tree_ref(domain->mappings); +/* replay existing address space mappings on the associated mr */ +QLIST_FOREACH(node, >notifiers_list, next) { +sid = virtio_iommu_get_sid(node->iommu_dev); +if (ep->id == sid) { +g_tree_foreach(domain->mappings, virtio_iommu_mapping_map, + >iommu_dev->iommu_mr); +} +} + return VIRTIO_IOMMU_S_OK; } -- 2.19.1
[Qemu-devel] [PATCH RFC v5 0/5] virtio-iommu: VFIO integration
This patch series integrates VFIO with virtio-iommu. This is tested with assigning 2 pci devices to Virtual Machine. This version is mainly about rebasing on v9 version on virtio-iommu device framework from Eric Augur. This patch series allows PCI pass-through using virtio-iommu. This series is based on: - virtio-iommu kernel driver by Jean-Philippe Brucker [PATCH v5 0/7] Add virtio-iommu driver git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9 - virtio-iommu device emulation by Eric Augur. [RFC,v9,00/17] VIRTIO-IOMMU device https://github.com/eauger/qemu/tree/v3.1.0-rc2-virtio-iommu-v0.9 v4->v5: - Rebase to v9 version from Eric - PCIe device hotplug fix - Added Patch 1/5 from Eric previous series (Eric somehow dropped in last version. - Patch "Translate the MSI doorbell in kvm_arch_fixup_msi_route" already integrated with vsmmu3 v3->v4: - Rebase to v4 version from Eric - Fixes from Eric with DPDK in VM - Logical division in multiple patches v2->v3: - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" Which is based on top of v2.10-rc0 that - Fixed issue with two PCI devices - Addressed review comments v1->v2: - Added trace events - removed vSMMU3 link in patch description Bharat Bhushan (4): virtio-iommu: Add iommu notifier for iommu-map/unmap virtio-iommu: Call iommu notifier on attach/detach virtio-iommu: add virtio-iommu replay virtio-iommu: handle IOMMU Notifier flag changes Eric Auger (1): hw/vfio/common: Do not print error when viommu translates into an mmio region hw/vfio/common.c | 2 - hw/virtio/trace-events | 5 + hw/virtio/virtio-iommu.c | 190 ++- include/hw/virtio/virtio-iommu.h | 6 + 4 files changed, 198 insertions(+), 5 deletions(-) -- 2.19.1
[Qemu-devel] [PATCH RFC v5 2/5] virtio-iommu: Add iommu notifier for iommu-map/unmap
This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request handling to notify registered iommu-notifier. These iommu-notifier maps the requested region in IOMMU using vfio. Signed-off-by: Bharat Bhushan --- v4->v5: - Rebase to v9 version from Eric - PCIe device hotplug fix hw/virtio/trace-events | 2 + hw/virtio/virtio-iommu.c | 74 ++-- include/hw/virtio/virtio-iommu.h | 6 +++ 3 files changed, 79 insertions(+), 3 deletions(-) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 053a07b3fc..420b1e471b 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -72,3 +72,5 @@ virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) virtio_iommu_fill_resv_property(uint32_t devid, uint8_t subtype, uint64_t start, uint64_t end, uint32_t flags, size_t filled) "dev= %d, subtype=%d start=0x%"PRIx64" end=0x%"PRIx64" flags=%d filled=0x%lx" virtio_iommu_fill_none_property(uint32_t devid) "devid=%d" virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64 +virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64"" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 2ec01f3b9e..613a77521d 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -99,6 +99,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data) } } +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova, +hwaddr paddr, hwaddr size) +{ +IOMMUTLBEntry entry; + +trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size); + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; +entry.iova = iova; +entry.perm = IOMMU_RW; +entry.translated_addr = paddr; + +memory_region_notify_iommu(mr, 0, entry); +} + +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, + hwaddr size) +{ +IOMMUTLBEntry entry; + +trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size); + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; +entry.iova = iova; +entry.perm = IOMMU_NONE; +entry.translated_addr = 0; + +memory_region_notify_iommu(mr, 0, entry); +} + static void virtio_iommu_detach_endpoint_from_domain(viommu_endpoint *ep) { QLIST_REMOVE(ep, next); @@ -301,9 +333,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s, uint64_t virt_start = le64_to_cpu(req->virt_start); uint64_t virt_end = le64_to_cpu(req->virt_end); uint32_t flags = le32_to_cpu(req->flags); +VirtioIOMMUNotifierNode *node; +viommu_endpoint *ep; viommu_domain *domain; viommu_interval *interval; viommu_mapping *mapping; +uint32_t sid; interval = g_malloc0(sizeof(*interval)); @@ -331,9 +366,40 @@ static int virtio_iommu_map(VirtIOIOMMU *s, g_tree_insert(domain->mappings, interval, mapping); +/* All devices in an address-space share mapping */ +QLIST_FOREACH(node, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +sid = virtio_iommu_get_sid(node->iommu_dev); +if (ep->id == sid) { +virtio_iommu_notify_map(>iommu_dev->iommu_mr, +virt_start, phys_start, mapping->size); +} +} +} + return VIRTIO_IOMMU_S_OK; } +static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, viommu_domain *domain, +viommu_interval *interval) +{ +VirtioIOMMUNotifierNode *node; +viommu_endpoint *ep; +uint32_t sid; + +g_tree_remove(domain->mappings, (gpointer)(interval)); +QLIST_FOREACH(node, >notifiers_list, next) { +QLIST_FOREACH(ep, >endpoint_list, next) { +sid = virtio_iommu_get_sid(node->iommu_dev); +if (ep->id == sid) { +virtio_iommu_notify_unmap(>iommu_dev->iommu_mr, + interval->low, + interval->high - interval->low + 1); +} +} +} +} + static int virtio_iommu_unmap(VirtIOIOMMU *s, struct virtio_iommu_req_unmap *req) { @@ -366,18 +432,18 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, current.high = high; if (low == interval.low && size >= mapping->size) { -g_tree_remo
[Qemu-devel] [PATCH RFC v5 1/5] hw/vfio/common: Do not print error when viommu translates into an mmio region
From: Eric Auger On ARM, the MSI doorbell is translated by the virtual IOMMU. As such address_space_translate() returns the MSI controller MMIO region and we get an "iommu map to non memory area" message. Let's remove this latter. Signed-off-by: Eric Auger Signed-off-by: Bharat Bhushan --- v5: - Added thi patch from Eric previous series (Eric somehow dropped in last version and this is needed for VFIO. hw/vfio/common.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 7c185e5a2e..fc40543121 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -328,8 +328,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr, , , writable, MEMTXATTRS_UNSPECIFIED); if (!memory_region_is_ram(mr)) { -error_report("iommu map to non memory area %"HWADDR_PRIx"", - xlat); return false; } -- 2.19.1
Re: [Qemu-devel] [RFC v9 06/17] virtio-iommu: Endpoint and domains structs and helpers
Hi Eric, > -Original Message- > From: Auger Eric > Sent: Friday, November 23, 2018 1:23 PM > To: Bharat Bhushan ; > eric.auger@gmail.com; qemu-devel@nongnu.org; qemu- > a...@nongnu.org; peter.mayd...@linaro.org; m...@redhat.com; jean- > philippe.bruc...@arm.com > Cc: t...@semihalf.com; kevin.t...@intel.com; pet...@redhat.com > Subject: Re: [Qemu-devel] [RFC v9 06/17] virtio-iommu: Endpoint and > domains structs and helpers > > Hi Bharat, > > On 11/23/18 7:38 AM, Bharat Bhushan wrote: > > Hi Eric, > > > >> -Original Message- > >> From: Eric Auger > >> Sent: Thursday, November 22, 2018 10:45 PM > >> To: eric.auger@gmail.com; eric.au...@redhat.com; qemu- > >> de...@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > >> m...@redhat.com; jean-philippe.bruc...@arm.com > >> Cc: kevin.t...@intel.com; t...@semihalf.com; Bharat Bhushan > >> ; pet...@redhat.com > >> Subject: [RFC v9 06/17] virtio-iommu: Endpoint and domains structs > >> and helpers > >> > >> This patch introduce domain and endpoint internal datatypes. Both are > >> stored in RB trees. The domain owns a list of endpoints attached to it. > >> > >> Helpers to get/put end points and domains are introduced. > >> get() helpers will become static in subsequent patches. > >> > >> Signed-off-by: Eric Auger > >> > >> --- > >> > >> v6 -> v7: > >> - on virtio_iommu_find_add_as the bus number computation may > >> not be finalized yet so we cannot register the EPs at that time. > >> Hence, let's remove the get_endpoint and also do not use the > >> bus number for building the memory region name string (only > >> used for debug though). > > > > Endpoint registration from virtio_iommu_find_add_as to PROBE request. > > It is mentioned that " the bus number computation may not be finalized ". > Can you please give some more information. > > I am asking this because from vfio perspective translate/replay will be > called much before the PROBE request and endpoint needed to be > registered by that time. > When from virtio_iommu_find_add() gets called, there are cases where the > BDF of the device is not yet computed, typically if the EP is plugged on a > secondary bus. That's why I postponed the registration. Do you have idea > When you would need the registration to happen? We want the endpoint registeration before replay/translate() is called for both virtio/vfio and I am trying to understand when we should register the endpoint. I am looking at amd iommu, there pci_setup_iommu() provides the callback function which is called with "devfn" from pci_device_iommu_address_space(). Are you saying that devfn provided by pci_device_iommu_address_space() can be invalid? Thanks -Bharat > > Thanks > > Eric > > > > > > Thanks > > -Bharat > > > >> > >> v4 -> v5: > >> - initialize as->endpoint_list > >> > >> v3 -> v4: > >> - new separate patch > >> --- > >> hw/virtio/trace-events | 4 ++ > >> hw/virtio/virtio-iommu.c | 125 > >> ++- > >> 2 files changed, 128 insertions(+), 1 deletion(-) > >> > >> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index > >> 9270b0463e..4b15086872 100644 > >> --- a/hw/virtio/trace-events > >> +++ b/hw/virtio/trace-events > >> @@ -61,3 +61,7 @@ virtio_iommu_map(uint32_t domain_id, uint64_t > >> virt_start, uint64_t virt_end, uin virtio_iommu_unmap(uint32_t > >> domain_id, uint64_t virt_start, uint64_t virt_end) "domain=%d > virt_start=0x%"PRIx64" > >> virt_end=0x%"PRIx64 virtio_iommu_translate(const char *name, > >> uint32_t rid, uint64_t iova, int flag) "mr=%s rid=%d addr=0x%"PRIx64" > flag=%d" > >> virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s" > >> +virtio_iommu_get_endpoint(uint32_t ep_id) "Alloc endpoint=%d" > >> +virtio_iommu_put_endpoint(uint32_t ep_id) "Free endpoint=%d" > >> +virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d" > >> +virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d" > >> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > >> index > >> dead062baf..1b9c3ba416 100644 > >> --- a/hw/virtio/virtio-iommu.c > >> +++ b/hw/virtio/virtio-iommu.c > >> @@ -33,20 +33,124 @@ > >> #inc
Re: [Qemu-devel] [RFC v9 06/17] virtio-iommu: Endpoint and domains structs and helpers
Hi Eric, > -Original Message- > From: Eric Auger > Sent: Thursday, November 22, 2018 10:45 PM > To: eric.auger@gmail.com; eric.au...@redhat.com; qemu- > de...@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > m...@redhat.com; jean-philippe.bruc...@arm.com > Cc: kevin.t...@intel.com; t...@semihalf.com; Bharat Bhushan > ; pet...@redhat.com > Subject: [RFC v9 06/17] virtio-iommu: Endpoint and domains structs and > helpers > > This patch introduce domain and endpoint internal datatypes. Both are > stored in RB trees. The domain owns a list of endpoints attached to it. > > Helpers to get/put end points and domains are introduced. > get() helpers will become static in subsequent patches. > > Signed-off-by: Eric Auger > > --- > > v6 -> v7: > - on virtio_iommu_find_add_as the bus number computation may > not be finalized yet so we cannot register the EPs at that time. > Hence, let's remove the get_endpoint and also do not use the > bus number for building the memory region name string (only > used for debug though). Endpoint registration from virtio_iommu_find_add_as to PROBE request. It is mentioned that " the bus number computation may not be finalized ". Can you please give some more information. I am asking this because from vfio perspective translate/replay will be called much before the PROBE request and endpoint needed to be registered by that time. Thanks -Bharat > > v4 -> v5: > - initialize as->endpoint_list > > v3 -> v4: > - new separate patch > --- > hw/virtio/trace-events | 4 ++ > hw/virtio/virtio-iommu.c | 125 > ++- > 2 files changed, 128 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index > 9270b0463e..4b15086872 100644 > --- a/hw/virtio/trace-events > +++ b/hw/virtio/trace-events > @@ -61,3 +61,7 @@ virtio_iommu_map(uint32_t domain_id, uint64_t > virt_start, uint64_t virt_end, uin virtio_iommu_unmap(uint32_t domain_id, > uint64_t virt_start, uint64_t virt_end) "domain=%d virt_start=0x%"PRIx64" > virt_end=0x%"PRIx64 virtio_iommu_translate(const char *name, uint32_t > rid, uint64_t iova, int flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d" > virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s" > +virtio_iommu_get_endpoint(uint32_t ep_id) "Alloc endpoint=%d" > +virtio_iommu_put_endpoint(uint32_t ep_id) "Free endpoint=%d" > +virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d" > +virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d" > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index > dead062baf..1b9c3ba416 100644 > --- a/hw/virtio/virtio-iommu.c > +++ b/hw/virtio/virtio-iommu.c > @@ -33,20 +33,124 @@ > #include "hw/virtio/virtio-bus.h" > #include "hw/virtio/virtio-access.h" > #include "hw/virtio/virtio-iommu.h" > +#include "hw/pci/pci_bus.h" > +#include "hw/pci/pci.h" > > /* Max size */ > #define VIOMMU_DEFAULT_QUEUE_SIZE 256 > > +typedef struct viommu_domain { > +uint32_t id; > +GTree *mappings; > +QLIST_HEAD(, viommu_endpoint) endpoint_list; } viommu_domain; > + > +typedef struct viommu_endpoint { > +uint32_t id; > +viommu_domain *domain; > +QLIST_ENTRY(viommu_endpoint) next; > +VirtIOIOMMU *viommu; > +} viommu_endpoint; > + > +typedef struct viommu_interval { > +uint64_t low; > +uint64_t high; > +} viommu_interval; > + > static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev) { > return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); } > > +static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer > +user_data) { > +viommu_interval *inta = (viommu_interval *)a; > +viommu_interval *intb = (viommu_interval *)b; > + > +if (inta->high <= intb->low) { > +return -1; > +} else if (intb->high <= inta->low) { > +return 1; > +} else { > +return 0; > +} > +} > + > +static void > virtio_iommu_detach_endpoint_from_domain(viommu_endpoint > +*ep) { > +QLIST_REMOVE(ep, next); > +ep->domain = NULL; > +} > + > +viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, > uint32_t > +ep_id); viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, > +uint32_t ep_id) { > +viommu_endpoint *ep; > + > +ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(ep_id)); > +if (ep) { > +return ep; > +} > +ep = g_malloc0(sizeof(*ep)); > +ep->id = ep_i
Re: [Qemu-devel] [RFC v8 15/18] hw/arm/virt: Add virtio-iommu to the virt board
Hi Eric, > -Original Message- > From: Eric Auger > Sent: Friday, November 9, 2018 5:00 PM > To: eric.auger@gmail.com; eric.au...@redhat.com; qemu- > de...@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org; > m...@redhat.com; jean-philippe.bruc...@arm.com > Cc: kevin.t...@intel.com; t...@semihalf.com; Bharat Bhushan > ; pet...@redhat.com > Subject: [RFC v8 15/18] hw/arm/virt: Add virtio-iommu to the virt board > > Both the virtio-iommu device and its dedicated mmio transport get > instantiated when requested. > > Signed-off-by: Eric Auger > > --- > > v6 -> v7: > - align to the smmu instantiation code > > v4 -> v5: > - VirtMachineClass no_iommu added in this patch > - Use object_resolve_path_type > --- > hw/arm/virt.c | 48 > +--- > 1 file changed, 45 insertions(+), 3 deletions(-) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index a2b8d8f7c2..f2994c4359 > 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -29,6 +29,7 @@ > */ > > #include "qemu/osdep.h" > +#include "monitor/qdev.h" > #include "qapi/error.h" > #include "hw/sysbus.h" > #include "hw/arm/arm.h" > @@ -49,6 +50,7 @@ > #include "qemu/bitops.h" > #include "qemu/error-report.h" > #include "hw/pci-host/gpex.h" > +#include "hw/virtio/virtio-pci.h" > #include "hw/arm/sysbus-fdt.h" > #include "hw/platform-bus.h" > #include "hw/arm/fdt.h" > @@ -59,6 +61,7 @@ > #include "qapi/visitor.h" > #include "standard-headers/linux/input.h" > #include "hw/arm/smmuv3.h" > +#include "hw/virtio/virtio-iommu.h" > > #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \ > static void virt_##major##_##minor##_class_init(ObjectClass *oc, \ @@ - > 1085,6 +1088,33 @@ static void create_smmu(const VirtMachineState *vms, > qemu_irq *pic, > g_free(node); > } > > +static void create_virtio_iommu(VirtMachineState *vms, > +const char *pciehb_nodename, PCIBus > +*bus) { > +const char compat[] = "virtio,pci-iommu"; > +uint16_t bdf = 0x8; /* 00:01.0 */ Why we hardcoded "bdf = 8" ? When adding the VFIO support I see virtio-iommu PCI device have bdf = x010. Thanks -Bharat > +DeviceState *dev; > +char *node; > + > +dev = qdev_create(BUS(bus), TYPE_VIRTIO_IOMMU_PCI); > +object_property_set_bool(OBJECT(dev), true, "realized", > + _fatal); > + > +node = g_strdup_printf("%s/virtio_iommu@%d", pciehb_nodename, > bdf); > +qemu_fdt_add_subnode(vms->fdt, node); > +qemu_fdt_setprop(vms->fdt, node, "compatible", compat, > sizeof(compat)); > +qemu_fdt_setprop_sized_cells(vms->fdt, node, "reg", > + 1, bdf << 8 /* phys.hi */, > + 1, 0/* phys.mid */, > + 1, 0/* phys.lo */, > + 1, 0/* size.hi */, > + 1, 0/* size.low */); > + > +qemu_fdt_setprop_cell(vms->fdt, node, "#iommu-cells", 1); > +qemu_fdt_setprop_cell(vms->fdt, node, "phandle", vms- > >iommu_phandle); > +g_free(node); > +} > + > + > static void create_pcie(VirtMachineState *vms, qemu_irq *pic) { > hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base; @@ - > 1205,10 +1235,22 @@ static void create_pcie(VirtMachineState *vms, > qemu_irq *pic) > if (vms->iommu) { > vms->iommu_phandle = qemu_fdt_alloc_phandle(vms->fdt); > > -create_smmu(vms, pic, pci->bus); > +switch (vms->iommu) { > +case VIRT_IOMMU_SMMUV3: > +create_smmu(vms, pic, pci->bus); > +qemu_fdt_setprop_cells(vms->fdt, nodename, "iommu-map", > + 0x0, vms->iommu_phandle, 0x0, 0x1); > +break; > +case VIRT_IOMMU_VIRTIO: > +create_virtio_iommu(vms, nodename, pci->bus); > +qemu_fdt_setprop_cells(vms->fdt, nodename, "iommu-map", > + 0x0, vms->iommu_phandle, 0x0, 0x8, > + 0x9, vms->iommu_phandle, 0x9, 0xfff7); > +break; > +default: > +g_assert_not_reached(); > +} > > -qemu_fdt_setprop_cells(vms->fdt, nodename, "iommu-map", > - 0x0, vms->iommu_phandle, 0x0, 0x1); > } > > g_free(nodename); > -- > 2.17.2
Re: [Qemu-devel] [Qemu-arm] [PATCH v4 0/5] virtio-iommu: VFIO integration
Hi Alex, Eric, > -Original Message- > From: Qemu-devel [mailto:qemu-devel- > bounces+bharat.bhushan=nxp@nongnu.org] On Behalf Of Bharat > Bhushan > Sent: Friday, October 06, 2017 9:16 AM > To: Auger Eric <eric.au...@redhat.com>; Linu Cherian > <linuc.dec...@gmail.com> > Cc: peter.mayd...@linaro.org; kevin.t...@intel.com; m...@redhat.com; > marc.zyng...@arm.com; t...@semihalf.com; will.dea...@arm.com; > drjo...@redhat.com; qemu-devel@nongnu.org; > alex.william...@redhat.com; qemu-...@nongnu.org; > linu.cher...@cavium.com; eric.auger@gmail.com; > robin.mur...@arm.com; christoffer.d...@linaro.org; > bharatb.ya...@gmail.com > Subject: Re: [Qemu-devel] [Qemu-arm] [PATCH v4 0/5] virtio-iommu: VFIO > integration > > > > > >> Thanks > > >> > > >> Eric > > >>> > > >>> However you should be allowed to map 1 sg element of 5 pages and > > >>> then notify the host about this event I think. Still looking at the > > >>> code... > > >>> > > >>> I still can't reproduce the issue at the moment. What kind of > > >>> device are you assigning? > > >>> > > >>> Thanks > > >>> > > >>> Eric > > >>>> > > >>>> Atleast vfio_get_vaddr called from vfio_iommu_map_notify in Qemu > > >>>> expects the map size to be a power of 2. > > > > > > Actually I missed the most important here ;-) > > >>>> > > >>>> if (len & iotlb->addr_mask) { > > > This check looks suspiscious to me. In our case the len is not > > > modified by the previous translation and it fails, I don't see why. > > > It should be valid to be able to notify 5 granules. > > > > So after discussion with Alex, looks the way we notify the host > > currently is wrong. we set the addr_mask to the mapping/unmapping size > > -1 whereas this should be a page mask instead (granule size or block size?). > > So if the guest maps 5 x 4kB pages we should send 5 notifications for > > each page and not a single one. It is unclear to me if we can notify > > with hugepage/block page size mask. Peter may confirm/infirm this. in > > vsmmuv3 code I notify by granule or block size. My understanding is that host provides supported page sizes (page_size_mask), and Size of each notification to host should be exactly best fit of supported page-size and/or multiples of supported page-sizes. So if guest maps 20K size (single request), and supported page size is 4K, so we can still send one 20K size request. Not sure I get it, why multiples of supported page-size cannot be provided in one notification to host. Thanks -Bharat > > > > Bharat, please can you add this to your TODO list? > > > > Linu, thanks a lot for the time you spent debugging this issue. > > Curiously on my side, it is really seldom hit but it is ... > > Thanks Linu and Eric, I added this to my todo list. > While I am still not able to reproduce the issue. I tried with e1000 and now > try with ixgbe device. May I know which device can be used to reproduce this > issue? > > Thanks > -Bharat > > > > > Thanks! > > > > Eric > > > > > > Thanks > > > > > > Eric > > >>>> error_report("iommu has granularity incompatible with target > AS"); > > >>>> return false; > > >>>> } > > >>>> > > >>>> Just trying to understand how this is not hitting in your case. > > >>>> > > >>>> > > >>> > > >> > > >
Re: [Qemu-devel] [Qemu-arm] [PATCH v4 0/5] virtio-iommu: VFIO integration
> >> Thanks > >> > >> Eric > >>> > >>> However you should be allowed to map 1 sg element of 5 pages and > >>> then notify the host about this event I think. Still looking at the > >>> code... > >>> > >>> I still can't reproduce the issue at the moment. What kind of device > >>> are you assigning? > >>> > >>> Thanks > >>> > >>> Eric > > Atleast vfio_get_vaddr called from vfio_iommu_map_notify in Qemu > expects the map size to be a power of 2. > > > > Actually I missed the most important here ;-) > > if (len & iotlb->addr_mask) { > > This check looks suspiscious to me. In our case the len is not > > modified by the previous translation and it fails, I don't see why. It > > should be valid to be able to notify 5 granules. > > So after discussion with Alex, looks the way we notify the host currently is > wrong. we set the addr_mask to the mapping/unmapping size > -1 whereas this should be a page mask instead (granule size or block size?). > So if the guest maps 5 x 4kB pages we should send 5 notifications for each > page and not a single one. It is unclear to me if we can notify with > hugepage/block page size mask. Peter may confirm/infirm this. in vsmmuv3 > code I notify by granule or block size. > > Bharat, please can you add this to your TODO list? > > Linu, thanks a lot for the time you spent debugging this issue. > Curiously on my side, it is really seldom hit but it is ... Thanks Linu and Eric, I added this to my todo list. While I am still not able to reproduce the issue. I tried with e1000 and now try with ixgbe device. May I know which device can be used to reproduce this issue? Thanks -Bharat > > Thanks! > > Eric > > > > Thanks > > > > Eric > error_report("iommu has granularity incompatible with target > AS"); > return false; > } > > Just trying to understand how this is not hitting in your case. > > > >>> > >> > >
Re: [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request
> -Original Message- > From: Tomasz Nowicki [mailto:tnowi...@caviumnetworks.com] > Sent: Wednesday, September 27, 2017 4:23 PM > To: Eric Auger <eric.au...@redhat.com>; eric.auger@gmail.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org; jean- > philippe.bruc...@arm.com > Cc: will.dea...@arm.com; kevin.t...@intel.com; marc.zyng...@arm.com; > christoffer.d...@linaro.org; drjo...@redhat.com; w...@redhat.com; Bharat > Bhushan <bharat.bhus...@nxp.com>; pet...@redhat.com; > linuc.dec...@gmail.com > Subject: Re: [RFC v4 10/16] virtio-iommu: Implement probe request > > Hi Eric, > > On 19.09.2017 09:46, Eric Auger wrote: > > This patch implements the PROBE request. At the moment, no reserved > > regions are returned. > > > > At the moment reserved regions are stored per device. > > > > Signed-off-by: Eric Auger <eric.au...@redhat.com> > > > > --- > > > > [...] > > > + > > +static int virtio_iommu_fill_property(int devid, int type, > > + viommu_property_buffer > > +*bufstate) { > > +int ret = -ENOSPC; > > + > > +if (bufstate->filled + 4 >= VIOMMU_PROBE_SIZE) { > > +bufstate->error = true; > > +goto out; > > +} > > + > > +switch (type) { > > +case VIRTIO_IOMMU_PROBE_T_NONE: > > +ret = virtio_iommu_fill_none_prop(bufstate); > > +break; > > +case VIRTIO_IOMMU_PROBE_T_RESV_MEM: > > +{ > > +viommu_dev *dev = bufstate->dev; > > + > > +g_tree_foreach(dev->reserved_regions, > > + virtio_iommu_fill_resv_mem_prop, > > + bufstate); > > +if (!bufstate->error) { > > +ret = 0; > > +} > > +break; > > +} > > +default: > > +ret = -ENOENT; > > +break; > > +} > > +out: > > +if (ret) { > > +error_report("%s property of type=%d could not be filled (%d)," > > + " remaining size = 0x%lx", > > + __func__, type, ret, bufstate->filled); > > +} > > +return ret; > > +} > > + > > +static int virtio_iommu_probe(VirtIOIOMMU *s, > > + struct virtio_iommu_req_probe *req, > > + uint8_t *buf) { > > +uint32_t devid = le32_to_cpu(req->device); > > +int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type; > > +viommu_property_buffer bufstate; > > +viommu_dev *dev; > > +int ret; > > + > > +dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid)); > > +if (!dev) { > > +return -EINVAL; > > +} > > + > > +bufstate.start = buf; > > +bufstate.filled = 0; > > +bufstate.dev = dev; > > bufstate.error is not initialized which may cause false alarm in > virtio_iommu_fill_property() I observed below prints "qemu-system-aarch64: virtio_iommu_fill_property property of type=2 could not be filled (-28), remaining size = 0x0 " When I initialized the bufstate.error = 0,it goes. Thanks -Bharat > > > + > > +while ((type = ctz32(prop_types)) != 32) { > > +ret = virtio_iommu_fill_property(devid, 1 << type, ); > > +if (ret) { > > +break; > > +} > > +prop_types &= ~(1 << type); > > +} > > +virtio_iommu_fill_property(devid, VIRTIO_IOMMU_PROBE_T_NONE, > > + ); > > + > > +return VIRTIO_IOMMU_S_OK; > > +} > > + > > #define get_payload_size(req) (\ > > sizeof((req)) - sizeof(struct virtio_iommu_req_tail)) > > > > @@ -433,6 +567,24 @@ static int > virtio_iommu_handle_unmap(VirtIOIOMMU *s, > > return virtio_iommu_unmap(s, ); > > } > > Thanks, > Tomasz
Re: [Qemu-devel] [Qemu-arm] [PATCH v4 0/5] virtio-iommu: VFIO integration
Hi, > -Original Message- > From: Linu Cherian [mailto:linuc.dec...@gmail.com] > Sent: Wednesday, September 27, 2017 1:11 PM > To: Bharat Bhushan <bharat.bhus...@nxp.com> > Cc: eric.au...@redhat.com; eric.auger@gmail.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org; kevin.t...@intel.com; > marc.zyng...@arm.com; t...@semihalf.com; will.dea...@arm.com; > drjo...@redhat.com; robin.mur...@arm.com; christoffer.d...@linaro.org; > bharatb.ya...@gmail.com > Subject: Re: [Qemu-arm] [PATCH v4 0/5] virtio-iommu: VFIO integration > > Hi, > > On Wed Sep 27, 2017 at 12:03:15PM +0530, Bharat Bhushan wrote: > > This patch series integrates VFIO/VHOST with virtio-iommu. > > > > This version is mainly about rebasing on v4 version on virtio-iommu > > device framework from Eric Augur and addresing review comments. > > > > This patch series allows PCI pass-through using virtio-iommu. > > > > This series is based on: > > - virtio-iommu kernel driver by Jean-Philippe Brucker > > [1] [RFC] virtio-iommu version 0.4 > > git://linux-arm.org/virtio-iommu.git branch viommu/v0.4 > > > > - virtio-iommu device emulation by Eric Augur. > >[RFC v4 00/16] VIRTIO-IOMMU device > >https://github.com/eauger/qemu/tree/v2.10.0-virtio-iommu-v4 > > > > Changes are available at : https://github.com/bharaty/qemu.git > > virtio-iommu-vfio-integration-v4 > > > > # With the above sources, was trying to test the vfio-pci device assigned to > guest using Qemu. > # Both guest and host kernels are configured with 4k as page size. > # releavant qemu command snippet, > -device virtio-iommu-device -device virtio-blk-device,drive=hd0 \ > -net none -device vfio-pci,host=xxx > > > On guest booting, observed mutliple messages as below, > > qemu-system-aarch64: iommu has granularity incompatible with target AS > > # On adding necessary prints, 0x5000 is len, 0x4fff is address mask > and the code expects the address mask to be 0xfff. I have not seen these errors, I am also using 4K page-size on both host and guest. Can you share compete qemu command and log. Thanks -Bharat > > if (len & iotlb->addr_mask) { > error_report > > # vfio_dma_map is failing due to this error. > > Any pointers ? > > > > v3->v4: > > - Rebase to v4 version from Eric > > - Fixes from Eric with DPDK in VM > > - Logical division in multiple patches > > > > v2->v3: > > - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" > >Which is based on top of v2.10-rc0 that > > - Fixed issue with two PCI devices > > - Addressed review comments > > > > v1->v2: > > - Added trace events > > - removed vSMMU3 link in patch description > > > > Bharat Bhushan (5): > > target/arm/kvm: Translate the MSI doorbell in > kvm_arch_fixup_msi_route > > virtio-iommu: Add iommu notifier for map/unmap > > virtio-iommu: Call iommu notifier for attach/detach > > virtio-iommu: add iommu replay > > virtio-iommu: add iommu notifier memory-region > > > > hw/virtio/trace-events | 5 ++ > > hw/virtio/virtio-iommu.c | 181 > ++- > > include/hw/virtio/virtio-iommu.h | 6 ++ > > target/arm/kvm.c | 27 ++ > > target/arm/trace-events | 3 + > > 5 files changed, 219 insertions(+), 3 deletions(-) > > > > -- > > 1.9.3 > > > > > > -- > Linu cherian
Re: [Qemu-devel] [PATCH v4 0/5] virtio-iommu: VFIO integration
Hi Peter, > -Original Message- > From: Peter Xu [mailto:pet...@redhat.com] > Sent: Wednesday, September 27, 2017 12:32 PM > To: Bharat Bhushan <bharat.bhus...@nxp.com> > Cc: eric.au...@redhat.com; eric.auger@gmail.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org; w...@redhat.com; > kevin.t...@intel.com; marc.zyng...@arm.com; t...@semihalf.com; > will.dea...@arm.com; drjo...@redhat.com; robin.mur...@arm.com; > christoffer.d...@linaro.org; bharatb.ya...@gmail.com > Subject: Re: [PATCH v4 0/5] virtio-iommu: VFIO integration > > On Wed, Sep 27, 2017 at 06:46:18AM +, Bharat Bhushan wrote: > > Hi Peter, > > Hi, Bharat! > > > > > While vfio with virtio-iommu I observed one issue, When virtio-iommu > device exists but guest kernel does not have virtio-iommu driver (not > enabled in Config) then IOMMU faults are reported on host. > > > > This is because no mapping is created in IOMMU, not even default > guest-physical to real-physical. While looking at vfio_listener_region_add(), > it > does not create initial mapping in IOMMU and relies on guest to create > mapping. Is this something known or I am missing something? > > For VT-d, the trick is played using dynamic IOMMU memory region. > Please refer to commit 558e0024a428 ("intel_iommu: allow dynamic switch of > IOMMU region") for more information. > > The whole idea is that, the IOMMU region will not be enabled only if the > guest enables that explicitly for the device. Otherwise (for your case, when > guest driver is not loaded at all), the IOMMU region is by default off, then > the default GPA region will be used to build up the mapping (just like when > we don't have vIOMMU at all). Thanks, Thanks, I will analyze and see how we can use for virtio-iommu. Regards -Bharat > > -- > Peter Xu
Re: [Qemu-devel] [PATCH v4 0/5] virtio-iommu: VFIO integration
Hi Peter, While vfio with virtio-iommu I observed one issue, When virtio-iommu device exists but guest kernel does not have virtio-iommu driver (not enabled in Config) then IOMMU faults are reported on host. This is because no mapping is created in IOMMU, not even default guest-physical to real-physical. While looking at vfio_listener_region_add(), it does not create initial mapping in IOMMU and relies on guest to create mapping. Is this something known or I am missing something? Thanks -Bharat > -Original Message- > From: Bharat Bhushan [mailto:bharat.bhus...@nxp.com] > Sent: Wednesday, September 27, 2017 12:03 PM > To: eric.au...@redhat.com; eric.auger@gmail.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org > Cc: w...@redhat.com; kevin.t...@intel.com; marc.zyng...@arm.com; > t...@semihalf.com; will.dea...@arm.com; drjo...@redhat.com; > robin.mur...@arm.com; christoffer.d...@linaro.org; > bharatb.ya...@gmail.com; Bharat Bhushan <bharat.bhus...@nxp.com> > Subject: [PATCH v4 0/5] virtio-iommu: VFIO integration > > This patch series integrates VFIO/VHOST with virtio-iommu. > > This version is mainly about rebasing on v4 version on virtio-iommu device > framework from Eric Augur and addresing review comments. > > This patch series allows PCI pass-through using virtio-iommu. > > This series is based on: > - virtio-iommu kernel driver by Jean-Philippe Brucker > [1] [RFC] virtio-iommu version 0.4 > git://linux-arm.org/virtio-iommu.git branch viommu/v0.4 > > - virtio-iommu device emulation by Eric Augur. >[RFC v4 00/16] VIRTIO-IOMMU device >https://github.com/eauger/qemu/tree/v2.10.0-virtio-iommu-v4 > > Changes are available at : https://github.com/bharaty/qemu.git virtio- > iommu-vfio-integration-v4 > > v3->v4: > - Rebase to v4 version from Eric > - Fixes from Eric with DPDK in VM > - Logical division in multiple patches > > v2->v3: > - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" >Which is based on top of v2.10-rc0 that > - Fixed issue with two PCI devices > - Addressed review comments > > v1->v2: > - Added trace events > - removed vSMMU3 link in patch description > > Bharat Bhushan (5): > target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route > virtio-iommu: Add iommu notifier for map/unmap > virtio-iommu: Call iommu notifier for attach/detach > virtio-iommu: add iommu replay > virtio-iommu: add iommu notifier memory-region > > hw/virtio/trace-events | 5 ++ > hw/virtio/virtio-iommu.c | 181 > ++- > include/hw/virtio/virtio-iommu.h | 6 ++ > target/arm/kvm.c | 27 ++ > target/arm/trace-events | 3 + > 5 files changed, 219 insertions(+), 3 deletions(-) > > -- > 1.9.3
[Qemu-devel] [PATCH v4 4/5] virtio-iommu: add iommu replay
Default replay does not work with virtio-iommu, so this patch provide virtio-iommu replay functionality. Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> --- v3->v4: - Replay functionality moved in separate patch - No other changes hw/virtio/trace-events | 1 + hw/virtio/virtio-iommu.c | 38 ++ 2 files changed, 39 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 251b595..840d54f 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -51,3 +51,4 @@ virtio_iommu_fill_none_property(uint32_t devid) "devid=%d" virtio_iommu_set_page_size_mask(const char *iommu_mr, uint64_t mask) "mr=%s page_size_mask=0x%"PRIx64 virtio_iommu_notify_map(const char *name, hwaddr iova, hwaddr paddr, hwaddr map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" virtio_iommu_notify_unmap(const char *name, hwaddr iova, hwaddr map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64"" +virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index ff91bce..d4d34cf 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -906,6 +906,43 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data) return (ua > ub) - (ua < ub); } +static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data) +{ +viommu_mapping *mapping = (viommu_mapping *) value; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +trace_virtio_iommu_remap(mapping->virt_addr, mapping->phys_addr, + mapping->size); +/* unmap previous entry and map again */ +virtio_iommu_notify_unmap(mr, mapping->virt_addr, mapping->size); + +virtio_iommu_notify_map(mr, mapping->virt_addr, mapping->phys_addr, +mapping->size); +return false; +} + +static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n) +{ +IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +uint32_t sid; +viommu_dev *dev; + +sid = virtio_iommu_get_sid(sdev); + +qemu_mutex_lock(>mutex); + +dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(sid)); +if (!dev || !dev->as) { +goto unlock; +} + +g_tree_foreach(dev->as->mappings, virtio_iommu_remap, mr); + +unlock: +qemu_mutex_unlock(>mutex); +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -1003,6 +1040,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = virtio_iommu_translate; imrc->set_page_size_mask = virtio_iommu_set_page_size_mask; +imrc->replay = virtio_iommu_replay; } static const TypeInfo virtio_iommu_info = { -- 1.9.3
[Qemu-devel] [PATCH v4 5/5] virtio-iommu: add iommu notifier memory-region
Finally add notify_flag_changed() to for memory-region access flag iommu flag change notifier Finally add the memory notifier Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> --- v3->v4: - notify_flag_changed functionality moved in separate patch - No other changes hw/virtio/trace-events | 2 ++ hw/virtio/virtio-iommu.c | 31 +++ 2 files changed, 33 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 840d54f..a9de0d4 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -52,3 +52,5 @@ virtio_iommu_set_page_size_mask(const char *iommu_mr, uint64_t mask) "mr=%s page virtio_iommu_notify_map(const char *name, hwaddr iova, hwaddr paddr, hwaddr map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" virtio_iommu_notify_unmap(const char *name, hwaddr iova, hwaddr map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64"" virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s" +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index d4d34cf..a9b0d72 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -764,6 +764,36 @@ push: } } +static void virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr, + IOMMUNotifierFlag old, + IOMMUNotifierFlag new) +{ +IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +VirtioIOMMUNotifierNode *node = NULL; +VirtioIOMMUNotifierNode *next_node = NULL; + +if (old == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name); +node = g_malloc0(sizeof(*node)); +node->iommu_dev = sdev; +QLIST_INSERT_HEAD(>notifiers_list, node, next); +return; +} + +/* update notifier node with new flags */ +QLIST_FOREACH_SAFE(node, >notifiers_list, next, next_node) { +if (node->iommu_dev == sdev) { +if (new == IOMMU_NOTIFIER_NONE) { +trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name); +QLIST_REMOVE(node, next); +g_free(node); +} +return; +} +} +} + static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr, IOMMUAccessFlags flag) { @@ -1041,6 +1071,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass, imrc->translate = virtio_iommu_translate; imrc->set_page_size_mask = virtio_iommu_set_page_size_mask; imrc->replay = virtio_iommu_replay; +imrc->notify_flag_changed = virtio_iommu_notify_flag_changed; } static const TypeInfo virtio_iommu_info = { -- 1.9.3
[Qemu-devel] [PATCH v4 1/5] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
Translate msi address if device is behind virtio-iommu. This logic is similar to vSMMUv3/Intel iommu emulation. This RFC patch does not handle the case where both vsmmuv3 and virtio-iommu are available. Signed-off-by: Eric Auger <eric.au...@redhat.com> Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> --- v3->v4 - No changes target/arm/kvm.c| 27 +++ target/arm/trace-events | 3 +++ 2 files changed, 30 insertions(+) diff --git a/target/arm/kvm.c b/target/arm/kvm.c index 211a7bf..895a630 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -21,7 +21,11 @@ #include "kvm_arm.h" #include "cpu.h" #include "internals.h" +#include "trace.h" #include "hw/arm/arm.h" +#include "hw/pci/pci.h" +#include "hw/pci/msi.h" +#include "hw/virtio/virtio-iommu.h" #include "exec/memattrs.h" #include "exec/address-spaces.h" #include "hw/boards.h" @@ -666,6 +670,29 @@ int kvm_arm_vgic_probe(void) int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, uint64_t address, uint32_t data, PCIDevice *dev) { +AddressSpace *as = pci_device_iommu_address_space(dev); +IOMMUTLBEntry entry; +IOMMUDevice *sdev; +IOMMUMemoryRegionClass *imrc; + +if (as == _space_memory) { +return 0; +} + +/* MSI doorbell address is translated by an IOMMU */ +sdev = container_of(as, IOMMUDevice, as); + +imrc = memory_region_get_iommu_class_nocheck(>iommu_mr); + +entry = imrc->translate(>iommu_mr, address, IOMMU_WO); + +route->u.msi.address_lo = entry.translated_addr; +route->u.msi.address_hi = entry.translated_addr >> 32; + +trace_kvm_arm_fixup_msi_route(address, sdev->devfn, + sdev->iommu_mr.parent_obj.name, + entry.translated_addr); + return 0; } diff --git a/target/arm/trace-events b/target/arm/trace-events index 9e37131..8b3c220 100644 --- a/target/arm/trace-events +++ b/target/arm/trace-events @@ -8,3 +8,6 @@ arm_gt_tval_write(int timer, uint64_t value) "gt_tval_write: timer %d value 0x%" arm_gt_ctl_write(int timer, uint64_t value) "gt_ctl_write: timer %d value 0x%" PRIx64 arm_gt_imask_toggle(int timer, int irqstate) "gt_ctl_write: timer %d IMASK toggle, new irqstate %d" arm_gt_cntvoff_write(uint64_t value) "gt_cntvoff_write: value 0x%" PRIx64 + +# target/arm/kvm.c +kvm_arm_fixup_msi_route(uint64_t iova, uint32_t devid, const char *name, uint64_t gpa) "MSI addr = 0x%"PRIx64" is translated for devfn=%d through %s into 0x%"PRIx64 -- 1.9.3
[Qemu-devel] [PATCH v4 3/5] virtio-iommu: Call iommu notifier for attach/detach
iommu-notifier are called when a device is attached or detached to as address-space. This is needed for VFIO. Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> Signed-off-by: Eric Auger <eric.au...@redhat.com> --- v3->v4: Follwoig fixes by Eric - Return "false" from virtio_iommu_mapping_unmap/map() - Calling virtio_iommu_notify_unmap/map() for all device in as hw/virtio/virtio-iommu.c | 43 +++ 1 file changed, 43 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 085e972..ff91bce 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -127,8 +127,42 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, memory_region_notify_iommu(mr, entry); } +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value, + gpointer data) +{ +viommu_mapping *mapping = (viommu_mapping *) value; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_unmap(mr, mapping->virt_addr, mapping->size); + +return false; +} + +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value, + gpointer data) +{ +viommu_mapping *mapping = (viommu_mapping *) value; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_map(mr, mapping->virt_addr, mapping->phys_addr, +mapping->size); + +return false; +} + static void virtio_iommu_detach_dev_from_as(viommu_dev *dev) { +VirtioIOMMUNotifierNode *node; +VirtIOIOMMU *s = dev->viommu; +viommu_as *as = dev->as; + +QLIST_FOREACH(node, >notifiers_list, next) { +if (dev->id == node->iommu_dev->devfn) { +g_tree_foreach(as->mappings, virtio_iommu_mapping_unmap, + >iommu_dev->iommu_mr); +} +} + QLIST_REMOVE(dev, next); dev->as = NULL; } @@ -260,6 +294,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, uint32_t asid = le32_to_cpu(req->address_space); uint32_t devid = le32_to_cpu(req->device); uint32_t reserved = le32_to_cpu(req->reserved); +VirtioIOMMUNotifierNode *node; viommu_as *as; viommu_dev *dev; @@ -284,6 +319,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, dev->as = as; g_tree_ref(as->mappings); +/* replay existing address space mappings on the associated mr */ +QLIST_FOREACH(node, >notifiers_list, next) { +if (devid == node->iommu_dev->devfn) { +g_tree_foreach(as->mappings, virtio_iommu_mapping_map, + >iommu_dev->iommu_mr); +} +} + return VIRTIO_IOMMU_S_OK; } -- 1.9.3
[Qemu-devel] [PATCH v4 0/5] virtio-iommu: VFIO integration
This patch series integrates VFIO/VHOST with virtio-iommu. This version is mainly about rebasing on v4 version on virtio-iommu device framework from Eric Augur and addresing review comments. This patch series allows PCI pass-through using virtio-iommu. This series is based on: - virtio-iommu kernel driver by Jean-Philippe Brucker [1] [RFC] virtio-iommu version 0.4 git://linux-arm.org/virtio-iommu.git branch viommu/v0.4 - virtio-iommu device emulation by Eric Augur. [RFC v4 00/16] VIRTIO-IOMMU device https://github.com/eauger/qemu/tree/v2.10.0-virtio-iommu-v4 Changes are available at : https://github.com/bharaty/qemu.git virtio-iommu-vfio-integration-v4 v3->v4: - Rebase to v4 version from Eric - Fixes from Eric with DPDK in VM - Logical division in multiple patches v2->v3: - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" Which is based on top of v2.10-rc0 that - Fixed issue with two PCI devices - Addressed review comments v1->v2: - Added trace events - removed vSMMU3 link in patch description Bharat Bhushan (5): target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route virtio-iommu: Add iommu notifier for map/unmap virtio-iommu: Call iommu notifier for attach/detach virtio-iommu: add iommu replay virtio-iommu: add iommu notifier memory-region hw/virtio/trace-events | 5 ++ hw/virtio/virtio-iommu.c | 181 ++- include/hw/virtio/virtio-iommu.h | 6 ++ target/arm/kvm.c | 27 ++ target/arm/trace-events | 3 + 5 files changed, 219 insertions(+), 3 deletions(-) -- 1.9.3
[Qemu-devel] [PATCH v4 2/5] virtio-iommu: Add iommu notifier for map/unmap
This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to notify registered iommu-notifier. This is needed for VFIO support, Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> Signed-off-by: Eric Auger <eric.au...@redhat.com> --- v3->v4: Follwoig fixes by Eric - Calling virtio_iommu_notify_map() for all device in AS - virtio_iommu_notify_unmap() moved to a function, This is needed as per changes in base framework (v4) hw/virtio/trace-events | 2 ++ hw/virtio/virtio-iommu.c | 69 ++-- include/hw/virtio/virtio-iommu.h | 6 3 files changed, 74 insertions(+), 3 deletions(-) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 2793604..251b595 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -49,3 +49,5 @@ virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) virtio_iommu_fill_resv_property(uint32_t devid, uint8_t subtype, uint64_t addr, uint64_t size, uint32_t flags, size_t filled) "dev= %d, subtype=%d addr=0x%"PRIx64" size=0x%"PRIx64" flags=%d filled=0x%lx" virtio_iommu_fill_none_property(uint32_t devid) "devid=%d" virtio_iommu_set_page_size_mask(const char *iommu_mr, uint64_t mask) "mr=%s page_size_mask=0x%"PRIx64 +virtio_iommu_notify_map(const char *name, hwaddr iova, hwaddr paddr, hwaddr map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_notify_unmap(const char *name, hwaddr iova, hwaddr map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64"" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 1873b9a..085e972 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -95,6 +95,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data) } } +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova, +hwaddr paddr, hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size); +entry.perm = IOMMU_RW; +entry.translated_addr = paddr; + +memory_region_notify_iommu(mr, entry); +} + +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, + hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size); +entry.perm = IOMMU_NONE; +entry.translated_addr = 0; + +memory_region_notify_iommu(mr, entry); +} + static void virtio_iommu_detach_dev_from_as(viommu_dev *dev) { QLIST_REMOVE(dev, next); @@ -291,6 +323,8 @@ static int virtio_iommu_map(VirtIOIOMMU *s, viommu_as *as; viommu_interval *interval; viommu_mapping *mapping; +VirtioIOMMUNotifierNode *node; +viommu_dev *dev; interval = g_malloc0(sizeof(*interval)); @@ -318,9 +352,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s, g_tree_insert(as->mappings, interval, mapping); +/* All devices in an address-space share mapping */ +QLIST_FOREACH(node, >notifiers_list, next) { +QLIST_FOREACH(dev, >device_list, next) { +if (dev->id == node->iommu_dev->devfn) { +virtio_iommu_notify_map(>iommu_dev->iommu_mr, +virt_addr, phys_addr, size); +} +} +} + return VIRTIO_IOMMU_S_OK; } +static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, viommu_as *as, +viommu_interval *interval) +{ +VirtioIOMMUNotifierNode *node; +viommu_dev *dev; + +g_tree_remove(as->mappings, (gpointer)(interval)); +QLIST_FOREACH(node, >notifiers_list, next) { +QLIST_FOREACH(dev, >device_list, next) { +if (dev->id == node->iommu_dev->devfn) { +virtio_iommu_notify_unmap(>iommu_dev->iommu_mr, + interval->low, + interval->high - interval->low + 1); +} +} +} +} + static int virtio_iommu_unmap(VirtIOIOMMU *s, struct virtio_iommu_req_unmap *req) { @@ -352,18 +414,18 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, current.high = high; if (low == interval.low && size >= mapping->size) { -g_tree_remove(as->mappings, (gpointer)()); +virtio_iommu_remove_mapping(s, as, ); interval.low = high + 1; trace_virtio_iommu_unmap_left_interval(current.low, current.high, interval.
Re: [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu
> -Original Message- > From: Auger Eric [mailto:eric.au...@redhat.com] > Sent: Monday, September 18, 2017 1:18 PM > To: Bharat Bhushan <bharat.bhus...@nxp.com>; > eric.auger@gmail.com; peter.mayd...@linaro.org; > alex.william...@redhat.com; m...@redhat.com; qemu-...@nongnu.org; > qemu-devel@nongnu.org > Cc: w...@redhat.com; kevin.t...@intel.com; marc.zyng...@arm.com; > t...@semihalf.com; will.dea...@arm.com; drjo...@redhat.com; > robin.mur...@arm.com; christoffer.d...@linaro.org > Subject: Re: [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with > virtio-iommu > > Hi Bharat, > > On 21/08/2017 12:48, Bharat Bhushan wrote: > > This RFC patch allows virtio-iommu protection for PCI > > device-passthrough. > > > > MSI region is mapped by current version of virtio-iommu driver. > > This uses VFIO extension of map/unmap notification when an area of > > memory is mappedi/unmapped in emulated iommu device. > > > > This series is tested with 2 PCI devices to virtual machine using > > dma-ops and DPDK in VM is not yet tested. > > > > Also with this series we observe below prints for MSI region mapping > > > >"qemu-system-aarch64: iommu map to non memory area 0" > > > >This print comes when vfio/map-notifier is called for MSI region. > > > > vfio map/unmap notification is called for given device > >This assumes that devid passed in virtio_iommu_attach is same as devfn > >This assumption is based on 1:1 mapping of requested-id with device-id > >in QEMU. > > > > Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> > > --- > > v2->v3: > > - Addressed review comments: > > - virtio-iommu_map_region function is split in two functions > > virtio_iommu_notify_map/virtio_iommu_notify_unmap > > - use size received from driver and do not split in 4K pages > > > > - map/unmap notification is called for given device/as > >This relies on devid passed in virtio_iommu_attach is same as devfn > >This is assumed as iommu-map maps 1:1 requested-id to device-id in > QEMU > >Looking for comment about this assumtion. > > > > - Keeping track devices in address-space > > > > - Verified with 2 PCI endpoints > > - some code cleanup > > > > hw/virtio/trace-events | 5 ++ > > hw/virtio/virtio-iommu.c | 163 > +++ > > include/hw/virtio/virtio-iommu.h | 6 ++ > > 3 files changed, 174 insertions(+) > > > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index > > 8db3d91..7e9663f 100644 > > --- a/hw/virtio/trace-events > > +++ b/hw/virtio/trace-events > > @@ -39,3 +39,8 @@ virtio_iommu_unmap_left_interval(uint64_t low, > > uint64_t high, uint64_t next_low, > virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t > next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], > new interval=[0x%"PRIx64",0x%"PRIx64"]" > > virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap > inc [0x%"PRIx64",0x%"PRIx64"]" > > virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr, > uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" > > +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu > notifier node for memory region %s" > > +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu > notifier node for memory region %s" > > +virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) > "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" > > +virtio_iommu_notify_map(hwaddr iova, hwaddr paddr, hwaddr > map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" > > +virtio_iommu_notify_unmap(hwaddr iova, hwaddr paddr, hwaddr > map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" > useless paddr > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index > > 9217587..9eae050 100644 > > --- a/hw/virtio/virtio-iommu.c > > +++ b/hw/virtio/virtio-iommu.c > > @@ -55,11 +55,13 @@ typedef struct viommu_interval { typedef struct > > viommu_dev { > > uint32_t id; > > viommu_as *as; > > +QLIST_ENTRY(viommu_dev) next; > > } viommu_dev; > > > > struct viommu_as { > > uint32_t id; > > GTree *mappings; > > +QLIS
Re: [Qemu-devel] [RFC v4 06/16] virtio-iommu: Register attached devices
> -Original Message- > From: Eric Auger [mailto:eric.au...@redhat.com] > Sent: Tuesday, September 19, 2017 1:17 PM > To: eric.auger@gmail.com; eric.au...@redhat.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org; jean- > philippe.bruc...@arm.com > Cc: will.dea...@arm.com; kevin.t...@intel.com; marc.zyng...@arm.com; > christoffer.d...@linaro.org; drjo...@redhat.com; w...@redhat.com; > t...@semihalf.com; Bharat Bhushan <bharat.bhus...@nxp.com>; > pet...@redhat.com; linuc.dec...@gmail.com > Subject: [RFC v4 06/16] virtio-iommu: Register attached devices > > This patch introduce address space and device internal > datatypes. Both are stored in RB trees. The address space > owns a list of devices attached to it. > > It is assumed the devid corresponds to the PCI BDF. > > Signed-off-by: Eric Auger <eric.au...@redhat.com> > > --- > v3 -> v4: > - new separate patch > --- > hw/virtio/trace-events | 4 ++ > hw/virtio/virtio-iommu.c | 120 > +++ > 2 files changed, 124 insertions(+) > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events > index bc65356..74b92d3 100644 > --- a/hw/virtio/trace-events > +++ b/hw/virtio/trace-events > @@ -38,3 +38,7 @@ virtio_iommu_map(uint32_t as, uint64_t phys_addr, > uint64_t virt_addr, uint64_t s > virtio_iommu_unmap(uint32_t as, uint64_t virt_addr, uint64_t size) "as= %d > virt_addr=0x%"PRIx64" size=0x%"PRIx64 > virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int > flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d" > virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s" > +virtio_iommu_get_dev(uint32_t devid) "Alloc devid=%d" > +virtio_iommu_put_dev(uint32_t devid) "Free devid=%d" > +virtio_iommu_get_as(uint32_t asid) "Alloc asid=%d" > +virtio_iommu_put_as(uint32_t asid) "Free asid=%d" > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > index f4cb76f..41a4bbc 100644 > --- a/hw/virtio/virtio-iommu.c > +++ b/hw/virtio/virtio-iommu.c > @@ -32,15 +32,116 @@ > #include "hw/virtio/virtio-bus.h" > #include "hw/virtio/virtio-access.h" > #include "hw/virtio/virtio-iommu.h" > +#include "hw/pci/pci_bus.h" > +#include "hw/pci/pci.h" > > /* Max size */ > #define VIOMMU_DEFAULT_QUEUE_SIZE 256 > > +typedef struct viommu_as { > +uint32_t id; > +GTree *mappings; > +QLIST_HEAD(, viommu_dev) device_list; > +} viommu_as; > + > +typedef struct viommu_dev { > +uint32_t id; > +viommu_as *as; > +QLIST_ENTRY(viommu_dev) next; > +VirtIOIOMMU *viommu; > +} viommu_dev; > + > +typedef struct viommu_interval { > +uint64_t low; > +uint64_t high; > +} viommu_interval; > + > static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev) > { > return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); > } > > +static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer > user_data) > +{ > +viommu_interval *inta = (viommu_interval *)a; > +viommu_interval *intb = (viommu_interval *)b; > + > +if (inta->high <= intb->low) { > +return -1; > +} else if (intb->high <= inta->low) { > +return 1; > +} else { > +return 0; > +} > +} > + > +static void virtio_iommu_detach_dev_from_as(viommu_dev *dev) > +{ > +QLIST_REMOVE(dev, next); > +dev->as = NULL; > +} > + > +static viommu_dev *virtio_iommu_get_dev(VirtIOIOMMU *s, uint32_t > devid) > +{ > +viommu_dev *dev; > + > +dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid)); > +if (dev) { > +return dev; > +} > +dev = g_malloc0(sizeof(*dev)); > +dev->id = devid; > +dev->viommu = s; > +trace_virtio_iommu_get_dev(devid); > +g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev); > +return dev; > +} > + > +static void virtio_iommu_put_dev(gpointer data) > +{ > +viommu_dev *dev = (viommu_dev *)data; > + > +if (dev->as) { > +virtio_iommu_detach_dev_from_as(dev); > +g_tree_unref(dev->as->mappings); > +} > + > +trace_virtio_iommu_put_dev(dev->id); > +g_free(dev); > +} > + > +viommu_as *virtio_iommu_get_as(VirtIOIOMMU *s, uint32_t asid); > +viommu_as *virtio_iommu_get_as(VirtIOIOMMU *s, uint32_t asid) > +{ > +viommu_as *as; > + > +as = g_tree_lookup(s->
Re: [Qemu-devel] [RFC v4 09/16] virtio-iommu: Implement translate
> -Original Message- > From: Eric Auger [mailto:eric.au...@redhat.com] > Sent: Tuesday, September 19, 2017 1:17 PM > To: eric.auger@gmail.com; eric.au...@redhat.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org; jean- > philippe.bruc...@arm.com > Cc: will.dea...@arm.com; kevin.t...@intel.com; marc.zyng...@arm.com; > christoffer.d...@linaro.org; drjo...@redhat.com; w...@redhat.com; > t...@semihalf.com; Bharat Bhushan <bharat.bhus...@nxp.com>; > pet...@redhat.com; linuc.dec...@gmail.com > Subject: [RFC v4 09/16] virtio-iommu: Implement translate > > This patch implements the translate callback > > Signed-off-by: Eric Auger <eric.au...@redhat.com> > --- > hw/virtio/trace-events | 1 + > hw/virtio/virtio-iommu.c | 39 > +-- > 2 files changed, 38 insertions(+), 2 deletions(-) > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index > da298c1..9010fbd 100644 > --- a/hw/virtio/trace-events > +++ b/hw/virtio/trace-events > @@ -45,3 +45,4 @@ virtio_iommu_put_as(uint32_t asid) "Free asid=%d" > virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t > next_low, uint64_t next_high) "Unmap left [0x%"PRIx64",0x%"PRIx64"], > new interval=[0x%"PRIx64",0x%"PRIx64"]" > virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t > next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], > new interval=[0x%"PRIx64",0x%"PRIx64"]" > virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc > [0x%"PRIx64",0x%"PRIx64"]" > +virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, > uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index > 6f1a7d1..db46a91 100644 > --- a/hw/virtio/virtio-iommu.c > +++ b/hw/virtio/virtio-iommu.c > @@ -496,19 +496,54 @@ static IOMMUTLBEntry > virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr, > IOMMUAccessFlags flag) { > IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr); > +VirtIOIOMMU *s = sdev->viommu; > uint32_t sid; > +viommu_dev *dev; > +viommu_mapping *mapping; > +viommu_interval interval; > + > +interval.low = addr; > +interval.high = addr + 1; > > IOMMUTLBEntry entry = { > .target_as = _space_memory, > .iova = addr, > .translated_addr = addr, > -.addr_mask = ~(hwaddr)0, > -.perm = IOMMU_NONE, > +.addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1, > +.perm = flag, > }; > > sid = virtio_iommu_get_sid(sdev); > > trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag); > +qemu_mutex_lock(>mutex); > + > +dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(sid)); > +if (!dev) { > +/* device cannot be attached to another as */ > +printf("%s sid=%d is not known!!\n", __func__, sid); > +goto unlock; > +} > + > +mapping = g_tree_lookup(dev->as->mappings, (gpointer)()); Should check of !dev->as before accessing this. Thanks -Bharat > +if (!mapping) { > +printf("%s no mapping for 0x%"PRIx64" for sid=%d\n", __func__, > + addr, sid); > +goto unlock; > +} > + > +if (((flag & IOMMU_RO) && !(mapping->flags & > VIRTIO_IOMMU_MAP_F_READ)) || > +((flag & IOMMU_WO) && !(mapping->flags & > VIRTIO_IOMMU_MAP_F_WRITE))) { > +error_report("Permission error on 0x%"PRIx64"(%d): allowed=%d", > + addr, flag, mapping->flags); > +entry.perm = IOMMU_NONE; > +goto unlock; > +} > +entry.translated_addr = addr - mapping->virt_addr + mapping- > >phys_addr, > +trace_virtio_iommu_translate_out(addr, entry.translated_addr, sid); > + > +unlock: > +qemu_mutex_unlock(>mutex); > return entry; > } > > -- > 2.5.5
Re: [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration
Hi Eric, > -Original Message- > From: Auger Eric [mailto:eric.au...@redhat.com] > Sent: Wednesday, August 23, 2017 10:12 PM > To: Bharat Bhushan <bharat.bhus...@nxp.com>; > eric.auger@gmail.com; peter.mayd...@linaro.org; > alex.william...@redhat.com; m...@redhat.com; qemu-...@nongnu.org; > qemu-devel@nongnu.org > Cc: w...@redhat.com; kevin.t...@intel.com; marc.zyng...@arm.com; > t...@semihalf.com; will.dea...@arm.com; drjo...@redhat.com; > robin.mur...@arm.com; christoffer.d...@linaro.org > Subject: Re: [PATCH v3 0/2] virtio-iommu: VFIO integration > > Hi Bharat, > > On 21/08/2017 12:48, Bharat Bhushan wrote: > > This V3 version is mainly about rebasing on v3 version on Virtio-iommu > > device framework from Eric Augur and addresing review comments. > s/Augur/Auger ;-) I am sorry, > > > > This patch series allows PCI pass-through using virtio-iommu. > > > > This series is based on: > > - virtio-iommu specification written by Jean-Philippe Brucker > >[RFC 0/3] virtio-iommu: a paravirtualized IOMMU, > > > > - virtio-iommu driver by Jean-Philippe Brucker > >[RFC PATCH linux] iommu: Add virtio-iommu driver > > > > - virtio-iommu device emulation by Eric Augur. > >[RFC v3 0/8] VIRTIO-IOMMU device > > > > PCI device pass-through and virtio-net-pci is tested with these > > changes using dma-ops > > I confirm it works fine now with 2 assigned VFs. > > However at the moment DPDK testpmd using those 2 VFs does not work for > me: > 1: > [/home/augere/UPSTREAM/dpdk/install/bin/testpmd(rte_dump_stack+0x2 > 4) > [0x4a8a78]] > > I haven't investigated yet... I have not run DPDK before, I am compiling right now and run. Thanks -Bharat > > Thanks > > Eric > > > > This patch series does not implement RESV_MEM changes proposal by > Jean-Philippe "https://lists.gnu.org/archive/html/qemu-devel/2017- > 07/msg01796.html" > > > > v2->v3: > > - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" > >Which is based on top of v2.10-rc0 that > > - Fixed issue with two PCI devices > > - Addressed review comments > > > > v1->v2: > > - Added trace events > > - removed vSMMU3 link in patch description > > > > Bharat Bhushan (2): > > target/arm/kvm: Translate the MSI doorbell in > kvm_arch_fixup_msi_route > > virtio-iommu: vfio integration with virtio-iommu > > > > hw/virtio/trace-events | 5 ++ > > hw/virtio/virtio-iommu.c | 163 > +++ > > include/hw/virtio/virtio-iommu.h | 6 ++ > > target/arm/kvm.c | 27 +++ > > target/arm/trace-events | 3 + > > 5 files changed, 204 insertions(+) > >
Re: [Qemu-devel] [RFC v2 PATCH 2/2] virtio-iommu: vfio integration with virtio-iommu
Hi Eric, > -Original Message- > From: Auger Eric [mailto:eric.au...@redhat.com] > Sent: Thursday, August 17, 2017 9:03 PM > To: Bharat Bhushan <bharat.bhus...@nxp.com>; > eric.auger@gmail.com; peter.mayd...@linaro.org; > alex.william...@redhat.com; m...@redhat.com; qemu-...@nongnu.org; > qemu-devel@nongnu.org > Cc: w...@redhat.com; kevin.t...@intel.com; marc.zyng...@arm.com; > t...@semihalf.com; will.dea...@arm.com; drjo...@redhat.com; > robin.mur...@arm.com; christoffer.d...@linaro.org > Subject: Re: [Qemu-devel] [RFC v2 PATCH 2/2] virtio-iommu: vfio integration > with virtio-iommu > > Hi Bharat, > > On 14/07/2017 09:25, Bharat Bhushan wrote: > > This patch allows virtio-iommu protection for PCI device-passthrough. > > > > MSI region is mapped by current version of virtio-iommu driver. > > This MSI region mapping in not getting pushed on hw iommu > > vfio_get_vaddr() allows only ram-region. > Why is it an issue. As far as I understand this is not needed actually as the > guest MSI doorbell is not used by the host. > This RFC patch needed > > to be improved. > > > > Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> > > --- > > v1-v2: > > - Added trace events > > > > hw/virtio/trace-events | 5 ++ > > hw/virtio/virtio-iommu.c | 133 > +++ > > include/hw/virtio/virtio-iommu.h | 6 ++ > > 3 files changed, 144 insertions(+) > > > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index > > 9196b63..3a3968b 100644 > > --- a/hw/virtio/trace-events > > +++ b/hw/virtio/trace-events > > @@ -39,3 +39,8 @@ virtio_iommu_unmap_left_interval(uint64_t low, > > uint64_t high, uint64_t next_low, > virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t > next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], > new interval=[0x%"PRIx64",0x%"PRIx64"]" > > virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap > inc [0x%"PRIx64",0x%"PRIx64"]" > > virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr, > uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" > > +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu > notifier node for memory region %s" > > +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu > notifier node for memory region %s" > > +virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) > "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" > > +virtio_iommu_map_region(hwaddr iova, hwaddr paddr, hwaddr > map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" > > +virtio_iommu_unmap_region(hwaddr iova, hwaddr paddr, hwaddr > map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index > > cd188fc..61f33cb 100644 > > --- a/hw/virtio/virtio-iommu.c > > +++ b/hw/virtio/virtio-iommu.c > > @@ -129,6 +129,48 @@ static gint interval_cmp(gconstpointer a, > gconstpointer b, gpointer user_data) > > } > > } > > > > +static void virtio_iommu_map_region(VirtIOIOMMU *s, hwaddr iova, > hwaddr paddr, > > +hwaddr size, int map) > bool map? > > the function name is a bit misleading to me and does not really explain what > the function does. It "notifies" so why not using something like > virtio_iommu_map_notify and virtio_iommu_unmap_notify. I tend to think > having separate proto is cleaner and more standard. > > Binding should happen on a specific IOMMUmemoryRegion (see next > comment). > > > +{ > > +VirtioIOMMUNotifierNode *node; > > +IOMMUTLBEntry entry; > > +uint64_t map_size = (1 << 12); > TODO: handle something else than 4K page. > > +int npages; > > +int i; > > + > > +npages = size / map_size; > > +entry.target_as = _space_memory; > > +entry.addr_mask = map_size - 1; > > + > > +for (i = 0; i < npages; i++) { > Although I understand we currently fail checking the consistency between > pIOMMU and vIOMMU page sizes, this will be very slow for guest DPDK use > case where hugepages are used. > > Why not directly using the full size? vfio_iommu_map_notify will report > errors if vfio_dma_map/unmap() fail. > > +entry.iova = iova + (i *
[Qemu-devel] [PATCH v3 1/2] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
Translate msi address if device is behind virtio-iommu. This logic is similar to vSMMUv3/Intel iommu emulation. This RFC patch does not handle the case where both vsmmuv3 and virtio-iommu are available. Signed-off-by: Eric Auger <eric.au...@redhat.com> Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> --- v2->v3: - Rebased to on top of 2.10-rc0 and especially [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion v1-v2: - Added trace events - removed vSMMU3 link in patch description target/arm/kvm.c| 27 +++ target/arm/trace-events | 3 +++ 2 files changed, 30 insertions(+) diff --git a/target/arm/kvm.c b/target/arm/kvm.c index 7c17f0d..0219c9d 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -21,7 +21,11 @@ #include "kvm_arm.h" #include "cpu.h" #include "internals.h" +#include "trace.h" #include "hw/arm/arm.h" +#include "hw/pci/pci.h" +#include "hw/pci/msi.h" +#include "hw/virtio/virtio-iommu.h" #include "exec/memattrs.h" #include "exec/address-spaces.h" #include "hw/boards.h" @@ -662,6 +666,29 @@ int kvm_arm_vgic_probe(void) int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, uint64_t address, uint32_t data, PCIDevice *dev) { +AddressSpace *as = pci_device_iommu_address_space(dev); +IOMMUTLBEntry entry; +IOMMUDevice *sdev; +IOMMUMemoryRegionClass *imrc; + +if (as == _space_memory) { +return 0; +} + +/* MSI doorbell address is translated by an IOMMU */ +sdev = container_of(as, IOMMUDevice, as); + +imrc = memory_region_get_iommu_class_nocheck(>iommu_mr); + +entry = imrc->translate(>iommu_mr, address, IOMMU_WO); + +route->u.msi.address_lo = entry.translated_addr; +route->u.msi.address_hi = entry.translated_addr >> 32; + +trace_kvm_arm_fixup_msi_route(address, sdev->devfn, + sdev->iommu_mr.parent_obj.name, + entry.translated_addr); + return 0; } diff --git a/target/arm/trace-events b/target/arm/trace-events index e21c84f..eff2822 100644 --- a/target/arm/trace-events +++ b/target/arm/trace-events @@ -8,3 +8,6 @@ arm_gt_tval_write(int timer, uint64_t value) "gt_tval_write: timer %d value %" P arm_gt_ctl_write(int timer, uint64_t value) "gt_ctl_write: timer %d value %" PRIx64 arm_gt_imask_toggle(int timer, int irqstate) "gt_ctl_write: timer %d IMASK toggle, new irqstate %d" arm_gt_cntvoff_write(uint64_t value) "gt_cntvoff_write: value %" PRIx64 + +# target/arm/kvm.c +kvm_arm_fixup_msi_route(uint64_t iova, uint32_t devid, const char *name, uint64_t gpa) "MSI addr = 0x%"PRIx64" is translated for devfn=%d through %s into 0x%"PRIx64 -- 1.9.3
[Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration
This V3 version is mainly about rebasing on v3 version on Virtio-iommu device framework from Eric Augur and addresing review comments. This patch series allows PCI pass-through using virtio-iommu. This series is based on: - virtio-iommu specification written by Jean-Philippe Brucker [RFC 0/3] virtio-iommu: a paravirtualized IOMMU, - virtio-iommu driver by Jean-Philippe Brucker [RFC PATCH linux] iommu: Add virtio-iommu driver - virtio-iommu device emulation by Eric Augur. [RFC v3 0/8] VIRTIO-IOMMU device PCI device pass-through and virtio-net-pci is tested with these changes using dma-ops This patch series does not implement RESV_MEM changes proposal by Jean-Philippe "https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01796.html; v2->v3: - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device" Which is based on top of v2.10-rc0 that - Fixed issue with two PCI devices - Addressed review comments v1->v2: - Added trace events - removed vSMMU3 link in patch description Bharat Bhushan (2): target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route virtio-iommu: vfio integration with virtio-iommu hw/virtio/trace-events | 5 ++ hw/virtio/virtio-iommu.c | 163 +++ include/hw/virtio/virtio-iommu.h | 6 ++ target/arm/kvm.c | 27 +++ target/arm/trace-events | 3 + 5 files changed, 204 insertions(+) -- 1.9.3
[Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu
This RFC patch allows virtio-iommu protection for PCI device-passthrough. MSI region is mapped by current version of virtio-iommu driver. This uses VFIO extension of map/unmap notification when an area of memory is mappedi/unmapped in emulated iommu device. This series is tested with 2 PCI devices to virtual machine using dma-ops and DPDK in VM is not yet tested. Also with this series we observe below prints for MSI region mapping "qemu-system-aarch64: iommu map to non memory area 0" This print comes when vfio/map-notifier is called for MSI region. vfio map/unmap notification is called for given device This assumes that devid passed in virtio_iommu_attach is same as devfn This assumption is based on 1:1 mapping of requested-id with device-id in QEMU. Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> --- v2->v3: - Addressed review comments: - virtio-iommu_map_region function is split in two functions virtio_iommu_notify_map/virtio_iommu_notify_unmap - use size received from driver and do not split in 4K pages - map/unmap notification is called for given device/as This relies on devid passed in virtio_iommu_attach is same as devfn This is assumed as iommu-map maps 1:1 requested-id to device-id in QEMU Looking for comment about this assumtion. - Keeping track devices in address-space - Verified with 2 PCI endpoints - some code cleanup hw/virtio/trace-events | 5 ++ hw/virtio/virtio-iommu.c | 163 +++ include/hw/virtio/virtio-iommu.h | 6 ++ 3 files changed, 174 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 8db3d91..7e9663f 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -39,3 +39,8 @@ virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low, virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]" virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]" virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s" +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s" +virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_notify_map(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_notify_unmap(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 9217587..9eae050 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -55,11 +55,13 @@ typedef struct viommu_interval { typedef struct viommu_dev { uint32_t id; viommu_as *as; +QLIST_ENTRY(viommu_dev) next; } viommu_dev; struct viommu_as { uint32_t id; GTree *mappings; +QLIST_HEAD(, viommu_dev) device_list; }; static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev) @@ -133,12 +135,70 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data) } } +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova, +hwaddr paddr, hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_map(iova, paddr, size); +entry.perm = IOMMU_RW; +entry.translated_addr = paddr; + +memory_region_notify_iommu(mr, entry); +} + +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova, + hwaddr paddr, hwaddr size) +{ +IOMMUTLBEntry entry; + +entry.target_as = _space_memory; +entry.addr_mask = size - 1; + +entry.iova = iova; +trace_virtio_iommu_notify_unmap(iova, paddr, size); +entry.perm = IOMMU_NONE; +entry.translated_addr = 0; + +memory_region_notify_iommu(mr, entry); +} + +static gboolean virtio_iommu_maping_unmap(gpointer key, gpointer value, + gpointer data) +{ +viommu_mapping *mapping = (viommu_mapping *) value; +IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data; + +virtio_iommu_notify_unmap(mr, mapping->virt_addr, 0, mapping->size); + +return true; +} + static void virtio_iommu_de
Re: [Qemu-devel] [RFC v2 PATCH 2/2] virtio-iommu: vfio integration with virtio-iommu
> -Original Message- > From: Auger Eric [mailto:eric.au...@redhat.com] > Sent: Thursday, August 17, 2017 9:03 PM > To: Bharat Bhushan <bharat.bhus...@nxp.com>; > eric.auger@gmail.com; peter.mayd...@linaro.org; > alex.william...@redhat.com; m...@redhat.com; qemu-...@nongnu.org; > qemu-devel@nongnu.org > Cc: w...@redhat.com; kevin.t...@intel.com; marc.zyng...@arm.com; > t...@semihalf.com; will.dea...@arm.com; drjo...@redhat.com; > robin.mur...@arm.com; christoffer.d...@linaro.org > Subject: Re: [Qemu-devel] [RFC v2 PATCH 2/2] virtio-iommu: vfio integration > with virtio-iommu > > Hi Bharat, > > On 14/07/2017 09:25, Bharat Bhushan wrote: > > This patch allows virtio-iommu protection for PCI device-passthrough. > > > > MSI region is mapped by current version of virtio-iommu driver. > > This MSI region mapping in not getting pushed on hw iommu > > vfio_get_vaddr() allows only ram-region. > Why is it an issue. As far as I understand this is not needed actually as the > guest MSI doorbell is not used by the host. > This RFC patch needed > > to be improved. > > > > Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> > > --- > > v1-v2: > > - Added trace events > > > > hw/virtio/trace-events | 5 ++ > > hw/virtio/virtio-iommu.c | 133 > +++ > > include/hw/virtio/virtio-iommu.h | 6 ++ > > 3 files changed, 144 insertions(+) > > > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index > > 9196b63..3a3968b 100644 > > --- a/hw/virtio/trace-events > > +++ b/hw/virtio/trace-events > > @@ -39,3 +39,8 @@ virtio_iommu_unmap_left_interval(uint64_t low, > > uint64_t high, uint64_t next_low, > virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t > next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], > new interval=[0x%"PRIx64",0x%"PRIx64"]" > > virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap > inc [0x%"PRIx64",0x%"PRIx64"]" > > virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr, > uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" > > +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu > notifier node for memory region %s" > > +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu > notifier node for memory region %s" > > +virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) > "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" > > +virtio_iommu_map_region(hwaddr iova, hwaddr paddr, hwaddr > map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" > > +virtio_iommu_unmap_region(hwaddr iova, hwaddr paddr, hwaddr > map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index > > cd188fc..61f33cb 100644 > > --- a/hw/virtio/virtio-iommu.c > > +++ b/hw/virtio/virtio-iommu.c > > @@ -129,6 +129,48 @@ static gint interval_cmp(gconstpointer a, > gconstpointer b, gpointer user_data) > > } > > } > > > > +static void virtio_iommu_map_region(VirtIOIOMMU *s, hwaddr iova, > hwaddr paddr, > > +hwaddr size, int map) > bool map? > > the function name is a bit misleading to me and does not really explain what > the function does. It "notifies" so why not using something like > virtio_iommu_map_notify and virtio_iommu_unmap_notify. I tend to think > having separate proto is cleaner and more standard. > > Binding should happen on a specific IOMMUmemoryRegion (see next > comment). > > > +{ > > +VirtioIOMMUNotifierNode *node; > > +IOMMUTLBEntry entry; > > +uint64_t map_size = (1 << 12); > TODO: handle something else than 4K page. > > +int npages; > > +int i; > > + > > +npages = size / map_size; > > +entry.target_as = _space_memory; > > +entry.addr_mask = map_size - 1; > > + > > +for (i = 0; i < npages; i++) { > Although I understand we currently fail checking the consistency between > pIOMMU and vIOMMU page sizes, this will be very slow for guest DPDK use > case where hugepages are used. > > Why not directly using the full size? vfio_iommu_map_notify will report > errors if vfio_dma_map/unmap() fail. Yes, just for understanding, VFIO/IOMMU will map at page
Re: [Qemu-devel] [RFC v2 6/8] virtio-iommu: Implement the translation and commands
Hi Eric, > -Original Message- > From: Auger Eric [mailto:eric.au...@redhat.com] > Sent: Monday, July 31, 2017 6:38 PM > To: Peter Xu <pet...@redhat.com>; Bharat Bhushan > <bharat.bhus...@nxp.com> > Cc: w...@redhat.com; peter.mayd...@linaro.org; kevin.t...@intel.com; > drjo...@redhat.com; m...@redhat.com; jean-philippe.bruc...@arm.com; > t...@semihalf.com; will.dea...@arm.com; qemu-devel@nongnu.org; > alex.william...@redhat.com; qemu-...@nongnu.org; > marc.zyng...@arm.com; robin.mur...@arm.com; > christoffer.d...@linaro.org; eric.auger@gmail.com > Subject: Re: [Qemu-devel] [RFC v2 6/8] virtio-iommu: Implement the > translation and commands > > Hi Peter, Bharat, > > On 17/07/2017 03:28, Peter Xu wrote: > > On Fri, Jul 14, 2017 at 06:40:34AM +, Bharat Bhushan wrote: > >> Hi Peter, > >> > >>> -Original Message- > >>> From: Peter Xu [mailto:pet...@redhat.com] > >>> Sent: Friday, July 14, 2017 7:48 AM > >>> To: Eric Auger <eric.au...@redhat.com> > >>> Cc: eric.auger@gmail.com; peter.mayd...@linaro.org; > >>> alex.william...@redhat.com; m...@redhat.com; qemu- > a...@nongnu.org; > >>> qemu-devel@nongnu.org; jean-philippe.bruc...@arm.com; > >>> w...@redhat.com; kevin.t...@intel.com; Bharat Bhushan > >>> <bharat.bhus...@nxp.com>; marc.zyng...@arm.com; > t...@semihalf.com; > >>> will.dea...@arm.com; drjo...@redhat.com; robin.mur...@arm.com; > >>> christoffer.d...@linaro.org > >>> Subject: Re: [Qemu-devel] [RFC v2 6/8] virtio-iommu: Implement the > >>> translation and commands > >>> > >>> On Wed, Jun 07, 2017 at 06:01:25PM +0200, Eric Auger wrote: > >>>> This patch adds the actual implementation for the translation > >>>> routine and the virtio-iommu commands. > >>>> > >>>> Signed-off-by: Eric Auger <eric.au...@redhat.com> > >>> > >>> [...] > >>> > >>>> static int virtio_iommu_attach(VirtIOIOMMU *s, > >>>> struct virtio_iommu_req_attach > >>>> *req) @@ -95,10 +135,34 @@ static int > virtio_iommu_attach(VirtIOIOMMU *s, > >>>> uint32_t asid = le32_to_cpu(req->address_space); > >>>> uint32_t devid = le32_to_cpu(req->device); > >>>> uint32_t reserved = le32_to_cpu(req->reserved); > >>>> +viommu_as *as; > >>>> +viommu_dev *dev; > >>>> > >>>> trace_virtio_iommu_attach(asid, devid, reserved); > >>>> > >>>> -return VIRTIO_IOMMU_S_UNSUPP; > >>>> +dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid)); > >>>> +if (dev) { > >>>> +return -1; > >>>> +} > >>>> + > >>>> +as = g_tree_lookup(s->address_spaces, > GUINT_TO_POINTER(asid)); > >>>> +if (!as) { > >>>> +as = g_malloc0(sizeof(*as)); > >>>> +as->id = asid; > >>>> +as->mappings = > g_tree_new_full((GCompareDataFunc)interval_cmp, > >>>> + NULL, NULL, > >>>> (GDestroyNotify)g_free); > >>>> +g_tree_insert(s->address_spaces, GUINT_TO_POINTER(asid), as); > >>>> +trace_virtio_iommu_new_asid(asid); > >>>> +} > >>>> + > >>>> +dev = g_malloc0(sizeof(*dev)); > >>>> +dev->as = as; > >>>> +dev->id = devid; > >>>> +as->nr_devices++; > >>>> +trace_virtio_iommu_new_devid(devid); > >>>> +g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev); > >>> > >>> Here do we need to record something like a refcount for address space? > >>> Since... > >> > >> We are using "nr_devices" as number of devices attached to an > >> address-space > >> > >>> > >>>> + > >>>> +return VIRTIO_IOMMU_S_OK; > >>>> } > >>>> > >>>> static int virtio_iommu_detach(VirtIOIOMMU *s, @@ -106,10 +170,13 > >>>> @@ static int virtio_iommu_detach(VirtIOIOMMU *s, { > >>>> uint32_t devid = le32_to_cpu(req->device); > >>>> uint32_t reserved = le32_to_cpu(req->reserved); > >>>> +int ret; > >>>> > >>>&
Re: [Qemu-devel] [RFC v3 6/8] virtio-iommu: Implement the translation and commands
Hi Eric, > -Original Message- > From: Eric Auger [mailto:eric.au...@redhat.com] > Sent: Tuesday, August 01, 2017 3:03 PM > To: eric.auger@gmail.com; eric.au...@redhat.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org; jean- > philippe.bruc...@arm.com > Cc: will.dea...@arm.com; kevin.t...@intel.com; marc.zyng...@arm.com; > christoffer.d...@linaro.org; drjo...@redhat.com; w...@redhat.com; > t...@semihalf.com; Bharat Bhushan <bharat.bhus...@nxp.com>; > pet...@redhat.com > Subject: [RFC v3 6/8] virtio-iommu: Implement the translation and > commands > > This patch adds the actual implementation for the translation routine > and the virtio-iommu commands. > > Signed-off-by: Eric Auger <eric.au...@redhat.com> > > --- > v2 -> v3: > - init the mutex > - return VIRTIO_IOMMU_S_INVAL is reserved field is not null on > attach/detach commands > - on attach, when the device is already attached to an address space, > detach it first instead of returning an error > - remove nr_devices and use g_tree_ref/unref to destroy the as->mappings > on last device detach > - map/unmap: return NOENT instead of INVAL error if as does not exist > - remove flags argument from attach/detach trace functions > > v1 -> v2: > - fix compilation issue reported by autobuild system > --- > hw/virtio/trace-events | 10 +- > hw/virtio/virtio-iommu.c | 232 > +-- > 2 files changed, 232 insertions(+), 10 deletions(-) > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events > index 341dbdf..8db3d91 100644 > --- a/hw/virtio/trace-events > +++ b/hw/virtio/trace-events > @@ -28,8 +28,14 @@ virtio_balloon_to_target(uint64_t target, uint32_t > num_pages) "balloon target: % > > # hw/virtio/virtio-iommu.c > # > -virtio_iommu_attach(uint32_t as, uint32_t dev, uint32_t flags) "as=%d > dev=%d flags=%d" > -virtio_iommu_detach(uint32_t dev, uint32_t flags) "dev=%d flags=%d" > +virtio_iommu_attach(uint32_t as, uint32_t dev) "as=%d dev=%d" > +virtio_iommu_detach(uint32_t dev) "dev=%d" > virtio_iommu_map(uint32_t as, uint64_t phys_addr, uint64_t virt_addr, > uint64_t size, uint32_t flags) "as= %d phys_addr=0x%"PRIx64" > virt_addr=0x%"PRIx64" size=0x%"PRIx64" flags=%d" > virtio_iommu_unmap(uint32_t as, uint64_t virt_addr, uint64_t size, uint32_t > reserved) "as= %d virt_addr=0x%"PRIx64" size=0x%"PRIx64" reserved=%d" > virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int > flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d" > +virtio_iommu_new_asid(uint32_t asid) "Allocate a new asid=%d" > +virtio_iommu_new_devid(uint32_t devid) "Allocate a new devid=%d" > +virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t > next_low, uint64_t next_high) "Unmap left [0x%"PRIx64",0x%"PRIx64"], > new interval=[0x%"PRIx64",0x%"PRIx64"]" > +virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t > next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], > new interval=[0x%"PRIx64",0x%"PRIx64"]" > +virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc > [0x%"PRIx64",0x%"PRIx64"]" > +virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr, > uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > index e663d9e..9217587 100644 > --- a/hw/virtio/virtio-iommu.c > +++ b/hw/virtio/virtio-iommu.c > @@ -32,10 +32,36 @@ > #include "hw/virtio/virtio-bus.h" > #include "hw/virtio/virtio-access.h" > #include "hw/virtio/virtio-iommu.h" > +#include "hw/pci/pci_bus.h" > +#include "hw/pci/pci.h" > > /* Max size */ > #define VIOMMU_DEFAULT_QUEUE_SIZE 256 > > +typedef struct viommu_as viommu_as; > + > +typedef struct viommu_mapping { > +uint64_t virt_addr; > +uint64_t phys_addr; > +uint64_t size; > +uint32_t flags; > +} viommu_mapping; > + > +typedef struct viommu_interval { > +uint64_t low; > +uint64_t high; > +} viommu_interval; > + > +typedef struct viommu_dev { > +uint32_t id; > +viommu_as *as; > +} viommu_dev; > + > +struct viommu_as { > +uint32_t id; > +GTree *mappings; > +}; > + > static inline uint16_t virtio_
[Qemu-devel] [RFC v2 PATCH 2/2] virtio-iommu: vfio integration with virtio-iommu
This patch allows virtio-iommu protection for PCI device-passthrough. MSI region is mapped by current version of virtio-iommu driver. This MSI region mapping in not getting pushed on hw iommu vfio_get_vaddr() allows only ram-region. This RFC patch needed to be improved. Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> --- v1-v2: - Added trace events hw/virtio/trace-events | 5 ++ hw/virtio/virtio-iommu.c | 133 +++ include/hw/virtio/virtio-iommu.h | 6 ++ 3 files changed, 144 insertions(+) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 9196b63..3a3968b 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -39,3 +39,8 @@ virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low, virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]" virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]" virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s" +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s" +virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_map_region(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" +virtio_iommu_unmap_region(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64"" diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index cd188fc..61f33cb 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -129,6 +129,48 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data) } } +static void virtio_iommu_map_region(VirtIOIOMMU *s, hwaddr iova, hwaddr paddr, +hwaddr size, int map) +{ +VirtioIOMMUNotifierNode *node; +IOMMUTLBEntry entry; +uint64_t map_size = (1 << 12); +int npages; +int i; + +npages = size / map_size; +entry.target_as = _space_memory; +entry.addr_mask = map_size - 1; + +for (i = 0; i < npages; i++) { +entry.iova = iova + (i * map_size); +if (map) { +trace_virtio_iommu_map_region(iova, paddr, map_size); +entry.perm = IOMMU_RW; +entry.translated_addr = paddr + (i * map_size); +} else { +trace_virtio_iommu_unmap_region(iova, paddr, map_size); +entry.perm = IOMMU_NONE; +entry.translated_addr = 0; +} + +QLIST_FOREACH(node, >notifiers_list, next) { +memory_region_notify_iommu(>iommu_dev->iommu_mr, entry); +} +} +} + +static gboolean virtio_iommu_unmap_single(gpointer key, gpointer value, + gpointer data) +{ +viommu_mapping *mapping = (viommu_mapping *) value; +VirtIOIOMMU *s = (VirtIOIOMMU *) data; + +virtio_iommu_map_region(s, mapping->virt_addr, 0, mapping->size, 0); + +return true; +} + static int virtio_iommu_attach(VirtIOIOMMU *s, struct virtio_iommu_req_attach *req) { @@ -170,10 +212,26 @@ static int virtio_iommu_detach(VirtIOIOMMU *s, { uint32_t devid = le32_to_cpu(req->device); uint32_t reserved = le32_to_cpu(req->reserved); +viommu_dev *dev; int ret; trace_virtio_iommu_detach(devid, reserved); +dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid)); +if (!dev || !dev->as) { +return -EINVAL; +} + +dev->as->nr_devices--; + +/* Unmap all if this is last device detached */ +if (dev->as->nr_devices == 0) { +g_tree_foreach(dev->as->mappings, virtio_iommu_unmap_single, s); + +g_tree_remove(s->address_spaces, GUINT_TO_POINTER(dev->as->id)); +g_tree_destroy(dev->as->mappings); +} + ret = g_tree_remove(s->devices, GUINT_TO_POINTER(devid)); return ret ? VIRTIO_IOMMU_S_OK : VIRTIO_IOMMU_S_INVAL; @@ -217,6 +275,7 @@ static int virtio_iommu_map(VirtIOIOMMU *s, g_tree_insert(as->mappings, interval, mapping); +virtio_iommu_map_region(s, virt_addr, phys_addr, size, 1); return VIRTIO_IOMMU_S_OK; } @@ -267,7 +326,9 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, } else {
[Qemu-devel] [RFC v2 PATCH 1/2] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
Translate msi address if device is behind virtio-iommu. This logic is similar to vSMMUv3/Intel iommu emulation. This RFC patch does not handle the case where both vsmmuv3 and virtio-iommu are available. Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> --- v1-v2: - Added trace events - removed vSMMU3 link in patch description target/arm/kvm.c| 25 + target/arm/trace-events | 3 +++ 2 files changed, 28 insertions(+) diff --git a/target/arm/kvm.c b/target/arm/kvm.c index 4555468..5a28956 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -21,7 +21,11 @@ #include "kvm_arm.h" #include "cpu.h" #include "internals.h" +#include "trace.h" #include "hw/arm/arm.h" +#include "hw/pci/pci.h" +#include "hw/pci/msi.h" +#include "hw/virtio/virtio-iommu.h" #include "exec/memattrs.h" #include "exec/address-spaces.h" #include "hw/boards.h" @@ -611,6 +615,27 @@ int kvm_arm_vgic_probe(void) int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, uint64_t address, uint32_t data, PCIDevice *dev) { +AddressSpace *as = pci_device_iommu_address_space(dev); +IOMMUTLBEntry entry; +IOMMUDevice *sdev; +VirtIOIOMMU *s; + +if (as == _space_memory) { +return 0; +} + +/* MSI doorbell address is translated by an IOMMU */ +sdev = container_of(as, IOMMUDevice, as); +s = sdev->viommu; + +entry = s->iommu_ops.translate(>iommu_mr, address, IOMMU_WO); + +route->u.msi.address_lo = entry.translated_addr; +route->u.msi.address_hi = entry.translated_addr >> 32; + +trace_kvm_arm_fixup_msi_route(address, sdev->devfn, sdev->iommu_mr.name, + entry.translated_addr); + return 0; } diff --git a/target/arm/trace-events b/target/arm/trace-events index e21c84f..eff2822 100644 --- a/target/arm/trace-events +++ b/target/arm/trace-events @@ -8,3 +8,6 @@ arm_gt_tval_write(int timer, uint64_t value) "gt_tval_write: timer %d value %" P arm_gt_ctl_write(int timer, uint64_t value) "gt_ctl_write: timer %d value %" PRIx64 arm_gt_imask_toggle(int timer, int irqstate) "gt_ctl_write: timer %d IMASK toggle, new irqstate %d" arm_gt_cntvoff_write(uint64_t value) "gt_cntvoff_write: value %" PRIx64 + +# target/arm/kvm.c +kvm_arm_fixup_msi_route(uint64_t iova, uint32_t devid, const char *name, uint64_t gpa) "MSI addr = 0x%"PRIx64" is translated for devfn=%d through %s into 0x%"PRIx64 -- 1.9.3
[Qemu-devel] [RFC v2 PATCH 0/2] VFIO integration
This patch series allows PCI pass-through using virtio-iommu. This series is based on: - virtio-iommu specification written by Jean-Philippe Brucker [RFC 0/3] virtio-iommu: a paravirtualized IOMMU, - virtio-iommu driver by Jean-Philippe Brucker [RFC PATCH linux] iommu: Add virtio-iommu driver - virtio-iommu device emulation by Eric Augur. [RFC v2 0/8] VIRTIO-IOMMU device PCI device pass-through and virtio-net-pci is tested with these chages using dma-ops This patch series does not implement RESV_MEM changes proposal by Jean-Philippe "https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01796.html; v1-v2: - Added trace events - removed vSMMU3 link in patch description Bharat Bhushan (2): target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route virtio-iommu: vfio integration with virtio-iommu hw/virtio/trace-events | 5 ++ hw/virtio/virtio-iommu.c | 133 +++ include/hw/virtio/virtio-iommu.h | 6 ++ target/arm/kvm.c | 25 target/arm/trace-events | 3 + 5 files changed, 172 insertions(+) -- 1.9.3
Re: [Qemu-devel] [RFC v2 6/8] virtio-iommu: Implement the translation and commands
Hi Peter, > -Original Message- > From: Peter Xu [mailto:pet...@redhat.com] > Sent: Friday, July 14, 2017 7:48 AM > To: Eric Auger <eric.au...@redhat.com> > Cc: eric.auger@gmail.com; peter.mayd...@linaro.org; > alex.william...@redhat.com; m...@redhat.com; qemu-...@nongnu.org; > qemu-devel@nongnu.org; jean-philippe.bruc...@arm.com; > w...@redhat.com; kevin.t...@intel.com; Bharat Bhushan > <bharat.bhus...@nxp.com>; marc.zyng...@arm.com; t...@semihalf.com; > will.dea...@arm.com; drjo...@redhat.com; robin.mur...@arm.com; > christoffer.d...@linaro.org > Subject: Re: [Qemu-devel] [RFC v2 6/8] virtio-iommu: Implement the > translation and commands > > On Wed, Jun 07, 2017 at 06:01:25PM +0200, Eric Auger wrote: > > This patch adds the actual implementation for the translation routine > > and the virtio-iommu commands. > > > > Signed-off-by: Eric Auger <eric.au...@redhat.com> > > [...] > > > static int virtio_iommu_attach(VirtIOIOMMU *s, > > struct virtio_iommu_req_attach *req) > > @@ -95,10 +135,34 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, > > uint32_t asid = le32_to_cpu(req->address_space); > > uint32_t devid = le32_to_cpu(req->device); > > uint32_t reserved = le32_to_cpu(req->reserved); > > +viommu_as *as; > > +viommu_dev *dev; > > > > trace_virtio_iommu_attach(asid, devid, reserved); > > > > -return VIRTIO_IOMMU_S_UNSUPP; > > +dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid)); > > +if (dev) { > > +return -1; > > +} > > + > > +as = g_tree_lookup(s->address_spaces, GUINT_TO_POINTER(asid)); > > +if (!as) { > > +as = g_malloc0(sizeof(*as)); > > +as->id = asid; > > +as->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp, > > + NULL, NULL, > > (GDestroyNotify)g_free); > > +g_tree_insert(s->address_spaces, GUINT_TO_POINTER(asid), as); > > +trace_virtio_iommu_new_asid(asid); > > +} > > + > > +dev = g_malloc0(sizeof(*dev)); > > +dev->as = as; > > +dev->id = devid; > > +as->nr_devices++; > > +trace_virtio_iommu_new_devid(devid); > > +g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev); > > Here do we need to record something like a refcount for address space? > Since... We are using "nr_devices" as number of devices attached to an address-space > > > + > > +return VIRTIO_IOMMU_S_OK; > > } > > > > static int virtio_iommu_detach(VirtIOIOMMU *s, @@ -106,10 +170,13 @@ > > static int virtio_iommu_detach(VirtIOIOMMU *s, { > > uint32_t devid = le32_to_cpu(req->device); > > uint32_t reserved = le32_to_cpu(req->reserved); > > +int ret; > > > > trace_virtio_iommu_detach(devid, reserved); > > > > -return VIRTIO_IOMMU_S_UNSUPP; > > +ret = g_tree_remove(s->devices, GUINT_TO_POINTER(devid)); > > ... here when detach, imho we should check the refcount: if there is no > device using specific address space, we should release the address space as > well. > > Otherwise there would have no way to destroy an address space? Here if nr_devices == 0 then release the address space, is that ok? This is how I implemented as part of VFIO integration over this patch series. "[RFC PATCH 2/2] virtio-iommu: vfio integration with virtio-iommu" Thanks -Bharat > > > + > > +return ret ? VIRTIO_IOMMU_S_OK : VIRTIO_IOMMU_S_INVAL; > > } > > [...] > > > static int virtio_iommu_unmap(VirtIOIOMMU *s, @@ -133,10 +227,64 @@ > > static int virtio_iommu_unmap(VirtIOIOMMU *s, > > uint64_t virt_addr = le64_to_cpu(req->virt_addr); > > uint64_t size = le64_to_cpu(req->size); > > uint32_t flags = le32_to_cpu(req->flags); > > +viommu_mapping *mapping; > > +viommu_interval interval; > > +viommu_as *as; > > > > trace_virtio_iommu_unmap(asid, virt_addr, size, flags); > > > > -return VIRTIO_IOMMU_S_UNSUPP; > > +as = g_tree_lookup(s->address_spaces, GUINT_TO_POINTER(asid)); > > +if (!as) { > > +error_report("%s: no as", __func__); > > +return VIRTIO_IOMMU_S_INVAL; > > +} > > +interval.low = virt_addr; > > +interval.high = virt_addr + size - 1; > > + > > +mapping = g_tree_lookup(as->mappings, (gpointer)); > > + > > +while (mapping) { >
[Qemu-devel] [RFC PATCH 1/2] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
Fix-up MSI address if it translate via virtual iommu. This code is based on http://patchwork.ozlabs.org/patch/785951/ Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> --- target/arm/kvm.c | 21 + 1 file changed, 21 insertions(+) diff --git a/target/arm/kvm.c b/target/arm/kvm.c index 4555468..eff7e8f 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -22,6 +22,9 @@ #include "cpu.h" #include "internals.h" #include "hw/arm/arm.h" +#include "hw/pci/pci.h" +#include "hw/pci/msi.h" +#include "hw/virtio/virtio-iommu.h" #include "exec/memattrs.h" #include "exec/address-spaces.h" #include "hw/boards.h" @@ -611,6 +614,24 @@ int kvm_arm_vgic_probe(void) int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, uint64_t address, uint32_t data, PCIDevice *dev) { +AddressSpace *as = pci_device_iommu_address_space(dev); +IOMMUTLBEntry entry; +IOMMUDevice *sdev; +VirtIOIOMMU *s; + +if (as == _space_memory) { +return 0; +} + +/* MSI doorbell address is translated by an IOMMU */ +sdev = container_of(as, IOMMUDevice, as); +s = sdev->viommu; + +entry = s->iommu_ops.translate(>iommu_mr, address, IOMMU_WO); + +route->u.msi.address_lo = entry.translated_addr; +route->u.msi.address_hi = entry.translated_addr >> 32; + return 0; } -- 1.9.3
[Qemu-devel] [RFC PATCH 2/2] virtio-iommu: vfio integration with virtio-iommu
This patch allows virtio-iommu protection for PCI device-passthrough. MSI region is mapped by current version of virtio-iommu driver. This MSI region mapping in not getting pushed on hw iommu vfio_get_vaddr() allows only ram-region. This RFC patch needed to be improved. Signed-off-by: Bharat Bhushan <bharat.bhus...@nxp.com> --- hw/virtio/virtio-iommu.c | 127 +++ include/hw/virtio/virtio-iommu.h | 6 ++ 2 files changed, 133 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index cd188fc..08d5a2f 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -129,6 +129,46 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data) } } +static void virtio_iommu_map_region(VirtIOIOMMU *s, hwaddr iova, hwaddr paddr, +hwaddr size, int map) +{ +VirtioIOMMUNotifierNode *node; +IOMMUTLBEntry entry; +uint64_t map_size = (1 << 12); +int npages; +int i; + +npages = size / map_size; +entry.target_as = _space_memory; +entry.addr_mask = map_size - 1; + +for (i = 0; i < npages; i++) { +entry.iova = iova + (i * map_size); +if (map) { +entry.perm = IOMMU_RW; +entry.translated_addr = paddr + (i * map_size); +} else { +entry.perm = IOMMU_NONE; +entry.translated_addr = 0; +} + +QLIST_FOREACH(node, >notifiers_list, next) { +memory_region_notify_iommu(>iommu_dev->iommu_mr, entry); +} +} +} + +static gboolean virtio_iommu_unmap_single(gpointer key, gpointer value, + gpointer data) +{ +viommu_mapping *mapping = (viommu_mapping *) value; +VirtIOIOMMU *s = (VirtIOIOMMU *) data; + +virtio_iommu_map_region(s, mapping->virt_addr, 0, mapping->size, 0); + +return true; +} + static int virtio_iommu_attach(VirtIOIOMMU *s, struct virtio_iommu_req_attach *req) { @@ -170,10 +210,26 @@ static int virtio_iommu_detach(VirtIOIOMMU *s, { uint32_t devid = le32_to_cpu(req->device); uint32_t reserved = le32_to_cpu(req->reserved); +viommu_dev *dev; int ret; trace_virtio_iommu_detach(devid, reserved); +dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid)); +if (!dev || !dev->as) { +return -EINVAL; +} + +dev->as->nr_devices--; + +/* Unmap all if this is last device detached */ +if (dev->as->nr_devices == 0) { +g_tree_foreach(dev->as->mappings, virtio_iommu_unmap_single, s); + +g_tree_remove(s->address_spaces, GUINT_TO_POINTER(dev->as->id)); +g_tree_destroy(dev->as->mappings); +} + ret = g_tree_remove(s->devices, GUINT_TO_POINTER(devid)); return ret ? VIRTIO_IOMMU_S_OK : VIRTIO_IOMMU_S_INVAL; @@ -217,6 +273,7 @@ static int virtio_iommu_map(VirtIOIOMMU *s, g_tree_insert(as->mappings, interval, mapping); +virtio_iommu_map_region(s, virt_addr, phys_addr, size, 1); return VIRTIO_IOMMU_S_OK; } @@ -267,7 +324,9 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s, } else { break; } + if (interval.low >= interval.high) { +virtio_iommu_map_region(s, virt_addr, 0, size, 0); return VIRTIO_IOMMU_S_OK; } else { mapping = g_tree_lookup(as->mappings, (gpointer)); @@ -410,6 +469,35 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq) } } +static void virtio_iommu_notify_flag_changed(MemoryRegion *iommu, + IOMMUNotifierFlag old, + IOMMUNotifierFlag new) +{ +IOMMUDevice *sdev = container_of(iommu, IOMMUDevice, iommu_mr); +VirtIOIOMMU *s = sdev->viommu; +VirtioIOMMUNotifierNode *node = NULL; +VirtioIOMMUNotifierNode *next_node = NULL; + +if (old == IOMMU_NOTIFIER_NONE) { +node = g_malloc0(sizeof(*node)); +node->iommu_dev = sdev; +QLIST_INSERT_HEAD(>notifiers_list, node, next); +return; +} + +/* update notifier node with new flags */ +QLIST_FOREACH_SAFE(node, >notifiers_list, next, next_node) { +if (node->iommu_dev == sdev) { +if (new == IOMMU_NOTIFIER_NONE) { +QLIST_REMOVE(node, next); +g_free(node); +} +return; +} +} +} + + static IOMMUTLBEntry virtio_iommu_translate(MemoryRegion *mr, hwaddr addr, IOMMUAccessFlags flag) { @@ -523,11 +611,48 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data) return (ua > ub) - (ua < ub); } +static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data) +{ +vio
[Qemu-devel] [RFC PATCH 0/2] VFIO integration
This patch series allows PCI pass-through using virtio-iommu. This series is based on: - virtio-iommu specification written by Jean-Philippe Brucker [RFC 0/3] virtio-iommu: a paravirtualized IOMMU, - virtio-iommu driver by Jean-Philippe Brucker [RFC PATCH linux] iommu: Add virtio-iommu driver - virtio-iommu device emulation by Eric Augur. [RFC v2 0/8] VIRTIO-IOMMU device PCI device pass-through and virtio-net-pci is tested with these chages using dma-ops This patch series does not implement RESV_MEM changes proposal by Jean-Philippe "https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01796.html; Bharat Bhushan (2): target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route virtio-iommu: vfio integration with virtio-iommu hw/virtio/virtio-iommu.c | 127 +++ include/hw/virtio/virtio-iommu.h | 6 ++ target/arm/kvm.c | 21 +++ 3 files changed, 154 insertions(+) -- 1.9.3
Re: [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device
> -Original Message- > From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] > Sent: Wednesday, July 12, 2017 4:28 PM > To: Bharat Bhushan <bharat.bhus...@nxp.com>; Auger Eric > <eric.au...@redhat.com>; eric.auger@gmail.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org > Cc: w...@redhat.com; kevin.t...@intel.com; marc.zyng...@arm.com; > t...@semihalf.com; will.dea...@arm.com; drjo...@redhat.com; > robin.mur...@arm.com; christoffer.d...@linaro.org > Subject: Re: [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device > > On 12/07/17 11:27, Bharat Bhushan wrote: > > > > > >> -Original Message- > >> From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] > >> Sent: Wednesday, July 12, 2017 3:48 PM > >> To: Bharat Bhushan <bharat.bhus...@nxp.com>; Auger Eric > >> <eric.au...@redhat.com>; eric.auger@gmail.com; > >> peter.mayd...@linaro.org; alex.william...@redhat.com; > m...@redhat.com; > >> qemu-...@nongnu.org; qemu-devel@nongnu.org > >> Cc: w...@redhat.com; kevin.t...@intel.com; marc.zyng...@arm.com; > >> t...@semihalf.com; will.dea...@arm.com; drjo...@redhat.com; > >> robin.mur...@arm.com; christoffer.d...@linaro.org > >> Subject: Re: [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device > >> > >> On 12/07/17 04:50, Bharat Bhushan wrote: > >> [...] > >>>> The size of the virtio_iommu_req_probe structure is variable, and > >> depends > >>>> what fields the device implements. So the device initially computes > >>>> the > >> size it > >>>> needs to fill virtio_iommu_req_probe, describes it in probe_size, > >>>> and the driver allocates that many bytes for > >>>> virtio_iommu_req_probe.content[] > >>>> > >>>>>> * When device offers VIRTIO_IOMMU_F_PROBE, the driver should > >> send > >>>> an > >>>>>> VIRTIO_IOMMU_T_PROBE request for each new endpoint. > >>>>>> * The driver allocates a device-writeable buffer of probe_size > >>>>>> (plus > >>>>>> framing) and sends it as a VIRTIO_IOMMU_T_PROBE request. > >>>>>> * The device fills the buffer with various information. > >>>>>> > >>>>>> struct virtio_iommu_req_probe { > >>>>>>/* device-readable */ > >>>>>>struct virtio_iommu_req_head head; > >>>>>>le32 device; > >>>>>>le32 flags; > >>>>>> > >>>>>>/* maybe also le32 content_size, but it must be equal to > >>>>>> probe_size */ > >>>>> > >>>>> Can you please describe why we need to pass size of "probe_size" > >>>>> in > >> probe > >>>> request? > >>>> > >>>> We don't. I don't think we should add this 'content_size' field > >>>> unless there > >> is > >>>> a compelling reason to do so. > >>>> > >>>>>> > >>>>>>/* device-writeable */ > >>>>>>u8 content[]; > >>>>> > >>>>> I assume content_size above is the size of array "content[]" and > >>>>> max > >> value > >>>> can be equal to probe_size advertised by device? > >>>> > >>>> probe_size is exactly the size of array content[]. The driver must > >>>> allocate a buffer of this size (plus the space needed for head, device, > flags and tail). > >>>> > >>>> Then the device is free to leave parts of content[] empty. Field > >>>> 'type' 0 will > >> be > >>>> reserved and mark the end of the array. > >>>> > >>>>>>struct virtio_iommu_req_tail tail; }; > >>>>>> > >>>>>> I'm still struggling with the content and layout of the probe > >>>>>> request, and would appreciate any feedback. To be easily > >>>>>> extended, I think it should contain a list of fields of variable size: > >>>>>> > >>>>>>|0 15|16 31|32 N| > >>>>>>| type |length | values | > >>&
Re: [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device
> -Original Message- > From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] > Sent: Wednesday, July 12, 2017 3:48 PM > To: Bharat Bhushan <bharat.bhus...@nxp.com>; Auger Eric > <eric.au...@redhat.com>; eric.auger@gmail.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org > Cc: w...@redhat.com; kevin.t...@intel.com; marc.zyng...@arm.com; > t...@semihalf.com; will.dea...@arm.com; drjo...@redhat.com; > robin.mur...@arm.com; christoffer.d...@linaro.org > Subject: Re: [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device > > On 12/07/17 04:50, Bharat Bhushan wrote: > [...] > >> The size of the virtio_iommu_req_probe structure is variable, and > depends > >> what fields the device implements. So the device initially computes the > size it > >> needs to fill virtio_iommu_req_probe, describes it in probe_size, and the > >> driver allocates that many bytes for virtio_iommu_req_probe.content[] > >> > >>>> * When device offers VIRTIO_IOMMU_F_PROBE, the driver should > send > >> an > >>>> VIRTIO_IOMMU_T_PROBE request for each new endpoint. > >>>> * The driver allocates a device-writeable buffer of probe_size (plus > >>>> framing) and sends it as a VIRTIO_IOMMU_T_PROBE request. > >>>> * The device fills the buffer with various information. > >>>> > >>>> struct virtio_iommu_req_probe { > >>>> /* device-readable */ > >>>> struct virtio_iommu_req_head head; > >>>> le32 device; > >>>> le32 flags; > >>>> > >>>> /* maybe also le32 content_size, but it must be equal to > >>>> probe_size */ > >>> > >>> Can you please describe why we need to pass size of "probe_size" in > probe > >> request? > >> > >> We don't. I don't think we should add this 'content_size' field unless > >> there > is > >> a compelling reason to do so. > >> > >>>> > >>>> /* device-writeable */ > >>>> u8 content[]; > >>> > >>> I assume content_size above is the size of array "content[]" and max > value > >> can be equal to probe_size advertised by device? > >> > >> probe_size is exactly the size of array content[]. The driver must > >> allocate a > >> buffer of this size (plus the space needed for head, device, flags and > >> tail). > >> > >> Then the device is free to leave parts of content[] empty. Field 'type' 0 > >> will > be > >> reserved and mark the end of the array. > >> > >>>> struct virtio_iommu_req_tail tail; > >>>> }; > >>>> > >>>> I'm still struggling with the content and layout of the probe > >>>> request, and would appreciate any feedback. To be easily extended, I > >>>> think it should contain a list of fields of variable size: > >>>> > >>>> |0 15|16 31|32 N| > >>>> | type |length | values | > >>>> > >>>> 'length' might be made optional if it can be deduced from type, but > >>>> might make driver-side parsing more robust. > >>>> > >>>> The probe could either be done for each endpoint, or for each address > >>>> space. I much prefer endpoint because it is the smallest granularity. > >>>> The driver can then decide what endpoints to put together in the same > >>>> address space based on their individual capabilities. The > >>>> specification would described how each endpoint property is combined > >>>> when endpoints are put in the same address space. For example, take > >>>> the minimum of all PASID size, the maximum of all page granularities, > >>>> combine doorbell addresses, etc. > >>>> > >>>> If we did the probe on address spaces instead, the driver would have > >>>> to re-send a probe request each time a new endpoint is attached to an > >>>> existing address space, to see if it is still capable of page table > >>>> handover or if the driver just combined a VFIO and an emulated > >>>> endpoint by accident. > >>>> > >>>> *** > >>>> > >>>> Using this framework, the device can declare