RE: [PATCH] intel_iommu: Use the latest fault reasons defined by spec

2024-05-24 Thread Duan, Zhenzhong


>-Original Message-
>From: Jason Wang 
>Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons defined by
>spec
>
>On Tue, May 21, 2024 at 6:25 PM Duan, Zhenzhong
> wrote:
>>
>>
>>
>> >-Original Message-
>> >From: Jason Wang 
>> >Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons defined by
>> >spec
>> >
>> >On Mon, May 20, 2024 at 12:15 PM Liu, Yi L  wrote:
>> >>
>> >> > From: Duan, Zhenzhong 
>> >> > Sent: Monday, May 20, 2024 11:41 AM
>> >> >
>> >> >
>> >> >
>> >> > >-Original Message-
>> >> > >From: Jason Wang 
>> >> > >Sent: Monday, May 20, 2024 8:44 AM
>> >> > >To: Duan, Zhenzhong 
>> >> > >Cc: qemu-devel@nongnu.org; Liu, Yi L ; Peng,
>Chao
>> >P
>> >> > >; Yu Zhang ;
>> >Michael
>> >> > >S. Tsirkin ; Paolo Bonzini
>;
>> >> > >Richard Henderson ; Eduardo
>Habkost
>> >> > >; Marcel Apfelbaum
>> >
>> >> > >Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons
>defined
>> >by
>> >> > >spec
>> >> > >
>> >> > >On Fri, May 17, 2024 at 6:26 PM Zhenzhong Duan
>> >> > > wrote:
>> >> > >>
>> >> > >> From: Yu Zhang 
>> >> > >>
>> >> > >> Currently we use only VTD_FR_PASID_TABLE_INV as fault reason.
>> >> > >> Update with more detailed fault reasons listed in VT-d spec 7.2.3.
>> >> > >>
>> >> > >> Signed-off-by: Yu Zhang 
>> >> > >> Signed-off-by: Zhenzhong Duan 
>> >> > >> ---
>> >> > >
>> >> > >I wonder if this could be noticed by the guest or not. If yes should
>> >> > >we consider starting to add thing like version to vtd emulation code?
>> >> >
>> >> > Kernel only dumps the reason like below:
>> >> >
>> >> > DMAR: [DMA Write NO_PASID] Request device [20:00.0] fault addr
>> >0x123460
>> >> > [fault reason 0x71] SM: Present bit in first-level paging entry is clear
>> >>
>> >> Yes, guest kernel would notice it as the fault would be injected to vm.
>> >>
>> >> > Maybe bump 1.0 -> 1.1?
>> >> > My understanding version number is only informational and is far
>from
>> >> > accurate to mark if a feature supported. Driver should check cap/ecap
>> >> > bits instead.
>> >>
>> >> Should the version ID here be aligned with VT-d spec?
>> >
>> >Probably, this might be something that could be noticed by the
>> >management to migration compatibility.
>>
>> Could you elaborate what we need to do for migration compatibility?
>> I see version is already exported so libvirt can query it, see:
>>
>> DEFINE_PROP_UINT32("version", IntelIOMMUState, version, 0),
>
>It is the Qemu command line parameters not the version of the vmstate.
>
>For example -device intel-iommu,version=3.0
>
>Qemu then knows it should behave as 3.0.

So you want to bump vtd_vmstate.version?

In fact, this series change intel_iommu property from 
x-scalable-mode=["on"|"off"]"
to x-scalable-mode=["legacy"|"modern"|"off"]".

My understanding management app should use same qemu cmdline
in source and destination, so compatibility is already guaranteed even if
we don't bump vtd_vmstate.version.

Thanks
Zhenzhong


RE: [PATCH rfcv2 17/17] tests/qtest: Add intel-iommu test

2024-05-23 Thread Duan, Zhenzhong


>-Original Message-
>From: Thomas Huth 
>Subject: Re: [PATCH rfcv2 17/17] tests/qtest: Add intel-iommu test
>
>On 22/05/2024 08.23, Zhenzhong Duan wrote:
>> Add the framework to test the intel-iommu device.
>>
>> Currently only tested cap/ecap bits correctness in scalable
>> modern mode. Also tested cap/ecap bits consistency before
>> and after system reset.
>>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   MAINTAINERS|  1 +
>>   tests/qtest/intel-iommu-test.c | 63
>++
>>   tests/qtest/meson.build|  1 +
>>   3 files changed, 65 insertions(+)
>>   create mode 100644 tests/qtest/intel-iommu-test.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 5dab60bd04..f1ef6128c8 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -3656,6 +3656,7 @@ S: Supported
>>   F: hw/i386/intel_iommu.c
>>   F: hw/i386/intel_iommu_internal.h
>>   F: include/hw/i386/intel_iommu.h
>> +F: tests/qtest/intel-iommu-test.c
>>
>>   AMD-Vi Emulation
>>   S: Orphan
>> diff --git a/tests/qtest/intel-iommu-test.c b/tests/qtest/intel-iommu-test.c
>> new file mode 100644
>> index 00..e1273bce14
>> --- /dev/null
>> +++ b/tests/qtest/intel-iommu-test.c
>> @@ -0,0 +1,63 @@
>> +/*
>> + * QTest testcase for intel-iommu
>> + *
>> + * Copyright (c) 2024 Intel, Inc.
>> + *
>> + * Author: Zhenzhong Duan 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>later.
>> + * See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "libqtest-single.h"
>
>It's a little bit nicer to write new tests without libqtest-single.h (e.g.
>in case you ever add migration tests later, you must not use anything that
>uses a global state), so I'd recommend to use "qts = qtest_init(...)"
>instead of qtest_start(...) and then to use the functions with the "qtest_"
>prefix instead of the other functions from libqtest-single.h ... but it's
>only a recommendation, up to you whether you want to respin your patch
>with
>it or not.

Got it, I'll fix it in next version.

>
>Anyway:
>Acked-by: Thomas Huth 
>
>Do you want me to pick this up through the qtest tree, or shall this go
>through some x86-related tree instead?

This patch depends on other functional patches in this series,
So maybe going through x86-related tree with others is better.

Thanks
Zhenzhong


RE: [PATCH rfcv2 00/17] intel_iommu: Enable stage-1 translation for emulated device

2024-05-23 Thread Duan, Zhenzhong


>-Original Message-
>From: Jason Wang 
>Subject: Re: [PATCH rfcv2 00/17] intel_iommu: Enable stage-1 translation
>for emulated device
>
>On Wed, May 22, 2024 at 2:25 PM Zhenzhong Duan
> wrote:
>>
>> Hi,
>>
>> Per Jason Wang's suggestion, iommufd nesting series[1] is split into
>> "Enable stage-1 translation for emulated device" series and
>> "Enable stage-1 translation for passthrough device" series.
>>
>> This series enables stage-1 translation support for emulated device
>> in intel iommu which we called "modern" mode.
>
>Btw, I think we never merge RFC patches so I guess this series could
>be sent as formal one for the next version.

Got it, will do.

Thanks
Zhenzhong


RE: [PATCH 7/7] vfio/{ap, ccw}: Use warn_report_err() for IRQ notifier registration errors

2024-05-23 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: [PATCH 7/7] vfio/{ap,ccw}: Use warn_report_err() for IRQ notifier
>registration errors
>
>vfio_ccw_register_irq_notifier() and vfio_ap_register_irq_notifier()
>errors are currently reported using error_report_err(). Since they are
>not considered as failing conditions, using warn_report_err() is more
>appropriate.
>
>Signed-off-by: Cédric Le Goater 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>---
> hw/vfio/ap.c  | 2 +-
> hw/vfio/ccw.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
>index
>c12531a7886a2fe87598be0861fba5923bd2c206..0c4354e3e70169ec072e1
>6da0919936647d1d351 100644
>--- a/hw/vfio/ap.c
>+++ b/hw/vfio/ap.c
>@@ -172,7 +172,7 @@ static void vfio_ap_realize(DeviceState *dev, Error
>**errp)
>  * Report this error, but do not make it a failing condition.
>  * Lack of this IRQ in the host does not prevent normal operation.
>  */
>-error_report_err(err);
>+warn_report_err(err);
> }
>
> return;
>diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
>index
>36f2677a448c5e31523dcc3de7d973ec70e4a13c..1f8e1272c7555cd0a77048
>1d1ae92988f6e2e62e 100644
>--- a/hw/vfio/ccw.c
>+++ b/hw/vfio/ccw.c
>@@ -616,7 +616,7 @@ static void vfio_ccw_realize(DeviceState *dev, Error
>**errp)
>  * Report this error, but do not make it a failing condition.
>  * Lack of this IRQ in the host does not prevent normal operation.
>  */
>-error_report_err(err);
>+warn_report_err(err);
> }
>
> return;
>--
>2.45.1



RE: [PATCH 5/7] vfio/ccw: Use the 'Error **errp' argument of vfio_ccw_realize()

2024-05-23 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: [PATCH 5/7] vfio/ccw: Use the 'Error **errp' argument of
>vfio_ccw_realize()
>
>The local error variable is kept for vfio_ccw_register_irq_notifier()
>because it is not considered as a failing condition. We will change
>how error reporting is done in following changes.
>
>Remove the error_propagate() call.
>
>Cc: Zhenzhong Duan 
>Signed-off-by: Cédric Le Goater 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>---
> hw/vfio/ccw.c | 12 +---
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
>diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
>index
>9a8e052711fe2f7c067c52808b2af30d0ebfee0c..a468fa2342b97e0ee36bd5f
>b8443025cc90a0453 100644
>--- a/hw/vfio/ccw.c
>+++ b/hw/vfio/ccw.c
>@@ -582,8 +582,8 @@ static void vfio_ccw_realize(DeviceState *dev, Error
>**errp)
>
> /* Call the class init function for subchannel. */
> if (cdc->realize) {
>-if (!cdc->realize(cdev, vcdev->vdev.sysfsdev, )) {
>-goto out_err_propagate;
>+if (!cdc->realize(cdev, vcdev->vdev.sysfsdev, errp)) {
>+return;
> }
> }
>
>@@ -596,17 +596,17 @@ static void vfio_ccw_realize(DeviceState *dev,
>Error **errp)
> goto out_attach_dev_err;
> }
>
>-if (!vfio_ccw_get_region(vcdev, )) {
>+if (!vfio_ccw_get_region(vcdev, errp)) {
> goto out_region_err;
> }
>
>-if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX,
>)) {
>+if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX,
>errp)) {
> goto out_io_notifier_err;
> }
>
> if (vcdev->crw_region) {
> if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_CRW_IRQ_INDEX,
>-)) {
>+errp)) {
> goto out_irq_notifier_err;
> }
> }
>@@ -634,8 +634,6 @@ out_attach_dev_err:
> if (cdc->unrealize) {
> cdc->unrealize(cdev);
> }
>-out_err_propagate:
>-error_propagate(errp, err);
> }
>
> static void vfio_ccw_unrealize(DeviceState *dev)
>--
>2.45.1



RE: [PATCH 4/7] s390x/css: Make S390CCWDeviceClass::realize return bool

2024-05-23 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: [PATCH 4/7] s390x/css: Make S390CCWDeviceClass::realize return
>bool
>
>Since the realize() handler of S390CCWDeviceClass takes an 'Error **'
>argument, best practices suggest to return a bool. See the api/error.h
>Rules section. While at it, modify the call in vfio_ccw_realize().
>
>Signed-off-by: Cédric Le Goater 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>---
> include/hw/s390x/s390-ccw.h | 2 +-
> hw/s390x/s390-ccw.c | 7 ---
> hw/vfio/ccw.c   | 3 +--
> 3 files changed, 6 insertions(+), 6 deletions(-)
>
>diff --git a/include/hw/s390x/s390-ccw.h b/include/hw/s390x/s390-ccw.h
>index
>2c807ee3a1ae8d85460fe65be8a62c64f212fe4b..2e0a70998132070996d6b
>0d083b8ddba5b9b87dc 100644
>--- a/include/hw/s390x/s390-ccw.h
>+++ b/include/hw/s390x/s390-ccw.h
>@@ -31,7 +31,7 @@ struct S390CCWDevice {
>
> struct S390CCWDeviceClass {
> CCWDeviceClass parent_class;
>-void (*realize)(S390CCWDevice *dev, char *sysfsdev, Error **errp);
>+bool (*realize)(S390CCWDevice *dev, char *sysfsdev, Error **errp);
> void (*unrealize)(S390CCWDevice *dev);
> IOInstEnding (*handle_request) (SubchDev *sch);
> int (*handle_halt) (SubchDev *sch);
>diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
>index
>b3d14c61d732880a651edcf28a040ca723cb9f5b..3c0975055089c3629dd76
>ce2e1484a4ef66d8d41 100644
>--- a/hw/s390x/s390-ccw.c
>+++ b/hw/s390x/s390-ccw.c
>@@ -108,7 +108,7 @@ static bool s390_ccw_get_dev_info(S390CCWDevice
>*cdev,
> return true;
> }
>
>-static void s390_ccw_realize(S390CCWDevice *cdev, char *sysfsdev, Error
>**errp)
>+static bool s390_ccw_realize(S390CCWDevice *cdev, char *sysfsdev, Error
>**errp)
> {
> CcwDevice *ccw_dev = CCW_DEVICE(cdev);
> CCWDeviceClass *ck = CCW_DEVICE_GET_CLASS(ccw_dev);
>@@ -117,7 +117,7 @@ static void s390_ccw_realize(S390CCWDevice *cdev,
>char *sysfsdev, Error **errp)
> int ret;
>
> if (!s390_ccw_get_dev_info(cdev, sysfsdev, errp)) {
>-return;
>+return false;
> }
>
> sch = css_create_sch(ccw_dev->devno, errp);
>@@ -142,7 +142,7 @@ static void s390_ccw_realize(S390CCWDevice *cdev,
>char *sysfsdev, Error **errp)
>
> css_generate_sch_crws(sch->cssid, sch->ssid, sch->schid,
>   parent->hotplugged, 1);
>-return;
>+return true;
>
> out_err:
> css_subch_assign(sch->cssid, sch->ssid, sch->schid, sch->devno, NULL);
>@@ -150,6 +150,7 @@ out_err:
> g_free(sch);
> out_mdevid_free:
> g_free(cdev->mdevid);
>+return false;
> }
>
> static void s390_ccw_unrealize(S390CCWDevice *cdev)
>diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
>index
>2600e62e37238779800dc2b3a0bd315d7633017b..9a8e052711fe2f7c067c
>52808b2af30d0ebfee0c 100644
>--- a/hw/vfio/ccw.c
>+++ b/hw/vfio/ccw.c
>@@ -582,8 +582,7 @@ static void vfio_ccw_realize(DeviceState *dev, Error
>**errp)
>
> /* Call the class init function for subchannel. */
> if (cdc->realize) {
>-cdc->realize(cdev, vcdev->vdev.sysfsdev, );
>-if (err) {
>+if (!cdc->realize(cdev, vcdev->vdev.sysfsdev, )) {
> goto out_err_propagate;
> }
> }
>--
>2.45.1



RE: [PATCH 3/7] hw/s390x/ccw: Remove local Error variable from s390_ccw_realize()

2024-05-23 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: [PATCH 3/7] hw/s390x/ccw: Remove local Error variable from
>s390_ccw_realize()
>
>Use the 'Error **errp' argument of s390_ccw_realize() instead and
>remove the error_propagate() call.
>
>Signed-off-by: Cédric Le Goater 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>---
> hw/s390x/s390-ccw.c | 13 +
> 1 file changed, 5 insertions(+), 8 deletions(-)
>
>diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
>index
>4b8ede701df90949720262b6fc1b65f4e505e34d..b3d14c61d732880a651ed
>cf28a040ca723cb9f5b 100644
>--- a/hw/s390x/s390-ccw.c
>+++ b/hw/s390x/s390-ccw.c
>@@ -115,13 +115,12 @@ static void s390_ccw_realize(S390CCWDevice
>*cdev, char *sysfsdev, Error **errp)
> DeviceState *parent = DEVICE(ccw_dev);
> SubchDev *sch;
> int ret;
>-Error *err = NULL;
>
>-if (!s390_ccw_get_dev_info(cdev, sysfsdev, )) {
>-goto out_err_propagate;
>+if (!s390_ccw_get_dev_info(cdev, sysfsdev, errp)) {
>+return;
> }
>
>-sch = css_create_sch(ccw_dev->devno, );
>+sch = css_create_sch(ccw_dev->devno, errp);
> if (!sch) {
> goto out_mdevid_free;
> }
>@@ -132,12 +131,12 @@ static void s390_ccw_realize(S390CCWDevice
>*cdev, char *sysfsdev, Error **errp)
> ccw_dev->sch = sch;
> ret = css_sch_build_schib(sch, >hostid);
> if (ret) {
>-error_setg_errno(, -ret, "%s: Failed to build initial schib",
>+error_setg_errno(errp, -ret, "%s: Failed to build initial schib",
>  __func__);
> goto out_err;
> }
>
>-if (!ck->realize(ccw_dev, )) {
>+if (!ck->realize(ccw_dev, errp)) {
> goto out_err;
> }
>
>@@ -151,8 +150,6 @@ out_err:
> g_free(sch);
> out_mdevid_free:
> g_free(cdev->mdevid);
>-out_err_propagate:
>-error_propagate(errp, err);
> }
>
> static void s390_ccw_unrealize(S390CCWDevice *cdev)
>--
>2.45.1



RE: [PATCH 2/7] s390x/css: Make CCWDeviceClass::realize return bool

2024-05-23 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: [PATCH 2/7] s390x/css: Make CCWDeviceClass::realize return bool
>
>Since the realize() handler of CCWDeviceClass takes an 'Error **'
>argument, best practices suggest to return a bool. See the api/error.h
>Rules section. While at it, modify the call in s390_ccw_realize().
>
>Signed-off-by: Cédric Le Goater 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>---
> hw/s390x/ccw-device.h | 2 +-
> hw/s390x/ccw-device.c | 3 ++-
> hw/s390x/s390-ccw.c   | 3 +--
> 3 files changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/hw/s390x/ccw-device.h b/hw/s390x/ccw-device.h
>index
>6dff95225df11c63f9b66975019026b215c8c448..5feeb0ee7a268b8709043b
>5bbc56b06e707a448d 100644
>--- a/hw/s390x/ccw-device.h
>+++ b/hw/s390x/ccw-device.h
>@@ -36,7 +36,7 @@ extern const VMStateDescription vmstate_ccw_dev;
> struct CCWDeviceClass {
> DeviceClass parent_class;
> void (*unplug)(HotplugHandler *, DeviceState *, Error **);
>-void (*realize)(CcwDevice *, Error **);
>+bool (*realize)(CcwDevice *, Error **);
> void (*refill_ids)(CcwDevice *);
> };
>
>diff --git a/hw/s390x/ccw-device.c b/hw/s390x/ccw-device.c
>index
>fb8c1acc64d5002c861a4913f292d8346dbef192..a7d682e5af9ce90e7e2fad8
>c24b30e39328c7cf4 100644
>--- a/hw/s390x/ccw-device.c
>+++ b/hw/s390x/ccw-device.c
>@@ -31,9 +31,10 @@ static void ccw_device_refill_ids(CcwDevice *dev)
> dev->subch_id.valid = true;
> }
>
>-static void ccw_device_realize(CcwDevice *dev, Error **errp)
>+static bool ccw_device_realize(CcwDevice *dev, Error **errp)
> {
> ccw_device_refill_ids(dev);
>+return true;
> }
>
> static Property ccw_device_properties[] = {
>diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
>index
>a06e91dfb318e3500324851488c56806fa46c08d..4b8ede701df9094972026
>2b6fc1b65f4e505e34d 100644
>--- a/hw/s390x/s390-ccw.c
>+++ b/hw/s390x/s390-ccw.c
>@@ -137,8 +137,7 @@ static void s390_ccw_realize(S390CCWDevice *cdev,
>char *sysfsdev, Error **errp)
> goto out_err;
> }
>
>-ck->realize(ccw_dev, );
>-if (err) {
>+if (!ck->realize(ccw_dev, )) {
> goto out_err;
> }
>
>--
>2.45.1



RE: [PATCH 1/7] hw/s390x/ccw: Make s390_ccw_get_dev_info() return a bool

2024-05-23 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: [PATCH 1/7] hw/s390x/ccw: Make s390_ccw_get_dev_info() return
>a bool
>
>Since s390_ccw_get_dev_info() takes an 'Error **' argument, best
>practices suggest to return a bool. See the qapi/error.h Rules
>section. While at it, modify the call in s390_ccw_realize().
>
>Signed-off-by: Cédric Le Goater 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>---
> hw/s390x/s390-ccw.c | 12 ++--
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
>diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
>index
>5261e66724f1cc3157b9413b0d5fdf5289c92503..a06e91dfb318e35003248
>51488c56806fa46c08d 100644
>--- a/hw/s390x/s390-ccw.c
>+++ b/hw/s390x/s390-ccw.c
>@@ -71,7 +71,7 @@ IOInstEnding s390_ccw_store(SubchDev *sch)
> return ret;
> }
>
>-static void s390_ccw_get_dev_info(S390CCWDevice *cdev,
>+static bool s390_ccw_get_dev_info(S390CCWDevice *cdev,
>   char *sysfsdev,
>   Error **errp)
> {
>@@ -84,12 +84,12 @@ static void s390_ccw_get_dev_info(S390CCWDevice
>*cdev,
> error_setg(errp, "No host device provided");
> error_append_hint(errp,
>   "Use -device vfio-ccw,sysfsdev=PATH_TO_DEVICE\n");
>-return;
>+return false;
> }
>
> if (!realpath(sysfsdev, dev_path)) {
> error_setg_errno(errp, errno, "Host device '%s' not found", sysfsdev);
>-return;
>+return false;
> }
>
> cdev->mdevid = g_path_get_basename(dev_path);
>@@ -98,13 +98,14 @@ static void s390_ccw_get_dev_info(S390CCWDevice
>*cdev,
> tmp = g_path_get_basename(tmp_dir);
> if (sscanf(tmp, "%2x.%1x.%4x", , , ) != 3) {
> error_setg_errno(errp, errno, "Failed to read %s", tmp);
>-return;
>+return false;
> }
>
> cdev->hostid.cssid = cssid;
> cdev->hostid.ssid = ssid;
> cdev->hostid.devid = devid;
> cdev->hostid.valid = true;
>+return true;
> }
>
> static void s390_ccw_realize(S390CCWDevice *cdev, char *sysfsdev, Error
>**errp)
>@@ -116,8 +117,7 @@ static void s390_ccw_realize(S390CCWDevice *cdev,
>char *sysfsdev, Error **errp)
> int ret;
> Error *err = NULL;
>
>-s390_ccw_get_dev_info(cdev, sysfsdev, );
>-if (err) {
>+if (!s390_ccw_get_dev_info(cdev, sysfsdev, )) {
> goto out_err_propagate;
> }
>
>--
>2.45.1



RE: [PATCH v2 20/20] vfio/ccw: Fix the missed unrealize() call in error path

2024-05-22 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Sent: Wednesday, May 22, 2024 3:52 PM
>To: Duan, Zhenzhong ; qemu-
>de...@nongnu.org
>Cc: alex.william...@redhat.com; eric.au...@redhat.com; Peng, Chao P
>; Eric Farman ; Matthew
>Rosato ; Thomas Huth ;
>open list:vfio-ccw 
>Subject: Re: [PATCH v2 20/20] vfio/ccw: Fix the missed unrealize() call in
>error path
>
>On 5/22/24 06:40, Zhenzhong Duan wrote:
>> When get name failed, we should call unrealize() so that
>> vfio_ccw_realize() is self contained.
>>
>> Fixes: 909a6254eda ("vfio/ccw: Make vfio cdev pre-openable by passing a
>file handle")
>> Signed-off-by: Zhenzhong Duan 
>
>If the realize handler fails, the unrealize handler should be called.
>See device_set_realized(). We should be fine without IMO.

Do you mean when vfio_ccw_realize() fails, vfio_ccw_unrealize() will be called?
Looked into device_set_realized(), I didn't see where vfio_ccw_unrealize() was 
called.
Do I misunderstand?

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>
>> ---
>>   hw/vfio/ccw.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
>> index 168c9e5973..161704cd7b 100644
>> --- a/hw/vfio/ccw.c
>> +++ b/hw/vfio/ccw.c
>> @@ -589,7 +589,7 @@ static void vfio_ccw_realize(DeviceState *dev,
>Error **errp)
>>   }
>>
>>   if (!vfio_device_get_name(vbasedev, errp)) {
>> -return;
>> +goto out_unrealize;
>>   }
>>
>>   if (!vfio_attach_device(cdev->mdevid, vbasedev,
>> @@ -633,6 +633,7 @@ out_region_err:
>>   vfio_detach_device(vbasedev);
>>   out_attach_dev_err:
>>   g_free(vbasedev->name);
>> +out_unrealize:
>>   if (cdc->unrealize) {
>>   cdc->unrealize(cdev);
>>   }



RE: [PATCH v2 19/20] vfio/ccw: Drop local @err in vfio_ccw_realize()

2024-05-22 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 19/20] vfio/ccw: Drop local @err in vfio_ccw_realize()
>
>On 5/22/24 06:40, Zhenzhong Duan wrote:
>> Use @errp to fetch error information directly and drop the local
>> variable @err.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   hw/vfio/ccw.c | 21 ++---
>>   1 file changed, 10 insertions(+), 11 deletions(-)
>>
>> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
>> index 2600e62e37..168c9e5973 100644
>> --- a/hw/vfio/ccw.c
>> +++ b/hw/vfio/ccw.c
>> @@ -574,17 +574,17 @@ static void
>vfio_ccw_put_region(VFIOCCWDevice *vcdev)
>>
>>   static void vfio_ccw_realize(DeviceState *dev, Error **errp)
>>   {
>> +ERRP_GUARD();
>>   S390CCWDevice *cdev = S390_CCW_DEVICE(dev);
>>   VFIOCCWDevice *vcdev = VFIO_CCW(cdev);
>>   S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(cdev);
>>   VFIODevice *vbasedev = >vdev;
>> -Error *err = NULL;
>>
>>   /* Call the class init function for subchannel. */
>>   if (cdc->realize) {
>> -cdc->realize(cdev, vcdev->vdev.sysfsdev, );
>> -if (err) {
>> -goto out_err_propagate;
>> +cdc->realize(cdev, vcdev->vdev.sysfsdev, errp);
>> +if (*errp) {
>> +return;
>
>We should change the realize() return value to bool also. this is more
>work and it should be addressed in its own patchset I think and ...
>
>>   }
>>   }
>>
>> @@ -597,27 +597,28 @@ static void vfio_ccw_realize(DeviceState *dev,
>Error **errp)
>>   goto out_attach_dev_err;
>>   }
>>
>> -if (!vfio_ccw_get_region(vcdev, )) {
>> +if (!vfio_ccw_get_region(vcdev, errp)) {
>>   goto out_region_err;
>>   }
>>
>> -if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX,
>)) {
>> +if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX,
>errp)) {
>>   goto out_io_notifier_err;
>>   }
>>
>>   if (vcdev->crw_region) {
>>   if (!vfio_ccw_register_irq_notifier(vcdev,
>VFIO_CCW_CRW_IRQ_INDEX,
>> -)) {
>> +errp)) {
>>   goto out_irq_notifier_err;
>>   }
>>   }
>>
>> -if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_REQ_IRQ_INDEX,
>)) {
>> +if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_REQ_IRQ_INDEX,
>errp)) {
>>   /*
>>* Report this error, but do not make it a failing condition.
>>* Lack of this IRQ in the host does not prevent normal operation.
>>*/
>> -error_report_err(err);
>> +error_report_err(*errp);
>
>This should use a local Error variable and be a warn_report_err instead.

Yes.

>
>Let's address these changes in another series. I can take care of it
>later if no one does.

OK, leave it to you

Thanks
Zhenzhong


RE: [PATCH 02/16] vfio/display: Make vfio_display_*() return bool

2024-05-21 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH 02/16] vfio/display: Make vfio_display_*() return bool
>
>On 5/15/24 10:20, Zhenzhong Duan wrote:
>> This is to follow the coding standand in qapi/error.h to return bool
>> for bool-valued functions.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>
>
>Reviewed-by: Cédric Le Goater 
>
>One comment below,
>
>> ---
>>   hw/vfio/pci.h |  2 +-
>>   hw/vfio/display.c | 20 ++--
>>   hw/vfio/pci.c |  3 +--
>>   3 files changed, 12 insertions(+), 13 deletions(-)
>>
>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>> index 92cd62d115..a5ac9efd4b 100644
>> --- a/hw/vfio/pci.h
>> +++ b/hw/vfio/pci.h
>> @@ -232,7 +232,7 @@ int vfio_pci_igd_opregion_init(VFIOPCIDevice
>*vdev,
>>  Error **errp);
>>
>>   void vfio_display_reset(VFIOPCIDevice *vdev);
>> -int vfio_display_probe(VFIOPCIDevice *vdev, Error **errp);
>> +bool vfio_display_probe(VFIOPCIDevice *vdev, Error **errp);
>>   void vfio_display_finalize(VFIOPCIDevice *vdev);
>>
>>   extern const VMStateDescription vfio_display_vmstate;
>> diff --git a/hw/vfio/display.c b/hw/vfio/display.c
>> index 57c5ae0b2a..b562f4be74 100644
>> --- a/hw/vfio/display.c
>> +++ b/hw/vfio/display.c
>> @@ -346,11 +346,11 @@ static const GraphicHwOps
>vfio_display_dmabuf_ops = {
>>   .ui_info= vfio_display_edid_ui_info,
>>   };
>>
>> -static int vfio_display_dmabuf_init(VFIOPCIDevice *vdev, Error **errp)
>> +static bool vfio_display_dmabuf_init(VFIOPCIDevice *vdev, Error **errp)
>>   {
>>   if (!display_opengl) {
>>   error_setg(errp, "vfio-display-dmabuf: opengl not available");
>> -return -1;
>> +return false;
>>   }
>>
>>   vdev->dpy = g_new0(VFIODisplay, 1);
>> @@ -360,11 +360,11 @@ static int
>vfio_display_dmabuf_init(VFIOPCIDevice *vdev, Error **errp)
>>   if (vdev->enable_ramfb) {
>>   vdev->dpy->ramfb = ramfb_setup(errp);
>>   if (!vdev->dpy->ramfb) {
>> -return -EINVAL;
>> +return false;
>>   }
>>   }
>>   vfio_display_edid_init(vdev);
>
>vfio_display_edid_init() can fail for many reasons and does it silently.
>It would be good to report the error in a future patch.

Yes, that deserve a fix. Will address it with a future patch.

Thanks
Zhenzhong



RE: [PATCH] intel_iommu: Use the latest fault reasons defined by spec

2024-05-21 Thread Duan, Zhenzhong


>-Original Message-
>From: Jason Wang 
>Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons defined by
>spec
>
>On Mon, May 20, 2024 at 12:15 PM Liu, Yi L  wrote:
>>
>> > From: Duan, Zhenzhong 
>> > Sent: Monday, May 20, 2024 11:41 AM
>> >
>> >
>> >
>> > >-Original Message-
>> > >From: Jason Wang 
>> > >Sent: Monday, May 20, 2024 8:44 AM
>> > >To: Duan, Zhenzhong 
>> > >Cc: qemu-devel@nongnu.org; Liu, Yi L ; Peng, Chao
>P
>> > >; Yu Zhang ;
>Michael
>> > >S. Tsirkin ; Paolo Bonzini ;
>> > >Richard Henderson ; Eduardo Habkost
>> > >; Marcel Apfelbaum
>
>> > >Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons defined
>by
>> > >spec
>> > >
>> > >On Fri, May 17, 2024 at 6:26 PM Zhenzhong Duan
>> > > wrote:
>> > >>
>> > >> From: Yu Zhang 
>> > >>
>> > >> Currently we use only VTD_FR_PASID_TABLE_INV as fault reason.
>> > >> Update with more detailed fault reasons listed in VT-d spec 7.2.3.
>> > >>
>> > >> Signed-off-by: Yu Zhang 
>> > >> Signed-off-by: Zhenzhong Duan 
>> > >> ---
>> > >
>> > >I wonder if this could be noticed by the guest or not. If yes should
>> > >we consider starting to add thing like version to vtd emulation code?
>> >
>> > Kernel only dumps the reason like below:
>> >
>> > DMAR: [DMA Write NO_PASID] Request device [20:00.0] fault addr
>0x123460
>> > [fault reason 0x71] SM: Present bit in first-level paging entry is clear
>>
>> Yes, guest kernel would notice it as the fault would be injected to vm.
>>
>> > Maybe bump 1.0 -> 1.1?
>> > My understanding version number is only informational and is far from
>> > accurate to mark if a feature supported. Driver should check cap/ecap
>> > bits instead.
>>
>> Should the version ID here be aligned with VT-d spec?
>
>Probably, this might be something that could be noticed by the
>management to migration compatibility.

Could you elaborate what we need to do for migration compatibility?
I see version is already exported so libvirt can query it, see:

DEFINE_PROP_UINT32("version", IntelIOMMUState, version, 0),

Thanks
Zhenzhong

>
>> If yes, it should
>> be 3.0 as the scalable mode was introduced in spec 3.0. And the fault
>> code was redefined together with the introduction of this translation
>> mode. Below is the a snippet from the change log of VT-d spec.
>>
>> June 2018 3.0
>> • Removed all text related to Extended-Mode.
>> • Added support for scalable-mode translation for DMA Remapping, that
>enables PASIDgranular first-level, second-level, nested and pass-through
>translation functions.
>> • Widen invalidation queue descriptors and page request queue
>descriptors from 128 bits
>> to 256 bits and redefined page-request and page-response descriptors.
>> • Listed all fault conditions in a unified table and described DMA
>Remapping hardware
>> behavior under each condition. Assigned new code for each fault condition
>in scalablemode operation.
>> • Added support for Accessed/Dirty (A/D) bits in second-level translation.
>> • Added support for submitting commands and receiving response from
>virtual DMA
>> Remapping hardware.
>> • Added a table on snooping behavior and memory type of hardware
>access to various
>> remapping structures as appendix.
>> • Move Page Request Overflow (PRO) fault reporting from Fault Status
>register
>> (FSTS_REG) to Page Request Status register (PRS_REG).
>>
>> Regards.
>> Yi Liu
>
>Thanks



RE: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry

2024-05-20 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: Re: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>creating an instance of IOMMUTLBEntry
>
>
>On 17/05/2024 12:40, Duan, Zhenzhong wrote:
>> Caution: External email. Do not open attachments or click links, unless this
>email comes from a known sender and you know the content is safe.
>>
>>
>>> -Original Message-
>>> From: CLEMENT MATHIEU--DRIF 
>>> Subject: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>>> creating an instance of IOMMUTLBEntry
>>>
>>> Signed-off-by: Clément Mathieu--Drif d...@eviden.com>
>>> ---
>>> hw/i386/intel_iommu.c | 7 +++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>> index 53f17d66c0..c4ebd4569e 100644
>>> --- a/hw/i386/intel_iommu.c
>>> +++ b/hw/i386/intel_iommu.c
>>> @@ -2299,6 +2299,7 @@ out:
>>>  entry->translated_addr = vtd_get_slpte_addr(pte, s->aw_bits) &
>>> page_mask;
>>>  entry->addr_mask = ~page_mask;
>>>  entry->perm = access_flags;
>>> +entry->pasid = pasid;
>> For PCI_NO_PASID, do we want to assign PCI_NO_PASID or rid2pasid?
>we have the following statement a few lines above :
>if (rid2pasid) {
>     pasid = VTD_CE_GET_RID2PASID();
>}
>
>so we store rid2pasid if the feature is enabled.
>
>But maybe we should store PCI_NO_PASID because the rest of the world is
>not supposed to be aware of what we are doing with rid2pasid.
>
>Does it look good to you?

Yes, that make sense.

>>
>> Thanks
>> Zhenzhong
>>
>>>  return true;
>>>
>>> error:
>>> @@ -2307,6 +2308,7 @@ error:
>>>  entry->translated_addr = 0;
>>>  entry->addr_mask = 0;
>>>  entry->perm = IOMMU_NONE;
>>> +entry->pasid = PCI_NO_PASID;
>>>  return false;
>>> }
>>>
>>> @@ -3497,6 +3499,7 @@ static void
>>> vtd_piotlb_pasid_invalidate_notify(IntelIOMMUState *s,
>>>  event.entry.target_as = _space_memory;
>>>  event.entry.iova = notifier->start;
>>>  event.entry.perm = IOMMU_NONE;
>>> +event.entry.pasid = pasid;
>>>  event.entry.addr_mask = notifier->end - notifier->start;
>>>  event.entry.translated_addr = 0;
>>>
>>> @@ -3678,6 +3681,7 @@ static void
>>> vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
>>>  event.entry.target_as = _space_memory;
>>>  event.entry.iova = addr;
>>>  event.entry.perm = IOMMU_NONE;
>>> +event.entry.pasid = pasid;
>>>  event.entry.addr_mask = size - 1;
>>>  event.entry.translated_addr = 0;
>>>
>>> @@ -4335,6 +4339,7 @@ static void
>>> do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
>>>  event.entry.iova = addr;
>>>  event.entry.perm = IOMMU_NONE;
>>>  event.entry.translated_addr = 0;
>>> +event.entry.pasid = vtd_dev_as->pasid;
>>>  memory_region_notify_iommu(_dev_as->iommu, 0, event);
>>> }
>>>
>>> @@ -4911,6 +4916,7 @@ static IOMMUTLBEntry
>>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>>  IOMMUTLBEntry iotlb = {
>>>  /* We'll fill in the rest later. */
>>>  .target_as = _space_memory,
>>> +.pasid = vtd_as->pasid,
>>>  };
>>>  bool success;
>>>
>>> @@ -4923,6 +4929,7 @@ static IOMMUTLBEntry
>>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>>  iotlb.translated_addr = addr & VTD_PAGE_MASK_4K;
>>>  iotlb.addr_mask = ~VTD_PAGE_MASK_4K;
>>>  iotlb.perm = IOMMU_RW;
>>> +iotlb.pasid = PCI_NO_PASID;
>>>  success = true;
>>>  }
>>>
>>> --
>>> 2.44.0


RE: [PATCH] intel_iommu: Use the latest fault reasons defined by spec

2024-05-20 Thread Duan, Zhenzhong


>-Original Message-
>From: Liu, Yi L 
>Subject: RE: [PATCH] intel_iommu: Use the latest fault reasons defined by
>spec
>
>> From: Duan, Zhenzhong 
>> Sent: Monday, May 20, 2024 11:41 AM
>>
>>
>>
>> >-Original Message-
>> >From: Jason Wang 
>> >Sent: Monday, May 20, 2024 8:44 AM
>> >To: Duan, Zhenzhong 
>> >Cc: qemu-devel@nongnu.org; Liu, Yi L ; Peng, Chao P
>> >; Yu Zhang ;
>Michael
>> >S. Tsirkin ; Paolo Bonzini ;
>> >Richard Henderson ; Eduardo Habkost
>> >; Marcel Apfelbaum
>
>> >Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons defined by
>> >spec
>> >
>> >On Fri, May 17, 2024 at 6:26 PM Zhenzhong Duan
>> > wrote:
>> >>
>> >> From: Yu Zhang 
>> >>
>> >> Currently we use only VTD_FR_PASID_TABLE_INV as fault reason.
>> >> Update with more detailed fault reasons listed in VT-d spec 7.2.3.
>> >>
>> >> Signed-off-by: Yu Zhang 
>> >> Signed-off-by: Zhenzhong Duan 
>> >> ---
>> >
>> >I wonder if this could be noticed by the guest or not. If yes should
>> >we consider starting to add thing like version to vtd emulation code?
>>
>> Kernel only dumps the reason like below:
>>
>> DMAR: [DMA Write NO_PASID] Request device [20:00.0] fault addr
>0x123460
>> [fault reason 0x71] SM: Present bit in first-level paging entry is clear
>
>Yes, guest kernel would notice it as the fault would be injected to vm.
>
>> Maybe bump 1.0 -> 1.1?
>> My understanding version number is only informational and is far from
>> accurate to mark if a feature supported. Driver should check cap/ecap
>> bits instead.
>
>Should the version ID here be aligned with VT-d spec? If yes, it should
>be 3.0 as the scalable mode was introduced in spec 3.0. And the fault
>code was redefined together with the introduction of this translation
>mode. Below is the a snippet from the change log of VT-d spec.

OK, then 3.0 is a better choice. Will update version.
For Jason's question, even though more fault reasons are added,
but the reason numbers are still backward compatible,
so no need to define reasons per version.

Thanks
Zhenzhong

>
>June 2018 3.0
>• Removed all text related to Extended-Mode.
>• Added support for scalable-mode translation for DMA Remapping, that
>enables PASIDgranular first-level, second-level, nested and pass-through
>translation functions.
>• Widen invalidation queue descriptors and page request queue descriptors
>from 128 bits
>to 256 bits and redefined page-request and page-response descriptors.
>• Listed all fault conditions in a unified table and described DMA Remapping
>hardware
>behavior under each condition. Assigned new code for each fault condition in
>scalablemode operation.
>• Added support for Accessed/Dirty (A/D) bits in second-level translation.
>• Added support for submitting commands and receiving response from
>virtual DMA
>Remapping hardware.
>• Added a table on snooping behavior and memory type of hardware access
>to various
>remapping structures as appendix.
>• Move Page Request Overflow (PRO) fault reporting from Fault Status
>register
>(FSTS_REG) to Page Request Status register (PRS_REG).
>
>Regards.
>Yi Liu


RE: [PATCH] intel_iommu: Use the latest fault reasons defined by spec

2024-05-19 Thread Duan, Zhenzhong


>-Original Message-
>From: Jason Wang 
>Sent: Monday, May 20, 2024 8:44 AM
>To: Duan, Zhenzhong 
>Cc: qemu-devel@nongnu.org; Liu, Yi L ; Peng, Chao P
>; Yu Zhang ; Michael
>S. Tsirkin ; Paolo Bonzini ;
>Richard Henderson ; Eduardo Habkost
>; Marcel Apfelbaum 
>Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons defined by
>spec
>
>On Fri, May 17, 2024 at 6:26 PM Zhenzhong Duan
> wrote:
>>
>> From: Yu Zhang 
>>
>> Currently we use only VTD_FR_PASID_TABLE_INV as fault reason.
>> Update with more detailed fault reasons listed in VT-d spec 7.2.3.
>>
>> Signed-off-by: Yu Zhang 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>
>I wonder if this could be noticed by the guest or not. If yes should
>we consider starting to add thing like version to vtd emulation code?

Kernel only dumps the reason like below:

DMAR: [DMA Write NO_PASID] Request device [20:00.0] fault addr 0x123460 
[fault reason 0x71] SM: Present bit in first-level paging entry is clear

Maybe bump 1.0 -> 1.1?
My understanding version number is only informational and is far from
accurate to mark if a feature supported. Driver should check cap/ecap
bits instead.

Thanks
Zhenzhong



RE: [PATCH] intel_iommu: Use the latest fault reasons defined by spec

2024-05-19 Thread Duan, Zhenzhong


>-Original Message-
>From: Liu, Yi L 
>Subject: RE: [PATCH] intel_iommu: Use the latest fault reasons defined by
>spec
>
>> From: CLEMENT MATHIEU--DRIF 
>> Sent: Friday, May 17, 2024 9:13 PM
>>
>> Hi Zhenzhong
>>
>> On 17/05/2024 12:23, Zhenzhong Duan wrote:
>> > Caution: External email. Do not open attachments or click links, unless
>this email
>> comes from a known sender and you know the content is safe.
>> >
>> >
>> > From: Yu Zhang 
>> >
>> > Currently we use only VTD_FR_PASID_TABLE_INV as fault reason.
>> > Update with more detailed fault reasons listed in VT-d spec 7.2.3.
>> >
>> > Signed-off-by: Yu Zhang 
>> > Signed-off-by: Zhenzhong Duan 
>> > ---
>> >   hw/i386/intel_iommu_internal.h |  8 +++-
>> >   hw/i386/intel_iommu.c  | 25 -
>> >   2 files changed, 23 insertions(+), 10 deletions(-)
>> >
>> > diff --git a/hw/i386/intel_iommu_internal.h
>b/hw/i386/intel_iommu_internal.h
>> > index f8cf99bddf..666e2cf2ce 100644
>> > --- a/hw/i386/intel_iommu_internal.h
>> > +++ b/hw/i386/intel_iommu_internal.h
>> > @@ -311,7 +311,13 @@ typedef enum VTDFaultReason {
>> > * request while disabled */
>> >   VTD_FR_IR_SID_ERR = 0x26,   /* Invalid Source-ID */
>> >
>> > -VTD_FR_PASID_TABLE_INV = 0x58,  /*Invalid PASID table entry */
>> > +/* PASID directory entry access failure */
>> > +VTD_FR_PASID_DIR_ACCESS_ERR = 0x50,
>> > +/* The Present(P) field of pasid directory entry is 0 */
>> > +VTD_FR_PASID_DIR_ENTRY_P = 0x51,
>> > +VTD_FR_PASID_TABLE_ACCESS_ERR = 0x58, /* PASID table entry
>access failure */
>> > +VTD_FR_PASID_ENTRY_P = 0x59, /* The Present(P) field of pasidt-
>entry is 0 */
>> s/pasidt/pasid
>
>Per spec, it is pasid table entry. So Zhenzhong may need to use the same
>word
>With the line below. E.g. PASID Table entry.

Yes, will fix.

Thanks
Zhenzhong

>
>Regards,
>Yi Liu
>
>> > +VTD_FR_PASID_TABLE_ENTRY_INV = 0x5b,  /*Invalid PASID table
>entry */
>> >
>> >   /* Output address in the interrupt address range for scalable mode
>*/
>> >   VTD_FR_SM_INTERRUPT_ADDR = 0x87,
>> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> > index cc8e59674e..0951ebb71d 100644
>> > --- a/hw/i386/intel_iommu.c
>> > +++ b/hw/i386/intel_iommu.c
>> > @@ -771,7 +771,7 @@ static int
>vtd_get_pdire_from_pdir_table(dma_addr_t
>> pasid_dir_base,
>> >   addr = pasid_dir_base + index * entry_size;
>> >   if (dma_memory_read(_space_memory, addr,
>> >   pdire, entry_size, MEMTXATTRS_UNSPECIFIED)) {
>> > -return -VTD_FR_PASID_TABLE_INV;
>> > +return -VTD_FR_PASID_DIR_ACCESS_ERR;
>> >   }
>> >
>> >   pdire->val = le64_to_cpu(pdire->val);
>> > @@ -789,6 +789,7 @@ static int
>vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState
>> *s,
>> > dma_addr_t addr,
>> > VTDPASIDEntry *pe)
>> >   {
>> > +uint8_t pgtt;
>> >   uint32_t index;
>> >   dma_addr_t entry_size;
>> >   X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
>> > @@ -798,7 +799,7 @@ static int
>vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState
>> *s,
>> >   addr = addr + index * entry_size;
>> >   if (dma_memory_read(_space_memory, addr,
>> >   pe, entry_size, MEMTXATTRS_UNSPECIFIED)) {
>> > -return -VTD_FR_PASID_TABLE_INV;
>> > +return -VTD_FR_PASID_TABLE_ACCESS_ERR;
>> >   }
>> >   for (size_t i = 0; i < ARRAY_SIZE(pe->val); i++) {
>> >   pe->val[i] = le64_to_cpu(pe->val[i]);
>> > @@ -806,11 +807,13 @@ static int
>vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState
>> *s,
>> >
>> >   /* Do translation type check */
>> >   if (!vtd_pe_type_check(x86_iommu, pe)) {
>> > -return -VTD_FR_PASID_TABLE_INV;
>> > +return -VTD_FR_PASID_TABLE_ENTRY_INV;
>> >   }
>> >
>> > -if (!vtd_is_level_supported(s, VTD_PE_GET_LEVEL(pe))) {
>> > -return -VTD_FR_PASID_TABLE_INV;
>> > +pgtt = VTD_PE_GET_TYPE(pe);
>> > +if (pgtt == VTD_SM_PASID_ENTRY_SLT &&
>> > +!vtd_is_level_supported(s, VTD_PE_GET_LEVEL(pe))) {
>> > +return -VTD_FR_PASID_TABLE_ENTRY_INV;
>> >   }
>> >
>> >   return 0;
>> > @@ -851,7 +854,7 @@ static int
>vtd_get_pe_from_pasid_table(IntelIOMMUState *s,
>> >   }
>> >
>> >   if (!vtd_pdire_present()) {
>> > -return -VTD_FR_PASID_TABLE_INV;
>> > +return -VTD_FR_PASID_DIR_ENTRY_P;
>> >   }
>> >
>> >   ret = vtd_get_pe_from_pdire(s, pasid, , pe);
>> > @@ -860,7 +863,7 @@ static int
>vtd_get_pe_from_pasid_table(IntelIOMMUState *s,
>> >   }
>> >
>> >   if (!vtd_pe_present(pe)) {
>> > -return -VTD_FR_PASID_TABLE_INV;
>> > +return -VTD_FR_PASID_ENTRY_P;
>> >   }
>> >
>> >   return 0;
>> > @@ -913,7 +916,7 @@ static int
>vtd_ce_get_pasid_fpd(IntelIOMMUState *s,
>> >   }
>> >
>> >   if (!vtd_pdire_present()) 

RE: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM

2024-05-17 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe
>devices that support SVM
>
>As the SVM-capable devices will need to cache translations, we provide
>an first implementation.
>
>This cache uses a two-level design based on hash tables.
>The first level is indexed by a PASID and the second by a virtual addresse.
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> tests/unit/meson.build |   1 +
> tests/unit/test-atc.c  | 502
>+
> util/atc.c | 211 +
> util/atc.h | 117 ++
> util/meson.build   |   1 +
> 5 files changed, 832 insertions(+)
> create mode 100644 tests/unit/test-atc.c
> create mode 100644 util/atc.c
> create mode 100644 util/atc.h

Maybe the unit test can be split from functional change?

>
>diff --git a/tests/unit/meson.build b/tests/unit/meson.build
>index 228a21d03c..5c9a6fe9f4 100644
>--- a/tests/unit/meson.build
>+++ b/tests/unit/meson.build
>@@ -52,6 +52,7 @@ tests = {
>   'test-interval-tree': [],
>   'test-xs-node': [qom],
>   'test-virtio-dmabuf': [meson.project_source_root() / 'hw/display/virtio-
>dmabuf.c'],
>+  'test-atc': []
> }
>
> if have_system or have_tools
>diff --git a/tests/unit/test-atc.c b/tests/unit/test-atc.c
>new file mode 100644
>index 00..60fa60924a
>--- /dev/null
>+++ b/tests/unit/test-atc.c
>@@ -0,0 +1,502 @@
>+/*
>+ * This program is free software; you can redistribute it and/or modify
>+ * it under the terms of the GNU General Public License as published by
>+ * the Free Software Foundation; either version 2 of the License, or
>+ * (at your option) any later version.
>+
>+ * This program is distributed in the hope that it will be useful,
>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>+ * GNU General Public License for more details.
>+
>+ * You should have received a copy of the GNU General Public License along
>+ * with this program; if not, see .
>+ */
>+
>+#include "util/atc.h"
>+
>+static inline bool tlb_entry_equal(IOMMUTLBEntry *e1, IOMMUTLBEntry
>*e2)
>+{
>+if (!e1 || !e2) {
>+return !e1 && !e2;
>+}
>+return e1->iova == e2->iova &&
>+e1->addr_mask == e2->addr_mask &&
>+e1->pasid == e2->pasid &&
>+e1->perm == e2->perm &&
>+e1->target_as == e2->target_as &&
>+e1->translated_addr == e2->translated_addr;
>+}
>+
>+static void assert_lookup_equals(ATC *atc, IOMMUTLBEntry *target,
>+ uint32_t pasid, hwaddr iova)
>+{
>+IOMMUTLBEntry *result;
>+result = atc_lookup(atc, pasid, iova);
>+g_assert(tlb_entry_equal(result, target));
>+}
>+
>+static void check_creation(uint64_t page_size, uint8_t address_width,
>+   uint8_t levels, uint8_t level_offset,
>+   bool should_work) {
>+ATC *atc = atc_new(page_size, address_width);
>+if (atc) {
>+if (atc->levels != levels || atc->level_offset != level_offset) {
>+g_assert(false); /* ATC created but invalid configuration : fail 
>*/
>+}
>+atc_destroy(atc);
>+g_assert(should_work);
>+} else {
>+g_assert(!should_work);
>+}
>+}
>+
>+static void test_creation_parameters(void)
>+{
>+check_creation(8, 39, 3, 9, false);
>+check_creation(4095, 39, 3, 9, false);
>+check_creation(4097, 39, 3, 9, false);
>+check_creation(8192, 48, 0, 0, false);
>+
>+check_creation(4096, 38, 0, 0, false);
>+check_creation(4096, 39, 3, 9, true);
>+check_creation(4096, 40, 0, 0, false);
>+check_creation(4096, 47, 0, 0, false);
>+check_creation(4096, 48, 4, 9, true);
>+check_creation(4096, 49, 0, 0, false);
>+check_creation(4096, 56, 0, 0, false);
>+check_creation(4096, 57, 5, 9, true);
>+check_creation(4096, 58, 0, 0, false);
>+
>+check_creation(16384, 35, 0, 0, false);
>+check_creation(16384, 36, 2, 11, true);
>+check_creation(16384, 37, 0, 0, false);
>+check_creation(16384, 46, 0, 0, false);
>+check_creation(16384, 47, 3, 11, true);
>+check_creation(16384, 48, 0, 0, false);
>+check_creation(16384, 57, 0, 0, false);
>+check_creation(16384, 58, 4, 11, true);
>+check_creation(16384, 59, 0, 0, false);
>+}
>+
>+static void test_single_entry(void)
>+{
>+IOMMUTLBEntry entry = {
>+.iova = 0x123456789000ULL,
>+.addr_mask = 0xfffULL,
>+.pasid = 5,
>+.perm = IOMMU_RW,
>+.translated_addr = 0xdeadbeefULL,
>+};
>+
>+ATC *atc = atc_new(4096, 48);
>+g_assert(atc);
>+
>+assert_lookup_equals(atc, NULL, entry.pasid,
>+ entry.iova + (entry.addr_mask / 2));
>+
>+atc_create_address_space_cache(atc, entry.pasid);
>+g_assert(atc_update(atc, ) == 0);
>+
>+

RE: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry

2024-05-17 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>creating an instance of IOMMUTLBEntry
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> hw/i386/intel_iommu.c | 7 +++
> 1 file changed, 7 insertions(+)
>
>diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>index 53f17d66c0..c4ebd4569e 100644
>--- a/hw/i386/intel_iommu.c
>+++ b/hw/i386/intel_iommu.c
>@@ -2299,6 +2299,7 @@ out:
> entry->translated_addr = vtd_get_slpte_addr(pte, s->aw_bits) &
>page_mask;
> entry->addr_mask = ~page_mask;
> entry->perm = access_flags;
>+entry->pasid = pasid;

For PCI_NO_PASID, do we want to assign PCI_NO_PASID or rid2pasid?

Thanks
Zhenzhong

> return true;
>
> error:
>@@ -2307,6 +2308,7 @@ error:
> entry->translated_addr = 0;
> entry->addr_mask = 0;
> entry->perm = IOMMU_NONE;
>+entry->pasid = PCI_NO_PASID;
> return false;
> }
>
>@@ -3497,6 +3499,7 @@ static void
>vtd_piotlb_pasid_invalidate_notify(IntelIOMMUState *s,
> event.entry.target_as = _space_memory;
> event.entry.iova = notifier->start;
> event.entry.perm = IOMMU_NONE;
>+event.entry.pasid = pasid;
> event.entry.addr_mask = notifier->end - notifier->start;
> event.entry.translated_addr = 0;
>
>@@ -3678,6 +3681,7 @@ static void
>vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
> event.entry.target_as = _space_memory;
> event.entry.iova = addr;
> event.entry.perm = IOMMU_NONE;
>+event.entry.pasid = pasid;
> event.entry.addr_mask = size - 1;
> event.entry.translated_addr = 0;
>
>@@ -4335,6 +4339,7 @@ static void
>do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
> event.entry.iova = addr;
> event.entry.perm = IOMMU_NONE;
> event.entry.translated_addr = 0;
>+event.entry.pasid = vtd_dev_as->pasid;
> memory_region_notify_iommu(_dev_as->iommu, 0, event);
> }
>
>@@ -4911,6 +4916,7 @@ static IOMMUTLBEntry
>vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
> IOMMUTLBEntry iotlb = {
> /* We'll fill in the rest later. */
> .target_as = _space_memory,
>+.pasid = vtd_as->pasid,
> };
> bool success;
>
>@@ -4923,6 +4929,7 @@ static IOMMUTLBEntry
>vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
> iotlb.translated_addr = addr & VTD_PAGE_MASK_4K;
> iotlb.addr_mask = ~VTD_PAGE_MASK_4K;
> iotlb.perm = IOMMU_RW;
>+iotlb.pasid = PCI_NO_PASID;
> success = true;
> }
>
>--
>2.44.0


RE: [PATCH 00/16] VFIO: misc cleanups part2

2024-05-17 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Sent: Friday, May 17, 2024 12:48 AM
>To: Duan, Zhenzhong ; qemu-
>de...@nongnu.org
>Cc: alex.william...@redhat.com; eric.au...@redhat.com; Peng, Chao P
>
>Subject: Re: [PATCH 00/16] VFIO: misc cleanups part2
>
>Hello Zhenzhong,
>
>On 5/15/24 10:20, Zhenzhong Duan wrote:
>> Hi
>>
>> This is the last round of cleanup series to change functions in hw/vfio/
>> to return bool when the error is passed through errp parameter.
>>
>> The first round is at https://lists.gnu.org/archive/html/qemu-devel/2024-
>05/msg01147.html
>>
>> I see Cédric is also working on some migration stuff cleanup,
>> so didn't touch migration.c, but all other files in hw/vfio/ are cleanup now.
>>
>> Patch1 is a fix patch, all others are cleanup patches.
>>
>> Test done on x86 platform:
>> vfio device hotplug/unplug with different backend
>> reboot
>>
>> This series is rebased to https://github.com/legoater/qemu/tree/vfio-next
>
>I queued part 1 in vfio-next with other changes. part 2 is in vfio-9.1
>for now and should reach vfio-next after reviews next week.
>
>Then, we have to work on your v5 [1] which should have all my attention
>again after the next vfio PR. You, Joao and Eric have followups series
>that need a resync on top of v5, possibly others [2] and [3], not sent
>AFAICT. Anyhow, we will need inputs from these people and IOMMU
>stakeholders/maintainers.

Thanks for sharing the plan.

+Joao, Eric, Michael, Jason, Nicolin, Clement for their awareness.

On my side, I have rebased nesting series on top of v5[1],
the newest patches at 
https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting_rfcv2/
is under internal review, FYI.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>[1] [PATCH v5 00/19] Add a host IOMMU device abstraction to check with
>vIOMMU
> https://lore.kernel.org/qemu-devel/20240507092043.1172717-1-
>zhenzhong.d...@intel.com/
>
>[2] [PATCH ats_vtd v2 00/25] ATS support for VT-d
> https://lore.kernel.org/all/20240515071057.33990-1-clement.mathieu--
>d...@eviden.com/
>
>[3] Add Tegra241 (Grace) CMDQV Support
> https://lore.kernel.org/all/cover.1712978212.git.nicol...@nvidia.com/
> https://github.com/nicolinc/qemu/commits/wip/iommufd_vcmdq/
>
>
>
>>
>> Thanks
>> Zhenzhong
>>
>> Zhenzhong Duan (16):
>>vfio/display: Fix error path in call site of ramfb_setup()
>>vfio/display: Make vfio_display_*() return bool
>>vfio/helpers: Use g_autofree in hw/vfio/helpers.c
>>vfio/helpers: Make vfio_set_irq_signaling() return bool
>>vfio/helpers: Make vfio_device_get_name() return bool
>>vfio/platform: Make vfio_populate_device() and vfio_base_device_init()
>>  return bool
>>vfio/ccw: Make vfio_ccw_get_region() return a bool
>>vfio/pci: Make vfio_intx_enable_kvm() return a bool
>>vfio/pci: Make vfio_pci_relocate_msix() and vfio_msix_early_setup()
>>  return a bool
>>vfio/pci: Make vfio_populate_device() return a bool
>>vfio/pci: Make vfio_intx_enable() return bool
>>vfio/pci: Make vfio_populate_vga() return bool
>>vfio/pci: Make capability related functions return bool
>>vfio/pci: Use g_autofree for vfio_region_info pointer
>>vfio/pci-quirks: Make vfio_pci_igd_opregion_init() return bool
>>vfio/pci-quirks: Make vfio_add_*_cap() return bool
>>
>>   hw/vfio/pci.h |  12 +-
>>   include/hw/vfio/vfio-common.h |   6 +-
>>   hw/vfio/ap.c  |  10 +-
>>   hw/vfio/ccw.c |  25 ++--
>>   hw/vfio/display.c |  22 ++--
>>   hw/vfio/helpers.c |  33 ++---
>>   hw/vfio/igd.c |   5 +-
>>   hw/vfio/pci-quirks.c  |  50 
>>   hw/vfio/pci.c | 227 --
>>   hw/vfio/platform.c|  61 -
>>   10 files changed, 213 insertions(+), 238 deletions(-)
>>



RE: [PATCH ats_vtd v1 03/24] intel_iommu: check if the input address is canonical

2024-05-16 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: Re: [PATCH ats_vtd v1 03/24] intel_iommu: check if the input
>address is canonical
>
>Hi zhenzhong,
>
>On 14/05/2024 09:34, Duan, Zhenzhong wrote:
>> Caution: External email. Do not open attachments or click links, unless this
>email comes from a known sender and you know the content is safe.
>>
>>
>> Hi Clement,
>>
>>> -Original Message-
>>> From: CLEMENT MATHIEU--DRIF 
>>> Subject: [PATCH ats_vtd v1 03/24] intel_iommu: check if the input
>address
>>> is canonical
>>>
>>> First stage translation must fail if the address to translate is
>>> not canonical.
>>>
>>> Signed-off-by: Clément Mathieu--Drif d...@eviden.com>
>>> ---
>>> hw/i386/intel_iommu.c  | 22 ++
>>> hw/i386/intel_iommu_internal.h |  2 ++
>>> 2 files changed, 24 insertions(+)
>>>
>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>> index 80cdf37870..240ecb8f72 100644
>>> --- a/hw/i386/intel_iommu.c
>>> +++ b/hw/i386/intel_iommu.c
>>> @@ -1912,6 +1912,7 @@ static const bool vtd_qualified_faults[] = {
>>>  [VTD_FR_PASID_ENTRY_P] = true,
>>>  [VTD_FR_PASID_TABLE_ENTRY_INV] = true,
>>>  [VTD_FR_SM_INTERRUPT_ADDR] = true,
>>> +[VTD_FR_FS_NON_CANONICAL] = true,
>>>  [VTD_FR_MAX] = false,
>>> };
>>>
>>> @@ -2023,6 +2024,21 @@ static inline uint64_t
>>> vtd_get_flpte_addr(uint64_t flpte, uint8_t aw)
>>>  return flpte & VTD_FL_PT_BASE_ADDR_MASK(aw);
>>> }
>>>
>>> +/* Return true if IOVA is canonical, otherwise false. */
>>> +static bool vtd_iova_fl_check_canonical(IntelIOMMUState *s,
>>> +uint64_t iova, VTDContextEntry *ce,
>>> +uint8_t aw, uint32_t pasid)
>>> +{
>>> +uint64_t iova_limit = vtd_iova_limit(s, ce, aw, pasid);
>> According to spec:
>>
>> "Input-address in the request subjected to first-stage translation is not
>> canonical (i.e., address bits 63:N are not same value as address bits [N-
>> 1], where N is 48 bits with 4-level paging and 57 bits with 5-level paging)."
>>
>> So it looks not correct to use aw filed in pasid entry to calculate 
>> iova_limit.
>> Aw can be a value configured by guest and it's used for stage-2 table. See
>spec:
>>
>> " This field is treated as Reserved(0) for implementations not supporting
>Second-stage
>> Translation (SSTS=0 in the Extended Capability Register).
>> This field indicates the adjusted guest-address-width (AGAW) to be used by
>hardware
>> for second-stage translation through paging structures referenced through
>the
>> SSPTPTR field.
>> • The following encodings are defined for this field:
>> • 001b: 39-bit AGAW (3-level page table)
>> • 010b: 48-bit AGAW (4-level page table)
>> • 011b: 57-bit AGAW (5-level page table)
>> • 000b,100b-111b: Reserved
>> When not treated as Reserved(0), hardware ignores this field for first-
>stage-only
>> (PGTT=001b) and pass-through (PGTT=100b) translations."
>>
>> Thanks
>> Zhenzhong
>>
>Not sure to understand.
>Are you talking about the aw field of Scalable-Mode PASID Table Entry?
Yes.

>The aw parameter is set to s->aw_bits in vtd_do_iommu_translate so I
>think it's safe to use it for canonical address check.
>Maybe we can just use s->aw_bits directly from
>vtd_iova_fl_check_canonical to avoid any mistake?
Agaw can be different from s->aw_bits.
Yes, I think using s->aw_bits is safe.

Thanks
Zhenzhong

>>> +uint64_t upper_bits_mask = ~(iova_limit - 1);
>>> +uint64_t upper_bits = iova & upper_bits_mask;
>>> +bool msb = ((iova & (iova_limit >> 1)) != 0);
>>> +return !(
>>> + (!msb && (upper_bits != 0)) ||
>>> + (msb && (upper_bits != upper_bits_mask))
>>> +);
>>> +}
>>> +
>>> /*
>>>   * Given the @iova, get relevant @flptep. @flpte_level will be the last
>level
>>>   * of the translation, can be used for deciding the size of large page.
>>> @@ -2038,6 +2054,12 @@ static int
>vtd_iova_to_flpte(IntelIOMMUState *s,
>>> VTDContextEntry *ce,
>>>  uint32_t offset;
>>>  uint64_t flpte;
>>>
>>> +if (!vtd_iova_fl_check_canonical(s, iova, ce, aw_bits, pasid)) {
>>> +e

RE: [PATCH ats_vtd v1 14/24] pci: add IOMMU operations to get address spaces and memory regions with PASID

2024-05-14 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v1 14/24] pci: add IOMMU operations to get
>address spaces and memory regions with PASID
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> hw/pci/pci.c | 20 
> include/hw/pci/pci.h | 34 ++
> 2 files changed, 54 insertions(+)
>
>diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>index e5f72f9f1d..9ed788c95d 100644
>--- a/hw/pci/pci.c
>+++ b/hw/pci/pci.c
>@@ -2747,6 +2747,26 @@ AddressSpace
>*pci_device_iommu_address_space(PCIDevice *dev)
> return _space_memory;
> }
>
>+AddressSpace *pci_device_iommu_address_space_pasid(PCIDevice *dev,
>+   uint32_t pasid)
>+{
>+PCIBus *bus;
>+PCIBus *iommu_bus;
>+int devfn;
>+
>+if (!dev->is_master || !pcie_pasid_enabled(dev) || pasid ==
>PCI_NO_PASID) {
>+return NULL;
>+}
>+
>+pci_device_get_iommu_bus_devfn(dev, , _bus, );
>+if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops &&

This is implicitly checked in pci_device_get_iommu_bus_devfn().
Just do " if (iommu_bus) && "

Thanks
Zhenzhong

>+iommu_bus->iommu_ops->get_address_space_pasid) {
>+return iommu_bus->iommu_ops->get_address_space_pasid(bus,
>+iommu_bus->iommu_opaque, devfn, pasid);
>+}
>+return NULL;
>+}
>+
> int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice
>*hiod,
> Error **errp)
> {
>diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>index 849e391813..0c532c563c 100644
>--- a/include/hw/pci/pci.h
>+++ b/include/hw/pci/pci.h
>@@ -385,6 +385,38 @@ typedef struct PCIIOMMUOps {
>  * @devfn: device and function number
>  */
> AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>devfn);
>+/**
>+ * @get_address_space_pasid: same as get_address_space but returns
>an
>+ * address space with the requested PASID
>+ *
>+ * This callback is required for PASID-based operations
>+ *
>+ * @bus: the #PCIBus being accessed.
>+ *
>+ * @opaque: the data passed to pci_setup_iommu().
>+ *
>+ * @devfn: device and function number
>+ *
>+ * @pasid: the pasid associated with the requested memory region
>+ */
>+AddressSpace * (*get_address_space_pasid)(PCIBus *bus, void *opaque,
>+  int devfn, uint32_t pasid);
>+/**
>+ * @get_memory_region_pasid: get the iommu memory region for a
>given
>+ * device and pasid
>+ *
>+ * @bus: the #PCIBus being accessed.
>+ *
>+ * @opaque: the data passed to pci_setup_iommu().
>+ *
>+ * @devfn: device and function number
>+ *
>+ * @pasid: the pasid associated with the requested memory region
>+ */
>+IOMMUMemoryRegion * (*get_memory_region_pasid)(PCIBus *bus,
>+   void *opaque,
>+   int devfn,
>+   uint32_t pasid);
> /**
>  * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
>  *
>@@ -420,6 +452,8 @@ typedef struct PCIIOMMUOps {
> } PCIIOMMUOps;
>
> AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
>+AddressSpace *pci_device_iommu_address_space_pasid(PCIDevice *dev,
>+   uint32_t pasid);
> int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice
>*hiod,
> Error **errp);
> void pci_device_unset_iommu_device(PCIDevice *dev);
>--
>2.44.0


RE: [PATCH ats_vtd v1 09/24] pcie: helper functions to check if PASID and ATS are enabled

2024-05-14 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v1 09/24] pcie: helper functions to check if PASID
>and ATS are enabled
>
>ats_enabled and pasid_enabled check whether the capabilities are
>present or not. If so, we read the configuration space to get
>the status of the feature (enabled or not).

s/present/enabled/

>
>Signed-off-by: Clément Mathieu--Drif 
>---
> hw/pci/pcie.c | 18 ++
> include/hw/pci/pcie.h |  3 +++
> 2 files changed, 21 insertions(+)
>
>diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
>index c8e9d4c0f7..2a638a9c3f 100644
>--- a/hw/pci/pcie.c
>+++ b/hw/pci/pcie.c
>@@ -1201,3 +1201,21 @@ void pcie_pasid_init(PCIDevice *dev, uint16_t
>offset, uint8_t pasid_width,
>
> dev->exp.pasid_cap = offset;
> }
>+
>+bool pcie_pasid_enabled(const PCIDevice *dev)
>+{
>+if (!pci_is_express(dev) || !dev->exp.pasid_cap) {
>+return false;
>+}
>+return (pci_get_word(dev->config + dev->exp.pasid_cap +
>PCI_PASID_CTRL) &
>+PCI_PASID_CTRL_ENABLE) != 0;
>+}
>+
>+bool pcie_ats_enabled(const PCIDevice *dev)
>+{
>+if (!pci_is_express(dev) || !dev->exp.ats_cap) {
>+return false;
>+}
>+return (pci_get_word(dev->config + dev->exp.ats_cap + PCI_ATS_CTRL) &
>+PCI_ATS_CTRL_ENABLE) != 0;
>+}
>diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
>index c59627d556..8c222f09da 100644
>--- a/include/hw/pci/pcie.h
>+++ b/include/hw/pci/pcie.h
>@@ -151,4 +151,7 @@ void
>pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
>
> void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
>  bool exec_perm, bool priv_mod);
>+
>+bool pcie_pasid_enabled(const PCIDevice *dev);
>+bool pcie_ats_enabled(const PCIDevice *dev);
> #endif /* QEMU_PCIE_H */
>--
>2.44.0


RE: [PATCH ats_vtd v1 08/24] pcie: add helper to declare PASID capability for a pcie device

2024-05-14 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v1 08/24] pcie: add helper to declare PASID
>capability for a pcie device
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> hw/pci/pcie.c  | 24 
> include/hw/pci/pcie.h  |  6 +-
> include/hw/pci/pcie_regs.h |  3 +++
> 3 files changed, 32 insertions(+), 1 deletion(-)
>
>diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
>index 4b2f0805c6..c8e9d4c0f7 100644
>--- a/hw/pci/pcie.c
>+++ b/hw/pci/pcie.c
>@@ -1177,3 +1177,27 @@ void pcie_acs_reset(PCIDevice *dev)
> pci_set_word(dev->config + dev->exp.acs_cap + PCI_ACS_CTRL, 0);
> }
> }
>+
>+/* PASID */
>+void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
>+ bool exec_perm, bool priv_mod)
>+{
>+assert(pasid_width <= PCI_EXT_CAP_PASID_MAX_WIDTH);
>+static const uint16_t control_reg_rw_mask = 0x07;
>+uint16_t capability_reg = pasid_width;
>+
>+pcie_add_capability(dev, PCI_EXT_CAP_ID_PASID, PCI_PASID_VER, offset,
>+PCI_EXT_CAP_PASID_SIZEOF);
>+
>+capability_reg <<= PCI_EXT_CAP_PASID_SIZEOF;

Not understand why PCI_EXT_CAP_PASID_SIZEOF is used for shifting?

>+capability_reg |= exec_perm ? PCI_PASID_CAP_EXEC : 0;
>+capability_reg |= priv_mod  ? PCI_PASID_CAP_PRIV : 0;
>+pci_set_word(dev->config + offset + PCI_PASID_CAP, capability_reg);
>+
>+/* Everything is disabled by default */
>+pci_set_word(dev->config + offset + PCI_PASID_CTRL, 0);
>+
>+pci_set_word(dev->wmask + offset + PCI_PASID_CTRL,
>control_reg_rw_mask);
>+
>+dev->exp.pasid_cap = offset;
>+}
>diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
>index 11f5a91bbb..c59627d556 100644
>--- a/include/hw/pci/pcie.h
>+++ b/include/hw/pci/pcie.h
>@@ -69,8 +69,9 @@ struct PCIExpressDevice {
> uint16_t aer_cap;
> PCIEAERLog aer_log;
>
>-/* Offset of ATS capability in config space */
>+/* Offset of ATS and PASID capabilities in config space */
> uint16_t ats_cap;
>+uint16_t pasid_cap;
>
> /* ACS */
> uint16_t acs_cap;
>@@ -147,4 +148,7 @@ void pcie_cap_slot_unplug_cb(HotplugHandler
>*hotplug_dev, DeviceState *dev,
>  Error **errp);
> void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
>  DeviceState *dev, Error **errp);
>+
>+void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
>+ bool exec_perm, bool priv_mod);
> #endif /* QEMU_PCIE_H */
>diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h
>index 9d3b6868dc..0a86598f80 100644
>--- a/include/hw/pci/pcie_regs.h
>+++ b/include/hw/pci/pcie_regs.h
>@@ -86,6 +86,9 @@ typedef enum PCIExpLinkWidth {
> #define PCI_ARI_VER 1
> #define PCI_ARI_SIZEOF  8
>
>+/* PASID */
>+#define PCI_PASID_VER   1
>+#define PCI_EXT_CAP_PASID_MAX_WIDTH 20
> /* AER */
> #define PCI_ERR_VER 2
> #define PCI_ERR_SIZEOF  0x48
>--
>2.44.0


RE: [PATCH ats_vtd v1 07/24] memory: add permissions in IOMMUAccessFlags

2024-05-14 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v1 07/24] memory: add permissions in
>IOMMUAccessFlags
>
>This will be necessary for devices implementing ATS.
>We also define a new macro IOMMU_ACCESS_FLAG_FULL in addition to
>IOMMU_ACCESS_FLAG to support more access flags.
>IOMMU_ACCESS_FLAG is kept for convenience and backward compatibility.
>
>Here are the flags added (defined by the PCIe 5 specification) :
>- Execute Requested
>- Privileged Mode Requested
>- Global
>- Untranslated Only
>
>IOMMU_ACCESS_FLAG sets the additional flags to 0
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> include/exec/memory.h | 33 ++---
> 1 file changed, 26 insertions(+), 7 deletions(-)
>
>diff --git a/include/exec/memory.h b/include/exec/memory.h
>index 8626a355b3..304504de02 100644
>--- a/include/exec/memory.h
>+++ b/include/exec/memory.h
>@@ -110,22 +110,41 @@ struct MemoryRegionSection {
>
> typedef struct IOMMUTLBEntry IOMMUTLBEntry;
>
>-/* See address_space_translate: bit 0 is read, bit 1 is write.  */
>+/*
>+ * See address_space_translate:
>+ *  - bit 0 : read
>+ *  - bit 1 : write
>+ *  - bit 2 : exec
>+ *  - bit 3 : priv
>+ *  - bit 4 : global
>+ *  - bit 5 : untranslated only
>+ */
> typedef enum {
> IOMMU_NONE = 0,
> IOMMU_RO   = 1,
> IOMMU_WO   = 2,
> IOMMU_RW   = 3,
>+IOMMU_EXEC = 4,
>+IOMMU_PRIV = 8,
>+IOMMU_GLOBAL = 16,
>+IOMMU_UNTRANSLATED_ONLY = 32,
> } IOMMUAccessFlags;
>
>-#define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ?
>IOMMU_WO : 0))
>+#define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | \
>+((w) ? IOMMU_WO : 0))
>+#define IOMMU_ACCESS_FLAG_FULL(r, w, x, p, g, uo) \
>+(IOMMU_ACCESS_FLAG(r, w) | \
>+((x) ? IOMMU_EXEC : 0) | \
>+((p) ? IOMMU_PRIV : 0) | \
>+((g) ? IOMMU_GLOBAL : 0) | \
>+((uo) ? IOMMU_UNTRANSLATED_ONLY : 0))
>
> struct IOMMUTLBEntry {
>-AddressSpace*target_as;
>-hwaddr   iova;
>-hwaddr   translated_addr;
>-hwaddr   addr_mask;  /* 0xfff = 4k translation */
>-IOMMUAccessFlags perm;
>+AddressSpace*target_as;
>+hwaddr  iova;
>+hwaddr  translated_addr;
>+hwaddr  addr_mask;  /* 0xfff = 4k translation */
>+IOMMUAccessFlagsperm;
> };

Any reason for this change?

Thanks
Zhenzhong


RE: [PATCH ats_vtd v1 04/24] intel_iommu: set accessed and dirty bits during first stage translation

2024-05-14 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v1 04/24] intel_iommu: set accessed and dirty bits
>during first stage translation
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> hw/i386/intel_iommu.c  | 26 ++
> hw/i386/intel_iommu_internal.h |  3 +++
> 2 files changed, 29 insertions(+)
>
>diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>index 240ecb8f72..cad70e0d05 100644
>--- a/hw/i386/intel_iommu.c
>+++ b/hw/i386/intel_iommu.c
>@@ -1913,6 +1913,7 @@ static const bool vtd_qualified_faults[] = {
> [VTD_FR_PASID_TABLE_ENTRY_INV] = true,
> [VTD_FR_SM_INTERRUPT_ADDR] = true,
> [VTD_FR_FS_NON_CANONICAL] = true,
>+[VTD_FR_FS_BIT_UPDATE_FAILED] = true,
> [VTD_FR_MAX] = false,
> };
>
>@@ -2039,6 +2040,20 @@ static bool
>vtd_iova_fl_check_canonical(IntelIOMMUState *s,
> );
> }
>
>+static MemTxResult vtd_set_flag_in_pte(dma_addr_t base_addr, uint32_t
>index,
>+   uint64_t pte, uint64_t flag)
>+{
>+if (pte & flag) {
>+return MEMTX_OK;
>+}
>+pte |= flag;
>+pte = cpu_to_le64(pte);
>+return dma_memory_write(_space_memory,
>+base_addr + index * sizeof(pte),
>+, sizeof(pte),
>+MEMTXATTRS_UNSPECIFIED);
>+}
>+
> /*
>  * Given the @iova, get relevant @flptep. @flpte_level will be the last level
>  * of the translation, can be used for deciding the size of large page.
>@@ -2080,11 +2095,22 @@ static int vtd_iova_to_flpte(IntelIOMMUState
>*s, VTDContextEntry *ce,
>
> *reads = true;
> *writes = (*writes) && (flpte & VTD_FL_RW_MASK);
>+
>+if (vtd_set_flag_in_pte(addr, offset, flpte, VTD_FL_PTE_A)
>+!= MEMTX_OK) {
>+return -VTD_FR_FS_BIT_UPDATE_FAILED;
>+}
>+
> if (is_write && !(flpte & VTD_FL_RW_MASK)) {
> return -VTD_FR_WRITE;
> }

May be better to set access bit here?
Speculatively setting access bit is allowed but not necessary.

Thanks
Zhenzhong

>
> if (vtd_is_last_flpte(flpte, level)) {
>+if (is_write &&
>+(vtd_set_flag_in_pte(addr, offset, flpte, VTD_FL_PTE_D) !=
>+
>MEMTX_OK)) {
>+return -VTD_FR_FS_BIT_UPDATE_FAILED;
>+}
> *flptep = flpte;
> *flpte_level = level;
> return 0;
>diff --git a/hw/i386/intel_iommu_internal.h
>b/hw/i386/intel_iommu_internal.h
>index e9448291a4..14879d3a58 100644
>--- a/hw/i386/intel_iommu_internal.h
>+++ b/hw/i386/intel_iommu_internal.h
>@@ -328,6 +328,7 @@ typedef enum VTDFaultReason {
>
> /* Output address in the interrupt address range for scalable mode */
> VTD_FR_SM_INTERRUPT_ADDR = 0x87,
>+VTD_FR_FS_BIT_UPDATE_FAILED = 0x91, /* SFS.10 */
> VTD_FR_MAX, /* Guard */
> } VTDFaultReason;
>
>@@ -649,6 +650,8 @@ typedef struct VTDPIOTLBInvInfo {
> /* First Level Paging Structure */
> #define VTD_FL_PT_LEVEL 1
> #define VTD_FL_PT_ENTRY_NR  512
>+#define VTD_FL_PTE_A0x20
>+#define VTD_FL_PTE_D0x40
>
> /* Masks for First Level Paging Entry */
> #define VTD_FL_RW_MASK  (1ULL << 1)
>--
>2.44.0


RE: [PATCH ats_vtd v1 06/24] intel_iommu: do not consider wait_desc as an invalid descriptor

2024-05-14 Thread Duan, Zhenzhong


>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v1 06/24] intel_iommu: do not consider wait_desc
>as an invalid descriptor
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> hw/i386/intel_iommu.c | 5 +
> 1 file changed, 5 insertions(+)
>
>diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>index 85a7ebac67..c475a354a0 100644
>--- a/hw/i386/intel_iommu.c
>+++ b/hw/i386/intel_iommu.c
>@@ -3365,6 +3365,11 @@ static bool
>vtd_process_wait_desc(IntelIOMMUState *s, VTDInvDesc *inv_desc)
> } else if (inv_desc->lo & VTD_INV_DESC_WAIT_IF) {
> /* Interrupt flag */
> vtd_generate_completion_event(s);
>+} else if (inv_desc->lo & VTD_INV_DESC_WAIT_FN) {
>+/*
>+ * SW = 0, IF = 0, FN = 1
>+ * Nothing to do as we process the events sequentially
>+ */
> } else {
> error_report_once("%s: invalid wait desc: hi=%"PRIx64", lo=%"PRIx64
>   " (unknown type)", __func__, inv_desc->hi,

LGTM.

Thanks
Zhenzhong 


RE: [PATCH ats_vtd v1 03/24] intel_iommu: check if the input address is canonical

2024-05-14 Thread Duan, Zhenzhong
Hi Clement,

>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v1 03/24] intel_iommu: check if the input address
>is canonical
>
>First stage translation must fail if the address to translate is
>not canonical.
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> hw/i386/intel_iommu.c  | 22 ++
> hw/i386/intel_iommu_internal.h |  2 ++
> 2 files changed, 24 insertions(+)
>
>diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>index 80cdf37870..240ecb8f72 100644
>--- a/hw/i386/intel_iommu.c
>+++ b/hw/i386/intel_iommu.c
>@@ -1912,6 +1912,7 @@ static const bool vtd_qualified_faults[] = {
> [VTD_FR_PASID_ENTRY_P] = true,
> [VTD_FR_PASID_TABLE_ENTRY_INV] = true,
> [VTD_FR_SM_INTERRUPT_ADDR] = true,
>+[VTD_FR_FS_NON_CANONICAL] = true,
> [VTD_FR_MAX] = false,
> };
>
>@@ -2023,6 +2024,21 @@ static inline uint64_t
>vtd_get_flpte_addr(uint64_t flpte, uint8_t aw)
> return flpte & VTD_FL_PT_BASE_ADDR_MASK(aw);
> }
>
>+/* Return true if IOVA is canonical, otherwise false. */
>+static bool vtd_iova_fl_check_canonical(IntelIOMMUState *s,
>+uint64_t iova, VTDContextEntry *ce,
>+uint8_t aw, uint32_t pasid)
>+{
>+uint64_t iova_limit = vtd_iova_limit(s, ce, aw, pasid);

According to spec:

"Input-address in the request subjected to first-stage translation is not
canonical (i.e., address bits 63:N are not same value as address bits [N-
1], where N is 48 bits with 4-level paging and 57 bits with 5-level paging)."

So it looks not correct to use aw filed in pasid entry to calculate iova_limit.
Aw can be a value configured by guest and it's used for stage-2 table. See spec:

" This field is treated as Reserved(0) for implementations not supporting 
Second-stage
Translation (SSTS=0 in the Extended Capability Register).
This field indicates the adjusted guest-address-width (AGAW) to be used by 
hardware
for second-stage translation through paging structures referenced through the
SSPTPTR field.
• The following encodings are defined for this field:
• 001b: 39-bit AGAW (3-level page table)
• 010b: 48-bit AGAW (4-level page table)
• 011b: 57-bit AGAW (5-level page table)
• 000b,100b-111b: Reserved
When not treated as Reserved(0), hardware ignores this field for 
first-stage-only
(PGTT=001b) and pass-through (PGTT=100b) translations."

Thanks
Zhenzhong


>+uint64_t upper_bits_mask = ~(iova_limit - 1);
>+uint64_t upper_bits = iova & upper_bits_mask;
>+bool msb = ((iova & (iova_limit >> 1)) != 0);
>+return !(
>+ (!msb && (upper_bits != 0)) ||
>+ (msb && (upper_bits != upper_bits_mask))
>+);
>+}
>+
> /*
>  * Given the @iova, get relevant @flptep. @flpte_level will be the last level
>  * of the translation, can be used for deciding the size of large page.
>@@ -2038,6 +2054,12 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s,
>VTDContextEntry *ce,
> uint32_t offset;
> uint64_t flpte;
>
>+if (!vtd_iova_fl_check_canonical(s, iova, ce, aw_bits, pasid)) {
>+error_report_once("%s: detected non canonical IOVA (iova=0x%"
>PRIx64 ","
>+  "pasid=0x%" PRIx32 ")", __func__, iova, pasid);
>+return -VTD_FR_FS_NON_CANONICAL;
>+}
>+
> while (true) {
> offset = vtd_iova_fl_level_offset(iova, level);
> flpte = vtd_get_flpte(addr, offset);
>diff --git a/hw/i386/intel_iommu_internal.h
>b/hw/i386/intel_iommu_internal.h
>index 901691afb9..e9448291a4 100644
>--- a/hw/i386/intel_iommu_internal.h
>+++ b/hw/i386/intel_iommu_internal.h
>@@ -324,6 +324,8 @@ typedef enum VTDFaultReason {
> VTD_FR_PASID_ENTRY_P = 0x59, /* The Present(P) field of pasidt-entry is
>0 */
> VTD_FR_PASID_TABLE_ENTRY_INV = 0x5b,  /*Invalid PASID table entry */
>
>+VTD_FR_FS_NON_CANONICAL = 0x80, /* SNG.1 : Address for FS not
>canonical.*/
>+
> /* Output address in the interrupt address range for scalable mode */
> VTD_FR_SM_INTERRUPT_ADDR = 0x87,
> VTD_FR_MAX, /* Guard */
>--
>2.44.0


RE: [PATCH intel_iommu 0/7] FLTS for VT-d

2024-05-13 Thread Duan, Zhenzhong
Hi Clement,

I'll learn and try to give comments this week.

Thanks
Zhenzhong

>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: Re: [PATCH intel_iommu 0/7] FLTS for VT-d
>
>Hi Zhenzhong
>
>Have you had time to review the ATS series rebased on you FLTS patches?
>
>Thanks
> >cmd
>
>
>On 06/05/2024 03:38, Duan, Zhenzhong wrote:
>> Caution: External email. Do not open attachments or click links, unless this
>email comes from a known sender and you know the content is safe.
>>
>>
>> Hi Clement,
>>
>> Sorry for late response, just back from vacation.
>> I saw your rebased version and thanks for your work.
>> I'll schedule a timeslot to review them.
>>
>> Thanks
>> Zhenzhong
>>
>>> -Original Message-
>>> From: CLEMENT MATHIEU--DRIF 
>>> Subject: Re: [PATCH intel_iommu 0/7] FLTS for VT-d
>>>
>>> Hi Zhenzhong,
>>>
>>> I will rebase,
>>>
>>> thanks
>>>
>>> On 01/05/2024 14:40, Duan, Zhenzhong wrote:
>>>> Caution: External email. Do not open attachments or click links, unless
>this
>>> email comes from a known sender and you know the content is safe.
>>>>
>>>> Ah, this is a duplicate effort on stage-1 translation.
>>>>
>>>> Hi Clement,
>>>>
>>>> We had ever sent a rfcv1 series "intel_iommu: Enable stage-1
>translation"
>>>> for both emulated and passthrough device, link:
>>>> https://lists.gnu.org/archive/html/qemu-devel/2024-
>01/msg02740.html
>>>> which now evolves to rfcv2, link:
>>>>
>>>
>https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting
>>> _rfcv2/
>>>> It had addressed recent community comments, also the comments in
>old
>>> history series:
>>>
>https://patchwork.kernel.org/project/kvm/cover/20210302203827.437645
>>> -1-yi.l@intel.com/
>>>> Would you mind rebasing your remaining part, i.e., ATS, PRI emulation,
>etc
>>> on to our rfcv2?
>>>> Thanks
>>>> Zhenzhong
>>>>
>>>>> -Original Message-
>>>>> From: Cédric Le Goater 
>>>>> Subject: Re: [PATCH intel_iommu 0/7] FLTS for VT-d
>>>>>
>>>>> Hello,
>>>>>
>>>>> Adding a few people in Cc: who are familiar with the Intel IOMMU.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> C.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 4/22/24 17:52, CLEMENT MATHIEU--DRIF wrote:
>>>>>> This series is the first of a list that add support for SVM in the Intel
>>> IOMMU.
>>>>>> Here, we implement support for first-stage translation in VT-d.
>>>>>> The PASID-based IOTLB invalidation is also added in this series as it is
>a
>>>>>> requirement of FLTS.
>>>>>>
>>>>>> The last patch introduces the 'flts' option to enable the feature from
>>>>>> the command line.
>>>>>> Once enabled, several drivers of the Linux kernel use this feature.
>>>>>>
>>>>>> This work is based on the VT-d specification version 4.1 (March 2023)
>>>>>>
>>>>>> Here is a link to a GitHub repository where you can find the following
>>>>> elements :
>>>>>>- Qemu with all the patches for SVM
>>>>>>- ATS
>>>>>>- PRI
>>>>>>- PASID based IOTLB invalidation
>>>>>>- Device IOTLB invalidations
>>>>>>- First-stage translations
>>>>>>- Requests with already translated addresses
>>>>>>- A demo device
>>>>>>- A simple driver for the demo device
>>>>>>- A userspace program (for testing and demonstration purposes)
>>>>>>
>>>>>> https://github.com/BullSequana/Qemu-in-guest-SVM-demo
>>>>>>
>>>>>> Clément Mathieu--Drif (7):
>>>>>>  intel_iommu: fix FRCD construction macro.
>>>>>>  intel_iommu: rename slpte to pte before adding FLTS
>>>>>>  intel_iommu: make types match
>>>>>>  intel_iommu: add support for first-stage translation
>>>>>>  intel_iommu: extract device IOTLB invalidation logic
>>>>>>  intel_iommu: add PASID-based IOTLB invalidation
>>>>>>  intel_iommu: add a CLI option to enable FLTS
>>>>>>
>>>>>> hw/i386/intel_iommu.c  | 655
>++-
>>> -
>>>>> -
>>>>>> hw/i386/intel_iommu_internal.h | 114 --
>>>>>> include/hw/i386/intel_iommu.h  |   3 +-
>>>>>> 3 files changed, 609 insertions(+), 163 deletions(-)
>>>>>>


RE: [PATCH v2 00/11] VFIO: misc cleanups

2024-05-13 Thread Duan, Zhenzhong
Hi All,

When I looked into more functions passing 'Error **',
I see many are in "int testfunc(..., Error **errp)" format. I was a bit 
confused.

The qapi/error.h suggests:

* - Whenever practical, also return a value that indicates success /
 *   failure.  This can make the error checking more concise, and can
 *   avoid useless error object creation and destruction.  Note that
 *   we still have many functions returning void.  We recommend
 *   • bool-valued functions return true on success / false on failure,
 *   • pointer-valued functions return non-null / null pointer, and
 *   • integer-valued functions return non-negative / negative.

There are some functions like:

int testfunc(..., Error **errp)
{
If (succeed) {
return 0;
} else {
return -EINVAL;
}
}

Does testfunc() follow 'integer-valued functions' as above or it should be 
changed to 'bool-valued functions'?

Is there a clear rule in which case to change 'int testfunc(... Error **errp)' 
to ' bool testfunc(... Error **errp)'?

Thanks
Zhenzhong

>-Original Message-
>From: Duan, Zhenzhong 
>
>Subject: [PATCH v2 00/11] VFIO: misc cleanups
>
>Hi
>
>This is a cleanup series to change functions in hw/vfio/ to return bool
>when the error is passed through errp parameter, also some cleanup
>with g_autofree.
>
>See discussion at https://lists.gnu.org/archive/html/qemu-devel/2024-
>04/msg04782.html
>
>This series processed below files:
>hw/vfio/container.c
>hw/vfio/iommufd.c
>hw/vfio/cpr.c
>backends/iommufd.c
>
>So above files are clean now, there are still other files need processing
>in hw/vfio.
>
>Test done on x86 platform:
>vfio device hotplug/unplug with different backend
>reboot
>
>Thanks
>Zhenzhong
>
>Changelog:
>v2:
>- split out g_autofree code as a patch (Cédric)
>- add processing for more files
>
>Zhenzhong Duan (11):
>  vfio/pci: Use g_autofree in vfio_realize
>  vfio/pci: Use g_autofree in iommufd_cdev_get_info_iova_range()
>  vfio: Make VFIOIOMMUClass::attach_device() and its wrapper return bool
>  vfio: Make VFIOIOMMUClass::setup() return bool
>  vfio: Make VFIOIOMMUClass::add_window() and its wrapper return bool
>  vfio/container: Make vfio_connect_container() return bool
>  vfio/container: Make vfio_set_iommu() return bool
>  vfio/container: Make vfio_get_device() return bool
>  vfio/iommufd: Make iommufd_cdev_*() return bool
>  vfio/cpr: Make vfio_cpr_register_container() return bool
>  backends/iommufd: Make iommufd_backend_*() return bool
>
> include/hw/vfio/vfio-common.h |   6 +-
> include/hw/vfio/vfio-container-base.h |  18 ++---
> include/sysemu/iommufd.h  |   6 +-
> backends/iommufd.c|  29 +++
> hw/vfio/ap.c  |   6 +-
> hw/vfio/ccw.c |   6 +-
> hw/vfio/common.c  |   6 +-
> hw/vfio/container-base.c  |   8 +-
> hw/vfio/container.c   |  81 +--
> hw/vfio/cpr.c |   4 +-
> hw/vfio/iommufd.c | 109 +++---
> hw/vfio/pci.c |  12 ++-
> hw/vfio/platform.c|   7 +-
> hw/vfio/spapr.c   |  28 +++
> backends/trace-events |   4 +-
> 15 files changed, 147 insertions(+), 183 deletions(-)
>
>--
>2.34.1



RE: [PATCH v5 01/19] backends: Introduce HostIOMMUDevice abstract

2024-05-13 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v5 01/19] backends: Introduce HostIOMMUDevice
>abstract
>
>Hello Zhenzhong,
>
>On 5/8/24 11:03, Zhenzhong Duan wrote:
>> Introduce HostIOMMUDevice as an abstraction of host IOMMU device.
>>
>> Introduce .realize() to initialize HostIOMMUDevice further after
>> instance init.
>>
>> Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
>> for VFIO, and VDPA in the future.
>
>This looks like a way to work around some other problem, like
>avoiding exposing Linux definitions on windows build.

Yes, I have used this MACRO in patch19 to fix build failure on windows.
Also need change HostIOMMUDeviceCaps::type to be uint32_t type.

>
>Thanks,
>
>C.
>
>
>
>
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   MAINTAINERS|  2 ++
>>   include/sysemu/host_iommu_device.h | 51
>++
>>   backends/host_iommu_device.c   | 30 ++
>>   backends/Kconfig   |  5 +++
>>   backends/meson.build   |  1 +
>>   5 files changed, 89 insertions(+)
>>   create mode 100644 include/sysemu/host_iommu_device.h
>>   create mode 100644 backends/host_iommu_device.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 84391777db..5dab60bd04 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -2191,6 +2191,8 @@ M: Zhenzhong Duan
>
>>   S: Supported
>>   F: backends/iommufd.c
>>   F: include/sysemu/iommufd.h
>> +F: backends/host_iommu_device.c
>> +F: include/sysemu/host_iommu_device.h
>>   F: include/qemu/chardev_open.h
>>   F: util/chardev_open.c
>>   F: docs/devel/vfio-iommufd.rst
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> new file mode 100644
>> index 00..2b58a94d62
>> --- /dev/null
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -0,0 +1,51 @@
>> +/*
>> + * Host IOMMU device abstract declaration
>> + *
>> + * Copyright (C) 2024 Intel Corporation.
>> + *
>> + * Authors: Zhenzhong Duan 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + */
>> +
>> +#ifndef HOST_IOMMU_DEVICE_H
>> +#define HOST_IOMMU_DEVICE_H
>> +
>> +#include "qom/object.h"
>> +#include "qapi/error.h"
>> +
>> +#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>> +OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>> +
>> +struct HostIOMMUDevice {
>> +Object parent_obj;
>> +};
>> +
>> +/**
>> + * struct HostIOMMUDeviceClass - The base class for all host IOMMU
>devices.
>> + *
>> + * Different type of host devices (e.g., VFIO or VDPA device) or devices
>> + * with different backend (e.g., VFIO legacy container or IOMMUFD
>backend)
>> + * can have different sub-classes.
>> + */
>> +struct HostIOMMUDeviceClass {
>> +ObjectClass parent_class;
>> +
>> +/**
>> + * @realize: initialize host IOMMU device instance further.
>> + *
>> + * Mandatory callback.
>> + *
>> + * @hiod: pointer to a host IOMMU device instance.
>> + *
>> + * @opaque: pointer to agent device of this host IOMMU device,
>> + *  i.e., for VFIO, pointer to VFIODevice
>> + *
>> + * @errp: pass an Error out when realize fails.
>> + *
>> + * Returns: true on success, false on failure.
>> + */
>> +bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
>> +};
>> +#endif
>> diff --git a/backends/host_iommu_device.c
>b/backends/host_iommu_device.c
>> new file mode 100644
>> index 00..41f2fdce20
>> --- /dev/null
>> +++ b/backends/host_iommu_device.c
>> @@ -0,0 +1,30 @@
>> +/*
>> + * Host IOMMU device abstract
>> + *
>> + * Copyright (C) 2024 Intel Corporation.
>> + *
>> + * Authors: Zhenzhong Duan 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "sysemu/host_iommu_device.h"
>> +
>> +OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
>> +host_iommu_device,
>> +HOST_IOMMU_DEVICE,
>> +OBJECT)
>> +
>> +static void host_iommu_device_class_init(ObjectClass *oc, void *data)
>> +{
>> +}
>> +
>> +static void host_iommu_device_init(Object *obj)
>> +{
>> +}
>> +
>> +static void host_iommu_device_finalize(Object *obj)
>> +{
>> +}
>> diff --git a/backends/Kconfig b/backends/Kconfig
>> index 2cb23f62fa..34ab29e994 100644
>> --- a/backends/Kconfig
>> +++ b/backends/Kconfig
>> @@ -3,3 +3,8 @@ source tpm/Kconfig
>>   config IOMMUFD
>>   bool
>>   depends on VFIO
>> +
>> +config HOST_IOMMU_DEVICE
>> +bool
>> +default y
>> +depends on VFIO
>> diff --git a/backends/meson.build b/backends/meson.build
>> index 8b2b111497..2e975d641e 100644
>> --- a/backends/meson.build
>> +++ b/backends/meson.build
>> @@ -25,6 +25,7 @@ if 

RE: [PATCH] vfio: container: Fix missing allocation of VFIOSpaprContainer

2024-05-09 Thread Duan, Zhenzhong


>-Original Message-
>From: Shivaprasad G Bhat 
>Subject: [PATCH] vfio: container: Fix missing allocation of
>VFIOSpaprContainer
>
>The commit 6ad359ec29 "(vfio/spapr: Move prereg_listener into
>spapr container)" began to use the newly introduced VFIOSpaprContainer
>structure.
>
>After several refactors, today the container_of(container,
>VFIOSpaprContainer, ABC) is used when VFIOSpaprContainer is actually
>not allocated. On PPC64 systems, this dereference is leading to corruption
>showing up as glibc malloc assertion during guest start when using vfio.
>
>Patch adds the missing allocation while also making the structure movement
>to vfio common header file.
>
>Fixes: 6ad359ec29 "(vfio/spapr: Move prereg_listener into spapr container)"
>Signed-off-by: Shivaprasad G Bhat 

Reviewed-by: Zhenzhong Duan 

An alternative way is to introduce a VFIOIOMMUClass::create or
VFIOIOMMUClass::get_container_size.
But that needs some refactor to vfio_connect_container().

Thanks
Zhenzhong

>---
> hw/vfio/container.c   |6 --
> hw/vfio/spapr.c   |6 --
> include/hw/vfio/vfio-common.h |6 ++
> 3 files changed, 10 insertions(+), 8 deletions(-)
>
>diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>index 77bdec276e..ecaf5786d9 100644
>--- a/hw/vfio/container.c
>+++ b/hw/vfio/container.c
>@@ -539,6 +539,7 @@ static int vfio_connect_container(VFIOGroup *group,
>AddressSpace *as,
> {
> VFIOContainer *container;
> VFIOContainerBase *bcontainer;
>+VFIOSpaprContainer *scontainer;
> int ret, fd;
> VFIOAddressSpace *space;
>
>@@ -611,7 +612,8 @@ static int vfio_connect_container(VFIOGroup *group,
>AddressSpace *as,
> goto close_fd_exit;
> }
>
>-container = g_malloc0(sizeof(*container));
>+scontainer = g_malloc0(sizeof(*scontainer));
>+container = >container;
> container->fd = fd;
> bcontainer = >bcontainer;
>
>@@ -675,7 +677,7 @@ unregister_container_exit:
> vfio_cpr_unregister_container(bcontainer);
>
> free_container_exit:
>-g_free(container);
>+g_free(scontainer);
>
> close_fd_exit:
> close(fd);
>diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c
>index 0d949bb728..78d218b7e7 100644
>--- a/hw/vfio/spapr.c
>+++ b/hw/vfio/spapr.c
>@@ -24,12 +24,6 @@
> #include "qapi/error.h"
> #include "trace.h"
>
>-typedef struct VFIOSpaprContainer {
>-VFIOContainer container;
>-MemoryListener prereg_listener;
>-QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
>-} VFIOSpaprContainer;
>-
> static bool vfio_prereg_listener_skipped_section(MemoryRegionSection
>*section)
> {
> if (memory_region_is_iommu(section->mr)) {
>diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>index b9da6c08ef..010fa68ac6 100644
>--- a/include/hw/vfio/vfio-common.h
>+++ b/include/hw/vfio/vfio-common.h
>@@ -82,6 +82,12 @@ typedef struct VFIOContainer {
> QLIST_HEAD(, VFIOGroup) group_list;
> } VFIOContainer;
>
>+typedef struct VFIOSpaprContainer {
>+VFIOContainer container;
>+MemoryListener prereg_listener;
>+QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
>+} VFIOSpaprContainer;
>+
> typedef struct VFIOHostDMAWindow {
> hwaddr min_iova;
> hwaddr max_iova;
>



RE: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU

2024-05-08 Thread Duan, Zhenzhong



>-Original Message-
>From: Jason Gunthorpe 
>Subject: Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to
>check with vIOMMU
>
>On Tue, May 07, 2024 at 02:24:30AM +0000, Duan, Zhenzhong wrote:
>> >On Mon, May 06, 2024 at 02:30:47AM +, Duan, Zhenzhong wrote:
>> >
>> >> I'm not clear how useful multiple iommufd instances support are.
>> >> One possible benefit is for security? It may bring a slightly fine-grained
>> >> isolation in kernel.
>> >
>> >No. I don't think there is any usecase, it is only harmful.
>>
>> OK, so we need to limit QEMU to only one iommufd instance.
>
>I don't know about limit, but you don't need to do extra stuff to make
>it work.
>
>The main issue will be to get all the viommu instances to share the
>same iommufd IOAS for the guest physical mapping. Otherwise each
>viommu should be largely unware of the others sharing (or not) a
>iommufd.

I see.

>
>If you can structure things properly it probably doesn't need a hard
>limit, it will just work worse.

OK, thanks for clarify.
The extra code to support multiple instances in intel_iommu is trivial.
So I'd like to keep this flexibility to user just like cdev. User can configure
QEMU cmdline to use one IOMMUFD instance easily whenever they want.

Thanks
Zhenzhong



RE: [PATCH v4 19/19] intel_iommu: Check compatibility with host IOMMU capabilities

2024-05-08 Thread Duan, Zhenzhong
Hi Clement,

See inline.

From: CLEMENT MATHIEU--DRIF 
Sent: Tuesday, May 7, 2024 7:40 PM
To: Duan, Zhenzhong ; qemu-devel@nongnu.org
Cc: alex.william...@redhat.com; c...@redhat.com; eric.au...@redhat.com; 
m...@redhat.com; pet...@redhat.com; jasow...@redhat.com; j...@nvidia.com; 
nicol...@nvidia.com; joao.m.mart...@oracle.com; Tian, Kevin 
; Liu, Yi L ; Peng, Chao P 
; Marcel Apfelbaum ; Paolo 
Bonzini ; Richard Henderson 
; Eduardo Habkost 
Subject: Re: [PATCH v4 19/19] intel_iommu: Check compatibility with host IOMMU 
capabilities


Hi Zhenzhong,
On 07/05/2024 11:20, Zhenzhong Duan wrote:

Caution: External email. Do not open attachments or click links, unless this 
email comes from a known sender and you know the content is safe.





If check fails, host device (either VFIO or VDPA device) is not

compatible with current vIOMMU config and should not be passed to

guest.



Only aw_bits is checked for now, we don't care other capabilities

before scalable modern mode is introduced.



Signed-off-by: Yi Liu <mailto:yi.l@intel.com>

Signed-off-by: Zhenzhong Duan 
<mailto:zhenzhong.d...@intel.com>

---

 hw/i386/intel_iommu.c | 26 ++

 1 file changed, 26 insertions(+)



diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c

index 747c988bc4..146fde23fc 100644

--- a/hw/i386/intel_iommu.c

+++ b/hw/i386/intel_iommu.c

@@ -35,6 +35,7 @@

 #include "sysemu/kvm.h"

 #include "sysemu/dma.h"

 #include "sysemu/sysemu.h"

+#include "sysemu/host_iommu_device.h"

 #include "hw/i386/apic_internal.h"

 #include "kvm/kvm_i386.h"

 #include "migration/vmstate.h"

@@ -3819,6 +3820,25 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, 
PCIBus *bus,

 return vtd_dev_as;

 }



+static bool vtd_check_hdev(IntelIOMMUState *s, VTDHostIOMMUDevice *vtd_hdev,

+   Error **errp)

+{

+HostIOMMUDevice *hiod = vtd_hdev->dev;

Why not passing the hiod pointer as parameter directly? Maybe you have 
something in mind for a future patch?

[Zhenzhong] Yes, I have below change in commit 
https://github.com/yiliu1765/qemu/commit/7a8219b040b44efe7a828e130bdf5ccc2dddb4d0

ret = host_iommu_device_get_cap(hiod, HOST_IOMMU_DEVICE_CAP_ERRATA, errp);

if (ret < 0) {

return false;

}

vtd_hdev->errata = ret;

Thanks

Zhenzhong



It would allow us to allocate the VTDHostIOMMUDevice later in 
vtd_dev_set_iommu_device.



+int ret;

+

+/* Common checks */

+ret = host_iommu_device_get_cap(hiod, HOST_IOMMU_DEVICE_CAP_AW_BITS, errp);

+if (ret < 0) {

+return false;

+}

+if (s->aw_bits > ret) {

+error_setg(errp, "aw-bits %d > host aw-bits %d", s->aw_bits, ret);

+return false;

+}

+

+return true;

+}

+

 static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,

  HostIOMMUDevice *hiod, Error **errp)

 {

@@ -3848,6 +3868,12 @@ static bool vtd_dev_set_iommu_device(PCIBus *bus, void 
*opaque, int devfn,

 vtd_hdev->iommu_state = s;

 vtd_hdev->dev = hiod;



+if (!vtd_check_hdev(s, vtd_hdev, errp)) {

+g_free(vtd_hdev);

+vtd_iommu_unlock(s);

+return false;

+}

+

 new_key = g_malloc(sizeof(*new_key));

 new_key->bus = bus;

 new_key->devfn = devfn;

--

2.34.1




RE: [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device()

2024-05-08 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 15/19] hw/pci: Introduce
>pci_device_[set|unset]_iommu_device()
>
>On 5/7/24 09:48, Duan, Zhenzhong wrote:
>> Hi Cédric,
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v3 15/19] hw/pci: Introduce
>>> pci_device_[set|unset]_iommu_device()
>>>
>>> Hello Zhenzhong,
>>>
>>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>>> From: Yi Liu 
>>>>
>>>> pci_device_[set|unset]_iommu_device() call
>>> pci_device_get_iommu_bus_devfn()
>>>> to get iommu_bus->iommu_ops and call [set|unset]_iommu_device
>>> callback to
>>>> set/unset HostIOMMUDevice for a given PCI device.
>>>>
>>>> Signed-off-by: Yi Liu 
>>>> Signed-off-by: Yi Sun 
>>>> Signed-off-by: Nicolin Chen 
>>>> Signed-off-by: Zhenzhong Duan 
>>>> ---
>>>>include/hw/pci/pci.h | 38
>>> +-
>>>>hw/pci/pci.c | 27 +++
>>>>2 files changed, 64 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>>>> index eaa3fc99d8..849e391813 100644
>>>> --- a/include/hw/pci/pci.h
>>>> +++ b/include/hw/pci/pci.h
>>>> @@ -3,6 +3,7 @@
>>>>
>>>>#include "exec/memory.h"
>>>>#include "sysemu/dma.h"
>>>> +#include "sysemu/host_iommu_device.h"
>>>
>>> This include directive pulls a Linux header file 
>>> which doesn't exist on all platforms, such as windows and it breaks
>>> compile. So,
>>>
>>>>
>>>>/* PCI includes legacy ISA access.  */
>>>>#include "hw/isa/isa.h"
>>>> @@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
>>>> *
>>>> * @devfn: device and function number
>>>> */
>>>> -   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>>> devfn);
>>>> +AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque,
>int
>>> devfn);
>>>> +/**
>>>> + * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
>>>> + *
>>>> + * Optional callback, if not implemented in vIOMMU, then vIOMMU
>>> can't
>>>> + * retrieve host information from the associated HostIOMMUDevice.
>>>> + *
>>>> + * @bus: the #PCIBus of the PCI device.
>>>> + *
>>>> + * @opaque: the data passed to pci_setup_iommu().
>>>> + *
>>>> + * @devfn: device and function number of the PCI device.
>>>> + *
>>>> + * @dev: the data structure representing host IOMMU device.
>>>> + *
>>>> + * @errp: pass an Error out only when return false
>>>> + *
>>>> + * Returns: 0 if HostIOMMUDevice is attached, or else <0 with errp
>set.
>>>> + */
>>>> +int (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
>>>> +HostIOMMUDevice *dev, Error **errp);
>>>> +/**
>>>> + * @unset_iommu_device: detach a HostIOMMUDevice from a
>>> vIOMMU
>>>> + *
>>>> + * Optional callback.
>>>> + *
>>>> + * @bus: the #PCIBus of the PCI device.
>>>> + *
>>>> + * @opaque: the data passed to pci_setup_iommu().
>>>> + *
>>>> + * @devfn: device and function number of the PCI device.
>>>> + */
>>>> +void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
>>>>} PCIIOMMUOps;
>>>>
>>>>AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
>>>> +int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice
>>> *hiod,
>>>> +Error **errp);
>>>
>>> please include a forward declaration for HostIOMMUDevice instead.
>>
>> Got it, will do.
>> Maybe using iommu_hw_info_type in
>include/sysemu/host_iommu_device.h
>> isn't a good idea from start.
>
>It is not indeed since some included files are used on the Windows build.

Windows build pass after below change to drop iommu_hw_info.
I'll do more build test, then send v5, sorry for the trouble.

RE: [PATCH v4 04/19] vfio/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO device

2024-05-08 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v4 04/19] vfio/iommufd: Introduce
>TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO device
>
>On 5/7/24 11:20, Zhenzhong Duan wrote:
>> TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO represents a host IOMMU
>device under
>> VFIO iommufd backend. It will be created during VFIO device attaching
>> and passed to vIOMMU.
>>
>> It will have its own .realize() implementation.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/vfio/vfio-common.h | 3 +++
>>   hw/vfio/iommufd.c | 5 -
>>   2 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index 05a199ce65..affb73f209 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -32,6 +32,7 @@
>>   #include "sysemu/sysemu.h"
>>   #include "hw/vfio/vfio-container-base.h"
>>   #include "sysemu/host_iommu_device.h"
>> +#include "sysemu/iommufd.h"
>
>I don't think you need this include.

Yes, maybe not now. TYPE_HOST_IOMMU_DEVICE_IOMMUFD is needed:

#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO \
TYPE_HOST_IOMMU_DEVICE_IOMMUFD "-vfio"

it can be replaced with:

#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO \
TYPE_HOST_IOMMU_DEVICE "-iommufd-vfio"


But the main usage is in nesting series.
Below structure references HostIOMMUDeviceIOMMUFD which is in sysemu/iommufd.h

struct HostIOMMUDeviceIOMMUFDVFIO {
HostIOMMUDeviceIOMMUFD parent_obj;

VFIODevice *vdev;
};

Thanks
Zhenzhong


RE: [PATCH v4 01/19] backends: Introduce HostIOMMUDevice abstract

2024-05-07 Thread Duan, Zhenzhong
Hi Philippe,

>-Original Message-
>From: Philippe Mathieu-Daudé 
>Subject: Re: [PATCH v4 01/19] backends: Introduce HostIOMMUDevice
>abstract
>
>Hi Zhenzhong,
>
>On 7/5/24 11:20, Zhenzhong Duan wrote:
>> Introduce HostIOMMUDevice as an abstraction of host IOMMU device.
>>
>> Introduce .realize() to initialize HostIOMMUDevice further after
>> instance init.
>>
>> Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
>> for VFIO, and VDPA in the future.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   MAINTAINERS|  2 ++
>>   include/sysemu/host_iommu_device.h | 51
>++
>>   backends/host_iommu_device.c   | 30 ++
>>   backends/Kconfig   |  5 +++
>>   backends/meson.build   |  1 +
>>   5 files changed, 89 insertions(+)
>>   create mode 100644 include/sysemu/host_iommu_device.h
>>   create mode 100644 backends/host_iommu_device.c
>
>
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> new file mode 100644
>> index 00..2b58a94d62
>> --- /dev/null
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -0,0 +1,51 @@
>> +/*
>> + * Host IOMMU device abstract declaration
>> + *
>> + * Copyright (C) 2024 Intel Corporation.
>> + *
>> + * Authors: Zhenzhong Duan 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + */
>> +
>> +#ifndef HOST_IOMMU_DEVICE_H
>> +#define HOST_IOMMU_DEVICE_H
>> +
>> +#include "qom/object.h"
>> +#include "qapi/error.h"
>> +
>> +#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>> +OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>> +
>> +struct HostIOMMUDevice {
>> +Object parent_obj;
>> +};
>> +
>> +/**
>> + * struct HostIOMMUDeviceClass - The base class for all host IOMMU
>devices.
>> + *
>> + * Different type of host devices (e.g., VFIO or VDPA device) or devices
>> + * with different backend (e.g., VFIO legacy container or IOMMUFD
>backend)
>> + * can have different sub-classes.
>> + */
>> +struct HostIOMMUDeviceClass {
>> +ObjectClass parent_class;
>> +
>> +/**
>> + * @realize: initialize host IOMMU device instance further.
>> + *
>> + * Mandatory callback.
>> + *
>> + * @hiod: pointer to a host IOMMU device instance.
>> + *
>> + * @opaque: pointer to agent device of this host IOMMU device,
>> + *  i.e., for VFIO, pointer to VFIODevice
>> + *
>> + * @errp: pass an Error out when realize fails.
>> + *
>> + * Returns: true on success, false on failure.
>> + */
>> +bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
>> +};
>> +#endif
>> diff --git a/backends/host_iommu_device.c
>b/backends/host_iommu_device.c
>> new file mode 100644
>> index 00..41f2fdce20
>> --- /dev/null
>> +++ b/backends/host_iommu_device.c
>> @@ -0,0 +1,30 @@
>> +/*
>> + * Host IOMMU device abstract
>> + *
>> + * Copyright (C) 2024 Intel Corporation.
>> + *
>> + * Authors: Zhenzhong Duan 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "sysemu/host_iommu_device.h"
>> +
>> +OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
>> +host_iommu_device,
>> +HOST_IOMMU_DEVICE,
>> +OBJECT)
>> +
>> +static void host_iommu_device_class_init(ObjectClass *oc, void *data)
>> +{
>> +}
>> +
>> +static void host_iommu_device_init(Object *obj)
>> +{
>> +}
>> +
>> +static void host_iommu_device_finalize(Object *obj)
>> +{
>> +}
>
>All these stubs call for a QOM interface design instead
>of inheritance IMHO. See INTERFACE_CHECK in "qom/object.h".
>But maybe I misunderstood this series :)

Thanks for your suggestion. Guess you mean introducing an interface
which contains .realize() and .get_cap(), let HostIOMMUDevice or its
sub-classes implement this interface.

.realize() and .get_cap() are not common handlers and only used by host iommu
device QOM, same for such an interface. So it looks an over design. We can let
HostIOMMUDevice or its sub-classes define and implement those handlers directly.

Thanks
Zhenzhong


RE: [PATCH v3 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device()

2024-05-07 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 15/19] hw/pci: Introduce
>pci_device_[set|unset]_iommu_device()
>
>Hello Zhenzhong,
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> From: Yi Liu 
>>
>> pci_device_[set|unset]_iommu_device() call
>pci_device_get_iommu_bus_devfn()
>> to get iommu_bus->iommu_ops and call [set|unset]_iommu_device
>callback to
>> set/unset HostIOMMUDevice for a given PCI device.
>>
>> Signed-off-by: Yi Liu 
>> Signed-off-by: Yi Sun 
>> Signed-off-by: Nicolin Chen 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/pci/pci.h | 38
>+-
>>   hw/pci/pci.c | 27 +++
>>   2 files changed, 64 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>> index eaa3fc99d8..849e391813 100644
>> --- a/include/hw/pci/pci.h
>> +++ b/include/hw/pci/pci.h
>> @@ -3,6 +3,7 @@
>>
>>   #include "exec/memory.h"
>>   #include "sysemu/dma.h"
>> +#include "sysemu/host_iommu_device.h"
>
>This include directive pulls a Linux header file 
>which doesn't exist on all platforms, such as windows and it breaks
>compile. So,
>
>>
>>   /* PCI includes legacy ISA access.  */
>>   #include "hw/isa/isa.h"
>> @@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
>>*
>>* @devfn: device and function number
>>*/
>> -   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>devfn);
>> +AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>devfn);
>> +/**
>> + * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
>> + *
>> + * Optional callback, if not implemented in vIOMMU, then vIOMMU
>can't
>> + * retrieve host information from the associated HostIOMMUDevice.
>> + *
>> + * @bus: the #PCIBus of the PCI device.
>> + *
>> + * @opaque: the data passed to pci_setup_iommu().
>> + *
>> + * @devfn: device and function number of the PCI device.
>> + *
>> + * @dev: the data structure representing host IOMMU device.
>> + *
>> + * @errp: pass an Error out only when return false
>> + *
>> + * Returns: 0 if HostIOMMUDevice is attached, or else <0 with errp set.
>> + */
>> +int (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
>> +HostIOMMUDevice *dev, Error **errp);
>> +/**
>> + * @unset_iommu_device: detach a HostIOMMUDevice from a
>vIOMMU
>> + *
>> + * Optional callback.
>> + *
>> + * @bus: the #PCIBus of the PCI device.
>> + *
>> + * @opaque: the data passed to pci_setup_iommu().
>> + *
>> + * @devfn: device and function number of the PCI device.
>> + */
>> +void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
>>   } PCIIOMMUOps;
>>
>>   AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
>> +int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice
>*hiod,
>> +Error **errp);
>
>please include a forward declaration for HostIOMMUDevice instead.

Got it, will do.
Maybe using iommu_hw_info_type in include/sysemu/host_iommu_device.h
isn't a good idea from start.

Thanks
Zhenzhong



RE: [PATCH v2 03/11] vfio: Make VFIOIOMMUClass::attach_device() and its wrapper return bool

2024-05-07 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 03/11] vfio: Make VFIOIOMMUClass::attach_device()
>and its wrapper return bool
>
>On 5/7/24 08:42, Zhenzhong Duan wrote:
>> Make VFIOIOMMUClass::attach_device() and its wrapper function
>> vfio_attach_device() return bool.
>>
>> This is to follow the coding standand to return bool if 'Error **'
>> is used to pass error.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/vfio/vfio-common.h |  4 ++--
>>   include/hw/vfio/vfio-container-base.h |  4 ++--
>>   hw/vfio/ap.c  |  6 ++
>>   hw/vfio/ccw.c |  6 ++
>>   hw/vfio/common.c  |  4 ++--
>>   hw/vfio/container.c   | 14 +++---
>>   hw/vfio/iommufd.c | 11 +--
>>   hw/vfio/pci.c |  5 ++---
>>   hw/vfio/platform.c|  7 +++
>>   9 files changed, 27 insertions(+), 34 deletions(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index b9da6c08ef..a7b6fc8f46 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -198,8 +198,8 @@ void vfio_region_exit(VFIORegion *region);
>>   void vfio_region_finalize(VFIORegion *region);
>>   void vfio_reset_handler(void *opaque);
>>   struct vfio_device_info *vfio_get_device_info(int fd);
>> -int vfio_attach_device(char *name, VFIODevice *vbasedev,
>> -   AddressSpace *as, Error **errp);
>> +bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>> +AddressSpace *as, Error **errp);
>>   void vfio_detach_device(VFIODevice *vbasedev);
>>
>>   int vfio_kvm_device_add_fd(int fd, Error **errp);
>> diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-
>container-base.h
>> index 3582d5f97a..c839cfd9cb 100644
>> --- a/include/hw/vfio/vfio-container-base.h
>> +++ b/include/hw/vfio/vfio-container-base.h
>> @@ -118,8 +118,8 @@ struct VFIOIOMMUClass {
>>   int (*dma_unmap)(const VFIOContainerBase *bcontainer,
>>hwaddr iova, ram_addr_t size,
>>IOMMUTLBEntry *iotlb);
>> -int (*attach_device)(const char *name, VFIODevice *vbasedev,
>> - AddressSpace *as, Error **errp);
>> +bool (*attach_device)(const char *name, VFIODevice *vbasedev,
>> +  AddressSpace *as, Error **errp);
>>   void (*detach_device)(VFIODevice *vbasedev);
>>   /* migration feature */
>>   int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
>> diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
>> index 7c4caa5938..d50600b702 100644
>> --- a/hw/vfio/ap.c
>> +++ b/hw/vfio/ap.c
>> @@ -156,7 +156,6 @@ static void
>vfio_ap_unregister_irq_notifier(VFIOAPDevice *vapdev,
>>   static void vfio_ap_realize(DeviceState *dev, Error **errp)
>>   {
>>   ERRP_GUARD();
>> -int ret;
>>   Error *err = NULL;
>>   VFIOAPDevice *vapdev = VFIO_AP_DEVICE(dev);
>>   VFIODevice *vbasedev = >vdev;
>> @@ -165,9 +164,8 @@ static void vfio_ap_realize(DeviceState *dev, Error
>**errp)
>>   return;
>>   }
>>
>> -ret = vfio_attach_device(vbasedev->name, vbasedev,
>> - _space_memory, errp);
>> -if (ret) {
>> +if (!vfio_attach_device(vbasedev->name, vbasedev,
>> +_space_memory, errp)) {
>>   goto error;
>>   }
>>
>> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
>> index 90e4a53437..782bd4bed7 100644
>> --- a/hw/vfio/ccw.c
>> +++ b/hw/vfio/ccw.c
>> @@ -580,7 +580,6 @@ static void vfio_ccw_realize(DeviceState *dev,
>Error **errp)
>>   S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(cdev);
>>   VFIODevice *vbasedev = >vdev;
>>   Error *err = NULL;
>> -int ret;
>>
>>   /* Call the class init function for subchannel. */
>>   if (cdc->realize) {
>> @@ -594,9 +593,8 @@ static void vfio_ccw_realize(DeviceState *dev,
>Error **errp)
>>   return;
>>   }
>>
>> -ret = vfio_attach_device(cdev->mdevid, vbasedev,
>> - _space_memory, errp);
>> -if (ret) {
>> +if (!vfio_attach_device(cdev->mdevid, vbasedev,
>> +_space_memory, errp)) {
>>   goto out_attach_dev_err;
>>   }
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 8f9cbdc026..890d30910e 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1492,8 +1492,8 @@ retry:
>>   return info;
>>   }
>>
>> -int vfio_attach_device(char *name, VFIODevice *vbasedev,
>> -   AddressSpace *as, Error **errp)
>> +bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>> +AddressSpace *as, Error **errp)
>>   {
>>   const VFIOIOMMUClass *ops =
>>
>VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
>
>This is still 

RE: [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps

2024-05-07 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 05/19] backends/host_iommu_device: Introduce
>HostIOMMUDeviceCaps
>
>Hello Zhenzhong,
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
>> Different platform IOMMU can support different elements.
>>
>> Currently only two elements, type and aw_bits, type hints the host
>> platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
>> host IOMMU address width.
>>
>> Introduce .check_cap() handler to check if
>HOST_IOMMU_DEVICE_CAP_XXX
>> is supported.
>>
>> Introduce a HostIOMMUDevice API host_iommu_device_check_cap()
>which
>> is a wrapper of .check_cap().
>>
>> Introduce a HostIOMMUDevice API
>host_iommu_device_check_cap_common()
>> to check common capabalities of different host platform IOMMUs.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/sysemu/host_iommu_device.h | 44
>++
>>   backends/host_iommu_device.c   | 29 
>>   2 files changed, 73 insertions(+)
>>
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> index 2b58a94d62..12b6afb463 100644
>> --- a/include/sysemu/host_iommu_device.h
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -14,12 +14,27 @@
>>
>>   #include "qom/object.h"
>>   #include "qapi/error.h"
>> +#include "linux/iommufd.h"
>
>
>Please use instead :
>
>#include 

Got it.

Thanks
Zhenzhong


RE: [PATCH 1/3] vfio: Make VFIOIOMMUClass::attach_device() and its wrapper return bool

2024-05-07 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH 1/3] vfio: Make VFIOIOMMUClass::attach_device() and
>its wrapper return bool
>
>On 5/7/24 04:09, Duan, Zhenzhong wrote:
>>
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH 1/3] vfio: Make VFIOIOMMUClass::attach_device()
>and
>>> its wrapper return bool
>>>
>>> On 5/6/24 10:33, Zhenzhong Duan wrote:
>>>> Make VFIOIOMMUClass::attach_device() and its wrapper function
>>>> vfio_attach_device() return bool.
>>>>
>>>> This is to follow the coding standand to return bool if 'Error **'
>>>> is used to pass error.
>>>>
>>>> Suggested-by: Cédric Le Goater 
>>>> Signed-off-by: Zhenzhong Duan 
>>>> ---
>>>>include/hw/vfio/vfio-common.h |  4 ++--
>>>>include/hw/vfio/vfio-container-base.h |  4 ++--
>>>>hw/vfio/ap.c  |  6 ++
>>>>hw/vfio/ccw.c |  6 ++
>>>>hw/vfio/common.c  |  4 ++--
>>>>hw/vfio/container.c   | 14 +++---
>>>>hw/vfio/iommufd.c | 11 +--
>>>>hw/vfio/pci.c |  8 +++-
>>>>hw/vfio/platform.c|  7 +++
>>>>9 files changed, 28 insertions(+), 36 deletions(-)
>>>>
>>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>>> common.h
>>>> index b9da6c08ef..a7b6fc8f46 100644
>>>> --- a/include/hw/vfio/vfio-common.h
>>>> +++ b/include/hw/vfio/vfio-common.h
>>>> @@ -198,8 +198,8 @@ void vfio_region_exit(VFIORegion *region);
>>>>void vfio_region_finalize(VFIORegion *region);
>>>>void vfio_reset_handler(void *opaque);
>>>>struct vfio_device_info *vfio_get_device_info(int fd);
>>>> -int vfio_attach_device(char *name, VFIODevice *vbasedev,
>>>> -   AddressSpace *as, Error **errp);
>>>> +bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>>>> +AddressSpace *as, Error **errp);
>>>>void vfio_detach_device(VFIODevice *vbasedev);
>>>>
>>>>int vfio_kvm_device_add_fd(int fd, Error **errp);
>>>> diff --git a/include/hw/vfio/vfio-container-base.h
>b/include/hw/vfio/vfio-
>>> container-base.h
>>>> index 3582d5f97a..c839cfd9cb 100644
>>>> --- a/include/hw/vfio/vfio-container-base.h
>>>> +++ b/include/hw/vfio/vfio-container-base.h
>>>> @@ -118,8 +118,8 @@ struct VFIOIOMMUClass {
>>>>int (*dma_unmap)(const VFIOContainerBase *bcontainer,
>>>> hwaddr iova, ram_addr_t size,
>>>> IOMMUTLBEntry *iotlb);
>>>> -int (*attach_device)(const char *name, VFIODevice *vbasedev,
>>>> - AddressSpace *as, Error **errp);
>>>> +bool (*attach_device)(const char *name, VFIODevice *vbasedev,
>>>> +  AddressSpace *as, Error **errp);
>>>>void (*detach_device)(VFIODevice *vbasedev);
>>>>/* migration feature */
>>>>int (*set_dirty_page_tracking)(const VFIOContainerBase
>*bcontainer,
>>>> diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
>>>> index 7c4caa5938..d50600b702 100644
>>>> --- a/hw/vfio/ap.c
>>>> +++ b/hw/vfio/ap.c
>>>> @@ -156,7 +156,6 @@ static void
>>> vfio_ap_unregister_irq_notifier(VFIOAPDevice *vapdev,
>>>>static void vfio_ap_realize(DeviceState *dev, Error **errp)
>>>>{
>>>>ERRP_GUARD();
>>>> -int ret;
>>>>Error *err = NULL;
>>>>VFIOAPDevice *vapdev = VFIO_AP_DEVICE(dev);
>>>>VFIODevice *vbasedev = >vdev;
>>>> @@ -165,9 +164,8 @@ static void vfio_ap_realize(DeviceState *dev,
>Error
>>> **errp)
>>>>return;
>>>>}
>>>>
>>>> -ret = vfio_attach_device(vbasedev->name, vbasedev,
>>>> - _space_memory, errp);
>>>> -if (ret) {
>>>> +if (!vfio_attach_device(vbasedev->name, vbasedev,
>>>> +_space_memory, errp)) {
>>>>goto error;
>>>>}
>>>>
>>>> 

RE: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU

2024-05-06 Thread Duan, Zhenzhong



>-Original Message-
>From: Jason Gunthorpe 
>Subject: Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to
>check with vIOMMU
>
>On Mon, May 06, 2024 at 02:30:47AM +0000, Duan, Zhenzhong wrote:
>
>> I'm not clear how useful multiple iommufd instances support are.
>> One possible benefit is for security? It may bring a slightly fine-grained
>> isolation in kernel.
>
>No. I don't think there is any usecase, it is only harmful.

OK, so we need to limit QEMU to only one iommufd instance.

In cdev series, we support mix of legacy and iommufd backend and multiple 
iommufd backend instances for flexibility.
We need to make a choice to have this limitation only for nesting series or 
globally(including cdev).
May I ask what harmfulness we may have?

Thanks
Zhenzhong



RE: [PATCH 1/3] vfio: Make VFIOIOMMUClass::attach_device() and its wrapper return bool

2024-05-06 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH 1/3] vfio: Make VFIOIOMMUClass::attach_device() and
>its wrapper return bool
>
>On 5/6/24 10:33, Zhenzhong Duan wrote:
>> Make VFIOIOMMUClass::attach_device() and its wrapper function
>> vfio_attach_device() return bool.
>>
>> This is to follow the coding standand to return bool if 'Error **'
>> is used to pass error.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/vfio/vfio-common.h |  4 ++--
>>   include/hw/vfio/vfio-container-base.h |  4 ++--
>>   hw/vfio/ap.c  |  6 ++
>>   hw/vfio/ccw.c |  6 ++
>>   hw/vfio/common.c  |  4 ++--
>>   hw/vfio/container.c   | 14 +++---
>>   hw/vfio/iommufd.c | 11 +--
>>   hw/vfio/pci.c |  8 +++-
>>   hw/vfio/platform.c|  7 +++
>>   9 files changed, 28 insertions(+), 36 deletions(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index b9da6c08ef..a7b6fc8f46 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -198,8 +198,8 @@ void vfio_region_exit(VFIORegion *region);
>>   void vfio_region_finalize(VFIORegion *region);
>>   void vfio_reset_handler(void *opaque);
>>   struct vfio_device_info *vfio_get_device_info(int fd);
>> -int vfio_attach_device(char *name, VFIODevice *vbasedev,
>> -   AddressSpace *as, Error **errp);
>> +bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>> +AddressSpace *as, Error **errp);
>>   void vfio_detach_device(VFIODevice *vbasedev);
>>
>>   int vfio_kvm_device_add_fd(int fd, Error **errp);
>> diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-
>container-base.h
>> index 3582d5f97a..c839cfd9cb 100644
>> --- a/include/hw/vfio/vfio-container-base.h
>> +++ b/include/hw/vfio/vfio-container-base.h
>> @@ -118,8 +118,8 @@ struct VFIOIOMMUClass {
>>   int (*dma_unmap)(const VFIOContainerBase *bcontainer,
>>hwaddr iova, ram_addr_t size,
>>IOMMUTLBEntry *iotlb);
>> -int (*attach_device)(const char *name, VFIODevice *vbasedev,
>> - AddressSpace *as, Error **errp);
>> +bool (*attach_device)(const char *name, VFIODevice *vbasedev,
>> +  AddressSpace *as, Error **errp);
>>   void (*detach_device)(VFIODevice *vbasedev);
>>   /* migration feature */
>>   int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
>> diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
>> index 7c4caa5938..d50600b702 100644
>> --- a/hw/vfio/ap.c
>> +++ b/hw/vfio/ap.c
>> @@ -156,7 +156,6 @@ static void
>vfio_ap_unregister_irq_notifier(VFIOAPDevice *vapdev,
>>   static void vfio_ap_realize(DeviceState *dev, Error **errp)
>>   {
>>   ERRP_GUARD();
>> -int ret;
>>   Error *err = NULL;
>>   VFIOAPDevice *vapdev = VFIO_AP_DEVICE(dev);
>>   VFIODevice *vbasedev = >vdev;
>> @@ -165,9 +164,8 @@ static void vfio_ap_realize(DeviceState *dev, Error
>**errp)
>>   return;
>>   }
>>
>> -ret = vfio_attach_device(vbasedev->name, vbasedev,
>> - _space_memory, errp);
>> -if (ret) {
>> +if (!vfio_attach_device(vbasedev->name, vbasedev,
>> +_space_memory, errp)) {
>>   goto error;
>>   }
>>
>> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
>> index 90e4a53437..782bd4bed7 100644
>> --- a/hw/vfio/ccw.c
>> +++ b/hw/vfio/ccw.c
>> @@ -580,7 +580,6 @@ static void vfio_ccw_realize(DeviceState *dev,
>Error **errp)
>>   S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(cdev);
>>   VFIODevice *vbasedev = >vdev;
>>   Error *err = NULL;
>> -int ret;
>>
>>   /* Call the class init function for subchannel. */
>>   if (cdc->realize) {
>> @@ -594,9 +593,8 @@ static void vfio_ccw_realize(DeviceState *dev,
>Error **errp)
>>   return;
>>   }
>>
>> -ret = vfio_attach_device(cdev->mdevid, vbasedev,
>> - _space_memory, errp);
>> -if (ret) {
>> +if (!vfio_attach_device(cdev->mdevid, vbasedev,
>> +_space_memory, errp)) {
>>   goto out_attach_dev_err;
>>   }
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 8f9cbdc026..890d30910e 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1492,8 +1492,8 @@ retry:
>>   return info;
>>   }
>>
>> -int vfio_attach_device(char *name, VFIODevice *vbasedev,
>> -   AddressSpace *as, Error **errp)
>> +bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>> +AddressSpace *as, Error **errp)
>>   {
>>   const VFIOIOMMUClass *ops =
>>
>VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
>
>
>I think 

RE: [PATCH v3 06/19] range: Introduce range_get_last_bit()

2024-05-06 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 06/19] range: Introduce range_get_last_bit()
>
>On 4/30/24 11:58, Duan, Zhenzhong wrote:
>>
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v3 06/19] range: Introduce range_get_last_bit()
>>>
>>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>>> This helper get the highest 1 bit position of the upper bound.
>>>>
>>>> If the range is empty or upper bound is zero, -1 is returned.
>>>>
>>>> Suggested-by: Cédric Le Goater 
>>>> Signed-off-by: Zhenzhong Duan 
>>>> ---
>>>>include/qemu/range.h | 11 +++
>>>>1 file changed, 11 insertions(+)
>>>>
>>>> diff --git a/include/qemu/range.h b/include/qemu/range.h
>>>> index 205e1da76d..8e05bc1d9f 100644
>>>> --- a/include/qemu/range.h
>>>> +++ b/include/qemu/range.h
>>>> @@ -20,6 +20,8 @@
>>>>#ifndef QEMU_RANGE_H
>>>>#define QEMU_RANGE_H
>>>>
>>>> +#include "qemu/bitops.h"
>>>> +
>>>>/*
>>>> * Operations on 64 bit address ranges.
>>>> * Notes:
>>>> @@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1,
>>> uint64_t len1,
>>>>return !(last2 < first1 || last1 < first2);
>>>>}
>>>>
>>>> +/* Get highest non-zero bit position of a range */
>>>> +static inline int range_get_last_bit(Range *range)
>>>> +{
>>>> +if (range_is_empty(range) || !range->upb) {
>>>> +return -1;
>>>> +}
>>>> +return find_last_bit(>upb, sizeof(range->upb));
>>>
>>> This breaks builds on 32-bit host systems.
>>
>> Oh, I missed 32bit build. Thanks, will fix.
>
>This should provide the same result ?
>
> return 63 - clz64(range->upb);

Yes, I tried 32bit and 64bit, it works. Will use it, thanks for suggestion.

BRs.
Zhenzhong


RE: [PATCH v3 00/19] Add a host IOMMU device abstraction to check with vIOMMU

2024-05-05 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Sent: Friday, May 3, 2024 10:04 PM
>To: Duan, Zhenzhong ; qemu-
>de...@nongnu.org
>Cc: alex.william...@redhat.com; eric.au...@redhat.com; m...@redhat.com;
>pet...@redhat.com; jasow...@redhat.com; j...@nvidia.com;
>nicol...@nvidia.com; joao.m.mart...@oracle.com; Tian, Kevin
>; Liu, Yi L ; Peng, Chao P
>
>Subject: Re: [PATCH v3 00/19] Add a host IOMMU device abstraction to
>check with vIOMMU
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Hi,
>>
>> The most important change in this version is instroducing a common
>> HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new
>interface
>> between vIOMMU and HostIOMMUDevice.
>>
>> HostIOMMUDeviceClass::realize() is introduced to initialize
>> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
>>
>> HostIOMMUDeviceClass::check_cap() is introduced to query host IOMMU
>> device capabilities.
>>
>> After the change, part2 is only 3 patches, so merge it with part1 to be
>> a single prerequisite series, same for changelog. If anyone doesn't like
>> that, I can split again.
>>
>> The class tree is as below:
>>
>>HostIOMMUDevice
>>   | .caps
>>   | .realize()
>>   | .check_cap()
>>   |
>>  .---.
>>  ||  |
>> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}
>HostIOMMUDeviceIOMMUFD
>>  | .vdev  | {.vdev}  | .iommufd
>>  | .devid
>>  | [.ioas_id]
>>  | 
>> [.attach_hwpt()]
>>  | 
>> [.detach_hwpt()]
>>  |
>>.--.
>>|  |
>> HostIOMMUDeviceIOMMUFDVFIO
>{HostIOMMUDeviceIOMMUFDVDPA}
>>| .vdev| {.vdev}
>>
>> * The attributes in [] will be implemented in nesting series.
>> * The classes in {} will be implemented in future.
>> * .vdev in different class points to different agent device,
>> * i.e., for VFIO it points to VFIODevice.
>>
>> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
>> PATCH5-11: Introduce HostIOMMUDeviceCaps, implement .realize()
>and .check_cap() handler
>> PATCH12-16: Create HostIOMMUDevice instance and pass to vIOMMU
>> PATCH17-19: Implement compatibility check between host IOMMU and
>vIOMMU(intel_iommu)
>>
>> Qemu code can be found at:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre
>q_v3
>>
>> Besides the compatibility check in this series, in nesting series, this
>> host IOMMU device is extended for much wider usage. For anyone
>interested
>> on the nesting series, here is the link:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfc
>v2
>
>
>v4 should be a good candidate, we will need feedback from the vIOMMU
>maintainers though.
>
>However, have you considered another/complementary approach which
>would be to create an host IOMMU (iommufd) backend object and a
>vIOMMU
>device object together for each vfio-pci device being plugged in the
>machine ?

I did consider about a single iommufd instance for qemu and finally chose
to support multiple iommufd instances, reason below:

I was taking iommufd as a backend of VFIO device not a backend of vIOMMU.
So there is an iommufd property linked to iommufd instances.
We do support multiple iommufd instances in nesting series just as
we do in cdev series, such as:

-device 
intel-iommu,caching-mode=on,dma-drain=on,device-iotlb=on,x-scalable-mode=modern
-object iommufd,id=iommufd0
-device vfio-pci,host=6f:01.0,id=vfio0,bus=root0,iommufd=iommufd0
-object iommufd,id=iommufd1
-device vfio-pci,host=6f:01.1,id=vfio1,bus=root1,iommufd=iommufd1
-device vfio-pci,host=6f:01.2,id=vfio2,bus=root2

Adding iommufd property to vIOMMU will limit the whole qemu to use only
one iommufd instance, it's also confusing if there is also vfio device with 
legacy
backend.

I'm not clear how useful multiple iommufd instances support are.
One poss

RE: [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler

2024-05-05 Thread Duan, Zhenzhong
>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 11/19] backends/iommufd: Implement
>HostIOMMUDeviceClass::check_cap() handler
>
>> +static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int
>cap,
> Error **errp)
>> +{
>> +switch (cap) {
>> +case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
>> +return 1;
>
> I don't understand this value.

 1 means this host iommu device is attached to IOMMUFD backend,
 or else 0 if attached to legacy backend.
>>>
>>> Hmm, this looks hacky to me and it is not used anywhere in the patchset.
>>> Let's reconsider when there is actually a use for it. Until then, please
>>> drop. My feeling is that a new HostIOMMUDeviceClass
>handler/attributed
>>> should be introduced instead.
>>
>> Got it, will drop it in this series.
>>
>> Is "return 1" directly the concern on your side?
>
>I don't know yet why the implementation would need to know if the host
>IOMMU device is of type IOMMUFD. If that's the case, there are alternative
>ways, like using OBJECT_CHECK( ..., TYPE_HOST_IOMMU_DEVICE_IOMMUFD)
>or
>a class attribute defined at build time but that's a bit the same. Let's
>see when the need arises.

Got it, let's revisit it in nesting series, will drop it for now.

Thanks
Zhenzhong


RE: [PATCH intel_iommu 0/7] FLTS for VT-d

2024-05-05 Thread Duan, Zhenzhong
Hi Clement,

Sorry for late response, just back from vacation.
I saw your rebased version and thanks for your work.
I'll schedule a timeslot to review them.

Thanks
Zhenzhong

>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: Re: [PATCH intel_iommu 0/7] FLTS for VT-d
>
>Hi Zhenzhong,
>
>I will rebase,
>
>thanks
>
>On 01/05/2024 14:40, Duan, Zhenzhong wrote:
>> Caution: External email. Do not open attachments or click links, unless this
>email comes from a known sender and you know the content is safe.
>>
>>
>> Ah, this is a duplicate effort on stage-1 translation.
>>
>> Hi Clement,
>>
>> We had ever sent a rfcv1 series "intel_iommu: Enable stage-1 translation"
>> for both emulated and passthrough device, link:
>> https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg02740.html
>> which now evolves to rfcv2, link:
>>
>https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting
>_rfcv2/
>>
>> It had addressed recent community comments, also the comments in old
>history series:
>>
>https://patchwork.kernel.org/project/kvm/cover/20210302203827.437645
>-1-yi.l@intel.com/
>>
>> Would you mind rebasing your remaining part, i.e., ATS, PRI emulation, etc
>on to our rfcv2?
>>
>> Thanks
>> Zhenzhong
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH intel_iommu 0/7] FLTS for VT-d
>>>
>>> Hello,
>>>
>>> Adding a few people in Cc: who are familiar with the Intel IOMMU.
>>>
>>> Thanks,
>>>
>>> C.
>>>
>>>
>>>
>>>
>>> On 4/22/24 17:52, CLEMENT MATHIEU--DRIF wrote:
>>>> This series is the first of a list that add support for SVM in the Intel
>IOMMU.
>>>>
>>>> Here, we implement support for first-stage translation in VT-d.
>>>> The PASID-based IOTLB invalidation is also added in this series as it is a
>>>> requirement of FLTS.
>>>>
>>>> The last patch introduces the 'flts' option to enable the feature from
>>>> the command line.
>>>> Once enabled, several drivers of the Linux kernel use this feature.
>>>>
>>>> This work is based on the VT-d specification version 4.1 (March 2023)
>>>>
>>>> Here is a link to a GitHub repository where you can find the following
>>> elements :
>>>>   - Qemu with all the patches for SVM
>>>>   - ATS
>>>>   - PRI
>>>>   - PASID based IOTLB invalidation
>>>>   - Device IOTLB invalidations
>>>>   - First-stage translations
>>>>   - Requests with already translated addresses
>>>>   - A demo device
>>>>   - A simple driver for the demo device
>>>>   - A userspace program (for testing and demonstration purposes)
>>>>
>>>> https://github.com/BullSequana/Qemu-in-guest-SVM-demo
>>>>
>>>> Clément Mathieu--Drif (7):
>>>> intel_iommu: fix FRCD construction macro.
>>>> intel_iommu: rename slpte to pte before adding FLTS
>>>> intel_iommu: make types match
>>>> intel_iommu: add support for first-stage translation
>>>> intel_iommu: extract device IOTLB invalidation logic
>>>> intel_iommu: add PASID-based IOTLB invalidation
>>>> intel_iommu: add a CLI option to enable FLTS
>>>>
>>>>hw/i386/intel_iommu.c  | 655 ++-
>-
>>> -
>>>>hw/i386/intel_iommu_internal.h | 114 --
>>>>include/hw/i386/intel_iommu.h  |   3 +-
>>>>3 files changed, 609 insertions(+), 163 deletions(-)
>>>>


RE: [PATCH intel_iommu 0/7] FLTS for VT-d

2024-05-01 Thread Duan, Zhenzhong
Ah, this is a duplicate effort on stage-1 translation.

Hi Clement,

We had ever sent a rfcv1 series "intel_iommu: Enable stage-1 translation"
for both emulated and passthrough device, link:
https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg02740.html
which now evolves to rfcv2, link:
https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting_rfcv2/

It had addressed recent community comments, also the comments in old history 
series: 
https://patchwork.kernel.org/project/kvm/cover/20210302203827.437645-1-yi.l@intel.com/

Would you mind rebasing your remaining part, i.e., ATS, PRI emulation, etc on 
to our rfcv2?

Thanks
Zhenzhong

>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH intel_iommu 0/7] FLTS for VT-d
>
>Hello,
>
>Adding a few people in Cc: who are familiar with the Intel IOMMU.
>
>Thanks,
>
>C.
>
>
>
>
>On 4/22/24 17:52, CLEMENT MATHIEU--DRIF wrote:
>> This series is the first of a list that add support for SVM in the Intel 
>> IOMMU.
>>
>> Here, we implement support for first-stage translation in VT-d.
>> The PASID-based IOTLB invalidation is also added in this series as it is a
>> requirement of FLTS.
>>
>> The last patch introduces the 'flts' option to enable the feature from
>> the command line.
>> Once enabled, several drivers of the Linux kernel use this feature.
>>
>> This work is based on the VT-d specification version 4.1 (March 2023)
>>
>> Here is a link to a GitHub repository where you can find the following
>elements :
>>  - Qemu with all the patches for SVM
>>  - ATS
>>  - PRI
>>  - PASID based IOTLB invalidation
>>  - Device IOTLB invalidations
>>  - First-stage translations
>>  - Requests with already translated addresses
>>  - A demo device
>>  - A simple driver for the demo device
>>  - A userspace program (for testing and demonstration purposes)
>>
>> https://github.com/BullSequana/Qemu-in-guest-SVM-demo
>>
>> Clément Mathieu--Drif (7):
>>intel_iommu: fix FRCD construction macro.
>>intel_iommu: rename slpte to pte before adding FLTS
>>intel_iommu: make types match
>>intel_iommu: add support for first-stage translation
>>intel_iommu: extract device IOTLB invalidation logic
>>intel_iommu: add PASID-based IOTLB invalidation
>>intel_iommu: add a CLI option to enable FLTS
>>
>>   hw/i386/intel_iommu.c  | 655 ++--
>-
>>   hw/i386/intel_iommu_internal.h | 114 --
>>   include/hw/i386/intel_iommu.h  |   3 +-
>>   3 files changed, 609 insertions(+), 163 deletions(-)
>>



RE: [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler

2024-05-01 Thread Duan, Zhenzhong



>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 11/19] backends/iommufd: Implement
>HostIOMMUDeviceClass::check_cap() handler
>
>On 4/30/24 12:06, Duan, Zhenzhong wrote:
>>
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v3 11/19] backends/iommufd: Implement
>>> HostIOMMUDeviceClass::check_cap() handler
>>>
>>> On 4/29/24 08:50, Zhenzhong Duan wrote:
>>>> Suggested-by: Cédric Le Goater 
>>>> Signed-off-by: Zhenzhong Duan 
>>>> ---
>>>>backends/iommufd.c | 18 ++
>>>>1 file changed, 18 insertions(+)
>>>>
>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>> index d61209788a..28faec528e 100644
>>>> --- a/backends/iommufd.c
>>>> +++ b/backends/iommufd.c
>>>> @@ -233,6 +233,23 @@ int
>>> iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>devid,
>>>>return ret;
>>>>}
>>>>
>>>> +static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int cap,
>>> Error **errp)
>>>> +{
>>>> +switch (cap) {
>>>> +case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
>>>> +return 1;
>>>
>>> I don't understand this value.
>>
>> 1 means this host iommu device is attached to IOMMUFD backend,
>> or else 0 if attached to legacy backend.
>
>Hmm, this looks hacky to me and it is not used anywhere in the patchset.
>Let's reconsider when there is actually a use for it. Until then, please
>drop. My feeling is that a new HostIOMMUDeviceClass handler/attributed
>should be introduced instead.

Got it, will drop it in this series.

Is "return 1" directly the concern on your side? If yes, what about adding a new
element be_type which can be initialized in realize(), like below:

--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -28,6 +28,9 @@
  * @fs1gp: first stage(a.k.a, Stage-1) 1GB huge page support.
  */
 typedef struct HostIOMMUDeviceCaps {
+#define HOST_IOMMU_DEVICE_CAP_BACKEND_LEGACY0
+#define HOST_IOMMU_DEVICE_CAP_BACKEND_IOMMUFD   1
+uint32_t be_type;
 enum iommu_hw_info_type type;
 uint8_t aw_bits;
 bool nesting;
@@ -91,7 +94,7 @@ struct HostIOMMUDeviceClass {
 /*
  * Host IOMMU device capability list.
  */
-#define HOST_IOMMU_DEVICE_CAP_IOMMUFD   0
+#define HOST_IOMMU_DEVICE_CAP_BACKEND_TYPE  0
 #define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE1
 #define HOST_IOMMU_DEVICE_CAP_AW_BITS   2
 #define HOST_IOMMU_DEVICE_CAP_NESTING   3

This looks a bit simpler than adding another handler.
Or you have other concern?

Thanks
Zhenzhong 

>
>
>Thanks,
>
>C.
>
>
>
>> Strictly speaking, HOST_IOMMU_DEVICE_CAP_IOMMUFD is not a
>> hardware capability, I'm trying to put all(sw/hw) in CAPs checking
>> framework just like KVM<->qemu CAPs does.
>>
>> Thanks
>> Zhenzhong
>>
>>>
>>>
>>> Thanks,
>>>
>>> C.
>>>
>>>
>>>> +default:
>>>> +return host_iommu_device_check_cap_common(hiod, cap, errp);
>>>> +}
>>>> +}
>>>> +
>>>> +static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
>>>> +{
>>>> +HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>>>> +
>>>> +hioc->check_cap = hiod_iommufd_check_cap;
>>>> +};
>>>> +
>>>>static const TypeInfo types[] = {
>>>>{
>>>>.name = TYPE_IOMMUFD_BACKEND,
>>>> @@ -251,6 +268,7 @@ static const TypeInfo types[] = {
>>>>.parent = TYPE_HOST_IOMMU_DEVICE,
>>>>.instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
>>>>.class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
>>>> +.class_init = hiod_iommufd_class_init,
>>>>.abstract = true,
>>>>}
>>>>};
>>




RE: [PATCH v3 13/19] vfio: Create host IOMMU device instance

2024-04-30 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 13/19] vfio: Create host IOMMU device instance
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Create host IOMMU device instance in vfio_attach_device() and call
>> .realize() to initialize it further.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/vfio/vfio-common.h |  1 +
>>   hw/vfio/common.c  | 18 +-
>>   2 files changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index 0943add3bc..b204b93a55 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -126,6 +126,7 @@ typedef struct VFIODevice {
>>   OnOffAuto pre_copy_dirty_page_tracking;
>>   bool dirty_pages_supported;
>>   bool dirty_tracking;
>> +HostIOMMUDevice *hiod;
>>   int devid;
>>   IOMMUFDBackend *iommufd;
>>   } VFIODevice;
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 8f9cbdc026..0be8b70ebd 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1497,6 +1497,8 @@ int vfio_attach_device(char *name, VFIODevice
>*vbasedev,
>>   {
>>   const VFIOIOMMUClass *ops =
>>
>VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
>> +HostIOMMUDevice *hiod;
>> +int ret;
>>
>>   if (vbasedev->iommufd) {
>>   ops =
>VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUF
>D));
>> @@ -1504,7 +1506,20 @@ int vfio_attach_device(char *name,
>VFIODevice *vbasedev,
>>
>>   assert(ops);
>>
>> -return ops->attach_device(name, vbasedev, as, errp);
>> +ret = ops->attach_device(name, vbasedev, as, errp);
>> +if (ret < 0) {
>> +return ret;
>
>
>hmm, I wonder if we should change the return value of vfio_attach_device()
>to be a bool.

I see, also VFIOIOMMUClass:: setup and VFIOIOMMUClass::add_window.
I can add cleanup patches to fix them if you have no other plan.

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>
>> +}
>> +
>> +hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
>> +if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev,
>errp)) {
>> +object_unref(hiod);
>> +ops->detach_device(vbasedev);
>> +return -EINVAL;
>> +}
>> +vbasedev->hiod = hiod;
>> +
>> +return 0;
>>   }
>>
>>   void vfio_detach_device(VFIODevice *vbasedev)
>> @@ -1512,5 +1527,6 @@ void vfio_detach_device(VFIODevice *vbasedev)
>>   if (!vbasedev->bcontainer) {
>>   return;
>>   }
>> +object_unref(vbasedev->hiod);
>>   vbasedev->bcontainer->ops->detach_device(vbasedev);
>>   }



RE: [PATCH v3 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info()

2024-04-30 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 08/19] backends/iommufd: Introduce helper
>function iommufd_backend_get_device_info()
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Introduce a helper function iommufd_backend_get_device_info() to get
>> host IOMMU related information through iommufd uAPI.
>>
>> Signed-off-by: Yi Liu 
>> Signed-off-by: Yi Sun 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/sysemu/iommufd.h |  4 
>>   backends/iommufd.c   | 24 +++-
>>   2 files changed, 27 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
>> index 6a9fb0007a..e9593637a3 100644
>> --- a/include/sysemu/iommufd.h
>> +++ b/include/sysemu/iommufd.h
>> @@ -17,6 +17,7 @@
>>   #include "qom/object.h"
>>   #include "exec/hwaddr.h"
>>   #include "exec/cpu-common.h"
>> +#include 
>>   #include "sysemu/host_iommu_device.h"
>>
>>   #define TYPE_IOMMUFD_BACKEND "iommufd"
>> @@ -47,6 +48,9 @@ int iommufd_backend_map_dma(IOMMUFDBackend
>*be, uint32_t ioas_id, hwaddr iova,
>>   ram_addr_t size, void *vaddr, bool readonly);
>>   int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t
>ioas_id,
>> hwaddr iova, ram_addr_t size);
>> +int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>devid,
>> +enum iommu_hw_info_type *type,
>> +void *data, uint32_t len, Error **errp);
>>
>>   #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>TYPE_HOST_IOMMU_DEVICE "-iommufd"
>>   OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD,
>HostIOMMUDeviceIOMMUFDClass,
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index 19e46194a2..d61209788a 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -19,7 +19,6 @@
>>   #include "monitor/monitor.h"
>>   #include "trace.h"
>>   #include 
>> -#include 
>>
>>   static void iommufd_backend_init(Object *obj)
>>   {
>> @@ -211,6 +210,29 @@ int
>iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
>>   return ret;
>>   }
>>
>> +int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>devid,
>> +enum iommu_hw_info_type *type,
>> +void *data, uint32_t len, Error **errp)
>
>When taking an 'Error **' argument, routines preferably return a bool.

Got it, will fix.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>
>
>
>> +{
>> +struct iommu_hw_info info = {
>> +.size = sizeof(info),
>> +.dev_id = devid,
>> +.data_len = len,
>> +.data_uptr = (uintptr_t)data,
>> +};
>> +int ret;
>> +
>> +ret = ioctl(be->fd, IOMMU_GET_HW_INFO, );
>> +if (ret) {
>> +error_setg_errno(errp, errno, "Failed to get hardware info");
>> +} else {
>> +g_assert(type);
>> +*type = info.out_data_type;
>> +}
>> +
>> +return ret;
>> +}
>> +
>>   static const TypeInfo types[] = {
>>   {
>>   .name = TYPE_IOMMUFD_BACKEND,



RE: [PATCH v3 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::check_cap() handler

2024-04-30 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 11/19] backends/iommufd: Implement
>HostIOMMUDeviceClass::check_cap() handler
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   backends/iommufd.c | 18 ++
>>   1 file changed, 18 insertions(+)
>>
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index d61209788a..28faec528e 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -233,6 +233,23 @@ int
>iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>>   return ret;
>>   }
>>
>> +static int hiod_iommufd_check_cap(HostIOMMUDevice *hiod, int cap,
>Error **errp)
>> +{
>> +switch (cap) {
>> +case HOST_IOMMU_DEVICE_CAP_IOMMUFD:
>> +return 1;
>
>I don't understand this value.

1 means this host iommu device is attached to IOMMUFD backend,
or else 0 if attached to legacy backend.

Strictly speaking, HOST_IOMMU_DEVICE_CAP_IOMMUFD is not a
hardware capability, I'm trying to put all(sw/hw) in CAPs checking
framework just like KVM<->qemu CAPs does.

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>> +default:
>> +return host_iommu_device_check_cap_common(hiod, cap, errp);
>> +}
>> +}
>> +
>> +static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
>> +{
>> +HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> +hioc->check_cap = hiod_iommufd_check_cap;
>> +};
>> +
>>   static const TypeInfo types[] = {
>>   {
>>   .name = TYPE_IOMMUFD_BACKEND,
>> @@ -251,6 +268,7 @@ static const TypeInfo types[] = {
>>   .parent = TYPE_HOST_IOMMU_DEVICE,
>>   .instance_size = sizeof(HostIOMMUDeviceIOMMUFD),
>>   .class_size = sizeof(HostIOMMUDeviceIOMMUFDClass),
>> +.class_init = hiod_iommufd_class_init,
>>   .abstract = true,
>>   }
>>   };



RE: [PATCH v3 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler

2024-04-30 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 07/19] vfio/container: Implement
>HostIOMMUDeviceClass::realize() handler
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> Utilize range_get_last_bit() to get host IOMMU address width and
>> package it in HostIOMMUDeviceCaps for query with .check_cap().
>>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   hw/vfio/container.c | 29 +
>>   1 file changed, 29 insertions(+)
>>
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index 3b6826996a..863eec3943 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -1143,6 +1143,34 @@ static void
>vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
>>   vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
>>   };
>>
>> +static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void
>*opaque,
>> + Error **errp)
>> +{
>> +VFIODevice *vdev = opaque;
>> +/* iova_ranges is a sorted list */
>> +GList *l = g_list_last(vdev->bcontainer->iova_ranges);
>> +
>> +/* There is no VFIO uAPI to query host platform IOMMU type */
>> +hiod->caps.type = IOMMU_HW_INFO_TYPE_NONE;
>> +HOST_IOMMU_DEVICE_IOMMUFD_VFIO(hiod)->vdev = vdev;
>
>cast uses the wrong type and I am not sure the ->vdev is useful.

Good catch, will remove vdev as you suggested.

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>> +
>> +if (l) {
>> +Range *range = l->data;
>> +hiod->caps.aw_bits = range_get_last_bit(range) + 1;
>> +} else {
>> +hiod->caps.aw_bits = 0xff;
>> +}
>> +
>> +return true;
>> +}
>> +
>> +static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
>> +{
>> +HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> +hioc->realize = hiod_legacy_vfio_realize;
>> +};
>> +
>>   static const TypeInfo types[] = {
>>   {
>>   .name = TYPE_VFIO_IOMMU_LEGACY,
>> @@ -1152,6 +1180,7 @@ static const TypeInfo types[] = {
>>   .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
>>   .parent = TYPE_HOST_IOMMU_DEVICE,
>>   .instance_size = sizeof(HostIOMMUDeviceLegacyVFIO),
>> +.class_init = hiod_legacy_vfio_class_init,
>>   }
>>   };
>>



RE: [PATCH v3 06/19] range: Introduce range_get_last_bit()

2024-04-30 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 06/19] range: Introduce range_get_last_bit()
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> This helper get the highest 1 bit position of the upper bound.
>>
>> If the range is empty or upper bound is zero, -1 is returned.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/qemu/range.h | 11 +++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/include/qemu/range.h b/include/qemu/range.h
>> index 205e1da76d..8e05bc1d9f 100644
>> --- a/include/qemu/range.h
>> +++ b/include/qemu/range.h
>> @@ -20,6 +20,8 @@
>>   #ifndef QEMU_RANGE_H
>>   #define QEMU_RANGE_H
>>
>> +#include "qemu/bitops.h"
>> +
>>   /*
>>* Operations on 64 bit address ranges.
>>* Notes:
>> @@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1,
>uint64_t len1,
>>   return !(last2 < first1 || last1 < first2);
>>   }
>>
>> +/* Get highest non-zero bit position of a range */
>> +static inline int range_get_last_bit(Range *range)
>> +{
>> +if (range_is_empty(range) || !range->upb) {
>> +return -1;
>> +}
>> +return find_last_bit(>upb, sizeof(range->upb));
>
>This breaks builds on 32-bit host systems.

Oh, I missed 32bit build. Thanks, will fix.

Thanks
zhenzhong

>
>
>Thanks,
>
>C.
>
>
>> +}
>> +
>>   /*
>>* Return -1 if @a < @b, 1 @a > @b, and 0 if they touch or overlap.
>>* Both @a and @b must not be empty.



RE: [PATCH v3 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps

2024-04-30 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 05/19] backends/host_iommu_device: Introduce
>HostIOMMUDeviceCaps
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
>> Different platform IOMMU can support different elements.
>>
>> Currently only two elements, type and aw_bits, type hints the host
>> platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
>> host IOMMU address width.
>>
>> Introduce .check_cap() handler to check if
>HOST_IOMMU_DEVICE_CAP_XXX
>> is supported.
>>
>> Introduce a HostIOMMUDevice API host_iommu_device_check_cap()
>which
>> is a wrapper of .check_cap().
>>
>> Introduce a HostIOMMUDevice API
>host_iommu_device_check_cap_common()
>> to check common capabalities of different host platform IOMMUs.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/sysemu/host_iommu_device.h | 44
>++
>>   backends/host_iommu_device.c   | 29 
>>   2 files changed, 73 insertions(+)
>>
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> index 2b58a94d62..12b6afb463 100644
>> --- a/include/sysemu/host_iommu_device.h
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -14,12 +14,27 @@
>>
>>   #include "qom/object.h"
>>   #include "qapi/error.h"
>> +#include "linux/iommufd.h"
>> +
>> +/**
>> + * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
>> + *
>> + * @type: host platform IOMMU type.
>> + *
>> + * @aw_bits: host IOMMU address width. 0xff if no limitation.
>> + */
>> +typedef struct HostIOMMUDeviceCaps {
>> +enum iommu_hw_info_type type;
>> +uint8_t aw_bits;
>> +} HostIOMMUDeviceCaps;
>>
>>   #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>>   OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>>
>>   struct HostIOMMUDevice {
>>   Object parent_obj;
>> +
>> +HostIOMMUDeviceCaps caps;
>>   };
>>
>>   /**
>> @@ -47,5 +62,34 @@ struct HostIOMMUDeviceClass {
>>* Returns: true on success, false on failure.
>>*/
>>   bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
>> +/**
>> + * @check_cap: check if a host IOMMU device capability is supported.
>> + *
>> + * Optional callback, if not implemented, hint not supporting query
>> + * of @cap.
>> + *
>> + * @hiod: pointer to a host IOMMU device instance.
>> + *
>> + * @cap: capability to check.
>> + *
>> + * @errp: pass an Error out when fails to query capability.
>> + *
>> + * Returns: <0 on failure, 0 if a @cap is unsupported, or else
>> + * 1 or some positive value for some special @cap,
>> + * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
>> + */
>> +int (*check_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
>>   };
>> +
>> +/*
>> + * Host IOMMU device capability list.
>> + */
>> +#define HOST_IOMMU_DEVICE_CAP_IOMMUFD   0
>> +#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE1
>> +#define HOST_IOMMU_DEVICE_CAP_AW_BITS   2
>> +
>> +
>> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap,
>Error **errp);
>> +int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod,
>int cap,
>> +   Error **errp);
>>   #endif
>> diff --git a/backends/host_iommu_device.c
>b/backends/host_iommu_device.c
>> index 41f2fdce20..b97d008cc7 100644
>> --- a/backends/host_iommu_device.c
>> +++ b/backends/host_iommu_device.c
>> @@ -28,3 +28,32 @@ static void host_iommu_device_init(Object *obj)
>>   static void host_iommu_device_finalize(Object *obj)
>>   {
>>   }
>> +
>> +/* Wrapper of HostIOMMUDeviceClass:check_cap */
>> +int host_iommu_device_check_cap(HostIOMMUDevice *hiod, int cap,
>Error **errp)
>
>Since we have an 'Error **errp', we could return a bool instead,
>unless this is a 'get_cap' routine ?

Maybe better to name it host_iommu_device_get_cap()?
Because not all results are bool, some are integer, i.e., aw_bits.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>
>> +{
>> +HostIOMMUDeviceClass *hiodc =
>HOST_IOMMU_DEVICE_GET_CLASS(hiod);
>> +if (!hiodc->check_cap) {
>> +error_setg(errp, ".check_cap() not implemented");
>> +return -EINVAL;
>> +}
>> +
>> +return hiodc->check_cap(hiod, cap, errp);
>> +}
>> +
>> +/* Implement check on common IOMMU capabilities */
>> +int host_iommu_device_check_cap_common(HostIOMMUDevice *hiod,
>int cap,
>> +   Error **errp)
>> +{
>> +HostIOMMUDeviceCaps *caps = >caps;
>> +
>> +switch (cap) {
>> +case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
>> +return caps->type;
>> +case HOST_IOMMU_DEVICE_CAP_AW_BITS:
>> +return caps->aw_bits;
>> +default:
>> +error_setg(errp, "Not support query cap %x", cap);
>> +return -EINVAL;
>> +}
>> +}



RE: [PATCH v3 04/19] vfio/iommufd: Introduce HostIOMMUDeviceIOMMUFDVFIO device

2024-04-30 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 04/19] vfio/iommufd: Introduce
>HostIOMMUDeviceIOMMUFDVFIO device
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> HostIOMMUDeviceIOMMUFDVFIO represents a host IOMMU device under
>VFIO
>> iommufd backend. It will be created during VFIO device attaching and
>> passed to vIOMMU.
>>
>> It includes a link to VFIODevice so that we can do VFIO device
>> specific operations, i.e., [at/de]taching hwpt, etc.
>>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/vfio/vfio-common.h | 13 +
>>   hw/vfio/iommufd.c |  6 +-
>>   2 files changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index aa3abe0a18..0943add3bc 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -32,6 +32,7 @@
>>   #include "sysemu/sysemu.h"
>>   #include "hw/vfio/vfio-container-base.h"
>>   #include "sysemu/host_iommu_device.h"
>> +#include "sysemu/iommufd.h"
>>
>>   #define VFIO_MSG_PREFIX "vfio %s: "
>>
>> @@ -159,6 +160,18 @@ struct HostIOMMUDeviceLegacyVFIO {
>>   VFIODevice *vdev;
>>   };
>>
>> +#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO \
>> +TYPE_HOST_IOMMU_DEVICE_IOMMUFD "-vfio"
>> +OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDeviceIOMMUFDVFIO,
>> +   HOST_IOMMU_DEVICE_IOMMUFD_VFIO)
>> +
>> +/* Abstraction of host IOMMU device with VFIO IOMMUFD backend */
>> +struct HostIOMMUDeviceIOMMUFDVFIO {
>> +HostIOMMUDeviceIOMMUFD parent;
>> +
>> +VFIODevice *vdev;
>
>Seems useless today.

Yes, useless before nesting series, will add in nesting series.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>
>
>> +};
>> +
>>   typedef struct VFIODMABuf {
>>   QemuDmaBuf buf;
>>   uint32_t pos_x, pos_y, pos_updates;
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index 8827ffe636..997f4ac43e 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -639,7 +639,11 @@ static const TypeInfo types[] = {
>>   .name = TYPE_VFIO_IOMMU_IOMMUFD,
>>   .parent = TYPE_VFIO_IOMMU,
>>   .class_init = vfio_iommu_iommufd_class_init,
>> -},
>> +}, {
>> +.name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
>> +.parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
>> +.instance_size = sizeof(HostIOMMUDeviceIOMMUFDVFIO),
>> +}
>>   };
>>
>>   DEFINE_TYPES(types)



RE: [PATCH v3 02/19] vfio/container: Introduce HostIOMMUDeviceLegacyVFIO device

2024-04-30 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v3 02/19] vfio/container: Introduce
>HostIOMMUDeviceLegacyVFIO device
>
>On 4/29/24 08:50, Zhenzhong Duan wrote:
>> HostIOMMUDeviceLegacyVFIO represents a host IOMMU device under
>VFIO
>> legacy container backend.
>>
>> It includes a link to VFIODevice.
>
>I don't see any use of this attribute. May be introduce later when needed.

Indeed, will remove.

Then 'struct HostIOMMUDeviceLegacyVFIO' is same as
struct HostIOMMUDevice.

Not clear if it's preferred to remove 'struct HostIOMMUDeviceLegacyVFIO'
and use HostIOMMUDevice instead. Something like:

OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDevice,
HOST_IOMMU_DEVICE_LEGACY_VFIO)

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>>
>> Suggested-by: Eric Auger 
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/vfio/vfio-common.h | 12 
>>   hw/vfio/container.c   |  6 +-
>>   2 files changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index b9da6c08ef..aa3abe0a18 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -31,6 +31,7 @@
>>   #endif
>>   #include "sysemu/sysemu.h"
>>   #include "hw/vfio/vfio-container-base.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>>   #define VFIO_MSG_PREFIX "vfio %s: "
>>
>> @@ -147,6 +148,17 @@ typedef struct VFIOGroup {
>>   bool ram_block_discard_allowed;
>>   } VFIOGroup;
>>
>> +#define TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO
>TYPE_HOST_IOMMU_DEVICE "-legacy-vfio"
>> +OBJECT_DECLARE_SIMPLE_TYPE(HostIOMMUDeviceLegacyVFIO,
>> +   HOST_IOMMU_DEVICE_LEGACY_VFIO)
>> +
>> +/* Abstract of host IOMMU device with VFIO legacy container backend */
>> +struct HostIOMMUDeviceLegacyVFIO {
>> +HostIOMMUDevice parent_obj;
>> +
>> +VFIODevice *vdev;
>> +};
>> +
>>   typedef struct VFIODMABuf {
>>   QemuDmaBuf buf;
>>   uint32_t pos_x, pos_y, pos_updates;
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index 77bdec276e..3b6826996a 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -1148,7 +1148,11 @@ static const TypeInfo types[] = {
>>   .name = TYPE_VFIO_IOMMU_LEGACY,
>>   .parent = TYPE_VFIO_IOMMU,
>>   .class_init = vfio_iommu_legacy_class_init,
>> -},
>> +}, {
>> +.name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
>> +.parent = TYPE_HOST_IOMMU_DEVICE,
>> +.instance_size = sizeof(HostIOMMUDeviceLegacyVFIO),
>> +}
>>   };
>>
>>   DEFINE_TYPES(types)



RE: [PATCH v2 3/5] intel_iommu: Add a framework to do compatibility check with host IOMMU cap/ecap

2024-04-25 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>compatibility check with host IOMMU cap/ecap
>
>On 4/25/24 10:46, Duan, Zhenzhong wrote:
>> Hi Cédric,
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>> compatibility check with host IOMMU cap/ecap
>>>
>>> Hello Zhenzhong,
>>>
>>> On 4/18/24 10:42, Duan, Zhenzhong wrote:
>>>> Hi Cédric,
>>>>
>>>>> -Original Message-
>>>>> From: Cédric Le Goater 
>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>> compatibility check with host IOMMU cap/ecap
>>>>>
>>>>> Hello Zhenzhong
>>>>>
>>>>> On 4/17/24 11:24, Duan, Zhenzhong wrote:
>>>>>>
>>>>>>
>>>>>>> -Original Message-
>>>>>>> From: Cédric Le Goater 
>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>>>> compatibility check with host IOMMU cap/ecap
>>>>>>>
>>>>>>> On 4/17/24 06:21, Duan, Zhenzhong wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> -Original Message-
>>>>>>>>> From: Cédric Le Goater 
>>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>>>>>> compatibility check with host IOMMU cap/ecap
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> On 4/16/24 09:09, Duan, Zhenzhong wrote:
>>>>>>>>>> Hi Cédric,
>>>>>>>>>>
>>>>>>>>>>> -Original Message-
>>>>>>>>>>> From: Cédric Le Goater 
>>>>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to
>do
>>>>>>>>>>> compatibility check with host IOMMU cap/ecap
>>>>>>>>>>>
>>>>>>>>>>> On 4/8/24 10:44, Zhenzhong Duan wrote:
>>>>>>>>>>>> From: Yi Liu 
>>>>>>>>>>>>
>>>>>>>>>>>> If check fails, the host side device(either vfio or vdpa device)
>>> should
>>>>>>> not
>>>>>>>>>>>> be passed to guest.
>>>>>>>>>>>>
>>>>>>>>>>>> Implementation details for different backends will be in
>>> following
>>>>>>>>> patches.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Yi Liu 
>>>>>>>>>>>> Signed-off-by: Yi Sun 
>>>>>>>>>>>> Signed-off-by: Zhenzhong Duan 
>>>>>>>>>>>> ---
>>>>>>>>>>>>hw/i386/intel_iommu.c | 35
>>>>>>>>>>> +++
>>>>>>>>>>>>1 file changed, 35 insertions(+)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>>>>>>>>>>> index 4f84e2e801..a49b587c73 100644
>>>>>>>>>>>> --- a/hw/i386/intel_iommu.c
>>>>>>>>>>>> +++ b/hw/i386/intel_iommu.c
>>>>>>>>>>>> @@ -35,6 +35,7 @@
>>>>>>>>>>>>#include "sysemu/kvm.h"
>>>>>>>>>>>>#include "sysemu/dma.h"
>>>>>>>>>>>>#include "sysemu/sysemu.h"
>>>>>>>>>>>> +#include "sysemu/iommufd.h"
>>>>>>>>>>>>#include "hw/i386/apic_internal.h"
>>>>>>>>>>>>#include "kvm/kvm_i386.h"
>>>>>>>>>>>>#include "migration/vmstate.h"
>>>>>>>>>>>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
>>>>>>>>>>> *vtd_find_add_as(Inte

RE: [PATCH v2 3/5] intel_iommu: Add a framework to do compatibility check with host IOMMU cap/ecap

2024-04-25 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>compatibility check with host IOMMU cap/ecap
>
>Hello Zhenzhong,
>
>On 4/18/24 10:42, Duan, Zhenzhong wrote:
>> Hi Cédric,
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>> compatibility check with host IOMMU cap/ecap
>>>
>>> Hello Zhenzhong
>>>
>>> On 4/17/24 11:24, Duan, Zhenzhong wrote:
>>>>
>>>>
>>>>> -Original Message-
>>>>> From: Cédric Le Goater 
>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>> compatibility check with host IOMMU cap/ecap
>>>>>
>>>>> On 4/17/24 06:21, Duan, Zhenzhong wrote:
>>>>>>
>>>>>>
>>>>>>> -Original Message-
>>>>>>> From: Cédric Le Goater 
>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>>>> compatibility check with host IOMMU cap/ecap
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> On 4/16/24 09:09, Duan, Zhenzhong wrote:
>>>>>>>> Hi Cédric,
>>>>>>>>
>>>>>>>>> -Original Message-
>>>>>>>>> From: Cédric Le Goater 
>>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>>>>>> compatibility check with host IOMMU cap/ecap
>>>>>>>>>
>>>>>>>>> On 4/8/24 10:44, Zhenzhong Duan wrote:
>>>>>>>>>> From: Yi Liu 
>>>>>>>>>>
>>>>>>>>>> If check fails, the host side device(either vfio or vdpa device)
>should
>>>>> not
>>>>>>>>>> be passed to guest.
>>>>>>>>>>
>>>>>>>>>> Implementation details for different backends will be in
>following
>>>>>>> patches.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Yi Liu 
>>>>>>>>>> Signed-off-by: Yi Sun 
>>>>>>>>>> Signed-off-by: Zhenzhong Duan 
>>>>>>>>>> ---
>>>>>>>>>>   hw/i386/intel_iommu.c | 35
>>>>>>>>> +++
>>>>>>>>>>   1 file changed, 35 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>>>>>>>>> index 4f84e2e801..a49b587c73 100644
>>>>>>>>>> --- a/hw/i386/intel_iommu.c
>>>>>>>>>> +++ b/hw/i386/intel_iommu.c
>>>>>>>>>> @@ -35,6 +35,7 @@
>>>>>>>>>>   #include "sysemu/kvm.h"
>>>>>>>>>>   #include "sysemu/dma.h"
>>>>>>>>>>   #include "sysemu/sysemu.h"
>>>>>>>>>> +#include "sysemu/iommufd.h"
>>>>>>>>>>   #include "hw/i386/apic_internal.h"
>>>>>>>>>>   #include "kvm/kvm_i386.h"
>>>>>>>>>>   #include "migration/vmstate.h"
>>>>>>>>>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
>>>>>>>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>>>>>>>>>   return vtd_dev_as;
>>>>>>>>>>   }
>>>>>>>>>>
>>>>>>>>>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>>>>>>>>>> + HostIOMMUDevice *hiod,
>>>>>>>>>> + Error **errp)
>>>>>>>>>> +{
>>>>>>>>>> +return 0;
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>>>>>>>>>> +  HostIOMMUDevice *hiod,
>>>>>>&

RE: [PATCH v2 3/5] intel_iommu: Add a framework to do compatibility check with host IOMMU cap/ecap

2024-04-19 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>compatibility check with host IOMMU cap/ecap
>
>Hello Zhenzhong,
>
>On 4/18/24 10:42, Duan, Zhenzhong wrote:
>> Hi Cédric,
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>> compatibility check with host IOMMU cap/ecap
>>>
>>> Hello Zhenzhong
>>>
>>> On 4/17/24 11:24, Duan, Zhenzhong wrote:
>>>>
>>>>
>>>>> -Original Message-
>>>>> From: Cédric Le Goater 
>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>> compatibility check with host IOMMU cap/ecap
>>>>>
>>>>> On 4/17/24 06:21, Duan, Zhenzhong wrote:
>>>>>>
>>>>>>
>>>>>>> -Original Message-
>>>>>>> From: Cédric Le Goater 
>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>>>> compatibility check with host IOMMU cap/ecap
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> On 4/16/24 09:09, Duan, Zhenzhong wrote:
>>>>>>>> Hi Cédric,
>>>>>>>>
>>>>>>>>> -Original Message-
>>>>>>>>> From: Cédric Le Goater 
>>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>>>>>> compatibility check with host IOMMU cap/ecap
>>>>>>>>>
>>>>>>>>> On 4/8/24 10:44, Zhenzhong Duan wrote:
>>>>>>>>>> From: Yi Liu 
>>>>>>>>>>
>>>>>>>>>> If check fails, the host side device(either vfio or vdpa device)
>should
>>>>> not
>>>>>>>>>> be passed to guest.
>>>>>>>>>>
>>>>>>>>>> Implementation details for different backends will be in
>following
>>>>>>> patches.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Yi Liu 
>>>>>>>>>> Signed-off-by: Yi Sun 
>>>>>>>>>> Signed-off-by: Zhenzhong Duan 
>>>>>>>>>> ---
>>>>>>>>>>   hw/i386/intel_iommu.c | 35
>>>>>>>>> +++
>>>>>>>>>>   1 file changed, 35 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>>>>>>>>> index 4f84e2e801..a49b587c73 100644
>>>>>>>>>> --- a/hw/i386/intel_iommu.c
>>>>>>>>>> +++ b/hw/i386/intel_iommu.c
>>>>>>>>>> @@ -35,6 +35,7 @@
>>>>>>>>>>   #include "sysemu/kvm.h"
>>>>>>>>>>   #include "sysemu/dma.h"
>>>>>>>>>>   #include "sysemu/sysemu.h"
>>>>>>>>>> +#include "sysemu/iommufd.h"
>>>>>>>>>>   #include "hw/i386/apic_internal.h"
>>>>>>>>>>   #include "kvm/kvm_i386.h"
>>>>>>>>>>   #include "migration/vmstate.h"
>>>>>>>>>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
>>>>>>>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>>>>>>>>>   return vtd_dev_as;
>>>>>>>>>>   }
>>>>>>>>>>
>>>>>>>>>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>>>>>>>>>> + HostIOMMUDevice *hiod,
>>>>>>>>>> + Error **errp)
>>>>>>>>>> +{
>>>>>>>>>> +return 0;
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>>>>>>>>>> +  HostIOMMUDevice *hiod,
>>>>>>>>>> +   

RE: [PATCH v2 3/5] intel_iommu: Add a framework to do compatibility check with host IOMMU cap/ecap

2024-04-18 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>compatibility check with host IOMMU cap/ecap
>
>Hello Zhenzhong
>
>On 4/17/24 11:24, Duan, Zhenzhong wrote:
>>
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>> compatibility check with host IOMMU cap/ecap
>>>
>>> On 4/17/24 06:21, Duan, Zhenzhong wrote:
>>>>
>>>>
>>>>> -Original Message-
>>>>> From: Cédric Le Goater 
>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>> compatibility check with host IOMMU cap/ecap
>>>>>
>>>>> Hello,
>>>>>
>>>>> On 4/16/24 09:09, Duan, Zhenzhong wrote:
>>>>>> Hi Cédric,
>>>>>>
>>>>>>> -Original Message-
>>>>>>> From: Cédric Le Goater 
>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>>>> compatibility check with host IOMMU cap/ecap
>>>>>>>
>>>>>>> On 4/8/24 10:44, Zhenzhong Duan wrote:
>>>>>>>> From: Yi Liu 
>>>>>>>>
>>>>>>>> If check fails, the host side device(either vfio or vdpa device) should
>>> not
>>>>>>>> be passed to guest.
>>>>>>>>
>>>>>>>> Implementation details for different backends will be in following
>>>>> patches.
>>>>>>>>
>>>>>>>> Signed-off-by: Yi Liu 
>>>>>>>> Signed-off-by: Yi Sun 
>>>>>>>> Signed-off-by: Zhenzhong Duan 
>>>>>>>> ---
>>>>>>>>  hw/i386/intel_iommu.c | 35
>>>>>>> +++
>>>>>>>>  1 file changed, 35 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>>>>>>> index 4f84e2e801..a49b587c73 100644
>>>>>>>> --- a/hw/i386/intel_iommu.c
>>>>>>>> +++ b/hw/i386/intel_iommu.c
>>>>>>>> @@ -35,6 +35,7 @@
>>>>>>>>  #include "sysemu/kvm.h"
>>>>>>>>  #include "sysemu/dma.h"
>>>>>>>>  #include "sysemu/sysemu.h"
>>>>>>>> +#include "sysemu/iommufd.h"
>>>>>>>>  #include "hw/i386/apic_internal.h"
>>>>>>>>  #include "kvm/kvm_i386.h"
>>>>>>>>  #include "migration/vmstate.h"
>>>>>>>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
>>>>>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>>>>>>>  return vtd_dev_as;
>>>>>>>>  }
>>>>>>>>
>>>>>>>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>>>>>>>> + HostIOMMUDevice *hiod,
>>>>>>>> + Error **errp)
>>>>>>>> +{
>>>>>>>> +return 0;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>>>>>>>> +  HostIOMMUDevice *hiod,
>>>>>>>> +  Error **errp)
>>>>>>>> +{
>>>>>>>> +return 0;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static int vtd_check_hdev(IntelIOMMUState *s,
>>>>> VTDHostIOMMUDevice
>>>>>>> *vtd_hdev,
>>>>>>>> +  Error **errp)
>>>>>>>> +{
>>>>>>>> +HostIOMMUDevice *hiod = vtd_hdev->dev;
>>>>>>>> +
>>>>>>>> +if (object_dynamic_cast(OBJECT(hiod), TYPE_HIOD_IOMMUFD))
>{
>>>>>>>> +return vtd_check_iommufd_hdev(s, hiod, errp);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +return vtd_check_legacy_

RE: [PATCH v2 3/5] intel_iommu: Add a framework to do compatibility check with host IOMMU cap/ecap

2024-04-17 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>compatibility check with host IOMMU cap/ecap
>
>On 4/17/24 06:21, Duan, Zhenzhong wrote:
>>
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>> compatibility check with host IOMMU cap/ecap
>>>
>>> Hello,
>>>
>>> On 4/16/24 09:09, Duan, Zhenzhong wrote:
>>>> Hi Cédric,
>>>>
>>>>> -Original Message-
>>>>> From: Cédric Le Goater 
>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>>>> compatibility check with host IOMMU cap/ecap
>>>>>
>>>>> On 4/8/24 10:44, Zhenzhong Duan wrote:
>>>>>> From: Yi Liu 
>>>>>>
>>>>>> If check fails, the host side device(either vfio or vdpa device) should
>not
>>>>>> be passed to guest.
>>>>>>
>>>>>> Implementation details for different backends will be in following
>>> patches.
>>>>>>
>>>>>> Signed-off-by: Yi Liu 
>>>>>> Signed-off-by: Yi Sun 
>>>>>> Signed-off-by: Zhenzhong Duan 
>>>>>> ---
>>>>>> hw/i386/intel_iommu.c | 35
>>>>> +++
>>>>>> 1 file changed, 35 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>>>>> index 4f84e2e801..a49b587c73 100644
>>>>>> --- a/hw/i386/intel_iommu.c
>>>>>> +++ b/hw/i386/intel_iommu.c
>>>>>> @@ -35,6 +35,7 @@
>>>>>> #include "sysemu/kvm.h"
>>>>>> #include "sysemu/dma.h"
>>>>>> #include "sysemu/sysemu.h"
>>>>>> +#include "sysemu/iommufd.h"
>>>>>> #include "hw/i386/apic_internal.h"
>>>>>> #include "kvm/kvm_i386.h"
>>>>>> #include "migration/vmstate.h"
>>>>>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
>>>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>>>>> return vtd_dev_as;
>>>>>> }
>>>>>>
>>>>>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>>>>>> + HostIOMMUDevice *hiod,
>>>>>> + Error **errp)
>>>>>> +{
>>>>>> +return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>>>>>> +  HostIOMMUDevice *hiod,
>>>>>> +  Error **errp)
>>>>>> +{
>>>>>> +return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int vtd_check_hdev(IntelIOMMUState *s,
>>> VTDHostIOMMUDevice
>>>>> *vtd_hdev,
>>>>>> +  Error **errp)
>>>>>> +{
>>>>>> +HostIOMMUDevice *hiod = vtd_hdev->dev;
>>>>>> +
>>>>>> +if (object_dynamic_cast(OBJECT(hiod), TYPE_HIOD_IOMMUFD)) {
>>>>>> +return vtd_check_iommufd_hdev(s, hiod, errp);
>>>>>> +}
>>>>>> +
>>>>>> +return vtd_check_legacy_hdev(s, hiod, errp);
>>>>>> +}
>>>>>
>>>>>
>>>>> I think we should be using the .get_host_iommu_info() class handler
>>>>> instead. Can we refactor the code slightly to avoid this check on
>>>>> the type ?
>>>>
>>>> There is some difficulty ini avoiding this check, the behavior of
>>> vtd_check_legacy_hdev
>>>> and vtd_check_iommufd_hdev are different especially after nesting
>>> support introduced.
>>>> vtd_check_iommufd_hdev() has much wider check over cap/ecap bits
>>> besides aw_bits.
>>>
>>> I think it is important to fully separate the vIOMMU model from the
>>> host IOMMU backing device. Could we introduce a new
>>> HostIOMMUDeviceClass
>>> handler .check_hdev() handler, which would call .get_host_iommu_info() ?
>>
>> Understood, besi

RE: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device

2024-04-16 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device
>
>Hello,
>
>On 4/16/24 05:41, Duan, Zhenzhong wrote:
>> Hi Cédric,
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device
>>>
>>> On 4/8/24 10:12, Zhenzhong Duan wrote:
>>>> HIODLegacyVFIO represents a host IOMMU device under VFIO legacy
>>>> container backend.
>>>>
>>>> It includes a link to VFIODevice.
>>>>
>>>> Suggested-by: Eric Auger 
>>>> Suggested-by: Cédric Le Goater 
>>>> Signed-off-by: Zhenzhong Duan 
>>>> ---
>>>>include/hw/vfio/vfio-common.h | 11 +++
>>>>hw/vfio/container.c   | 11 ++-
>>>>2 files changed, 21 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>>> common.h
>>>> index b9da6c08ef..f30772f534 100644
>>>> --- a/include/hw/vfio/vfio-common.h
>>>> +++ b/include/hw/vfio/vfio-common.h
>>>> @@ -31,6 +31,7 @@
>>>>#endif
>>>>#include "sysemu/sysemu.h"
>>>>#include "hw/vfio/vfio-container-base.h"
>>>> +#include "sysemu/host_iommu_device.h"
>>>>
>>>>#define VFIO_MSG_PREFIX "vfio %s: "
>>>>
>>>> @@ -147,6 +148,16 @@ typedef struct VFIOGroup {
>>>>bool ram_block_discard_allowed;
>>>>} VFIOGroup;
>>>>
>>>> +#define TYPE_HIOD_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-
>legacy-
>>> vfio"
>>>
>>> I would prefer to keep the prefix TYPE_HOST_IOMMU_DEVICE.
>>
>> Will do.
>>
>>>
>>>> +OBJECT_DECLARE_SIMPLE_TYPE(HIODLegacyVFIO, HIOD_LEGACY_VFIO)
>>>> +
>>>> +/* Abstraction of VFIO legacy host IOMMU device */
>>>> +struct HIODLegacyVFIO {
>>>
>>> same here
>>
>> Should I do the same for all the HostIOMMUDevice and
>HostIOMMUDeviceClass sub-structures?
>
>I would for type names. The main reason is for naming consistency, which is
>useful for grep and code analysis.

Got it.

>
>>
>> The reason I used 'HIOD' abbreviation is some function names become
>extremely long
>> and exceed 80 characters. E.g.:
>>
>> @@ -1148,9 +1148,9 @@ static void
>vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
>>   vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
>>   };
>>
>> -static int hiod_legacy_vfio_get_host_iommu_info(HostIOMMUDevice
>*hiod,
>> -void *data, uint32_t len,
>> -Error **errp)
>> +static int
>host_iommu_device_legacy_vfio_get_host_iommu_info(HostIOMMUDevice
>*hiod,
>> + void *data, 
>> uint32_t len,
>> + Error **errp)
>>   {
>>   VFIODevice *vbasedev = HIOD_LEGACY_VFIO(hiod)->vdev;
>>   /* iova_ranges is a sorted list */
>> @@ -1173,7 +1173,7 @@ static void
>hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
>>   {
>>   HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>>
>> -hioc->get_host_iommu_info = hiod_legacy_vfio_get_host_iommu_info;
>> +hioc->get_host_iommu_info =
>host_iommu_device_legacy_vfio_get_host_iommu_info;
>>   };
>>
>> I didn't find other way to make it meet the 80 chars limitation. Any
>suggestions on this?
>
>Try :
>
>@@ -1177,7 +1177,8 @@ static void hiod_legacy_vfio_class_init(
>  {
>  HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>
>-hioc->get_host_iommu_info = hiod_legacy_vfio_get_host_iommu_info;
>+hioc->get_host_iommu_info =
>+host_iommu_device_legacy_vfio_get_host_iommu_info;
>  };
>
>  static const TypeInfo types[] = {
>
>That said, I agree that 'host_iommu_device_legacy_vfio' routine prefix
>could be shortened to 'hiod_legacy_vfio'.

Got it.

Thanks
Zhenzhong



RE: [PATCH v2 3/5] intel_iommu: Add a framework to do compatibility check with host IOMMU cap/ecap

2024-04-16 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>compatibility check with host IOMMU cap/ecap
>
>Hello,
>
>On 4/16/24 09:09, Duan, Zhenzhong wrote:
>> Hi Cédric,
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>>> compatibility check with host IOMMU cap/ecap
>>>
>>> On 4/8/24 10:44, Zhenzhong Duan wrote:
>>>> From: Yi Liu 
>>>>
>>>> If check fails, the host side device(either vfio or vdpa device) should not
>>>> be passed to guest.
>>>>
>>>> Implementation details for different backends will be in following
>patches.
>>>>
>>>> Signed-off-by: Yi Liu 
>>>> Signed-off-by: Yi Sun 
>>>> Signed-off-by: Zhenzhong Duan 
>>>> ---
>>>>hw/i386/intel_iommu.c | 35
>>> +++
>>>>1 file changed, 35 insertions(+)
>>>>
>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>>> index 4f84e2e801..a49b587c73 100644
>>>> --- a/hw/i386/intel_iommu.c
>>>> +++ b/hw/i386/intel_iommu.c
>>>> @@ -35,6 +35,7 @@
>>>>#include "sysemu/kvm.h"
>>>>#include "sysemu/dma.h"
>>>>#include "sysemu/sysemu.h"
>>>> +#include "sysemu/iommufd.h"
>>>>#include "hw/i386/apic_internal.h"
>>>>#include "kvm/kvm_i386.h"
>>>>#include "migration/vmstate.h"
>>>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>>>return vtd_dev_as;
>>>>}
>>>>
>>>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>>>> + HostIOMMUDevice *hiod,
>>>> + Error **errp)
>>>> +{
>>>> +return 0;
>>>> +}
>>>> +
>>>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>>>> +  HostIOMMUDevice *hiod,
>>>> +  Error **errp)
>>>> +{
>>>> +return 0;
>>>> +}
>>>> +
>>>> +static int vtd_check_hdev(IntelIOMMUState *s,
>VTDHostIOMMUDevice
>>> *vtd_hdev,
>>>> +  Error **errp)
>>>> +{
>>>> +HostIOMMUDevice *hiod = vtd_hdev->dev;
>>>> +
>>>> +if (object_dynamic_cast(OBJECT(hiod), TYPE_HIOD_IOMMUFD)) {
>>>> +return vtd_check_iommufd_hdev(s, hiod, errp);
>>>> +}
>>>> +
>>>> +return vtd_check_legacy_hdev(s, hiod, errp);
>>>> +}
>>>
>>>
>>> I think we should be using the .get_host_iommu_info() class handler
>>> instead. Can we refactor the code slightly to avoid this check on
>>> the type ?
>>
>> There is some difficulty ini avoiding this check, the behavior of
>vtd_check_legacy_hdev
>> and vtd_check_iommufd_hdev are different especially after nesting
>support introduced.
>> vtd_check_iommufd_hdev() has much wider check over cap/ecap bits
>besides aw_bits.
>
>I think it is important to fully separate the vIOMMU model from the
>host IOMMU backing device. Could we introduce a new
>HostIOMMUDeviceClass
>handler .check_hdev() handler, which would call .get_host_iommu_info() ?

Understood, besides the new .check_hdev() handler, I think we also need a new 
interface
class TYPE_IOMMU_CHECK_HDEV which has two handlers 
check_[legacy|iommufd]_hdev(),
and different vIOMMUs have different implementation.

Then legacy and iommufd host device have different implementation of 
.check_hdev()
and calls into one of the two interface handlers.

Let me know if I misunderstand any of your point.

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>> That the reason I have two functions to do different thing.
>> See:
>>
>https://github.com/yiliu1765/qemu/blob/zhenzhong/iommufd_nesting_rfc
>v2/hw/i386/intel_iommu.c#L5472
>>
>> Meanwhile in vtd_check_legacy_hdev(), when legacy VFIO device attaches
>to modern vIOMMU,
>> this is unsupported and error out early, it will not
>call .get_host_iommu_info().
>> I mean we don't need to unconditionally call .get_host_iommu_info() in
>some cases.
>>
>> Thanks
>> Zhenzhong



RE: [PATCH v2 3/5] intel_iommu: Add a framework to do compatibility check with host IOMMU cap/ecap

2024-04-16 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
>compatibility check with host IOMMU cap/ecap
>
>On 4/8/24 10:44, Zhenzhong Duan wrote:
>> From: Yi Liu 
>>
>> If check fails, the host side device(either vfio or vdpa device) should not
>> be passed to guest.
>>
>> Implementation details for different backends will be in following patches.
>>
>> Signed-off-by: Yi Liu 
>> Signed-off-by: Yi Sun 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   hw/i386/intel_iommu.c | 35
>+++
>>   1 file changed, 35 insertions(+)
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 4f84e2e801..a49b587c73 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -35,6 +35,7 @@
>>   #include "sysemu/kvm.h"
>>   #include "sysemu/dma.h"
>>   #include "sysemu/sysemu.h"
>> +#include "sysemu/iommufd.h"
>>   #include "hw/i386/apic_internal.h"
>>   #include "kvm/kvm_i386.h"
>>   #include "migration/vmstate.h"
>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
>*vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>   return vtd_dev_as;
>>   }
>>
>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>> + HostIOMMUDevice *hiod,
>> + Error **errp)
>> +{
>> +return 0;
>> +}
>> +
>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>> +  HostIOMMUDevice *hiod,
>> +  Error **errp)
>> +{
>> +return 0;
>> +}
>> +
>> +static int vtd_check_hdev(IntelIOMMUState *s, VTDHostIOMMUDevice
>*vtd_hdev,
>> +  Error **errp)
>> +{
>> +HostIOMMUDevice *hiod = vtd_hdev->dev;
>> +
>> +if (object_dynamic_cast(OBJECT(hiod), TYPE_HIOD_IOMMUFD)) {
>> +return vtd_check_iommufd_hdev(s, hiod, errp);
>> +}
>> +
>> +return vtd_check_legacy_hdev(s, hiod, errp);
>> +}
>
>
>I think we should be using the .get_host_iommu_info() class handler
>instead. Can we refactor the code slightly to avoid this check on
>the type ?

There is some difficulty ini avoiding this check, the behavior of 
vtd_check_legacy_hdev
and vtd_check_iommufd_hdev are different especially after nesting support 
introduced.
vtd_check_iommufd_hdev() has much wider check over cap/ecap bits besides 
aw_bits.
That the reason I have two functions to do different thing.
See:
https://github.com/yiliu1765/qemu/blob/zhenzhong/iommufd_nesting_rfcv2/hw/i386/intel_iommu.c#L5472

Meanwhile in vtd_check_legacy_hdev(), when legacy VFIO device attaches to 
modern vIOMMU,
this is unsupported and error out early, it will not call 
.get_host_iommu_info().
I mean we don't need to unconditionally call .get_host_iommu_info() in some 
cases.

Thanks
Zhenzhong


RE: [PATCH v2 08/10] vfio: Create host IOMMU device instance

2024-04-16 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 08/10] vfio: Create host IOMMU device instance
>
>On 4/8/24 10:12, Zhenzhong Duan wrote:
>> Create host IOMMU device instance and initialize it based on backend.
>>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/vfio/vfio-common.h | 1 +
>>   hw/vfio/container.c   | 5 +
>>   hw/vfio/iommufd.c | 8 
>>   3 files changed, 14 insertions(+)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index d382b12ec1..4fbba85018 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -126,6 +126,7 @@ typedef struct VFIODevice {
>>   OnOffAuto pre_copy_dirty_page_tracking;
>>   bool dirty_pages_supported;
>>   bool dirty_tracking;
>> +HostIOMMUDevice *hiod;
>>   int devid;
>>   IOMMUFDBackend *iommufd;
>>   } VFIODevice;
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index ba0ad4a41b..fc0c027501 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -915,6 +915,7 @@ static int vfio_legacy_attach_device(const char
>*name, VFIODevice *vbasedev,
>>   VFIODevice *vbasedev_iter;
>>   VFIOGroup *group;
>>   VFIOContainerBase *bcontainer;
>> +HIODLegacyVFIO *hiod_vfio;
>
>s/hiod_vfio/hiod/ please. Same below.

Will do.

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>
>>   int ret;
>>
>>   if (groupid < 0) {
>> @@ -945,6 +946,9 @@ static int vfio_legacy_attach_device(const char
>*name, VFIODevice *vbasedev,
>>   vbasedev->bcontainer = bcontainer;
>>   QLIST_INSERT_HEAD(>device_list, vbasedev,
>container_next);
>>   QLIST_INSERT_HEAD(_device_list, vbasedev, global_next);
>> +hiod_vfio =
>HIOD_LEGACY_VFIO(object_new(TYPE_HIOD_LEGACY_VFIO));
>> +hiod_vfio->vdev = vbasedev;
>> +vbasedev->hiod = HOST_IOMMU_DEVICE(hiod_vfio);
>>
>>   return ret;
>>   }
>> @@ -959,6 +963,7 @@ static void vfio_legacy_detach_device(VFIODevice
>*vbasedev)
>>   trace_vfio_detach_device(vbasedev->name, group->groupid);
>>   vfio_put_base_device(vbasedev);
>>   vfio_put_group(group);
>> +object_unref(vbasedev->hiod);
>>   }
>>
>>   static int vfio_legacy_pci_hot_reset(VFIODevice *vbasedev, bool single)
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index 115b9f8e7f..b6d058339b 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -308,6 +308,7 @@ static int iommufd_cdev_attach(const char *name,
>VFIODevice *vbasedev,
>>   VFIOIOMMUFDContainer *container;
>>   VFIOAddressSpace *space;
>>   struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
>> +HIODIOMMUFDVFIO *hiod_vfio;
>>   int ret, devfd;
>>   uint32_t ioas_id;
>>   Error *err = NULL;
>> @@ -431,6 +432,12 @@ found_container:
>>   QLIST_INSERT_HEAD(>device_list, vbasedev,
>container_next);
>>   QLIST_INSERT_HEAD(_device_list, vbasedev, global_next);
>>
>> +hiod_vfio =
>HIOD_IOMMUFD_VFIO(object_new(TYPE_HIOD_IOMMUFD_VFIO));
>> +hiod_iommufd_init(HIOD_IOMMUFD(hiod_vfio), vbasedev->iommufd,
>> +  vbasedev->devid);
>> +hiod_vfio->vdev = vbasedev;
>> +vbasedev->hiod = HOST_IOMMU_DEVICE(hiod_vfio);
>> +
>>   trace_iommufd_cdev_device_info(vbasedev->name, devfd, vbasedev-
>>num_irqs,
>>  vbasedev->num_regions, vbasedev->flags);
>>   return 0;
>> @@ -468,6 +475,7 @@ static void iommufd_cdev_detach(VFIODevice
>*vbasedev)
>>   iommufd_cdev_detach_container(vbasedev, container);
>>   iommufd_cdev_container_destroy(container);
>>   vfio_put_address_space(space);
>> +object_unref(vbasedev->hiod);
>>
>>   iommufd_cdev_unbind_and_disconnect(vbasedev);
>>   close(vbasedev->fd);



RE: [PATCH v2 09/10] hw/pci: Introduce pci_device_set/unset_iommu_device()

2024-04-16 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 09/10] hw/pci: Introduce
>pci_device_set/unset_iommu_device()
>
>On 4/8/24 10:12, Zhenzhong Duan wrote:
>> From: Yi Liu 
>>
>> This adds pci_device_set/unset_iommu_device() to set/unset
>> HostIOMMUDevice for a given PCI device. Caller of set
>> should fail if set operation fails.
>>
>> Extract out pci_device_get_iommu_bus_devfn() to facilitate
>
>I would separate this change in a prereq patch.

Will do.

Thanks
Zhenzhong



RE: [PATCH v2 07/10] backends/iommufd: Implement get_host_iommu_info() callback

2024-04-16 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 07/10] backends/iommufd: Implement
>get_host_iommu_info() callback
>
>On 4/8/24 10:12, Zhenzhong Duan wrote:
>> It calls iommufd_backend_get_device_info() to get host IOMMU
>> related information.
>>
>> Define a common structure HIOD_IOMMUFD_INFO to describe the info
>> returned from kernel. Currently only vtd, but easy to add arm smmu
>> when kernel supports.
>
>I think you can merge the previous patch and this one.

Sure.

>
>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/sysemu/iommufd.h |  7 +++
>>   backends/iommufd.c   | 17 +
>>   2 files changed, 24 insertions(+)
>>
>> diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
>> index fa1a866237..44ec1335b2 100644
>> --- a/include/sysemu/iommufd.h
>> +++ b/include/sysemu/iommufd.h
>
>I just noticed that include/sysemu/iommufd.h lacks a header.  Could you fix
>that please ?

Sure. Presume you means the copyright header. Fix me if you mean others.

>
>> @@ -39,6 +39,13 @@ int
>iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>>   enum iommu_hw_info_type *type,
>>   void *data, uint32_t len, Error 
>> **errp);
>>
>> +typedef struct HIOD_IOMMUFD_INFO {
>
>Please use CamelCase names.

Sure.

Thanks
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>> +enum iommu_hw_info_type type;
>> +union {
>> +struct iommu_hw_info_vtd vtd;
>> +} data;
>> +} HIOD_IOMMUFD_INFO;
>> +
>>   #define TYPE_HIOD_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
>>   OBJECT_DECLARE_TYPE(HIODIOMMUFD, HIODIOMMUFDClass,
>HIOD_IOMMUFD)
>>
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index 559affa9ec..1e9c469e65 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -240,8 +240,25 @@ void hiod_iommufd_init(HIODIOMMUFD *idev,
>IOMMUFDBackend *iommufd,
>>   idev->devid = devid;
>>   }
>>
>> +static int hiod_iommufd_get_host_iommu_info(HostIOMMUDevice
>*hiod,
>> +void *data, uint32_t len,
>> +Error **errp)
>> +{
>> +HIODIOMMUFD *idev = HIOD_IOMMUFD(hiod);
>> +HIOD_IOMMUFD_INFO *info = data;
>> +
>> +assert(sizeof(HIOD_IOMMUFD_INFO) <= len);
>> +
>> +return iommufd_backend_get_device_info(idev->iommufd, idev-
>>devid,
>> +   >type, >data,
>> +   sizeof(info->data), errp);
>> +}
>> +
>>   static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
>>   {
>> +HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> +hiodc->get_host_iommu_info = hiod_iommufd_get_host_iommu_info;
>>   }
>>
>>   static const TypeInfo types[] = {



RE: [PATCH v2 06/10] backends/iommufd: Introduce helper function iommufd_backend_get_device_info()

2024-04-16 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 06/10] backends/iommufd: Introduce helper
>function iommufd_backend_get_device_info()
>
>On 4/8/24 10:12, Zhenzhong Duan wrote:
>> Introduce a helper function iommufd_backend_get_device_info() to get
>> host IOMMU related information through iommufd uAPI.
>>
>> Signed-off-by: Yi Liu 
>> Signed-off-by: Yi Sun 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/sysemu/iommufd.h |  4 
>>   backends/iommufd.c   | 23 ++-
>>   2 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
>> index 71c53cbb45..fa1a866237 100644
>> --- a/include/sysemu/iommufd.h
>> +++ b/include/sysemu/iommufd.h
>> @@ -4,6 +4,7 @@
>>   #include "qom/object.h"
>>   #include "exec/hwaddr.h"
>>   #include "exec/cpu-common.h"
>> +#include 
>>   #include "sysemu/host_iommu_device.h"
>>
>>   #define TYPE_IOMMUFD_BACKEND "iommufd"
>> @@ -34,6 +35,9 @@ int iommufd_backend_map_dma(IOMMUFDBackend
>*be, uint32_t ioas_id, hwaddr iova,
>>   ram_addr_t size, void *vaddr, bool readonly);
>>   int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t
>ioas_id,
>> hwaddr iova, ram_addr_t size);
>> +int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>devid,
>> +enum iommu_hw_info_type *type,
>> +void *data, uint32_t len, Error **errp);
>>
>>   #define TYPE_HIOD_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
>>   OBJECT_DECLARE_TYPE(HIODIOMMUFD, HIODIOMMUFDClass,
>HIOD_IOMMUFD)
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index ef8b3a808b..559affa9ec 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -20,7 +20,6 @@
>>   #include "monitor/monitor.h"
>>   #include "trace.h"
>>   #include 
>> -#include 
>>
>>   static void iommufd_backend_init(Object *obj)
>>   {
>> @@ -212,6 +211,28 @@ int
>iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
>>   return ret;
>>   }
>>
>> +int iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>devid,
>> +enum iommu_hw_info_type *type,
>> +void *data, uint32_t len, Error **errp)
>> +{
>> +struct iommu_hw_info info = {
>> +.size = sizeof(info),
>> +.dev_id = devid,
>> +.data_len = len,
>> +.data_uptr = (uintptr_t)data,
>> +};
>> +int ret;
>> +
>> +ret = ioctl(be->fd, IOMMU_GET_HW_INFO, );
>> +if (ret) {
>> +error_setg_errno(errp, errno, "Failed to get hardware info");
>> +} else {
>> +*type = info.out_data_type;
>
>type should not be NULL.

Yes, will add g_assert(type);

Thanks
Zhenzhong



RE: [PATCH v2 05/10] vfio: Implement get_host_iommu_info() callback

2024-04-15 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 05/10] vfio: Implement get_host_iommu_info()
>callback
>
>On 4/8/24 10:12, Zhenzhong Duan wrote:
>> Utilize iova_ranges to calculate host IOMMU address width and
>> package it in HIOD_LEGACY_INFO for vIOMMU usage.
>>
>> HIOD_LEGACY_INFO will be used by both VFIO and VDPA so declare
>> it in host_iommu_device.h.
>>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/sysemu/host_iommu_device.h | 10 ++
>>   hw/vfio/container.c| 24 
>>   2 files changed, 34 insertions(+)
>>
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> index 22ccbe3a5d..beb8be8231 100644
>> --- a/include/sysemu/host_iommu_device.h
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -16,4 +16,14 @@ struct HostIOMMUDeviceClass {
>>   int (*get_host_iommu_info)(HostIOMMUDevice *hiod, void *data,
>uint32_t len,
>>  Error **errp);
>>   };
>> +
>> +/*
>> + * Define the format of host IOMMU related info that current VFIO
>> + * or VDPA can privode to vIOMMU.
>> + *
>> + * @aw_bits: Host IOMMU address width. 0xff if no limitation.
>> + */
>> +typedef struct HIOD_LEGACY_INFO {
>
>Please use CamelCase names.

Sure.

>
>> +uint8_t aw_bits;
>> +} HIOD_LEGACY_INFO;
>>   #endif
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index 44018ef085..ba0ad4a41b 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -1143,8 +1143,32 @@ static void
>vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
>>   vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
>>   };
>>
>> +static int hiod_legacy_vfio_get_host_iommu_info(HostIOMMUDevice
>*hiod,
>> +void *data, uint32_t len,
>> +Error **errp)
>> +{
>> +VFIODevice *vbasedev = HIOD_LEGACY_VFIO(hiod)->vdev;
>> +/* iova_ranges is a sorted list */
>> +GList *l = g_list_last(vbasedev->bcontainer->iova_ranges);
>> +HIOD_LEGACY_INFO *info = data;
>> +
>> +assert(sizeof(HIOD_LEGACY_INFO) <= len);
>> +
>> +if (l) {
>> +Range *range = l->data;
>> +info->aw_bits = find_last_bit(>upb, BITS_PER_LONG) + 1;
>
>There is a comment in range.h saying:
>
> /*
>  * Do not access members directly, use the functions!
>
>Please introduce a new helper.

Sure, thanks for point out.

BRs.
Zhenzhong

>
>
>Thanks,
>
>C.
>
>
>
>> +} else {
>> +info->aw_bits = 0xff;
>> +}
>> +
>> +return 0;
>> +}
>> +
>>   static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
>>   {
>> +HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> +hioc->get_host_iommu_info =
>hiod_legacy_vfio_get_host_iommu_info;
>>   };
>>
>>   static const TypeInfo types[] = {



RE: [PATCH v2 03/10] backends/iommufd: Introduce abstract HIODIOMMUFD device

2024-04-15 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 03/10] backends/iommufd: Introduce abstract
>HIODIOMMUFD device
>
>On 4/8/24 10:12, Zhenzhong Duan wrote:
>> HIODIOMMUFD represents a host IOMMU device under iommufd backend.
>>
>> Currently it includes only public iommufd handle and device id.
>> which could be used to get hw IOMMU information.
>>
>> When nested translation is supported in future, vIOMMU is going
>> to have iommufd related operations like attaching/detaching hwpt,
>> So IOMMUFDDevice interface will be further extended at that time.
>>
>> VFIO and VDPA device have different way of attaching/detaching hwpt.
>> So HIODIOMMUFD is still an abstract class which will be inherited by
>> VFIO and VDPA device.
>>
>> Introduce a helper hiod_iommufd_init() to initialize HIODIOMMUFD
>> device.
>>
>> Suggested-by: Cédric Le Goater 
>> Originally-by: Yi Liu 
>> Signed-off-by: Yi Sun 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/sysemu/iommufd.h | 22 +++
>>   backends/iommufd.c   | 47 ++--
>>   2 files changed, 53 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
>> index 9af27ebd6c..71c53cbb45 100644
>> --- a/include/sysemu/iommufd.h
>> +++ b/include/sysemu/iommufd.h
>> @@ -4,6 +4,7 @@
>>   #include "qom/object.h"
>>   #include "exec/hwaddr.h"
>>   #include "exec/cpu-common.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>>   #define TYPE_IOMMUFD_BACKEND "iommufd"
>>   OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass,
>IOMMUFD_BACKEND)
>> @@ -33,4 +34,25 @@ int
>iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id,
>hwaddr iova,
>>   ram_addr_t size, void *vaddr, bool readonly);
>>   int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t
>ioas_id,
>> hwaddr iova, ram_addr_t size);
>> +
>> +#define TYPE_HIOD_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
>
>Please keep TYPE_HOST_IOMMU_DEVICE

Sure.

>
>> +OBJECT_DECLARE_TYPE(HIODIOMMUFD, HIODIOMMUFDClass,
>HIOD_IOMMUFD)
>> +
>> +struct HIODIOMMUFD {
>> +/*< private >*/
>> +HostIOMMUDevice parent;
>> +void *opaque;
>> +
>> +/*< public >*/
>> +IOMMUFDBackend *iommufd;
>> +uint32_t devid;
>> +};
>> +
>> +struct HIODIOMMUFDClass {
>> +/*< private >*/
>> +HostIOMMUDeviceClass parent_class;
>> +};
>
>This new class doesn't seem useful. Do you have plans for handlers ?

Yes, In nesting series 
https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting_rfcv2/
This commit 
https://github.com/yiliu1765/qemu/commit/581fc900aa296988eaa48abee6d68d3670faf8c9
implement [at|de]tach_hwpt handlers.

So I add an extra layer of abstract HIODIOMMUFDClass.

>
>> +
>> +void hiod_iommufd_init(HIODIOMMUFD *idev, IOMMUFDBackend
>*iommufd,
>> +   uint32_t devid);
>>   #endif
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index 62a79fa6b0..ef8b3a808b 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -212,23 +212,38 @@ int
>iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
>>   return ret;
>>   }
>>
>> -static const TypeInfo iommufd_backend_info = {
>> -.name = TYPE_IOMMUFD_BACKEND,
>> -.parent = TYPE_OBJECT,
>> -.instance_size = sizeof(IOMMUFDBackend),
>> -.instance_init = iommufd_backend_init,
>> -.instance_finalize = iommufd_backend_finalize,
>> -.class_size = sizeof(IOMMUFDBackendClass),
>> -.class_init = iommufd_backend_class_init,
>> -.interfaces = (InterfaceInfo[]) {
>> -{ TYPE_USER_CREATABLE },
>> -{ }
>> -}
>> -};
>> +void hiod_iommufd_init(HIODIOMMUFD *idev, IOMMUFDBackend
>*iommufd,
>> +   uint32_t devid)
>> +{
>> +idev->iommufd = iommufd;
>> +idev->devid = devid;
>> +}
>
>This routine doesn't seem useful. I wonder if we shouldn't introduce
>properties. I'm not sure this is useful either.

This routine is called in patch8 to initialize iommu, devid and ioas(in future 
nesting series).
I didn't choose properties as HIODIOMMUFD is not user creatable, property is a 
bit heavy
here. But I'm fine to use it if you prefer.

Thanks
Zhenzhong

>
>
>> -static void register_types(void)
>> +static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
>>   {
>> -type_register_static(_backend_info);
>>   }
>>
>> -type_init(register_types);
>> +static const TypeInfo types[] = {
>> +{
>> +.name = TYPE_IOMMUFD_BACKEND,
>> +.parent = TYPE_OBJECT,
>> +.instance_size = sizeof(IOMMUFDBackend),
>> +.instance_init = iommufd_backend_init,
>> +.instance_finalize = iommufd_backend_finalize,
>> +.class_size = sizeof(IOMMUFDBackendClass),
>> +.class_init = iommufd_backend_class_init,
>> +.interfaces = (InterfaceInfo[]) {
>> +{ TYPE_USER_CREATABLE },
>> +{ }
>> +}
>> +}, {
>> +

RE: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device

2024-04-15 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device
>
>On 4/8/24 10:12, Zhenzhong Duan wrote:
>> HIODLegacyVFIO represents a host IOMMU device under VFIO legacy
>> container backend.
>>
>> It includes a link to VFIODevice.
>>
>> Suggested-by: Eric Auger 
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/vfio/vfio-common.h | 11 +++
>>   hw/vfio/container.c   | 11 ++-
>>   2 files changed, 21 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index b9da6c08ef..f30772f534 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -31,6 +31,7 @@
>>   #endif
>>   #include "sysemu/sysemu.h"
>>   #include "hw/vfio/vfio-container-base.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>>   #define VFIO_MSG_PREFIX "vfio %s: "
>>
>> @@ -147,6 +148,16 @@ typedef struct VFIOGroup {
>>   bool ram_block_discard_allowed;
>>   } VFIOGroup;
>>
>> +#define TYPE_HIOD_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-legacy-
>vfio"
>
>I would prefer to keep the prefix TYPE_HOST_IOMMU_DEVICE.

Will do.

>
>> +OBJECT_DECLARE_SIMPLE_TYPE(HIODLegacyVFIO, HIOD_LEGACY_VFIO)
>> +
>> +/* Abstraction of VFIO legacy host IOMMU device */
>> +struct HIODLegacyVFIO {
>
>same here

Should I do the same for all the HostIOMMUDevice and HostIOMMUDeviceClass 
sub-structures?

The reason I used 'HIOD' abbreviation is some function names become extremely 
long
and exceed 80 characters. E.g.:

@@ -1148,9 +1148,9 @@ static void vfio_iommu_legacy_class_init(ObjectClass 
*klass, void *data)
 vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
 };

-static int hiod_legacy_vfio_get_host_iommu_info(HostIOMMUDevice *hiod,
-void *data, uint32_t len,
-Error **errp)
+static int host_iommu_device_legacy_vfio_get_host_iommu_info(HostIOMMUDevice 
*hiod,
+ void *data, 
uint32_t len,
+ Error **errp)
 {
 VFIODevice *vbasedev = HIOD_LEGACY_VFIO(hiod)->vdev;
 /* iova_ranges is a sorted list */
@@ -1173,7 +1173,7 @@ static void hiod_legacy_vfio_class_init(ObjectClass *oc, 
void *data)
 {
 HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);

-hioc->get_host_iommu_info = hiod_legacy_vfio_get_host_iommu_info;
+hioc->get_host_iommu_info = 
host_iommu_device_legacy_vfio_get_host_iommu_info;
 };

I didn't find other way to make it meet the 80 chars limitation. Any 
suggestions on this?

>
>> +/*< private >*/
>> +HostIOMMUDevice parent;
>> +VFIODevice *vdev;
>
>It seems to me that the back pointer should be on the container instead.
>Looks more correct conceptually.

Yes, that makes sense for legacy VFIO, as iova_ranges, pgsizes etc are all 
saved in bcontainer.

>
>
>> +};
>> +
>>   typedef struct VFIODMABuf {
>>   QemuDmaBuf buf;
>>   uint32_t pos_x, pos_y, pos_updates;
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index 77bdec276e..44018ef085 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -1143,12 +1143,21 @@ static void
>vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
>>   vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
>>   };
>>
>> +static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
>> +{
>> +};
>
>Is it preferable to introduce routines when they are actually useful.
>Please drop the .class_init definition.

Sure.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>
>> +
>>   static const TypeInfo types[] = {
>>   {
>>   .name = TYPE_VFIO_IOMMU_LEGACY,
>>   .parent = TYPE_VFIO_IOMMU,
>>   .class_init = vfio_iommu_legacy_class_init,
>> -},
>> +}, {
>> +.name = TYPE_HIOD_LEGACY_VFIO,
>> +.parent = TYPE_HOST_IOMMU_DEVICE,
>> +.instance_size = sizeof(HIODLegacyVFIO),
>> +.class_init = hiod_legacy_vfio_class_init,
>> +}
>>   };
>>
>>   DEFINE_TYPES(types)



RE: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device

2024-04-15 Thread Duan, Zhenzhong


>-Original Message-
>From: Philippe Mathieu-Daudé 
>Subject: Re: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device
>
>On 15/4/24 12:10, Duan, Zhenzhong wrote:
>> Hi Philippe,
>>
>>> -Original Message-
>>> From: Philippe Mathieu-Daudé 
>>> Sent: Monday, April 15, 2024 5:20 PM
>>> To: Duan, Zhenzhong ; qemu-
>>> de...@nongnu.org
>>> Cc: alex.william...@redhat.com; c...@redhat.com;
>eric.au...@redhat.com;
>>> pet...@redhat.com; jasow...@redhat.com; m...@redhat.com;
>>> j...@nvidia.com; nicol...@nvidia.com; joao.m.mart...@oracle.com; Tian,
>>> Kevin ; Liu, Yi L ; Peng, Chao P
>>> 
>>> Subject: Re: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device
>>>
>>> On 8/4/24 10:12, Zhenzhong Duan wrote:
>>>> HIODLegacyVFIO represents a host IOMMU device under VFIO legacy
>>>> container backend.
>>>>
>>>> It includes a link to VFIODevice.
>>>>
>>>> Suggested-by: Eric Auger 
>>>> Suggested-by: Cédric Le Goater 
>>>> Signed-off-by: Zhenzhong Duan 
>>>> ---
>>>>include/hw/vfio/vfio-common.h | 11 +++
>>>>hw/vfio/container.c   | 11 ++-
>>>>2 files changed, 21 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>>> common.h
>>>> index b9da6c08ef..f30772f534 100644
>>>> --- a/include/hw/vfio/vfio-common.h
>>>> +++ b/include/hw/vfio/vfio-common.h
>>>> @@ -31,6 +31,7 @@
>>>>#endif
>>>>#include "sysemu/sysemu.h"
>>>>#include "hw/vfio/vfio-container-base.h"
>>>> +#include "sysemu/host_iommu_device.h"
>>>>
>>>>#define VFIO_MSG_PREFIX "vfio %s: "
>>>>
>>>> @@ -147,6 +148,16 @@ typedef struct VFIOGroup {
>>>>bool ram_block_discard_allowed;
>>>>} VFIOGroup;
>>>>
>>>> +#define TYPE_HIOD_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-
>legacy-
>>> vfio"
>>>> +OBJECT_DECLARE_SIMPLE_TYPE(HIODLegacyVFIO, HIOD_LEGACY_VFIO)
>>>> +
>>>> +/* Abstraction of VFIO legacy host IOMMU device */
>>>> +struct HIODLegacyVFIO {
>>>> +/*< private >*/
>>>
>>> Please drop this comment.
>>
>> Will do. But may I ask the rules when to use that comment and when not?
>
>Sure, see
>https://www.qemu.org/docs/master/devel/style.html#qemu-object-model-
>declarations

Learned, thanks Philippe.

BRs.
Zhenzhong


RE: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device

2024-04-15 Thread Duan, Zhenzhong
Hi Philippe,

>-Original Message-
>From: Philippe Mathieu-Daudé 
>Sent: Monday, April 15, 2024 5:20 PM
>To: Duan, Zhenzhong ; qemu-
>de...@nongnu.org
>Cc: alex.william...@redhat.com; c...@redhat.com; eric.au...@redhat.com;
>pet...@redhat.com; jasow...@redhat.com; m...@redhat.com;
>j...@nvidia.com; nicol...@nvidia.com; joao.m.mart...@oracle.com; Tian,
>Kevin ; Liu, Yi L ; Peng, Chao P
>
>Subject: Re: [PATCH v2 02/10] vfio: Introduce HIODLegacyVFIO device
>
>On 8/4/24 10:12, Zhenzhong Duan wrote:
>> HIODLegacyVFIO represents a host IOMMU device under VFIO legacy
>> container backend.
>>
>> It includes a link to VFIODevice.
>>
>> Suggested-by: Eric Auger 
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/hw/vfio/vfio-common.h | 11 +++
>>   hw/vfio/container.c   | 11 ++-
>>   2 files changed, 21 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index b9da6c08ef..f30772f534 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -31,6 +31,7 @@
>>   #endif
>>   #include "sysemu/sysemu.h"
>>   #include "hw/vfio/vfio-container-base.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>>   #define VFIO_MSG_PREFIX "vfio %s: "
>>
>> @@ -147,6 +148,16 @@ typedef struct VFIOGroup {
>>   bool ram_block_discard_allowed;
>>   } VFIOGroup;
>>
>> +#define TYPE_HIOD_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-legacy-
>vfio"
>> +OBJECT_DECLARE_SIMPLE_TYPE(HIODLegacyVFIO, HIOD_LEGACY_VFIO)
>> +
>> +/* Abstraction of VFIO legacy host IOMMU device */
>> +struct HIODLegacyVFIO {
>> +/*< private >*/
>
>Please drop this comment.

Will do. But may I ask the rules when to use that comment and when not?
I see some QOM use that comment to mark private vs. public, for example:

struct AccelState {
/*< private >*/
Object parent_obj;
};

typedef struct AccelClass {
/*< private >*/
ObjectClass parent_class;
/*< public >*/

>
>> +HostIOMMUDevice parent;
>
>Please name 'parent_obj'.

Will do.

Thanks
Zhenzhong

>
>> +VFIODevice *vdev;
>> +};



RE: [PATCH v2 01/10] backends: Introduce abstract HostIOMMUDevice

2024-04-15 Thread Duan, Zhenzhong


>-Original Message-
>From: Philippe Mathieu-Daudé 
>Subject: Re: [PATCH v2 01/10] backends: Introduce abstract
>HostIOMMUDevice
>
>Hi Zhenzhong,
>
>On 8/4/24 10:12, Zhenzhong Duan wrote:
>> Introduce HostIOMMUDevice as an abstraction of host IOMMU device.
>>
>> get_host_iommu_info() is used to get host IOMMU info, different
>> backends can have different implementations and result format.
>>
>> Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
>> for VFIO, and VDPA in the future.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   MAINTAINERS|  2 ++
>>   include/sysemu/host_iommu_device.h | 19 +++
>>   backends/host_iommu_device.c   | 19 +++
>>   backends/Kconfig   |  5 +
>>   backends/meson.build   |  1 +
>>   5 files changed, 46 insertions(+)
>>   create mode 100644 include/sysemu/host_iommu_device.h
>>   create mode 100644 backends/host_iommu_device.c
>
>
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> new file mode 100644
>> index 00..22ccbe3a5d
>> --- /dev/null
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -0,0 +1,19 @@
>> +#ifndef HOST_IOMMU_DEVICE_H
>> +#define HOST_IOMMU_DEVICE_H
>> +
>> +#include "qom/object.h"
>> +
>> +#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>> +OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>> +
>> +struct HostIOMMUDevice {
>> +Object parent;
>> +};
>> +
>> +struct HostIOMMUDeviceClass {
>> +ObjectClass parent_class;
>> +
>> +int (*get_host_iommu_info)(HostIOMMUDevice *hiod, void *data,
>uint32_t len,
>> +   Error **errp);
>
>Please document this new method (in particular return value and @data).
>
>Since @len is sizeof(data), can we use the size_t type?

Sure, will do.

Thanks
Zhenzhong


RE: [PATCH v2 01/10] backends: Introduce abstract HostIOMMUDevice

2024-04-15 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v2 01/10] backends: Introduce abstract
>HostIOMMUDevice
>
>On 4/8/24 10:12, Zhenzhong Duan wrote:
>> Introduce HostIOMMUDevice as an abstraction of host IOMMU device.
>>
>> get_host_iommu_info() is used to get host IOMMU info, different
>> backends can have different implementations and result format.
>>
>> Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
>> for VFIO, and VDPA in the future.
>>
>> Suggested-by: Cédric Le Goater 
>> Signed-off-by: Zhenzhong Duan 
>
>LGTM,
>
>> ---
>>   MAINTAINERS|  2 ++
>>   include/sysemu/host_iommu_device.h | 19 +++
>>   backends/host_iommu_device.c   | 19 +++
>>   backends/Kconfig   |  5 +
>>   backends/meson.build   |  1 +
>>   5 files changed, 46 insertions(+)
>>   create mode 100644 include/sysemu/host_iommu_device.h
>>   create mode 100644 backends/host_iommu_device.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index e71183eef9..22f71cbe02 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -2202,6 +2202,8 @@ M: Zhenzhong Duan
>
>>   S: Supported
>>   F: backends/iommufd.c
>>   F: include/sysemu/iommufd.h
>> +F: backends/host_iommu_device.c
>> +F: include/sysemu/host_iommu_device.h
>>   F: include/qemu/chardev_open.h
>>   F: util/chardev_open.c
>>   F: docs/devel/vfio-iommufd.rst
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> new file mode 100644
>> index 00..22ccbe3a5d
>> --- /dev/null
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -0,0 +1,19 @@
>> +#ifndef HOST_IOMMU_DEVICE_H
>> +#define HOST_IOMMU_DEVICE_H
>> +
>> +#include "qom/object.h"
>> +
>> +#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>> +OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>> +
>> +struct HostIOMMUDevice {
>> +Object parent;
>> +};
>> +
>> +struct HostIOMMUDeviceClass {
>> +ObjectClass parent_class;
>
>Could you please document the struct and its handlers ? This is more for
>the future reader to understand the VFIO concepts than for the generated
>docs. Anyhow, it could be useful for the docs also. Overall, the QEMU VFIO
>susbsytem suffers from a lack of documentation and we should try to
>improve that in the next cycle.

Sure, will doc struct and handlers in v3.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>
>
>> +int (*get_host_iommu_info)(HostIOMMUDevice *hiod, void *data,
>uint32_t len,
>> +   Error **errp);
>> +};
>> +#endif
>> diff --git a/backends/host_iommu_device.c
>b/backends/host_iommu_device.c
>> new file mode 100644
>> index 00..6cb6007d8c
>> --- /dev/null
>> +++ b/backends/host_iommu_device.c
>> @@ -0,0 +1,19 @@
>> +#include "qemu/osdep.h"
>> +#include "sysemu/host_iommu_device.h"
>> +
>> +OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
>> +host_iommu_device,
>> +HOST_IOMMU_DEVICE,
>> +OBJECT)
>> +
>> +static void host_iommu_device_class_init(ObjectClass *oc, void *data)
>> +{
>> +}
>> +
>> +static void host_iommu_device_init(Object *obj)
>> +{
>> +}
>> +
>> +static void host_iommu_device_finalize(Object *obj)
>> +{
>> +}
>> diff --git a/backends/Kconfig b/backends/Kconfig
>> index 2cb23f62fa..34ab29e994 100644
>> --- a/backends/Kconfig
>> +++ b/backends/Kconfig
>> @@ -3,3 +3,8 @@ source tpm/Kconfig
>>   config IOMMUFD
>>   bool
>>   depends on VFIO
>> +
>> +config HOST_IOMMU_DEVICE
>> +bool
>> +default y
>> +depends on VFIO
>> diff --git a/backends/meson.build b/backends/meson.build
>> index 8b2b111497..2e975d641e 100644
>> --- a/backends/meson.build
>> +++ b/backends/meson.build
>> @@ -25,6 +25,7 @@ if have_vhost_user
>>   endif
>>   system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-
>vhost.c'))
>>   system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c'))
>> +system_ss.add(when: 'CONFIG_HOST_IOMMU_DEVICE', if_true:
>files('host_iommu_device.c'))
>>   if have_vhost_user_crypto
>> system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true:
>files('cryptodev-vhost-user.c'))
>>   endif



RE: [PATCH v2 03/10] backends/iommufd: Introduce abstract HIODIOMMUFD device

2024-04-08 Thread Duan, Zhenzhong
Hi All,

>-Original Message-
>From: Duan, Zhenzhong 
>Subject: [PATCH v2 03/10] backends/iommufd: Introduce abstract
>HIODIOMMUFD device
>
>HIODIOMMUFD represents a host IOMMU device under iommufd backend.
>
>Currently it includes only public iommufd handle and device id.
>which could be used to get hw IOMMU information.
>
>When nested translation is supported in future, vIOMMU is going
>to have iommufd related operations like attaching/detaching hwpt,
>So IOMMUFDDevice interface will be further extended at that time.
>
>VFIO and VDPA device have different way of attaching/detaching hwpt.
>So HIODIOMMUFD is still an abstract class which will be inherited by
>VFIO and VDPA device.
>
>Introduce a helper hiod_iommufd_init() to initialize HIODIOMMUFD
>device.
>
>Suggested-by: Cédric Le Goater 
>Originally-by: Yi Liu 
>Signed-off-by: Yi Sun 
>Signed-off-by: Zhenzhong Duan 
>---
> include/sysemu/iommufd.h | 22 +++
> backends/iommufd.c   | 47 ++--
> 2 files changed, 53 insertions(+), 16 deletions(-)
>
>diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
>index 9af27ebd6c..71c53cbb45 100644
>--- a/include/sysemu/iommufd.h
>+++ b/include/sysemu/iommufd.h
>@@ -4,6 +4,7 @@
> #include "qom/object.h"
> #include "exec/hwaddr.h"
> #include "exec/cpu-common.h"
>+#include "sysemu/host_iommu_device.h"
>
> #define TYPE_IOMMUFD_BACKEND "iommufd"
> OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass,
>IOMMUFD_BACKEND)
>@@ -33,4 +34,25 @@ int iommufd_backend_map_dma(IOMMUFDBackend
>*be, uint32_t ioas_id, hwaddr iova,
> ram_addr_t size, void *vaddr, bool readonly);
> int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t
>ioas_id,
>   hwaddr iova, ram_addr_t size);
>+
>+#define TYPE_HIOD_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
>+OBJECT_DECLARE_TYPE(HIODIOMMUFD, HIODIOMMUFDClass,
>HIOD_IOMMUFD)
>+
>+struct HIODIOMMUFD {
>+/*< private >*/
>+HostIOMMUDevice parent;
>+void *opaque;

Please ignore above line "void *opaque;", it's totally useless, I forgot to 
remove it. Sorry for noise.

Thanks
Zhenzhong


RE: [PATCH v1 01/11] Introduce a common abstract struct HostIOMMUDevice

2024-03-31 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v1 01/11] Introduce a common abstract struct
>HostIOMMUDevice
>
>Hello Zhenzhong,
>
>On 3/28/24 04:06, Duan, Zhenzhong wrote:
>> Hi Cédric,
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Subject: Re: [PATCH v1 01/11] Introduce a common abstract struct
>>> HostIOMMUDevice
>>>
>>> Hello Zhenzhong,
>>>
>>> On 3/19/24 12:58, Duan, Zhenzhong wrote:
>>>> Hi Cédric,
>>>>
>>>>> -Original Message-
>>>>> From: Cédric Le Goater 
>>>>> Sent: Tuesday, March 19, 2024 4:17 PM
>>>>> To: Duan, Zhenzhong ; qemu-
>>>>> de...@nongnu.org
>>>>> Cc: alex.william...@redhat.com; eric.au...@redhat.com;
>>>>> pet...@redhat.com; jasow...@redhat.com; m...@redhat.com;
>>>>> j...@nvidia.com; nicol...@nvidia.com; joao.m.mart...@oracle.com;
>Tian,
>>>>> Kevin ; Liu, Yi L ; Sun, Yi Y
>>>>> ; Peng, Chao P 
>>>>> Subject: Re: [PATCH v1 01/11] Introduce a common abstract struct
>>>>> HostIOMMUDevice
>>>>>
>>>>> Hello Zhenzhong,
>>>>>
>>>>> On 2/28/24 04:58, Zhenzhong Duan wrote:
>>>>>> HostIOMMUDevice will be inherited by two sub classes,
>>>>>> legacy and iommufd currently.
>>>>>>
>>>>>> Introduce a helper function host_iommu_base_device_init to
>initialize it.
>>>>>>
>>>>>> Suggested-by: Eric Auger 
>>>>>> Signed-off-by: Zhenzhong Duan 
>>>>>> ---
>>>>>> include/sysemu/host_iommu_device.h | 22
>>> ++
>>>>>> 1 file changed, 22 insertions(+)
>>>>>> create mode 100644 include/sysemu/host_iommu_device.h
>>>>>>
>>>>>> diff --git a/include/sysemu/host_iommu_device.h
>>>>> b/include/sysemu/host_iommu_device.h
>>>>>> new file mode 100644
>>>>>> index 00..fe80ab25fb
>>>>>> --- /dev/null
>>>>>> +++ b/include/sysemu/host_iommu_device.h
>>>>>> @@ -0,0 +1,22 @@
>>>>>> +#ifndef HOST_IOMMU_DEVICE_H
>>>>>> +#define HOST_IOMMU_DEVICE_H
>>>>>> +
>>>>>> +typedef enum HostIOMMUDevice_Type {
>>>>>> +HID_LEGACY,
>>>>>> +HID_IOMMUFD,
>>>>>> +HID_MAX,
>>>>>> +} HostIOMMUDevice_Type;
>>>>>> +
>>>>>> +typedef struct HostIOMMUDevice {
>>>>>> +HostIOMMUDevice_Type type;
>>>>>
>>>>> A type field is not a good sign and that's where QOM is useful.
>>>>
>>>> Yes, agree.
>>>> I didn't choose QOM because in iommufd-cdev series, VFIOContainer
>>> chooses not using QOM model.
>>>> See the discussion:
>>> https://lore.kernel.org/all/YmuFv2s5TPuw7K%2Fu@yekko/
>>>> I thought HostIOMMUDevice need to follow same rule.
>>>>
>>>> But after further digging into this, I think it may be ok to use QOM
>model
>>> as long as we don't expose
>>>> HostIOMMUDevice in qapi/qom.json and not use USER_CREATABLE
>>> interface. Your thoughts?
>>>
>>> yes. Can we change a bit this series to use QOM ? something like :
>>>
>>>  typedef struct HostIOMMUDevice {
>>>  Object parent;
>>>  } HostIOMMUDevice;
>>>
>>>  #define TYPE_HOST_IOMMU "host.iommu"
>>>  OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUClass,
>>> HOST_IOMMU)
>>>
>>>  struct HostIOMMUClass {
>>>  ObjectClass parent_class;
>>>
>>>  int (*get_type)(HostIOMMUDevice *hiod, uint64_t *type, Error
>**errp);
>>>  int (*get_cap)(HostIOMMUDevice *hiod, uint64_t *cap, Error
>**errp);
>>>  };
>>>
>>> Inherited objects would be TYPE_HOST_IOMMU_IOMMUFD and
>>> TYPE_HOST_IOMMU_LEGACY.
>>> Each class implementing the handlers or not (legacy mode).
>>
>> Understood, thanks for your guide.
>>
>>>
>>> The class handlers are introduced for the intel-iommu helper
>>> vtd_check_hdev()
>>> in order to avoid using iommufd routines directly. HostIOMMUDevice is
>>> supposed
>>> to abstract the Host IOMMU device, so we need to abstract also all the
>>> interfaces to this object.
>>
>> I'd like to have a minimal adjustment to class handers. Just let me know if
>you have strong
>> preference.
>>
>> Cap/ecap is intel_iommu specific, I'd like to make it a bit generic also for
>arm smmu usage,
>> and merge get_type and get_cap into one function as they both calls
>ioctl(IOMMU_GET_HW_INFO),
>> something like:
>> get_info(HostIOMMUDevice *hiod, enum iommu_hw_info_type *type,
>void **data, void **len,  Error **errp);
>
>OK. Let's see how it goes. Having more users of this new object Host
>IOMMU device is important to get a better feeling of the interface.
>As of today, it doesn't have not much value. The iommufd object could
>be QOM linked to the vIOMMU when available and we could get the bind
>devid in some other ways I suppose. Anyhow, please keep it simple and
>let's explore.

Got it, thanks Cédric!

BRs.
Zhenzhong


RE: [PATCH v1 3/6] intel_iommu: Add a framework to check and sync host IOMMU cap/ecap

2024-03-28 Thread Duan, Zhenzhong
Hi Michael,

>-Original Message-
>From: Michael S. Tsirkin 
>Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check and
>sync host IOMMU cap/ecap
>
>On Mon, Mar 18, 2024 at 02:20:50PM +0100, Eric Auger wrote:
>> Hi Michael,
>>
>> On 3/13/24 12:17, Michael S. Tsirkin wrote:
>> > On Wed, Mar 13, 2024 at 07:54:11AM +, Duan, Zhenzhong wrote:
>> >>
>> >>> -Original Message-
>> >>> From: Michael S. Tsirkin 
>> >>> Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check
>and
>> >>> sync host IOMMU cap/ecap
>> >>>
>> >>> On Wed, Mar 13, 2024 at 02:52:39AM +, Duan, Zhenzhong wrote:
>> >>>> Hi Michael,
>> >>>>
>> >>>>> -Original Message-
>> >>>>> From: Michael S. Tsirkin 
>> >>>>> Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to
>check and
>> >>>>> sync host IOMMU cap/ecap
>> >>>>>
>> >>>>> On Wed, Feb 28, 2024 at 05:44:29PM +0800, Zhenzhong Duan
>wrote:
>> >>>>>> From: Yi Liu 
>> >>>>>>
>> >>>>>> Add a framework to check and synchronize host IOMMU cap/ecap
>with
>> >>>>>> vIOMMU cap/ecap.
>> >>>>>>
>> >>>>>> The sequence will be:
>> >>>>>>
>> >>>>>> vtd_cap_init() initializes iommu->cap/ecap.
>> >>>>>> vtd_check_hdev() update iommu->cap/ecap based on host
>cap/ecap.
>> >>>>>> iommu->cap_frozen set when machine create done, iommu-
>>cap/ecap
>> >>>>> become readonly.
>> >>>>>> Implementation details for different backends will be in following
>> >>> patches.
>> >>>>>> Signed-off-by: Yi Liu 
>> >>>>>> Signed-off-by: Yi Sun 
>> >>>>>> Signed-off-by: Zhenzhong Duan 
>> >>>>>> ---
>> >>>>>>  include/hw/i386/intel_iommu.h |  1 +
>> >>>>>>  hw/i386/intel_iommu.c | 50
>> >>>>> ++-
>> >>>>>>  2 files changed, 50 insertions(+), 1 deletion(-)
>> >>>>>>
>> >>>>>> diff --git a/include/hw/i386/intel_iommu.h
>> >>>>> b/include/hw/i386/intel_iommu.h
>> >>>>>> index bbc7b96add..c71a133820 100644
>> >>>>>> --- a/include/hw/i386/intel_iommu.h
>> >>>>>> +++ b/include/hw/i386/intel_iommu.h
>> >>>>>> @@ -283,6 +283,7 @@ struct IntelIOMMUState {
>> >>>>>>
>> >>>>>>  uint64_t cap;   /* The value of capability reg */
>> >>>>>>  uint64_t ecap;  /* The value of extended 
>> >>>>>> capability reg
>*/
>> >>>>>> +bool cap_frozen;/* cap/ecap become read-only 
>> >>>>>> after
>> >>> frozen */
>> >>>>>>  uint32_t context_cache_gen; /* Should be in [1,MAX] */
>> >>>>>>  GHashTable *iotlb;  /* IOTLB */
>> >>>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> >>>>>> index ffa1ad6429..a9f9dfd6a7 100644
>> >>>>>> --- a/hw/i386/intel_iommu.c
>> >>>>>> +++ b/hw/i386/intel_iommu.c
>> >>>>>> @@ -35,6 +35,8 @@
>> >>>>>>  #include "sysemu/kvm.h"
>> >>>>>>  #include "sysemu/dma.h"
>> >>>>>>  #include "sysemu/sysemu.h"
>> >>>>>> +#include "hw/vfio/vfio-common.h"
>> >>>>>> +#include "sysemu/iommufd.h"
>> >>>>>>  #include "hw/i386/apic_internal.h"
>> >>>>>>  #include "kvm/kvm_i386.h"
>> >>>>>>  #include "migration/vmstate.h"
>> >>>>>> @@ -3819,6 +3821,38 @@ VTDAddressSpace
>> >>>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>> >>>>>>  return vtd_dev_as;
>> >>>>>>  }
>> >>>>>>
>> >>>>>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>> 

RE: [PATCH v1 01/11] Introduce a common abstract struct HostIOMMUDevice

2024-03-27 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Subject: Re: [PATCH v1 01/11] Introduce a common abstract struct
>HostIOMMUDevice
>
>Hello Zhenzhong,
>
>On 3/19/24 12:58, Duan, Zhenzhong wrote:
>> Hi Cédric,
>>
>>> -Original Message-
>>> From: Cédric Le Goater 
>>> Sent: Tuesday, March 19, 2024 4:17 PM
>>> To: Duan, Zhenzhong ; qemu-
>>> de...@nongnu.org
>>> Cc: alex.william...@redhat.com; eric.au...@redhat.com;
>>> pet...@redhat.com; jasow...@redhat.com; m...@redhat.com;
>>> j...@nvidia.com; nicol...@nvidia.com; joao.m.mart...@oracle.com; Tian,
>>> Kevin ; Liu, Yi L ; Sun, Yi Y
>>> ; Peng, Chao P 
>>> Subject: Re: [PATCH v1 01/11] Introduce a common abstract struct
>>> HostIOMMUDevice
>>>
>>> Hello Zhenzhong,
>>>
>>> On 2/28/24 04:58, Zhenzhong Duan wrote:
>>>> HostIOMMUDevice will be inherited by two sub classes,
>>>> legacy and iommufd currently.
>>>>
>>>> Introduce a helper function host_iommu_base_device_init to initialize it.
>>>>
>>>> Suggested-by: Eric Auger 
>>>> Signed-off-by: Zhenzhong Duan 
>>>> ---
>>>>include/sysemu/host_iommu_device.h | 22
>++
>>>>1 file changed, 22 insertions(+)
>>>>create mode 100644 include/sysemu/host_iommu_device.h
>>>>
>>>> diff --git a/include/sysemu/host_iommu_device.h
>>> b/include/sysemu/host_iommu_device.h
>>>> new file mode 100644
>>>> index 00..fe80ab25fb
>>>> --- /dev/null
>>>> +++ b/include/sysemu/host_iommu_device.h
>>>> @@ -0,0 +1,22 @@
>>>> +#ifndef HOST_IOMMU_DEVICE_H
>>>> +#define HOST_IOMMU_DEVICE_H
>>>> +
>>>> +typedef enum HostIOMMUDevice_Type {
>>>> +HID_LEGACY,
>>>> +HID_IOMMUFD,
>>>> +HID_MAX,
>>>> +} HostIOMMUDevice_Type;
>>>> +
>>>> +typedef struct HostIOMMUDevice {
>>>> +HostIOMMUDevice_Type type;
>>>
>>> A type field is not a good sign and that's where QOM is useful.
>>
>> Yes, agree.
>> I didn't choose QOM because in iommufd-cdev series, VFIOContainer
>chooses not using QOM model.
>> See the discussion:
>https://lore.kernel.org/all/YmuFv2s5TPuw7K%2Fu@yekko/
>> I thought HostIOMMUDevice need to follow same rule.
>>
>> But after further digging into this, I think it may be ok to use QOM model
>as long as we don't expose
>> HostIOMMUDevice in qapi/qom.json and not use USER_CREATABLE
>interface. Your thoughts?
>
>yes. Can we change a bit this series to use QOM ? something like :
>
> typedef struct HostIOMMUDevice {
> Object parent;
> } HostIOMMUDevice;
>
> #define TYPE_HOST_IOMMU "host.iommu"
> OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUClass,
>HOST_IOMMU)
>
> struct HostIOMMUClass {
> ObjectClass parent_class;
>
> int (*get_type)(HostIOMMUDevice *hiod, uint64_t *type, Error **errp);
> int (*get_cap)(HostIOMMUDevice *hiod, uint64_t *cap, Error **errp);
> };
>
>Inherited objects would be TYPE_HOST_IOMMU_IOMMUFD and
>TYPE_HOST_IOMMU_LEGACY.
>Each class implementing the handlers or not (legacy mode).

Understood, thanks for your guide.

>
>The class handlers are introduced for the intel-iommu helper
>vtd_check_hdev()
>in order to avoid using iommufd routines directly. HostIOMMUDevice is
>supposed
>to abstract the Host IOMMU device, so we need to abstract also all the
>interfaces to this object.

I'd like to have a minimal adjustment to class handers. Just let me know if you 
have strong
preference.

Cap/ecap is intel_iommu specific, I'd like to make it a bit generic also for 
arm smmu usage,
and merge get_type and get_cap into one function as they both calls 
ioctl(IOMMU_GET_HW_INFO),
something like:
get_info(HostIOMMUDevice *hiod, enum iommu_hw_info_type *type, void **data, 
void **len,  Error **errp);

and let iommu emulater to extract content of *data. For intel_iommu, it's:

struct iommu_hw_info_vtd {
__u32 flags;
__u32 __reserved;
__aligned_u64 cap_reg;
__aligned_u64 ecap_reg;
};

>
>The .host_iommu_device_create() handler could be merged
>in .attach_device()
>possibly. Anyhow, please use now object_new() and object_unref() instead.
>host_iommu_base_device_init() is useless IMHO.

Good idea, will do.

>
>>
>>>
>>> Is vtd_check_hdev() the only use of this field ?
>>
>> Currently yes. virtio-iommu may have similar usage.
>>
>>> If so, can we simplify with a QOM interface in any way ?
>>
>> QOM interface is a set of callbacks, guess you mean QOM class,
>> saying HostIOMMUDevice class, IOMMULegacyDevice class and
>IOMMUFDDevice class?
>
>See above proposal. it should work fine.
>
>Also, I think it is better to use a IOMMUFDBackend* parameter for
>iommufd_device_get_info() to be consistent with the other routines.

Sure, then I'd like to also rename it to iommufd_backend_get_device_info().

Thanks
Zhenzhong

>
>Then It would interesting to see how this applies to Eric's series.
>
>Thanks,
>
>C.
>
>



RE: [PATCH v1 01/11] Introduce a common abstract struct HostIOMMUDevice

2024-03-19 Thread Duan, Zhenzhong
Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Sent: Tuesday, March 19, 2024 4:17 PM
>To: Duan, Zhenzhong ; qemu-
>de...@nongnu.org
>Cc: alex.william...@redhat.com; eric.au...@redhat.com;
>pet...@redhat.com; jasow...@redhat.com; m...@redhat.com;
>j...@nvidia.com; nicol...@nvidia.com; joao.m.mart...@oracle.com; Tian,
>Kevin ; Liu, Yi L ; Sun, Yi Y
>; Peng, Chao P 
>Subject: Re: [PATCH v1 01/11] Introduce a common abstract struct
>HostIOMMUDevice
>
>Hello Zhenzhong,
>
>On 2/28/24 04:58, Zhenzhong Duan wrote:
>> HostIOMMUDevice will be inherited by two sub classes,
>> legacy and iommufd currently.
>>
>> Introduce a helper function host_iommu_base_device_init to initialize it.
>>
>> Suggested-by: Eric Auger 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>   include/sysemu/host_iommu_device.h | 22 ++
>>   1 file changed, 22 insertions(+)
>>   create mode 100644 include/sysemu/host_iommu_device.h
>>
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> new file mode 100644
>> index 00..fe80ab25fb
>> --- /dev/null
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -0,0 +1,22 @@
>> +#ifndef HOST_IOMMU_DEVICE_H
>> +#define HOST_IOMMU_DEVICE_H
>> +
>> +typedef enum HostIOMMUDevice_Type {
>> +HID_LEGACY,
>> +HID_IOMMUFD,
>> +HID_MAX,
>> +} HostIOMMUDevice_Type;
>> +
>> +typedef struct HostIOMMUDevice {
>> +HostIOMMUDevice_Type type;
>
>A type field is not a good sign and that's where QOM is useful.

Yes, agree.
I didn't choose QOM because in iommufd-cdev series, VFIOContainer chooses not 
using QOM model.
See the discussion: https://lore.kernel.org/all/YmuFv2s5TPuw7K%2Fu@yekko/
I thought HostIOMMUDevice need to follow same rule.

But after further digging into this, I think it may be ok to use QOM model as 
long as we don't expose
HostIOMMUDevice in qapi/qom.json and not use USER_CREATABLE interface. Your 
thoughts?

>
>Is vtd_check_hdev() the only use of this field ?

Currently yes. virtio-iommu may have similar usage.

> If so, can we simplify with a QOM interface in any way ?

QOM interface is a set of callbacks, guess you mean QOM class,
saying HostIOMMUDevice class, IOMMULegacyDevice class and IOMMUFDDevice class?

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>
>
>
>> +size_t size;
>> +} HostIOMMUDevice;
>> +
>> +static inline void host_iommu_base_device_init(HostIOMMUDevice *dev,
>> +   HostIOMMUDevice_Type type,
>> +   size_t size)
>> +{
>> +dev->type = type;
>> +dev->size = size;
>> +}
>> +#endif



RE: [PATCH v1 11/11] backends/iommufd: Introduce helper function iommufd_device_get_info()

2024-03-19 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v1 11/11] backends/iommufd: Introduce helper
>function iommufd_device_get_info()
>
>Hi Joao,
>
>On 3/18/24 16:09, Joao Martins wrote:
>> On 18/03/2024 07:54, Eric Auger wrote:
>>> Hi Zhenzhong,
>>>
>>> On 2/28/24 04:59, Zhenzhong Duan wrote:
 Introduce a helper function iommufd_device_get_info() to get
 host IOMMU related information through iommufd uAPI.
>>> Looks strange to have this patch in this series. I Would rather put it
>>> in your second series alongs with its user.
>>>
>> The reason it was here was to use this helper for this patch:
>>
>> https://lore.kernel.org/qemu-devel/20240212135643.5858-2-
>joao.m.mart...@oracle.com/
>>
>> Instead of me having my own alternate helper.
>>
>> Though at the same time, Zhenzhong will also make use of it in his second
>series.
>OK I understand now. Maybe with extra comment in the coverletter then

Will add.

Thanks
Zhenzhong


RE: [PATCH v1 00/11] Add a host IOMMU device abstraction

2024-03-19 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v1 00/11] Add a host IOMMU device abstraction
>
>
>
>On 2/28/24 04:58, Zhenzhong Duan wrote:
>> Hi,
>>
>> Based on Joao's suggestion, the iommufd nesting prerequisite series [1]
>> is further splitted to host IOMMU device abstract part and vIOMMU
>> check/sync part. This series implements the 1st part.
>>
>> This split also faciliates the dirty tracking series [2] and virtio-iommu
>> series [3] to depend on 1st part.
>>
>> PATCH1-3: Introduce HostIOMMUDevice and two sub class
>> PATCH4: Define HostIOMMUDevice handle in VFIODevice
>> PATCH5-8: Introdcue host_iommu_device_create callback to allocate and
>intialize HostIOMMUDevice
>Introduce, here and below

Good catch, will fix.

Thanks
Zhenzhong

>
>Eric
>> PATCH9-10: Introdcue set/unset_iommu_device to pass
>HostIOMMUDevice to vIOMMU
>> PATCH11: a helper to get host IOMMU info
>>
>> Because it's becoming clear on community's suggestion, I'd like to remove
>> rfc tag from this version.
>>
>> Qemu code can be found at:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre
>q_part1_v1
>>
>> [1] https://lore.kernel.org/qemu-devel/20240201072818.327930-1-
>zhenzhong.d...@intel.com/
>> [2] https://lore.kernel.org/qemu-devel/20240212135643.5858-1-
>joao.m.mart...@oracle.com/
>> [3] https://lore.kernel.org/qemu-devel/20240117080414.316890-1-
>eric.au...@redhat.com/
>>
>> Thanks
>> Zhenzhong
>>
>> Changelog:
>> v1:
>> - use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
>> - change host_iommu_device_init to host_iommu_device_create
>> - allocate HostIOMMUDevice in host_iommu_device_create callback
>>   and set the VFIODevice base_hdev handle (Eric)
>> - refine pci_device_set/unset_iommu_device doc (Eric)
>> - use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice
>(Eric)
>>
>> rfcv2:
>> - introduce common abstract HostIOMMUDevice and sub struct for
>different BEs (Eric, Cédric)
>> - remove iommufd_device.[ch] (Cédric)
>> - remove duplicate iommufd/devid define from VFIODevice (Eric)
>> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
>> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn
>(Cédric, Eric)
>> - use errp in iommufd_device_get_info (Eric)
>> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
>> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h
>(Cédric)
>> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1'
>(Cédric)
>> - block migration if vIOMMU cap/ecap updated based on host IOMMU
>cap/ecap
>> - add R-B
>>
>>
>> Yi Liu (1):
>>   hw/pci: Introduce pci_device_set/unset_iommu_device()
>>
>> Zhenzhong Duan (10):
>>   Introduce a common abstract struct HostIOMMUDevice
>>   backends/iommufd: Introduce IOMMUFDDevice
>>   vfio: Introduce IOMMULegacyDevice
>>   vfio: Add HostIOMMUDevice handle into VFIODevice
>>   vfio: Introduce host_iommu_device_create callback
>>   vfio/container: Implement host_iommu_device_create callback in legacy
>> mode
>>   vfio/iommufd: Implement host_iommu_device_create callback in
>iommufd
>> mode
>>   vfio/pci: Allocate and initialize HostIOMMUDevice after attachment
>>   vfio: Pass HostIOMMUDevice to vIOMMU
>>   backends/iommufd: Introduce helper function iommufd_device_get_info()
>>
>>  include/hw/pci/pci.h  | 38 +++-
>>  include/hw/vfio/vfio-common.h |  8 
>>  include/hw/vfio/vfio-container-base.h |  1 +
>>  include/sysemu/host_iommu_device.h| 22 ++
>>  include/sysemu/iommufd.h  | 19 
>>  backends/iommufd.c| 32 +-
>>  hw/pci/pci.c  | 62 +--
>>  hw/vfio/common.c  |  8 
>>  hw/vfio/container.c   |  9 
>>  hw/vfio/iommufd.c | 10 +
>>  hw/vfio/pci.c | 24 ---
>>  11 files changed, 223 insertions(+), 10 deletions(-)
>>  create mode 100644 include/sysemu/host_iommu_device.h
>>



RE: [PATCH v1 09/11] hw/pci: Introduce pci_device_set/unset_iommu_device()

2024-03-19 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v1 09/11] hw/pci: Introduce
>pci_device_set/unset_iommu_device()
>
>Hi Zhenzhong,
>
>On 2/28/24 04:58, Zhenzhong Duan wrote:
>> From: Yi Liu 
>>
>> This adds pci_device_set/unset_iommu_device() to set/unset
>> HostIOMMUDevice for a given PCIe device. Caller of set
>> should fail if set operation fails.
>>
>> Extract out pci_device_get_iommu_bus_devfn() to facilitate
>> implementation of pci_device_set/unset_iommu_device().
>>
>> Signed-off-by: Yi Liu 
>> Signed-off-by: Yi Sun 
>> Signed-off-by: Nicolin Chen 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>  include/hw/pci/pci.h | 38 ++-
>>  hw/pci/pci.c | 62
>+---
>>  2 files changed, 96 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>> index fa6313aabc..8fe6f746d7 100644
>> --- a/include/hw/pci/pci.h
>> +++ b/include/hw/pci/pci.h
>> @@ -3,6 +3,7 @@
>>
>>  #include "exec/memory.h"
>>  #include "sysemu/dma.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>>  /* PCI includes legacy ISA access.  */
>>  #include "hw/isa/isa.h"
>> @@ -384,10 +385,45 @@ typedef struct PCIIOMMUOps {
>>   *
>>   * @devfn: device and function number
>>   */
>> -   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>devfn);
>> +AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int
>devfn);
>> +/**
>> + * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
>> + *
>> + * Optional callback, if not implemented in vIOMMU, then vIOMMU
>can't
>> + * retrieve host information from the associated HostIOMMUDevice.
>> + *
>> + * Return true if HostIOMMUDevice is attached, or else return false
>> + * with errp set.
>> + *
>> + * @bus: the #PCIBus of the PCI device.
>> + *
>> + * @opaque: the data passed to pci_setup_iommu().
>> + *
>> + * @devfn: device and function number of the PCI device.
>> + *
>> + * @dev: the data structure representing host IOMMU device.
>@errp is missing

Will add.

>> + *
>> + */
>> +int (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
>> +HostIOMMUDevice *dev, Error **errp);
>> +/**
>> + * @unset_iommu_device: detach a HostIOMMUDevice from a
>vIOMMU
>> + *
>> + * Optional callback.
>> + *
>> + * @bus: the #PCIBus of the PCI device.
>> + *
>> + * @opaque: the data passed to pci_setup_iommu().
>> + *
>> + * @devfn: device and function number of the PCI device.
>> + */
>> +void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
>>  } PCIIOMMUOps;
>>
>>  AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
>> +int pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice
>*base_dev,
>> +Error **errp);
>> +void pci_device_unset_iommu_device(PCIDevice *dev);
>>
>>  /**
>>   * pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>> index 76080af580..8078307963 100644
>> --- a/hw/pci/pci.c
>> +++ b/hw/pci/pci.c
>> @@ -2672,11 +2672,14 @@ static void
>pci_device_class_base_init(ObjectClass *klass, void *data)
>>  }
>>  }
>>
>I would write some comments describing the output params and also
>explicitly saying some are optional

Sure.

>> -AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
>> +static void pci_device_get_iommu_bus_devfn(PCIDevice *dev,
>> +   PCIBus **aliased_bus,
>> +   PCIBus **piommu_bus,
>
>piommu_bus is not an optional parameter. I would put it before aliased_bus.

Good suggestion, will do.

>
>> +   int *aliased_devfn)
>>  {
>>  PCIBus *bus = pci_get_bus(dev);
>>  PCIBus *iommu_bus = bus;
>> -uint8_t devfn = dev->devfn;
>> +int devfn = dev->devfn;
>>
>>  while (iommu_bus && !iommu_bus->iommu_ops && iommu_bus-
>>parent_dev) {
>>  PCIBus *parent_bus = pci_get_bus(iommu_bus->parent_dev);
>> @@ -2717,13 +2720,66 @@ AddressSpace
>*pci_device_iommu_address_space(PCIDevice *dev)
>>
>>  iommu_bus = parent_bus;
>>  }
>> -if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) {
>> +
>> +assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
>> +assert(iommu_bus);
>> +
>> +if (pci_bus_bypass_iommu(bus) || !iommu_bus->iommu_ops) {
>> +iommu_bus = NULL;
>> +}
>> +
>> +*piommu_bus = iommu_bus;
>> +
>> +if (aliased_bus) {
>> +*aliased_bus = bus;
>> +}
>> +
>> +if (aliased_devfn) {
>> +*aliased_devfn = devfn;
>> +}
>> +}
>> +
>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
>> +{
>> +PCIBus *bus;
>> +PCIBus *iommu_bus;
>> +int devfn;
>> +
>> +pci_device_get_iommu_bus_devfn(dev, , _bus, );
>> +if (iommu_bus) {
>>  

RE: [PATCH v1 05/11] vfio: Introduce host_iommu_device_create callback

2024-03-18 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v1 05/11] vfio: Introduce host_iommu_device_create
>callback
>
>
>
>On 2/28/24 04:58, Zhenzhong Duan wrote:
>> Introduce host_iommu_device_create callback and a wrapper for it.
>>
>> This callback is used to allocate a host iommu device instance and
>> initialize it based on type.
>>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>  include/hw/vfio/vfio-common.h | 1 +
>>  include/hw/vfio/vfio-container-base.h | 1 +
>>  hw/vfio/common.c  | 8 
>>  3 files changed, 10 insertions(+)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index b6676c9f79..9fefea4b89 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -208,6 +208,7 @@ struct vfio_device_info *vfio_get_device_info(int
>fd);
>>  int vfio_attach_device(char *name, VFIODevice *vbasedev,
>> AddressSpace *as, Error **errp);
>>  void vfio_detach_device(VFIODevice *vbasedev);
>> +void host_iommu_device_create(VFIODevice *vbasedev);
>>
>>  int vfio_kvm_device_add_fd(int fd, Error **errp);
>>  int vfio_kvm_device_del_fd(int fd, Error **errp);
>> diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-
>container-base.h
>> index b2813b0c11..dc003f6eb2 100644
>> --- a/include/hw/vfio/vfio-container-base.h
>> +++ b/include/hw/vfio/vfio-container-base.h
>> @@ -120,6 +120,7 @@ struct VFIOIOMMUClass {
>>  int (*attach_device)(const char *name, VFIODevice *vbasedev,
>>   AddressSpace *as, Error **errp);
>>  void (*detach_device)(VFIODevice *vbasedev);
>> +void (*host_iommu_device_create)(VFIODevice *vbasedev);
>Maybe return an int instead. It is common the allocation can fail and
>the deallocation cannot. While at it I would also pass an errp in case
>it fails

Currently host_iommu_device_create implementation only calls g_malloc0,
so never fails, so I returned void.

I'm fine to return an int, will be like below, take iommufd for example:

--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -651,7 +651,7 @@ static IOMMUFDDeviceOps vfio_iommufd_device_ops = {
 .detach_hwpt = vfio_iommufd_device_detach_hwpt,
 };

-static void vfio_cdev_host_iommu_device_create(VFIODevice *vbasedev)
+static int vfio_cdev_host_iommu_device_create(VFIODevice *vbasedev, Error 
**errp)
 {
 IOMMUFDDevice *idev = g_malloc0(sizeof(IOMMUFDDevice));
 VFIOIOMMUFDContainer *container = container_of(vbasedev->bcontainer,
@@ -661,6 +661,8 @@ static void vfio_cdev_host_iommu_device_create(VFIODevice 
*vbasedev)

 iommufd_device_init(idev, vbasedev->iommufd, vbasedev->devid,
 container->ioas_id, vbasedev, 
_iommufd_device_ops);
+
+return 0;
 }

Thanks
Zhenzhong

>
>Eric
>>  /* migration feature */
>>  int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
>> bool start);
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 059bfdc07a..41e9031c59 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1521,3 +1521,11 @@ void vfio_detach_device(VFIODevice
>*vbasedev)
>>  }
>>  vbasedev->bcontainer->ops->detach_device(vbasedev);
>>  }
>> +
>> +void host_iommu_device_create(VFIODevice *vbasedev)
>> +{
>> +const VFIOIOMMUClass *ops = vbasedev->bcontainer->ops;
>> +
>> +assert(ops->host_iommu_device_create);
>> +ops->host_iommu_device_create(vbasedev);
>> +}



RE: [PATCH v1 01/11] Introduce a common abstract struct HostIOMMUDevice

2024-03-18 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v1 01/11] Introduce a common abstract struct
>HostIOMMUDevice
>
>Hi Zhenzhong,
>On 2/28/24 04:58, Zhenzhong Duan wrote:
>> HostIOMMUDevice will be inherited by two sub classes,
>> legacy and iommufd currently.
>As this patch introduces the object, you describe what the object is
>meant for and used for. Maybe reuse text from the cover letter

Sure, will do.

Thanks
Zhenzhong

>
>Thanks
>
>Eric
>>
>> Introduce a helper function host_iommu_base_device_init to initialize it.
>>
>> Suggested-by: Eric Auger 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>  include/sysemu/host_iommu_device.h | 22 ++
>>  1 file changed, 22 insertions(+)
>>  create mode 100644 include/sysemu/host_iommu_device.h
>>
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> new file mode 100644
>> index 00..fe80ab25fb
>> --- /dev/null
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -0,0 +1,22 @@
>> +#ifndef HOST_IOMMU_DEVICE_H
>> +#define HOST_IOMMU_DEVICE_H
>> +
>> +typedef enum HostIOMMUDevice_Type {
>> +HID_LEGACY,
>> +HID_IOMMUFD,
>> +HID_MAX,
>> +} HostIOMMUDevice_Type;
>> +
>> +typedef struct HostIOMMUDevice {
>> +HostIOMMUDevice_Type type;
>> +size_t size;
>> +} HostIOMMUDevice;
>> +
>> +static inline void host_iommu_base_device_init(HostIOMMUDevice *dev,
>> +   HostIOMMUDevice_Type type,
>> +   size_t size)
>> +{
>> +dev->type = type;
>> +dev->size = size;
>> +}
>> +#endif



RE: [PATCH v1 08/11] vfio/pci: Allocate and initialize HostIOMMUDevice after attachment

2024-03-18 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v1 08/11] vfio/pci: Allocate and initialize
>HostIOMMUDevice after attachment
>
>
>
>On 2/28/24 04:58, Zhenzhong Duan wrote:
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>  hw/vfio/pci.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 4fa387f043..6cc7de5d10 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3006,6 +3006,9 @@ static void vfio_realize(PCIDevice *pdev, Error
>**errp)
>>  goto error;
>>  }
>>
>> +/* Allocate and initialize HostIOMMUDevice after attachment succeed
>*/
>after successful attachment?
>> +host_iommu_device_create(vbasedev);
>> +
>you shall free on error: as well

I free it in vfio_instance_finalize().
Vfio-pci's design is special, it didn't free all allocated resources in 
realize's error path,
They are freed in _finalize(). e.g., vdev->emulated_config_bits, vdev->rom, 
devices and group resources(vfio_detach_device).
I'm following the same way. I'm fine to free it as you suggested something like 
below:

--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3246,6 +3246,7 @@ out_teardown:
 vfio_teardown_msi(vdev);
 vfio_bars_exit(vdev);
 error:
+g_free(vdev->vbasedev.base_hdev);
 error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->name);
 }

@@ -3288,6 +3289,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 vfio_bars_exit(vdev);
 vfio_migration_exit(vbasedev);
 pci_device_unset_iommu_device(pdev);
+g_free(vdev->vbasedev.base_hdev);
 }

>
>Eric
>>  vfio_populate_device(vdev, );
>>  if (err) {
>>  error_propagate(errp, err);
>> @@ -3244,6 +3247,7 @@ static void vfio_instance_finalize(Object *obj)
>>
>>  vfio_display_finalize(vdev);
>>  vfio_bars_finalize(vdev);
>> +g_free(vdev->vbasedev.base_hdev);

I free it here.

Thanks
Zhenzhong

>>  g_free(vdev->emulated_config_bits);
>>  g_free(vdev->rom);
>>  /*



RE: [PATCH v1 05/11] vfio: Introduce host_iommu_device_create callback

2024-03-18 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v1 05/11] vfio: Introduce host_iommu_device_create
>callback
>
>
>
>On 3/18/24 14:52, Eric Auger wrote:
>> Hi ZHenzhong,
>>
>> On 2/28/24 04:58, Zhenzhong Duan wrote:
>>> Introduce host_iommu_device_create callback and a wrapper for it.
>>>
>>> This callback is used to allocate a host iommu device instance and
>>> initialize it based on type.
>>>
>>> Signed-off-by: Zhenzhong Duan 
>>> ---
>>>  include/hw/vfio/vfio-common.h | 1 +
>>>  include/hw/vfio/vfio-container-base.h | 1 +
>>>  hw/vfio/common.c  | 8 
>>>  3 files changed, 10 insertions(+)
>>>
>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>>> index b6676c9f79..9fefea4b89 100644
>>> --- a/include/hw/vfio/vfio-common.h
>>> +++ b/include/hw/vfio/vfio-common.h
>>> @@ -208,6 +208,7 @@ struct vfio_device_info *vfio_get_device_info(int
>fd);
>>>  int vfio_attach_device(char *name, VFIODevice *vbasedev,
>>> AddressSpace *as, Error **errp);
>>>  void vfio_detach_device(VFIODevice *vbasedev);
>>> +void host_iommu_device_create(VFIODevice *vbasedev);
>>>
>>>  int vfio_kvm_device_add_fd(int fd, Error **errp);
>>>  int vfio_kvm_device_del_fd(int fd, Error **errp);
>>> diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-
>container-base.h
>>> index b2813b0c11..dc003f6eb2 100644
>>> --- a/include/hw/vfio/vfio-container-base.h
>>> +++ b/include/hw/vfio/vfio-container-base.h
>>> @@ -120,6 +120,7 @@ struct VFIOIOMMUClass {
>>>  int (*attach_device)(const char *name, VFIODevice *vbasedev,
>>>   AddressSpace *as, Error **errp);
>>>  void (*detach_device)(VFIODevice *vbasedev);
>>> +void (*host_iommu_device_create)(VFIODevice *vbasedev);
>>>  /* migration feature */
>>>  int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
>>> bool start);
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index 059bfdc07a..41e9031c59 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -1521,3 +1521,11 @@ void vfio_detach_device(VFIODevice
>*vbasedev)
>>>  }
>>>  vbasedev->bcontainer->ops->detach_device(vbasedev);
>>>  }
>>> +
>>> +void host_iommu_device_create(VFIODevice *vbasedev)
>>> +{
>>> +const VFIOIOMMUClass *ops = vbasedev->bcontainer->ops;
>>> +
>>> +assert(ops->host_iommu_device_create);
>> at this stage ops actual implementation do not exist yet so this will
>> break the bisection
>
>Sorry it is OK at the function only is called in
>[PATCH v1 08/11] vfio/pci: Allocate and initialize HostIOMMUDevice after
>attachment
>
>Sorry for the noise

Ah, send too quickly. No problem.

Thanks
Zhenzhong


RE: [PATCH v1 05/11] vfio: Introduce host_iommu_device_create callback

2024-03-18 Thread Duan, Zhenzhong
Hi Eric,

>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v1 05/11] vfio: Introduce host_iommu_device_create
>callback
>
>Hi ZHenzhong,
>
>On 2/28/24 04:58, Zhenzhong Duan wrote:
>> Introduce host_iommu_device_create callback and a wrapper for it.
>>
>> This callback is used to allocate a host iommu device instance and
>> initialize it based on type.
>>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>  include/hw/vfio/vfio-common.h | 1 +
>>  include/hw/vfio/vfio-container-base.h | 1 +
>>  hw/vfio/common.c  | 8 
>>  3 files changed, 10 insertions(+)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index b6676c9f79..9fefea4b89 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -208,6 +208,7 @@ struct vfio_device_info *vfio_get_device_info(int
>fd);
>>  int vfio_attach_device(char *name, VFIODevice *vbasedev,
>> AddressSpace *as, Error **errp);
>>  void vfio_detach_device(VFIODevice *vbasedev);
>> +void host_iommu_device_create(VFIODevice *vbasedev);
>>
>>  int vfio_kvm_device_add_fd(int fd, Error **errp);
>>  int vfio_kvm_device_del_fd(int fd, Error **errp);
>> diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-
>container-base.h
>> index b2813b0c11..dc003f6eb2 100644
>> --- a/include/hw/vfio/vfio-container-base.h
>> +++ b/include/hw/vfio/vfio-container-base.h
>> @@ -120,6 +120,7 @@ struct VFIOIOMMUClass {
>>  int (*attach_device)(const char *name, VFIODevice *vbasedev,
>>   AddressSpace *as, Error **errp);
>>  void (*detach_device)(VFIODevice *vbasedev);
>> +void (*host_iommu_device_create)(VFIODevice *vbasedev);
>>  /* migration feature */
>>  int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
>> bool start);
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 059bfdc07a..41e9031c59 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1521,3 +1521,11 @@ void vfio_detach_device(VFIODevice
>*vbasedev)
>>  }
>>  vbasedev->bcontainer->ops->detach_device(vbasedev);
>>  }
>> +
>> +void host_iommu_device_create(VFIODevice *vbasedev)
>> +{
>> +const VFIOIOMMUClass *ops = vbasedev->bcontainer->ops;
>> +
>> +assert(ops->host_iommu_device_create);
>at this stage ops actual implementation do not exist yet so this will
>break the bisection

This patch only introcudes host_iommu_device_create but no one call
into it. Patch6-7 implement callback for different backend,
patch8 call host_iommu_device_create(), so I think the order is ok.
Let me know if I missed your points.

Thanks
Zhenzhong

>
>Eric
>> +ops->host_iommu_device_create(vbasedev);
>> +}



RE: [PATCH v1 04/11] vfio: Add HostIOMMUDevice handle into VFIODevice

2024-03-18 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: Re: [PATCH v1 04/11] vfio: Add HostIOMMUDevice handle into
>VFIODevice
>
>
>
>On 2/28/24 04:58, Zhenzhong Duan wrote:
>> This handle points to either IOMMULegacyDevice or IOMMUFDDevice
>variant,
>> neither both.
>I would reword into:
>store an handle to the HostIOMMUDevice the VFIODevice is associated with
>. Its actual nature depends on the backend in use (VFIO or IOMMUFD).

More clear, thanks Eric, will use it.

Zhenzhong

>
>Thanks
>
>Eric
>>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>  include/hw/vfio/vfio-common.h | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index 8bfb9cbe94..b6676c9f79 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -130,6 +130,7 @@ typedef struct VFIODevice {
>>  OnOffAuto pre_copy_dirty_page_tracking;
>>  bool dirty_pages_supported;
>>  bool dirty_tracking;
>> +HostIOMMUDevice *base_hdev;
>>  int devid;
>>  IOMMUFDBackend *iommufd;
>>  } VFIODevice;



RE: [PATCH 2/2] qom/object_interfaces: Remove local_err in user_creatable_add_type

2024-03-15 Thread Duan, Zhenzhong



>-Original Message-
>From: Liu, Zhao1 
>Subject: Re: [PATCH 2/2] qom/object_interfaces: Remove local_err in
>user_creatable_add_type
>
>On Thu, Feb 29, 2024 at 11:37:39AM +0800, Zhenzhong Duan wrote:
>> Date: Thu, 29 Feb 2024 11:37:39 +0800
>> From: Zhenzhong Duan 
>> Subject: [PATCH 2/2] qom/object_interfaces: Remove local_err in
>>  user_creatable_add_type
>> X-Mailer: git-send-email 2.34.1
>>
>> In user_creatable_add_type, there is mixed usage of ERRP_GUARD and
>> local_err. This makes error_abort not taking effect in those callee
>> functions with local_err passed.
>>
>> Now that we already has ERRP_GUARD, remove local_err and use *errp
>> instead.
>>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>  qom/object_interfaces.c | 12 +---
>>  1 file changed, 5 insertions(+), 7 deletions(-)
>>
>> diff --git a/qom/object_interfaces.c b/qom/object_interfaces.c
>> index 255a7bf659..165cd433e7 100644
>> --- a/qom/object_interfaces.c
>> +++ b/qom/object_interfaces.c
>> @@ -81,7 +81,6 @@ Object *user_creatable_add_type(const char *type,
>const char *id,
>>  ERRP_GUARD();
>>  Object *obj;
>>  ObjectClass *klass;
>> -Error *local_err = NULL;
>>
>>  if (id != NULL && !id_wellformed(id)) {
>>  error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "id", "an
>identifier");
>> @@ -109,20 +108,20 @@ Object *user_creatable_add_type(const char
>*type, const char *id,
>>
>>  assert(qdict);
>>  obj = object_new(type);
>> -object_set_properties_from_qdict(obj, qdict, v, _err);
>> -if (local_err) {
>> +object_set_properties_from_qdict(obj, qdict, v, errp);
>
>It's better to make object_set_properties_from_qdict someting (e.g.,
>boolean). Maybe an extra cleanup?

OK, will do.

>
>> +if (*errp) {
>>  goto out;
>>  }
>>
>>  if (id != NULL) {
>>  object_property_try_add_child(object_get_objects_root(),
>> -  id, obj, _err);
>> -if (local_err) {
>> +  id, obj, errp);
>> +if (*errp) {
>>  goto out;
>>  }
>>  }
>
>Here we could check whether the returned ObjectProperty* is NULL instaed
>of dereferencing errp.

Indeed, that's better, will do.

Thanks
Zhenzhong



RE: [PATCH 1/2] qom/object_interfaces: Remove unnecessary local_err check

2024-03-15 Thread Duan, Zhenzhong



>-Original Message-
>From: Liu, Zhao1 
>Subject: Re: [PATCH 1/2] qom/object_interfaces: Remove unnecessary
>local_err check
>
>On Thu, Feb 29, 2024 at 11:37:38AM +0800, Zhenzhong Duan wrote:
>> Date: Thu, 29 Feb 2024 11:37:38 +0800
>> From: Zhenzhong Duan 
>> Subject: [PATCH 1/2] qom/object_interfaces: Remove unnecessary
>local_err
>>  check
>> X-Mailer: git-send-email 2.34.1
>>
>> In the error return path, local_err is always set, no need to check it.
>
>The original error handling code indicates "local_err is always set",
>and error_propagate() can handle the case that local_err is NULL.

Will do.

>
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>  qom/object_interfaces.c | 10 --
>>  1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/qom/object_interfaces.c b/qom/object_interfaces.c
>> index e0833c8bfe..255a7bf659 100644
>> --- a/qom/object_interfaces.c
>> +++ b/qom/object_interfaces.c
>> @@ -128,13 +128,11 @@ Object *user_creatable_add_type(const char
>*type, const char *id,
>>  }
>>  goto out;
>>  }
>> -out:
>> -if (local_err) {
>> -error_propagate(errp, local_err);
>> -object_unref(obj);
>> -return NULL;
>> -}
>>  return obj;
>> +out:
>
>Maybe rename this to "err:"? Since now it's just used to handle error,
>and "goto err" seems more clear.

Good suggestion, will do.

Thanks
Zhenzhong

>
>> +error_propagate(errp, local_err);
>> +object_unref(obj);
>> +return NULL;
>>  }
>>
>>  void user_creatable_add_qapi(ObjectOptions *options, Error **errp)
>> --
>> 2.34.1
>>
>
>Otherwise,
>
>Reviewed-by: Zhao Liu 
>




RE: [PATCH v2] vfio/iommufd: Fix memory leak

2024-03-14 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Subject: [PATCH v2] vfio/iommufd: Fix memory leak
>
>Coverity reported a memory leak on variable 'contents' in routine
>iommufd_cdev_getfd(). Use g_autofree variables to simplify the exit
>path and get rid of g_free() calls.
>
>Cc: Eric Auger 
>Cc: Yi Liu 
>Fixes: CID 1540007
>Fixes: 5ee3dc7af785 ("vfio/iommufd: Implement the iommufd backend")
>Suggested-by: Zhenzhong Duan 
>Signed-off-by: Cédric Le Goater 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>---
> hw/vfio/iommufd.c | 19 ---
> 1 file changed, 8 insertions(+), 11 deletions(-)
>
>diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>index
>a75a785e90c64cdcc4d10c88d217801b3f536cdb..b9c7efb3ef11e49e189103
>ae6fb9011a631b60da 100644
>--- a/hw/vfio/iommufd.c
>+++ b/hw/vfio/iommufd.c
>@@ -118,10 +118,12 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
> {
> ERRP_GUARD();
> long int ret = -ENOTTY;
>-char *path, *vfio_dev_path = NULL, *vfio_path = NULL;
>+g_autofree char *path = NULL;
>+g_autofree char *vfio_dev_path = NULL;
>+g_autofree char *vfio_path = NULL;
> DIR *dir = NULL;
> struct dirent *dent;
>-gchar *contents;
>+g_autofree gchar *contents = NULL;
> gsize length;
> int major, minor;
> dev_t vfio_devt;
>@@ -130,7 +132,7 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
> dir = opendir(path);
> if (!dir) {
> error_setg_errno(errp, errno, "couldn't open directory %s", path);
>-goto out_free_path;
>+goto out;
> }
>
> while ((dent = readdir(dir))) {
>@@ -147,14 +149,13 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
>
> if (!g_file_get_contents(vfio_dev_path, , , NULL)) {
> error_setg(errp, "failed to load \"%s\"", vfio_dev_path);
>-goto out_free_dev_path;
>+goto out_close_dir;
> }
>
> if (sscanf(contents, "%d:%d", , ) != 2) {
> error_setg(errp, "failed to get major:minor for \"%s\"", 
> vfio_dev_path);
>-goto out_free_dev_path;
>+goto out_close_dir;
> }
>-g_free(contents);
> vfio_devt = makedev(major, minor);
>
> vfio_path = g_strdup_printf("/dev/vfio/devices/%s", dent->d_name);
>@@ -164,17 +165,13 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
> }
>
> trace_iommufd_cdev_getfd(vfio_path, ret);
>-g_free(vfio_path);
>
>-out_free_dev_path:
>-g_free(vfio_dev_path);
> out_close_dir:
> closedir(dir);
>-out_free_path:
>+out:
> if (*errp) {
> error_prepend(errp, VFIO_MSG_PREFIX, path);
> }
>-g_free(path);
>
> return ret;
> }
>--
>2.44.0



RE: [PATCH v1 3/6] intel_iommu: Add a framework to check and sync host IOMMU cap/ecap

2024-03-13 Thread Duan, Zhenzhong



>-Original Message-
>From: Michael S. Tsirkin 
>Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check and
>sync host IOMMU cap/ecap
>
>On Wed, Mar 13, 2024 at 07:54:11AM +0000, Duan, Zhenzhong wrote:
>>
>>
>> >-Original Message-
>> >From: Michael S. Tsirkin 
>> >Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check and
>> >sync host IOMMU cap/ecap
>> >
>> >On Wed, Mar 13, 2024 at 02:52:39AM +, Duan, Zhenzhong wrote:
>> >> Hi Michael,
>> >>
>> >> >-Original Message-
>> >> >From: Michael S. Tsirkin 
>> >> >Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check
>and
>> >> >sync host IOMMU cap/ecap
>> >> >
>> >> >On Wed, Feb 28, 2024 at 05:44:29PM +0800, Zhenzhong Duan wrote:
>> >> >> From: Yi Liu 
>> >> >>
>> >> >> Add a framework to check and synchronize host IOMMU cap/ecap
>with
>> >> >> vIOMMU cap/ecap.
>> >> >>
>> >> >> The sequence will be:
>> >> >>
>> >> >> vtd_cap_init() initializes iommu->cap/ecap.
>> >> >> vtd_check_hdev() update iommu->cap/ecap based on host cap/ecap.
>> >> >> iommu->cap_frozen set when machine create done, iommu-
>>cap/ecap
>> >> >become readonly.
>> >> >>
>> >> >> Implementation details for different backends will be in following
>> >patches.
>> >> >>
>> >> >> Signed-off-by: Yi Liu 
>> >> >> Signed-off-by: Yi Sun 
>> >> >> Signed-off-by: Zhenzhong Duan 
>> >> >> ---
>> >> >>  include/hw/i386/intel_iommu.h |  1 +
>> >> >>  hw/i386/intel_iommu.c | 50
>> >> >++-
>> >> >>  2 files changed, 50 insertions(+), 1 deletion(-)
>> >> >>
>> >> >> diff --git a/include/hw/i386/intel_iommu.h
>> >> >b/include/hw/i386/intel_iommu.h
>> >> >> index bbc7b96add..c71a133820 100644
>> >> >> --- a/include/hw/i386/intel_iommu.h
>> >> >> +++ b/include/hw/i386/intel_iommu.h
>> >> >> @@ -283,6 +283,7 @@ struct IntelIOMMUState {
>> >> >>
>> >> >>  uint64_t cap;   /* The value of capability reg */
>> >> >>  uint64_t ecap;  /* The value of extended 
>> >> >> capability reg
>*/
>> >> >> +bool cap_frozen;/* cap/ecap become read-only after
>> >frozen */
>> >> >>
>> >> >>  uint32_t context_cache_gen; /* Should be in [1,MAX] */
>> >> >>  GHashTable *iotlb;  /* IOTLB */
>> >> >> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> >> >> index ffa1ad6429..a9f9dfd6a7 100644
>> >> >> --- a/hw/i386/intel_iommu.c
>> >> >> +++ b/hw/i386/intel_iommu.c
>> >> >> @@ -35,6 +35,8 @@
>> >> >>  #include "sysemu/kvm.h"
>> >> >>  #include "sysemu/dma.h"
>> >> >>  #include "sysemu/sysemu.h"
>> >> >> +#include "hw/vfio/vfio-common.h"
>> >> >> +#include "sysemu/iommufd.h"
>> >> >>  #include "hw/i386/apic_internal.h"
>> >> >>  #include "kvm/kvm_i386.h"
>> >> >>  #include "migration/vmstate.h"
>> >> >> @@ -3819,6 +3821,38 @@ VTDAddressSpace
>> >> >*vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>> >> >>  return vtd_dev_as;
>> >> >>  }
>> >> >>
>> >> >> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>> >> >> + IOMMULegacyDevice *ldev,
>> >> >> + Error **errp)
>> >> >> +{
>> >> >> +return 0;
>> >> >> +}
>> >> >> +
>> >> >> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>> >> >> +  IOMMUFDDevice *idev,
>> >> >> +  Error **errp)
>> >> >> +{
>> >> >> +return 0;
>> >> >> +}
>> >&g

RE: [PATCH] vfio/iommufd: Fix memory leak

2024-03-13 Thread Duan, Zhenzhong


>-Original Message-
>From: Cédric Le Goater 
>Sent: Thursday, March 14, 2024 5:06 AM
>To: qemu-devel@nongnu.org
>Cc: Alex Williamson ; Cédric Le Goater
>; Eric Auger ; Liu, Yi L
>; Duan, Zhenzhong 
>Subject: [PATCH] vfio/iommufd: Fix memory leak
>
>Make sure variable contents is freed if scanf fails.
>
>Cc: Eric Auger 
>Cc: Yi Liu 
>Cc: Zhenzhong Duan 
>Fixes: CID 1540007
>Fixes: 5ee3dc7af785 ("vfio/iommufd: Implement the iommufd backend")
>Signed-off-by: Cédric Le Goater 

Reviewed-by: Zhenzhong Duan 

Unrelated to this patch, I see there are four g_free calls, not clear if it's 
deserved
to cleanup with g_autofree.

Thanks
Zhenzhong

>---
> hw/vfio/iommufd.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
>diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>index
>a75a785e90c64cdcc4d10c88d217801b3f536cdb..cd549e0ee8573e75772c5
>1cc96153762a6bc8550 100644
>--- a/hw/vfio/iommufd.c
>+++ b/hw/vfio/iommufd.c
>@@ -152,9 +152,8 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
>
> if (sscanf(contents, "%d:%d", , ) != 2) {
> error_setg(errp, "failed to get major:minor for \"%s\"", 
> vfio_dev_path);
>-goto out_free_dev_path;
>+goto out_free_contents;
> }
>-g_free(contents);
> vfio_devt = makedev(major, minor);
>
> vfio_path = g_strdup_printf("/dev/vfio/devices/%s", dent->d_name);
>@@ -166,6 +165,8 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
> trace_iommufd_cdev_getfd(vfio_path, ret);
> g_free(vfio_path);
>
>+out_free_contents:
>+g_free(contents);
> out_free_dev_path:
> g_free(vfio_dev_path);
> out_close_dir:
>--
>2.44.0



RE: [PATCH v1 3/6] intel_iommu: Add a framework to check and sync host IOMMU cap/ecap

2024-03-13 Thread Duan, Zhenzhong



>-Original Message-
>From: Michael S. Tsirkin 
>Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check and
>sync host IOMMU cap/ecap
>
>On Wed, Mar 13, 2024 at 02:52:39AM +0000, Duan, Zhenzhong wrote:
>> Hi Michael,
>>
>> >-Original Message-
>> >From: Michael S. Tsirkin 
>> >Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check and
>> >sync host IOMMU cap/ecap
>> >
>> >On Wed, Feb 28, 2024 at 05:44:29PM +0800, Zhenzhong Duan wrote:
>> >> From: Yi Liu 
>> >>
>> >> Add a framework to check and synchronize host IOMMU cap/ecap with
>> >> vIOMMU cap/ecap.
>> >>
>> >> The sequence will be:
>> >>
>> >> vtd_cap_init() initializes iommu->cap/ecap.
>> >> vtd_check_hdev() update iommu->cap/ecap based on host cap/ecap.
>> >> iommu->cap_frozen set when machine create done, iommu->cap/ecap
>> >become readonly.
>> >>
>> >> Implementation details for different backends will be in following
>patches.
>> >>
>> >> Signed-off-by: Yi Liu 
>> >> Signed-off-by: Yi Sun 
>> >> Signed-off-by: Zhenzhong Duan 
>> >> ---
>> >>  include/hw/i386/intel_iommu.h |  1 +
>> >>  hw/i386/intel_iommu.c | 50
>> >++-
>> >>  2 files changed, 50 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/include/hw/i386/intel_iommu.h
>> >b/include/hw/i386/intel_iommu.h
>> >> index bbc7b96add..c71a133820 100644
>> >> --- a/include/hw/i386/intel_iommu.h
>> >> +++ b/include/hw/i386/intel_iommu.h
>> >> @@ -283,6 +283,7 @@ struct IntelIOMMUState {
>> >>
>> >>  uint64_t cap;   /* The value of capability reg */
>> >>  uint64_t ecap;  /* The value of extended capability 
>> >> reg */
>> >> +bool cap_frozen;/* cap/ecap become read-only after
>frozen */
>> >>
>> >>  uint32_t context_cache_gen; /* Should be in [1,MAX] */
>> >>  GHashTable *iotlb;  /* IOTLB */
>> >> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> >> index ffa1ad6429..a9f9dfd6a7 100644
>> >> --- a/hw/i386/intel_iommu.c
>> >> +++ b/hw/i386/intel_iommu.c
>> >> @@ -35,6 +35,8 @@
>> >>  #include "sysemu/kvm.h"
>> >>  #include "sysemu/dma.h"
>> >>  #include "sysemu/sysemu.h"
>> >> +#include "hw/vfio/vfio-common.h"
>> >> +#include "sysemu/iommufd.h"
>> >>  #include "hw/i386/apic_internal.h"
>> >>  #include "kvm/kvm_i386.h"
>> >>  #include "migration/vmstate.h"
>> >> @@ -3819,6 +3821,38 @@ VTDAddressSpace
>> >*vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>> >>  return vtd_dev_as;
>> >>  }
>> >>
>> >> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>> >> + IOMMULegacyDevice *ldev,
>> >> + Error **errp)
>> >> +{
>> >> +return 0;
>> >> +}
>> >> +
>> >> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>> >> +  IOMMUFDDevice *idev,
>> >> +  Error **errp)
>> >> +{
>> >> +return 0;
>> >> +}
>> >> +
>> >> +static int vtd_check_hdev(IntelIOMMUState *s,
>VTDHostIOMMUDevice
>> >*vtd_hdev,
>> >> +  Error **errp)
>> >> +{
>> >> +HostIOMMUDevice *base_dev = vtd_hdev->dev;
>> >> +IOMMUFDDevice *idev;
>> >> +
>> >> +if (base_dev->type == HID_LEGACY) {
>> >> +IOMMULegacyDevice *ldev = container_of(base_dev,
>> >> +   IOMMULegacyDevice, base);
>> >> +
>> >> +return vtd_check_legacy_hdev(s, ldev, errp);
>> >> +}
>> >> +
>> >> +idev = container_of(base_dev, IOMMUFDDevice, base);
>> >> +
>> >> +return vtd_check_iommufd_hdev(s, idev, errp);
>> >> +}
>> >> +
>> >>  static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>> >devfn,
>> >> 

RE: [PATCH v1 3/6] intel_iommu: Add a framework to check and sync host IOMMU cap/ecap

2024-03-12 Thread Duan, Zhenzhong
Hi Michael,

>-Original Message-
>From: Michael S. Tsirkin 
>Subject: Re: [PATCH v1 3/6] intel_iommu: Add a framework to check and
>sync host IOMMU cap/ecap
>
>On Wed, Feb 28, 2024 at 05:44:29PM +0800, Zhenzhong Duan wrote:
>> From: Yi Liu 
>>
>> Add a framework to check and synchronize host IOMMU cap/ecap with
>> vIOMMU cap/ecap.
>>
>> The sequence will be:
>>
>> vtd_cap_init() initializes iommu->cap/ecap.
>> vtd_check_hdev() update iommu->cap/ecap based on host cap/ecap.
>> iommu->cap_frozen set when machine create done, iommu->cap/ecap
>become readonly.
>>
>> Implementation details for different backends will be in following patches.
>>
>> Signed-off-by: Yi Liu 
>> Signed-off-by: Yi Sun 
>> Signed-off-by: Zhenzhong Duan 
>> ---
>>  include/hw/i386/intel_iommu.h |  1 +
>>  hw/i386/intel_iommu.c | 50
>++-
>>  2 files changed, 50 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/i386/intel_iommu.h
>b/include/hw/i386/intel_iommu.h
>> index bbc7b96add..c71a133820 100644
>> --- a/include/hw/i386/intel_iommu.h
>> +++ b/include/hw/i386/intel_iommu.h
>> @@ -283,6 +283,7 @@ struct IntelIOMMUState {
>>
>>  uint64_t cap;   /* The value of capability reg */
>>  uint64_t ecap;  /* The value of extended capability reg 
>> */
>> +bool cap_frozen;/* cap/ecap become read-only after 
>> frozen */
>>
>>  uint32_t context_cache_gen; /* Should be in [1,MAX] */
>>  GHashTable *iotlb;  /* IOTLB */
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index ffa1ad6429..a9f9dfd6a7 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -35,6 +35,8 @@
>>  #include "sysemu/kvm.h"
>>  #include "sysemu/dma.h"
>>  #include "sysemu/sysemu.h"
>> +#include "hw/vfio/vfio-common.h"
>> +#include "sysemu/iommufd.h"
>>  #include "hw/i386/apic_internal.h"
>>  #include "kvm/kvm_i386.h"
>>  #include "migration/vmstate.h"
>> @@ -3819,6 +3821,38 @@ VTDAddressSpace
>*vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>  return vtd_dev_as;
>>  }
>>
>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
>> + IOMMULegacyDevice *ldev,
>> + Error **errp)
>> +{
>> +return 0;
>> +}
>> +
>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
>> +  IOMMUFDDevice *idev,
>> +  Error **errp)
>> +{
>> +return 0;
>> +}
>> +
>> +static int vtd_check_hdev(IntelIOMMUState *s, VTDHostIOMMUDevice
>*vtd_hdev,
>> +  Error **errp)
>> +{
>> +HostIOMMUDevice *base_dev = vtd_hdev->dev;
>> +IOMMUFDDevice *idev;
>> +
>> +if (base_dev->type == HID_LEGACY) {
>> +IOMMULegacyDevice *ldev = container_of(base_dev,
>> +   IOMMULegacyDevice, base);
>> +
>> +return vtd_check_legacy_hdev(s, ldev, errp);
>> +}
>> +
>> +idev = container_of(base_dev, IOMMUFDDevice, base);
>> +
>> +return vtd_check_iommufd_hdev(s, idev, errp);
>> +}
>> +
>>  static int vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>devfn,
>>  HostIOMMUDevice *base_dev, Error **errp)
>>  {
>> @@ -3829,6 +3863,7 @@ static int vtd_dev_set_iommu_device(PCIBus
>*bus, void *opaque, int devfn,
>>  .devfn = devfn,
>>  };
>>  struct vtd_as_key *new_key;
>> +int ret;
>>
>>  assert(base_dev);
>>
>> @@ -3848,6 +3883,13 @@ static int vtd_dev_set_iommu_device(PCIBus
>*bus, void *opaque, int devfn,
>>  vtd_hdev->iommu_state = s;
>>  vtd_hdev->dev = base_dev;
>>
>> +ret = vtd_check_hdev(s, vtd_hdev, errp);
>> +if (ret) {
>> +g_free(vtd_hdev);
>> +vtd_iommu_unlock(s);
>> +return ret;
>> +}
>> +
>>  new_key = g_malloc(sizeof(*new_key));
>>  new_key->bus = bus;
>>  new_key->devfn = devfn;
>
>
>Okay. So when VFIO device is created, it will call vtd_dev_set_iommu_device
>and that in turn will update caps.
>
>
>
>
>> @@ -4083,7 +4125,9 @@ static void vtd_init(IntelIOMMUState *s)
>>  s->iq_dw = false;
>>  s->next_frcd_reg = 0;
>>
>> -vtd_cap_init(s);
>> +if (!s->cap_frozen) {
>> +vtd_cap_init(s);
>> +}
>>
>
>If it's fronzen it's because VFIO was added after machine done.
>And then what? I think caps are just wrong?

Not quite get your question on caps being wrong. But try to explains:

When a hot plugged vfio device's host iommu cap isn't compatible with
vIOMMU's, hotplug should fail. Currently there is no check for this and
allow hotplug to succeed, but then some issue will reveal later,
e.g., vIOMMU's MGAW > host IOMMU's MGAW, guest can setup iova
mapping beyond host supported iova range, then DMA will fail.

In fact, before this series, cap is not impacted by VFIO, so it's same effect of
frozen after machine done.

>
>
>I think the way to approach this is just by specifying 

RE: [PATCH v8 7/9] hw/i386/q35: Set virtio-iommu aw-bits default value to 39

2024-03-07 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: [PATCH v8 7/9] hw/i386/q35: Set virtio-iommu aw-bits default
>value to 39
>
>Currently the default input range can extend to 64 bits. On x86,
>when the virtio-iommu protects vfio devices, the physical iommu
>may support only 39 bits. Let's set the default to 39, as done
>for the intel-iommu.
>
>We use hw_compat_8_2 to handle the compatibility for machines
>before 9.0 which used to have a virtio-iommu default input range
>of 64 bits.
>
>Of course if aw-bits is set from the command line, the default
>is overriden.
>
>Signed-off-by: Eric Auger 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>
>---
>
>v6 -> v7:
>- use static pc_q35_compat_defaults
>- remove spurious header addition
>- s/32/UINT32_MAX in the qtest
>
>v5 -> v6:
>- split pc/arm settings
>
>v3 -> v4:
>- update the qos test to relax the check on the max input IOVA
>
>v2 -> v3:
>- collected Zhenzhong's R-b
>- use _abort instead of NULL error handle
>  on object_property_get_uint() call (Cédric)
>- use VTD_HOST_AW_39BIT (Cédric)
>
>v1 -> v2:
>- set aw-bits to 48b on ARM
>- use hw_compat_8_2 to handle the compat for older machines
>  which used 64b as a default
>---
> hw/core/machine.c   | 1 +
> hw/i386/pc_q35.c| 9 +
> tests/qtest/virtio-iommu-test.c | 2 +-
> 3 files changed, 11 insertions(+), 1 deletion(-)
>
>diff --git a/hw/core/machine.c b/hw/core/machine.c
>index 6bd09d4592..4b89172d1c 100644
>--- a/hw/core/machine.c
>+++ b/hw/core/machine.c
>@@ -35,6 +35,7 @@
>
> GlobalProperty hw_compat_8_2[] = {
> { TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" },
>+{ TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "64" },
> };
> const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2);
>
>diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>index 45a4102e75..1e7464d39a 100644
>--- a/hw/i386/pc_q35.c
>+++ b/hw/i386/pc_q35.c
>@@ -45,6 +45,7 @@
> #include "hw/i386/pc.h"
> #include "hw/i386/amd_iommu.h"
> #include "hw/i386/intel_iommu.h"
>+#include "hw/virtio/virtio-iommu.h"
> #include "hw/display/ramfb.h"
> #include "hw/ide/pci.h"
> #include "hw/ide/ahci-pci.h"
>@@ -63,6 +64,12 @@
> /* ICH9 AHCI has 6 ports */
> #define MAX_SATA_PORTS 6
>
>+static GlobalProperty pc_q35_compat_defaults[] = {
>+{ TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "39" },
>+};
>+static const size_t pc_q35_compat_defaults_len =
>+G_N_ELEMENTS(pc_q35_compat_defaults);
>+
> struct ehci_companions {
> const char *name;
> int func;
>@@ -356,6 +363,8 @@ static void pc_q35_machine_options(MachineClass
>*m)
> machine_class_allow_dynamic_sysbus_dev(m,
>TYPE_INTEL_IOMMU_DEVICE);
> machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
> machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
>+compat_props_add(m->compat_props,
>+ pc_q35_compat_defaults, pc_q35_compat_defaults_len);
> }
>
> static void pc_q35_9_0_machine_options(MachineClass *m)
>diff --git a/tests/qtest/virtio-iommu-test.c b/tests/qtest/virtio-iommu-test.c
>index 068e7a9e6c..afb225971d 100644
>--- a/tests/qtest/virtio-iommu-test.c
>+++ b/tests/qtest/virtio-iommu-test.c
>@@ -34,7 +34,7 @@ static void pci_config(void *obj, void *data,
>QGuestAllocator *t_alloc)
> uint8_t bypass = qvirtio_config_readb(dev, 36);
>
> g_assert_cmpint(input_range_start, ==, 0);
>-g_assert_cmphex(input_range_end, ==, UINT64_MAX);
>+g_assert_cmphex(input_range_end, >=, UINT32_MAX);
> g_assert_cmpint(domain_range_start, ==, 0);
> g_assert_cmpint(domain_range_end, ==, UINT32_MAX);
> g_assert_cmpint(bypass, ==, 1);
>--
>2.41.0



RE: [PATCH v6 8/9] hw/arm/virt: Set virtio-iommu aw-bits default value to 48

2024-03-05 Thread Duan, Zhenzhong



>-Original Message-
>From: Eric Auger 
>Subject: [PATCH v6 8/9] hw/arm/virt: Set virtio-iommu aw-bits default value
>to 48
>
>On ARM we set 48b as a default (matching SMMUv3 SMMU_IDR5.VAX == 0).
>
>hw_compat_8_2 is used to handle the compatibility for machine types
>before 9.0 (default was 64 bits).
>
>Signed-off-by: Eric Auger 
>---
> hw/arm/virt.c | 17 +
> 1 file changed, 17 insertions(+)
>
>diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>index 0af1943697..dcfb25369b 100644
>--- a/hw/arm/virt.c
>+++ b/hw/arm/virt.c
>@@ -85,11 +85,28 @@
> #include "hw/char/pl011.h"
> #include "qemu/guest-random.h"
>
>+GlobalProperty arm_virt_compat[] = {
>+{ TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "48" },
>+};
>+const size_t arm_virt_compat_len = G_N_ELEMENTS(arm_virt_compat);

This can be static, otherwise,

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>+
>+/*
>+ * This cannot be called from the virt_machine_class_init() because
>+ * TYPE_VIRT_MACHINE is abstract and mc->compat_props
>g_ptr_array_new()
>+ * only is called on virt non abstract class init.
>+ */
>+static void arm_virt_compat_set(MachineClass *mc)
>+{
>+compat_props_add(mc->compat_props, arm_virt_compat,
>+ arm_virt_compat_len);
>+}
>+
> #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
> static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> void *data) \
> { \
> MachineClass *mc = MACHINE_CLASS(oc); \
>+arm_virt_compat_set(mc); \
> virt_machine_##major##_##minor##_options(mc); \
> mc->desc = "QEMU " # major "." # minor " ARM Virtual Machine"; \
> if (latest) { \
>--
>2.41.0




RE: [PATCH v6 7/9] hw/i386/q35: Set virtio-iommu aw-bits default value to 39

2024-03-05 Thread Duan, Zhenzhong
Hi Eric,

>-Original Message-
>From: Eric Auger 
>Subject: [PATCH v6 7/9] hw/i386/q35: Set virtio-iommu aw-bits default
>value to 39
>
>Currently the default input range can extend to 64 bits. On x86,
>when the virtio-iommu protects vfio devices, the physical iommu
>may support only 39 bits. Let's set the default to 39, as done
>for the intel-iommu.
>
>We use hw_compat_8_2 to handle the compatibility for machines
>before 9.0 which used to have a virtio-iommu default input range
>of 64 bits.
>
>Of course if aw-bits is set from the command line, the default
>is overriden.
>
>Signed-off-by: Eric Auger 
>
>---
>
>v5 -> v6:
>- split pc/arm settings
>
>v3 -> v4:
>- update the qos test to relax the check on the max input IOVA
>
>v2 -> v3:
>- collected Zhenzhong's R-b
>- use _abort instead of NULL error handle
>  on object_property_get_uint() call (Cédric)
>- use VTD_HOST_AW_39BIT (Cédric)
>
>v1 -> v2:
>- set aw-bits to 48b on ARM
>- use hw_compat_8_2 to handle the compat for older machines
>  which used 64b as a default
>---
> include/hw/i386/pc.h| 3 +++
> hw/core/machine.c   | 1 +
> hw/i386/pc.c| 6 ++
> hw/i386/pc_q35.c| 2 ++
> tests/qtest/virtio-iommu-test.c | 2 +-
> 5 files changed, 13 insertions(+), 1 deletion(-)
>
>diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
>index 5065590281..b3229f98de 100644
>--- a/include/hw/i386/pc.h
>+++ b/include/hw/i386/pc.h
>@@ -198,6 +198,9 @@ void pc_system_parse_ovmf_flash(uint8_t
>*flash_ptr, size_t flash_size);
> /* sgx.c */
> void pc_machine_init_sgx_epc(PCMachineState *pcms);
>
>+extern GlobalProperty pc_compat_defaults[];
>+extern const size_t pc_compat_defaults_len;

If we only want to support q35 and not i440fx, better to add _q35 suffix and 
move into pc_q35.c and made static?

>+
> extern GlobalProperty pc_compat_8_2[];
> extern const size_t pc_compat_8_2_len;
>
>diff --git a/hw/core/machine.c b/hw/core/machine.c
>index 6bd09d4592..4b89172d1c 100644
>--- a/hw/core/machine.c
>+++ b/hw/core/machine.c
>@@ -35,6 +35,7 @@
>
> GlobalProperty hw_compat_8_2[] = {
> { TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" },
>+{ TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "64" },
> };
> const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2);
>
>diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>index f5ff970acf..9024483356 100644
>--- a/hw/i386/pc.c
>+++ b/hw/i386/pc.c
>@@ -59,6 +59,7 @@
> #include "hw/i386/kvm/xen_evtchn.h"
> #include "hw/i386/kvm/xen_gnttab.h"
> #include "hw/i386/kvm/xen_xenstore.h"
>+#include "hw/i386/intel_iommu.h"

This can be removed?

> #include "hw/mem/memory-device.h"
> #include "e820_memory_layout.h"
> #include "trace.h"
>@@ -78,6 +79,11 @@
> { "qemu64-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version "
>v, },\
> { "athlon-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },
>
>+GlobalProperty pc_compat_defaults[] =  {
>+{ TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "39" },
>+};
>+const size_t pc_compat_defaults_len =
>G_N_ELEMENTS(pc_compat_defaults);
>+
> GlobalProperty pc_compat_8_2[] = {};
> const size_t pc_compat_8_2_len = G_N_ELEMENTS(pc_compat_8_2);
>
>diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>index 45a4102e75..32421a0a5f 100644
>--- a/hw/i386/pc_q35.c
>+++ b/hw/i386/pc_q35.c
>@@ -356,6 +356,8 @@ static void pc_q35_machine_options(MachineClass
>*m)
> machine_class_allow_dynamic_sysbus_dev(m,
>TYPE_INTEL_IOMMU_DEVICE);
> machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
> machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
>+compat_props_add(m->compat_props,
>+ pc_compat_defaults, pc_compat_defaults_len);
> }
>
> static void pc_q35_9_0_machine_options(MachineClass *m)
>diff --git a/tests/qtest/virtio-iommu-test.c b/tests/qtest/virtio-iommu-test.c
>index 068e7a9e6c..0f36381acb 100644
>--- a/tests/qtest/virtio-iommu-test.c
>+++ b/tests/qtest/virtio-iommu-test.c
>@@ -34,7 +34,7 @@ static void pci_config(void *obj, void *data,
>QGuestAllocator *t_alloc)
> uint8_t bypass = qvirtio_config_readb(dev, 36);
>
> g_assert_cmpint(input_range_start, ==, 0);
>-g_assert_cmphex(input_range_end, ==, UINT64_MAX);
>+g_assert_cmphex(input_range_end, >=, 32);

UINT32_MAX?

Thanks
Zhenzhong

> g_assert_cmpint(domain_range_start, ==, 0);
> g_assert_cmpint(domain_range_end, ==, UINT32_MAX);
> g_assert_cmpint(bypass, ==, 1);
>--
>2.41.0



  1   2   3   4   >