Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset

2023-04-05 Thread Eric Auger
Hi Jason,

On 4/5/23 13:41, Jason Gunthorpe wrote:
> On Tue, Apr 04, 2023 at 05:59:01PM +0200, Eric Auger wrote:
>
>>> but the hot reset shall fail as the group is not owned by the user.
>> sure it shall but I fail to understand if the reset fails or the device
>> plug is somehow delayed until the reset completes.
> It is just racy today - vfio_pci_dev_set_resettable() doesn't hold any
> locks across the pci_walk_bus() check to prevent hot plug in while it is
> working on the reset.

OK thanks

Eric
>
> Jason
>



Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset

2023-04-05 Thread Jason Gunthorpe
On Tue, Apr 04, 2023 at 05:59:01PM +0200, Eric Auger wrote:

> > but the hot reset shall fail as the group is not owned by the user.
> 
> sure it shall but I fail to understand if the reset fails or the device
> plug is somehow delayed until the reset completes.

It is just racy today - vfio_pci_dev_set_resettable() doesn't hold any
locks across the pci_walk_bus() check to prevent hot plug in while it is
working on the reset.

Jason


Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset

2023-04-04 Thread Eric Auger



On 4/4/23 17:29, Liu, Yi L wrote:
>> From: Eric Auger 
>> Sent: Tuesday, April 4, 2023 11:19 PM
>>
>> Hi Yi,
>>
>> On 4/4/23 16:37, Liu, Yi L wrote:
>>> Hi Eric,
>>>
 From: Eric Auger 
 Sent: Tuesday, April 4, 2023 10:00 PM

 Hi YI,

 On 4/1/23 16:44, Yi Liu wrote:
> If the affected device is not opened by any user, it's safe to reset it
> given it's not in use.
>
> Reviewed-by: Kevin Tian 
> Reviewed-by: Jason Gunthorpe 
> Tested-by: Yanting Jiang 
> Signed-off-by: Yi Liu 
> ---
>  drivers/vfio/pci/vfio_pci_core.c | 14 +++---
>  include/uapi/linux/vfio.h|  8 
>  2 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci_core.c 
> b/drivers/vfio/pci/vfio_pci_core.c
> index 65bbef562268..5d745c9abf05 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -2429,10 +2429,18 @@ static int vfio_pci_dev_set_hot_reset(struct
 vfio_device_set *dev_set,
>   list_for_each_entry(cur_vma, _set->device_list, vdev.dev_set_list) {
>   /*
> -  * Test whether all the affected devices are contained by the
> -  * set of groups provided by the user.
> +  * Test whether all the affected devices can be reset by the
> +  * user.
> +  *
> +  * Resetting an unused device (not opened) is safe, because
> +  * dev_set->lock is held in hot reset path so this device
> +  * cannot race being opened by another user simultaneously.
> +  *
> +  * Otherwise all opened devices in the dev_set must be
> +  * contained by the set of groups provided by the user.
>*/
> - if (!vfio_dev_in_groups(cur_vma, groups)) {
> + if (cur_vma->vdev.open_count &&
> + !vfio_dev_in_groups(cur_vma, groups)) {
>   ret = -EINVAL;
>   goto err_undo;
>   }
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 0552e8dcf0cb..f96e5689cffc 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info {
>   * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13,
>   *   struct vfio_pci_hot_reset)
>   *
> + * Userspace requests hot reset for the devices it uses.  Due to the
> + * underlying topology, multiple devices can be affected in the reset
 by the reset
> + * while some might be opened by another user.  To avoid interference
 s/interference/hot reset failure?
>>> I don’t think user can really avoid hot reset failure since there may
>>> be new devices plugged into the affected slot. Even user has opened
>> I don't know the legacy wrt that issue but this sounds a serious issue,
>> meaning the reset of an assigned device could impact another device
>> belonging to another group not not owned by the user?
> but the hot reset shall fail as the group is not owned by the user.

sure it shall but I fail to understand if the reset fails or the device
plug is somehow delayed until the reset completes.
>
>>> all the groups/devices reported by VFIO_DEVICE_GET_PCI_HOT_RESET_INFO,
>>> the hot reset can fail if new device is plugged in and has not been
>>> bound to vfio or opened by another user during the window of
>>> _INFO and HOT_RESET.
>> with respect to the latter isn't the dev_set lock held during the hot
>> reset and sufficient to prevent any new opening to occur?
> yes. new open needs to acquire the dev_set lock. So when hot reset
> acquires the dev_set lock, then no new open can occur. 
>
> Regards,
> Yi Liu
>
>>> maybe the whole statement should be as below:
>>>
>>> To avoid interference, the hot reset can only be conducted when all
>>> the affected devices are either opened by the calling user or not
>>> opened yet at the moment of the hot reset attempt.
>> OK
>>
>> Eric
> + * the calling user must ensure all affected devices, if opened, are
> + * owned by itself.
> + *
> + * The ownership is proved by an array of group fds.
> + *
>   * Return: 0 on success, -errno on failure.
>   */
>  struct vfio_pci_hot_reset {
>>> Regards,
>>> Yi Liu



Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset

2023-04-04 Thread Liu, Yi L
> From: Eric Auger 
> Sent: Tuesday, April 4, 2023 11:19 PM
> 
> Hi Yi,
> 
> On 4/4/23 16:37, Liu, Yi L wrote:
> > Hi Eric,
> >
> >> From: Eric Auger 
> >> Sent: Tuesday, April 4, 2023 10:00 PM
> >>
> >> Hi YI,
> >>
> >> On 4/1/23 16:44, Yi Liu wrote:
> >>> If the affected device is not opened by any user, it's safe to reset it
> >>> given it's not in use.
> >>>
> >>> Reviewed-by: Kevin Tian 
> >>> Reviewed-by: Jason Gunthorpe 
> >>> Tested-by: Yanting Jiang 
> >>> Signed-off-by: Yi Liu 
> >>> ---
> >>>  drivers/vfio/pci/vfio_pci_core.c | 14 +++---
> >>>  include/uapi/linux/vfio.h|  8 
> >>>  2 files changed, 19 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/vfio/pci/vfio_pci_core.c 
> >>> b/drivers/vfio/pci/vfio_pci_core.c
> >>> index 65bbef562268..5d745c9abf05 100644
> >>> --- a/drivers/vfio/pci/vfio_pci_core.c
> >>> +++ b/drivers/vfio/pci/vfio_pci_core.c
> >>> @@ -2429,10 +2429,18 @@ static int vfio_pci_dev_set_hot_reset(struct
> >> vfio_device_set *dev_set,
> >>>   list_for_each_entry(cur_vma, _set->device_list, vdev.dev_set_list) {
> >>>   /*
> >>> -  * Test whether all the affected devices are contained by the
> >>> -  * set of groups provided by the user.
> >>> +  * Test whether all the affected devices can be reset by the
> >>> +  * user.
> >>> +  *
> >>> +  * Resetting an unused device (not opened) is safe, because
> >>> +  * dev_set->lock is held in hot reset path so this device
> >>> +  * cannot race being opened by another user simultaneously.
> >>> +  *
> >>> +  * Otherwise all opened devices in the dev_set must be
> >>> +  * contained by the set of groups provided by the user.
> >>>*/
> >>> - if (!vfio_dev_in_groups(cur_vma, groups)) {
> >>> + if (cur_vma->vdev.open_count &&
> >>> + !vfio_dev_in_groups(cur_vma, groups)) {
> >>>   ret = -EINVAL;
> >>>   goto err_undo;
> >>>   }
> >>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >>> index 0552e8dcf0cb..f96e5689cffc 100644
> >>> --- a/include/uapi/linux/vfio.h
> >>> +++ b/include/uapi/linux/vfio.h
> >>> @@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info {
> >>>   * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13,
> >>>   *   struct vfio_pci_hot_reset)
> >>>   *
> >>> + * Userspace requests hot reset for the devices it uses.  Due to the
> >>> + * underlying topology, multiple devices can be affected in the reset
> >> by the reset
> >>> + * while some might be opened by another user.  To avoid interference
> >> s/interference/hot reset failure?
> > I don’t think user can really avoid hot reset failure since there may
> > be new devices plugged into the affected slot. Even user has opened
> I don't know the legacy wrt that issue but this sounds a serious issue,
> meaning the reset of an assigned device could impact another device
> belonging to another group not not owned by the user?

but the hot reset shall fail as the group is not owned by the user.

> > all the groups/devices reported by VFIO_DEVICE_GET_PCI_HOT_RESET_INFO,
> > the hot reset can fail if new device is plugged in and has not been
> > bound to vfio or opened by another user during the window of
> > _INFO and HOT_RESET.
> with respect to the latter isn't the dev_set lock held during the hot
> reset and sufficient to prevent any new opening to occur?

yes. new open needs to acquire the dev_set lock. So when hot reset
acquires the dev_set lock, then no new open can occur. 

Regards,
Yi Liu

> >
> > maybe the whole statement should be as below:
> >
> > To avoid interference, the hot reset can only be conducted when all
> > the affected devices are either opened by the calling user or not
> > opened yet at the moment of the hot reset attempt.
> 
> OK
> 
> Eric
> >
> >>> + * the calling user must ensure all affected devices, if opened, are
> >>> + * owned by itself.
> >>> + *
> >>> + * The ownership is proved by an array of group fds.
> >>> + *
> >>>   * Return: 0 on success, -errno on failure.
> >>>   */
> >>>  struct vfio_pci_hot_reset {
> > Regards,
> > Yi Liu



Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset

2023-04-04 Thread Eric Auger
Hi Yi,

On 4/4/23 16:37, Liu, Yi L wrote:
> Hi Eric,
>
>> From: Eric Auger 
>> Sent: Tuesday, April 4, 2023 10:00 PM
>>
>> Hi YI,
>>
>> On 4/1/23 16:44, Yi Liu wrote:
>>> If the affected device is not opened by any user, it's safe to reset it
>>> given it's not in use.
>>>
>>> Reviewed-by: Kevin Tian 
>>> Reviewed-by: Jason Gunthorpe 
>>> Tested-by: Yanting Jiang 
>>> Signed-off-by: Yi Liu 
>>> ---
>>>  drivers/vfio/pci/vfio_pci_core.c | 14 +++---
>>>  include/uapi/linux/vfio.h|  8 
>>>  2 files changed, 19 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/vfio/pci/vfio_pci_core.c 
>>> b/drivers/vfio/pci/vfio_pci_core.c
>>> index 65bbef562268..5d745c9abf05 100644
>>> --- a/drivers/vfio/pci/vfio_pci_core.c
>>> +++ b/drivers/vfio/pci/vfio_pci_core.c
>>> @@ -2429,10 +2429,18 @@ static int vfio_pci_dev_set_hot_reset(struct
>> vfio_device_set *dev_set,
>>> list_for_each_entry(cur_vma, _set->device_list, vdev.dev_set_list) {
>>> /*
>>> -* Test whether all the affected devices are contained by the
>>> -* set of groups provided by the user.
>>> +* Test whether all the affected devices can be reset by the
>>> +* user.
>>> +*
>>> +* Resetting an unused device (not opened) is safe, because
>>> +* dev_set->lock is held in hot reset path so this device
>>> +* cannot race being opened by another user simultaneously.
>>> +*
>>> +* Otherwise all opened devices in the dev_set must be
>>> +* contained by the set of groups provided by the user.
>>>  */
>>> -   if (!vfio_dev_in_groups(cur_vma, groups)) {
>>> +   if (cur_vma->vdev.open_count &&
>>> +   !vfio_dev_in_groups(cur_vma, groups)) {
>>> ret = -EINVAL;
>>> goto err_undo;
>>> }
>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>>> index 0552e8dcf0cb..f96e5689cffc 100644
>>> --- a/include/uapi/linux/vfio.h
>>> +++ b/include/uapi/linux/vfio.h
>>> @@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info {
>>>   * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13,
>>>   * struct vfio_pci_hot_reset)
>>>   *
>>> + * Userspace requests hot reset for the devices it uses.  Due to the
>>> + * underlying topology, multiple devices can be affected in the reset
>> by the reset
>>> + * while some might be opened by another user.  To avoid interference
>> s/interference/hot reset failure?
> I don’t think user can really avoid hot reset failure since there may
> be new devices plugged into the affected slot. Even user has opened
I don't know the legacy wrt that issue but this sounds a serious issue,
meaning the reset of an assigned device could impact another device
belonging to another group not not owned by the user?
> all the groups/devices reported by VFIO_DEVICE_GET_PCI_HOT_RESET_INFO,
> the hot reset can fail if new device is plugged in and has not been
> bound to vfio or opened by another user during the window of
> _INFO and HOT_RESET.
with respect to the latter isn't the dev_set lock held during the hot
reset and sufficient to prevent any new opening to occur?
>
> maybe the whole statement should be as below:
>
> To avoid interference, the hot reset can only be conducted when all
> the affected devices are either opened by the calling user or not
> opened yet at the moment of the hot reset attempt.

OK

Eric
>
>>> + * the calling user must ensure all affected devices, if opened, are
>>> + * owned by itself.
>>> + *
>>> + * The ownership is proved by an array of group fds.
>>> + *
>>>   * Return: 0 on success, -errno on failure.
>>>   */
>>>  struct vfio_pci_hot_reset {
> Regards,
> Yi Liu



Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset

2023-04-04 Thread Liu, Yi L
Hi Eric,

> From: Eric Auger 
> Sent: Tuesday, April 4, 2023 10:00 PM
> 
> Hi YI,
> 
> On 4/1/23 16:44, Yi Liu wrote:
> > If the affected device is not opened by any user, it's safe to reset it
> > given it's not in use.
> >
> > Reviewed-by: Kevin Tian 
> > Reviewed-by: Jason Gunthorpe 
> > Tested-by: Yanting Jiang 
> > Signed-off-by: Yi Liu 
> > ---
> >  drivers/vfio/pci/vfio_pci_core.c | 14 +++---
> >  include/uapi/linux/vfio.h|  8 
> >  2 files changed, 19 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_core.c 
> > b/drivers/vfio/pci/vfio_pci_core.c
> > index 65bbef562268..5d745c9abf05 100644
> > --- a/drivers/vfio/pci/vfio_pci_core.c
> > +++ b/drivers/vfio/pci/vfio_pci_core.c
> > @@ -2429,10 +2429,18 @@ static int vfio_pci_dev_set_hot_reset(struct
> vfio_device_set *dev_set,
> >
> > list_for_each_entry(cur_vma, _set->device_list, vdev.dev_set_list) {
> > /*
> > -* Test whether all the affected devices are contained by the
> > -* set of groups provided by the user.
> > +* Test whether all the affected devices can be reset by the
> > +* user.
> > +*
> > +* Resetting an unused device (not opened) is safe, because
> > +* dev_set->lock is held in hot reset path so this device
> > +* cannot race being opened by another user simultaneously.
> > +*
> > +* Otherwise all opened devices in the dev_set must be
> > +* contained by the set of groups provided by the user.
> >  */
> > -   if (!vfio_dev_in_groups(cur_vma, groups)) {
> > +   if (cur_vma->vdev.open_count &&
> > +   !vfio_dev_in_groups(cur_vma, groups)) {
> > ret = -EINVAL;
> > goto err_undo;
> > }
> > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > index 0552e8dcf0cb..f96e5689cffc 100644
> > --- a/include/uapi/linux/vfio.h
> > +++ b/include/uapi/linux/vfio.h
> > @@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info {
> >   * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13,
> >   * struct vfio_pci_hot_reset)
> >   *
> > + * Userspace requests hot reset for the devices it uses.  Due to the
> > + * underlying topology, multiple devices can be affected in the reset
> by the reset
> > + * while some might be opened by another user.  To avoid interference
> s/interference/hot reset failure?

I don’t think user can really avoid hot reset failure since there may
be new devices plugged into the affected slot. Even user has opened
all the groups/devices reported by VFIO_DEVICE_GET_PCI_HOT_RESET_INFO,
the hot reset can fail if new device is plugged in and has not been
bound to vfio or opened by another user during the window of
_INFO and HOT_RESET.

maybe the whole statement should be as below:

To avoid interference, the hot reset can only be conducted when all
the affected devices are either opened by the calling user or not
opened yet at the moment of the hot reset attempt.

> > + * the calling user must ensure all affected devices, if opened, are
> > + * owned by itself.
> > + *
> > + * The ownership is proved by an array of group fds.
> > + *
> >   * Return: 0 on success, -errno on failure.
> >   */
> >  struct vfio_pci_hot_reset {

Regards,
Yi Liu


Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset

2023-04-04 Thread Eric Auger
Hi YI,

On 4/1/23 16:44, Yi Liu wrote:
> If the affected device is not opened by any user, it's safe to reset it
> given it's not in use.
>
> Reviewed-by: Kevin Tian 
> Reviewed-by: Jason Gunthorpe 
> Tested-by: Yanting Jiang 
> Signed-off-by: Yi Liu 
> ---
>  drivers/vfio/pci/vfio_pci_core.c | 14 +++---
>  include/uapi/linux/vfio.h|  8 
>  2 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci_core.c 
> b/drivers/vfio/pci/vfio_pci_core.c
> index 65bbef562268..5d745c9abf05 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -2429,10 +2429,18 @@ static int vfio_pci_dev_set_hot_reset(struct 
> vfio_device_set *dev_set,
>  
>   list_for_each_entry(cur_vma, _set->device_list, vdev.dev_set_list) {
>   /*
> -  * Test whether all the affected devices are contained by the
> -  * set of groups provided by the user.
> +  * Test whether all the affected devices can be reset by the
> +  * user.
> +  *
> +  * Resetting an unused device (not opened) is safe, because
> +  * dev_set->lock is held in hot reset path so this device
> +  * cannot race being opened by another user simultaneously.
> +  *
> +  * Otherwise all opened devices in the dev_set must be
> +  * contained by the set of groups provided by the user.
>*/
> - if (!vfio_dev_in_groups(cur_vma, groups)) {
> + if (cur_vma->vdev.open_count &&
> + !vfio_dev_in_groups(cur_vma, groups)) {
>   ret = -EINVAL;
>   goto err_undo;
>   }
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 0552e8dcf0cb..f96e5689cffc 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info {
>   * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13,
>   *   struct vfio_pci_hot_reset)
>   *
> + * Userspace requests hot reset for the devices it uses.  Due to the
> + * underlying topology, multiple devices can be affected in the reset
by the reset
> + * while some might be opened by another user.  To avoid interference
s/interference/hot reset failure?
> + * the calling user must ensure all affected devices, if opened, are
> + * owned by itself.
> + *
> + * The ownership is proved by an array of group fds.
> + *
>   * Return: 0 on success, -errno on failure.
>   */
>  struct vfio_pci_hot_reset {
Thanks

Eric



[Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset

2023-04-01 Thread Yi Liu
If the affected device is not opened by any user, it's safe to reset it
given it's not in use.

Reviewed-by: Kevin Tian 
Reviewed-by: Jason Gunthorpe 
Tested-by: Yanting Jiang 
Signed-off-by: Yi Liu 
---
 drivers/vfio/pci/vfio_pci_core.c | 14 +++---
 include/uapi/linux/vfio.h|  8 
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 65bbef562268..5d745c9abf05 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -2429,10 +2429,18 @@ static int vfio_pci_dev_set_hot_reset(struct 
vfio_device_set *dev_set,
 
list_for_each_entry(cur_vma, _set->device_list, vdev.dev_set_list) {
/*
-* Test whether all the affected devices are contained by the
-* set of groups provided by the user.
+* Test whether all the affected devices can be reset by the
+* user.
+*
+* Resetting an unused device (not opened) is safe, because
+* dev_set->lock is held in hot reset path so this device
+* cannot race being opened by another user simultaneously.
+*
+* Otherwise all opened devices in the dev_set must be
+* contained by the set of groups provided by the user.
 */
-   if (!vfio_dev_in_groups(cur_vma, groups)) {
+   if (cur_vma->vdev.open_count &&
+   !vfio_dev_in_groups(cur_vma, groups)) {
ret = -EINVAL;
goto err_undo;
}
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 0552e8dcf0cb..f96e5689cffc 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info {
  * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13,
  * struct vfio_pci_hot_reset)
  *
+ * Userspace requests hot reset for the devices it uses.  Due to the
+ * underlying topology, multiple devices can be affected in the reset
+ * while some might be opened by another user.  To avoid interference
+ * the calling user must ensure all affected devices, if opened, are
+ * owned by itself.
+ *
+ * The ownership is proved by an array of group fds.
+ *
  * Return: 0 on success, -errno on failure.
  */
 struct vfio_pci_hot_reset {
-- 
2.34.1