RE: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances

2024-02-13 Thread Minwoo Im
> -Original Message-
> From: qemu-block-bounces+minwoo.im.dev=gmail@nongnu.org  bounces+minwoo.im.dev=gmail@nongnu.org> On Behalf Of Akihiko Odaki
> Sent: Monday, February 12, 2024 7:21 PM
> To: Philippe Mathieu-Daudé ; Michael S. Tsirkin
> ; Marcel Apfelbaum ; Alex
> Williamson ; Cédric Le Goater ;
> Paolo Bonzini ; Daniel P. Berrangé ;
> Eduardo Habkost ; Sriram Yagnaraman
> ; Jason Wang ; Keith Busch
> ; Klaus Jensen 
> Cc: qemu-devel@nongnu.org; qemu-bl...@nongnu.org; Akihiko Odaki
> 
> Subject: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances
>
> Disable SR-IOV VF devices by reusing code to power down PCI devices
> instead of removing them when the guest requests to disable VFs. This
> allows to realize devices and report VF realization errors at PF
> realization time.
>
> Signed-off-by: Akihiko Odaki 

Hello Akihiko,

I think this patch fixes the issue reported in [1].  The latest master branch
also causes an object-related assertion error when we enable VF(s) and disable
through sysfs over and over again (at least twice).  But this issue is also
fixed with your patch.

**
ERROR:../qom/object.c:753:object_finalize: assertion failed: (obj->parent == 
NULL)
Bail out! ERROR:../qom/object.c:753:object_finalize: assertion failed: 
(obj->parent == NULL)

Klaus,

If this patchset is applied, I think [1] can be dropped.  What do you think?

Thanks,

[1] 
https://lore.kernel.org/qemu-devel/20240109022953epcms2p54550dcfc9f831a515206513ae98e7511@epcms2p5/




Re: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances

2024-02-13 Thread Akihiko Odaki

On 2024/02/13 17:51, Minwoo Im wrote:

-Original Message-
From: qemu-block-bounces+minwoo.im.dev=gmail@nongnu.org  On Behalf Of Akihiko Odaki
Sent: Monday, February 12, 2024 7:21 PM
To: Philippe Mathieu-Daudé ; Michael S. Tsirkin
; Marcel Apfelbaum ; Alex
Williamson ; Cédric Le Goater ;
Paolo Bonzini ; Daniel P. Berrangé ;
Eduardo Habkost ; Sriram Yagnaraman
; Jason Wang ; Keith Busch
; Klaus Jensen 
Cc: qemu-devel@nongnu.org; qemu-bl...@nongnu.org; Akihiko Odaki

Subject: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances

Disable SR-IOV VF devices by reusing code to power down PCI devices
instead of removing them when the guest requests to disable VFs. This
allows to realize devices and report VF realization errors at PF
realization time.

Signed-off-by: Akihiko Odaki 


Hello Akihiko,

I think this patch fixes the issue reported in [1].  The latest master branch
also causes an object-related assertion error when we enable VF(s) and disable
through sysfs over and over again (at least twice).  But this issue is also
fixed with your patch.

**
ERROR:../qom/object.c:753:object_finalize: assertion failed: (obj->parent == 
NULL)
Bail out! ERROR:../qom/object.c:753:object_finalize: assertion failed: 
(obj->parent == NULL)


I looked into this and found it's not fixed with my patch though the 
symptom may be gone.


The problem is that object_ref() is not called when copying subsys. An 
obvious fix is to add an object_ref() call, but I think it's too hacky 
and error-prone. It's better to enumerate nvme_props and call 
object_property_get_qobject() and object_property_set_qobject() for each 
property.


Regards,
Akihiko Odaki



Klaus,

If this patchset is applied, I think [1] can be dropped.  What do you think?

Thanks,

[1] 
https://lore.kernel.org/qemu-devel/20240109022953epcms2p54550dcfc9f831a515206513ae98e7511@epcms2p5/





Re: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances

2024-02-13 Thread Akihiko Odaki

On 2024/02/13 17:51, Minwoo Im wrote:

-Original Message-
From: qemu-block-bounces+minwoo.im.dev=gmail@nongnu.org  On Behalf Of Akihiko Odaki
Sent: Monday, February 12, 2024 7:21 PM
To: Philippe Mathieu-Daudé ; Michael S. Tsirkin
; Marcel Apfelbaum ; Alex
Williamson ; Cédric Le Goater ;
Paolo Bonzini ; Daniel P. Berrangé ;
Eduardo Habkost ; Sriram Yagnaraman
; Jason Wang ; Keith Busch
; Klaus Jensen 
Cc: qemu-devel@nongnu.org; qemu-bl...@nongnu.org; Akihiko Odaki

Subject: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances

Disable SR-IOV VF devices by reusing code to power down PCI devices
instead of removing them when the guest requests to disable VFs. This
allows to realize devices and report VF realization errors at PF
realization time.

Signed-off-by: Akihiko Odaki 


Hello Akihiko,

I think this patch fixes the issue reported in [1].  The latest master branch
also causes an object-related assertion error when we enable VF(s) and disable
through sysfs over and over again (at least twice).  But this issue is also
fixed with your patch.

**
ERROR:../qom/object.c:753:object_finalize: assertion failed: (obj->parent == 
NULL)
Bail out! ERROR:../qom/object.c:753:object_finalize: assertion failed: 
(obj->parent == NULL)


I'll note that in the next version.



Klaus,

If this patchset is applied, I think [1] can be dropped.  What do you think?


[1] should be kept. This patchset fixes use-after-free but double free 
[1] fixes still occurs.


Regards,
Akihiko Odaki



Thanks,

[1] 
https://lore.kernel.org/qemu-devel/20240109022953epcms2p54550dcfc9f831a515206513ae98e7511@epcms2p5/






Re: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances

2024-02-13 Thread Akihiko Odaki

On 2024/02/13 20:01, Michael S. Tsirkin wrote:

On Mon, Feb 12, 2024 at 07:20:34PM +0900, Akihiko Odaki wrote:

Disable SR-IOV VF devices by reusing code to power down PCI devices
instead of removing them when the guest requests to disable VFs. This
allows to realize devices and report VF realization errors at PF
realization time.

Signed-off-by: Akihiko Odaki 


It is simpler for sure, but I am worried that all of these
unused VFs will consume lots of resources even if never
enabled. Thoughts?


My rationale behind this change is that the resources should be 
allocated when the PF is realized to ensure the resources are available 
when the guest requests to enable VFs.


When it is necessary to allocate resources dynamically, the conventional 
hotplug mechanism should be used instead since the SR-IOV interface is 
not designed to report resource allocation errors.




Re: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances

2024-02-13 Thread Michael S. Tsirkin
On Mon, Feb 12, 2024 at 07:20:34PM +0900, Akihiko Odaki wrote:
> Disable SR-IOV VF devices by reusing code to power down PCI devices
> instead of removing them when the guest requests to disable VFs. This
> allows to realize devices and report VF realization errors at PF
> realization time.
> 
> Signed-off-by: Akihiko Odaki 

It is simpler for sure, but I am worried that all of these
unused VFs will consume lots of resources even if never
enabled. Thoughts?


> ---
>  docs/pcie_sriov.txt |   8 ++--
>  include/hw/pci/pci.h|   2 +-
>  include/hw/pci/pci_device.h |   2 +-
>  include/hw/pci/pcie_sriov.h |   6 +--
>  hw/net/igb.c|  13 --
>  hw/nvme/ctrl.c  |  24 +++
>  hw/pci/pci.c|  18 
>  hw/pci/pci_host.c   |   4 +-
>  hw/pci/pcie.c   |   4 +-
>  hw/pci/pcie_sriov.c | 100 
> 
>  10 files changed, 97 insertions(+), 84 deletions(-)
> 
> diff --git a/docs/pcie_sriov.txt b/docs/pcie_sriov.txt
> index a47aad0bfab0..ab2142807f79 100644
> --- a/docs/pcie_sriov.txt
> +++ b/docs/pcie_sriov.txt
> @@ -52,9 +52,11 @@ setting up a BAR for a VF.
>...
>  
>/* Add and initialize the SR/IOV capability */
> -  pcie_sriov_pf_init(d, 0x200, "your_virtual_dev",
> -   vf_devid, initial_vfs, total_vfs,
> -   fun_offset, stride);
> +  if (!pcie_sriov_pf_init(d, 0x200, "your_virtual_dev",
> +  vf_devid, initial_vfs, total_vfs,
> +  fun_offset, stride, errp)) {
> + return;
> +  }
>  
>/* Set up individual VF BARs (parameters as for normal BARs) */
>pcie_sriov_pf_init_vf_bar( ... )
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index fa6313aabc43..fae83b9b723c 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -643,6 +643,6 @@ static inline void pci_irq_pulse(PCIDevice *pci_dev)
>  }
>  
>  MSIMessage pci_get_msi_message(PCIDevice *dev, int vector);
> -void pci_set_power(PCIDevice *pci_dev, bool state);
> +void pci_set_enabled(PCIDevice *pci_dev, bool state);
>  
>  #endif
> diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
> index 7564e9536dbd..153e13eaef99 100644
> --- a/include/hw/pci/pci_device.h
> +++ b/include/hw/pci/pci_device.h
> @@ -56,7 +56,7 @@ typedef struct PCIReqIDCache PCIReqIDCache;
>  struct PCIDevice {
>  DeviceState qdev;
>  bool partially_hotplugged;
> -bool has_power;
> +bool is_enabled;
>  
>  /* PCI config space */
>  uint8_t *config;
> diff --git a/include/hw/pci/pcie_sriov.h b/include/hw/pci/pcie_sriov.h
> index 095fb0c9edf9..d9a39daccac4 100644
> --- a/include/hw/pci/pcie_sriov.h
> +++ b/include/hw/pci/pcie_sriov.h
> @@ -18,7 +18,6 @@
>  struct PCIESriovPF {
>  uint16_t num_vfs;   /* Number of virtual functions created */
>  uint8_t vf_bar_type[PCI_NUM_REGIONS];   /* Store type for each VF bar */
> -const char *vfname; /* Reference to the device type used for the VFs */
>  PCIDevice **vf; /* Pointer to an array of num_vfs VF devices */
>  };
>  
> @@ -27,10 +26,11 @@ struct PCIESriovVF {
>  uint16_t vf_number; /* Logical VF number of this function */
>  };
>  
> -void pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset,
> +bool pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset,
>  const char *vfname, uint16_t vf_dev_id,
>  uint16_t init_vfs, uint16_t total_vfs,
> -uint16_t vf_offset, uint16_t vf_stride);
> +uint16_t vf_offset, uint16_t vf_stride,
> +Error **errp);
>  void pcie_sriov_pf_exit(PCIDevice *dev);
>  
>  /* Set up a VF bar in the SR/IOV bar area */
> diff --git a/hw/net/igb.c b/hw/net/igb.c
> index 0b5c31a58bba..1079a33d4000 100644
> --- a/hw/net/igb.c
> +++ b/hw/net/igb.c
> @@ -447,9 +447,16 @@ static void igb_pci_realize(PCIDevice *pci_dev, Error 
> **errp)
>  
>  pcie_ari_init(pci_dev, 0x150);
>  
> -pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, TYPE_IGBVF,
> -IGB_82576_VF_DEV_ID, IGB_MAX_VF_FUNCTIONS, IGB_MAX_VF_FUNCTIONS,
> -IGB_VF_OFFSET, IGB_VF_STRIDE);
> +if (!pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET,
> +TYPE_IGBVF, IGB_82576_VF_DEV_ID,
> +IGB_MAX_VF_FUNCTIONS, IGB_MAX_VF_FUNCTIONS,
> +IGB_VF_OFFSET, IGB_VF_STRIDE,
> +errp)) {
> +pcie_cap_exit(pci_dev);
> +igb_cleanup_msix(s);
> +msi_uninit(pci_dev);
> +return;
> +}
>  
>  pcie_sriov_pf_init_vf_bar(pci_dev, IGBVF_MMIO_BAR_IDX,
>  PCI_BASE_ADDRESS_MEM_TYPE_64 | PCI_BASE_ADDRESS_MEM_PREFETCH,
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index f026245d1e9e..f8df622fe590 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@