On Wed, 31 Jan 2024 15:22:59 +0530
Vinayak Kale <vk...@nvidia.com> wrote:

> On 31/01/24 12:28 am, Alex Williamson wrote:
> > 
> > On Tue, 30 Jan 2024 23:32:26 +0530
> > Vinayak Kale <vk...@nvidia.com> wrote:
> >   
> >> Missed adding Michael, Marcel, Alex and Avihai earlier, apologies.
> >>
> >> Regards,
> >> Vinayak
> >>
> >> On 30/01/24 3:26 pm, Vinayak Kale wrote:  
> >>> In case of migration, during restore operation, qemu checks the config 
> >>> space of the pci device with the config space
> >>> in the migration stream captured during save operation. In case of config 
> >>> space data mismatch, restore operation is failed.
> >>>
> >>> config space check is done in function get_pci_config_device(). By 
> >>> default VSC (vendor-specific-capability) in config space is checked.
> >>>
> >>> Ideally qemu should not check VSC during restore/load. This patch skips 
> >>> the check by not setting pdev->cmask[] for VSC offsets in 
> >>> pci_add_capability().
> >>> If cmask[] is not set for an offset, then qemu skips config space check 
> >>> for that offset.
> >>>
> >>> Signed-off-by: Vinayak Kale <vk...@nvidia.com>
> >>> ---
> >>>    hw/pci/pci.c | 7 +++++--
> >>>    1 file changed, 5 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >>> index 76080af580..32429109df 100644
> >>> --- a/hw/pci/pci.c
> >>> +++ b/hw/pci/pci.c
> >>> @@ -2485,8 +2485,11 @@ int pci_add_capability(PCIDevice *pdev, uint8_t 
> >>> cap_id,
> >>>        memset(pdev->used + offset, 0xFF, QEMU_ALIGN_UP(size, 4));
> >>>        /* Make capability read-only by default */
> >>>        memset(pdev->wmask + offset, 0, size);
> >>> -    /* Check capability by default */
> >>> -    memset(pdev->cmask + offset, 0xFF, size);
> >>> +
> >>> +    if (cap_id != PCI_CAP_ID_VNDR) {
> >>> +        /* Check non-vendor specific capability by default */
> >>> +        memset(pdev->cmask + offset, 0xFF, size);
> >>> +    }
> >>>        return offset;
> >>>    }
> >>>  
> >>  
> > 
> > If there is a possibility that the data within the vendor specific cap
> > can be consumed by the driver or diagnostic tools, then it's part of
> > the device ABI and should be consistent across migration.  A mismatch
> > can certainly cause a migration failure, but why shouldn't it?  
> 
> Sure, the device ABI should be consistent across migration. In case of 
> VSC, it should represent same format on source and destination. But 
> shouldn't VSC content check or its interpretation be left to vendor 
> driver instead of qemu?

By "vendor driver" here, are you suggesting that QEMU device models (ex.
hw/net/{e1000*,igb*,rtl8139*}) should perform that validation?  If so,
where's the patch that introduces any sort of validation hooks for
vendors to provide?  Where is this validation going to happen in the
case of a migratable vfio-pci variant devices?  Nothing about this
patch suggests that it's deferring responsibility to some other code
entity, it only indicates "checking this breaks, let's not do it".

It's possible that the device you care about only reports volatile
diagnostic information through the vendor specific capability, but
another device might use it to report information relative to the
internal hardware configuration.  Without knowing what the vendor
specific capability contains, QEMU needs to take the most conservative
approach by default.  Thanks,

Alex


Reply via email to