On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
> This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of
> virtio pci device").
>
> Virtio drivers and PCI devices have never fully supported true
> surprise (aka hot unplug) removal. Drivers historically continued
> processing and waiting for pending I/O and even continued synchronous
> device reset during surprise removal. Devices have also continued
> completing I/Os, doing DMA and allowing device reset after surprise
> removal to support such drivers.
>
> Supporting it correctly would require a new device capability
If a device is removed, it is removed. Windows drivers supported
this since forever and it's just a Linux bug that it does not
handle all the cases. This is not something you can handle
with a capability.
> and
> driver negotiation in the virtio specification to safely stop
> I/O and free queue memory. Failure to do so either breaks all the
> existing drivers with call trace listed in the commit or crashes the
> host on continuing the DMA.
If the device is gone, then DMA does not continue.
IIUC what is going on for you, is that you have developed a surprise
removal emulation that pretends to remove the device but
actually the device is doing DMA. So of course things break then.
> Hence, until such specification and devices
> are invented, restore the previous behavior of treating surprise
> removal as graceful removal to avoid regressions and maintain system
> stability same as before the
> commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci
> device").
>
> As explained above, previous analysis of solving this only in driver
> was incomplete and non-reliable at [1] and at [2]; Hence reverting commit
> 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device")
> is still the best stand to restore failures of virtio net and
> block devices.
>
> [1]
> https://lore.kernel.org/virtualization/cy8pr12mb719506cc5613eb100bc6c638dc...@cy8pr12mb7195.namprd12.prod.outlook.com/#t
I can only repeat what I said then, this is not how we do kernel
development.
> [2]
> https://lore.kernel.org/virtualization/[email protected]/
What was missing here, is handling corner cases. So let us please
try to handle them.
Here is how I would try to do it:
- add a new driver callback
- start a periodic timer task in virtio core on remove
- in the timer, probe that the device is still present.
if not, invoke a driver callback
- cancel the task on device reset
If you do not have the time, let me know and I will try to look into it.
> Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci
> device")
> Cc: [email protected]
> Reported-by: [email protected]
> Closes:
> https://lore.kernel.org/virtualization/[email protected]/
> Signed-off-by: Parav Pandit <[email protected]>
> ---
> drivers/virtio/virtio_pci_common.c | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/drivers/virtio/virtio_pci_common.c
> b/drivers/virtio/virtio_pci_common.c
> index d6d79af44569..dba5eb2eaff9 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -747,13 +747,6 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
> struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
> struct device *dev = get_device(&vp_dev->vdev.dev);
>
> - /*
> - * Device is marked broken on surprise removal so that virtio upper
> - * layers can abort any ongoing operation.
> - */
> - if (!pci_device_is_present(pci_dev))
> - virtio_break_device(&vp_dev->vdev);
> -
> pci_disable_sriov(pci_dev);
>
> unregister_virtio_device(&vp_dev->vdev);
> --
> 2.26.2