On Sun, Aug 24, 2025 at 02:36:23AM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <m...@redhat.com> > > Sent: 22 August 2025 07:30 PM > > > > On Fri, Aug 22, 2025 at 01:49:36PM +0000, Parav Pandit wrote: > > > > > > > From: Michael S. Tsirkin <m...@redhat.com> > > > > Sent: 22 August 2025 06:34 PM > > > > > > > > On Fri, Aug 22, 2025 at 12:22:50PM +0000, Parav Pandit wrote: > > > > > > From: Michael S. Tsirkin <m...@redhat.com> > > > > > > Sent: 22 August 2025 03:52 PM > > > > > > > > > > > > On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote: > > > > > > > This reverts commit 43bb40c5b926 ("virtio_pci: Support > > > > > > > surprise removal of > > > > > > virtio pci device"). > > > > > > > > > > > > > > Virtio drivers and PCI devices have never fully supported true > > > > > > > surprise (aka hot unplug) removal. Drivers historically > > > > > > > continued processing and waiting for pending I/O and even > > > > > > > continued synchronous device reset during surprise removal. > > > > > > > Devices have also continued completing I/Os, doing DMA and > > > > > > > allowing device reset after surprise removal to support such > > > > > > > drivers. > > > > > > > > > > > > > > Supporting it correctly would require a new device capability > > > > > > > > > > > > If a device is removed, it is removed. > > > > > This is how it was implemented and none of the virtio drivers > > > > > supported it. > > > > > So vendors had stepped away from such device implementation. > > > > > (not just us). > > > > > > > > > > > > If the slot does not have a mechanical interlock, I can pull the > > > > device out. It's not up to a device implementation. > > > > > > Sure yes, stack is not there yet to support it. > > > Each of the virtio device drivers are not there yet. > > > Lets build that infra, let device indicate it and it will be smooth ride > > > for driver > > and device. > > > > There is simply no way for the device to "support" for surprise removal, or > > lack > > such support thereof. > > The support is up to the slot, not the device. Any pci > > compliant device can be placed in a slot that allows surprise removal and > > that is > > all. The user can then remove the device. > > Software can then either recover gracefully - it should - or hang or crash > > - it > > does sometimes, now. The patch you are trying to revert is an attempt to > > move > > some use-cases from the 1st to the 2nd category. > > > It is the driver (and not the device) who needs to tell the device that it > will do sane cleanup and not wait infinitely.
You can invent a way for driver to tell the device that it is not broken. But even if the driver does not do it, nothing at all prevents users from removing the device. > > But what is going on now, as far as I could tell, is that someone developed > > a > > surprise removal emulation that does not actually remove the device, and is > > using that for testing the code in linux that supports surprise removal. > Nop. Your analysis is incorrect. > And I explained you that already. > The device implementation supports correct implementation where device stops > all the dma and also does not support register access. > And no single virtio driver supported that. > > On a surprised removed device, driver expects I/Os to complete and this is > beyond a 'bug fix' watermark. > > > That > > weird emulation seems to lead to all kind of weird issues. You answer is to > > remove the existing code and tell your testing team "we do not support > > surprise removal". > > > He he, it is no the device, it is the driver that does not support surprise > removal as you can see in your proposed patches and other sw changes. Then fix the driver. Or don't, for that matter, if you lack the time. > > But just go ahead and tell this to them straight away. You do not need this > > patch > > for this. > > > It is needed until infrastructure in multiple subsystem is built. What I do not understand, is what good does the revert do. Sorry. > > > > Or better still, let's fix the issues please. > > > The implementation is more than a fix category for stable kernels. > Hence, what is asked is to do proper implementation for future kernels and > until that point restore the bad use experience. I am not at all interested in discussing ease of backporting fixes before they are developed. Not how we do kernel development, sorry. > > > > -- > > MST