On Sun, Aug 24, 2025 at 02:36:23AM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <m...@redhat.com>
> > Sent: 22 August 2025 07:30 PM
> > 
> > On Fri, Aug 22, 2025 at 01:49:36PM +0000, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin <m...@redhat.com>
> > > > Sent: 22 August 2025 06:34 PM
> > > >
> > > > On Fri, Aug 22, 2025 at 12:22:50PM +0000, Parav Pandit wrote:
> > > > > > From: Michael S. Tsirkin <m...@redhat.com>
> > > > > > Sent: 22 August 2025 03:52 PM
> > > > > >
> > > > > > On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
> > > > > > > This reverts commit 43bb40c5b926 ("virtio_pci: Support
> > > > > > > surprise removal of
> > > > > > virtio pci device").
> > > > > > >
> > > > > > > Virtio drivers and PCI devices have never fully supported true
> > > > > > > surprise (aka hot unplug) removal. Drivers historically
> > > > > > > continued processing and waiting for pending I/O and even
> > > > > > > continued synchronous device reset during surprise removal.
> > > > > > > Devices have also continued completing I/Os, doing DMA and
> > > > > > > allowing device reset after surprise removal to support such 
> > > > > > > drivers.
> > > > > > >
> > > > > > > Supporting it correctly would require a new device capability
> > > > > >
> > > > > > If a device is removed, it is removed.
> > > > > This is how it was implemented and none of the virtio drivers 
> > > > > supported it.
> > > > > So vendors had stepped away from such device implementation.
> > > > > (not just us).
> > > >
> > > >
> > > > If the slot does not have a mechanical interlock, I can pull the
> > > > device out. It's not up to a device implementation.
> > >
> > > Sure yes, stack is not there yet to support it.
> > > Each of the virtio device drivers are not there yet.
> > > Lets build that infra, let device indicate it and it will be smooth ride 
> > > for driver
> > and device.
> > 
> > There is simply no way for the device to "support" for surprise removal, or 
> > lack
> > such support thereof. 
> > The support is up to the slot, not the device.  Any pci
> > compliant device can be placed in a slot that allows surprise removal and 
> > that is
> > all. The user can then remove the device.
> > Software can then either recover gracefully - it should - or hang or crash 
> > - it
> > does sometimes, now. The patch you are trying to revert is an attempt to 
> > move
> > some use-cases from the 1st to the 2nd category.
> > 
> It is the driver (and not the device) who needs to tell the device that it 
> will do sane cleanup and not wait infinitely.

You can invent a way for driver to tell the device that it is not
broken. But even if the driver does not do it, nothing at all
prevents users from removing the device.


> > But what is going on now, as far as I could tell, is that someone developed 
> > a
> > surprise removal emulation that does not actually remove the device, and is
> > using that for testing the code in linux that supports surprise removal.  
> Nop. Your analysis is incorrect.
> And I explained you that already.
> The device implementation supports correct implementation where device stops 
> all the dma and also does not support register access.
> And no single virtio driver supported that.
> 
> On a surprised removed device, driver expects I/Os to complete and this is 
> beyond a 'bug fix' watermark.
> 
> > That
> > weird emulation seems to lead to all kind of weird issues. You answer is to
> > remove the existing code and tell your testing team "we do not support
> > surprise removal".
> >
> He he, it is no the device, it is the driver that does not support surprise 
> removal as you can see in your proposed patches and other sw changes.

Then fix the driver. Or don't, for that matter, if you lack the time.

> > But just go ahead and tell this to them straight away. You do not need this 
> > patch
> > for this.
> > 
> It is needed until infrastructure in multiple subsystem is built.

What I do not understand, is what good does the revert do. Sorry.

> > 
> > Or better still, let's fix the issues please.
> > 
> The implementation is more than a fix category for stable kernels.
> Hence, what is asked is to do proper implementation for future kernels and 
> until that point restore the bad use experience.



I am not at all interested in discussing ease of backporting fixes
before they are developed.
Not how we do kernel development, sorry.

> > 
> > --
> > MST


Reply via email to