On Sun, Aug 24, 2025 at 02:36:11AM +0000, Parav Pandit wrote: > > > > From: Michael S. Tsirkin <m...@redhat.com> > > Sent: 22 August 2025 07:32 PM > > > > On Fri, Aug 22, 2025 at 01:53:02PM +0000, Parav Pandit wrote: > > > > > > > > > > From: Michael S. Tsirkin <m...@redhat.com> > > > > Sent: 22 August 2025 06:35 PM > > > > > > > > On Fri, Aug 22, 2025 at 12:24:06PM +0000, Parav Pandit wrote: > > > > > > > > > > > From: Li,Rongqing <lirongq...@baidu.com> > > > > > > Sent: 22 August 2025 03:57 PM > > > > > > > > > > > > > This reverts commit 43bb40c5b926 ("virtio_pci: Support > > > > > > > surprise removal of virtio pci device"). > > > > > > > > > > > > > > Virtio drivers and PCI devices have never fully supported true > > > > > > > surprise (aka hot > > > > > > > unplug) removal. Drivers historically continued processing and > > > > > > > waiting for pending I/O and even continued synchronous device > > > > > > > reset during surprise removal. Devices have also continued > > > > > > > completing I/Os, doing DMA and allowing device reset after > > > > > > > surprise > > > > removal to support such drivers. > > > > > > > > > > > > > > Supporting it correctly would require a new device capability > > > > > > > and driver negotiation in the virtio specification to safely > > > > > > > stop I/O and free queue > > > > > > memory. > > > > > > > Failure to do so either breaks all the existing drivers with > > > > > > > call trace listed in the commit or crashes the host on continuing > > > > > > > the > > DMA. > > > > > > > Hence, until such specification and devices are invented, > > > > > > > restore the previous behavior of treating surprise removal as > > > > > > > graceful removal to avoid regressions and maintain system > > > > > > > stability same as before the commit 43bb40c5b926 ("virtio_pci: > > > > > > > Support surprise removal of virtio pci > > > > > > device"). > > > > > > > > > > > > > > As explained above, previous analysis of solving this only in > > > > > > > driver was incomplete and non-reliable at [1] and at [2]; > > > > > > > Hence reverting commit > > > > > > > 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio > > > > > > > pci > > > > > > > device") is still the best stand to restore failures of virtio > > > > > > > net and block > > > > > > devices. > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB10 > > > > > > 0BC6 > > > > > > C6 > > > > > > > 38 dc...@cy8pr12mb7195.namprd12.prod.outlook.com/#t > > > > > > > [2] > > > > > > > https://lore.kernel.org/virtualization/20250602024358.57114-1- > > > > > > > para > > > > > > > v@nv > > > > > > > idia.c > > > > > > > om/ > > > > > > > > > > > > > > Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of > > > > > > > virtio pci device") > > > > > > > Cc: sta...@vger.kernel.org > > > > > > > Reported-by: lirongq...@baidu.com > > > > > > > Closes: > > > > > > > https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb7 > > > > > > > 3ca9 > > > > > > > b474 > > > > > > > 1@b > > > > > > > aidu.com/ > > > > > > > Signed-off-by: Parav Pandit <pa...@nvidia.com> > > > > > > > > > > > > > > > > > > > > > > > > Tested-by: Li RongQing <lirongq...@baidu.com> > > > > > > > > > > > > Thanks > > > > > > > > > > > > -Li > > > > > > > > > > > Multiple users are blocked to have this fix in stable kernel. > > > > > > > > what are these users doing that is blocked by this fix? > > > > > > > Not sure I understand the question. Let me try to answer. > > > They are unable to dynamically add/remove the virtio net, block, fs > > > devices in > > their systems. > > > Users have their networking applications running over NS network and > > database and file system through these devices. > > > Some of them keep reverting the patch. Some are unable to. > > > They are in search of stable kernel. > > > > > > Did I understand your question? > > > > > > > Not really, sorry. > > > > Does the system or does it not have a mechanical interlock? > > > It is modern system beyond mechanical interlock but has the ability for > surprise removal.
I am not sure what does "beyond" mean. I guess that it does not have it? > > If it does, how does a user run into surprise removal issues without the > > ability > > to remove the device? > > > User has the ability to surprise removal a device from the slot via the > slot's pci registers. I don't know what this means. Surprise removal is done by removing the device. Not via pci registers. > Yet the device is capable enough to fulfil the needs of broken drivers which > are waiting for the pending requests to arrive. I don't know what this means. A removed device can not do anything at all. > > If it does not, and a user pull out the working device, how does your patch > > help? > > > A driver must tell that it will not follow broken ancient behaviour and at > that point device would stop its ancient backward compatibility mode. I don't know what is "ancient backward compatibility mode". > > -- > > MST