> From: Michael S. Tsirkin <m...@redhat.com>
> Sent: 22 August 2025 07:32 PM
> 
> On Fri, Aug 22, 2025 at 01:53:02PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <m...@redhat.com>
> > > Sent: 22 August 2025 06:35 PM
> > >
> > > On Fri, Aug 22, 2025 at 12:24:06PM +0000, Parav Pandit wrote:
> > > >
> > > > > From: Li,Rongqing <lirongq...@baidu.com>
> > > > > Sent: 22 August 2025 03:57 PM
> > > > >
> > > > > > This reverts commit 43bb40c5b926 ("virtio_pci: Support
> > > > > > surprise removal of virtio pci device").
> > > > > >
> > > > > > Virtio drivers and PCI devices have never fully supported true
> > > > > > surprise (aka hot
> > > > > > unplug) removal. Drivers historically continued processing and
> > > > > > waiting for pending I/O and even continued synchronous device
> > > > > > reset during surprise removal. Devices have also continued
> > > > > > completing I/Os, doing DMA and allowing device reset after
> > > > > > surprise
> > > removal to support such drivers.
> > > > > >
> > > > > > Supporting it correctly would require a new device capability
> > > > > > and driver negotiation in the virtio specification to safely
> > > > > > stop I/O and free queue
> > > > > memory.
> > > > > > Failure to do so either breaks all the existing drivers with
> > > > > > call trace listed in the commit or crashes the host on continuing 
> > > > > > the
> DMA.
> > > > > > Hence, until such specification and devices are invented,
> > > > > > restore the previous behavior of treating surprise removal as
> > > > > > graceful removal to avoid regressions and maintain system
> > > > > > stability same as before the commit 43bb40c5b926 ("virtio_pci:
> > > > > > Support surprise removal of virtio pci
> > > > > device").
> > > > > >
> > > > > > As explained above, previous analysis of solving this only in
> > > > > > driver was incomplete and non-reliable at [1] and at [2];
> > > > > > Hence reverting commit
> > > > > > 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio
> > > > > > pci
> > > > > > device") is still the best stand to restore failures of virtio
> > > > > > net and block
> > > > > devices.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB10
> > > > > 0BC6
> > > > > C6
> > > > > > 38 dc...@cy8pr12mb7195.namprd12.prod.outlook.com/#t
> > > > > > [2]
> > > > > > https://lore.kernel.org/virtualization/20250602024358.57114-1-
> > > > > > para
> > > > > > v@nv
> > > > > > idia.c
> > > > > > om/
> > > > > >
> > > > > > Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of
> > > > > > virtio pci device")
> > > > > > Cc: sta...@vger.kernel.org
> > > > > > Reported-by: lirongq...@baidu.com
> > > > > > Closes:
> > > > > > https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb7
> > > > > > 3ca9
> > > > > > b474
> > > > > > 1@b
> > > > > > aidu.com/
> > > > > > Signed-off-by: Parav Pandit <pa...@nvidia.com>
> > > > >
> > > > >
> > > > >
> > > > > Tested-by: Li RongQing <lirongq...@baidu.com>
> > > > >
> > > > > Thanks
> > > > >
> > > > > -Li
> > > > >
> > > > Multiple users are blocked to have this fix in stable kernel.
> > >
> > > what are these users doing that is blocked by this fix?
> > >
> > Not sure I understand the question. Let me try to answer.
> > They are unable to dynamically add/remove the virtio net, block, fs devices 
> > in
> their systems.
> > Users have their networking applications running over NS network and
> database and file system through these devices.
> > Some of them keep reverting the patch. Some are unable to.
> > They are in search of stable kernel.
> >
> > Did I understand your question?
> >
> 
> Not really, sorry.
> 
> Does the system or does it not have a mechanical interlock?
> 
It is modern system beyond mechanical interlock but has the ability for 
surprise removal.

> If it does, how does a user run into surprise removal issues without the 
> ability
> to remove the device?
> 
User has the ability to surprise removal a device from the slot via the slot's 
pci registers.
Yet the device is capable enough to fulfil the needs of broken drivers which 
are waiting for the pending requests to arrive.

> If it does not, and a user pull out the working device, how does your patch
> help?
>
A driver must tell that it will not follow broken ancient behaviour and at that 
point device would stop its ancient backward compatibility mode.
 
> --
> MST


Reply via email to