On Thu, Aug 28 2025, "Michael S. Tsirkin" <m...@redhat.com> wrote:

> On Thu, Aug 28, 2025 at 02:16:28PM +0200, Cornelia Huck wrote:
>> On Thu, Aug 28 2025, Parav Pandit <pa...@nvidia.com> wrote:
>> 
>> >> From: Cornelia Huck <coh...@redhat.com>
>> >> Sent: 27 August 2025 05:04 PM
>> >> 
>> >> On Wed, Aug 27 2025, "Michael S. Tsirkin" <m...@redhat.com> wrote:
>> >> 
>> >> > On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote:
>> >> >> > What I do not understand, is what good does the revert do. Sorry.
>> >> >> >
>> >> >> Let me explain.
>> >> >> It prevents the issue of vblk requests being stuck due to broken VQ.
>> >> >> It prevents the vnet driver start_xmit() to be not stuck on skb 
>> >> >> completions.
>> >> >
>> >> > This is the part I don't get.  In what scenario, before 43bb40c5b9265
>> >> > start_xmit is not stuck, but after 43bb40c5b9265 it is stuck?
>> >> >
>> >> > Once the device is gone, it is not using any buffers at all.
>> >> 
>> >> What I also don't understand: virtio-ccw does exactly the same thing
>> >> (virtio_break_device(), added in 2014), and it supports surprise removal
>> >> _only_, yet I don't remember seeing bug reports?
>> >
>> > I suspect that stress testing may not have happened for ccw with active 
>> > vblk Ios and outstanding transmit pkt and cvq commands.
>> > Hard to say as we don't have ccw hw or systems.
>> 
>> cc:ing linux-s390 list. I'd be surprised if nobody ever tested surprise
>> removal on a loaded system in the last 11 years.
>
>
> As it became very clear from follow up discussion, the issue is nothing
> to do with virtio, it is with a broken hypervisor that allows device to
> DMA into guest memory while also telling the guest that the device has
> been removed.
>
> I guess s390 is just not broken like this.

Ah good, I missed that -- that indeed sounds broken, and needs to be
fixed there.


Reply via email to