Re: [Qemu-devel] [patch v4 05/16] memory: introduce ref, unref interface for MemoryRegionOps

Paolo Bonzini Fri, 26 Oct 2012 08:06:50 -0700


----- Messaggio originale -----
> Da: "Avi Kivity" <a...@redhat.com>
> A: "Paolo Bonzini" <pbonz...@redhat.com>
> Cc: "Liu Ping Fan" <pingf...@linux.vnet.ibm.com>, qemu-devel@nongnu.org, 
> "Anthony Liguori" <anth...@codemonkey.ws>,
> "Marcelo Tosatti" <mtosa...@redhat.com>, "Jan Kiszka" 
> <jan.kis...@siemens.com>, "Stefan Hajnoczi"
> <stefa...@gmail.com>
> Inviato: Giovedì, 25 ottobre 2012 18:28:27
> Oggetto: Re: [patch v4 05/16] memory: introduce ref,unref interface for 
> MemoryRegionOps
> 
> On 10/24/2012 09:29 AM, Paolo Bonzini wrote:
> > Il 23/10/2012 18:09, Avi Kivity ha scritto:
> >>> But our interfaces had better support asynchronicity, and indeed
> >>> they
> >>> do: after you write to the "eject" register, the "up" will show
> >>> the
> >>> device as present until after destroy is done.  This can be
> >>> changed to
> >>> show the device as present only until after step 4 is done.
> >> 
> >> Let's say we want to eject the hotplug hardware itself (just as an
> >> example).  With refcounts, the callback that updates "up" will hold
> >> on to to it via refcounts.  With stop_machine(), you need to cancel
> >> that callback, or wait for it somehow, or it can arrive after the
> >> stop_machine() and bite you.
> > 
> > The callback that updates "up" is for the parent of the hotplug
> > hardware.  There is nothing that has to be updated in the hotplug
> > hardware itself.
> 
> I meant, as an unrealistic example, hot-unplugging the bridge itself.
> So we have a callback that updates information in the bridge (up
> register state) being called asynchronously.
> 
> A more realistic example would be hot-unplug of an HBA, then the block
> layer callback comes back to update the device.  So stop_machine()
> would need to cancel all I/O and wait for I/O that cannot be cancelled.


Cancellation+wait would be triggered by isolate (4a) and it would run
outside stop_machine().  We know that stop_machine() will eventually
run because the guest cannot place more requests for the devices to
process.

At this point we're here:

> > 4a. close all backends (also cancel or complete all pending I/O)
> 
> ^ long latency
> 

but none of this is done in stop_machine().  Once cancellation/wait
finishes, the HBA gives a green-light to the parent, which proceeds
as follows:

> > 4b. notify parent that we're done
> >     4ba. parent removes device from its bus
> >     4bb. parent notifies guest
> >     4bc. parent schedules stop_machine(qdev_free(child))
> > 5. a bottom half calls stop_machine(qdev_free(child))

All we're doing in stop_machine() is really calling the destructor,
which---in an isolate-enabled device---only includes calls to
qemu_del_timer, drive_put_ref, memory_region_destroy and the like.

> Maybe my worry about long stop_machine latencies is premature.
> Everyone in the kernel hates it, but the kernel scales a lot more
> than qemu and is in a much better place wrt threading.

stop_machine may indeed require (or at least warmly suggest) a
conversion to isolate of storage devices, in order to reduce the
latency of the destructor.  We do not have that many though (the
IDE and SCSI buses, and virtio-blk).

Paolo

Re: [Qemu-devel] [patch v4 05/16] memory: introduce ref, unref interface for MemoryRegionOps

Reply via email to