Re: [Spice-devel] synchronous io considered harmful

Yonit Halperin Sun, 19 Jun 2011 00:44:02 -0700

Hi,
On 06/17/2011 07:12 PM, Alon Levy wrote:

On Fri, Jun 17, 2011 at 09:34:12AM +0200, Gerd Hoffmann wrote:

On 06/16/11 21:21, Alon Levy wrote:

Hi,


I just had a short conversation with Marc-Andre about a bug in
spice/qxl and it turns out there are a couple of possible solutions
and I require some more input.

The problem: spice was stalling in flush_display_commands caused by
QXL_UPDATE_AREA. Since the vcpu thread is holding the global qemu
mutex then qemu is unresponsive during this time to any other
clients, specifically a libvirt call via virt-manager
(virDomainGetXMLDesc) was blocked.


Yes, qemu doesn't respond to anything.  I've seen up to 50
miliseconds latency (spice client connected via loopback), which is
enougth to disturb EHCI emulation.

There are a number of wrong things here, I'll start from what I think
we should fix but any of these would solve this specific problem:

1. QXL_IO_UPDATE_AREA is synchronous. If we made it async, by using an
interrupt to notify of it's completion, then the io would complete
practicaly immediately, and the mutex would be relinquished.


QXL_IO_UPDATE_AREA isn't the only one, although it is the most
important.  And, yes, this is what we should do IMHO.  Make the I/O
complete immediately, then set a bit in a status register and raise
an IRQ when done.  This is how real hardware usually works.

2. flush_display_commands is waiting for the client if the PIPE is too
large. This is wrong - why should a slow client prevent the guest

>from doing an update area? Presumably we do this to keep the pipe

length limited, which presumably limits the amount of memory. I don't
have any numbers, but this looks wrong to me.


Sounds like this should be done anyway.  The spice client shouldn't
be able to delay update area requests of the guests, even if they
run asynchronously.

Actually, all operations in the guest that push commands, can be delayedby the client (WaitForCmdRing) - since we don't process commands as longas the pipe size is to big. Which make sense, since we want to keep therates of the producer and consumer balanced in order to keep the memoryusage stable and not trigger QXL_IO_NOTIFY_OOM.I agree that update_area is exceptional because it might wait for morethan one pipe item to be sent.First, as already been discussed previously, we probably need to limitthe pipe according to the total size of elements it holds, and not byits own size. Second, in the update area case, maybe we can exceed thepipe size, but keep track of memory usage and in order to avoidsucceeding OOMs we can free other drawables the worker holds (i.e., dowhat we actually do in OOM).

3. we don't drop qemu's global mutex when dispatching a message. Since
an IO can happen from any vcpu we would need to add our own dispatcher
mutex in spice.


I've already looked briefly how to fix that best.

One option is to simply kick off a thread in qemu to run the update
area request, and also handle the locking inside qemu.  Advantage is
that we don't need libspice-server api changes.  It is a bit unclean
though, especially as the work gets offloaded to a thread inside
libspice-server anyway.

The other option is to extend the libspice-server API and add async
versions of all syncronous requests with completion callbacks.

No matter how we fix it on the qemu/spice-server side we need driver
updates for the guests ...


Yes. My current thinking (and I'm continuing with UPDATE_AREA as the example,
but like you noted this needs to be done for any of the other IO's that are
causing possible delays, like DESTROY_PRIMARY, CREATE_PRIMARY) is to add
UPDATE_AREA_ASYNC to not break current drivers and be able to do the change
in steps (qemu update, win driver update, lin driver update). Regarding the 
linux
driver, we would need to have one for this to work - currently we don't have an
interrupt handler at all.

One issue is if we want to handle multiple outstanding update area requests or
not. Doing a single one would be just adding a field for surface_id in the
QXLRam, and adding an interrupt value (we use two out of 32 bits, so we have
quite a few left).

You mean that while the driver preforms update area, another driverthread attempts to call update area as well? Interesting if this reallyhappens. I think succeeding calls to UpdateArea can be kept synchronous,i.e., we can use a mutex for this calls in the driver.

If we want to handle multiple it becomes more difficult. It seems to me it 
would be
worth adding a ring in such a case, since we can't reuse the existing command 
ring for
update_area because the whole point of that request is to flush the command 
ring (or
at least only those operations affecting that surface, but there is no way to 
know that
without going over the whole ring). Otoh maybe we could reuse the command ring
I guess, we could even convert these update_area calls to something like a
fence? but I haven't thought about that yet.

There is a problem with reusing the command ring - if it is full you can't push 
anything into
it, including an QXLUpdateArea command..

So to sum the points:
  1. UPDATE_AREA_ASYNC, linux driver for X usage, on windows driver already has 
an interrupt
  handler. Single outstanding request availble (so driver will have to 
serialize them)
  2. Need to consider if we want to have multiple outstanding updates.

In general I think we should be aiming to eliminate update_area usage as much 
as possible (because
it requires a vmexit), so I guess maybe 1 will be enough in the long run.


cheers,
   Gerd
_______________________________________________
Spice-devel mailing list
Spice-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/spice-devel


_______________________________________________
Spice-devel mailing list
Spice-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/spice-devel

Re: [Spice-devel] synchronous io considered harmful

Reply via email to