On Wed, Feb 29, 2012 at 11:58:37AM -0300, Luiz Capitulino wrote: > On Wed, 29 Feb 2012 10:15:53 +0200 > Alon Levy <al...@redhat.com> wrote: > > > On Tue, Feb 28, 2012 at 05:10:39PM -0300, Luiz Capitulino wrote: > > > On Sat, 25 Feb 2012 10:46:07 +0200 > > > Alon Levy <al...@redhat.com> wrote: > > > > > > > On Fri, Feb 24, 2012 at 04:40:15PM -0600, Anthony Liguori wrote: > > > > > On 02/24/2012 03:22 PM, Alon Levy wrote: > > > > > >This is an across the board change since I wanted to keep the > > > > > >existing > > > > > >(good imo) single graphic_console_init callback setter, instead of > > > > > >introducing a new cb that isn't set by it but instead by a second > > > > > >initialization function. > > > > > > > > > > > >Signed-off-by: Alon Levy<al...@redhat.com> > > > > > > > > > > What's the rationale for this? > > > > > > > > There is a hang possible with the current screendump command, qxl, a > > > > spice client using libvirt and spice-gtk such as virt-viewer / > > > > remote-viewer, where you have: > > > > 1. libvirt waiting for screendump to complete > > > > 2. screendump waiting for spice server thread to render > > > > 3. spice server thread waiting for spice client to read messages > > > > > > Which messages? > > > > spice display channel messages. > > > > > > > > > 4. spice client == libvirt client, waiting for screendump completion > > > > > > The way I had understood this problem is that qxl takes a long time to > > > perform a screen dump, which would cause the global mutex to be held for > > > a long time. If this is really serious, then a async command for it > > > makes sense IMO. > > > > That is true, but it is not the immediate problem the bz is dealing with > > - if it was only this there would be no hang. > > Well, this kind of hang always smells like a spice threading synchronization > problem to me. I thought that I'd be capable of showing that if I really > understood what was going on, but I can't, even with your diagram.
It isn't a spice threading synchronization problem, it's much simpler then that: spice server flushes messages to the spice client, so the client not handling them causes it to sleep for a very long time (the timeout is 150 seconds) server: handle_dev_update/flush_display_commands The client in this case is not reading from the socket. I couldn't reproduce, but I imagined it was simply waiting for completion of the screendump command before it came around to reading the spice sockets. > > An asynchronous command solves the global mutex contention problem, but I > think this hang should be further investigated, otherwise the async command > risks just hiding the real problem.