Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
On Tue, Oct 25, 2011 at 04:41:18PM -0200, Luiz Capitulino wrote: On Tue, 25 Oct 2011 15:21:11 +0200 Alon Levy al...@redhat.com wrote: On Tue, Oct 25, 2011 at 10:51:30AM -0200, Luiz Capitulino wrote: On Tue, 25 Oct 2011 12:13:09 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 10:31:48PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 19:29:37 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 01:45:16PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 17:13:14 +0200 Gerd Hoffmann kra...@redhat.com wrote: On 10/24/11 14:02, Alon Levy wrote: Make screen_dump monitor command an async command to allow next for qxl to implement it as a initiating call to red_worker and completion on callback, to fix a deadlock when issueing a screendump command via libvirt while connected with a libvirt controlled spice-gtk client. Approach looks reasonable to me. Patch breaks the build though, you've missed a bunch of screen_dump functions in non-x86 targets. There are two problems actually. The first one is that changing an existing command from synchronous to asynchronous is an incompatible change because asynchronous commands semantics is different. For an example of possible problems please check: https://bugzilla.redhat.com/show_bug.cgi?id=623903. The second problem is that the existing asynchronous interface in the monitor is incomplete and has never been used for real. Our plan is to use QAPI's async support, but that has not landed in master yet and iirc there wasn't consensus about it. I also think it's a bit late for its inclusion in 1.0 (and certainly not a candidate for stable). If all you need here is to delay sending the response, then maybe the current interface could work (although I honestly don't trust it and regret not having dropped it). Otherwise our only choice would be to work on getting the QAPI async support merged. My problem is that the io thread keeps the global mutex during the wait, that's why the async monitor is perfect for what I want - this is exactly what it does. Let's not mix internal implementation details with what we want as an external interface. Can't you just make a vga_hw_screen_dump() specific callback? I don't understand how that would help - if the monitor command doesn't return (normal sync operation) then the mutex is never dropped, and any callback won't change that. I'm trying to figure out a different solution. Our primary motivation for making a QMP command asynchronous must be to give clients the ability to keep issuing commands while slow commands are running. If that's not what you want nor your primary motivation, then you're probably taking the wrong approach. That sounds right, and it was what I assumed the async monitor implementation would do, boy was I surprised to discover it doesn't do any such thing, but what it does do is return early, allow *other* io related events (select returns) to be handled, and keeps the serialized only-one-command-ongoing monitor usage. Yes, if that turns out to be needed internally then we'll have to add such functionality (but probably not as part of QMP's async support iiuc). If that is what you want, then you'll have to add a new command, because changing from asynchronous to synchronous is an incompatible change _and_ you shouldn't use the current interface, because it's botched (actually, I think I'm going to drop it right now as my last series fixed its only user). This is not what I want. I understand of course that it is what one would design an async monitor / api to allow (it was after all what I thought async monitor support meant). Using a botched interface that doesn't do what's supposed to do but happens to solve a bug as a side effect will very likely end in tears at some point in the future. Right, but it's the opposite of the current case. Now, I did some research and found this description of the problem: In testing my patches for 'add_client' support with SPICE, I noticed deadlock in the QEMU/SPICE code. It only happens if I close a SPICE client and then immediately reconnect within about 1 second. If I wait a couple of seconds before reconnecting the SPICE client I don't see the deadlock. (http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg02599.html) Is this accurate? Why does it _work_ after 1 second? This is an unrelated bug, I know I said different on the thread but
Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
Hi, If all you need here is to delay sending the response, then maybe the current interface could work (although I honestly don't trust it and regret not having dropped it). Otherwise our only choice would be to work on getting the QAPI async support merged. A delayed monitor response is all we need. The command may take a bit to finish, where a bit is in the order of a few seconds max, usually less than a second. Blocking the monitor for that amount of time is fine, but blocking the io thread introduces insane long I/O latencies for the guest (and device emulation). cheers, Gerd
Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
On Mon, Oct 24, 2011 at 10:31:48PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 19:29:37 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 01:45:16PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 17:13:14 +0200 Gerd Hoffmann kra...@redhat.com wrote: On 10/24/11 14:02, Alon Levy wrote: Make screen_dump monitor command an async command to allow next for qxl to implement it as a initiating call to red_worker and completion on callback, to fix a deadlock when issueing a screendump command via libvirt while connected with a libvirt controlled spice-gtk client. Approach looks reasonable to me. Patch breaks the build though, you've missed a bunch of screen_dump functions in non-x86 targets. There are two problems actually. The first one is that changing an existing command from synchronous to asynchronous is an incompatible change because asynchronous commands semantics is different. For an example of possible problems please check: https://bugzilla.redhat.com/show_bug.cgi?id=623903. The second problem is that the existing asynchronous interface in the monitor is incomplete and has never been used for real. Our plan is to use QAPI's async support, but that has not landed in master yet and iirc there wasn't consensus about it. I also think it's a bit late for its inclusion in 1.0 (and certainly not a candidate for stable). If all you need here is to delay sending the response, then maybe the current interface could work (although I honestly don't trust it and regret not having dropped it). Otherwise our only choice would be to work on getting the QAPI async support merged. My problem is that the io thread keeps the global mutex during the wait, that's why the async monitor is perfect for what I want - this is exactly what it does. Let's not mix internal implementation details with what we want as an external interface. Can't you just make a vga_hw_screen_dump() specific callback? I don't understand how that would help - if the monitor command doesn't return (normal sync operation) then the mutex is never dropped, and any callback won't change that. On the other hand, thinking a bit about the reference to 623903 baloon bug, I don't see a problem: the change doesn't affect the semantics for any other device except qxl, which I've tested. For any other device, the only difference is that instead of: do_screen_dump call device specific implementation return It becomes do_screen_dump call device specific implementation (not qxl) callback called (always - not conditional, no one stores it except qxl) return I haven't looked at QAPI async support, but I understand it's a bit in the future. Yes, it's not for the immediate term.
Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
On Tue, 25 Oct 2011 12:13:09 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 10:31:48PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 19:29:37 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 01:45:16PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 17:13:14 +0200 Gerd Hoffmann kra...@redhat.com wrote: On 10/24/11 14:02, Alon Levy wrote: Make screen_dump monitor command an async command to allow next for qxl to implement it as a initiating call to red_worker and completion on callback, to fix a deadlock when issueing a screendump command via libvirt while connected with a libvirt controlled spice-gtk client. Approach looks reasonable to me. Patch breaks the build though, you've missed a bunch of screen_dump functions in non-x86 targets. There are two problems actually. The first one is that changing an existing command from synchronous to asynchronous is an incompatible change because asynchronous commands semantics is different. For an example of possible problems please check: https://bugzilla.redhat.com/show_bug.cgi?id=623903. The second problem is that the existing asynchronous interface in the monitor is incomplete and has never been used for real. Our plan is to use QAPI's async support, but that has not landed in master yet and iirc there wasn't consensus about it. I also think it's a bit late for its inclusion in 1.0 (and certainly not a candidate for stable). If all you need here is to delay sending the response, then maybe the current interface could work (although I honestly don't trust it and regret not having dropped it). Otherwise our only choice would be to work on getting the QAPI async support merged. My problem is that the io thread keeps the global mutex during the wait, that's why the async monitor is perfect for what I want - this is exactly what it does. Let's not mix internal implementation details with what we want as an external interface. Can't you just make a vga_hw_screen_dump() specific callback? I don't understand how that would help - if the monitor command doesn't return (normal sync operation) then the mutex is never dropped, and any callback won't change that. I'm trying to figure out a different solution. Our primary motivation for making a QMP command asynchronous must be to give clients the ability to keep issuing commands while slow commands are running. If that's not what you want nor your primary motivation, then you're probably taking the wrong approach. If that is what you want, then you'll have to add a new command, because changing from asynchronous to synchronous is an incompatible change _and_ you shouldn't use the current interface, because it's botched (actually, I think I'm going to drop it right now as my last series fixed its only user). Using a botched interface that doesn't do what's supposed to do but happens to solve a bug as a side effect will very likely end in tears at some point in the future. Now, I did some research and found this description of the problem: In testing my patches for 'add_client' support with SPICE, I noticed deadlock in the QEMU/SPICE code. It only happens if I close a SPICE client and then immediately reconnect within about 1 second. If I wait a couple of seconds before reconnecting the SPICE client I don't see the deadlock. (http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg02599.html) Is this accurate? Why does it _work_ after 1 second? On the other hand, thinking a bit about the reference to 623903 baloon bug, I don't see a problem: the change doesn't affect the semantics for any other device except qxl, which I've tested. For any other device, the only difference is that instead of: do_screen_dump call device specific implementation return It becomes do_screen_dump call device specific implementation (not qxl) callback called (always - not conditional, no one stores it except qxl) return I haven't looked at QAPI async support, but I understand it's a bit in the future. Yes, it's not for the immediate term.
Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
On Tue, Oct 25, 2011 at 10:51:30AM -0200, Luiz Capitulino wrote: On Tue, 25 Oct 2011 12:13:09 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 10:31:48PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 19:29:37 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 01:45:16PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 17:13:14 +0200 Gerd Hoffmann kra...@redhat.com wrote: On 10/24/11 14:02, Alon Levy wrote: Make screen_dump monitor command an async command to allow next for qxl to implement it as a initiating call to red_worker and completion on callback, to fix a deadlock when issueing a screendump command via libvirt while connected with a libvirt controlled spice-gtk client. Approach looks reasonable to me. Patch breaks the build though, you've missed a bunch of screen_dump functions in non-x86 targets. There are two problems actually. The first one is that changing an existing command from synchronous to asynchronous is an incompatible change because asynchronous commands semantics is different. For an example of possible problems please check: https://bugzilla.redhat.com/show_bug.cgi?id=623903. The second problem is that the existing asynchronous interface in the monitor is incomplete and has never been used for real. Our plan is to use QAPI's async support, but that has not landed in master yet and iirc there wasn't consensus about it. I also think it's a bit late for its inclusion in 1.0 (and certainly not a candidate for stable). If all you need here is to delay sending the response, then maybe the current interface could work (although I honestly don't trust it and regret not having dropped it). Otherwise our only choice would be to work on getting the QAPI async support merged. My problem is that the io thread keeps the global mutex during the wait, that's why the async monitor is perfect for what I want - this is exactly what it does. Let's not mix internal implementation details with what we want as an external interface. Can't you just make a vga_hw_screen_dump() specific callback? I don't understand how that would help - if the monitor command doesn't return (normal sync operation) then the mutex is never dropped, and any callback won't change that. I'm trying to figure out a different solution. Our primary motivation for making a QMP command asynchronous must be to give clients the ability to keep issuing commands while slow commands are running. If that's not what you want nor your primary motivation, then you're probably taking the wrong approach. That sounds right, and it was what I assumed the async monitor implementation would do, boy was I surprised to discover it doesn't do any such thing, but what it does do is return early, allow *other* io related events (select returns) to be handled, and keeps the serialized only-one-command-ongoing monitor usage. If that is what you want, then you'll have to add a new command, because changing from asynchronous to synchronous is an incompatible change _and_ you shouldn't use the current interface, because it's botched (actually, I think I'm going to drop it right now as my last series fixed its only user). This is not what I want. I understand of course that it is what one would design an async monitor / api to allow (it was after all what I thought async monitor support meant). Using a botched interface that doesn't do what's supposed to do but happens to solve a bug as a side effect will very likely end in tears at some point in the future. Right, but it's the opposite of the current case. Now, I did some research and found this description of the problem: In testing my patches for 'add_client' support with SPICE, I noticed deadlock in the QEMU/SPICE code. It only happens if I close a SPICE client and then immediately reconnect within about 1 second. If I wait a couple of seconds before reconnecting the SPICE client I don't see the deadlock. (http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg02599.html) Is this accurate? Why does it _work_ after 1 second? This is an unrelated bug, I know I said different on the thread but now rereading the callstack it is the channel_event locking workaround - and it is fixed by a spice-server only patch (which stops calling channel_event from the worker thread). On the other hand, thinking a bit about the reference to 623903 baloon bug, I don't see a problem: the change doesn't affect the semantics for any other device except qxl, which I've tested. For any other device, the only difference is that instead of: do_screen_dump call device specific implementation return It becomes do_screen_dump call device specific
Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
On Tue, 25 Oct 2011 15:21:11 +0200 Alon Levy al...@redhat.com wrote: On Tue, Oct 25, 2011 at 10:51:30AM -0200, Luiz Capitulino wrote: On Tue, 25 Oct 2011 12:13:09 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 10:31:48PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 19:29:37 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 01:45:16PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 17:13:14 +0200 Gerd Hoffmann kra...@redhat.com wrote: On 10/24/11 14:02, Alon Levy wrote: Make screen_dump monitor command an async command to allow next for qxl to implement it as a initiating call to red_worker and completion on callback, to fix a deadlock when issueing a screendump command via libvirt while connected with a libvirt controlled spice-gtk client. Approach looks reasonable to me. Patch breaks the build though, you've missed a bunch of screen_dump functions in non-x86 targets. There are two problems actually. The first one is that changing an existing command from synchronous to asynchronous is an incompatible change because asynchronous commands semantics is different. For an example of possible problems please check: https://bugzilla.redhat.com/show_bug.cgi?id=623903. The second problem is that the existing asynchronous interface in the monitor is incomplete and has never been used for real. Our plan is to use QAPI's async support, but that has not landed in master yet and iirc there wasn't consensus about it. I also think it's a bit late for its inclusion in 1.0 (and certainly not a candidate for stable). If all you need here is to delay sending the response, then maybe the current interface could work (although I honestly don't trust it and regret not having dropped it). Otherwise our only choice would be to work on getting the QAPI async support merged. My problem is that the io thread keeps the global mutex during the wait, that's why the async monitor is perfect for what I want - this is exactly what it does. Let's not mix internal implementation details with what we want as an external interface. Can't you just make a vga_hw_screen_dump() specific callback? I don't understand how that would help - if the monitor command doesn't return (normal sync operation) then the mutex is never dropped, and any callback won't change that. I'm trying to figure out a different solution. Our primary motivation for making a QMP command asynchronous must be to give clients the ability to keep issuing commands while slow commands are running. If that's not what you want nor your primary motivation, then you're probably taking the wrong approach. That sounds right, and it was what I assumed the async monitor implementation would do, boy was I surprised to discover it doesn't do any such thing, but what it does do is return early, allow *other* io related events (select returns) to be handled, and keeps the serialized only-one-command-ongoing monitor usage. Yes, if that turns out to be needed internally then we'll have to add such functionality (but probably not as part of QMP's async support iiuc). If that is what you want, then you'll have to add a new command, because changing from asynchronous to synchronous is an incompatible change _and_ you shouldn't use the current interface, because it's botched (actually, I think I'm going to drop it right now as my last series fixed its only user). This is not what I want. I understand of course that it is what one would design an async monitor / api to allow (it was after all what I thought async monitor support meant). Using a botched interface that doesn't do what's supposed to do but happens to solve a bug as a side effect will very likely end in tears at some point in the future. Right, but it's the opposite of the current case. Now, I did some research and found this description of the problem: In testing my patches for 'add_client' support with SPICE, I noticed deadlock in the QEMU/SPICE code. It only happens if I close a SPICE client and then immediately reconnect within about 1 second. If I wait a couple of seconds before reconnecting the SPICE client I don't see the deadlock. (http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg02599.html) Is this accurate? Why does it _work_ after 1 second? This is an unrelated bug, I know I said different on the thread but now rereading the callstack it is the channel_event locking workaround - and it is fixed by a spice-server only patch (which stops calling channel_event from the worker thread). So, can you try to describe
Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
On 10/24/11 14:02, Alon Levy wrote: Make screen_dump monitor command an async command to allow next for qxl to implement it as a initiating call to red_worker and completion on callback, to fix a deadlock when issueing a screendump command via libvirt while connected with a libvirt controlled spice-gtk client. Approach looks reasonable to me. Patch breaks the build though, you've missed a bunch of screen_dump functions in non-x86 targets. cheers, Gerd
Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
On Mon, 24 Oct 2011 17:13:14 +0200 Gerd Hoffmann kra...@redhat.com wrote: On 10/24/11 14:02, Alon Levy wrote: Make screen_dump monitor command an async command to allow next for qxl to implement it as a initiating call to red_worker and completion on callback, to fix a deadlock when issueing a screendump command via libvirt while connected with a libvirt controlled spice-gtk client. Approach looks reasonable to me. Patch breaks the build though, you've missed a bunch of screen_dump functions in non-x86 targets. There are two problems actually. The first one is that changing an existing command from synchronous to asynchronous is an incompatible change because asynchronous commands semantics is different. For an example of possible problems please check: https://bugzilla.redhat.com/show_bug.cgi?id=623903. The second problem is that the existing asynchronous interface in the monitor is incomplete and has never been used for real. Our plan is to use QAPI's async support, but that has not landed in master yet and iirc there wasn't consensus about it. I also think it's a bit late for its inclusion in 1.0 (and certainly not a candidate for stable). If all you need here is to delay sending the response, then maybe the current interface could work (although I honestly don't trust it and regret not having dropped it). Otherwise our only choice would be to work on getting the QAPI async support merged.
Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
On Mon, Oct 24, 2011 at 01:45:16PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 17:13:14 +0200 Gerd Hoffmann kra...@redhat.com wrote: On 10/24/11 14:02, Alon Levy wrote: Make screen_dump monitor command an async command to allow next for qxl to implement it as a initiating call to red_worker and completion on callback, to fix a deadlock when issueing a screendump command via libvirt while connected with a libvirt controlled spice-gtk client. Approach looks reasonable to me. Patch breaks the build though, you've missed a bunch of screen_dump functions in non-x86 targets. There are two problems actually. The first one is that changing an existing command from synchronous to asynchronous is an incompatible change because asynchronous commands semantics is different. For an example of possible problems please check: https://bugzilla.redhat.com/show_bug.cgi?id=623903. The second problem is that the existing asynchronous interface in the monitor is incomplete and has never been used for real. Our plan is to use QAPI's async support, but that has not landed in master yet and iirc there wasn't consensus about it. I also think it's a bit late for its inclusion in 1.0 (and certainly not a candidate for stable). If all you need here is to delay sending the response, then maybe the current interface could work (although I honestly don't trust it and regret not having dropped it). Otherwise our only choice would be to work on getting the QAPI async support merged. My problem is that the io thread keeps the global mutex during the wait, that's why the async monitor is perfect for what I want - this is exactly what it does. I haven't looked at QAPI async support, but I understand it's a bit in the future.
Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async
On Mon, 24 Oct 2011 19:29:37 +0200 Alon Levy al...@redhat.com wrote: On Mon, Oct 24, 2011 at 01:45:16PM -0200, Luiz Capitulino wrote: On Mon, 24 Oct 2011 17:13:14 +0200 Gerd Hoffmann kra...@redhat.com wrote: On 10/24/11 14:02, Alon Levy wrote: Make screen_dump monitor command an async command to allow next for qxl to implement it as a initiating call to red_worker and completion on callback, to fix a deadlock when issueing a screendump command via libvirt while connected with a libvirt controlled spice-gtk client. Approach looks reasonable to me. Patch breaks the build though, you've missed a bunch of screen_dump functions in non-x86 targets. There are two problems actually. The first one is that changing an existing command from synchronous to asynchronous is an incompatible change because asynchronous commands semantics is different. For an example of possible problems please check: https://bugzilla.redhat.com/show_bug.cgi?id=623903. The second problem is that the existing asynchronous interface in the monitor is incomplete and has never been used for real. Our plan is to use QAPI's async support, but that has not landed in master yet and iirc there wasn't consensus about it. I also think it's a bit late for its inclusion in 1.0 (and certainly not a candidate for stable). If all you need here is to delay sending the response, then maybe the current interface could work (although I honestly don't trust it and regret not having dropped it). Otherwise our only choice would be to work on getting the QAPI async support merged. My problem is that the io thread keeps the global mutex during the wait, that's why the async monitor is perfect for what I want - this is exactly what it does. Let's not mix internal implementation details with what we want as an external interface. Can't you just make a vga_hw_screen_dump() specific callback? I haven't looked at QAPI async support, but I understand it's a bit in the future. Yes, it's not for the immediate term.