On Mon, Sep 03, 2018 at 03:41:16PM +0100, Daniel P. Berrangé wrote: > On Mon, Sep 03, 2018 at 09:30:52AM -0500, Eric Blake wrote: > > On 09/03/2018 08:31 AM, Markus Armbruster wrote: > > > > > Example: > > > > > > client sends in-band command #1 > > > QEMU reads and queues > > > QEMU dequeues in-band command #1 > > > in-band command #1 starts executing, but it's slooow > > > client sends in-band command #2 > > > QEMU reads and queues > > > ... > > > client sends in-band command #8 > > > QEMU reads, queues and suspends the monitor > > > client sends out-of-band command > > > --> time passes... > > > in-band command #1 completes, QEMU sends reply > > > QEMU dequeues in-band command #2, resumes the monitor > > > in-band command #2 starts executing > > > QEMU reads and executes out-of-band command > > > out-of-band command completes, QEMU sends reply > > > in-band command #2 completes, QEMU sends reply > > > ... same for remaining in-band commands ... > > > > > > The out-of-band command gets stuck behind the in-band commands.
(It's a shame of me to have just noticed that the out-of-band command will be stuck after we dropped the COMMAND_DROP event... so now I agree it's not that ideal any more to drop the event but maybe still preferable) > > > > > > The client can avoid this by managing the flow of in-band commands: have > > > no more than 7 in flight, so the monitor never gets suspended. > > > > > > This is a potentially useful thing to do for clients, isn't it? > > > > > > Eric, Daniel, is it something libvirt would do? > > > > Right now, libvirt serializes commands - it never sends a QMP command until > > the previous command's response has been processed. But that may not help > > much, since libvirt does not send OOB commands. > > Note that is not merely due to the QMP monitor restriction either. > > Libvirt serializes all its public APIs that can change state of a running > domain. It usually aims to allow read-only APIs to be run in parallel with > APIs that change state. > > The exception to the rule right now are some of the migration APIs which > we allow to be invoked to manage the migration process. > > > I guess when we are designing what libvirt should do, and deciding WHEN it > > should send OOB commands, we have the luxury of designing libvirt to enforce > > how many in-flight in-band commands it will ever have pending at once > > (whether the current 'at most 1', or even if we make it more parallel to 'at > > most 7'), so that we can still be ensured that the OOB command will be > > processed without being stuck in the queue of suspended in-band commands. > > If we never send more than one in-band at a time, then it's not a concern > > how deep the qemu queue is; but if we do want libvirt to start parallel > > in-band commands, then you are right that having a way to learn the qemu > > queue depth is programmatically more precise than just guessing the maximum > > depth. But it's also hard to argue we need that complexity if we don't have > > an immediate use envisioned for it. > > In terms of what libvirt would want to parallelize, I think it is reasonable > to consider any of the query-XXXX commands desirable. Other stuff is likely > to remain serialized from libvirt's side. IMHO concurrency won't help much now even for query commands, since our current concurrency is still "partly" - the executions of query commands (which is the most time consuming part) will still be done sequentially, so even if we send multiple query commands in parallel (without waiting for a response of any sent commands), the total time used for the list of commands would be mostly the same. My understanding for why we have such a queue length now is that it came from a security concern: after we have a queue, we need that queue length to limit the memory usages for the QMP server. Though that might not help much for real users like Libvirt, it's majorly serving as a way to protect QEMU QMP from being attacked or from being turned down by a buggy QMP client. But I agree now that the queue length information might still be helpful some day. Maybe, we can hide that until we support executing commands in parallel for some of them. Regards, -- Peter Xu