= Motivation = QEMU startup and initial configuration were designed many years ago for a much, much simpler QEMU. They have since changed beyond recognition to adapt to new needs. There was no real redesign. Adaption to new needs has become more and more difficult. A recent example for "difficult" is Damien's "[RFC PATCH v3 0/5] QMP support for cold-plugging devices".
I think it's time for a redesign. = What users want for initial configuration = 1. QMP only Management applications need to use QMP for monitoring anyway. They may want to use it for initial configuration, too. Libvirt does. They still need to bootstrap a QMP monitor, and for that, CLI is fine as long as it's simple and stable. 2. CLI and configuration files Human users want a CLI and configuration files. CLI is good for quick tweaks, and to explore. For more permanent, non-trivial configuration, configuration files are more suitable, because they are easier to read, edit, and document than long command lines. = What we have for initial configuration = Half of 1. and half of 2., satisfying nobody's needs. Management applications need to do a lot of non-trivial initial configuration with the CLI. Human users struggle with inconsistent syntax, insufficiently expressive configuration files, and huge command lines. = Why startup matters for initial configuration = Startup and initial configuration are joined at the hip: the way startup works limits how initial configuration can work. Consider the startup we had before -prefconfig: 0. Basic, configuration-independent initialization 1. Parse and partially process CLI 2. Startup with the remaining CLI processing mixed in 3. Startup complete, run main loop Note: * CLI is processed out of order. Few people understand the order, and only while starting at the code. Pretty much every time we mess with the order, we break something. * QMP monitors become available only at 3, i.e. after startup is complete. Precludes use of QMP for initial configuration. The current startup with -preconfig permits a bit of initial configuration with QMP, at the cost of complicating startup code and interfaces. More in "Appendix: Why further incremental change feels impractical". = Ways forward = 0. This is too hard, let's go shopping I can't really blame people for not wanting to mess with startup. It *is* hard, and there's plenty of other useful stuff to do. However, I believe we have to mess with it, i.e. this is not actually a way forward. 1. Improve what we have step by step, managing incompatible changes We tried. It has eaten substantial time and energy, it has complicated code and interfaces further, and we're still nowhere near where we need to be. Does not feel like a practical way forward to me. If you don't believe me, read the long version in "Appendix: Why further incremental change feels impractical". 2. Start over with a QMP-only QEMU, keep the traditional QEMU forever A QMP-only QEMU isn't just a matter of providing an alternate main() where we ditch most of the CLI for QMP replacements. It takes a rework of the startup code as "phase transitions" triggered by QMP commands. Similar to how -preconfig triggers some startup work. To reduce code duplication, we'll probably want to rework the traditional QEMU's startup code some, too. 3. Start over with the aim to replace the traditional QEMU I don't want an end game where we have two QEMUs, one with bad CLI and limited QMP, and one without real CLI and decent QMP. I want one QEMU with decent CLI and decent QMP. I propose that a QEMU with decent CLI and decent QMP isn't really harder than a QMP-only QEMU. CLI is just another transport that happens to be one-way, i.e. the QMP commands encoded as CLI options can't return a value. Same for configuration files. I believe the parts that are actually hard are shared with QMP-only: QAPIfying initial configuration and reworking the startup code as phase transitions. When that decent QEMU is ready, we can deprecate and later drop the traditional one. = A design proposal = 0. Basic, configuration-independent initialization 1. Start main loop 2. Feed it CLI left to right. Each option runs a handler just like each QMP command does. Options that read a configuration file inject the file into the feed. Options that create a monitor create it suspended. Options may advance the phase / run state, they may require certain phase(s), and semantics may depend on the phase. 3. When we're done with CLI, resume any monitors we created. As a convenience, provide an easy way to auto-advance phase / run state to "guest runs" right here. The traditional way would be to auto-advance unless the user gives -S. 4. Monitors now feed commands to the main loop. Commands may advance the phase / run state, and they may require certain phase(s), and semantics may depend on the phase. In particular, command "cont" advances phase / run state to "guest runs". Users can do as much or as little with the CLI as they want. A management application might want to set up a QMP monitor and no more. = A way to get to the proposed design = Create an alternate QEMU binary with the CLI taken out and shot: 0. Basic, configuration-independent initialization 2. Startup 3. Startup complete, run main loop Maybe hardcode a few things so we can actually run something without any configuration. To be ditched when we can configure again. Rework 2. Startup as a sequence of state transitions we can later trigger from commands / options. Create a quasi-monitor so we can feed QMP commands to the main loop. Make QMP command "cont" transition to "guest runs". Feed "cont" to the main loop, and drop 2. Startup. Create QAPI/CLI infrastructure, so we can define CLI options like QMP commands. I have (dusty and likely quite bit-rotten) RFC patches. Feed CLI options to the main loop, and only then feed "cont". Create option -S to suppress feeding "cont". Create / reuse / rework QMP commands and CLI options to create QMP monitors. Monitors created by CLI start suspended, and get resumed only when we're done with the CLI. This is so monitor commands can't race with the CLI. Create CLI option to read a configuration file, and feed its contents to the main loop. Recursively. We'll likely want nicer syntax than CLI or QMP for configuration files. At this point, we have all the infrastructure we need. What's still missing is filling gaps in QMP and in CLI. I believe "QMP only" would only be marginally easier. Basically, it processes (a minimal) CLI before the main loop, not in it. We could leave it unQAPIfied. I can't quite see offhand how to do configuration files sanely. Stretch goals: * Throw in QAPIfying HMP. * Dust with syntactic sugar (sparingly, please) to make CLI/HMP easier to use for humans. = Initial exploration, a.k.a. talk's cheap, show me the code = I'm going to post "[PATCH RFC 00/11] vl: Explore redesign of startup" shortly. = Appendix: Why further incremental change feels impractical = Recall from "Why startup matters for initial configuration" how the traditional startup code made initial configuration with QMP impossible. We added -preconfig (v3.0.0 in 2018) to support a bit of it. If given, we delay part of the startup until QMP command x-exit-preconfig, like this: 0. Basic, configuration-independent initialization 1. Parse and partially process CLI left to right 2. Some startup with some of the remaining CLI processing mixed in 3. Run main loop 3.1. Execute QMP commands that are allowed before x-exit-preconfig 3.2. x-exit-preconfig: Remaining startup with the remaining CLI processing mixed in 3.3. Startup complete Note: * (Out of order) CLI processing is interleaved with (in-order) QMP execution. One word: ugh! * QMP becomes available earlier, but still not early enough to support "QMP only". * The initial implementation of -preconfig was a hack. Getting to the current implementation took effort. Actually, *any* non-trivial change in this area now takes more effort than it should. See also "pretty much every time [...] we break something" above. We can try to push more and more startup work from 2. to 3 until "QMP only" becomes possible. Damien's "[RFC PATCH v3 0/5] QMP support for cold-plugging devices" is a modest step in this direction. Since the work to be pushed needs to move to a different point than x-exit-preconfig, we need another command x-machine-init to provide a suitable point. At this point, we complicated startup code and interfaces (and arguably compromised CLI sanity further), with more of the same needed to get anywhere near "QMP only". There is plenty of doubt on the interfaces going around, and that's why -preconfig is still not a stable interface. Crafting a big change in small steps sounds great. It isn't when we have to make things worse before they can get better, and every step is painfully slow because it's just too hard, and the end state we aim for isn't really what we want.