On Thu, 29 Oct 2020 21:02:05 +0800 Jason Wang <jasow...@redhat.com> wrote:
> On 2020/10/29 下午8:08, Stefan Hajnoczi wrote: > > Here are notes from the session: > > > > protocol stability: > > * vhost-user already exists for existing third-party applications > > * vfio-user is more general but will take more time to develop > > * libvfio-user can be provided to allow device implementations > > > > management: > > * Should QEMU launch device emulation processes? > > * Nicer user experience > > * Technical blockers: forking, hotplug, security is hard once > > QEMU has started running > > * Probably requires a new process model with a long-running > > QEMU management process proxying QMP requests to the emulator process > > > > migration: > > * dbus-vmstate > > * VFIO live migration ioctls > > * Source device can continue if migration fails > > * Opaque blobs are transferred to destination, destination can > > fail migration if it decides the blobs are incompatible > > > I'm not sure this can work: > > 1) Reading something that is opaque to userspace is probably a hint of > bad uAPI design > 2) Did qemu even try to migrate opaque blobs before? It's probably a bad > design of migration protocol as well. > > It looks to me have a migration driver in qemu that can clearly define > each byte in the migration stream is a better approach. Any time during the previous two years of development might have been a more appropriate time to express your doubts. Note that we're not talking about vDPA devices here, we're talking about arbitrary devices with arbitrary state. Some degree of migration support for assigned devices can be implemented in QEMU, Alex Graf proved this several years ago with i40evf. Years later, we don't have any vendors proposing device specific migration code for QEMU. Clearly we're also trying to account for proprietary devices where even for suspend/resume support, proprietary drivers may be required for manipulating that internal state. When we move device emulation outside of QEMU, whether in kernel or to other userspace processes, does it still make sense to require code in QEMU to support interpretation of that device for migration purposes? That seems counter to the actual goal of out-of-process devices and clearly hasn't work for us so far. Thanks, Alex