On 11/08/2023 13:29, Cédric Le Goater wrote:
External email: Use caution opening links or attachments
On 8/8/23 08:23, Avihai Horon wrote:
On 07/08/2023 18:53, Cédric Le Goater wrote:
External email: Use caution opening links or attachments
[ Adding Juan and Peter for their awareness ]
On 8/2/23 10:14, Avihai Horon wrote:
Changing the device state from STOP_COPY to STOP can take time as the
device may need to free resources and do other operations as part
of the
transition. Currently, this is done in vfio_save_complete_precopy()
and
therefore it is counted in the migration downtime.
To avoid this, change the device state from STOP_COPY to STOP in
vfio_save_cleanup(), which is called after migration has completed and
thus is not part of migration downtime.
What bothers me is that this looks like a device specific optimization
True, currently it helps mlx5, but this change is based on the
assumption that, in general, VFIO devices are likely to free
resources when transitioning from STOP_COPY to STOP.
So I think this is a good change to have in any case.
and we are loosing the error part.
I don't think we lose the error part.
AFAIU, the crucial part is transitioning to STOP_COPY and sending the
final data.
If that's done successfully, then migration is successful.
The STOP_COPY->STOP transition is done as part of the cleanup flow,
after the migration is completed -- i.e., failure in it does not
affect the success of migration.
Further more, if there is an error in the STOP_COPY->STOP transition,
then it's reported by vfio_migration_set_state().
It is indeed. I am nit-picking. Pushed on :
https://github.com/legoater/qemu/tree/vfio-next
It can still be updated before I send a PR. I also provided custom
rpms to our QE team for extras tests.
Should follow Dynamic MSI-X allocation [1] and Joao's series regarding
vIOMMU [2] but first I will take some PTO. See you in a couple of weeks !
Thanks, have a pleasant vacation!
Cheers,
C.
[1]
https://lore.kernel.org/qemu-devel/20230727072410.135743-1-jing2....@intel.com/
[2]
https://lore.kernel.org/qemu-devel/20230622214845.3980-1-joao.m.mart...@oracle.com/