I have made some tests on the Charmed OpenStack Caracal release. When the 'watcher_cluster_data_model_collectors' periods are shorter than a continuous audit's interval, subsequent live migrations of the same VM are working fine. I wasn't able to reproduce the problem for such a scenario.
When testing with the 'watcher_cluster_data_model_collectors' periods set to 3600s, and the audit interval of 60s, after the first successful VM migration, subsequent migrations fail due to the fact that Watcher cluster model is out of sync, and Watcher incorrectly picks a migration target, which happens to be the node currently hosting the VM. However, in my test environment, I have only three compute nodes. I submitted a patch for the Watcher charm [1][2], lowering the default data model refresh interval from 60 to 30 minutes. I also added some extra comments about the implications of using a long refresh interval. Regarding the upstream Watcher project, I proposed a minor documentation update [3], and I am looking forward to the comments/reviews. [1] https://review.opendev.org/c/openstack/charm-watcher/+/973822 [2] https://launchpad.net/bugs/2138626 [3] https://review.opendev.org/c/openstack/watcher/+/973839 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2131043 Title: watcher decision engine fails when evaluating previously migrated instance To manage notifications about this bug go to: https://bugs.launchpad.net/watcher/+bug/2131043/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
