I have made some tests on the Charmed OpenStack Caracal release. When
the 'watcher_cluster_data_model_collectors' periods are shorter than a
continuous audit's interval, subsequent live migrations of the same VM
are working fine. I wasn't able to reproduce the problem for such a
scenario.

When testing with the 'watcher_cluster_data_model_collectors' periods
set to 3600s, and the audit interval of 60s, after the first successful
VM migration, subsequent migrations fail due to the fact that Watcher
cluster model is out of sync, and Watcher incorrectly picks a migration
target, which happens to be the node currently hosting the VM. However,
in my test environment, I have only three compute nodes.


I submitted a patch for the Watcher charm [1][2], lowering the default data 
model refresh interval from 60 to 30 minutes. I also added some extra comments 
about the implications of using a long refresh interval.

Regarding the upstream Watcher project, I proposed a minor documentation
update [3], and I am looking forward to the comments/reviews.

[1] https://review.opendev.org/c/openstack/charm-watcher/+/973822
[2] https://launchpad.net/bugs/2138626
[3] https://review.opendev.org/c/openstack/watcher/+/973839

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2131043

Title:
  watcher decision engine fails when evaluating previously migrated
  instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/watcher/+bug/2131043/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to