Hi all:
I deploy my OpenStack with VMware driver, one nova-compute connect to
VMware deployment,
there are about 3000 VMs in VMware deployment. I use mysql. The method
of InstanceList.get_by_host
rasie rpc timeout error when ComputeManager.init_host() and
_sync_power_states periodic task execute.
Currently, one nova-compute host map to the whole VMware deployment
that maybe contain several clusters
in nova VMware driver. When InstanceList.get_by_host execute in
ComputeManager, it indicate that nova-compute
will execute a rpc call to nova-conducutor, nova-conductor will fetch a
lots of instances in the whole VMware
deployment in once, in my case , it's 3000 instances. The long time SQL
query maybe lead to the rpc timeout
from nova-compute to nova-conductor. We only face the issue in VMWare
driver.
https://bugs.launchpad.net/nova/+bug/1420662
https://review.openstack.org/#/c/155676/
In the patch I split the large rpc request to multiple small rpc requests
using pagination mechanism in order to
fix this issue, but sahid think it looks like a hack and need a real
pattern to handle this problem.
If you have other better idea, please let me know.
Feel free to discuss it. Thanks.
Best Regards
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev