Hi all: I deploy my OpenStack with VMware driver, one nova-compute connect to VMware deployment, there are about 3000 VMs in VMware deployment. I use mysql. The method of InstanceList.get_by_host rasie rpc timeout error when ComputeManager.init_host() and _sync_power_states periodic task execute. Currently, one nova-compute host map to the whole VMware deployment that maybe contain several clusters in nova VMware driver. When InstanceList.get_by_host execute in ComputeManager, it indicate that nova-compute will execute a rpc call to nova-conducutor, nova-conductor will fetch a lots of instances in the whole VMware deployment in once, in my case , it's 3000 instances. The long time SQL query maybe lead to the rpc timeout from nova-compute to nova-conductor. We only face the issue in VMWare driver.
https://bugs.launchpad.net/nova/+bug/1420662 https://review.openstack.org/#/c/155676/ In the patch I split the large rpc request to multiple small rpc requests using pagination mechanism in order to fix this issue, but sahid think it looks like a hack and need a real pattern to handle this problem. If you have other better idea, please let me know. Feel free to discuss it. Thanks. Best Regards
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev