Re: [openstack-dev] [nova] upgrade connection_info when Ceph mon IP changed
On 5/18/2016 3:05 PM, melanie witt wrote: On Wed, 18 May 2016 14:30:00 -0500, Matt Riedemann wrote: While convenient as a workaround, I'm not in favor of the idea of adding something to the REST API so a user can force refresh the connection info - this is a bug and leaks information out of the API about how the cloud is configured. If you didn't have volumes attached to the instance at all then this wouldn't matter. I think in an earlier version of the patch it was reloading and checking the connection info every time the BDM list was retrieved for an instance, which was a major issue for normal operations where this isn't a problem. Since it's been scoped to just start/reboot operations, it's better, and there are comments in the patch to make it a bit more efficient also (avoid calling the DB multiple times for the same information). I'm not totally opposed to doing the refresh on start/reboot. We could make it configurable, so if you're using a storage server backend where the IP might change, then set this flag, but that's a bit clunky. And a periodic task wouldn't help us out. I'm open to other ideas if anyone has them. I was thinking it may be possible to do something similar to how network info is periodically refreshed in _heal_instance_info_cache [1]. The task interval is configurable (defaults to 60 seconds) and works on a queue of instances such that one is refreshed per period, to control the load on the host. To avoid doing anything for storage backends that can't change IP, maybe we could make the task return immediately after calling a driver method that would indicate whether the storage backend can be affected by an IP change. There would be some delay until the task runs on an affected instance, though. -melanie [1] https://github.com/openstack/nova/blob/9a05d38/nova/compute/manager.py#L5549 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev I like this idea. Sure it's a delay, but it resolves the problem eventually and doesn't add the overhead to the start/reboot operations that should mostly be unnecessary if things are working. I like the short-circuit idea too, although that's a nice to have. A deployer can always disable the periodic task if they don't want that running. -- Thanks, Matt Riedemann __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] upgrade connection_info when Ceph mon IP changed
On Wed, 18 May 2016 14:30:00 -0500, Matt Riedemann wrote: While convenient as a workaround, I'm not in favor of the idea of adding something to the REST API so a user can force refresh the connection info - this is a bug and leaks information out of the API about how the cloud is configured. If you didn't have volumes attached to the instance at all then this wouldn't matter. I think in an earlier version of the patch it was reloading and checking the connection info every time the BDM list was retrieved for an instance, which was a major issue for normal operations where this isn't a problem. Since it's been scoped to just start/reboot operations, it's better, and there are comments in the patch to make it a bit more efficient also (avoid calling the DB multiple times for the same information). I'm not totally opposed to doing the refresh on start/reboot. We could make it configurable, so if you're using a storage server backend where the IP might change, then set this flag, but that's a bit clunky. And a periodic task wouldn't help us out. I'm open to other ideas if anyone has them. I was thinking it may be possible to do something similar to how network info is periodically refreshed in _heal_instance_info_cache [1]. The task interval is configurable (defaults to 60 seconds) and works on a queue of instances such that one is refreshed per period, to control the load on the host. To avoid doing anything for storage backends that can't change IP, maybe we could make the task return immediately after calling a driver method that would indicate whether the storage backend can be affected by an IP change. There would be some delay until the task runs on an affected instance, though. -melanie [1] https://github.com/openstack/nova/blob/9a05d38/nova/compute/manager.py#L5549 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] upgrade connection_info when Ceph mon IP changed
On 5/16/2016 8:39 PM, zhou.b...@zte.com.cn wrote: Hi all: I got a problem described in https://bugs.launchpad.net/cinder/+bug/1452641, and my colleague got another similar problem described in https://bugs.launchpad.net/nova/+bug/1581367. It's all about the storage backend ip change. With the storage backend, not only Ceph but also IPSAN, when the backend's ip changed, the related volumes attached to VMs would not be available. Previously I proposed to auto-check the consistency of IP record in nova's bdm table and storage backend, which was submitted in https://review.openstack.org/#/c/289813/. reviewers point out that it's a waste of performance with normal case and it's a not a good scenario to do thess checking in a regular function. I agree with this suggestion and the bug troubled me and my colleagues all the time. I think if we can just add an option in nova api, such as "nova reboot --refresh-conn" to manually modify the VM's bdm info when the bug happened. The "--refresh-conn" was parsed and passed to "reboot_instance" function in nova-compute. Without auto-checking, it would be more flexible and efficient. And I need all of your valued opinions and appreciate for hearing from you soon. The fake code is like this in nova-compute: def reboot_instance(self, context, instance, block_device_info, reboot_type, refresh_conn = False): """Reboot an instance on this host.""" ... ... block_device_info = self._get_instance_block_device_info(context, instance, refresh_conn) Thank you. related links are as follows: https://bugs.launchpad.net/cinder/+bug/1452641 https://bugs.launchpad.net/nova/+bug/1581367 https://review.openstack.org/#/c/289813/ ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev While convenient as a workaround, I'm not in favor of the idea of adding something to the REST API so a user can force refresh the connection info - this is a bug and leaks information out of the API about how the cloud is configured. If you didn't have volumes attached to the instance at all then this wouldn't matter. I think in an earlier version of the patch it was reloading and checking the connection info every time the BDM list was retrieved for an instance, which was a major issue for normal operations where this isn't a problem. Since it's been scoped to just start/reboot operations, it's better, and there are comments in the patch to make it a bit more efficient also (avoid calling the DB multiple times for the same information). I'm not totally opposed to doing the refresh on start/reboot. We could make it configurable, so if you're using a storage server backend where the IP might change, then set this flag, but that's a bit clunky. And a periodic task wouldn't help us out. I'm open to other ideas if anyone has them. -- Thanks, Matt Riedemann __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev