[openstack-dev] [nova] upgrade connection_info when Ceph mon IP changed
>n Wed, 18 May 2016 14:30:00 -0500, Matt Riedemann wrote: >>> While convenient as a workaround, I'm not in favor of the idea of adding >>> something to the REST API so a user can force refresh the connection >>> info - this is a bug and leaks information out of the API about how the >>> cloud is configured. If you didn't have volumes attached to the instance >>> at all then this wouldn't matter. >>> >>> I think in an earlier version of the patch it was reloading and checking >>> the connection info every time the BDM list was retrieved for an >>> instance, which was a major issue for normal operations where this isn't >>> a problem. >>> >>> Since it's been scoped to just start/reboot operations, it's better, and >>> there are comments in the patch to make it a bit more efficient also >>> (avoid calling the DB multiple times for the same information). >>> >>> I'm not totally opposed to doing the refresh on start/reboot. We could >>> make it configurable, so if you're using a storage server backend where >>> the IP might change, then set this flag, but that's a bit clunky. And a >>> periodic task wouldn't help us out. >>> >>> I'm open to other ideas if anyone has them. >> >> >> I was thinking it may be possible to do something similar to how network >> info is periodically refreshed in _heal_instance_info_cache [1]. The >> task interval is configurable (defaults to 60 seconds) and works on a >> queue of instances such that one is refreshed per period, to control the >> load on the host. To avoid doing anything for storage backends that >> can't change IP, maybe we could make the task return immediately after >> calling a driver method that would indicate whether the storage backend >> can be affected by an IP change. >> >> There would be some delay until the task runs on an affected instance, >> though. >> >> -melanie >> >> >> [1] >> https://github.com/openstack/nova/blob/9a05d38/nova/compute/manager.py#L5549 >> >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >I like this idea. Sure it's a delay, but it resolves the problem >eventually and doesn't add the overhead to the start/reboot operations >that should mostly be unnecessary if things are working. > >I like the short-circuit idea too, although that's a nice to have. A >deployer can always disable the periodic task if they don't want that >running. > >-- > >Thanks, > >Matt Riedemann > Hi Matt, I was thinking, if it could be done on restarting of nova-compute service. Because if operator is going to change storage node IPs, they might need to restart at least some services, so we can ask operator to restart nova-compute service as well, if instances on that compute-node is going to be affected by IP change. To fix this issue, we can hard reboot the affected instances on restart of nova-compute service, by doing so the updated connection info is getting stored in BDM table as well as recorded in domain xml. Please correct me if I am wrong. Regards, Rajesh __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] upgrade connection_info when Ceph mon IP changed
On 5/18/2016 3:05 PM, melanie witt wrote: On Wed, 18 May 2016 14:30:00 -0500, Matt Riedemann wrote: While convenient as a workaround, I'm not in favor of the idea of adding something to the REST API so a user can force refresh the connection info - this is a bug and leaks information out of the API about how the cloud is configured. If you didn't have volumes attached to the instance at all then this wouldn't matter. I think in an earlier version of the patch it was reloading and checking the connection info every time the BDM list was retrieved for an instance, which was a major issue for normal operations where this isn't a problem. Since it's been scoped to just start/reboot operations, it's better, and there are comments in the patch to make it a bit more efficient also (avoid calling the DB multiple times for the same information). I'm not totally opposed to doing the refresh on start/reboot. We could make it configurable, so if you're using a storage server backend where the IP might change, then set this flag, but that's a bit clunky. And a periodic task wouldn't help us out. I'm open to other ideas if anyone has them. I was thinking it may be possible to do something similar to how network info is periodically refreshed in _heal_instance_info_cache [1]. The task interval is configurable (defaults to 60 seconds) and works on a queue of instances such that one is refreshed per period, to control the load on the host. To avoid doing anything for storage backends that can't change IP, maybe we could make the task return immediately after calling a driver method that would indicate whether the storage backend can be affected by an IP change. There would be some delay until the task runs on an affected instance, though. -melanie [1] https://github.com/openstack/nova/blob/9a05d38/nova/compute/manager.py#L5549 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev I like this idea. Sure it's a delay, but it resolves the problem eventually and doesn't add the overhead to the start/reboot operations that should mostly be unnecessary if things are working. I like the short-circuit idea too, although that's a nice to have. A deployer can always disable the periodic task if they don't want that running. -- Thanks, Matt Riedemann __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] upgrade connection_info when Ceph mon IP changed
On Wed, 18 May 2016 14:30:00 -0500, Matt Riedemann wrote: While convenient as a workaround, I'm not in favor of the idea of adding something to the REST API so a user can force refresh the connection info - this is a bug and leaks information out of the API about how the cloud is configured. If you didn't have volumes attached to the instance at all then this wouldn't matter. I think in an earlier version of the patch it was reloading and checking the connection info every time the BDM list was retrieved for an instance, which was a major issue for normal operations where this isn't a problem. Since it's been scoped to just start/reboot operations, it's better, and there are comments in the patch to make it a bit more efficient also (avoid calling the DB multiple times for the same information). I'm not totally opposed to doing the refresh on start/reboot. We could make it configurable, so if you're using a storage server backend where the IP might change, then set this flag, but that's a bit clunky. And a periodic task wouldn't help us out. I'm open to other ideas if anyone has them. I was thinking it may be possible to do something similar to how network info is periodically refreshed in _heal_instance_info_cache [1]. The task interval is configurable (defaults to 60 seconds) and works on a queue of instances such that one is refreshed per period, to control the load on the host. To avoid doing anything for storage backends that can't change IP, maybe we could make the task return immediately after calling a driver method that would indicate whether the storage backend can be affected by an IP change. There would be some delay until the task runs on an affected instance, though. -melanie [1] https://github.com/openstack/nova/blob/9a05d38/nova/compute/manager.py#L5549 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] upgrade connection_info when Ceph mon IP changed
On 5/16/2016 8:39 PM, zhou.b...@zte.com.cn wrote: Hi all: I got a problem described in https://bugs.launchpad.net/cinder/+bug/1452641, and my colleague got another similar problem described in https://bugs.launchpad.net/nova/+bug/1581367. It's all about the storage backend ip change. With the storage backend, not only Ceph but also IPSAN, when the backend's ip changed, the related volumes attached to VMs would not be available. Previously I proposed to auto-check the consistency of IP record in nova's bdm table and storage backend, which was submitted in https://review.openstack.org/#/c/289813/. reviewers point out that it's a waste of performance with normal case and it's a not a good scenario to do thess checking in a regular function. I agree with this suggestion and the bug troubled me and my colleagues all the time. I think if we can just add an option in nova api, such as "nova reboot --refresh-conn" to manually modify the VM's bdm info when the bug happened. The "--refresh-conn" was parsed and passed to "reboot_instance" function in nova-compute. Without auto-checking, it would be more flexible and efficient. And I need all of your valued opinions and appreciate for hearing from you soon. The fake code is like this in nova-compute: def reboot_instance(self, context, instance, block_device_info, reboot_type, refresh_conn = False): """Reboot an instance on this host.""" ... ... block_device_info = self._get_instance_block_device_info(context, instance, refresh_conn) Thank you. related links are as follows: https://bugs.launchpad.net/cinder/+bug/1452641 https://bugs.launchpad.net/nova/+bug/1581367 https://review.openstack.org/#/c/289813/ ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev While convenient as a workaround, I'm not in favor of the idea of adding something to the REST API so a user can force refresh the connection info - this is a bug and leaks information out of the API about how the cloud is configured. If you didn't have volumes attached to the instance at all then this wouldn't matter. I think in an earlier version of the patch it was reloading and checking the connection info every time the BDM list was retrieved for an instance, which was a major issue for normal operations where this isn't a problem. Since it's been scoped to just start/reboot operations, it's better, and there are comments in the patch to make it a bit more efficient also (avoid calling the DB multiple times for the same information). I'm not totally opposed to doing the refresh on start/reboot. We could make it configurable, so if you're using a storage server backend where the IP might change, then set this flag, but that's a bit clunky. And a periodic task wouldn't help us out. I'm open to other ideas if anyone has them. -- Thanks, Matt Riedemann __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] upgrade connection_info when Ceph mon IP changed
Hi all: I got a problem described in https://bugs.launchpad.net/cinder/+bug/1452641, and my colleague got another similar problem described in https://bugs.launchpad.net/nova/+bug/1581367. It's all about the storage backend ip change. With the storage backend, not only Ceph but also IPSAN, when the backend's ip changed, the related volumes attached to VMs would not be available. Previously I proposed to auto-check the consistency of IP record in nova's bdm table and storage backend, which was submitted in https://review.openstack.org/#/c/289813/. reviewers point out that it's a waste of performance with normal case and it's a not a good scenario to do thess checking in a regular function. I agree with this suggestion and the bug troubled me and my colleagues all the time. I think if we can just add an option in nova api, such as "nova reboot --refresh-conn" to manually modify the VM's bdm info when the bug happened. The "--refresh-conn" was parsed and passed to "reboot_instance" function in nova-compute. Without auto-checking, it would be more flexible and efficient. And I need all of your valued opinions and appreciate for hearing from you soon. The fake code is like this in nova-compute: def reboot_instance(self, context, instance, block_device_info, reboot_type, refresh_conn = False): """Reboot an instance on this host.""" ... ... block_device_info = self._get_instance_block_device_info(context, instance, refresh_conn) Thank you. related links are as follows: https://bugs.launchpad.net/cinder/+bug/1452641 https://bugs.launchpad.net/nova/+bug/1581367 https://review.openstack.org/#/c/289813/ ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] upgrade connection_info when Ceph mon IP changed
Hi all: about https://bugs.launchpad.net/cinder/+bug/1452641 I submitted my solution https://review.openstack.org/#/c/289813/. VM startup or reboot failed when volume backend of ceph mon ip address changed. And this bug also happens when vm deployed with volume backend of IPSAN, when IPSAN's iscsi ip address changed. In our product use, when we transfered our equipment to others' lab, the IP adress would changed all, including the volume backends' ip, and the pre-deployed instances would failed to start at all, which troubled us to much. I proposed to check and refresh the ip address which recored in instance's block_device_mapping table, Matt Riedemann pointed out that it's not a good idea to perform these checks each time to call call startup or reboot APIs, and give good advice to fix this in nova-manage when this bug did happen. My question is: is it reasonable to call nova database, libvirt driver, rbd driver, cinder's api and so on in nova-manage to perferm the check and refresh connection info in BDM and fix the bug, when volume backends' ip changed. Or any other reasonable solutions? Thank you. Best regards! R&D Building, ZTE Plaza, #6 Huashen Ave. Yuhuatai District, Nanjing, P.R.China, 210012 T: +86 02552878587M: +86 15062283989 E: zhou.b...@zte.com.cn www.zte.com.cn ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev