Re: [Openstack-operators] [cinder] volume in-use attached to None?
Hello Christopher, check out this: https://ask.openstack.org/en/question/66918/how-to-delete-volume-with-available-status-and-attached-to/ Saverio 2017-10-16 20:45 GMT+02:00 Christopher Hull: > Running Liberty. > I'd like to be able to create new volumes from old ones. Launching > instances from volumes results in the volume being "root attached", and > therefore, it seems, forever wed to the instance. It can not be copied. > So I tried deleting the instance. Still no good. The volume is now in-use > by None. So now the volume is completely useless. > How do I force Cinder to detach from either a root mounted , or of all > things, None?? > > -Chris > > > > > - Christopher T. Hull > > http://faq.chrishull.com > Sunnyvale CA. 94085 > (415) 385 4865 > chrishul...@gmail.com > http://chrishull.com > > > > ___ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] [cinder] volume in-use attached to None?
Running Liberty. I'd like to be able to create new volumes from old ones. Launching instances from volumes results in the volume being "root attached", and therefore, it seems, forever wed to the instance. It can not be copied. So I tried deleting the instance. Still no good. The volume is now in-use by None. So now the volume is completely useless. How do I force Cinder to detach from either a root mounted , or of all things, None?? -Chris - Christopher T. Hull http://faq.chrishull.com Sunnyvale CA. 94085 (415) 385 4865 chrishul...@gmail.com http://chrishull.com ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [openstack-dev] [nova] Interesting bug when unshelving an instance in an AZ and the AZ is gone
On 10/16/2017 11:00 AM, Dean Troyer wrote: [not having a dog in this hunt, this is what I would expect as a cloud consumer] Thanks for the user perspective, that's what I'm looking for here, and operator perspective of course. On Mon, Oct 16, 2017 at 10:22 AM, Matt Riedemannwrote: - The user creates an instance in a non-default AZ. - They shelve offload the instance. - The admin deletes the AZ that the instance was using, for whatever reason. - The user unshelves the instance which goes back through scheduling and fails with NoValidHost because the AZ on the original request spec no longer exists. 1. How reasonable is it for a user to expect in a stable production environment that AZs are going to be deleted from under them? We actually have a spec related to this but with AZ renames: Change happens... 2. Should we null out the instance.availability_zone when it's shelved offloaded like we do for the instance.host and instance.node attributes? Similarly, we would not take into account the RequestSpec.availability_zone when scheduling during unshelve. I tend to prefer this option because once you unshelve offload an instance, it's no longer associated with a host and therefore no longer associated with an AZ. However, is it reasonable to assume that the user doesn't care that the instance, once unshelved, is no longer in the originally requested AZ? Probably not a safe assumption. Agreed, unless we keep track that the user specified a default or no AZ at create. We do keep track of what the user originally requested, that is this RequestSpec object thing I keep referring to. I think nulling the AZ when the original doesn't exist would be reasonable from a user standpoint, but I'd feel handcuffed if that happens and I can not select a new AZ. Or throwing a specific error and letting the user handle it in #3 below: At the point of failure, the API has done an RPC cast and returned a 202 to the user, so the only way to provide a message like this to the user would be to check if the original AZ still exists in the API. We could do that, it would just be something to be aware of. 3. When a user unshelves, they can't propose a new AZ (and I don't think we want to add that capability to the unshelve API). So if the original AZ is Here is my question... if I can specify an AZ on create, why not on unshelve? Is it the image location movement under the hood? I just don't think it's ever come up. The reason I hesitate to add the ability to the unshelve API is more or less rooted in my bias toward not liking shelve/unshelve in general because of how complicated and half-baked it is (we've had a lot of bugs from these APIs, some of which are still unresolved). That's not the user's fault though, so one could argue that if we're not going to deprecate these APIs, we need to make them more robust. We, as developers, also don't have any idea how many users are actually using the shelve API, so it's hard to know if we should spend any time on improving it. gone, should we automatically remove the RequestSpec.availability_zone when scheduling? I tend to not like this as it's very implicit and the user could see the AZ on their instance change before and after unshelve and be confused. Agreed that explicit is better than implicit. 4. We could simply do nothing about this specific bug and assert the behavior is correct. The user requested an instance in a specific AZ, shelved that instance and when they wanted to unshelve it, it's no longer available so it fails. The user would have to delete the instance and create a new instance from the shelve snapshot image in a new AZ. If we implemented I do not have the list of things in my head that are preserved in shelve/unshelve that would be lost in a recreate, but that's where my worry would come. Presumably that is why I shelved in the first place rather than snapshotting the server and removing it. Depends on the cost models too, if I lose my grandfathered-in pricing by being forced to recreate I amy be unhappy. The volumes and ports remain attached to the shelved instance, only the guest on the hypervisor is destroyed. It doesn't change anything about quota - you retain quota usage for a shelved instance so you have room in your quota to unshelve it later. From what I can tell, the os-simple-tenant-usage API will still count the instance and it's consumed disk/ram/cpu against you even though the guest is deleted from the hypervisor while the instance is shelved offloaded. So the operator is happy about shelved offloaded instances because that means they have more free capacity for new instances and moving things, but the user is still getting charged the same, if your billing model is based on os-simple-tenant-usage (which Telemetry uses I believe). Sylvain's spec in #1 above, maybe we don't have this problem going forward since you couldn't remove/delete an AZ when there
[Openstack-operators] UC IRC Meeting - Monday 10/16
Dear Community, This is a kind reminder for our UC IRC today at 18:00 UTC in #openstack-meeting channel. So, far we have the following agenda: https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs Thanks, Edgar ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] [nova] Interesting bug when unshelving an instance in an AZ and the AZ is gone
This is interesting from the user point of view: https://bugs.launchpad.net/nova/+bug/1723880 - The user creates an instance in a non-default AZ. - They shelve offload the instance. - The admin deletes the AZ that the instance was using, for whatever reason. - The user unshelves the instance which goes back through scheduling and fails with NoValidHost because the AZ on the original request spec no longer exists. Now the question is what, if anything, do we do about this bug? Some notes: 1. How reasonable is it for a user to expect in a stable production environment that AZs are going to be deleted from under them? We actually have a spec related to this but with AZ renames: https://review.openstack.org/#/c/446446/ 2. Should we null out the instance.availability_zone when it's shelved offloaded like we do for the instance.host and instance.node attributes? Similarly, we would not take into account the RequestSpec.availability_zone when scheduling during unshelve. I tend to prefer this option because once you unshelve offload an instance, it's no longer associated with a host and therefore no longer associated with an AZ. However, is it reasonable to assume that the user doesn't care that the instance, once unshelved, is no longer in the originally requested AZ? Probably not a safe assumption. 3. When a user unshelves, they can't propose a new AZ (and I don't think we want to add that capability to the unshelve API). So if the original AZ is gone, should we automatically remove the RequestSpec.availability_zone when scheduling? I tend to not like this as it's very implicit and the user could see the AZ on their instance change before and after unshelve and be confused. 4. We could simply do nothing about this specific bug and assert the behavior is correct. The user requested an instance in a specific AZ, shelved that instance and when they wanted to unshelve it, it's no longer available so it fails. The user would have to delete the instance and create a new instance from the shelve snapshot image in a new AZ. If we implemented Sylvain's spec in #1 above, maybe we don't have this problem going forward since you couldn't remove/delete an AZ when there are even shelved offloaded instances still tied to it. Other options? -- Thanks, Matt ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [openstack-ansible]: Container errors on task "lxc_container_create : Drop container network file (interfaces)"
On 13 October 2017 at 11:29, andres sanchez ramoswrote: > Hello guys, > > I am trying to deploy a lab Openstack environment using ansible in order to > get acquainted with this tool. Actually i am stuck with an error i am not > being able to resolve. It happens on the following task: > > TASK [lxc_container_create : Drop container network file (interfaces)] > * > > task path: > /etc/ansible/roles/lxc_container_create/tasks/container_create.yml:262 > > The console prints out this error for each one of the containers: > > container_name: "infra1_cinder_scheduler_container-8054b2be" > physical_host: "infra1" > Container confirmed > <192.168.100.30> ESTABLISH SSH CONNECTION FOR USER: lab232 > An exception occurred during task execution. The full traceback is: > Traceback (most recent call last): > File > "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/executor/task_executor.py", > line 98, in run > item_results = self._run_loop(items) > File > "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/executor/task_executor.py", > line 290, in _run_loop > res = self._execute(variables=task_vars) > File > "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/executor/task_executor.py", > line 511, in _execute > result = self._handler.run(task_vars=variables) > File > "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/plugins/action/template.py", > line 149, in run > tmp = self._make_tmp_path(remote_user) > File > "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/plugins/action/__init__.py", > line 224, in _make_tmp_path > tmpdir = self._remote_expand_user(C.DEFAULT_REMOTE_TMP, sudoable=False) > File > "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/plugins/action/__init__.py", > line 506, in _remote_expand_user > initial_fragment = data['stdout'].strip().splitlines()[-1] > IndexError: list index out of range > > fatal: [infra1_nova_scheduler_container-92a94180]: FAILED! => { > "failed": true, > "msg": "Unexpected failure during module execution.", > "stdout": "" > } > > Regarding my setup i have one infrastructure node and one compute node. I am > attaching the interface config for each node and the > openstack_user_config.yml file so you get a complete picture of my setup. I > would deeply appreciate any help or any pointers that might help me > troubleshoot this! > > I think the only relevant change i make regarding the instructions is my > management network which is 192.168.100.0./24 since I already have this set > up for other equipment. > > > > > Enviado desde Outlook > > > ___ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > Sorry, I forgot to reply all. With the information you gave me, I think the user could be the cause. Our connection plugin doesn't seem to work with the sudo trick you did. May I suggest you to file a bug to list what you did in practice, explain this issue, and you'd expect? In the meantime, could you try running as root on your destination node? And maybe later on your deploy node too, to see the results? Best regards, Jean-Philippe Evrard (evrardjp) ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators