Re: [Openstack-operators] [cinder] volume in-use attached to None?

2017-10-16 Thread Saverio Proto
Hello Christopher,

check out this:
https://ask.openstack.org/en/question/66918/how-to-delete-volume-with-available-status-and-attached-to/

Saverio

2017-10-16 20:45 GMT+02:00 Christopher Hull :
> Running Liberty.
> I'd like to be able to create new volumes from old ones.  Launching
> instances from volumes results in the volume being "root attached", and
> therefore, it seems, forever wed to the instance.  It can not be copied.
> So I tried deleting the instance.  Still no good.  The volume is now in-use
> by None.   So now the volume is completely useless.
> How do I force Cinder to detach from either a root mounted , or of all
> things, None??
>
> -Chris
>
>
>
>
> - Christopher T. Hull
>
> http://faq.chrishull.com
> Sunnyvale CA. 94085
> (415) 385 4865
> chrishul...@gmail.com
> http://chrishull.com
>
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [cinder] volume in-use attached to None?

2017-10-16 Thread Christopher Hull
Running Liberty.
I'd like to be able to create new volumes from old ones.  Launching
instances from volumes results in the volume being "root attached", and
therefore, it seems, forever wed to the instance.  It can not be copied.
So I tried deleting the instance.  Still no good.  The volume is now in-use
by None.   So now the volume is completely useless.
How do I force Cinder to detach from either a root mounted , or of all
things, None??

-Chris




- Christopher T. Hull

http://faq.chrishull.com
Sunnyvale CA. 94085
(415) 385 4865
chrishul...@gmail.com
http://chrishull.com
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova] Interesting bug when unshelving an instance in an AZ and the AZ is gone

2017-10-16 Thread Matt Riedemann

On 10/16/2017 11:00 AM, Dean Troyer wrote:

[not having a dog in this hunt, this is what I would expect as a cloud consumer]


Thanks for the user perspective, that's what I'm looking for here, and 
operator perspective of course.




On Mon, Oct 16, 2017 at 10:22 AM, Matt Riedemann  wrote:

- The user creates an instance in a non-default AZ.
- They shelve offload the instance.
- The admin deletes the AZ that the instance was using, for whatever reason.
- The user unshelves the instance which goes back through scheduling and
fails with NoValidHost because the AZ on the original request spec no longer
exists.



1. How reasonable is it for a user to expect in a stable production
environment that AZs are going to be deleted from under them? We actually
have a spec related to this but with AZ renames:


Change happens...


2. Should we null out the instance.availability_zone when it's shelved
offloaded like we do for the instance.host and instance.node attributes?
Similarly, we would not take into account the RequestSpec.availability_zone
when scheduling during unshelve. I tend to prefer this option because once
you unshelve offload an instance, it's no longer associated with a host and
therefore no longer associated with an AZ. However, is it reasonable to
assume that the user doesn't care that the instance, once unshelved, is no
longer in the originally requested AZ? Probably not a safe assumption.


Agreed, unless we keep track that the user specified a default or no
AZ at create.


We do keep track of what the user originally requested, that is this 
RequestSpec object thing I keep referring to.




I think nulling the AZ when the original doesn't exist would be
reasonable from a user standpoint, but I'd feel handcuffed if that
happens and I can not select a new AZ. Or throwing a specific error
and letting the user handle it in #3 below:


At the point of failure, the API has done an RPC cast and returned a 202 
to the user, so the only way to provide a message like this to the user 
would be to check if the original AZ still exists in the API. We could 
do that, it would just be something to be aware of.





3. When a user unshelves, they can't propose a new AZ (and I don't think we
want to add that capability to the unshelve API). So if the original AZ is


Here is my question... if I can specify an AZ on create, why not on
unshelve?  Is it the image location movement under the hood?


I just don't think it's ever come up. The reason I hesitate to add the 
ability to the unshelve API is more or less rooted in my bias toward not 
liking shelve/unshelve in general because of how complicated and 
half-baked it is (we've had a lot of bugs from these APIs, some of which 
are still unresolved). That's not the user's fault though, so one could 
argue that if we're not going to deprecate these APIs, we need to make 
them more robust. We, as developers, also don't have any idea how many 
users are actually using the shelve API, so it's hard to know if we 
should spend any time on improving it.





gone, should we automatically remove the RequestSpec.availability_zone when
scheduling? I tend to not like this as it's very implicit and the user could
see the AZ on their instance change before and after unshelve and be
confused.


Agreed that explicit is better than implicit.


4. We could simply do nothing about this specific bug and assert the
behavior is correct. The user requested an instance in a specific AZ,
shelved that instance and when they wanted to unshelve it, it's no longer
available so it fails. The user would have to delete the instance and create
a new instance from the shelve snapshot image in a new AZ. If we implemented


I do not have the list of things in my head that are preserved in
shelve/unshelve that would be lost in a recreate, but that's where my
worry would come.  Presumably that is why I shelved in the first place
rather than snapshotting the server and removing it.  Depends on the
cost models too, if I lose my grandfathered-in pricing by being forced
to recreate I amy be unhappy.


The volumes and ports remain attached to the shelved instance, only the 
guest on the hypervisor is destroyed. It doesn't change anything about 
quota - you retain quota usage for a shelved instance so you have room 
in your quota to unshelve it later.


From what I can tell, the os-simple-tenant-usage API will still count 
the instance and it's consumed disk/ram/cpu against you even though the 
guest is deleted from the hypervisor while the instance is shelved 
offloaded. So the operator is happy about shelved offloaded instances 
because that means they have more free capacity for new instances and 
moving things, but the user is still getting charged the same, if your 
billing model is based on os-simple-tenant-usage (which Telemetry uses I 
believe).






Sylvain's spec in #1 above, maybe we don't have this problem going forward
since you couldn't remove/delete an AZ when there 

[Openstack-operators] UC IRC Meeting - Monday 10/16

2017-10-16 Thread Edgar Magana
Dear Community,

This is a kind reminder for our UC IRC today at 18:00 UTC in #openstack-meeting 
channel. So, far we have the following agenda:

https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs


Thanks,

Edgar
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [nova] Interesting bug when unshelving an instance in an AZ and the AZ is gone

2017-10-16 Thread Matt Riedemann

This is interesting from the user point of view:

https://bugs.launchpad.net/nova/+bug/1723880

- The user creates an instance in a non-default AZ.
- They shelve offload the instance.
- The admin deletes the AZ that the instance was using, for whatever reason.
- The user unshelves the instance which goes back through scheduling and 
fails with NoValidHost because the AZ on the original request spec no 
longer exists.


Now the question is what, if anything, do we do about this bug? Some notes:

1. How reasonable is it for a user to expect in a stable production 
environment that AZs are going to be deleted from under them? We 
actually have a spec related to this but with AZ renames:


https://review.openstack.org/#/c/446446/

2. Should we null out the instance.availability_zone when it's shelved 
offloaded like we do for the instance.host and instance.node attributes? 
Similarly, we would not take into account the 
RequestSpec.availability_zone when scheduling during unshelve. I tend to 
prefer this option because once you unshelve offload an instance, it's 
no longer associated with a host and therefore no longer associated with 
an AZ. However, is it reasonable to assume that the user doesn't care 
that the instance, once unshelved, is no longer in the originally 
requested AZ? Probably not a safe assumption.


3. When a user unshelves, they can't propose a new AZ (and I don't think 
we want to add that capability to the unshelve API). So if the original 
AZ is gone, should we automatically remove the 
RequestSpec.availability_zone when scheduling? I tend to not like this 
as it's very implicit and the user could see the AZ on their instance 
change before and after unshelve and be confused.


4. We could simply do nothing about this specific bug and assert the 
behavior is correct. The user requested an instance in a specific AZ, 
shelved that instance and when they wanted to unshelve it, it's no 
longer available so it fails. The user would have to delete the instance 
and create a new instance from the shelve snapshot image in a new AZ. If 
we implemented Sylvain's spec in #1 above, maybe we don't have this 
problem going forward since you couldn't remove/delete an AZ when there 
are even shelved offloaded instances still tied to it.


Other options?

--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-ansible]: Container errors on task "lxc_container_create : Drop container network file (interfaces)"

2017-10-16 Thread Jean-Philippe Evrard
On 13 October 2017 at 11:29, andres sanchez ramos
 wrote:
> Hello guys,
>
> I am trying to deploy a lab Openstack environment using ansible in order to
> get acquainted with this tool. Actually i am stuck with an error i am not
> being able to resolve. It happens on the following task:
>
> TASK [lxc_container_create : Drop container network file (interfaces)]
> *
>
> task path:
> /etc/ansible/roles/lxc_container_create/tasks/container_create.yml:262
>
> The console prints out this error for each one of the containers:
>
> container_name: "infra1_cinder_scheduler_container-8054b2be"
> physical_host: "infra1"
> Container confirmed
> <192.168.100.30> ESTABLISH SSH CONNECTION FOR USER: lab232
> An exception occurred during task execution. The full traceback is:
> Traceback (most recent call last):
>   File
> "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/executor/task_executor.py",
> line 98, in run
> item_results = self._run_loop(items)
>   File
> "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/executor/task_executor.py",
> line 290, in _run_loop
> res = self._execute(variables=task_vars)
>   File
> "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/executor/task_executor.py",
> line 511, in _execute
> result = self._handler.run(task_vars=variables)
>   File
> "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/plugins/action/template.py",
> line 149, in run
> tmp = self._make_tmp_path(remote_user)
>   File
> "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/plugins/action/__init__.py",
> line 224, in _make_tmp_path
> tmpdir =  self._remote_expand_user(C.DEFAULT_REMOTE_TMP, sudoable=False)
>   File
> "/opt/ansible-runtime/local/lib/python2.7/site-packages/ansible/plugins/action/__init__.py",
> line 506, in _remote_expand_user
> initial_fragment = data['stdout'].strip().splitlines()[-1]
> IndexError: list index out of range
>
> fatal: [infra1_nova_scheduler_container-92a94180]: FAILED! => {
> "failed": true,
> "msg": "Unexpected failure during module execution.",
> "stdout": ""
> }
>
> Regarding my setup i have one infrastructure node and one compute node. I am
> attaching the interface config for each node and the
> openstack_user_config.yml file so you get a complete picture of my setup. I
> would deeply appreciate any help or any pointers that might help me
> troubleshoot this!
>
> I think the only relevant change i make regarding the instructions is my
> management network which is 192.168.100.0./24 since I already have this set
> up for other equipment.
>
>
>
>
> Enviado desde Outlook
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>

Sorry, I forgot to reply all.

With the information you gave me, I think the user could be the cause.
Our connection plugin doesn't seem to
work with the sudo trick you did.

May I suggest you to file a bug to list what you did in practice,
explain this issue, and you'd expect?

In the meantime, could you try running as root on your destination
node? And maybe later on your deploy node too, to see the results?

Best regards,
Jean-Philippe Evrard (evrardjp)

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators