Re: [Openstack] [Openstack-qa-team] wait_for_server_status and Compute API
On 06/18/2012 12:01 PM, David Kranz wrote: There are a few tempest tests, and many in the old kong suite that is still there, that wait for a server status that is something other than ACTIVE or VERIFY_RESIZE. These other states, such as BUILD or REBOOT, are transient so I don't understand why it is correct for code to poll for those states. Am I missing something or do those tests have race condition bugs? No, you are correct, and I have made some comments in recent code reviews to that effect. Here are all the task states: https://github.com/openstack/nova/blob/master/nova/compute/task_states.py Out of all those task states, I believe the only one safe to poll in a wait loop is RESIZE_VERIFY. All the others are prone to state transitions outside the control of the user. For the VM states: https://github.com/openstack/nova/blob/master/nova/compute/vm_states.py I consider the following to be non-racy, quiescent states: ACTIVE DELETED STOPPED SHUTDOFF PAUSED SUSPENDED ERROR I consider the following to be racy states that should not be tested for: MIGRATING -- Instead, the final state should be checked for... RESIZING -- Instead, the RESIZE_VERIFY and RESIZE_CONFIRM task states should be checked I have absolutely no idea what the state termination is for the following VM states: RESCUED -- is this a permanent state? Is this able to be queried for in a consistent manner before it transitions to some further state? SOFT_DELETE -- I have no clue what the purpose or queryability of this state is, but would love to know... Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Openstack-qa-team] wait_for_server_status and Compute API
I can verify that rescue is a non-race state. The transition is active to rescue on setting rescue, and rescue to active when leaving rescue. Original message Subject: Re: [Openstack-qa-team] wait_for_server_status and Compute API From: Jay Pipes jaypi...@gmail.com To: openstack-qa-t...@lists.launchpad.net openstack-qa-t...@lists.launchpad.net,openstack@lists.launchpad.net openstack@lists.launchpad.net CC: Re: [Openstack-qa-team] wait_for_server_status and Compute API On 06/18/2012 12:01 PM, David Kranz wrote: There are a few tempest tests, and many in the old kong suite that is still there, that wait for a server status that is something other than ACTIVE or VERIFY_RESIZE. These other states, such as BUILD or REBOOT, are transient so I don't understand why it is correct for code to poll for those states. Am I missing something or do those tests have race condition bugs? No, you are correct, and I have made some comments in recent code reviews to that effect. Here are all the task states: https://github.com/openstack/nova/blob/master/nova/compute/task_states.py Out of all those task states, I believe the only one safe to poll in a wait loop is RESIZE_VERIFY. All the others are prone to state transitions outside the control of the user. For the VM states: https://github.com/openstack/nova/blob/master/nova/compute/vm_states.py I consider the following to be non-racy, quiescent states: ACTIVE DELETED STOPPED SHUTDOFF PAUSED SUSPENDED ERROR I consider the following to be racy states that should not be tested for: MIGRATING -- Instead, the final state should be checked for... RESIZING -- Instead, the RESIZE_VERIFY and RESIZE_CONFIRM task states should be checked I have absolutely no idea what the state termination is for the following VM states: RESCUED -- is this a permanent state? Is this able to be queried for in a consistent manner before it transitions to some further state? SOFT_DELETE -- I have no clue what the purpose or queryability of this state is, but would love to know... Best, -jay -- Mailing list: https://launchpad.net/~openstack-qa-team Post to : openstack-qa-t...@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack-qa-team More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Openstack-qa-team] wait_for_server_status and Compute API
Hi Jay et al, there is a patch in review here to overhaul the state machine: https://review.openstack.org/#/c/8254/ All transient state in vm state will be moved to task state. Stable state in task state (RESIZE_VERIFY) will be moved to vm state. There is also a state transition diagram in dot format. Comments welcome. Thanks, All On Mon, Jun 18, 2012 at 12:26 PM, Jay Pipes jaypi...@gmail.com wrote: On 06/18/2012 12:01 PM, David Kranz wrote: There are a few tempest tests, and many in the old kong suite that is still there, that wait for a server status that is something other than ACTIVE or VERIFY_RESIZE. These other states, such as BUILD or REBOOT, are transient so I don't understand why it is correct for code to poll for those states. Am I missing something or do those tests have race condition bugs? No, you are correct, and I have made some comments in recent code reviews to that effect. Here are all the task states: https://github.com/openstack/nova/blob/master/nova/compute/task_states.py Out of all those task states, I believe the only one safe to poll in a wait loop is RESIZE_VERIFY. All the others are prone to state transitions outside the control of the user. For the VM states: https://github.com/openstack/nova/blob/master/nova/compute/vm_states.py I consider the following to be non-racy, quiescent states: ACTIVE DELETED STOPPED SHUTDOFF PAUSED SUSPENDED ERROR I consider the following to be racy states that should not be tested for: MIGRATING -- Instead, the final state should be checked for... RESIZING -- Instead, the RESIZE_VERIFY and RESIZE_CONFIRM task states should be checked I have absolutely no idea what the state termination is for the following VM states: RESCUED -- is this a permanent state? Is this able to be queried for in a consistent manner before it transitions to some further state? SOFT_DELETE -- I have no clue what the purpose or queryability of this state is, but would love to know... Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Openstack-qa-team] wait_for_server_status and Compute API
On 06/18/2012 12:49 PM, Daryl Walleck wrote: I can verify that rescue is a non-race state. The transition is active to rescue on setting rescue, and rescue to active when leaving rescue. I don't see a RESCUE state. I see a RESCUED state. Is that what you are referring to here? Want to make sure, since the semantics and tenses of the power, VM, and task states are a bit inconsistent. Best, -jay Original message Subject: Re: [Openstack-qa-team] wait_for_server_status and Compute API From: Jay Pipes jaypi...@gmail.com To: openstack-qa-t...@lists.launchpad.net openstack-qa-t...@lists.launchpad.net,openstack@lists.launchpad.net openstack@lists.launchpad.net CC: Re: [Openstack-qa-team] wait_for_server_status and Compute API On 06/18/2012 12:01 PM, David Kranz wrote: There are a few tempest tests, and many in the old kong suite that is still there, that wait for a server status that is something other than ACTIVE or VERIFY_RESIZE. These other states, such as BUILD or REBOOT, are transient so I don't understand why it is correct for code to poll for those states. Am I missing something or do those tests have race condition bugs? No, you are correct, and I have made some comments in recent code reviews to that effect. Here are all the task states: https://github.com/openstack/nova/blob/master/nova/compute/task_states.py Out of all those task states, I believe the only one safe to poll in a wait loop is RESIZE_VERIFY. All the others are prone to state transitions outside the control of the user. For the VM states: https://github.com/openstack/nova/blob/master/nova/compute/vm_states.py I consider the following to be non-racy, quiescent states: ACTIVE DELETED STOPPED SHUTDOFF PAUSED SUSPENDED ERROR I consider the following to be racy states that should not be tested for: MIGRATING -- Instead, the final state should be checked for... RESIZING -- Instead, the RESIZE_VERIFY and RESIZE_CONFIRM task states should be checked I have absolutely no idea what the state termination is for the following VM states: RESCUED -- is this a permanent state? Is this able to be queried for in a consistent manner before it transitions to some further state? SOFT_DELETE -- I have no clue what the purpose or queryability of this state is, but would love to know... Best, -jay -- Mailing list: https://launchpad.net/~openstack-qa-team Post to : openstack-qa-t...@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack-qa-team More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Openstack-qa-team] wait_for_server_status and Compute API
Thanks, Yun. The problem is that the API calls give you status which is neither task state nor vm state. I think these are the stable states: ACTIVE, VERIFY_RESIZE, STOPPED, SHUTOFF, PAUSED, SUSPENDED, RESCUE, ERROR, DELETED Does that seem right to you, and is there a plan to change that set for Folsom? -David On 6/18/2012 12:51 PM, Yun Mao wrote: Hi Jay et al, there is a patch in review here to overhaul the state machine: https://review.openstack.org/#/c/8254/ All transient state in vm state will be moved to task state. Stable state in task state (RESIZE_VERIFY) will be moved to vm state. There is also a state transition diagram in dot format. Comments welcome. Thanks, All On Mon, Jun 18, 2012 at 12:26 PM, Jay Pipesjaypi...@gmail.com wrote: On 06/18/2012 12:01 PM, David Kranz wrote: There are a few tempest tests, and many in the old kong suite that is still there, that wait for a server status that is something other than ACTIVE or VERIFY_RESIZE. These other states, such as BUILD or REBOOT, are transient so I don't understand why it is correct for code to poll for those states. Am I missing something or do those tests have race condition bugs? No, you are correct, and I have made some comments in recent code reviews to that effect. Here are all the task states: https://github.com/openstack/nova/blob/master/nova/compute/task_states.py Out of all those task states, I believe the only one safe to poll in a wait loop is RESIZE_VERIFY. All the others are prone to state transitions outside the control of the user. For the VM states: https://github.com/openstack/nova/blob/master/nova/compute/vm_states.py I consider the following to be non-racy, quiescent states: ACTIVE DELETED STOPPED SHUTDOFF PAUSED SUSPENDED ERROR I consider the following to be racy states that should not be tested for: MIGRATING -- Instead, the final state should be checked for... RESIZING -- Instead, the RESIZE_VERIFY and RESIZE_CONFIRM task states should be checked I have absolutely no idea what the state termination is for the following VM states: RESCUED -- is this a permanent state? Is this able to be queried for in a consistent manner before it transitions to some further state? SOFT_DELETE -- I have no clue what the purpose or queryability of this state is, but would love to know... Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Openstack-qa-team] wait_for_server_status and Compute API
Hi David, Yes there is a plan to change that for Folsom. vm_state will be purely stable state and task_state will be purely for transition state. See http://wiki.openstack.org/VMState for the new design rational of (power_state, vm_state, task_state) After the cleanup, vm_state will have ACTIVE = 'active' # VM is running BUILDING = 'building' # VM only exists in DB PAUSED = 'paused' SUSPENDED = 'suspended' # VM is suspended to disk. STOPPED = 'stopped' # VM is powered off, the disk image is still there. RESCUED = 'rescued' # A rescue image is running with the original VM image # attached. RESIZED = 'resized' # a VM with the new size is active. The user is expected # to manually confirm or revert. SOFT_DELETED = 'soft-delete' # VM is marked as deleted but the disk images are # still available to restore. DELETED = 'deleted' # VM is permanently deleted. ERROR = 'error' There is no SHUTOFF (merged with STOPPED), and VERIFY_RESIZE is named (from task state) as RESIZED (in vm state). BUILDING state is not my favorite but it's left there mostly for backward compatibility reason. This is still up for discussion and your input is welcome. Thanks, Yun On Mon, Jun 18, 2012 at 3:54 PM, David Kranz david.kr...@qrclab.com wrote: Thanks, Yun. The problem is that the API calls give you status which is neither task state nor vm state. I think these are the stable states: ACTIVE, VERIFY_RESIZE, STOPPED, SHUTOFF, PAUSED, SUSPENDED, RESCUE, ERROR, DELETED Does that seem right to you, and is there a plan to change that set for Folsom? -David On 6/18/2012 12:51 PM, Yun Mao wrote: Hi Jay et al, there is a patch in review here to overhaul the state machine: https://review.openstack.org/#/c/8254/ All transient state in vm state will be moved to task state. Stable state in task state (RESIZE_VERIFY) will be moved to vm state. There is also a state transition diagram in dot format. Comments welcome. Thanks, All On Mon, Jun 18, 2012 at 12:26 PM, Jay Pipesjaypi...@gmail.com wrote: On 06/18/2012 12:01 PM, David Kranz wrote: There are a few tempest tests, and many in the old kong suite that is still there, that wait for a server status that is something other than ACTIVE or VERIFY_RESIZE. These other states, such as BUILD or REBOOT, are transient so I don't understand why it is correct for code to poll for those states. Am I missing something or do those tests have race condition bugs? No, you are correct, and I have made some comments in recent code reviews to that effect. Here are all the task states: https://github.com/openstack/nova/blob/master/nova/compute/task_states.py Out of all those task states, I believe the only one safe to poll in a wait loop is RESIZE_VERIFY. All the others are prone to state transitions outside the control of the user. For the VM states: https://github.com/openstack/nova/blob/master/nova/compute/vm_states.py I consider the following to be non-racy, quiescent states: ACTIVE DELETED STOPPED SHUTDOFF PAUSED SUSPENDED ERROR I consider the following to be racy states that should not be tested for: MIGRATING -- Instead, the final state should be checked for... RESIZING -- Instead, the RESIZE_VERIFY and RESIZE_CONFIRM task states should be checked I have absolutely no idea what the state termination is for the following VM states: RESCUED -- is this a permanent state? Is this able to be queried for in a consistent manner before it transitions to some further state? SOFT_DELETE -- I have no clue what the purpose or queryability of this state is, but would love to know... Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp