Re: [openstack-dev] [mistral] bugfix for "Fix concurrency issues by using READ_COMMITTED" unveils / creates a different bug

2015-12-09 Thread KOFFMAN, Noa (Noa)
Hi,

I have reproduced this issue multiple times before your fix was merged.

In order to reproduce I used a workflow with multiple async actions, and 
resumed all of them at the same time.

I just created a ticket in launchpad [1], with the workflow used and the 
mistral engine logs.

[1] - https://bugs.launchpad.net/mistral/+bug/1524477

If anyone could take a look and confirm the bug it would be great.

Thanks
Noa Koffman


From: ELISHA, Moshe (Moshe) [moshe.eli...@alcatel-lucent.com]
Sent: Monday, December 07, 2015 6:29 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [mistral] bugfix for "Fix concurrency issues by using 
READ_COMMITTED" unveils / creates a different bug

Hi all,

The current bugfix I am working on[1] have unveiled / created a bug.
Test “WorkflowResumeTest.test_resume_different_task_states” sometimes fails 
because “task4” is executed twice instead of once (See unit test output and 
workflow below).

This happens because task2 on-complete is running task4 as expected but also 
task3 executes task4 by mistake.

It is not consistent but it happens quite often – This happens if the unit test 
resumes the WF and updates action execution of task2 and finishes task2 before 
task3 is finished.
Scenario:


1.   Task2 in method on_action_complete – changes task2 state to RUNNING.

2.   Task3 in method on_action_complete – changes task2 state to RUNNING 
(before task2 calls _on_task_state_change).

3.   Task3 in “_on_task_state_change” > “continue_workflow” > 
“DirectWorkflowController ._find_next_commands” – it finds task2 because task2 
is in SUCCESS and processed = False and “_find_next_commands_for_task(task2)” 
returns task4.

4.   Task3 executes command to RunTask task4.

5.   Task2 in “_on_task_state_change” > “continue_workflow” > 
“DirectWorkflowController ._find_next_commands” – it finds task2 because task2 
is in SUCCESS and processed = False and “_find_next_commands_for_task(task2)” 
returns task4.

6.   Task2 executes command to RunTask task4.


[1] - https://review.openstack.org/#/c/253819/


If I am not mistaken – this issue also exists in the current code and my bugfix 
only made it much more often. Can you confirm?
I don’t have enough knowledge on how to fix this issue…
For now – I have modified the test_resume_different_task_states unit test to 
wait for task3 to be processed before updating the action execution of task2.
If you agree this bug exist today as well – we can proceed with my bugfix and 
open a different bug for that issue.

Thanks.



[stack@melisha-devstack mistral(keystone_admin)]$ tox -e py27 -- 
WorkflowResumeTest.test_resume_different_task_states
...
==
FAIL: 
mistral.tests.unit.engine.test_workflow_resume.WorkflowResumeTest.test_resume_different_task_states
tags: worker-0
--
pythonlogging:'': {{{WARNING [oslo_db.sqlalchemy.utils] Id not in sort_keys; is 
sort_keys unique?}}}
stderr: {{{
/opt/stack/mistral/.tox/py27/local/lib/python2.7/site-packages/novaclient/v2/client.py:109:
 UserWarning: 'novaclient.v2.client.Client' is not designed to be initialized 
directly. It is inner class of novaclient. Please, use 
'novaclient.client.Client' instead. Related lp bug-report: 1493576
  _LW("'novaclient.v2.client.Client' is not designed to be "
}}}

stdout: {{{
Engine test case exception occurred: 4 != 5
Exception type: 

Printing workflow executions...

wb.wf1 [state=SUCCESS, output={u'__execution': {u'params': {}, u'id': 
u'2807dd99-ca6f-49d7-886d-7d3b79e1c49e', u'spec': {u'type': u'direct', u'name': 
u'wf1', u'tasks': {u'task4': {u'type': u'direct', u'name': u'task4', 
u'version': u'2.0', u'action': u'std.echo output="Task 4"'}, u'task2': 
{u'type': u'direct', u'name': u'task2', u'on-complete': [u'task4'], u'version': 
u'2.0', u'action': u'std.mistral_http url="http://google.com";'}, u'task3': 
{u'type': u'direct', u'name': u'task3', u'version': u'2.0', u'action': 
u'std.echo output="Task 3"'}, u'task1': {u'type': u'direct', u'name': u'task1', 
u'on-complete': [u'task3', u'pause'], u'version': u'2.0', u'action': u'std.echo 
output="Hi!"'}}, u'version': u'2.0'}, u'input': {}}, u'task4': u'Task 4', 
u'task3': u'Task 3', u'__tasks': {u'848c6e92-b1b1-4d54-b11d-c93cfb4fc88f': 
u'task2', u'00a546e7-8da9-4603-b6be-54d58b14c625': u'task1'}}]
 task2 [id=848c6e92-b1b1-4d54-b11d-c93cfb4fc88f, state=SUCCESS, 
published={}]
 task1 [id=00a546e7-8da9-4603-b6be-54d58b14c625, state=SUCCESS, 
published={}]
 task3 [id=8ce20324-7fba-4424-bcd2-1e0c9b27fd4a, state=SUCCESS, 
published={}]
 task4 [id=3758de43-9bc3-4ac9-b3f3-29eb543b16ef, state=SUCCESS, 
published={}]
 task4 [id=f12ee464-0ba5-48c7-8423-9f425a00e675, state=SUCCESS, 
published={}]
}}}

Traceback (most recent call last):
  File "mistral/tests/unit/engine/test_workflow_resume.

Re: [openstack-dev] [heat] suggestion for lock/protect stack blueprint

2015-04-09 Thread KOFFMAN, Noa (Noa)
Hey everyone,


Regarding the lock-stack blueprint, Following Steve and Pavlo's
suggestions, I created the following blueprint in heat-specs.

This is the launchpad link:

https://blueprints.launchpad.net/heat/+spec/lock-stack

on Wed, Apr 8, 2015 at 4:54 PM Steve Hardy wrote:
>We might consider making this a stack >"action", e.g like suspend/resume -
>actions are intended for stack-wide >operations which affect the stack state
>but not it's definition, so it seems like >potentially a good fit


on Wed, Apr 8, 2015 at 4:59 PM Pavlo Shchelokovskyy wrote:
>would you kindly propose this blueprint >as a spec in heat-specs project on 
>>review.openstack.org? It is way easier >to discuss specs in a Gerrit review 
>>format than in ML.


I would appriciate any comment, suggestions and reviews

Thanks

Noa Koffman

Sent from my Android phone using Symantec TouchDown (www.symantec.com)

-Original Message-
From: Pavlo Shchelokovskyy [pshchelokovs...@mirantis.com]
Received: Wednesday, 08 Apr 2015, 16:59
To: OpenStack Development Mailing List (not for usage questions) 
[openstack-dev@lists.openstack.org]
Subject: Re: [openstack-dev] [heat] suggestion for lock/protect stack blueprint

Hi Noa,

would you kindly propose this blueprint as a spec in heat-specs project on 
review.openstack.org<http://review.openstack.org>? It is way easier to discuss 
specs in a Gerrit review format than in ML. If you need a help with submitting 
a spec for a review, come to our IRC channel (#heat at 
freenode.net<http://freenode.net>), we'll gladly help you with that.

Best regards,

Pavlo Shchelokovskyy
Software Engineer
Mirantis Inc
www.mirantis.com<http://www.mirantis.com>

On Wed, Apr 8, 2015 at 3:43 PM, KOFFMAN, Noa (Noa) 
mailto:noa.koff...@alcatel-lucent.com>> wrote:
Hey,

I would like to suggest a blueprint to allow locking/protecting a
stack. Similar to: nova server "lock" or glance-image "--is-protected"
flag.
Once a stack is locked, the only operation allowed on the stack is
"unlock" - heat engine should reject any stack operations and ignore
signals that modify the stack (such as scaling).

The lock operation should have a "lock_resources" flag (default = True):
When True: perform heat lock and enable lock/protect for each stack
resource that supports it (nova server, glance image,...).
when False: perform heat lock - which would lock the stack and all
nested stacks (actions on resources will not be effected).

Use-cases:
1. we received several requests from application vendors, to allow
"maintenance mode" for the application. When in maintenance no topology
changes are permitted. For example a maintenance mode is required for
a clustered DB app that needs a manual reboot of one of its servers -
when the server reboots all the other servers are redistributing the
data among themselves which causes high CPU levels which in turn might
cause an undesired scale out (which will cause another CPU spike and so
on...).
2. some cloud-admins have a configuration stack that initializes the
cloud (Creating networks, flavors, images, ...) and these resources
should always exist while the cloud exists. Locking these configuration
stacks, will prevent someone from accidently deleting/modifying the
stack or its resources.

This feature might even raise in significance, once convergence phase 2
is in place, and many other automatic actions are performed by heat.
The ability to manually perform admin actions on the stack with no
interruptions is important.

Any thoughts/comments/suggestions are welcome.

Thanks
Noa Koffman.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [heat] suggestion for lock/protect stack blueprint

2015-04-08 Thread KOFFMAN, Noa (Noa)
Hey,

I would like to suggest a blueprint to allow locking/protecting a 
stack. Similar to: nova server "lock" or glance-image "--is-protected" 
flag.
Once a stack is locked, the only operation allowed on the stack is 
"unlock" - heat engine should reject any stack operations and ignore 
signals that modify the stack (such as scaling).

The lock operation should have a "lock_resources" flag (default = True):
When True: perform heat lock and enable lock/protect for each stack 
resource that supports it (nova server, glance image,...).
when False: perform heat lock - which would lock the stack and all 
nested stacks (actions on resources will not be effected).

Use-cases:
1. we received several requests from application vendors, to allow 
"maintenance mode" for the application. When in maintenance no topology 
changes are permitted. For example a maintenance mode is required for 
a clustered DB app that needs a manual reboot of one of its servers - 
when the server reboots all the other servers are redistributing the 
data among themselves which causes high CPU levels which in turn might 
cause an undesired scale out (which will cause another CPU spike and so 
on...).
2. some cloud-admins have a configuration stack that initializes the 
cloud (Creating networks, flavors, images, ...) and these resources 
should always exist while the cloud exists. Locking these configuration 
stacks, will prevent someone from accidently deleting/modifying the 
stack or its resources.

This feature might even raise in significance, once convergence phase 2 
is in place, and many other automatic actions are performed by heat. 
The ability to manually perform admin actions on the stack with no 
interruptions is important.

Any thoughts/comments/suggestions are welcome.

Thanks   
Noa Koffman.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev