Re: [openstack-dev] [mistral] bugfix for "Fix concurrency issues by using READ_COMMITTED" unveils / creates a different bug
Hi, I have reproduced this issue multiple times before your fix was merged. In order to reproduce I used a workflow with multiple async actions, and resumed all of them at the same time. I just created a ticket in launchpad [1], with the workflow used and the mistral engine logs. [1] - https://bugs.launchpad.net/mistral/+bug/1524477 If anyone could take a look and confirm the bug it would be great. Thanks Noa Koffman From: ELISHA, Moshe (Moshe) [moshe.eli...@alcatel-lucent.com] Sent: Monday, December 07, 2015 6:29 PM To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [mistral] bugfix for "Fix concurrency issues by using READ_COMMITTED" unveils / creates a different bug Hi all, The current bugfix I am working on[1] have unveiled / created a bug. Test “WorkflowResumeTest.test_resume_different_task_states” sometimes fails because “task4” is executed twice instead of once (See unit test output and workflow below). This happens because task2 on-complete is running task4 as expected but also task3 executes task4 by mistake. It is not consistent but it happens quite often – This happens if the unit test resumes the WF and updates action execution of task2 and finishes task2 before task3 is finished. Scenario: 1. Task2 in method on_action_complete – changes task2 state to RUNNING. 2. Task3 in method on_action_complete – changes task2 state to RUNNING (before task2 calls _on_task_state_change). 3. Task3 in “_on_task_state_change” > “continue_workflow” > “DirectWorkflowController ._find_next_commands” – it finds task2 because task2 is in SUCCESS and processed = False and “_find_next_commands_for_task(task2)” returns task4. 4. Task3 executes command to RunTask task4. 5. Task2 in “_on_task_state_change” > “continue_workflow” > “DirectWorkflowController ._find_next_commands” – it finds task2 because task2 is in SUCCESS and processed = False and “_find_next_commands_for_task(task2)” returns task4. 6. Task2 executes command to RunTask task4. [1] - https://review.openstack.org/#/c/253819/ If I am not mistaken – this issue also exists in the current code and my bugfix only made it much more often. Can you confirm? I don’t have enough knowledge on how to fix this issue… For now – I have modified the test_resume_different_task_states unit test to wait for task3 to be processed before updating the action execution of task2. If you agree this bug exist today as well – we can proceed with my bugfix and open a different bug for that issue. Thanks. [stack@melisha-devstack mistral(keystone_admin)]$ tox -e py27 -- WorkflowResumeTest.test_resume_different_task_states ... == FAIL: mistral.tests.unit.engine.test_workflow_resume.WorkflowResumeTest.test_resume_different_task_states tags: worker-0 -- pythonlogging:'': {{{WARNING [oslo_db.sqlalchemy.utils] Id not in sort_keys; is sort_keys unique?}}} stderr: {{{ /opt/stack/mistral/.tox/py27/local/lib/python2.7/site-packages/novaclient/v2/client.py:109: UserWarning: 'novaclient.v2.client.Client' is not designed to be initialized directly. It is inner class of novaclient. Please, use 'novaclient.client.Client' instead. Related lp bug-report: 1493576 _LW("'novaclient.v2.client.Client' is not designed to be " }}} stdout: {{{ Engine test case exception occurred: 4 != 5 Exception type: Printing workflow executions... wb.wf1 [state=SUCCESS, output={u'__execution': {u'params': {}, u'id': u'2807dd99-ca6f-49d7-886d-7d3b79e1c49e', u'spec': {u'type': u'direct', u'name': u'wf1', u'tasks': {u'task4': {u'type': u'direct', u'name': u'task4', u'version': u'2.0', u'action': u'std.echo output="Task 4"'}, u'task2': {u'type': u'direct', u'name': u'task2', u'on-complete': [u'task4'], u'version': u'2.0', u'action': u'std.mistral_http url="http://google.com";'}, u'task3': {u'type': u'direct', u'name': u'task3', u'version': u'2.0', u'action': u'std.echo output="Task 3"'}, u'task1': {u'type': u'direct', u'name': u'task1', u'on-complete': [u'task3', u'pause'], u'version': u'2.0', u'action': u'std.echo output="Hi!"'}}, u'version': u'2.0'}, u'input': {}}, u'task4': u'Task 4', u'task3': u'Task 3', u'__tasks': {u'848c6e92-b1b1-4d54-b11d-c93cfb4fc88f': u'task2', u'00a546e7-8da9-4603-b6be-54d58b14c625': u'task1'}}] task2 [id=848c6e92-b1b1-4d54-b11d-c93cfb4fc88f, state=SUCCESS, published={}] task1 [id=00a546e7-8da9-4603-b6be-54d58b14c625, state=SUCCESS, published={}] task3 [id=8ce20324-7fba-4424-bcd2-1e0c9b27fd4a, state=SUCCESS, published={}] task4 [id=3758de43-9bc3-4ac9-b3f3-29eb543b16ef, state=SUCCESS, published={}] task4 [id=f12ee464-0ba5-48c7-8423-9f425a00e675, state=SUCCESS, published={}] }}} Traceback (most recent call last): File "mistral/tests/unit/engine/test_workflow_resume.
Re: [openstack-dev] [heat] suggestion for lock/protect stack blueprint
Hey everyone, Regarding the lock-stack blueprint, Following Steve and Pavlo's suggestions, I created the following blueprint in heat-specs. This is the launchpad link: https://blueprints.launchpad.net/heat/+spec/lock-stack on Wed, Apr 8, 2015 at 4:54 PM Steve Hardy wrote: >We might consider making this a stack >"action", e.g like suspend/resume - >actions are intended for stack-wide >operations which affect the stack state >but not it's definition, so it seems like >potentially a good fit on Wed, Apr 8, 2015 at 4:59 PM Pavlo Shchelokovskyy wrote: >would you kindly propose this blueprint >as a spec in heat-specs project on >>review.openstack.org? It is way easier >to discuss specs in a Gerrit review >>format than in ML. I would appriciate any comment, suggestions and reviews Thanks Noa Koffman Sent from my Android phone using Symantec TouchDown (www.symantec.com) -Original Message- From: Pavlo Shchelokovskyy [pshchelokovs...@mirantis.com] Received: Wednesday, 08 Apr 2015, 16:59 To: OpenStack Development Mailing List (not for usage questions) [openstack-dev@lists.openstack.org] Subject: Re: [openstack-dev] [heat] suggestion for lock/protect stack blueprint Hi Noa, would you kindly propose this blueprint as a spec in heat-specs project on review.openstack.org<http://review.openstack.org>? It is way easier to discuss specs in a Gerrit review format than in ML. If you need a help with submitting a spec for a review, come to our IRC channel (#heat at freenode.net<http://freenode.net>), we'll gladly help you with that. Best regards, Pavlo Shchelokovskyy Software Engineer Mirantis Inc www.mirantis.com<http://www.mirantis.com> On Wed, Apr 8, 2015 at 3:43 PM, KOFFMAN, Noa (Noa) mailto:noa.koff...@alcatel-lucent.com>> wrote: Hey, I would like to suggest a blueprint to allow locking/protecting a stack. Similar to: nova server "lock" or glance-image "--is-protected" flag. Once a stack is locked, the only operation allowed on the stack is "unlock" - heat engine should reject any stack operations and ignore signals that modify the stack (such as scaling). The lock operation should have a "lock_resources" flag (default = True): When True: perform heat lock and enable lock/protect for each stack resource that supports it (nova server, glance image,...). when False: perform heat lock - which would lock the stack and all nested stacks (actions on resources will not be effected). Use-cases: 1. we received several requests from application vendors, to allow "maintenance mode" for the application. When in maintenance no topology changes are permitted. For example a maintenance mode is required for a clustered DB app that needs a manual reboot of one of its servers - when the server reboots all the other servers are redistributing the data among themselves which causes high CPU levels which in turn might cause an undesired scale out (which will cause another CPU spike and so on...). 2. some cloud-admins have a configuration stack that initializes the cloud (Creating networks, flavors, images, ...) and these resources should always exist while the cloud exists. Locking these configuration stacks, will prevent someone from accidently deleting/modifying the stack or its resources. This feature might even raise in significance, once convergence phase 2 is in place, and many other automatic actions are performed by heat. The ability to manually perform admin actions on the stack with no interruptions is important. Any thoughts/comments/suggestions are welcome. Thanks Noa Koffman. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [heat] suggestion for lock/protect stack blueprint
Hey, I would like to suggest a blueprint to allow locking/protecting a stack. Similar to: nova server "lock" or glance-image "--is-protected" flag. Once a stack is locked, the only operation allowed on the stack is "unlock" - heat engine should reject any stack operations and ignore signals that modify the stack (such as scaling). The lock operation should have a "lock_resources" flag (default = True): When True: perform heat lock and enable lock/protect for each stack resource that supports it (nova server, glance image,...). when False: perform heat lock - which would lock the stack and all nested stacks (actions on resources will not be effected). Use-cases: 1. we received several requests from application vendors, to allow "maintenance mode" for the application. When in maintenance no topology changes are permitted. For example a maintenance mode is required for a clustered DB app that needs a manual reboot of one of its servers - when the server reboots all the other servers are redistributing the data among themselves which causes high CPU levels which in turn might cause an undesired scale out (which will cause another CPU spike and so on...). 2. some cloud-admins have a configuration stack that initializes the cloud (Creating networks, flavors, images, ...) and these resources should always exist while the cloud exists. Locking these configuration stacks, will prevent someone from accidently deleting/modifying the stack or its resources. This feature might even raise in significance, once convergence phase 2 is in place, and many other automatic actions are performed by heat. The ability to manually perform admin actions on the stack with no interruptions is important. Any thoughts/comments/suggestions are welcome. Thanks Noa Koffman. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev