[
https://issues.apache.org/jira/browse/CLOUDSTACK-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908164#comment-13908164
]
ASF subversion and git services commented on CLOUDSTACK-6124:
-------------------------------------------------------------
Commit c6a8659ac2c2c85c6f0e9fb8af11606aaa2f9a37 in cloudstack's branch
refs/heads/4.3-forward from [~koushikd]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=c6a8659 ]
CLOUDSTACK-6124: During MS maintenance unfinished work items are not cleaned up
resulting in them getting repeated for every subsequent maintenance
Updating the op_it_work table entry appropriately in db once the unfinished
work item is completed.
> During MS maintenance unfinished work items are not cleaned up resulting in
> them getting repeated for every subsequent maintenance
> ----------------------------------------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-6124
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-6124
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.3.0
> Reporter: Koushik Das
> Assignee: Koushik Das
> Fix For: 4.4.0
>
>
> During MS shutdown, all pending work items (op_it_work.step != 'Done') for it
> are picked up by other MS in cluster. The new MS then try to see for all
> pending work items, if the VMs are running or not and if not try to start
> them (using the same mechanism used to HA VMs). In case the investigators
> find out that VMs are still alive no action is needed. This completes the
> process for checking all pending work items.
> Looks like there is a bug in the code where the op_it_work.step is not marked
> as 'Done' in the above case thereby leaving the work items as pending always.
> As a result every time MS owning these work items is shutdown, the work items
> are picked up by another MS and the steps mentioned above gets repeated.
> Scenario where a pending work item may get created. If there is a failure to
> deploy VM then type and step gets set to 'Starting' and 'Release'
> respectively. Ideally if the operations ends gracefully then the step gets
> updated to 'Done'. But if there is an abrupt termination then it is possible
> that for some work items the step still remains in 'Release'. As a result of
> this the step never gets updated to 'Done' for these items and are always
> tried when a new MS takes over.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)