Koushik Das created CLOUDSTACK-6124:
---------------------------------------

             Summary: During MS maintenance unfinished work items are not 
cleaned up resulting in them getting repeated for every subsequent maintenance
                 Key: CLOUDSTACK-6124
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-6124
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server
    Affects Versions: 4.3.0
            Reporter: Koushik Das
            Assignee: Koushik Das
             Fix For: 4.4.0


During MS shutdown, all pending work items (op_it_work.step != 'Done') for it 
are picked up by other MS in cluster. The new MS then try to see for all 
pending work items, if the VMs are running or not and if not try to start them 
(using the same mechanism used to HA VMs). In case the investigators find out 
that VMs are still alive no action is needed. This completes the process for 
checking all pending work items.

Looks like there is a bug in the code where the op_it_work.step is not marked 
as 'Done' in the above case thereby leaving the work items as pending always. 
As a result every time MS owning these work items is shutdown, the work items 
are picked up by another MS and the steps mentioned above gets repeated.

Scenario where a pending work item may get created. If there is a failure to 
deploy VM then type and step gets set to 'Starting' and 'Release' respectively. 
Ideally if the operations ends gracefully then the step gets updated to 'Done'. 
But if there is an abrupt termination then it is possible that for some work 
items the step still remains in 'Release'. As a result of this the step never 
gets updated to 'Done' for these items and are always tried when a new MS takes 
over.





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to