Dear Rafael
Many thanks for your deep analysis. Nice humor with the planetsalignments BTW ;)We also suspected the HA Flag to be related this issue, as we never got this issue in the past, until we enabled the HA on all VMs and Offerings.
Here is also the workers list: mysql> select * from op_ha_work; +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+| id | instance_id | type | vm_type | state | mgmt_server_id | host_id | created | tried | taken | step | time_to_try | updated |
+------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+| 105 | 343 | HA | User | Running | 345049098498 | 29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 1394056013 | 3 | | 109 | 407 | HA | ConsoleProxy | Running | 345049098498 | 33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 1394056013 | 3 | | 113 | 407 | HA | ConsoleProxy | Running | 345049098498 | 33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 1394056013 | 3 | | 117 | 407 | HA | ConsoleProxy | Running | 345049098498 | 33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 1394056013 | 3 | | 121 | 343 | HA | User | Running | 345049098498 | 29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 1394056013 | 3 | | 125 | 405 | HA | SecondaryStorageVm | Running | 345049098498 | 29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 1394056013 | 3 | | 129 | 393 | HA | User | Running | 345049098498 | 25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 1394056014 | 30 | | 133 | 405 | HA | SecondaryStorageVm | Running | 345049098498 | 29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 1394056014 | 3 | | 137 | 402 | HA | ConsoleProxy | Running | 345049098498 | 25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 1394056014 | 3 | | 345 | 402 | HA | ConsoleProxy | Starting | 345049098498 | 25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 | Error | 1402527506 | 770 | | 349 | 687 | HA | SecondaryStorageVm | Starting | 345049098498 | 25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 | Error | 1402527506 | 2 | | 2057 | 1178 | Migration | User | Running | 345049098498 | 33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 | Done | 1451024774 | 95 | | 2058 | 1736 | Migration | User | Running | 345049098498 | 224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 | Done | 1451024774 | 45 | | 2059 | 1240 | Migration | User | Running | 345049098498 | 7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 | Done | 1451024774 | 61 | | 2060 | 1178 | Migration | User | Running | 345049098122 | 33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 | Done | 1451025374 | 95 | | 2061 | 1178 | Migration | User | Stopped | NULL | 33 | 2017-01-31 07:56:09 | 5 | NULL | Done | 1451027900 | 109 | | 2062 | 1690 | Migration | User | Running | 345049098498 | 224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 | Done | 1451024774 | 109 | | 2063 | 1177 | Migration | User | Running | 345049098498 | 29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 | Done | 1451024774 | 82 | | 2064 | 1690 | Migration | User | Running | 345049098122 | 224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 | Done | 1451024900 | 109 | | 2065 | 1240 | Migration | User | Running | 345049098122 | 7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 | Done | 1451024900 | 61 | | 2066 | 1736 | Migration | User | Running | 345049098122 | 224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 | Done | 1451024900 | 45 | | 2067 | 1240 | Migration | User | Stopped | NULL | 7 | 2017-01-31 07:56:09 | 5 | NULL | Done | 1451028003 | 80 | | 2068 | 1178 | Migration | User | Running | 345049098122 | 33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 | Done | 1451024781 | 95 | | 2069 | 1736 | Migration | User | Running | 345049098122 | 224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 | Done | 1451024781 | 45 | | 2070 | 1240 | Migration | User | Running | 345049098122 | 7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 | Done | 1451024781 | 61 | | 2071 | 1177 | Migration | User | Running | 345049098122 | 29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 | Done | 1451024781 | 82 | | 2072 | 1177 | Migration | User | Stopped | NULL | 29 | 2017-01-31 07:56:16 | 5 | NULL | Done | 1451027973 | 101 | | 2073 | 1177 | Migration | User | Running | 345049098122 | 29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 | Done | 1451024904 | 82 | | 2074 | 1690 | Migration | User | Stopped | NULL | 224 | 2017-01-31 07:56:16 | 5 | NULL | Done | 1451027464 | 123 | | 2075 | 1736 | Migration | User | Stopped | NULL | 224 | 2017-01-31 07:56:16 | 5 | NULL | Done | 1451027477 | 59 |
+------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+ If we put a host in maintenance, it may unfortunately happen that a timeout occurs and CS aborts the migration. As far as we can interpret the logs, it seems (we speculate) that CS detects that the VM is then still running on the original host, which is in maintenance, and decides somehow to stop it because of the "host not up" status. We then need to start several times the VM, until CS gives up the stopping. After restarting the MSs I guess that a similar issue may happen: the hosts are disconnected during the early stage and then reconnects. Maybe CS then again stops the VMs because the "host not up" status? Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The last host of this VM is not UP or is not enabled, host status is: Disconnected, host resource state is: EnabledBut the problem after the MSs restart is definitely only impacting VMs listed in the Table op_ha_work.
To your questions: We have 3 ACS MSs managing several Zones. Each zone has a single pod and cluster. The Cluster of i-638-1735-VM has this summary: Total number of VMs: 121 Total number hypervisors: 4 Total allocated RAM: 522.0 / 972 GB (53.7 %) Total allocated cores: 211 Total allocated storage: 11285 GB Used Memory on the hosts is now varying from 28 to 77%, so there is enough RAM. I did not check the XS logs. The same Problem has arisen this morning again (shortly after a restart of all MSs), so I will look in the XS logs now. I have some fair experience with Eclipse and Java, I will ask my boss if we can go this way. The main issue for now, is that restarting the MSs is reproducing the problem, which is quite annoying for the impacted VM's. I thought to stop the MSs, delete all HA/Migration entries in op_ha_work and start the MSs. Is that a viable option? (or should I delete the Migration entries only? or never touch this table?) We will also very probably remove the HA Flag on all VMs if it does not work reliably. Best Regards Francois Scheurer Here are new logs regarding the same issue of today. I tried to get the important parts related to the stop of i-189-1178-VM.regexp: (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRequestedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)
2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null, lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017} 2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Deploy avoids pods: [], clusters: [], hosts: [29] 2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache for 232 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) DeploymentPlanner allocation algorithm: com.cloud.deploy.UserDispersingPlanner@1734b2e2 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25, requested cpu: 4200, requested ram: 4294967296 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is ROOT volume READY (pool already allocated)?: Yes 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) This VM has last host_id specified, trying to choose the same host: 25
2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The last host of this VM is not UP or is not enabled, host status is: Disconnected, host resource state is: Enabled 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Cannot choose the last host to deploy this VM 2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Searching resources only under specified Cluster: 25
2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea FirstFitRoutingAllocator) Found 0 hosts for allocation after prioritization: [] 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No suitable hosts found 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No suitable hosts found under this Cluster: 25 2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Could not find suitable Deployment Destination for this VM under any clusters, returning. 2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Searching resources only under specified Cluster: 25 2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The specified cluster is in avoid set, returning. 2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Unable to find destination for migrating the vm VM[User|i-189-1178-VM] 2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Failed to deploy vm 1178 with original planner, sending HAPlanner
2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Deploy avoids pods: [], clusters: [], hosts: [29] 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25, requested cpu: 4200, requested ram: 4294967296 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is ROOT volume READY (pool already allocated)?: Yes 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) This VM has last host_id specified, trying to choose the same host: 25
2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The last host of this VM is not UP or is not enabled, host status is: Disconnected, host resource state is: Enabled 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Cannot choose the last host to deploy this VM 2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Searching resources only under specified Cluster: 25 2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Deploying vm during HA process, so skipping disable threshold check
2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea FirstFitRoutingAllocator) Found 0 hosts for allocation after prioritization: [] 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No suitable hosts found 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No suitable hosts found under this Cluster: 25 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish async job-192986 complete on message bus 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs related to job-192986 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db status for job-192986 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The last host of this VM is not UP or is not enabled, host status is: Disconnected, host resource state is: Enabled 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) Cannot choose the last host to deploy this VM 2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) Searching resources only under specified Cluster: 35 2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Could not find suitable Deployment Destination for this VM under any clusters, returning. 2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Searching resources only under specified Cluster: 25 2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item: SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob, contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27, lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05 CET 2017} 2017-01-30 08:44:06,169 INFO [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add job-192992 into job monitoring 2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) Deploying vm during HA process, so skipping disable threshold check 2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The specified cluster is in avoid set, returning. 2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Unable to find destination for migrating the vm VM[User|i-189-1178-VM] 2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Invocation exception, caused by: com.cloud.exception.InsufficientServerCapacityException: Unable to find a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25 2017-01-30 08:44:06,187 INFO [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Rethrow exception com.cloud.exception.InsufficientServerCapacityException: Unable to find a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25 2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with run of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393 2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 345049103441, completeMsid: null, lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393 com.cloud.exception.InsufficientServerCapacityException: Unable to find a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25 at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390) at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) at com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636) at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:103) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991 2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs joined with job-192986 and disjoin all subjobs created from job- 192986 2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete async job-192989, jobStatus: FAILED, resultCode: 0, result: rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZlckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vwc0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRDYXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5nL0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xvdWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vycm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4aOxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNhdXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQAEkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFuZy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QAEExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBzZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFjZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFyaW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2VySW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hlc3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EAfgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNxAH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5yZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhqYXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EAfgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pvYnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNsb3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRjaGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1wbCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRleHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdlZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2FsbHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5jb250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxsV2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5jb25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRvcnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdUaHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29ya2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAALVGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5tb2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEudXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-omaW2FAIAAHhwAA 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) Deploy avoids pods: [], clusters: [], hosts: [3] 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) Trying to allocate a host and storage pools from dc:14, pod:17,cluster:17, requested cpu: 4200, requested ram: 4294967296 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) Is ROOT volume READY (pool already allocated)?: Yes 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) This VM has last host_id specified, trying to choose the same host: 213 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish async job-192989 complete on message bus 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs related to job-192989 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db status for job-192989 2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) Checking resources in Cluster: 35 under Pod: 31 2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f FirstFitRoutingAllocator) Looking for hosts in dc: 23 pod:31 cluster:35 2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:05 CET 2017} 2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953 2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item: SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob, contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61, lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05 CET 2017} 2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) The last host of this VM is not UP or is not enabled, host status is: Disconnected, host resource state is: Enabled 2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) Cannot choose the last host to deploy this VM 2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner] (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) Searching resources only under specified Cluster: 17 2017-01-30 08:44:06,262 INFO [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add job-192991 into job monitoring 2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done executing com.cloud.vm.VmWorkMigrateAway for job-192986 2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs joined with job-192989 and disjoin all subjobs created from job- 192989 2017-01-30 08:44:06,269 INFO [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove job-192986 from job monitoring 2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done executing com.cloud.vm.VmWorkMigrateAway for job-192989 2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:05 CET 2017} 2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965 2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) Ping from 258 2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f FirstFitRoutingAllocator) List of hosts in ascending order of number of VMs: [224, 19, 21, 15] 2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for allocation: [] 2017-01-30 08:44:06,322 INFO [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) PingMap for agent: 258 will not be updated because agent is no longer in the PingMap 2017-01-30 08:44:06,326 INFO [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove job-192989 from job monitoring
2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) Migration cancelled because state has changed: VM[User|i-189-1178-VM] 2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) Unable to migrate VM due to: Migration cancelled because state has changed: VM[User|i-189-1178-VM] 2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) VM state transitted from :Running to Stopping with event: StopRequestedvm's original host id: 29 new host id: 33 host id before state transition: 33 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response Received: 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request] (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing: { Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,"result":true,"details":"","wait":0}}] } 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request] (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received: { Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, { NetworkUsageAnswer } } 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl] (RouterMonitor-1:ctx-2e80f405) Details from executing class com.cloud.agent.api.NetworkUsageCommand: 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request] (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) Seq 33-1917970491306410035: Sending { Cmd , MgmtId: 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}] } 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request] (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) Seq 33-1917970491306410035: Executing: { Cmd , MgmtId: 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}] } 2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing request 2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing request
2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-4:ctx-36aae733 work-2049) Processing HAWork[2049-Migration-1178-Stopped-Migrating] 2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext] (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context, setup psudo job for the executing thread 2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object VmWorkJobQueue.1178
On 31.01.2017 00:42, Rafael Weingärtner wrote:
This seems to be one of those nasty bugs that are very hard to debug. It seems to be cause to a combination of different factors, such as planets alignments ;) At class “VirtualMachineManagerImpl” line 2395 (method “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String, long, DeploymentPlanner)”) is the start of the code flow that triggers the migration. The point here is that, if an error occurs during the migration, the VM will be stopped; For some reason this is executed only for HA VMs. Going a little bit further, we find the point that is triggering the error in the method “com.cloud.vm.VirtualMachineManagerImpl.migrate(VMInstanceVO, long, DeployDestination)” at line 1941. This line will call the method “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event, Long, ItWorkVO, Step)” to change the VM state to migrating/prepare migration/or something like that (I am guessing here), but for some reason, the state of the VM is not the one expected and this is causing a “ConcurrentOperationException” to be thrown. The problem is that the “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event, Long, ItWorkVO, Step)” does not seem to provide much useful log entries. Just to enable us to discuss the problem a little further, could you tell us a little bit about your environment setting? How many ACS MSs do you have? What is the VM workload being migrated (I mean, the number of VMs and computing resources)? Do the other servers have enough computing resources to support these VMs? Could you also check the XenServer logs, and check what happened on both, source and destination hosts when the migration for VM “i-638-1735-VM” was triggered? Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to connect to a running JVM and debug a code? Because for me it will not be possible to go further without looking at the variables in a running environment. On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer < francois.scheu...@everyware.ch> wrote:Dear Rafael Sorry for the late answer (was sick a couple of days). I confirm we are migrating inside the same xenserver pool (within a CS Cluster). Best Regards Francois On 26.01.2017 04:35, Rafael Weingärtner wrote:Found the code that is executed. I need you to confirm if you are migrating within a cluster or across clusters? On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner < rafaelweingart...@gmail.com> wrote: I do not have access to ACS code right now, but I suggest startingdebugging here: 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl](Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to Stopping with event: StopRequestedvm's original host id: 224 new host id: 15 host id before state transition: 15 I would try to understand first why ACS requested the hypervisor to stopthe VM. I mean the following, check the source code for conditions that would make ACS request the shutdown of VMs. BTW: are you migrating within a cluster or across clusters? I will be able to analyze this further only after work hours, so if you find anything, keep me posted. On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer < francois.scheu...@everyware.ch> wrote: Hello RafaelI think the important log lines are these: ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) Migration cancelled because state has changed: VM[User|i-638-1736-VM] ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: Migration cancelled because state has changed: VM[User|i-638-1736-VM] ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to Stopping with event: StopRequestedvm's original host id: 224 new host id: 15 host id before state transition: 15 The really disturbing thing is that CS stop again the VM if we start it manually, apparently because the migratingjob is still running. Note the the VM HA Flag is enabled. Thank you. Best Regards. Francois more details: ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for VM[User|i-638-1736-VM] ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64 job-14953/job-191351) Done with run of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953 ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8bacc64 job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway for job-191351 ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64 job-14953/job-191351) Remove job-191351 from job monitoring ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c work-1989) Completed HAWork[1989-Migration-1411-Running-Migrating] ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375: Forwarding Seq 19-8526158519542516375: { Cmd , MgmtId: 345049103441, via: 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{ "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id" :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other (64-bit)","platformEmulator":"Other install media","bootArgs":"","enable HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true, "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0"," hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033 -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de- 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org. apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8751b89-00f 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563 ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079- ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3 f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName" :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin gType":"THIN","id":8655,"deviceId":0,"hypervisorType":" XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93 -609d5b980a34","type":"ROOT","_details":{"managed":"false"," storagePort":"0", "storageHost":"localhost","volumeSize":"53687091200"}},{"dat a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64," poolType":"PreSetup","host":"localhost","path":"/b8751b89-00 f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA- DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId ":638,"format":"VHD","provisioningType":"THIN","id":8656," deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1," path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS K","_details":{"managed":"false","storagePort":"0","stor ageHost":"localhos t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c- 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org. apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid" :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup"," host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0 -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20- e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894" ,"volumeId":86 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD"," provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1 20-062ee7a67894","type":"DATADISK","_details":{"managed":" false","storagePort":"0","storageHost":"localhost","volumeSize":"42949 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef"," id":201,"format":"ISO","accountId":1,"hvm":true,"displayText ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 4.5 (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":" ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3 a8","ip":"192.168.1.42","netmask":"255.255.255.0"," gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8 .8.8","dns2":"8.8 .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":" vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to 345049098498 ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request] (AgentManager-Handler-10:null) Seq 19-8526158519542516375: Processing: { Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110, [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result": true,"wait":0}}] } ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.m.AgentAttache] (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more commands found ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375: Received: { Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } } ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) Migration cancelled because state has changed: VM[User|i-638-1736-VM] ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: Migration cancelled because state has changed: VM[User|i-638-1736-VM] ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to Stopping with event: StopRequestedvm's original host id: 224 new host id: 15 host id before state transition: 15 ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643: Forwarding Seq 15-6889944479923437643: { Cmd , MgmtId: 345049103441, via: 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut eInSequence":false,"checkBeforeCleanup":false,"vmName":"i- 638-1736-VM","wait":0}}] } to 345049098498 ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 254-800498: Processing Seq 254-800498: { Cmd , MgmtId: -1, via: 254, Ver: v1, Flags: 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_ proxyVmId":1530,"_loadInfo":"{\n \"connections\": []\n}","wait":0}}] } ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 254-800498: Sending Seq 254-800498: { Ans: , MgmtId: 345049103441, via: 254, Ver: v1, Flags: 100010, [{"com.cloud.agent.api.AgentCo ntrolAnswer":{"result":true,"wait":0}}] } ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request] (AgentManager-Handler-11:null) Seq 15-6889944479923437632: Processing: { Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10, [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result ":true,"details":"Stop VM i-638-1735-VM Succeed","wait":0}}] } ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Seq 15-6889944479923437632: Received: { Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10, { StopAnswer } } ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is stopped on the host. Proceeding to release resource held. ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported in the network id=519 ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Changing active number of nics for network id=519 on -1 ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to release NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1 92.168.1.11-null ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Successfully released network resources for the vm VM[User|i-638-1735-VM] ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Successfully released storage resources for the vm VM[User|i-638-1735-VM] ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) VM state transitted from :Stopping to Stopped with event: OperationSucceededvm's original host id: 15 new host id: null host id before state transition: 15 ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976 and CPU after applying overprovisioning: 331904 ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM: 261961927808 and RAM after applying overprovisioning: 261961924608 ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old used: 71300,reserved: 2100, actual total: 82976, total with overprovisioning: 331904; new used: 67100,reserved:6300; movedfromreserved: false,moveToReserveredtrue ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old used: 98784247808,reserved: 1073741824, total: 261961924608; new used: 94489280512,reserved:5368709120 <(53)%206870-9120>; movedfromreserved: false,moveToReserveredtrue ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported in the network id=519 ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported in the network id=519 ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by: com.cloud.utils.exception.CloudRuntimeException: Unable to migrate VM[User|i-638-1735-VM] ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) Rethrow exception com.cloud.utils.exception.CloudRuntimeException: Unable to migrate VM[User|i-638-1735-VM] ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379) Done with run of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733 ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379) Unable to complete AsyncJobVO {id:191379, userId: 1, accountId: 1, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2 VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW FuYWdlckltcGwAAAAAAAAA4A, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 345049098498, completeMsid: null, lastUpdated: null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job origin:13733 ewcstack-man02-prod: com.cloud.utils.exception.CloudRuntimeException: Unable to migrate VM[User|i-638-1735-VM] ewcstack-man02-prod: at com.cloud.vm.VirtualMachineMan agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405) ewcstack-man02-prod: at com.cloud.vm.VirtualMachineMan agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517) ewcstack-man02-prod: at sun.reflect.GeneratedMethodAcc essor402.invoke(Unknown Source) ewcstack-man02-prod: at sun.reflect.DelegatingMethodAc cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ewcstack-man02-prod: at java.lang.reflect.Method.invok e(Method.java:606) ewcstack-man02-prod: at com.cloud.vm.VmWorkJobHandlerP roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) ewcstack-man02-prod: at com.cloud.vm.VirtualMachineMan agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636) ewcstack-man02-prod: at com.cloud.vm.VmWorkJobDispatch er.runJob(VmWorkJobDispatcher.java:103) ewcstack-man02-prod: at org.apache.cloudstack.framewor k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag erImpl.java:537) ewcstack-man02-prod: at org.apache.cloudstack.managed. context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) ewcstack-man02-prod: at org.apache.cloudstack.managed. context.impl.DefaultManagedContext$1.call(DefaultManagedCont ext.java:56) ewcstack-man02-prod: at org.apache.cloudstack.managed. context.impl.DefaultManagedContext.callWithContext(DefaultMa nagedContext.java:103) ewcstack-man02-prod: at org.apache.cloudstack.managed. context.impl.DefaultManagedContext.runWithContext(DefaultMan agedContext.java:53) ewcstack-man02-prod: at org.apache.cloudstack.managed. context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) ewcstack-man02-prod: at org.apache.cloudstack.framewor k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494) ewcstack-man02-prod: at java.util.concurrent.Executors $RunnableAdapter.call(Executors.java:471) ewcstack-man02-prod: at java.util.concurrent.FutureTas k.run(FutureTask.java:262) ewcstack-man02-prod: at java.util.concurrent.ThreadPoo lExecutor.runWorker(ThreadPoolExecutor.java:1145) ewcstack-man02-prod: at java.util.concurrent.ThreadPoo lExecutor$Worker.run(ThreadPoolExecutor.java:615) ewcstack-man02-prod: at java.lang.Thread.run(Thread.java:745) ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379) Complete async job-191379, jobStatus: FAILED, resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____ 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB- AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4 dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0 TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB- ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91 dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379) Publish async job-191379 complete on message bus ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379) Wake up jobs related to job-191379 ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379) Update db status for job-191379 ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379) Wake up jobs joined with job-191379 and disjoin all subjobs created from job- 191379 ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway for job-191379 ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379) Remove job-191379 from job monitoring On 25.01.2017 15:53, Rafael Weingärtner wrote: Did you check the ACS log files looking for some exceptions when themigrations for this specific VM started? On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer < francois.scheu...@everyware.ch> wrote: Dear CS contributorsWe use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared primary storage. When we put a host in maintenance, cs evacuates all VM's on other hosts. But it happens regularly that some VM's are stopped instead of live migrated. If we restart that stopped VM's, CS will then stop them again later. We need to start them several times until they stays up. Is it a known issue? Is it fixed on CS 4.9.2 ? Migrating manually all VM's is working but impracticable because too much time consuming. Many thanks in advance for your help. Best Regards Francois -- EveryWare AG François Scheurer Senior Systems Engineer Zurlindenstrasse 52a CH-8003 Zürich tel: +41 44 466 60 00 fax: +41 44 466 60 10 mail: francois.scheu...@everyware.ch web: http://www.everyware.ch--EveryWare AG François Scheurer Senior Systems Engineer Zurlindenstrasse 52a CH-8003 Zürich tel: +41 44 466 60 00 fax: +41 44 466 60 10 mail: francois.scheu...@everyware.ch web: http://www.everyware.ch-- Rafael Weingärtner-- EveryWare AG François Scheurer Senior Systems Engineer Zurlindenstrasse 52a CH-8003 Zürich tel: +41 44 466 60 00 fax: +41 44 466 60 10 mail: francois.scheu...@everyware.ch web: http://www.everyware.ch
-- EveryWare AG François Scheurer Senior Systems Engineer Zurlindenstrasse 52a CH-8003 Zürich tel: +41 44 466 60 00 fax: +41 44 466 60 10 mail: francois.scheu...@everyware.ch web: http://www.everyware.ch
smime.p7s
Description: S/MIME cryptographic signature