Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Rafael Weingärtner Tue, 07 Feb 2017 08:48:26 -0800

Got the idea of your work around now.

I have not paid much attention to the table data you posted (sorry for
that). Well, it seems you have all taken care of.


I have a suggestion to make; instead of changing data directly on the data
base, what about using CloudMonkey to update the HA flag to false on VMs?
Currently, it is the UI that is blocking such customization, but according
to the API, you should be able to do that.
https://cloudstack.apache.org/api/apidocs-4.5/root_admin/updateVirtualMachine.html

On Tue, Feb 7, 2017 at 11:25 AM, Francois Scheurer <
[email protected]> wrote:

> Dear Rafael
>
>
> We changed it on the VM to avoid the required stop and start. Changing on
> the used Offering is not affecting the vm_instance.ha_enabled.
> We could create a new Offering without HA but the VM need to be stopped to
> let us apply it.
>
> We did not disabled the HA on these 4 VM's to solve the problem, but just
> to minimize the impact of the problem (now only i-189-1177-VM should be
> stopped intermittently) if the cleaning of op_it_work is not fixing the
> issue.
> Of course we could filter the update query with “instance id”, but if you
> look the the table output all instance_id are related to the 5 problematic
> VM's.
> The only exceptions are the first 3 rows that are related to VM's that
> does no exist anymore since July 2015 ...
> That is why I thought it may be safe to do.
>
>
> Best Regards
>
> Francois Scheurer
>
>
>
>
>
>
>
>
> On 07.02.2017 16:27, Rafael Weingärtner wrote:
>
> So, you changed the flags directly on "vm_instance". Why did not you
> change the service offering of these VMs using ACS UI or API? For me, it
> feels that the best approach would be to change the service offering of the
> affected VMs (the HA flag is first configured at the service offering). You
> could assign a service offering without HA; I mean, it would be a matter of
> creating a similar service offering (same configurations to the one they
> are using right now), but without HA.
>
> I am assuming that the change directly on “vm_instance” did not solve your
> problem, right? That is why you want to change some data at “op_it_work”. I
> did a quick look at “op_it_work”, and it seems that this table may be used
> for more things than HA. Here is a tip for complex production system, never
> run update SQLs in production environments without a very, very restrictive
> and well thought “where” clause. In my opinion this is not restrictive
> enough (update op_it_work set step="Done" where step!="Done"). I would
> advise you not to go further and change configurations directly on database
> (unless you have a dev team that is capable of reading, debugging, and work
> with ACS source code). ACS has over 500k lines of java code, and God knows
> how many lines of shell, python and other languages, software and systems
> that may be used; it is pretty hard sometimes to get a clear picture of how
> a change in the database will reverberate in the whole system without
> knowing in details the installation configurations and underlying
> structures (networking, hypervisors, storage systems and other pieces).
> However, if you want to proceed with that, I would suggest you adding some
> extra filters to that “where” clause. What about adding the “instance id”
> there?
>
> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
> Philis named Cosmic? https://github.com/MissionCriticalCloud/cosmic
> Yes, I have. They were part of the active ACS community back there. I was
> already here when they created the fork. Still, I kept with ACS.
>
>
> On Tue, Feb 7, 2017 at 3:54 AM, Francois Scheurer <
> <[email protected]>[email protected]> wrote:
>
>> Dear Rafael
>>
>>
>> We disabled temporarily the HA Flag on these VM's by updating the DB
>> directly: update vm_instance set ha_enabled=0 where removed is null and
>> id=1178 or id=1240 or id=1690 or id=1736;
>>
>> We kept the test VM i-189-1177-VM with HA Flag.
>>
>> Restarting the MS + cleaning the op_it_work table will hopefully fix the
>> issue. And if the issue is persisting, then only the i-189-1177-VM should
>> be affected, hence mitigating the impact.
>>
>> In the other hand if everything run fine we would thereafter re-enable
>> the HA Flag on the problematic VM's.
>>
>>
>> Do you think it is safe to run this update?: update op_it_work set
>> step="Done" where step!="Done";
>>
>>
>> Best Regards
>>
>> Francois Scheurer
>>
>>
>>
>> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
>> Philis named Cosmic? <https://github.com/MissionCriticalCloud/cosmic>
>> https://github.com/MissionCriticalCloud/cosmic
>>
>>
>>
>>
>>
>> On 06.02.2017 16:41, Rafael Weingärtner wrote:
>>
>> I totally understand your position of not being able to debug the ACS in
>> production ;)
>>
>>
>> There is one bit I quite did not understand. If you have disabled the HA
>> for VMs, why do you still need to change data on the data base? How did you
>> disabled the HA of VMs?
>>
>> On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer <
>> [email protected]> wrote:
>>
>>> Dear Rafael / Dear CS contributors
>>>
>>>
>>> I tried the following:
>>>
>>> 1.     stop all MS
>>> 2.     backup DB
>>> 3.     select * from op_ha_work where type="Migration" and
>>> mgmt_server_id is null and taken is null;
>>> 4.     delete from op_ha_work where type="Migration" and mgmt_server_id
>>> is null and taken is null;
>>> 5.     start all MS
>>>
>>> But actually after the MS stopped the op_ha_work table was
>>> automatically cleaned, making my steps useless.
>>> After the MS started the op_ha_work table was filled again and related
>>> VM's were stopped several times (3-7 times), requiring high attention to
>>> manually restart them on each stop.
>>>
>>> I later found another table that could be the cause of the issue:
>>>
>>> mysql> select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>>> op_it_work.instance_id=vm_instance.id order by instance_id;
>>> +--------------------------------------+----------------+---
>>> ---------+-----------------------+-----------+--------------
>>> ------+-----------+------------+-------------+--------------
>>> -+-------------+---------------+---------------------+
>>> | id                                   | mgmt_server_id | created_at |
>>> thread                | type      | vm_type            | step      |
>>> updated_at | instance_id | resource_type | resource_id | instance_name |
>>> removed             |
>>> +--------------------------------------+----------------+---
>>> ---------+-----------------------+-----------+--------------
>>> ------+-----------+------------+-------------+--------------
>>> -+-------------+---------------+---------------------+
>>> | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |   345049098498 | 1436187965
>>> <%2814%29%203618-7965> | consoleproxy-1        | Starting  |
>>> ConsoleProxy       | Release   | 1436188265 <%2814%29%203618-8265>
>>> |         402 | NULL          |           0 | v-402-VM      | 2015-07-06
>>> 13:23:22 |
>>> | fb4d1844-eb92-478c-9f50-d666790f43af |   345049098498 | 1436187965
>>> <%2814%29%203618-7965> | secstorage-1          | Starting  |
>>> SecondaryStorageVm | Release   | 1436188265 <%2814%29%203618-8265>
>>> |         687 | NULL          |           0 | s-687-VM      | 2015-07-06
>>> 13:11:34 |
>>> | 7dfeb25a-a705-42e8-907c-79d1e69697ed |   345049098498 | 1436964548
>>> <%2814%29%203696-4548> | Work-Job-Executor-165 | Migrating |
>>> User               | Migrating | 1436964548 <%2814%29%203696-4548>
>>> |         776 | Host          |          29 | i-51-776-VM   | 2015-07-15
>>> 12:55:38 |
>>> | 2018ac61-b11c-4101-8ef5-07a39509f2bd |   345049098498 | 1486020173 |
>>> Work-Job-Executor-19  | Migrating | User               | Prepare   |
>>> 1486020173 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | 25430cd5-4346-4643-9c7e-7b39b8a876a2 |   345049098122 | 1486019693 |
>>> Work-Job-Executor-6   | Migrating | User               | Prepare   |
>>> 1486019693 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |   345049098498 | 1486020953 |
>>> Work-Job-Executor-33  | Migrating | User               | Prepare   |
>>> 1486020953 |        1177 | Host          |          33 | i-189-1177-VM |
>>> NULL                |
>>> | 6462da33-83f2-4575-af38-0e52d2a48488 |   345049098122 | 1486021433 |
>>> Work-Job-Executor-12  | Migrating | User               | Prepare   |
>>> 1486021433 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |   345049098122 | 1486020773 |
>>> Work-Job-Executor-11  | Migrating | User               | Prepare   |
>>> 1486020773 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | d8a18e9d-ac6f-4273-923e-e401533e2cf7 |   345049098498 | 1486020353 |
>>> Work-Job-Executor-25  | Migrating | User               | Prepare   |
>>> 1486020353 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | 1222ece8-399d-43d8-8ff8-d24bdebca2ed |   345049103441 | 1486018924 |
>>> Work-Job-Executor-2   | Migrating | User               | Prepare   |
>>> 1486018924 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |   345049098498 | 1486021613 |
>>> Work-Job-Executor-41  | Migrating | User               | Prepare   |
>>> 1486021613 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |   345049103441 | 1486019584 |
>>> Work-Job-Executor-5   | Migrating | User               | Prepare   |
>>> 1486019584 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | 53745a2d-056c-4a1b-9267-eb97f077a6c8 |   345049103441 | 1486020244 |
>>> Work-Job-Executor-6   | Migrating | User               | Prepare   |
>>> 1486020244 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |   345049098498 | 1486020893 |
>>> Work-Job-Executor-32  | Migrating | User               | Prepare   |
>>> 1486020893 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | 1dcd5204-4f9a-4342-a23b-329a48596e13 |   345049098498 | 1486020953 |
>>> Work-Job-Executor-34  | Migrating | User               | Prepare   |
>>> 1486020953 |        1240 | Host          |           3 | i-388-1240-VM |
>>> NULL                |
>>> | ec1668b3-fa3c-4673-b593-3d607b169a24 |   345049098498 | 1486019693 |
>>> Work-Job-Executor-14  | Migrating | User               | Prepare   |
>>> 1486019693 |        1240 | Host          |           3 | i-388-1240-VM |
>>> NULL                |
>>> | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |   345049098498 | 1486020353 |
>>> Work-Job-Executor-26  | Migrating | User               | Prepare   |
>>> 1486020353 |        1240 | Host          |           3 | i-388-1240-VM |
>>> NULL                |
>>> | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |   345049098498 | 1465387271 |
>>> Work-Job-Executor-18  | Running   | User               | Prepare   |
>>> 1465387271 |        1388 | Host          |           7 | i-361-1388-VM |
>>> NULL                |
>>> | 6e941b7e-6f19-4717-bf82-974c59737dbc |   345049103441 | 1486019524 |
>>> Work-Job-Executor-4   | Migrating | User               | Prepare   |
>>> 1486019524 |        1690 | Host          |          15 | i-648-1690-VM |
>>> NULL                |
>>> | 718578a4-0dd6-42d9-be10-ed7680650108 |   345049098122 | 1486021433 |
>>> Work-Job-Executor-13  | Migrating | User               | Prepare   |
>>> 1486021433 |        1690 | Host          |          15 | i-648-1690-VM |
>>> NULL                |
>>> | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |   345049098498 | 1486020173 |
>>> Work-Job-Executor-20  | Migrating | User               | Prepare   |
>>> 1486020173 |        1690 | Host          |          15 | i-648-1690-VM |
>>> NULL                |
>>> | a50c420b-ae52-4caa-83db-704b52b299a4 |   345049098498 | 1486020773 |
>>> Work-Job-Executor-28  | Migrating | User               | Prepare   |
>>> 1486020773 |        1690 | Host          |          15 | i-648-1690-VM |
>>> NULL                |
>>> | 2b5f4179-5975-4326-9db7-643c3ede63fe |   345049103441 | 1484224516 |
>>> Work-Job-Executor-161 | Running   | User               | Prepare   |
>>> 1484224516 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |   345049098498 | 1484223910 |
>>> Work-Job-Executor-163 | Running   | User               | Prepare   |
>>> 1484223910 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 4126a23b-1cde-49bc-8636-c0d063b85cf6 |   345049103441 | 1484224636 |
>>> Work-Job-Executor-86  | Running   | User               | Prepare   |
>>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 5dad5645-766b-48c9-9a85-b77b3c101390 |   345049103441 | 1484224636 |
>>> Work-Job-Executor-92  | Running   | User               | Prepare   |
>>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 8c1dfff9-e55d-4f81-a22f-45f72accc416 |   345049103441 | 1484224576 |
>>> Work-Job-Executor-95  | Running   | User               | Prepare   |
>>> 1484224576 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | a703d2d7-b1e3-42ce-a551-831982990ffb |   345049098498 | 1484224630 |
>>> Work-Job-Executor-13  | Running   | User               | Prepare   |
>>> 1484224630 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 0e194cb5-95b7-41cb-969a-ae37d2ad449a |   345049098498 | 1485763613 |
>>> Work-Job-Executor-28  | Migrating | User               | Prepare   |
>>> 1485763613 |        1735 | Host          |          19 | i-638-1735-VM |
>>> 2017-01-30 15:24:10 |
>>> | 74f180b9-be0f-429b-9a45-597ca2ca997d |   345049098498 | 1485763013 |
>>> Work-Job-Executor-21  | Migrating | User               | Prepare   |
>>> 1485763013 |        1735 | Host          |          19 | i-638-1735-VM |
>>> 2017-01-30 15:24:10 |
>>> | feaa421c-88df-4493-9a4f-bddb6ee1b75b |   345049098122 | 1485764285 |
>>> Work-Job-Executor-19  | Migrating | User               | Prepare   |
>>> 1485764285 |        1735 | Host          |          19 | i-638-1735-VM |
>>> 2017-01-30 15:24:10 |
>>> | 32138a36-ad18-42b5-8478-76ad63caf4da |   345049098122 | 1486020293 |
>>> Work-Job-Executor-9   | Migrating | User               | Prepare   |
>>> 1486020293 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> | 5306ab1f-3c92-4f95-bbff-283b43e6862d |   345049098498 | 1486019573 |
>>> Work-Job-Executor-11  | Migrating | User               | Prepare   |
>>> 1486019573 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> | 843bdf9c-247d-432c-ac19-1a4f7858631b |   345049098498 | 1486021493 |
>>> Work-Job-Executor-38  | Migrating | User               | Prepare   |
>>> 1486021493 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> | 8d400afb-766f-4ac5-89ee-854f25a38eba |   345049098498 | 1486020173 |
>>> Work-Job-Executor-21  | Migrating | User               | Prepare   |
>>> 1486020173 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> | f0ae5612-b6ef-4882-94a2-40a1e6dce342 |   345049098498 | 1486020833 |
>>> Work-Job-Executor-30  | Migrating | User               | Prepare   |
>>> 1486020833 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> +--------------------------------------+----------------+---
>>> ---------+-----------------------+-----------+--------------
>>> ------+-----------+------------+-------------+--------------
>>> -+-------------+---------------+---------------------+
>>>
>>> As you see, the problematic VM's are all present on that table:
>>>
>>> i-189-1177-VM
>>> i-189-1178-VM
>>> i-388-1240-VM
>>> i-648-1690-VM
>>> i-638-1736-VM
>>>
>>> As the MS are running the production environnements, we cannot debug a
>>> lot and just need the shortest path to resolution.
>>> For example restarting the MS is having an impact on our customers and
>>> we need to reduce the number of restarts at the minimum.
>>>
>>> I have disabled the HA on all problematic VM's except on i-189-1177-VM,
>>> which is less critical.
>>>
>>> The next steps could be:
>>> 1.     stop all MS
>>> 2.     backup DB
>>> 3.     select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>>> op_it_work.instance_id=vm_instance.id order by instance_id;
>>> 4.     update op_it_work set step="Done" where step!="Done";
>>> 5.     start all MS
>>>
>>> What do you think about step 4., safe or risky?
>>> Some of these entries are related to deleted VM's and seems to safe to
>>> mark as "Done".
>>> All other entries are related to the problematic VM's that keep
>>> stopping... so marking them as "Done" may (hopefully) eliminate the cause
>>> of the Migrating/Stopping.
>>> But we are not sure if marking them as "Done" need also changes on other
>>> tables for consistency.
>>>
>>>
>>> Thank you for your help.
>>>
>>>
>>> Best Regards
>>> Francois Scheurer
>>>
>>>
>>>
>>>
>>>
>>> On 01.02.2017 03:01, Rafael Weingärtner wrote:
>>>
>>> Ok, I dug a little further into the code.
>>>
>>> If we put a host in maintenance, it may unfortunately happen that a
>>>> timeout occurs and CS aborts the migration. As far as we can interpret the
>>>> logs, it seems (we speculate) that CS detects that the VM is then still
>>>> running on the original host, which is in maintenance, and decides somehow
>>>> to stop it because of the "host not up" status.
>>>>
>>>
>>> From wat I have seen in the code, this may happen (as you speculated).
>>> If a timeout or some other event interrupts migrations, the VM will be
>>> stopped. This happens in the method “com.cloud.vm.VirtualMachineMa
>>> nagerImpl.orchestrateMigrateAway(String, long, DeploymentPlanner)” at
>>> line 2396 and further. This was implemented on purpose; if the migration
>>> does not happen; the VM is stopped on the source host. Probably it was
>>> implemented this way on that philosophy of gradient descendent of
>>> artificial intelligence:
>>>
>>> “pick a random point in the field, update the values according to the
>>>> slope and hopefully after successfully iterations you converge to a minimum
>>>> point.” (the problem is that sometimes the function diverges).
>>>>
>>>
>>> I mean, probably this was done this way because the HA was supposed to
>>> see that a VM with HA is stopped, and should try to start it again
>>> somewhere else. The log you posted do not provide any other details to
>>> debug this further because the log is after an MS restart, right?
>>>
>>>
>>> After restarting the MSs I guess that a similar issue may happen: the
>>>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>>>> then again stops the VMs because the "host not up" status?
>>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>> Disconnected, host resource state is: Enabled
>>>>
>>>
>>> Here is something interesting that you have found. I checked the code,
>>> but without a live debug it is pretty hard for me to get the right flow of
>>> execution. By looking at the code and at the logs you provided, your
>>> deduction is right. After ACS boots up, the check for hosts that are
>>> shutdown/deactivated starts, and the method “com.cloud.agent.manager.Agent
>>> ManagerImpl.executeUserRequest(long, Event)” is executed. The problem
>>> is that all of the hosts are considered as disconnected (ACS may not have
>>> finished connecting to all of them yet), and then a whole process of HA is
>>> kicked. The problem is that there is no host with status “UP”. That is why
>>> this happen “No suitable hosts found under this Cluster: 25”, and then the
>>> whole flow of HA, which shuts down the VM is kicked.
>>>
>>> In theory, the HA worker should see that a VM with HA is down, and
>>> should try to start it. I am not understanding why this is not working. If
>>> you decide to debug the running ACS, I may be able to help you, just keep
>>> me posted.
>>>
>>>
>>> On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner <
>>> [email protected]> wrote:
>>>
>>>> I will be able to re-check some code and answer you with new thoughts
>>>> on this problem tonight (the log files you provided today have some very
>>>> insightful information regarding this problem). I believe this is a bug in
>>>> HA, a very tricky one that has not been reported yet (at least I do not
>>>> remember seeing anything similar). And thanks to randomness/destiny now you
>>>> have found/uncovered a way to reproduce the problem (bug?!).
>>>>
>>>> I do not suggest you changing any data at all directly in the data base
>>>> (unless you know ACS code well and are 100% of what you are doing). This
>>>> can cause even bigger and more complicated problems. If you remove the HA
>>>> feature from this VMs, this problem will stop. However, if you can/are
>>>> willing to maintain the HA until we fully uncover the problem, map and
>>>> report it, this would be great.
>>>> BTW: why are you restarting MSs?
>>>> What are the configurations of your MSs?
>>>>
>>>> Thanks for your time and for sharing information with our community.
>>>>
>>>>
>>>>
>>>> On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer <
>>>> <[email protected]>[email protected]> wrote:
>>>>
>>>>> Dear Rafael
>>>>>
>>>>>
>>>>> Many thanks for your deep analysis.
>>>>> Nice humor with the planetsalignments BTW ;)
>>>>>
>>>>> We also suspected the HA Flag to be related this issue, as we never
>>>>> got this issue in the past, until we enabled the HA on all VMs and
>>>>> Offerings.
>>>>>
>>>>> Here is also the workers list:
>>>>>
>>>>> mysql> select * from op_ha_work;
>>>>> +------+-------------+-----------+--------------------+-----
>>>>> -----+----------------+---------+---------------------+-----
>>>>> --+---------------------+-------+-------------+---------+
>>>>> | id   | instance_id | type      | vm_type            | state    |
>>>>> mgmt_server_id | host_id | created             | tried | taken
>>>>>  | step  | time_to_try | updated |
>>>>> +------+-------------+-----------+--------------------+-----
>>>>> -----+----------------+---------+---------------------+-----
>>>>> --+---------------------+-------+-------------+---------+
>>>>> |  105 |         343 | HA        | User               | Running  |
>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  109 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  113 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  117 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  121 |         343 | HA        | User               | Running  |
>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  125 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  129 |         393 | HA        | User               | Running  |
>>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056014 |      30 |
>>>>> |  133 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056014 |       3 |
>>>>> |  137 |         402 | HA        | ConsoleProxy       | Running  |
>>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056014 |       3 |
>>>>> |  345 |         402 | HA        | ConsoleProxy       | Starting |
>>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>>> Error |  1402527506 |     770 |
>>>>> |  349 |         687 | HA        | SecondaryStorageVm | Starting |
>>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>>> Error |  1402527506 |       2 |
>>>>> | 2057 |        1178 | Migration | User               | Running  |
>>>>>  345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>> Done  |  1451024774 |      95 |
>>>>> | 2058 |        1736 | Migration | User               | Running  |
>>>>>  345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>> Done  |  1451024774 |      45 |
>>>>> | 2059 |        1240 | Migration | User               | Running  |
>>>>>  345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>> Done  |  1451024774 |      61 |
>>>>> | 2060 |        1178 | Migration | User               | Running  |
>>>>>  345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 |
>>>>> Done  |  1451025374 |      95 |
>>>>> | 2061 |        1178 | Migration | User               | Stopped  |
>>>>>        NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL                |
>>>>> Done  |  1451027900 |     109 |
>>>>> | 2062 |        1690 | Migration | User               | Running  |
>>>>>  345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>> Done  |  1451024774 |     109 |
>>>>> | 2063 |        1177 | Migration | User               | Running  |
>>>>>  345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>> Done  |  1451024774 |      82 |
>>>>> | 2064 |        1690 | Migration | User               | Running  |
>>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>> Done  |  1451024900 |     109 |
>>>>> | 2065 |        1240 | Migration | User               | Running  |
>>>>>  345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>> Done  |  1451024900 |      61 |
>>>>> | 2066 |        1736 | Migration | User               | Running  |
>>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>> Done  |  1451024900 |      45 |
>>>>> | 2067 |        1240 | Migration | User               | Stopped  |
>>>>>        NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL                |
>>>>> Done  |  1451028003 |      80 |
>>>>> | 2068 |        1178 | Migration | User               | Running  |
>>>>>  345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>> Done  |  1451024781 |      95 |
>>>>> | 2069 |        1736 | Migration | User               | Running  |
>>>>>  345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>> Done  |  1451024781 |      45 |
>>>>> | 2070 |        1240 | Migration | User               | Running  |
>>>>>  345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>> Done  |  1451024781 |      61 |
>>>>> | 2071 |        1177 | Migration | User               | Running  |
>>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>> Done  |  1451024781 |      82 |
>>>>> | 2072 |        1177 | Migration | User               | Stopped  |
>>>>>        NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>> Done  |  1451027973 |     101 |
>>>>> | 2073 |        1177 | Migration | User               | Running  |
>>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 |
>>>>> Done  |  1451024904 |      82 |
>>>>> | 2074 |        1690 | Migration | User               | Stopped  |
>>>>>        NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>> Done  |  1451027464 |     123 |
>>>>> | 2075 |        1736 | Migration | User               | Stopped  |
>>>>>        NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>> Done  |  1451027477 |      59 |
>>>>> +------+-------------+-----------+--------------------+-----
>>>>> -----+----------------+---------+---------------------+-----
>>>>> --+---------------------+-------+-------------+---------+
>>>>>
>>>>> If we put a host in maintenance, it may unfortunately happen that a
>>>>> timeout occurs and CS aborts the migration.
>>>>> As far as we can interpret the logs, it seems (we speculate) that CS
>>>>> detects that the VM is then still running on the original host, which is 
>>>>> in
>>>>> maintenance, and decides somehow to stop it because of the "host not up"
>>>>> status.
>>>>> We then need to start several times the VM, until CS gives up the
>>>>> stopping.
>>>>>
>>>>> After restarting the MSs I guess that a similar issue may happen: the
>>>>> hosts are disconnected during the early stage and then reconnects. Maybe 
>>>>> CS
>>>>> then again stops the VMs because the "host not up" status?
>>>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>>
>>>>> But the problem after the MSs restart is definitely only impacting VMs
>>>>> listed in the Table op_ha_work.
>>>>>
>>>>>
>>>>> To your questions:
>>>>>         We have 3 ACS MSs managing several Zones. Each zone has a
>>>>> single pod and cluster.
>>>>>         The Cluster of i-638-1735-VM has this summary:
>>>>>          Total number of VMs: 121
>>>>>          Total number hypervisors: 4
>>>>>          Total allocated RAM: 522.0 / 972 GB (53.7 %)
>>>>>          Total allocated cores: 211
>>>>>          Total allocated storage: 11285 GB
>>>>>         Used Memory on the hosts is now varying from 28 to 77%, so
>>>>> there is enough RAM.
>>>>>         I did not check the XS logs. The same Problem has arisen this
>>>>> morning again (shortly after a restart of all MSs), so I will look in the
>>>>> XS logs now.
>>>>>
>>>>>         I have some fair experience with Eclipse and Java, I will ask
>>>>> my boss if we can go this way.
>>>>>
>>>>> The main issue for now, is that restarting the MSs is reproducing the
>>>>> problem, which is quite annoying for the impacted VM's.
>>>>> I thought to stop the MSs, delete all HA/Migration entries in
>>>>> op_ha_work and start the MSs.
>>>>> Is that a viable option? (or should I delete the Migration entries
>>>>> only? or never touch this table?)
>>>>> We will also very probably remove the HA Flag on all VMs if it does
>>>>> not work reliably.
>>>>>
>>>>> Best Regards
>>>>> Francois Scheurer
>>>>>
>>>>>
>>>>>
>>>>> Here are new logs regarding the same issue of today.
>>>>> I tried to get the important parts related to the stop of
>>>>> i-189-1178-VM.
>>>>>
>>>>> regexp: (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRe
>>>>> questedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae73
>>>>> 3|work-2049)
>>>>>
>>>>> 2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing
>>>>> AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null,
>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>>>> F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm
>>>>> 1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTA
>>>>> ALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAA
>>>>> AAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>>>>> ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null,
>>>>> lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 
>>>>> 2017}
>>>>> 2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work
>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>>>>>
>>>>> 2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>>>>> (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache
>>>>> for 232
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.UserDispersin
>>>>> gPlanner@1734b2e2
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> This VM has last host_id specified, trying to choose the same host: 25
>>>>>
>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Cannot choose the last host to deploy this VM
>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Searching resources only under specified Cluster: 25
>>>>>
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>>> prioritization: []
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> No suitable hosts found
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> No suitable hosts found under this Cluster: 25
>>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Could not find suitable Deployment Destination for this VM under any
>>>>> clusters, returning.
>>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Searching resources only under specified Cluster: 25
>>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> The specified cluster is in avoid set, returning.
>>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>>>> 2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Failed to deploy vm 1178 with original planner, sending HAPlanner
>>>>>
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>>>> sPlanner@4fdf86ce
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> This VM has last host_id specified, trying to choose the same host: 25
>>>>>
>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Cannot choose the last host to deploy this VM
>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Searching resources only under specified Cluster: 25
>>>>> 2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Deploying vm during HA process, so skipping disable threshold check
>>>>>
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>>> prioritization: []
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> No suitable hosts found
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> No suitable hosts found under this Cluster: 25
>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish async
>>>>> job-192986 complete on message bus
>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>>>> related to job-192986
>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db
>>>>> status for job-192986
>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> Cannot choose the last host to deploy this VM
>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> Searching resources only under specified Cluster: 35
>>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Could not find suitable Deployment Destination for this VM under any
>>>>> clusters, returning.
>>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Searching resources only under specified Cluster: 25
>>>>> 2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>>> SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob,
>>>>> contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27,
>>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 
>>>>> 08:44:05
>>>>> CET 2017}
>>>>> 2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add
>>>>> job-192992 into job monitoring
>>>>> 2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> Deploying vm during HA process, so skipping disable threshold check
>>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> The specified cluster is in avoid set, returning.
>>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>>>> 2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Invocation exception, caused by: 
>>>>> com.cloud.exception.InsufficientServerCapacityException:
>>>>> Unable to find a server to migrate to.Scope=interface
>>>>> com.cloud.org.Cluster; id=25
>>>>> 2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Rethrow exception com.cloud.exception.InsufficientServerCapacityException:
>>>>> Unable to find a server to migrate to.Scope=interface
>>>>> com.cloud.org.Cluster; id=25
>>>>> 2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with run
>>>>> of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin:
>>>>> 15393
>>>>> 2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to
>>>>> complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType:
>>>>> null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>> AAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, cmdVersion:
>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>> initMsid: 345049103441, completeMsid: null, lastUpdated: null, lastPolled:
>>>>> null, created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393
>>>>> com.cloud.exception.InsufficientServerCapacityException: Unable to
>>>>> find a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>         at sun.reflect.NativeMethodAccess
>>>>> orImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>         at sun.reflect.DelegatingMethodAc
>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>         at com.cloud.vm.VmWorkJobHandlerP
>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>         at com.cloud.vm.VmWorkJobDispatch
>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>         at org.apache.cloudstack.framewor
>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>> erImpl.java:537)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>> ext.java:56)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>> nagedContext.java:103)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>> agedContext.java:53)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>         at org.apache.cloudstack.framewor
>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>>         at java.util.concurrent.Executors
>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>         at java.util.concurrent.ThreadPoo
>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>         at java.util.concurrent.ThreadPoo
>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>> 2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
>>>>> 2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>>>> joined with job-192986 and disjoin all subjobs created from job- 192986
>>>>> 2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete
>>>>> async job-192989, jobStatus: FAILED, resultCode: 0, result:
>>>>> rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZl
>>>>> ckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vw
>>>>> c0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRD
>>>>> YXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5n
>>>>> L0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xv
>>>>> dWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vy
>>>>> cm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2
>>>>> YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4a
>>>>> OxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNh
>>>>> dXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQA
>>>>> EkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFu
>>>>> Zy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QA
>>>>> EExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBz
>>>>> ZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFj
>>>>> ZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNr
>>>>> VHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFy
>>>>> aW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-
>>>>> AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2Vy
>>>>> SW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hl
>>>>> c3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EA
>>>>> fgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1w
>>>>> bHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNx
>>>>> AH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5y
>>>>> ZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0
>>>>> aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhq
>>>>> YXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EA
>>>>> fgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0
>>>>> ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pv
>>>>> YnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNs
>>>>> b3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRj
>>>>> aGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91
>>>>> ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1w
>>>>> bCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0
>>>>> c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNv
>>>>> bnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRl
>>>>> eHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUu
>>>>> Y2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdl
>>>>> ZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2Fs
>>>>> bHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5j
>>>>> b250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxs
>>>>> V2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250
>>>>> ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2Vk
>>>>> LmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-
>>>>> ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5j
>>>>> b25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRv
>>>>> cnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50
>>>>> LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0
>>>>> ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdU
>>>>> aHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0
>>>>> AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29y
>>>>> a2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAAL
>>>>> VGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5t
>>>>> b2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEu
>>>>> dXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte
>>>>> 9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEu
>>>>> dXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4
>>>>> cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1
>>>>> ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMA
>>>>> AAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFs
>>>>> dWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-o
>>>>> maW2FAIAAHhwAA
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Deploy avoids pods: [], clusters: [], hosts: [3]
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>>>> sPlanner@4fdf86ce
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Trying to allocate a host and storage pools from dc:14, pod:17,cluster:17,
>>>>> requested cpu: 4200, requested ram: 4294967296
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> This VM has last host_id specified, trying to choose the same host: 213
>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish async
>>>>> job-192989 complete on message bus
>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>>>> related to job-192989
>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db
>>>>> status for job-192989
>>>>> 2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> Checking resources in Cluster: 35 under Pod: 31
>>>>> 2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>> FirstFitRoutingAllocator) Looking for hosts in dc: 23  pod:31  cluster:35
>>>>> 2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing
>>>>> AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null,
>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>> AAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, cmdVersion:
>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>>> 2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work
>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953
>>>>> 2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>>> SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob,
>>>>> contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61,
>>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 
>>>>> 08:44:05
>>>>> CET 2017}
>>>>> 2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>> 2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Cannot choose the last host to deploy this VM
>>>>> 2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Searching resources only under specified Cluster: 17
>>>>> 2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add
>>>>> job-192991 into job monitoring
>>>>> 2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done
>>>>> executing com.cloud.vm.VmWorkMigrateAway for job-192986
>>>>> 2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>>>> joined with job-192989 and disjoin all subjobs created from job- 192989
>>>>> 2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove
>>>>> job-192986 from job monitoring
>>>>> 2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done
>>>>> executing com.cloud.vm.VmWorkMigrateAway for job-192989
>>>>> 2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing
>>>>> AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null,
>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>> AAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, cmdVersion:
>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>>> 2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work
>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965
>>>>> 2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>> (AgentManager-Handler-13:null) Ping from 258
>>>>> 2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>> FirstFitRoutingAllocator) List of hosts in ascending order of number of
>>>>> VMs: [224, 19, 21, 15]
>>>>> 2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>> FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for
>>>>> allocation: []
>>>>> 2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl]
>>>>> (AgentManager-Handler-13:null) PingMap for agent: 258 will not be updated
>>>>> because agent is no longer in the PingMap
>>>>> 2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove
>>>>> job-192989 from job monitoring
>>>>>
>>>>> 2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> Migration cancelled because state has changed: VM[User|i-189-1178-VM]
>>>>> 2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> Unable to migrate VM due to: Migration cancelled because state has 
>>>>> changed:
>>>>> VM[User|i-189-1178-VM]
>>>>> 2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> VM state transitted from :Running to Stopping with event: 
>>>>> StopRequestedvm's
>>>>> original host id: 29 new host id: 33 host id before state transition: 33
>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response
>>>>> Received:
>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing:  {
>>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10,
>>>>> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-
>>>>> 1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,
>>>>> "result":true,"details":"","wait":0}}] }
>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>> (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received:  {
>>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, {
>>>>> NetworkUsageAnswer } }
>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>> (RouterMonitor-1:ctx-2e80f405) Details from executing class
>>>>> com.cloud.agent.api.NetworkUsageCommand:
>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> Seq 33-1917970491306410035: Sending  { Cmd , MgmtId: 345049098498, via:
>>>>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>> }
>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> Seq 33-1917970491306410035: Executing:  { Cmd , MgmtId: 345049098498, via:
>>>>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>> }
>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>> (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing 
>>>>> request
>>>>> 2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>> (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing 
>>>>> request
>>>>>
>>>>> 2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl]
>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Processing
>>>>> HAWork[2049-Migration-1178-Stopped-Migrating]
>>>>> 2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext]
>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context,
>>>>> setup psudo job for the executing thread
>>>>> 2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object
>>>>> VmWorkJobQueue.1178
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>>>>
>>>>>> This seems to be one of those nasty bugs that are very hard to debug.
>>>>>> It
>>>>>> seems to be cause to a combination of different factors, such as
>>>>>> planets
>>>>>> alignments ;)
>>>>>>
>>>>>> At class “VirtualMachineManagerImpl” line 2395 (method
>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAw
>>>>>> ay(String,
>>>>>> long, DeploymentPlanner)”) is the start of the code flow that
>>>>>> triggers the
>>>>>> migration. The point here is that, if an error occurs during the
>>>>>> migration,
>>>>>> the VM will be stopped; For some reason this is executed only for HA
>>>>>> VMs.
>>>>>>
>>>>>> Going a little bit further, we find the point that is triggering the
>>>>>> error
>>>>>> in the method “com.cloud.vm.VirtualMachineMa
>>>>>> nagerImpl.migrate(VMInstanceVO,
>>>>>> long, DeployDestination)” at line 1941. This line will call the method
>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>> Event,
>>>>>> Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
>>>>>> migration/or something like that (I am guessing here), but for some
>>>>>> reason,
>>>>>> the state of the VM is not the one expected and this is causing a
>>>>>> “ConcurrentOperationException” to be thrown. The problem is that the
>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>> Event,
>>>>>> Long, ItWorkVO, Step)” does not seem to provide much useful log
>>>>>> entries.
>>>>>>
>>>>>> Just to enable us to discuss the problem a little further, could you
>>>>>> tell
>>>>>> us a little bit about your environment setting? How many ACS MSs do
>>>>>> you
>>>>>> have? What is the VM workload being migrated (I mean, the number of
>>>>>> VMs and
>>>>>> computing resources)? Do the other servers have enough computing
>>>>>> resources
>>>>>> to support these VMs? Could you also check the XenServer logs, and
>>>>>> check
>>>>>> what happened on both, source and destination hosts when the
>>>>>> migration for
>>>>>> VM “i-638-1735-VM” was triggered?
>>>>>>
>>>>>> Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to
>>>>>> connect to
>>>>>> a running JVM and debug a code? Because for me it will not be
>>>>>> possible to
>>>>>> go further without looking at the variables in a running environment.
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>>>>>> <[email protected]>[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> Dear Rafael
>>>>>>>
>>>>>>>
>>>>>>> Sorry for the late answer (was sick a couple of days).
>>>>>>>
>>>>>>> I confirm we are migrating inside the same xenserver pool (within a
>>>>>>> CS
>>>>>>> Cluster).
>>>>>>>
>>>>>>>
>>>>>>> Best Regards
>>>>>>> Francois
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>>>>>>
>>>>>>> Found the code that is executed.
>>>>>>>> I need you to confirm if you are migrating within a cluster or
>>>>>>>> across
>>>>>>>> clusters?
>>>>>>>>
>>>>>>>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>>>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>>>
>>>>>>>> I do not have access to ACS code right now, but I suggest starting
>>>>>>>>
>>>>>>>>> debugging here:
>>>>>>>>>
>>>>>>>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>>>>>
>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>>> ctx-175d37df)
>>>>>>>>>> VM state transitted from :Running to Stopping with event:
>>>>>>>>>> StopRequestedvm's
>>>>>>>>>> original host id: 224 new host id: 15 host id before state
>>>>>>>>>> transition:
>>>>>>>>>> 15
>>>>>>>>>>
>>>>>>>>>> I would try to understand first why ACS requested the hypervisor
>>>>>>>>>> to stop
>>>>>>>>>>
>>>>>>>>> the VM. I mean the following, check the source code for conditions
>>>>>>>>> that
>>>>>>>>> would make ACS request the shutdown of VMs.
>>>>>>>>> BTW: are you migrating within a cluster or across clusters?
>>>>>>>>>
>>>>>>>>> I will be able to analyze this further only after work hours, so
>>>>>>>>> if you
>>>>>>>>> find anything, keep me posted.
>>>>>>>>>
>>>>>>>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>>>>>>>> <[email protected]>[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello Rafael
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I think the important log lines are these:
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>>> state has
>>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>>> Migration
>>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>>> :Running to
>>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>>>> host
>>>>>>>>>> id:
>>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>>
>>>>>>>>>> The really disturbing thing is that CS stop again the VM if we
>>>>>>>>>> start it
>>>>>>>>>> manually, apparently because the migratingjob is still running.
>>>>>>>>>>
>>>>>>>>>> Note the the VM HA Flag is enabled.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best Regards.
>>>>>>>>>>
>>>>>>>>>> Francois
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> more details:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>>>>>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>>>>>>>> VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>>>> job-14953/job-191351) Done with run of VM work job:
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8b
>>>>>>>>>> acc64
>>>>>>>>>> job-14953/job-191351) Done executing
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>>> job-191351
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>>>>>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>>>>>>>> work-1989)
>>>>>>>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>>>>>>>> Forwarding
>>>>>>>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>>>>>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>>>>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>>>>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>>>>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>>>>>>> (64-bit)","platformEmulator":"Other install
>>>>>>>>>> media","bootArgs":"","enable
>>>>>>>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>>>>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>>>>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>>>>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>>>>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio"
>>>>>>>>>> :"4.0","
>>>>>>>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>>>>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>>>>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>>>>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8
>>>>>>>>>> 751b89-00f
>>>>>>>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>>>>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>>>>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>>>>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>>>>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"
>>>>>>>>>> path":"d3
>>>>>>>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>>>>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>>>>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>>>>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>>>>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>>>>>>> storagePort":"0",
>>>>>>>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>>>>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>>>>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>>>>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>>>>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","
>>>>>>>>>> id":64,"
>>>>>>>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>>>>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>>>>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>>>>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>>>>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a
>>>>>>>>>> 32b-ee3e2
>>>>>>>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>>>>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>>>>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>>>>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>>>>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>>>>>>>> ageHost":"localhos
>>>>>>>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>>>>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>>>>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>>>>>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"
>>>>>>>>>> PreSetup","
>>>>>>>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>>>>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>>>>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>>>>>>
>>>>>>>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>>>>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>>>>>>> ,"volumeId":86
>>>>>>>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>>>>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>>>>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>>>>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>>>>>>> false","storagePort":"0","storageHost":"localhost","volumeSi
>>>>>>>>>> ze":"42949
>>>>>>>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>>>>>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>>>>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>>>>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>>>>>>>>>> 4.5
>>>>>>>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>>>>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>>>>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>>>>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>>>>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>>>>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>>>>>>> .8.8","dns2":"8.8
>>>>>>>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>>>>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>>>>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>>>>>>>>>> 345049098498
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375:
>>>>>>>>>> Processing:
>>>>>>>>>> {
>>>>>>>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>>>>>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>>>>>>> true,"wait":0}}]
>>>>>>>>>> }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>> [c.c.a.m.AgentAttache]
>>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more
>>>>>>>>>> commands
>>>>>>>>>> found
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>>> ctx-175d37df)
>>>>>>>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId:
>>>>>>>>>> 345049103441,
>>>>>>>>>> via:
>>>>>>>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>>> state has
>>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>>> Migration
>>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>>> :Running to
>>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>>>> host
>>>>>>>>>> id:
>>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>>>>>>>> Forwarding
>>>>>>>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>>>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>>>>>>> 638-1736-VM","wait":0}}]
>>>>>>>>>> } to 345049098498
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>>> 254-800498:
>>>>>>>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1,
>>>>>>>>>> Flags:
>>>>>>>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>>>>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>>>>>>>> \"connections\": []\n}","wait":0}}] }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>>> 254-800498:
>>>>>>>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254,
>>>>>>>>>> Ver:
>>>>>>>>>> v1,
>>>>>>>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>>>>>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632:
>>>>>>>>>> Processing:
>>>>>>>>>> {
>>>>>>>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>>>>>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>>>>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>>>>>>> ":true,"details":"Stop
>>>>>>>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379
>>>>>>>>>> ctx-bbabfeae)
>>>>>>>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId:
>>>>>>>>>> 345049103441,
>>>>>>>>>> via:
>>>>>>>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is
>>>>>>>>>> stopped on
>>>>>>>>>> the
>>>>>>>>>> host. Proceeding to release resource held.
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>> supported
>>>>>>>>>> in the network id=519
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics
>>>>>>>>>> for
>>>>>>>>>> network id=519 on -1
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to
>>>>>>>>>> release
>>>>>>>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>>>>>>> 92.168.1.11-null
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>>>>>>>> resources
>>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>>>>>>>> resources
>>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from
>>>>>>>>>> :Stopping to
>>>>>>>>>> Stopped with event: OperationSucceededvm's original host id: 15
>>>>>>>>>> new host
>>>>>>>>>> id: null host id before state transition: 15
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU:
>>>>>>>>>> 82976 and
>>>>>>>>>> CPU
>>>>>>>>>> after applying overprovisioning: 331904
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>>>>>>>> 261961927808
>>>>>>>>>> and RAM after applying overprovisioning: 261961924608
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old
>>>>>>>>>> used:
>>>>>>>>>> 71300,reserved: 2100, actual total: 82976, total with
>>>>>>>>>> overprovisioning:
>>>>>>>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old
>>>>>>>>>> used:
>>>>>>>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>>>>>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>;
>>>>>>>>>> movedfromreserved:
>>>>>>>>>>
>>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>> supported
>>>>>>>>>> in the network id=519
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>> supported
>>>>>>>>>> in the network id=519
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused
>>>>>>>>>> by:
>>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to
>>>>>>>>>> migrate
>>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to
>>>>>>>>>> migrate
>>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379) Done with run of VM work job:
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379,
>>>>>>>>>> userId:
>>>>>>>>>> 1,
>>>>>>>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>>>>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>>>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>>>>>>>>>> 0,
>>>>>>>>>> result: null, initMsid: 345049098498, completeMsid: null,
>>>>>>>>>> lastUpdated:
>>>>>>>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017},
>>>>>>>>>> job
>>>>>>>>>> origin:13733
>>>>>>>>>> ewcstack-man02-prod: com.cloud.utils.exception.Clou
>>>>>>>>>> dRuntimeException:
>>>>>>>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>>> va:2405)
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>>> va:4517)
>>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>>>>>>>> essor402.invoke(Unknown
>>>>>>>>>> Source)
>>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>>>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>>>>>>>> e(Method.java:606)
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>>>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>>>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>>>>>> erImpl.java:537)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.
>>>>>>>>>> java:49)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>>>>>> ext.java:56)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>>>>>> nagedContext.java:103)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>>>>>> agedContext.java:53)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.ja
>>>>>>>>>> va:46)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.ja
>>>>>>>>>> va:494)
>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>>>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>>>>>>>> k.run(FutureTask.java:262)
>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Complete async job-191379, jobStatus:
>>>>>>>>>> FAILED,
>>>>>>>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>>>>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>>>>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>>>>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>>>>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>>>>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>>>>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>>>>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>>>>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>>>>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>>>>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>>>>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>>>>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>>>>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>>>>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>>>>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>>>>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>>>>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>>>>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>>>>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>>>>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>>>>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>>>>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>>>>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>>>>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>>>>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>>>>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>>>>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>>>>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>>>>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>>>>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>>>>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>>>>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>>>>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>>>>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>>>>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>>>>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>>>>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>>>>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>>>>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>>>>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>>>>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>>>>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>>>>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>>>>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>>>>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>>>>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>>>>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>>>>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Publish async job-191379 complete on
>>>>>>>>>> message bus
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Update db status for job-191379
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and
>>>>>>>>>> disjoin
>>>>>>>>>> all
>>>>>>>>>> subjobs created from job- 191379
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Done executing
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>>> job-191379
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>>>>>>>
>>>>>>>>>> Did you check the ACS log files looking for some exceptions when
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>> migrations for this specific VM started?
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>>>>>>>> <[email protected]>[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Dear CS contributors
>>>>>>>>>>>
>>>>>>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared
>>>>>>>>>>>> primary
>>>>>>>>>>>> storage.
>>>>>>>>>>>>
>>>>>>>>>>>> When we put a host in maintenance, cs evacuates all VM's on
>>>>>>>>>>>> other
>>>>>>>>>>>> hosts.
>>>>>>>>>>>>
>>>>>>>>>>>> But it happens regularly that some VM's are stopped instead of
>>>>>>>>>>>> live
>>>>>>>>>>>> migrated.
>>>>>>>>>>>>
>>>>>>>>>>>> If we restart that stopped VM's, CS will then stop them again
>>>>>>>>>>>> later.
>>>>>>>>>>>>
>>>>>>>>>>>> We need to start them several times until they stays up.
>>>>>>>>>>>>
>>>>>>>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Migrating manually all VM's is working but impracticable
>>>>>>>>>>>> because too
>>>>>>>>>>>> much
>>>>>>>>>>>> time consuming.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks in advance for your help.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Best Regards
>>>>>>>>>>>> Francois
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> EveryWare AG
>>>>>>>>>>>> François Scheurer
>>>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>>>
>>>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>>>> mail: <[email protected]>
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> EveryWare AG
>>>>>>>>>> François Scheurer
>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>
>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>> mail: <[email protected]>
>>>>>>>>>> [email protected]
>>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>> Rafael Weingärtner
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> EveryWare AG
>>>>>>> François Scheurer
>>>>>>> Senior Systems Engineer
>>>>>>> Zurlindenstrasse 52a
>>>>>>> CH-8003 Zürich
>>>>>>>
>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>> mail: <[email protected]>[email protected]
>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> EveryWare AG
>>>>> François Scheurer
>>>>> Senior Systems Engineer
>>>>> Zurlindenstrasse 52a
>>>>> CH-8003 Zürich
>>>>>
>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>> mail: <[email protected]>[email protected]
>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Rafael Weingärtner
>>>>
>>>
>>>
>>>
>>> --
>>> Rafael Weingärtner
>>>
>>>
>>> --
>>>
>>>
>>> EveryWare AG
>>> François Scheurer
>>> Senior Systems Engineer
>>> Zurlindenstrasse 52a
>>> CH-8003 Zürich
>>>
>>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>>> mail: [email protected]
>>> web: http://www.everyware.ch
>>>
>>>
>>
>>
>> --
>> Rafael Weingärtner
>>
>>
>> --
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>> mail: [email protected]
>> web: http://www.everyware.ch
>>
>>
>
>
> --
> Rafael Weingärtner
>
>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
> mail: [email protected]
> web: http://www.everyware.ch
>
>


-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Reply via email to