Ok, this is a new thread centered on a serious problem in my 3.02 CS cloud, running Xenserver 6.02 hosts. Here is what has transpired so far:
1: user reports console proxy not available
2: confirm console proxy not available, issue reboot via cloudstack UI
3: CS reports VM booted ok, still unavailable
4: tried to migrate to different host, VM stuck in migrating state
5: Log in to host, list_domains command does not show VM , but shows a domain in this state:
117 | deadbeef-dead-beef-dead-beef00000075 | DS
which is a pretty bad sign that the VM is hung pretty badly.
6: attempt to destroy domain according to Citrix Support article:
/opt/xensource/debug/destroy_domain -domid 117
7: command hangs
8: I then restart xe api toolstack, it appears to restart fine. I should note that ALL vms are on this host via the "first_fit" vm provisioning algorithm 9: I attempt to start migrating VMs to two other available hosts in preparation for a hard reboot of host 10: migrating VMs fails, and host is now in alert state in CS, and CS log states that host is unavailable. Force reconnect fails.

So, here I am, in a production environment with a scenario that the whole premise of cloud based computing is specifically designed to address, and it is the root cause of the issue it is intended to prevent.

Do I have any other options to prevent down time? I have exhausted everything I know to do. have already scheduled a maintenance window, and fudged the truth to my customers stating that there should be no downtime during this window, which I have 0 faith will actually be true.


--

Regards,

Nik

Nik Martin
nfina Technologies, Inc.
+1.251.243.0043 x1003
http://nfinausa.com
Relentless Reliability

Reply via email to