I was able to get it working by following these steps:

1. stop all instances
2. service cloudstack-management stop
3. service cloudstack-agent stop
4. virsh shutdown {domain} (for each of the system VMs)
5. service libvirtd stop
6. umount primary and secondary
7. reboot

The console proxy is working again.  I expect it will probably break again
in a day or two.  I have a feeling it's a result of this libvirtd bug,
since I've seen the "cannot acquire state change lock" several times.

https://bugs.launchpad.net/nova/+bug/1254872

I might try building my own libvirtd 1.0.3 for EL6.


On Tue, May 20, 2014 at 6:21 PM, Ian Young <[email protected]> wrote:

> So I got the console proxy working via HTTPS (by managing my own "
> realhostip.com" DNS) last week and everything was working fine.  Today,
> all of a sudden, the console proxy stopped working again.  The browser
> says, "Connecting to 192-168-100-159.realhostip.com..." and eventually
> times out.  I tried to restart it and it went into a "Stopping" state that
> never completed and the Agent State was "Disconnected."  I could not shut
> down the VM using virsh or with "kill -9" because libvirtd kept saying,
> "cannot acquire state change lock," so I gracefully shut down the remaining
> instances and rebooted the entire management server/hypervisor.  Start over.
>
> When it came back up, the SSVM and console proxy started but the virtual
> router was stopped.  I was able to manually start it from the UI.  The
> console proxy still times out when I try to access it from a browser.  I
> don't see any errors in the management or agent logs, just this:
>
> 2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null)
> Seq 1-2130378876: Sending  { Cmd , MgmtId: 55157049428734, via: 1(
> virthost1.redacted.com), Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.GetVncPortCommand":{"id":4,"name":"r-4-VM","wait":0}}]
> }
> 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request]
> (AgentManager-Handler-3:null) Seq 1-2130378876: Processing:  { Ans: ,
> MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5902,"result":true,"wait":0}}]
> }
> 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (catalina-exec-10:null)
> Seq 1-2130378876: Received:  { Ans: , MgmtId: 55157049428734, via: 1, Ver:
> v1, Flags: 10, { GetVncPortAnswer } }
> 2014-05-20 18:04:27,684 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-10:null) Port info 192.168.100.6
> 2014-05-20 18:04:27,684 INFO  [c.c.s.ConsoleProxyServlet]
> (catalina-exec-10:null) Parse host info returned from executing
> GetVNCPortCommand. host info: 192.168.100.6
> 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-10:null) Compose console url:
> https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A
> 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-10:null) the console url is ::
> <html><title>r-4-VM</title><frameset><frame src="
> https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A
> "></frame></frameset></html>
> 2014-05-20 18:04:29,216 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentManager-Handler-4:null) SeqA 2-545: Processing Seq 2-545:  { Cmd ,
> MgmtId: -1, via: 2, Ver: v1, Flags: 11,
> [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":2,"_loadInfo":"{\n
>  \"connections\": []\n}","wait":0}}] }
>
> If I try to restart the system VMs with cloudstack-sysvmadm, it says:
>
> Stopping and starting 1 secondary storage vm(s)...
> curl: (7) couldn't connect to host
> ERROR: Failed to stop secondary storage vm with id 1
>
> Done stopping and starting secondary storage vm(s)
>
> Stopping and starting 1 console proxy vm(s)...
> curl: (7) couldn't connect to host
> ERROR: Failed to stop console proxy vm with id 2
>
> Done stopping and starting console proxy vm(s) .
>
> Stopping and starting 1 running routing vm(s)...
> curl: (7) couldn't connect to host
> 2
> Done restarting router(s).
>
> I notice there are now four entries for the same management server in the
> mshost table, and they all are in an "Up" state and the "removed" field is
> NULL.  What's wrong with this system?
>

Reply via email to