Can you check to see if the firewall rules on the management servers match
up?
On Apr 24, 2013 12:45 PM, "Warren Nicholson" <warren.nichol...@nfinausa.com>
wrote:

> OK.
>
> I also tried to start a create a new instance, and it is stuck in
> "starting"
> mode.
>
> Therefore while it appeared everything was peachy, it really wasn't
> functional.
>
> Warren
>
> 2013-04-24 14:41:54,874 DEBUG [cloud.ha.ManagementIPSystemVMInvestigator]
> (HA-Worker-0:work-24) Unable to find a management nic, cannot ping this
> system VM, unable to determine state of VM[User|i-2-9-VM] returning null
> 2013-04-24 14:41:54,874 INFO  [cloud.ha.HighAvailabilityManagerImpl]
> (HA-Worker-0:work-24) ManagementIPSysVMInvestigator found
> VM[User|i-2-9-VM]to be alive? null
> 2013-04-24 14:41:54,874 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> (HA-Worker-0:work-24) Fencing off VM that we don't know the state of
> 2013-04-24 14:41:54,879 DEBUG [agent.manager.ClusteredAgentAttache]
> (HA-Worker-0:work-24) Seq 2-1948647459: Forwarding Seq 2-1948647459:  { Cmd
> , MgmtId: 130580632156, via: 2, Ver: v1, Flags: 100011,
>
> [{"FenceCommand":{"vmName":"i-2-9-VM","hostGuid":"0b198af1-ef8b-45cd-a601-db
> aeccca02ad","hostIp":"172.16.5.3","inSeq":false,"wait":0}}] } to
> 130582343935
> 2013-04-24 14:41:54,879 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-14:null) Seq 2-1948647459: Routing from 130580632156
> 2013-04-24 14:41:54,879 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-14:null) Seq 2-1948647459: Link is closed
> 2013-04-24 14:41:54,880 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-14:null) Seq 2-1948647459: MgmtId 130580632156: Req:
> Resource [Host:2] is unreachable: Host 2: Link is closed
> 2013-04-24 14:41:54,880 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-14:null) Seq 2--1: MgmtId 130580632156: Req: Routing
> to peer
> 2013-04-24 14:41:54,880 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-4:null) Seq 2--1: MgmtId 130580632156: Req: Cancel
> request received
> 2013-04-24 14:41:54,880 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-4:null) Seq 2-1948647459: Cancelling.
> 2013-04-24 14:41:54,881 DEBUG [agent.manager.AgentAttache]
> (HA-Worker-0:work-24) Seq 2-1948647459: Waiting some more time because this
> is the current command
> 2013-04-24 14:41:54,881 DEBUG [agent.manager.AgentAttache]
> (HA-Worker-0:work-24) Seq 2-1948647459: Waiting some more time because this
> is the current command
> 2013-04-24 14:41:54,881 WARN  [agent.manager.AgentAttache]
> (HA-Worker-0:work-24) Seq 2-1948647459: Timed out on Seq 2-1948647459:  {
> Cmd , MgmtId: 130580632156, via: 2, Ver: v1, Flags: 100011,
>
> [{"FenceCommand":{"vmName":"i-2-9-VM","hostGuid":"0b198af1-ef8b-45cd-a601-db
> aeccca02ad","hostIp":"172.16.5.3","inSeq":false,"wait":0}}] }
> 2013-04-24 14:41:54,881 DEBUG [agent.manager.AgentAttache]
> (HA-Worker-0:work-24) Seq 2-1948647459: Cancelling.
> 2013-04-24 14:41:54,881 DEBUG [cloud.ha.XenServerFencer]
> (HA-Worker-0:work-24) Moving on to the next host because Host[-2-Routing]
> is
> unavailable
> 2013-04-24 14:41:54,881 DEBUG [cloud.ha.XenServerFencer]
> (HA-Worker-0:work-24) Unable to fence off VM[User|i-2-9-VM] on
> Host[-1-Routing]
> 2013-04-24 14:41:54,881 INFO  [cloud.ha.HighAvailabilityManagerImpl]
> (HA-Worker-0:work-24) Fencer XenServerFenceBuilder returned false
> 2013-04-24 14:41:54,881 DEBUG [cloud.ha.KVMFencer] (HA-Worker-0:work-24)
> Don't know how to fence non kvm hosts XenServer
> 2013-04-24 14:41:54,881 INFO  [cloud.ha.HighAvailabilityManagerImpl]
> (HA-Worker-0:work-24) Fencer KVMFenceBuilder returned null
> 2013-04-24 14:41:54,881 INFO  [cloud.ha.HighAvailabilityManagerImpl]
> (HA-Worker-0:work-24) Fencer VmwareFenceBuilder returned null
> 2013-04-24 14:41:54,882 DEBUG [ovm.hypervisor.OvmFencer]
> (HA-Worker-0:work-24) Don't know how to fence non Ovm hosts XenServer
> 2013-04-24 14:41:54,882 INFO  [cloud.ha.HighAvailabilityManagerImpl]
> (HA-Worker-0:work-24) Fencer OvmFenceBuilder returned null
> 2013-04-24 14:41:54,882 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> (HA-Worker-0:work-24) We were unable to fence off the VM VM[User|i-2-9-VM]
> 2013-04-24 14:41:54,946 INFO  [cloud.ha.HighAvailabilityManagerImpl]
> (HA-Worker-4:work-20) Rescheduling HAWork[20-HA-6-Running-Investigating] to
> try again at Wed Apr 24 14:52:08 CDT 2013
> 2013-04-24 14:41:55,057 INFO  [cloud.ha.HighAvailabilityManagerImpl]
> (HA-Worker-0:work-24) Rescheduling HAWork[24-HA-9-Starting-Investigating]
> to
> try again at Wed Apr 24 14:52:08 CDT 2013
> 2013-04-24 14:41:57,105 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-5:null) Ping from 4
> 2013-04-24 14:41:57,565 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-6:null) Ping from 5
> 2013-04-24 14:42:08,160 DEBUG [cloud.server.StatsCollector]
> (StatsCollector-3:null) HostStatsCollector is running...
> 2013-04-24 14:42:08,176 DEBUG [agent.manager.ClusteredAgentAttache]
> (StatsCollector-3:null) Seq 1-875167777: Forwarding null to 130582343935
> 2013-04-24 14:42:08,177 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-7:null) Seq 1-875167777: Routing from 130580632156
> 2013-04-24 14:42:08,177 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-7:null) Seq 1-875167777: Link is closed
> 2013-04-24 14:42:08,177 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-7:null) Seq 1-875167777: MgmtId 130580632156: Req:
> Resource [Host:1] is unreachable: Host 1: Link is closed
> 2013-04-24 14:42:08,178 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-7:null) Seq 1--1: MgmtId 130580632156: Req: Routing
> to
> peer
> 2013-04-24 14:42:08,178 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-8:null) Seq 1--1: MgmtId 130580632156: Req: Cancel
> request received
> 2013-04-24 14:42:08,178 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-8:null) Seq 1-875167777: Cancelling.
> 2013-04-24 14:42:08,179 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167777: Waiting some more time because
> this
> is the current command
> 2013-04-24 14:42:08,179 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167777: Waiting some more time because
> this
> is the current command
> 2013-04-24 14:42:08,179 WARN  [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167777: Timed out on null
> 2013-04-24 14:42:08,180 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167777: Cancelling.
> 2013-04-24 14:42:08,180 WARN  [agent.manager.AgentManagerImpl]
> (StatsCollector-3:null) Operation timed out: Commands 875167777 to Host 1
> timed out after 3600
> 2013-04-24 14:42:08,180 WARN  [cloud.resource.ResourceManagerImpl]
> (StatsCollector-3:null) Unable to obtain host 1 statistics.
> 2013-04-24 14:42:08,180 WARN  [cloud.server.StatsCollector]
> (StatsCollector-3:null) Received invalid host stats for host: 1
> 2013-04-24 14:42:08,188 DEBUG [agent.manager.ClusteredAgentAttache]
> (StatsCollector-3:null) Seq 2-1948647460: Forwarding null to 130582343935
> 2013-04-24 14:42:08,188 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-13:null) Seq 2-1948647460: Routing from 130580632156
> 2013-04-24 14:42:08,188 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-13:null) Seq 2-1948647460: Link is closed
> 2013-04-24 14:42:08,188 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-13:null) Seq 2-1948647460: MgmtId 130580632156: Req:
> Resource [Host:2] is unreachable: Host 2: Link is closed
> 2013-04-24 14:42:08,189 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-13:null) Seq 2--1: MgmtId 130580632156: Req: Routing
> to peer
> 2013-04-24 14:42:08,189 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-9:null) Seq 2--1: MgmtId 130580632156: Req: Cancel
> request received
> 2013-04-24 14:42:08,189 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-9:null) Seq 2-1948647460: Cancelling.
> 2013-04-24 14:42:08,189 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647460: Waiting some more time because
> this is the current command
> 2013-04-24 14:42:08,189 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647460: Waiting some more time because
> this is the current command
> 2013-04-24 14:42:08,190 WARN  [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647460: Timed out on null
> 2013-04-24 14:42:08,190 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647460: Cancelling.
> 2013-04-24 14:42:08,190 WARN  [agent.manager.AgentManagerImpl]
> (StatsCollector-3:null) Operation timed out: Commands 1948647460 to Host 2
> timed out after 3600
> 2013-04-24 14:42:08,190 WARN  [cloud.resource.ResourceManagerImpl]
> (StatsCollector-3:null) Unable to obtain host 2 statistics.
> 2013-04-24 14:42:08,190 WARN  [cloud.server.StatsCollector]
> (StatsCollector-3:null) Received invalid host stats for host: 2
> 2013-04-24 14:42:08,221 DEBUG [cloud.server.StatsCollector]
> (StatsCollector-1:null) VmStatsCollector is running...
> 2013-04-24 14:42:08,236 DEBUG [agent.manager.ClusteredAgentAttache]
> (StatsCollector-1:null) Seq 1-875167778: Forwarding null to 130582343935
> 2013-04-24 14:42:08,237 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-10:null) Seq 1-875167778: Routing from 130580632156
> 2013-04-24 14:42:08,237 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-10:null) Seq 1-875167778: Link is closed
> 2013-04-24 14:42:08,237 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-10:null) Seq 1-875167778: MgmtId 130580632156: Req:
> Resource [Host:1] is unreachable: Host 1: Link is closed
> 2013-04-24 14:42:08,237 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-10:null) Seq 1--1: MgmtId 130580632156: Req: Routing
> to peer
> 2013-04-24 14:42:08,238 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-11:null) Seq 1--1: MgmtId 130580632156: Req: Cancel
> request received
> 2013-04-24 14:42:08,238 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-11:null) Seq 1-875167778: Cancelling.
> 2013-04-24 14:42:08,238 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-1:null) Seq 1-875167778: Waiting some more time because
> this
> is the current command
> 2013-04-24 14:42:08,238 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-1:null) Seq 1-875167778: Waiting some more time because
> this
> is the current command
> 2013-04-24 14:42:08,238 WARN  [agent.manager.AgentAttache]
> (StatsCollector-1:null) Seq 1-875167778: Timed out on null
> 2013-04-24 14:42:08,238 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-1:null) Seq 1-875167778: Cancelling.
> 2013-04-24 14:42:08,238 WARN  [agent.manager.AgentManagerImpl]
> (StatsCollector-1:null) Operation timed out: Commands 875167778 to Host 1
> timed out after 3600
> 2013-04-24 14:42:08,239 WARN  [cloud.vm.UserVmManagerImpl]
> (StatsCollector-1:null) Unable to obtain VM statistics.
> 2013-04-24 14:42:08,248 DEBUG [agent.manager.ClusteredAgentAttache]
> (StatsCollector-1:null) Seq 2-1948647461: Forwarding null to 130582343935
> 2013-04-24 14:42:08,249 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-12:null) Seq 2-1948647461: Routing from 130580632156
> 2013-04-24 14:42:08,249 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-12:null) Seq 2-1948647461: Link is closed
> 2013-04-24 14:42:08,249 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-12:null) Seq 2-1948647461: MgmtId 130580632156: Req:
> Resource [Host:2] is unreachable: Host 2: Link is closed
> 2013-04-24 14:42:08,249 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-12:null) Seq 2--1: MgmtId 130580632156: Req: Routing
> to peer
> 2013-04-24 14:42:08,250 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-15:null) Seq 2--1: MgmtId 130580632156: Req: Cancel
> request received
> 2013-04-24 14:42:08,250 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-15:null) Seq 2-1948647461: Cancelling.
> 2013-04-24 14:42:08,250 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-1:null) Seq 2-1948647461: Waiting some more time because
> this is the current command
> 2013-04-24 14:42:08,250 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-1:null) Seq 2-1948647461: Waiting some more time because
> this is the current command
> 2013-04-24 14:42:08,250 WARN  [agent.manager.AgentAttache]
> (StatsCollector-1:null) Seq 2-1948647461: Timed out on null
> 2013-04-24 14:42:08,251 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-1:null) Seq 2-1948647461: Cancelling.
> 2013-04-24 14:42:08,251 WARN  [agent.manager.AgentManagerImpl]
> (StatsCollector-1:null) Operation timed out: Commands 1948647461 to Host 2
> timed out after 3600
> 2013-04-24 14:42:08,251 WARN  [cloud.vm.UserVmManagerImpl]
> (StatsCollector-1:null) Unable to obtain VM statistics.
> 2013-04-24 14:42:08,720 DEBUG [cloud.server.StatsCollector]
> (StatsCollector-3:null) StorageCollector is running...
> 2013-04-24 14:42:08,784 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (AgentManager-Handler-1:null) Cleanup succeeded. Details null
> 2013-04-24 14:42:08,784 DEBUG [agent.transport.Request]
> (StatsCollector-3:null) Seq 4-1584201741: Received:  { Ans: , MgmtId:
> 130580632156, via: 4, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
> 2013-04-24 14:42:08,784 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (StatsCollector-3:null) Cleanup succeeded. Details null
> 2013-04-24 14:42:08,793 DEBUG [agent.manager.ClusteredAgentAttache]
> (StatsCollector-3:null) Seq 1-875167779: Forwarding null to 130582343935
> 2013-04-24 14:42:08,794 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-3:null) Seq 1-875167779: Routing from 130580632156
> 2013-04-24 14:42:08,794 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-3:null) Seq 1-875167779: Link is closed
> 2013-04-24 14:42:08,794 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-3:null) Seq 1-875167779: MgmtId 130580632156: Req:
> Resource [Host:1] is unreachable: Host 1: Link is closed
> 2013-04-24 14:42:08,794 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-3:null) Seq 1--1: MgmtId 130580632156: Req: Routing
> to
> peer
> 2013-04-24 14:42:08,795 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-2:null) Seq 1--1: MgmtId 130580632156: Req: Cancel
> request received
> 2013-04-24 14:42:08,795 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-2:null) Seq 1-875167779: Cancelling.
> 2013-04-24 14:42:08,795 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167779: Waiting some more time because
> this
> is the current command
> 2013-04-24 14:42:08,795 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167779: Waiting some more time because
> this
> is the current command
> 2013-04-24 14:42:08,795 WARN  [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167779: Timed out on null
> 2013-04-24 14:42:08,795 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167779: Cancelling.
> 2013-04-24 14:42:08,795 DEBUG [cloud.storage.StorageManagerImpl]
> (StatsCollector-3:null) Unable to send storage pool command to
> Pool[200|IscsiLUN] via 1
> com.cloud.exception.OperationTimedoutException: Commands 875167779 to Host
> 1
> timed out after 3600
>         at com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:425)
>         at
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:501)
>         at
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:454)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1937
> )
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:448)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:462)
>         at
>
> com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:307
> )
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$
> 201(ScheduledThreadPoolExecutor.java:165)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sch
> eduledThreadPoolExecutor.java:267)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
> 46)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
> 15)
>         at java.lang.Thread.run(Thread.java:679)
> 2013-04-24 14:42:08,801 DEBUG [agent.manager.ClusteredAgentAttache]
> (StatsCollector-3:null) Seq 2-1948647462: Forwarding null to 130582343935
> 2013-04-24 14:42:08,802 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-14:null) Seq 2-1948647462: Routing from 130580632156
> 2013-04-24 14:42:08,802 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-14:null) Seq 2-1948647462: Link is closed
> 2013-04-24 14:42:08,802 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-14:null) Seq 2-1948647462: MgmtId 130580632156: Req:
> Resource [Host:2] is unreachable: Host 2: Link is closed
> 2013-04-24 14:42:08,802 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-14:null) Seq 2--1: MgmtId 130580632156: Req: Routing
> to peer
> 2013-04-24 14:42:08,803 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-4:null) Seq 2--1: MgmtId 130580632156: Req: Cancel
> request received
> 2013-04-24 14:42:08,803 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-4:null) Seq 2-1948647462: Cancelling.
> 2013-04-24 14:42:08,803 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647462: Waiting some more time because
> this is the current command
> 2013-04-24 14:42:08,803 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647462: Waiting some more time because
> this is the current command
> 2013-04-24 14:42:08,803 WARN  [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647462: Timed out on null
> 2013-04-24 14:42:08,803 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647462: Cancelling.
> 2013-04-24 14:42:08,803 DEBUG [cloud.storage.StorageManagerImpl]
> (StatsCollector-3:null) Unable to send storage pool command to
> Pool[200|IscsiLUN] via 2
> com.cloud.exception.OperationTimedoutException: Commands 1948647462 to Host
> 2 timed out after 3600
>         at com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:425)
>         at
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:501)
>         at
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:454)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1937
> )
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:448)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:462)
>         at
>
> com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:307
> )
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$
> 201(ScheduledThreadPoolExecutor.java:165)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sch
> eduledThreadPoolExecutor.java:267)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
> 46)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
> 15)
>         at java.lang.Thread.run(Thread.java:679)
> 2013-04-24 14:42:08,803 INFO  [cloud.server.StatsCollector]
> (StatsCollector-3:null) Unable to reach Pool[200|IscsiLUN]
> com.cloud.exception.StorageUnavailableException: Resource [StoragePool:200]
> is unreachable: Unable to send command to the pool
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1947
> )
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:448)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:462)
>         at
>
> com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:307
> )
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$
> 201(ScheduledThreadPoolExecutor.java:165)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sch
> eduledThreadPoolExecutor.java:267)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
> 46)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
> 15)
>         at java.lang.Thread.run(Thread.java:679)
> 2013-04-24 14:42:08,810 DEBUG [agent.manager.ClusteredAgentAttache]
> (StatsCollector-3:null) Seq 1-875167780: Forwarding null to 130582343935
> 2013-04-24 14:42:08,810 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-5:null) Seq 1-875167780: Routing from 130580632156
> 2013-04-24 14:42:08,811 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-5:null) Seq 1-875167780: Link is closed
> 2013-04-24 14:42:08,811 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-5:null) Seq 1-875167780: MgmtId 130580632156: Req:
> Resource [Host:1] is unreachable: Host 1: Link is closed
> 2013-04-24 14:42:08,811 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-5:null) Seq 1--1: MgmtId 130580632156: Req: Routing
> to
> peer
> 2013-04-24 14:42:08,811 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-6:null) Seq 1--1: MgmtId 130580632156: Req: Cancel
> request received
> 2013-04-24 14:42:08,812 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-6:null) Seq 1-875167780: Cancelling.
> 2013-04-24 14:42:08,812 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167780: Waiting some more time because
> this
> is the current command
> 2013-04-24 14:42:08,812 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167780: Waiting some more time because
> this
> is the current command
> 2013-04-24 14:42:08,812 WARN  [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167780: Timed out on null
> 2013-04-24 14:42:08,812 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 1-875167780: Cancelling.
> 2013-04-24 14:42:08,812 DEBUG [cloud.storage.StorageManagerImpl]
> (StatsCollector-3:null) Unable to send storage pool command to
> Pool[201|IscsiLUN] via 1
> com.cloud.exception.OperationTimedoutException: Commands 875167780 to Host
> 1
> timed out after 3600
>         at com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:425)
>         at
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:501)
>         at
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:454)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1937
> )
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:448)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:462)
>         at
>
> com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:307
> )
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$
> 201(ScheduledThreadPoolExecutor.java:165)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sch
> eduledThreadPoolExecutor.java:267)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
> 46)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
> 15)
>         at java.lang.Thread.run(Thread.java:679)
> 2013-04-24 14:42:08,817 DEBUG [agent.manager.ClusteredAgentAttache]
> (StatsCollector-3:null) Seq 2-1948647463: Forwarding null to 130582343935
> 2013-04-24 14:42:08,817 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-7:null) Seq 2-1948647463: Routing from 130580632156
> 2013-04-24 14:42:08,817 DEBUG [agent.manager.ClusteredAgentAttache]
> (AgentManager-Handler-7:null) Seq 2-1948647463: Link is closed
> 2013-04-24 14:42:08,817 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-7:null) Seq 2-1948647463: MgmtId 130580632156: Req:
> Resource [Host:2] is unreachable: Host 2: Link is closed
> 2013-04-24 14:42:08,817 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-7:null) Seq 2--1: MgmtId 130580632156: Req: Routing
> to
> peer
> 2013-04-24 14:42:08,818 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> (AgentManager-Handler-8:null) Seq 2--1: MgmtId 130580632156: Req: Cancel
> request received
> 2013-04-24 14:42:08,818 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-8:null) Seq 2-1948647463: Cancelling.
> 2013-04-24 14:42:08,818 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647463: Waiting some more time because
> this is the current command
> 2013-04-24 14:42:08,818 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647463: Waiting some more time because
> this is the current command
> 2013-04-24 14:42:08,818 WARN  [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647463: Timed out on null
> 2013-04-24 14:42:08,819 DEBUG [agent.manager.AgentAttache]
> (StatsCollector-3:null) Seq 2-1948647463: Cancelling.
> 2013-04-24 14:42:08,819 DEBUG [cloud.storage.StorageManagerImpl]
> (StatsCollector-3:null) Unable to send storage pool command to
> Pool[201|IscsiLUN] via 2
> com.cloud.exception.OperationTimedoutException: Commands 1948647463 to Host
> 2 timed out after 3600
>         at com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:425)
>         at
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:501)
>         at
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:454)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1937
> )
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:448)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:462)
>         at
>
> com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:307
> )
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$
> 201(ScheduledThreadPoolExecutor.java:165)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sch
> eduledThreadPoolExecutor.java:267)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
> 46)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
> 15)
>         at java.lang.Thread.run(Thread.java:679)
> 2013-04-24 14:42:08,819 INFO  [cloud.server.StatsCollector]
> (StatsCollector-3:null) Unable to reach Pool[201|IscsiLUN]
> com.cloud.exception.StorageUnavailableException: Resource [StoragePool:201]
> is unreachable: Unable to send command to the pool
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1947
> )
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:448)
>         at
>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:462)
>         at
>
> com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:307
> )
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$
> 201(ScheduledThreadPoolExecutor.java:165)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sch
> eduledThreadPoolExecutor.java:267)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
> 46)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
> 15)
>         at java.lang.Thread.run(Thread.java:679)
>
>
> -----Original Message-----
> From: Ahmad Emneina [mailto:aemne...@gmail.com]
> Sent: Wednesday, April 24, 2013 2:22 PM
> To: Cloudstack users mailing list
> Subject: Re: Procedure to replace management controller
>
> Hey Warren, it could be a slew of reasons. I'd like to see more before i
> make a recommendation. Would you be able to paste more logs. Say 100-200
> more lines above the stack trace you cited.
> On Apr 24, 2013 12:12 PM, "Warren Nicholson" <
> warren.nichol...@nfinausa.com>
> wrote:
>
> > Using Cloudstack 3.02.
> >
> >
> >
> > I've done this a couple of times now, with the same result.
> >
> >
> >
> > After importing the cloud and the cloud_usage DB files from
> >
> > the mysql  backup run from the original controller, the replacement
> >
> > Controller appears fully functional with one exception.
> >
> >
> >
> > All templates, instances, physical resources, system VM's,
> >
> > Dashboard, Snapshots, etc. seem to be reporting correctly
> >
> > and the VM's and are fully accessible and functional.
> >
> >
> >
> > However, the total storage reported under the Infrastructure
> >
> > selection, shows the Total Storage to be 0.00 KB.
> >
> >
> >
> > The Dashboard shows it as 3.62 TB which is correct.
> >
> >
> >
> > 1.       How do I get the Infrastructure to agree with the resource?
> >
> >
> >
> > I am getting the following blurb in the management-server.log:
> >
> >
> >
> > 2013-04-24 14:03:06,562 INFO  [cloud.server.StatsCollector]
> > (StatsCollector-3:null) Unable to reach Pool[201|IscsiLUN]
> >
> > com.cloud.exception.StorageUnavailableException: Resource
> > [StoragePool:201] is unreachable: Unable to send command to the pool
> >
> >         at
> >
> > com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.jav
> > a:1947
> > )
> >
> >         at
> >
> > com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.jav
> > a:448)
> >
> >         at
> >
> > com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.jav
> > a:462)
> >
> >         at
> >
> > com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.ja
> > va:307
> > )
> >
> >         at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471
> > )
> >
> >         at
> > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:
> > 351)
> >
> >         at
> > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> >
> >         at
> >
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
> > ccess$
> > 201(ScheduledThreadPoolExecutor.java:165)
> >
> >         at
> >
> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
> > un(Sch
> > eduledThreadPoolExecutor.java:267)
> >
> >         at
> >
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > ava:11
> > 46)
> >
> >         at
> >
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > java:6
> > 15)
> >
> >         at java.lang.Thread.run(Thread.java:679)
> >
> >
> >
> > Warren
> >
> >
>
>

Reply via email to