Restarted the MS and tailed management-server.log. Unfortunately I'm not seeing any smoking guns in the logs to explain why my install is being cranky:
2013-03-08 14:44:55,815 DEBUG [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) Zone host is ready, but secondar y storage vm template: 1 is not ready on secondary storage: 6 2013-03-08 14:44:55,817 DEBUG [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) Zone 1 is not ready to launch se condary storage VM yet 2013-03-08 14:44:56,091 DEBUG [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Zone host is ready, but console p roxy template: 1 is not ready on secondary storage: 6 2013-03-08 14:44:56,092 DEBUG [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Zone 1 is not ready to launch con sole proxy yet Although I did catch an interesting error when the MS was talking the the agent on my HV host: 2013-03-08 14:44:05,628 INFO [xen.discoverer.XcpServerDiscoverer] (AgentTaskPool-1:null) Host: CS-XS-01 connected with hypervisor type: XenServer. Checking CIDR... 2013-03-08 14:44:05,782 DEBUG [cloud.resource.ResourceState] (AgentTaskPool-1:null) Resource state update: [id = 3; name = CS-XS-0 1; old state = Enabled; event = InternalCreated; new state = Enabled] 2013-03-08 14:44:05,783 DEBUG [cloud.host.Status] (AgentTaskPool-1:null) Transition:[Resource state = Enabled, Agent event = Agent Connected, Host id = 3, name = CS-XS-01] 2013-03-08 14:44:05,805 DEBUG [cloud.host.Status] (AgentTaskPool-1:null) Agent status update: [id = 3; name = CS-XS-01; old status = Disconnected; event = AgentConnected; new status = Connecting; old update count = 21; new update count = 22] 2013-03-08 14:44:05,808 DEBUG [agent.manager.ClusteredAgentManagerImpl] (AgentTaskPool-1:null) create ClusteredDirectAgentAttache for 3 2013-03-08 14:44:05,809 INFO [agent.manager.DirectAgentAttache] (AgentTaskPool-1:null) StartupAnswer received 3 Interval = 60 2013-03-08 14:44:05,819 DEBUG [agent.manager.AgentManagerImpl] (AgentTaskPool-1:null) Sending Connect to listener: XcpServerDiscov erer$$EnhancerByCGLIB$$d3a31083 2013-03-08 14:44:05,829 DEBUG [xen.discoverer.XcpServerDiscoverer] (AgentTaskPool-1:null) Setting up host 3 2013-03-08 14:44:05,839 DEBUG [agent.transport.Request] (AgentTaskPool-1:null) Seq 3-1184497665: Sending { Cmd , MgmtId: 34505085 8316, via: 3, Ver: v1, Flags: 100111, [{"SetupCommand":{"env":{},"multipath":false,"needSetup":false,"wait":0}}] } 2013-03-08 14:44:05,840 DEBUG [agent.transport.Request] (AgentTaskPool-1:null) Seq 3-1184497665: Executing: { Cmd , MgmtId: 34505 0858316, via: 3, Ver: v1, Flags: 100111, [{"SetupCommand":{"env":{},"multipath":false,"needSetup":false,"wait":0}}] } 2013-03-08 14:44:05,843 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-1:null) Seq 3-1184497665: Executing request 2013-03-08 14:44:06,038 INFO [xen.resource.CitrixResourceBase] (DirectAgent-1:null) Host 147.26.14.170 OpaqueRef:be755447-313a-66 24-38e5-69c15735b352: Host 147.26.14.170 is already setup. 2013-03-08 14:44:08,719 WARN [xen.resource.CitrixResourceBase] (DirectAgent-1:null) forget SR catch Exception due to The server failed to handle your request, due to an internal error. The given message may give details useful for debugging the p roblem. at com.xensource.xenapi.Types.checkResponse(Types.java:1510) at com.xensource.xenapi.Connection.dispatch(Connection.java:368) at com.cloud.hypervisor.xen.resource.XenServerConnectionPool$XenServerConnection.dispatch(XenServerConnectionPool.java:909 ) at com.xensource.xenapi.PBD.unplug(PBD.java:465) at com.cloud.hypervisor.xen.resource.CitrixResourceBase.cleanupTemplateSR(CitrixResourceBase.java:4518) at com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:4544) at com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:485) at com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:73) at com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:191) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) 2013-03-08 14:44:08,857 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-1:null) Seq 3-1184497665: Response Received: 2013-03-08 14:44:08,858 DEBUG [agent.transport.Request] (DirectAgent-1:null) Seq 3-1184497665: Processing: { Ans: , MgmtId: 34505 0858316, via: 3, Ver: v1, Flags: 110, [{"SetupAnswer":{"_reconnect":false,"result":true,"wait":0}}] } 2013-03-08 14:44:08,858 DEBUG [agent.transport.Request] (AgentTaskPool-1:null) Seq 3-1184497665: Received: { Ans: , MgmtId: 34505 0858316, via: 3, Ver: v1, Flags: 110, { SetupAnswer } } However this seems to me unrelated. On Fri, Mar 8, 2013 at 2:24 PM, Ahmad Emneina <aemne...@gmail.com> wrote: > got it, lets see the full management server log. we should be able to find > out where the MS isnt cooperating. > > > On Fri, Mar 8, 2013 at 12:19 PM, Jason Davis <scr...@gmail.com> wrote: > >> Yup that's what I did, however the MS refuses to spin up a fresh copy of >> the SSVM. >> >> >> On Fri, Mar 8, 2013 at 2:09 PM, Ahmad Emneina <aemne...@gmail.com> wrote: >> >>> I believe you also have to destroy the old secondary storage vm. That way >>> it gets programmed with the new path to mount. >>> >>> >>> On Fri, Mar 8, 2013 at 11:51 AM, Jason Davis <scr...@gmail.com> wrote: >>> >>> > Sorry for bumping this old thread but... >>> > >>> > Did you ever get this figured out Andrei? I am running into the exact >>> same >>> > issue and after some playtime in the DB I can't seem to get this to >>> behave. >>> > >>> > >>> > On Mon, Feb 4, 2013 at 4:59 PM, Andrei Mikhailovsky <and...@arhont.com >>> > >wrote: >>> > >>> > > >>> > > >>> > > >>> > > >did you make the db changes while the management server was up and >>> > > running? >>> > > >Have you restarted the management server since making the db >>> > > modifications? >>> > > >>> > > AM: Yes, I've done the change while the management server was >>> running, >>> > and >>> > > restarted it right after the change has been made. I did go back to >>> db >>> > > after the restart of the management server to make sure the values >>> have >>> > > been saved in db. they are correct. >>> > > >>> > > >>> > > >>> > > >>> > > On Mon, Feb 4, 2013 at 6:20 AM, Andrei Mikhailovsky < >>> and...@arhont.com >>> > > >wrote: >>> > > >>> > > > Hello guys, >>> > > > >>> > > > I am having an issue with the SSVM not starting after I've changed >>> the >>> > > URL >>> > > > of the secondary storage server. I am running a single instance of >>> CS >>> > > 4.0.0 >>> > > > on Centos 6. Here is what I've done: >>> > > > >>> > > > 1. I've modified the host and host_details tables in the DB to >>> change >>> > the >>> > > > URL of the secondary storage server. >>> > > > 2. I've restarted the CS management server >>> > > > 3. Logged in to CS gui and made sure the secondary storage server >>> shows >>> > > > correct details. It did. >>> > > > 4. Restarted SSVM and logged in to SSVM and ran the ssvm check >>> script. >>> > It >>> > > > showed that nfs mountpoint is not mounted. >>> > > > 5. Verified that SSVM has network and it can reach the nfs server. >>> It >>> > > did. >>> > > > 6. Manually mounted the nfs share using: mount -t nfs -o >>> mountproto=tcp >>> > > > server:/path /path. That worked as well. >>> > > > 7. Restarted SSVM again and ran the check script again. No joy. >>> > > > 8. Deleted SSVM server hoping CS would create a new ssvm instance >>> and >>> > all >>> > > > will work okay. The new SSVM is not being created. Log file entries >>> > show: >>> > > > >>> > > > ---- >>> > > > >>> > > > 2013-02-04 13:57:19,336 DEBUG >>> > > > [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) >>> > Zone >>> > > > host is ready, but secondary storage vm template: 3 is not ready on >>> > > > secondary storage: 6 >>> > > > 2013-02-04 13:57:19,336 DEBUG >>> > > > [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) >>> > Zone >>> > > 1 >>> > > > is not ready to launch secondary storage VM yet >>> > > > >>> > > > 2013-02-04 13:57:19,444 DEBUG >>> > > [cloud.consoleproxy.ConsoleProxyManagerImpl] >>> > > > (consoleproxy-1:null) Zone host is ready, but console proxy >>> template: 3 >>> > > is >>> > > > not ready on secondary storage: 6 >>> > > > 2013-02-04 13:57:19,444 DEBUG >>> > > [cloud.consoleproxy.ConsoleProxyManagerImpl] >>> > > > (consoleproxy-1:null) Zone 1 is not ready to launch console proxy >>> yet >>> > > > 2013-02-04 13:57:19,956 DEBUG >>> > > > [network.router.VirtualNetworkApplianceManagerImpl] >>> > > > (RouterStatusMonitor-1:null) Found 7 routers. >>> > > > 2013-02-04 13:57:23,600 DEBUG [agent.manager.AgentManagerImpl] >>> > > > (AgentManager-Handler-8:null) Ping from 21 >>> > > > 2013-02-04 13:57:35,517 DEBUG [agent.manager.AgentManagerImpl] >>> > > > (AgentManager-Handler-13:null) Ping from 20 >>> > > > 2013-02-04 13:57:41,166 DEBUG [cloud.server.StatsCollector] >>> > > > (StatsCollector-1:null) StorageCollector is running... >>> > > > 2013-02-04 13:57:41,168 DEBUG [cloud.server.StatsCollector] >>> > > > (StatsCollector-1:null) There is no secondary storage VM for >>> secondary >>> > > > storage host nfs://192.168.169.200/cloudstack-secondary >>> > > > >>> > > > ---- >>> > > > >>> > > > I do not see any errors or exceptions in the logs. I've even >>> rebooted >>> > the >>> > > > CS management server. Still, no joy (( >>> > > > >>> > > > I've checked the vm_template table and the template with id 3 looks >>> > okay: >>> > > > >>> > > > | 3 | routing-3 | SystemVM Template (KVM) | >>> > > > 8d335295-558c-4378-839a-f2e816aebb6c | 0 | 0 | SYSTEM | 0 | 64 | >>> > > > >>> > > >>> > >>> http://download.cloud.com/templates/acton/acton-systemvm-02062012.qcow2.bz2|QCOW2|2012-10-29 >>> 23:39:25 | NULL | 1 | 2755de1f9ef2ce4d6f2bee2efbb4da92 >>> > > > | SystemVM Template (KVM) | 0 | 0 | 15 | 1 | 0 | 1 | 0 | KVM | >>> NULL | >>> > > NULL >>> > > > | 0 | >>> > > > >>> > > > >>> > > > The secondary storage host entry has an Alert status (which could >>> cause >>> > > > the problem): >>> > > > >>> > > > | 6 | nfs://192.168.169.200/cloudstack-secondary | >>> > > > 8e143df9-580c-481d-9e1d-eadfe7474867 | Alert | SecondaryStorage | >>> nfs | >>> > > > 255.255.255.0 | 00:19:bb:34:35:1e | 192.168.169.250 | >>> 255.255.255.0 | >>> > > > 00:19:bb:34:35:1e | NULL | NULL | NULL | NULL | NULL | NULL | NULL >>> | >>> > > NULL | >>> > > > 1 | NULL | NULL | NULL | nfs:// >>> 192.168.169.200/cloudstack-secondary | >>> > > > NULL | None | NULL | 0 | NULL | 4.0.0.20121029120443 | >>> > > > 4e31c7b3-9333-3e6f-8a04-86d4bec5b576 | 2064199680 | NULL | nfs:// >>> > > > 192.168.169.200/cloudstack-secondary | 1 | 0 | 0 | 1319922183 | >>> NULL | >>> > > > NULL | 2012-10-30 12:31:55 | NULL | 3 | Enabled | >>> > > > >>> > > > >>> > > > >>> > > > I am not sure if I can simply change the db entry of the Status >>> column >>> > > > from Alert to UP? I do not want to loose the secondary storage >>> server >>> > as >>> > > > I've got a bunch of templates, isos and snapshots that I do not >>> want to >>> > > > recreate. Does anyone know what else to try to get back the SSVM? >>> > > > >>> > > > Many thanks >>> > > > >>> > > > Andrei >>> > > > >>> > > > >>> > > > >>> > > >>> > > >>> > >>> >> >> >