Hi Wei, Please see :
root@acs-apac:/etc/cloudstack/management# hostname -f acs-apac One thing I saw right now in table hosts, the mgmt._server_id is still corresponding to previous server.... Right now, all the nodes are with state Up but with a timeout in logs. I changed the mgmt._server_id for one host to the new id, current server, and the state was updated to Disconnected. Now I'm a bit confused; should not all hosts have the new mgmt._server_id, or is this updated over time? For clarification, the hosts are up and reachable from cloudstack and vice versa. Regards, Cristian -----Original Message----- From: Wei ZHOU <[email protected]> Sent: Wednesday, January 17, 2024 4:56 PM To: [email protected] Subject: Re: MgmtId 345050133919: Req: Resource [Host:11] is unreachable: Host 11: Link is closed The msid should be same after restart. Can you check if "hostname -f" works? 在 2024年1月17日星期三, <[email protected]> 写道: > I already have a headache. Anyone here who can provide support for > this (not for a fee)? I need to restore this CloudStack, but it is > impossible on this version. > > > Now any task I execute I see repetitive : > > 2024-01-17 14:41:20,892 DEBUG [c.c.a.m.ClusteredAgentAttache] > (AgentManager-Handler-9:null) (logid:) Seq 6-5811613844144914446: > Forwarding Seq 6-5811613844144914446: { Cmd , MgmtId: 345050149016, via: > 6, Ver: v1, Flags: 100111, [{"com.cloud.agent.api. > RebootCommand":{"vmName":"s-24960-VM","executeInSequence": > "true","wait":"0","bypassHostMaintenance":"false"}}] } to 52239141482 > > So, I only see this in the log, one after another. Here I just tried > to reboot a secondary storage to see if it works. No matter what I > try, I see a similar repetitive event in the log, and it will not stop > until I restart the cloudstack-management. > > > Regards, > Cristian > > > > > -----Original Message----- > From: [email protected] <[email protected]> > Sent: Wednesday, January 17, 2024 12:46 PM > To: [email protected] > Subject: RE: MgmtId 345050133919: Req: Resource [Host:11] is unreachable: > Host 11: Link is closed > > Hi Wei, > > I just did this, now I set this for both entries : > > |id |msid |runid |name |uuid > |state|version |service_ip |service_port|last_update > |removed |alert_count| > |---|---------------|-----------------|--------|------------ > ------------------------|-----|--------|-------------|------ > ------|-------------------|-------------------|-----------| > |1 |52,239,141,482 > |1,704,875,017,724|acs-apac|bd77b0ae-66d4-4783-bc17-5c3707cc1e49|Down > |4.18.1.0|181.41.xxx.12|9,090 |2024-01-16 08:00:01|2024-01-17 > 12:36:18|0 | > |2 |345,050,133,919|1,705,475,667,602|acs-apac|5fd7fd16- > 8134-4408-b5b7-fa11327ebe29|Down |4.18.1.0|181.41.xxx.12|9,090 > |2024-01-17 07:20:02|2024-01-17 12:36:23|0 | > > > And after start I see this : > > |id |msid |runid |name |uuid > |state|version |service_ip |service_port|last_update > |removed |alert_count| > |---|---------------|-----------------|--------|------------ > ------------------------|-----|--------|-------------|------ > ------|-------------------|-------------------|-----------| > |1 |52,239,141,482 > |1,704,875,017,724|acs-apac|bd77b0ae-66d4-4783-bc17-5c3707cc1e49|Down > |4.18.1.0|181.41.xxx.12|9,090 |2024-01-16 08:00:01|2024-01-17 > 12:36:18|0 | > |2 > ||345,050,133,919|1,705,487,986,668|acs-apac|5fd7fd16-8134-4408-b5b7-f > |a11327ebe29|Up > |4.18.1.0|181.41.xxx.12|9,090 |2024-01-17 10:41:05| > |0 | > > > This is what I have in mshost status : > > |id |ms_id |last_jvm_start > |last_jvm_stop |last_system_boot |os_distribution > |java_name|java_version |updated > |created |removed| > |---|------------------------------------|------------------ > -|-------------------|-------------------|------------------ > |---------|-----------------------------------|------------- > ------|-------------------|-------| > |2 |bd77b0ae-66d4-4783-bc17-5c3707cc1e49|2024-01-10 > |08:23:25|2024-01-10 > 08:16:12|2024-01-10 08:23:15|Ubuntu 20.04.6 LTS|Ubuntu > |11.0.21+9-post-Ubuntu-0ubuntu120.04|2024-01-16 07:59:59|2022-09-30 > 07:40:52| | > |3 |5fd7fd16-8134-4408-b5b7-fa11327ebe29|2024-01-17 > |07:14:16|2024-01-17 > 07:14:14|2024-01-16 21:34:09|Ubuntu 22.04.3 LTS|Ubuntu > |11.0.21+9-post-Ubuntu-0ubuntu122.04|2024-01-17 07:19:46|2024-01-16 > 11:53:18| | > > And I have the same issue, host timeout. > > I'm out of ideas..... > > > Thank you, > Cristian > > > -----Original Message----- > From: Wei ZHOU <[email protected]> > Sent: Wednesday, January 17, 2024 12:19 PM > To: [email protected] > Subject: Re: MgmtId 345050133919: Req: Resource [Host:11] is unreachable: > Host 11: Link is closed > > Hi, > > Can you mark the mgmt server as Down (or set "removed" to "now()") in > the "mshost" table , and retry? > > -Wei > > On Wed, 17 Jan 2024 at 10:29, <[email protected]> wrote: > > > Looks like restoring on the exact same OS version is not fixing the > > issue... at starting the cloudstack I get this error : > > > > 2024-01-17 09:26:26,067 DEBUG [c.c.u.s.Script] (main:null) (logid:) > > System > > resource: > > file:/usr/share/cloudstack-common/scripts/vm/systemvm/injectkeys.sh > > 2024-01-17 09:26:26,067 DEBUG [c.c.u.s.Script] (main:null) (logid:) > > Absolute path = > > /usr/share/cloudstack-common/scripts/vm/systemvm/injectkeys.sh > > 2024-01-17 09:26:26,068 DEBUG [c.c.s.ConfigurationServerImpl] > > (main:null) > > (logid:) Executing: /bin/bash > > /usr/share/cloudstack-common/scripts/vm/systemvm/injectkeys.sh > > /var/lib/cloudstack/management/.ssh/id_rsa > > 2024-01-17 09:26:26,070 DEBUG [c.c.s.ConfigurationServerImpl] > > (main:null) > > (logid:) Executing while with timeout : 3600000 > > 2024-01-17 09:26:26,078 DEBUG [c.c.s.ConfigurationServerImpl] > > (main:null) > > (logid:) Execution is successful. > > 2024-01-17 09:26:26,083 INFO [c.c.s.ConfigurationServerImpl] > > (main:null) > > (logid:) The script injectkeys.sh was run with result : null > > 2024-01-17 09:26:26,106 DEBUG [c.c.u.d.DbProperties] (main:null) > > (logid:) DB properties were already loaded > > 2024-01-17 09:26:26,106 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > > (main:null) (logid:) Start AsyncJobManager API executor thread pool > > in size > > 125 > > 2024-01-17 09:26:26,107 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > > (main:null) (logid:) Start AsyncJobManager Work executor thread pool > > in size 166 > > 2024-01-17 09:26:26,108 INFO [c.c.c.ClusterManagerImpl] (main:null) > > (logid:) Start configuring cluster manager : ClusterManagerImpl > > 2024-01-17 09:26:26,108 DEBUG [c.c.u.d.DbProperties] (main:null) > > (logid:) DB properties were already loaded > > 2024-01-17 09:26:26,109 INFO [c.c.c.ClusterManagerImpl] (main:null) > > (logid:) Cluster node IP : 10.60.0.2 > > 2024-01-17 09:26:26,122 INFO [c.c.c.ClusterManagerImpl] (main:null) > > (logid:) Trying to connect to 10.60.0.2 > > 2024-01-17 09:26:26,133 ERROR [c.c.c.ClusterManagerImpl] (main:null) > > (logid:) Unable to ping management server at 10.60.0.2:9090 due to > > ConnectException > > 2024-01-17 09:26:26,134 DEBUG [c.c.c.ClusterManagerImpl] (main:null) > > (logid:) Unable to ping management server at 10.60.0.2:9090 due to > > ConnectException > > java.net.ConnectException: Connection refused > > at java.base/sun.nio.ch.Net.connect0(Native Method) > > at java.base/sun.nio.ch.Net.connect(Net.java:483) > > at java.base/sun.nio.ch.Net.connect(Net.java:472) > > at java.base/sun.nio.ch > > .SocketChannelImpl.connect(SocketChannelImpl.java:692) > > at > > com.cloud.cluster.ClusterManagerImpl.pingManagementNode( > ClusterManagerImpl.java:1190) > > at > > com.cloud.cluster.ClusterManagerImpl.pingManagementNode( > ClusterManagerImpl.java:1159) > > at > > com.cloud.cluster.ClusterManagerImpl.checkConflicts( > ClusterManagerImpl.java:1240) > > at > > com.cloud.cluster.ClusterManagerImpl.configure( > ClusterManagerImpl.java:1115) > > at > > org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle$3. > with(CloudStackExtendedLifeCycle.java:114) > > at > > org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.w > > ith( > CloudStackExtendedLifeCycle.java:153) > > at > > org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle. > configure(CloudStackExtendedLifeCycle.java:110) > > at > > org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle. > start(CloudStackExtendedLifeCycle.java:55) > > at > > org.springframework.context.support.DefaultLifecycleProcessor.doStar > > t( > DefaultLifecycleProcessor.java:178) > > at > > org.springframework.context.support.DefaultLifecycleProcessor. > access$200(DefaultLifecycleProcessor.java:54) > > at > > org.springframework.context.support.DefaultLifecycleProcessor$ > LifecycleGroup.start(DefaultLifecycleProcessor.java:356) > > at java.base/java.lang.Iterable.forEach(Iterable.java:75) > > at > > org.springframework.context.support.DefaultLifecycleProcessor. > startBeans(DefaultLifecycleProcessor.java:155) > > at > > org.springframework.context.support.DefaultLifecycleProcessor.onRefr > > esh( > DefaultLifecycleProcessor.java:123) > > at > > org.springframework.context.support.AbstractApplicationContext. > finishRefresh(AbstractApplicationContext.java:937) > > at > > org.springframework.context.support.AbstractApplicationContext.refre > > sh( > AbstractApplicationContext.java:586) > > at > > org.apache.cloudstack.spring.module.model.impl. > DefaultModuleDefinitionSet.loadContext(DefaultModuleDefinitionSet. > java:144) > > at > > org.apache.cloudstack.spring.module.model.impl. > DefaultModuleDefinitionSet$2.with(DefaultModuleDefinitionSet.java:121) > > at > > org.apache.cloudstack.spring.module.model.impl. > DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java: > 244) > > at > > org.apache.cloudstack.spring.module.model.impl. > DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java: > 249) > > at > > org.apache.cloudstack.spring.module.model.impl. > DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java: > 249) > > at > > org.apache.cloudstack.spring.module.model.impl. > DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java: > 232) > > at > > org.apache.cloudstack.spring.module.model.impl. > DefaultModuleDefinitionSet.loadContexts(DefaultModuleDefinitionSet. > java:116) > > at > > org.apache.cloudstack.spring.module.model.impl. > DefaultModuleDefinitionSet.load(DefaultModuleDefinitionSet.java:78) > > at > > org.apache.cloudstack.spring.module.factory.ModuleBasedContextFactory. > loadModules(ModuleBasedContextFactory.java:37) > > at > > org.apache.cloudstack.spring.module.factory. > CloudStackSpringContext.init(CloudStackSpringContext.java:70) > > at > > org.apache.cloudstack.spring.module.factory. > CloudStackSpringContext.<init>(CloudStackSpringContext.java:57) > > at > > org.apache.cloudstack.spring.module.factory. > CloudStackSpringContext.<init>(CloudStackSpringContext.java:61) > > at > > org.apache.cloudstack.spring.module.web.CloudStackContextLoaderListe > > ne > r.contextInitialized(CloudStackContextLoaderListener.java:52) > > at > > org.eclipse.jetty.server.handler.ContextHandler.callContextInitializ > > ed( > ContextHandler.java:1073) > > at > > org.eclipse.jetty.servlet.ServletContextHandler.callContextInitializ > > ed( > ServletContextHandler.java:572) > > at > > org.eclipse.jetty.server.handler.ContextHandler.contextInitialized( > ContextHandler.java:1002) > > at > > org.eclipse.jetty.servlet.ServletHandler.initialize( > ServletHandler.java:765) > > at > > org.eclipse.jetty.servlet.ServletContextHandler.startContext( > ServletContextHandler.java:379) > > at > > org.eclipse.jetty.webapp.WebAppContext.startWebapp( > WebAppContext.java:1449) > > at > > org.eclipse.jetty.webapp.WebAppContext.startContext( > WebAppContext.java:1414) > > at > > org.eclipse.jetty.server.handler.ContextHandler. > doStart(ContextHandler.java:916) > > at > > org.eclipse.jetty.servlet.ServletContextHandler.doStart( > ServletContextHandler.java:288) > > at > > org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524) > > at > > org.eclipse.jetty.util.component.AbstractLifeCycle. > start(AbstractLifeCycle.java:73) > > at > > org.eclipse.jetty.util.component.ContainerLifeCycle. > start(ContainerLifeCycle.java:169) > > at > > org.eclipse.jetty.util.component.ContainerLifeCycle. > doStart(ContainerLifeCycle.java:110) > > at > > org.eclipse.jetty.server.handler.AbstractHandler. > doStart(AbstractHandler.java:97) > > at > > org.eclipse.jetty.server.handler.gzip.GzipHandler. > doStart(GzipHandler.java:426) > > at > > org.eclipse.jetty.util.component.AbstractLifeCycle. > start(AbstractLifeCycle.java:73) > > at > > org.eclipse.jetty.util.component.ContainerLifeCycle. > start(ContainerLifeCycle.java:169) > > at > > org.eclipse.jetty.util.component.ContainerLifeCycle. > doStart(ContainerLifeCycle.java:117) > > at > > org.eclipse.jetty.server.handler.AbstractHandler. > doStart(AbstractHandler.java:97) > > at > > org.eclipse.jetty.util.component.AbstractLifeCycle. > start(AbstractLifeCycle.java:73) > > at > > org.eclipse.jetty.util.component.ContainerLifeCycle. > start(ContainerLifeCycle.java:169) > > at org.eclipse.jetty.server.Server.start(Server.java:423) > > at > > org.eclipse.jetty.util.component.ContainerLifeCycle. > doStart(ContainerLifeCycle.java:110) > > at > > org.eclipse.jetty.server.handler.AbstractHandler. > doStart(AbstractHandler.java:97) > > at org.eclipse.jetty.server.Server.doStart(Server.java:387) > > at > > org.eclipse.jetty.util.component.AbstractLifeCycle. > start(AbstractLifeCycle.java:73) > > at org.apache.cloudstack.ServerDaemon.start( > ServerDaemon.java:192) > > at > > org.apache.cloudstack.ServerDaemon.main(ServerDaemon.java:107) > > 2024-01-17 09:26:26,137 INFO [c.c.c.ClusterManagerImpl] (main:null) > > (logid:) Detected that another management node with the same IP > > 10.60.0.2 is considered as running in DB, however it is not > > pingable, we will continue cluster initialization with this > > management server node > > 2024-01-17 09:26:26,137 INFO [c.c.c.ClusterManagerImpl] (main:null) > > (logid:) Cluster manager is configured. > > > > > > Regards, > > Cristian > > > > > > > > > > -----Original Message----- > > From: [email protected] <[email protected]> > > Sent: Wednesday, January 17, 2024 9:24 AM > > To: [email protected] > > Subject: RE: MgmtId 345050133919: Req: Resource [Host:11] is unreachable: > > Host 11: Link is closed > > > > Hi Jithin, > > > > I changed the status to down and set as removed. I still have the > > same issue... So, I think I must create a new server with previous > > OS version, in any case not sure where is the logic. > > > > Thank you! > > Cristian > > > > -----Original Message----- > > From: Jithin Raju <[email protected]> > > Sent: Wednesday, January 17, 2024 6:16 AM > > To: [email protected] > > Subject: Re: MgmtId 345050133919: Req: Resource [Host:11] is unreachable: > > Host 11: Link is closed > > > > Hi Cristian, > > > > Could you locate any stale management server record in the mshost table? > > If there is any ,specifically the one in the error ? update it as > removed. > > > > -Jithin > > > > From: Cristian Ciobanu <[email protected]> > > Date: Wednesday, 17 January 2024 at 4:16 AM > > To: [email protected] <[email protected]> > > Subject: Re: MgmtId 345050133919: Req: Resource [Host:11] is unreachable: > > Host 11: Link is closed Looks like I have the same issue as I had > > some time ago... > > > > "host time out after cloudstack management restore". Old topic, 2022. > > When I restored the backup on a different os, from Rocky to Ubuntu.. > > but now I used the same os... Different version.. > > > > This looks like a bug. I'm trying to identify the route cause. > > > > > > Regards, > > Cristian > > > > On Wed, Jan 17, 2024, 00:20 <[email protected]> wrote: > > > > > Hello, > > > > > > > > > > > > I have a CloudStack 4.18.1 restored from a DB backup due to server > > > failure. The CloudStack is running; I see all the hosts with state > > > 'Up', but it looks like it's stuck, nothing is working. I get a > > > similar error for all hosts: “MgmtId 345050133919: Req: Resource > > [Host:11] is unreachable: > > > Host 11: Link is closed,” even if the hosts are reachable from > > > CloudStack and vSphere, where they are connected and manageable. I > > > do not see any events in vSphere. > > > > > > > > > > > > Any suggestions as to why CloudStack does not see that the servers > > > are reachable or why I get a timeout? > > > > > > > > > > > > Thank you, > > > > > > Cristian > > > > > > > > > > > > > > > > > >
