Hi Ahmad, Thanks for the mail. Telnet failed on port 443 to CS from XS hosts. I did not see any service listening on 443 on CS management server. Something wrong with management server?
Thanks Leeno On Tue, Jul 16, 2013 at 11:50 PM, Ahmad Emneina <aemne...@gmail.com> wrote: > can you check if your hosts can connect back to the management server > (ping, telnet 22,443)? there might be some firewall rules in place, or > routing issues, preventing this. > > > On Tue, Jul 16, 2013 at 9:01 AM, Leeno Jose.P.A <leeno...@gmail.com> > wrote: > > > CS startup logs, > > > > 2013-07-16 11:25:30,702 INFO [utils.component.ComponentContext] > > (Timer-1:null) Starting > > > com.cloud.network.guru.NiciraNvpGuestNetworkGuru_EnhancerByCloudStack_1f6b4bb6 > > 2013-07-16 11:25:30,702 INFO [utils.component.ComponentContext] > > (Timer-1:null) Starting > > com.cloud.server.ManagementServerImpl_EnhancerByCloudStack_d54e1bb1 > > 2013-07-16 11:25:30,702 INFO [cloud.server.ManagementServerImpl] > > (Timer-1:null) Startup CloudStack management server... > > 2013-07-16 11:25:30,707 INFO > > [cloud.cluster.ClusterServiceServletContainer] (Thread-18:null) Cluster > > service servlet container listening on port 9090 > > 2013-07-16 11:25:31,832 DEBUG [utils.db.ConnectionConcierge] > > (Cluster-Heartbeat-1:null) Registering a database connection for > > ClusterManagerHeartBeat2 > > 2013-07-16 11:25:31,845 INFO [cloud.cluster.ClusterManagerImpl] > > (Cluster-Heartbeat-1:null) We are good, no orphan management server msid > in > > host table is found > > 2013-07-16 11:25:31,845 INFO [cloud.cluster.ClusterManagerImpl] > > (Cluster-Heartbeat-1:null) Found 1 inactive management server node based > on > > timestamp > > 2013-07-16 11:25:31,846 INFO [cloud.cluster.ClusterManagerImpl] > > (Cluster-Heartbeat-1:null) management server node msid: 130602634328, > name: > > cstagcms, service ip: 192.168.10.251, version: 4.1.0 > > 2013-07-16 11:25:31,846 INFO [cloud.cluster.ClusterManagerImpl] > > (Cluster-Heartbeat-1:null) Trying to connect to 192.168.10.251 > > 2013-07-16 11:25:31,860 DEBUG [cloud.cluster.ClusterManagerImpl] > > (Cluster-Heartbeat-1:null) Detected management node joined, id:2, > > nodeIP:192.168.10.251 > > 2013-07-16 11:25:33,348 DEBUG [cloud.cluster.ClusterManagerImpl] > > (Cluster-Notification-1:null) Notify management server node join to > > listeners. > > 2013-07-16 11:25:33,349 DEBUG [cloud.cluster.ClusterManagerImpl] > > (Cluster-Notification-1:null) Joining node, IP: 192.168.10.251, msid: > > 81375086018793 > > 2013-07-16 11:25:33,350 DEBUG [cloud.alert.ClusterAlertAdapter] > > (Cluster-Notification-1:null) Receive cluster alert, EventArgs: > > com.cloud.cluster.ClusterNodeJoinEventArgs > > 2013-07-16 11:25:33,350 DEBUG [cloud.alert.ClusterAlertAdapter] > > (Cluster-Notification-1:null) Handle cluster node join alert, joined > node: > > 192.168.10.251, msidL: 81375086018793 > > 2013-07-16 11:25:33,350 DEBUG [cloud.alert.ClusterAlertAdapter] > > (Cluster-Notification-1:null) Management server node 192.168.10.251 is > up, > > send alert > > 2013-07-16 11:25:33,361 WARN [cloud.cluster.ClusterManagerImpl] > > (Cluster-Notification-1:null) Notifying management server join event took > > 12 ms > > 2013-07-16 11:25:45,450 DEBUG [cloud.server.StatsCollector] > > (StatsCollector-2:null) HostStatsCollector is running... > > 2013-07-16 11:25:45,452 DEBUG [cloud.server.StatsCollector] > > (StatsCollector-1:null) VmStatsCollector is running... > > 2013-07-16 11:25:45,467 DEBUG [cloud.server.StatsCollector] > > (StatsCollector-3:null) StorageCollector is running... > > 2013-07-16 11:25:45,498 DEBUG [agent.manager.ClusteredAgentManagerImpl] > > (StatsCollector-2:null) create forwarding ClusteredAgentAttache for 39 > > 2013-07-16 11:25:45,491 DEBUG [agent.manager.ClusteredAgentManagerImpl] > > (StatsCollector-3:null) create forwarding ClusteredAgentAttache for 50 > > 2013-07-16 11:25:45,751 INFO [agent.manager.ClusteredAgentManagerImpl] > > (StatsCollector-3:null) SSL: Handshake done > > 2013-07-16 11:25:45,752 DEBUG [agent.manager.ClusteredAgentManagerImpl] > > (StatsCollector-3:null) Connection to peer opened: 130602634328, ip: > > 192.168.10.251 > > 2013-07-16 11:25:45,757 DEBUG [agent.manager.ClusteredAgentAttache] > > (StatsCollector-2:null) Seq 39-282525697: Forwarding null to 130602634328 > > 2013-07-16 11:25:45,758 DEBUG [agent.manager.ClusteredAgentAttache] > > (StatsCollector-3:null) Seq 50-1962541057: Forwarding null to > 130602634328 > > 2013-07-16 11:25:45,804 DEBUG [agent.manager.ClusteredAgentAttache] > > (AgentManager-Handler-2:null) Seq 39-282525697: Routing from > 81375086018793 > > 2013-07-16 11:25:45,804 DEBUG [agent.manager.ClusteredAgentAttache] > > (AgentManager-Handler-2:null) Seq 39-282525697: Link is closed > > 2013-07-16 11:25:45,806 DEBUG [agent.manager.ClusteredAgentManagerImpl] > > (AgentManager-Handler-2:null) Seq 39-282525697: MgmtId 81375086018793: > Req: > > Resource [Host:39] is unreachable: Host 39: Link is closed > > > > > > Thanks > > Leeno > > > > > > On Tue, Jul 16, 2013 at 6:10 PM, Leeno Jose.P.A <leeno...@gmail.com > >wrote: > > > >> Hi Todd, > >> > >> Thanks for the help. > >> > >> I executed the steps as you mentioned above but that did not help. Still > >> I get same error message. But I can do ping, telnet ports 22, 80 and > 443 on > >> XS hosts from CS. > >> > >> Thanks > >> Leeno > >> > >> > >> On Tue, Jul 16, 2013 at 5:12 PM, Todd Pigram <t...@toddpigram.com> > wrote: > >> > >>> Did you remove the Tags on each XenServer host prior to starting? > >>> > >>> Management Controller Failure and Replacement > >>> > >>> < > https://cwiki.apache.org/confluence/pages/editpage.action?pageId=30755366> > >>> Edit Page< > https://cwiki.apache.org/confluence/pages/editpage.action?pageId=30755366> > >>> < > https://cwiki.apache.org/confluence/pages/listpages.action?key=CLOUDSTACK> > >>> Browse Space< > https://cwiki.apache.org/confluence/pages/listpages.action?key=CLOUDSTACK> > >>> < > https://cwiki.apache.org/confluence/pages/createpage.action?spaceKey=CLOUDSTACK&fromPageId=30755366 > > > >>> Add Page< > https://cwiki.apache.org/confluence/pages/createpage.action?spaceKey=CLOUDSTACK&fromPageId=30755366 > > > >>> < > https://cwiki.apache.org/confluence/pages/createblogpost.action?spaceKey=CLOUDSTACK&fromPageId=30755366 > > > >>> Add News< > https://cwiki.apache.org/confluence/pages/createblogpost.action?spaceKey=CLOUDSTACK&fromPageId=30755366 > > > >>> > >>> In setting up your cloud, you should have a backup routine for your > >>> controller. The most important item to back up is the MySQL databases > that > >>> Cloudstack uses. A suitable backup script is attached to this page. In > the > >>> even of a cloud management controller failure, the steps to replace the > >>> controller with a new one are: > >>> > >>> These instructions assume your cluster is Xenserver - Contributors > >>> using other Hypervisor OSs, please contribute. > >>> > >>> 1. Setup new management server hardware > >>> 2. Install your OS > >>> 3. Install Cloudstack, up to and including the "Install Database > >>> step" > >>> 4. Import your database backup > >>> 5. In Xencenter, connect to your Cloudstack host pool. > >>> 6. On each host, remove the tags on Host > General Tab > Tags by > >>> editing the tags and un-checking each one. > >>> 7. On the management controller, start Cloudstack > >>> 1. service cloud-management start > >>> 8. the new cloud management controller will connect to each host in > >>> the database and push out new tags and keys to each host in the > pool. > >>> > >>> > >>> On Jul 16, 2013, at 1:13 AM, Leeno Jose.P.A <leeno...@gmail.com> > wrote: > >>> > >>> After restoring the old database dump to new installation. CS is unable > >>> to > >>> contact Xenserver hosts. I getting following errors in > >>> mamangement-server.log, > >>> > >>> > >>> 2013-07-15 11:57:49,646 DEBUG [agent.manager.ClusteredAgentManagerImpl] > >>> (StatsCollector-1:null) Connection to peer opened: 130602634328, ip: > >>> 192.168.10.251 > >>> 2013-07-15 11:57:49,652 DEBUG [agent.manager.ClusteredAgentAttache] > >>> (StatsCollector-2:null) Seq 50-185008129: Forwarding null to > 130602634328 > >>> 2013-07-15 11:57:49,662 DEBUG [agent.manager.ClusteredAgentAttache] > >>> (StatsCollector-1:null) Seq 39-1272840193: Forwarding null to > >>> 130602634328 > >>> 2013-07-15 11:57:49,699 DEBUG [agent.manager.ClusteredAgentAttache] > >>> (AgentManager-Handler-2:null) Seq 50-185008129: Routing from > >>> 81375086018793 > >>> 2013-07-15 11:57:49,699 DEBUG [agent.manager.ClusteredAgentAttache] > >>> (AgentManager-Handler-2:null) Seq 50-185008129: Link is closed > >>> 2013-07-15 11:57:49,699 DEBUG [agent.manager.ClusteredAgentAttache] > >>> (AgentManager-Handler-3:null) Seq 39-1272840193: Routing from > >>> 81375086018793 > >>> 2013-07-15 11:57:49,700 DEBUG [agent.manager.ClusteredAgentAttache] > >>> (AgentManager-Handler-3:null) Seq 39-1272840193: Link is closed > >>> 2013-07-15 11:57:49,700 DEBUG [agent.manager.ClusteredAgentManagerImpl] > >>> (AgentManager-Handler-3:null) Seq 39-1272840193: MgmtId 81375086018793: > >>> Req: Resource [Host:39] is unreachable: Host 39: Link is closed > >>> > >>> > >>> 2013-07-15 11:57:49,861 DEBUG [agent.manager.ClusteredAgentManagerImpl] > >>> (AgentManager-Handler-8:null) Seq 39--1: MgmtId 81375086018793: Req: > >>> Cancel > >>> request received > >>> 2013-07-15 11:57:49,861 DEBUG [agent.manager.AgentAttache] > >>> (AgentManager-Handler-8:null) Seq 39-1272840194: Cancelling. > >>> 2013-07-15 11:57:49,861 DEBUG [agent.manager.AgentAttache] > >>> (StatsCollector-2:null) Seq 39-1272840194: Waiting some more time > because > >>> this is the current command > >>> 2013-07-15 11:57:49,862 DEBUG [agent.manager.AgentAttache] > >>> (StatsCollector-2:null) Seq 39-1272840194: Waiting some more time > because > >>> this is the current command > >>> 2013-07-15 11:57:49,862 INFO [utils.exception.CSExceptionErrorCode] > >>> (StatsCollector-2:null) Could not find exception: > >>> com.cloud.exception.OperationTimedoutException in error code list for > >>> exceptions > >>> 2013-07-15 11:57:49,862 WARN [agent.manager.AgentAttache] > >>> (StatsCollector-2:null) Seq 39-1272840194: Timed out on null > >>> 2013-07-15 11:57:49,862 DEBUG [agent.manager.AgentAttache] > >>> (StatsCollector-2:null) Seq 39-1272840194: Cancelling. > >>> 2013-07-15 11:57:49,863 DEBUG [cloud.storage.StorageManagerImpl] > >>> (StatsCollector-2:null) Unable to send storage pool command to > >>> Pool[210|NetworkFilesystem] via 39 > >>> com.cloud.exception.OperationTimedoutException: Commands 1272840194 to > >>> Host > >>> 39 timed out after 3600 > >>> at > >>> com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:429) > >>> at > >>> > com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:511) > >>> at > >>> > com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:464) > >>> at > >>> > >>> > com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:2347) > >>> at > >>> > >>> > com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:422) > >>> at > >>> > >>> > com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:436) > >>> at > >>> > >>> > com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:316) > >>> at > >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > >>> at > >>> > >>> > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > >>> at > >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > >>> at > >>> > >>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165) > >>> at > >>> > >>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267) > >>> at > >>> > >>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) > >>> at > >>> > >>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > >>> at java.lang.Thread.run(Thread.java:679) > >>> 2013-07-15 11:57:49,863 INFO [cloud.server.StatsCollector] > >>> (StatsCollector-2:null) Unable to reach Pool[210|NetworkFilesystem] > >>> com.cloud.exception.StorageUnavailableException: Resource > >>> [StoragePool:210] > >>> is unreachable: Unable to send command to the pool > >>> at > >>> > >>> > com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:2357) > >>> at > >>> > >>> > com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:422) > >>> at > >>> > >>> > com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:436) > >>> at > >>> > >>> > com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:316) > >>> at > >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > >>> at > >>> > >>> > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > >>> at > >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > >>> at > >>> > >>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165) > >>> at > >>> > >>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267) > >>> at > >>> > >>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) > >>> at > >>> > >>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > >>> at java.lang.Thread.run(Thread.java:679) > >>> > >>> > >>> Thanks > >>> Leeno > >>> > >>> > >>> On Tue, Jul 16, 2013 at 10:21 AM, Leeno Jose.P.A <leeno...@gmail.com> > >>> wrote: > >>> > >>> This is a dev box. We are planning a HA enabled environment for prod > >>> setup. Thanks Geoff. > >>> > >>> > >>> On Tue, Jul 16, 2013 at 12:11 AM, Geoff Higginbottom < > >>> geoff.higginbot...@shapeblue.com> wrote: > >>> > >>> Hi Leeno, > >>> > >>> It theory that should work, but obviously you will lose all changes > made > >>> since the dump was taken. If any new VMs have been created, they will > >>> get > >>> purged by the system etc. > >>> > >>> I would highly recommend splitting the DB and the Management Server, > and > >>> if possible add a 2nd instance of each. > >>> > >>> Regards > >>> > >>> Geoff Higginbottom > >>> > >>> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 > >>> > >>> geoff.higginbot...@shapeblue.com > >>> > >>> -----Original Message----- > >>> From: Leeno Jose.P.A [mailto:leeno...@gmail.com] > >>> Sent: 15 July 2013 18:46 > >>> To: users@cloudstack.apache.org > >>> Subject: Re: Rebuilding management server > >>> > >>> Hi Geoff, > >>> > >>> 1. I have only one management server. > >>> 2. Management server is not functioning now but 'cloud' database dump > is > >>> available in backup. CS version was 4.1.0 Hosts were Xenserver 6.1.0 3. > >>> DB > >>> server was on same machine where management server installed. > >>> > >>> Now I am planning to do a fresh install of CS 4.1.0 and restore cloud > >>> database with old installation dump, which is available in backup. Will > >>> it > >>> work? > >>> > >>> Thanks > >>> Leeno > >>> > >>> > >>> On Mon, Jul 15, 2013 at 9:56 PM, Geoff Higginbottom < > >>> geoff.higginbot...@shapeblue.com> wrote: > >>> > >>> The Management Servers are 'Stateless' so as Chip points out, it's the > >>> DB that stores all the info. > >>> > >>> How you actually go about it depends on your current setup. > >>> > >>> 1. How many management servers do you currently have? > >>> 2. Are the original Management Server(s) still functioning, or are > >>> they down? > >>> 3. Is DB on a separate server, or the same as the Management Server? > >>> > >>> Regards > >>> > >>> Geoff Higginbottom > >>> > >>> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 > >>> > >>> geoff.higginbot...@shapeblue.com > >>> > >>> -----Original Message----- > >>> From: Chip Childers [mailto:chip.child...@sungard.com] > >>> Sent: 15 July 2013 15:50 > >>> To: users@cloudstack.apache.org > >>> Subject: Re: Rebuilding management server > >>> > >>> On Mon, Jul 15, 2013 at 03:19:42PM +0530, Leeno Jose.P.A wrote: > >>> > >>> Hi Users, > >>> > >>> Has anyone tried to rebuild management server with Xenserver hosts? > >>> If yes, could you please share experience? > >>> > >>> > >>> -- > >>> Leeno Jose .P.A > >>> > >>> > >>> I have not, but one of the most critical aspects of this is to ensure > >>> that your database is retained. > >>> > >>> This email and any attachments to it may be confidential and are > >>> intended solely for the use of the individual to whom it is addressed. > >>> Any views or opinions expressed are solely those of the author and do > >>> not necessarily represent those of Shape Blue Ltd or related > >>> companies. If you are not the intended recipient of this email, you > >>> must neither take any action based upon its contents, nor copy or show > >>> it to anyone. Please contact the sender if you believe you have > >>> received this email in error. Shape Blue Ltd is a company incorporated > >>> in England & Wales. ShapeBlue Services India LLP is operated under > >>> license from Shape Blue Ltd. ShapeBlue is a registered trademark. > >>> > >>> > >>> > >>> > >>> -- > >>> Leeno Jose .P.A > >>> This email and any attachments to it may be confidential and are > intended > >>> solely for the use of the individual to whom it is addressed. Any views > >>> or > >>> opinions expressed are solely those of the author and do not > necessarily > >>> represent those of Shape Blue Ltd or related companies. If you are not > >>> the > >>> intended recipient of this email, you must neither take any action > based > >>> upon its contents, nor copy or show it to anyone. Please contact the > >>> sender > >>> if you believe you have received this email in error. Shape Blue Ltd > is a > >>> company incorporated in England & Wales. ShapeBlue Services India LLP > is > >>> operated under license from Shape Blue Ltd. ShapeBlue is a registered > >>> trademark. > >>> > >>> > >>> > >>> > >>> -- > >>> Leeno Jose .P.A > >>> > >>> > >>> > >>> > >>> -- > >>> Leeno Jose .P.A > >>> > >>> > >>> > >>> > >>> > >>> > >>> Todd Pigram > >>> t...@toddpigram.com > >>> > >>> > >>> > >> > >> > >> -- > >> Leeno Jose .P.A > >> > > > > > > > > -- > > Leeno Jose .P.A > > > -- Leeno Jose .P.A