I have met such issue when I reuse an old xenserver which has been added to another cs before. It is because that xenserver can not ssh into vrouter by authorizedkeys. try to copy the systemvm.iso and id_rsa.pub from management server and replace which in xenserver now. then you can start your vrouter. If it doesn't work, you can try to re-install xenserver, i think.
2013/8/25 Kent Johnson <[email protected]> > So I found something that could have caused my Virtual Router to not be > able to start up. > > The permissions on my vhd-util may have been set wrong. They were set to > 0777 and it looks like some things were looking for the permissions to be > 0755. It doesn't make sense to me why that would be the problem but after I > changed the permissions on the file in > /usr/share/cloudstack-common/scripts/vm/hypervisor/xenserver/vhd-util and > then copied that file to > /usr/share/cloudstack-common/scripts/vm/hypervisor/xenserver/xenserver60 > then my virtual router started up properly. Why would the difference in > permissions be the problem? Was it that I had not copied the file to the > .../xenserver60 directory? Nothing in the docs told me about copying to > .../xenserver60 and nothing mentioned particular permissions to be set. > > Whether or not that was the cause of my problem I do not know but I do > know my Virtual Router is up and running now. > > Also, my instances used to go to an error state but now they don't. > However, they stay in a perpetual starting state. I started a small > instance and after 40 minutes it is still not running. It was successfully > created and is now in the "Starting" state. How long should I give my > instances before I start looking for issues? > > Kent Johnson > University of Utah > Graduate Student > MSIS Program > > -----Original Message----- > From: Kent Johnson [mailto:[email protected]] > Sent: Saturday, August 24, 2013 5:20 PM > To: [email protected] > Subject: RE: Cs 4.1.0 plus Xen 6.1 No Instances or VR's starting up > > Which host are you referring to? My XenServer host mounts the secondary > just fine. My secondary storage is a directory on my NAS using NFS. By the > way, my NAS is on a separate subnet than all the rest of my devices. Would > that cause any problems? My management server and my XenServer host mount > it just fine and I haven't had any problems with it to date. > > Kent Johnson > University of Utah > Graduate Student > MSIS Program > > -----Original Message----- > From: Carlos Reátegui [mailto:[email protected]] > Sent: Saturday, August 24, 2013 5:05 PM > To: [email protected] > Subject: Re: Cs 4.1.0 plus Xen 6.1 No Instances or VR's starting up > > Is your host able to mount secondary storage? > > On Aug 24, 2013, at 3:34 PM, Kent Johnson <[email protected]> wrote: > > > Thank you Kirk. > > As a preface to my response a virtual router is automatically created > every time I create my first virtual machine. I have not yet successfully > been able to add an instance. The virtual router created when I make an > instance is the VR I am having problems with. > > > > In answering your questions, > > 1. The virtual router goes into the starting state and never makes it to > running. It errors out and then goes from starting state to stopping. I did > see some suspicious lines on the log file: > > 2013-08-23 18:47:58,361 DEBUG [xen.resource.CitrixResourceBase] > > (DirectAgent-9:null) Trying to connect to 169.254.3.124 > > 2013-08-23 18:47:59,178 DEBUG [xen.resource.CitrixResourceBase] > > (DirectAgent-9:null) Ping command port succeeded for vm r-11-VM > > 2013-08-23 18:47:59,561 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-9:null) Seq 1-585236521: Cancelling because one of the answers > is false and it is stop on error. > > > > Do you think the problem with my router not starting up is related to > the "stop on error" part of those log lines? A few lines later in the log > it tells me the guru did not like the answers so is stopping the router: > > > > 2013-08-23 18:47:59,623 INFO [cloud.vm.VirtualMachineManagerImpl] > > (Job-Executor-11:job-31) The guru did not like the answers so stopping > > VM[DomainRouter|r-11-VM] > > 2013-08-23 18:47:59,626 DEBUG [agent.transport.Request] > > (Job-Executor-11:job-31) Seq 1-585236524: Sending { Cmd , MgmtId: > > 166316981724, via: 1, Ver: v1, Flags: 100111, > > [{"StopCommand":{"isProxy":false,"vmName":"r-11-VM","wait":0}}] } > > 2013-08-23 18:47:59,627 DEBUG [agent.transport.Request] > > (Job-Executor-11:job-31) Seq 1-585236524: Executing: { Cmd , MgmtId: > > 166316981724, via: 1, Ver: v1, Flags: 100111, > > [{"StopCommand":{"isProxy":false,"vmName":"r-11-VM","wait":0}}] } > > 2013-08-23 18:47:59,627 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-22:null) Seq 1-585236524: Executing request > > 2013-08-23 18:47:59,774 DEBUG [xen.resource.CitrixResourceBase] > > (DirectAgent-22:null) 9. The VM r-11-VM is in Stopping state > > 2013-08-23 18:48:00,101 INFO [xen.resource.CitrixResourceBase] > > (DirectAgent-22:null) Removed network rules for vm r-11-VM > > > > Later it says it is in stopped state and that stopping succeeded: > > 2013-08-23 18:48:08,312 DEBUG [xen.resource.CitrixResourceBase] > > (DirectAgent-22:null) 10. The VM r-11-VM is in Stopped state > > 2013-08-23 18:48:08,313 DEBUG [agent.manager.AgentManagerImpl] > > (Job-Executor-11:job-31) Details from executing class > > com.cloud.agent.api.StopCommand: Stop VM r-11-VM Succeed > > > > Then it tells me the error was in finalizeStart: > > 2013-08-23 18:48:08,313 ERROR [cloud.vm.VirtualMachineManagerImpl] > > (Job-Executor-11:job-31) Failed to start instance > > VM[DomainRouter|r-11-VM] > > com.cloud.utils.exception.ExecutionException: Unable to start > > VM[DomainRouter|r-11-VM] due to error in finalizeStart, not retrying > > > > Where can I see the finalizeStart method and the part of it that is > failing? > > > > 2. I checked the /var/log/SMlog on the XenServer host and did not find > any lines that contained "getDomRVersion" How would I know if any of the > entries are related to the getDomRVersion script? > > I did notice that when I tried to start the VR then the SMlog showed a > failure after running router_proxy.sh and get_template_version.sh (shown in > the log below): > > [31327] 2013-08-24 16:14:20.346551 #### VMOPS enter routerProxy #### > > [31327] 2013-08-24 16:14:20.346675 ['/bin/bash', > '/opt/xensource/bin/router_proxy.sh', 'get_template_version.sh', > '169.254.3.168'] > > [31327] 2013-08-24 16:14:20.464558 FAILED in util.pread: (rc 255) > stdout: '', stderr: '' > > [31327] 2013-08-24 16:14:20.464766 routerProxy command > get_template_version.sh 169.254.3.168 failed > > [31327] 2013-08-24 16:14:20.464872 #### VMOPS exit routerProxy #### > > > > 3. I checked for the /root/.ssh/id_rsa.cloud file on the XenServer host > and it did exist. > > > > 4. I tried forcing reconnect through the UI and that did not solve the > problem. > > 5. I tried unmanaging and re-managing the cluster (I had to put my > > host in maintenance mode first) 6. I tried clearing the host tags > > > > None of this solved my problem. I did see some entries in the SMlog that > made me wonder if vhd-util needs to be in my primary or secondary storage > directory. To your knowledge, does it need to be? I thought vhd-util only > needed to be in /usr/bin and /opt/xensource/bin. Is there somewhere else it > needs to be? > > > > > > I am starting to wonder if I should reinstall everything and use KVM > with Libvirt. I really want to use Xen and CloudStack. This problem with my > instances and the virtual router really is an impasse for my project. > > > > Thank you for your help, Kirk. > > > > Best, > > > > Kent Johnson > > University of Utah > > Graduate Student > > MSIS Program > > -----Original Message----- > > From: Kirk Kosinski [mailto:[email protected]] > > Sent: Friday, August 23, 2013 8:35 PM > > To: [email protected] > > Cc: Ahmad Emneina > > Subject: Re: Cs 4.1.0 plus Xen 6.1 No Instances or VR's starting up > > > > Does the virtual router start on XenServer and stop after a few seconds > or minutes? Or does it never start at all? > > > > Check /var/log/SMlog on the XenServer host for any entries or > > (especially) errors related to the getDomRVersion script. If there are > no useful errors, find the script arguments in SMlog and try running it > manually with bash -x to see where it is failing. > > > > One potential cause is a missing /root/.ssh/id_rsa.cloud on the > XenServer host, so confirm it exists. Besides that, some general steps > that might clear up the problem include: > > 1. Force reconnect the host (CS API or UI). > > 2. Unmanage and re-manage the cluster (CS API or UI). > > 3. Unmanage cluster / clear host tags (xe host-param-clear > > uuid=host_id > > param-name=tags) / re-manage cluster > > > > Best regards, > > Kirk > > > > > > On 08/23/2013 04:24 PM, Ahmad Emneina wrote: > >> looks like your management server reaches out to the router but barfs > >> out > >> here: > >> 2013-08-23 17:04:53,938 DEBUG [agent.transport.Request] > >> (Job-Executor-1:job-18) Seq 1-1363673129: Received: { Ans: , MgmtId: > >> 166316981724, via: 1, Ver: v1, Flags: 110, { StartAnswer, > >> CheckSshAnswer, GetDomRVersionAnswer } } > >> 2013-08-23 17:04:53,983 WARN > >> [network.router.VirtualNetworkApplianceManagerImpl] > >> (Job-Executor-1:job-18) Unable to get the template/scripts version of > >> router r-6-VM due to: getDomRVersionCmd failed > >> 2013-08-23 17:04:53,984 INFO [cloud.vm.VirtualMachineManagerImpl] > >> (Job-Executor-1:job-18) The guru did not like the answers so stopping > >> VM[DomainRouter|r-6-VM] > >> 2013-08-23 17:04:53,988 DEBUG [agent.transport.Request] > >> (Job-Executor-1:job-18) Seq 1-1363673130: Sending { Cmd , MgmtId: > >> 166316981724, via: 1, Ver: v1, Flags: 100111, > >> [{"StopCommand":{"isProxy":false,"vmName":"r-6-VM","wait":0}}] } > >> > >> so the good news is your network config looks good, i just wonder if > >> youre using the right template or if the system vm isnt patched > properly. > >> > >> > >> On Fri, Aug 23, 2013 at 4:16 PM, Kent Johnson <[email protected]> > wrote: > >> > >>> Thank you Marty and Ahmad. > >>> > >>> Here is my management-server.log log: http://pastebin.com/TnRpsB8j > >>> > >>> Here is my catalina.out log: http://pastebin.com/index > >>> > >>> Here is my xensource.log from my xenserver host: > >>> http://pastebin.com/RC6LhXQ4 > >>> > >>> Best, > >>> > >>> Kent Johnson > >>> > >>> -----Original Message----- > >>> From: Ahmad Emneina [mailto:[email protected]] > >>> Sent: Friday, August 23, 2013 5:00 PM > >>> To: Cloudstack users mailing list > >>> Subject: Re: Cs 4.1.0 plus Xen 6.1 No Instances or VR's starting up > >>> > >>> probably best to post your logs to pastebin and have us sift over them. > >>> see if we can spot anything. > >>> > >>> > >>> On Fri, Aug 23, 2013 at 3:53 PM, Kent Johnson > >>> <[email protected]> > >>> wrote: > >>> > >>>> Can anyone point me in the right direction on solving my instance > >>>> startup problem? They won't start up, nor will my virtual router. > >>>> > >>>> I am using CloudStack 4.1.0 on CentOS for my Management Server and > >>>> XenServer 6.1 for my VM host. > >>>> > >>>> I installed CS as per the documentation and have successfully added > >>>> a zone, pod, cluster, host, primary, and secondary storage. > >>>> My System VM's start up and run correctly. My systemVM template and > >>>> the default CentOS templates download correctly. > >>>> I can create instances but they always try to start up and then > >>>> fail and end up in "Error" state. My Virtual Router tries to start > >>>> up But it always fails. > >>>> > >>>> Are there any gotchas or quick suggestions anyone could give me to > >>>> help me understand how to get my instances to run and possibly my > >>> virtual router? > >>>> I can provide more details from the logs if needed. > >>>> > >>>> Kent Johnson > >>>> University of Utah > >>>> Graduate Student > >> >
