On KVM host: sed -i 's/INFO/DEBUG/g' /etc/cloud/agent/log4j-cloud.xml Disable reboot: sed -i 's/reboot/#reboot/g' /usr/lib64/cloud/agent/scripts/vm/hypervisor/kvm/kvmheartbeat.sh
> -----Original Message----- > From: Alexey Zilber [mailto:[email protected]] > Sent: Friday, June 22, 2012 10:58 AM > To: [email protected] > Subject: RE: Cloudstack agent keeps rebooting kvm host.. > > Hi Edison, > > I did that earlier, before I added the host. It mounted perfectly. > I > will test it again in a bit after some sleep. > Is there a way to increase the debug level in the agent? > > Thanks, > Alex > > -Alexey (sent via Android) > On Jun 23, 2012 1:39 AM, "Edison Su" <[email protected]> wrote: > > > Are you using NFS primary storage, right? The NFS primary storage > will be > > mounted at /mnt/9c2be815-de2b-3c14-84bb-54025d782794, after agent > connected > > to mgt server. > > Then a NFS storage monitor is started, by writing a timestamp file > into in > > NFS primary storage. If it failed, that means NFS primary storage is > not > > usable. > > How to diagnose the issue: > > Mount primary storage on kvm host, check the permission of the mount > > point, or just simply create a file under the mount point. > > Usually, this error coming from NFS server setup. Please check the > NFS > > server setup, make sure primary storage work on kvm host, before > adding it > > mgt server. > > > > > -----Original Message----- > > > From: Alexey Zilber [mailto:[email protected]] > > > Sent: Friday, June 22, 2012 10:12 AM > > > To: [email protected] > > > Subject: Re: Cloudstack agent keeps rebooting kvm host.. > > > > > > Hi Sadhu, > > > > > > /mnt isn't a mounted filesystem. It's on the root filesystem. > There > > > should be no write errors, and I see it was able to create the main > > > directory: > > > > > > [root@kvm1 mnt]# ls -altrh > > > total 12K > > > drwxr-xr-x 2 root root 6 Jun 22 22:14 kvm_primary_storage > > > drwxr-xr-x. 4 root root 4.0K Jun 22 23:58 . > > > drwxrwxrwx 3 root root 4.0K Jun 23 01:01 > > > 9c2be815-de2b-3c14-84bb-54025d782794 > > > dr-xr-xr-x. 24 root root 4.0K Jun 23 01:05 .. > > > > > > I even just created /mnt/9c2be815-de2b-3c14-84bb- > 54025d782794/KVMHA/ > > > with > > > full permissions and it still rebooted: > > > > > > 2012-06-23 01:09:23,830{GMT} INFO [cloud.agent.Agent] (Agent- > Handler- > > > 2:) > > > Startup Response Received: agent id = 5 > > > 2012-06-23 01:09:23,830 INFO [cloud.agent.Agent] (Agent-Handler- > 2:null) > > > Startup Response Received: agent id = 5 > > > 2012-06-23 01:10:22,774{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 0 > > > 2012-06-23 01:10:22,774 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 0 > > > 2012-06-23 01:10:22,797{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 1 > > > 2012-06-23 01:10:22,797 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 1 > > > 2012-06-23 01:10:22,821{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 2 > > > 2012-06-23 01:10:22,821 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 2 > > > 2012-06-23 01:10:22,843{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 3 > > > 2012-06-23 01:10:22,843 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 3 > > > 2012-06-23 01:10:22,866{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 4 > > > 2012-06-23 01:10:22,866 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: 4 > > > 2012-06-23 01:10:22,867{GMT} WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; > reboot > > > the > > > host > > > 2012-06-23 01:10:22,867 WARN [resource.computing.KVMHAMonitor] > > > (Thread-7:null) write heartbeat failed: Failed to create > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; > reboot > > > the > > > host > > > > > > Broadcast message from [email protected] > > > (unknown) at 1:10 ... > > > > > > The system is going down for reboot NOW! > > > Killing VMOps Agent (PID 4074) with SIGTERM > > > Waiting for agent to exit > > > > > > > > > -Alex > > > > > > On Sat, Jun 23, 2012 at 12:55 AM, Suresh Sadhu > > > <[email protected]>wrote: > > > > > > > HI Alex, > > > > > > > > When heartbeat fails ,host will reboot continuously till the > problem > > > > resolved(heartbeat successful)... > > > > > > > > The heartbeat failure might be caused due to fail to write on > > > mounted > > > > storage path, > > > > Did you see any permission denied messages in the logs ..and does > > > your > > > > mounted storage paths has rw permissions after this problem. > because > > > due > > > > some corruption in the mounted FS your mounted file system might > > > become > > > > read-only. That might cause heart-beat failure. > > > > > > > > > > > > > > > > Regards > > > > Sadhu > > > > > > > > > > > > -----Original Message----- > > > > From: Alexey Zilber [mailto:[email protected]] > > > > Sent: 22 June 2012 21:52 > > > > To: [email protected] > > > > Subject: Cloudstack agent keeps rebooting kvm host.. > > > > > > > > Hi, > > > > > > > > The saga continues! I added a KVM host. The agent decided it > wants > > > to > > > > constantly reboot the server: > > > > > > > > 2012-06-23 00:11:32,083{GMT} INFO [cloud.agent.Agent] (Agent- > > > Handler-2:) > > > > Startup Response Received: agent id = 5 > > > > 2012-06-23 00:11:32,083 INFO [cloud.agent.Agent] (Agent-Handler- > > > 2:null) > > > > Startup Response Received: agent id = 5 > > > > 2012-06-23 00:12:30,187{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 0 > > > > 2012-06-23 00:12:30,187 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 0 > > > > 2012-06-23 00:12:30,209{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 1 > > > > 2012-06-23 00:12:30,209 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 1 > > > > 2012-06-23 00:12:30,232{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 2 > > > > 2012-06-23 00:12:30,232 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 2 > > > > 2012-06-23 00:12:30,254{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 3 > > > > 2012-06-23 00:12:30,254 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 3 > > > > 2012-06-23 00:12:30,275{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 4 > > > > 2012-06-23 00:12:30,275 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, > retry: > > > 4 > > > > 2012-06-23 00:12:30,275{GMT} WARN > [resource.computing.KVMHAMonitor] > > > > (Thread-7:) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; > reboot > > > the > > > > host > > > > 2012-06-23 00:12:30,275 WARN [resource.computing.KVMHAMonitor] > > > > (Thread-7:null) write heartbeat failed: Failed to create > > > > /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; > reboot > > > the > > > > host > > > > > > > > Broadcast message from [email protected] > > > > (unknown) at 0:12 ... > > > > > > > > The system is going down for reboot NOW! > > > > > > > > It looks like the agent was in fact, at least able to create the > > > initial > > > > directory: > > > > > > > > [root@kvm1 ~]# ls -al /mnt/9c2be815-de2b-3c14-84bb-54025d782794 > > > > total 8 > > > > drwxrwxrwx 2 root root 4096 Jun 22 23:58 . > > > > drwxr-xr-x. 4 root root 4096 Jun 22 23:58 .. > > > > > > > > Here's the agent properties file: > > > > > > > > #Storage > > > > #Sat Jun 23 00:11:32 MYT 2012 > > > > guest.network.device=cloudbr0 > > > > workers=5 > > > > private.network.device=cloudbr0 > > > > port=8250 > > > > > resource=com.cloud.agent.resource.computing.LibvirtComputingResource > > > > pod=1 > > > > zone=1 > > > > guid=0f0f4f5c-99d0-3813-a7a6-00248cdfd17e > > > > cluster=2 > > > > public.network.device=cloudbr0 > > > > local.storage.uuid=fbefb2ea-f3e0-4f02-96cb-1b8abb6e8c54 > > > > host=10.1.1.18 > > > > LibvirtComputingResource.id=5 > > > > > > > > > > > > First time I'm seeing this error... Last time my kvm setup went > well, > > > but > > > > KVM was my first hypervisor, now it's the second. > > > > > > > > Thanks! > > > > Alex > > > > > >
