----- Original Message ----- > From: "ILanit Stein" <ist...@redhat.com> > To: "Artyom Lukianov" <aluki...@redhat.com>, "Eli Mesika" <emes...@redhat.com> > Cc: users@ovirt.org, rabsh...@citytwist.net > Sent: Tuesday, January 27, 2015 5:19:12 PM > Subject: Fwd: [ovirt-users] Host remains Non-Responsive after reboot > > > Hi Guys, > > Can you please look into this please?
Hi From the logs I can see clearly that host is turned on in 2015-01-26 11:56:51,191 However, there is a stomp exception in 2015-01-26 11:56:53,544 and a connection timeout in 2015-01-26 11:56:53,553 that might be related Piotr, can you please have a look ? > > Thanks, > Ilanit. > ----- Forwarded Message ----- > From: "Rob Abshear" <rabsh...@citytwist.net> > To: "ILanit Stein" <ist...@redhat.com> > Sent: Tuesday, January 27, 2015 3:05:56 PM > Subject: Re: [ovirt-users] Host remains Non-Responsive after reboot > > Here are the logs. you requested. The shutdown of the node was at 11:53 > and vdsmd was manually restarted at 12:01 to get the node back online. > > On Tue, Jan 27, 2015 at 2:05 AM, ILanit Stein <ist...@redhat.com> wrote: > > > It might be a bug, > > Would you please attach the logs, I mentioned bellow, > > that can bring more details on the failure? > > Adding Eli, that may want to give some input on this issue. > > > > Thanks, > > Ilanit. > > > > ----- Original Message ----- > > From: "Rob Abshear" <rabsh...@citytwist.net> > > To: "ILanit Stein" <ist...@redhat.com> > > Cc: users@ovirt.org > > Sent: Monday, January 26, 2015 9:43:14 PM > > Subject: Re: [ovirt-users] Host remains Non-Responsive after reboot > > > > I have done a bit more investigating on this matter. If I restart the node > > from within oVirt using the power management option "restart", then the > > node restarts and vdsmd DOES NOT start. If I go into the DRAC and issue > > the command to power cycle the machine, then the machine restarts and vdsmd > > DOES start. I can run the following command from another node in the > > cluster: > > fence_drac5 -a 192.168.200.105 -l root -p <password> -x -o reboot > > and the node restarts and vdsmd DOES start. > > > > On Sun, Jan 25, 2015 at 1:56 AM, ILanit Stein <ist...@redhat.com> wrote: > > > > > Hi Rob, > > > > > > Thanks for this report. > > > > > > Would you please provide these logs, at the time frame, the host failure > > > occur: > > > 1. oVirt Engine: /var/log/ovirt-engine/engine.log > > > 2. host: /var/log/vdsm/vdsm.log > > > > > > If it is reproducible, please add this info as well. > > > > > > You can also check vdsm service status, on host, while host reported as > > > Non responsive, > > > by running on host 'service vdsmd status' > > > There might some problem, that might have prevented from vdsm service to > > > come up, on host. > > > > > > Ilanit. > > > > > > ----- Original Message ----- > > > From: "Rob Abshear" <rabsh...@citytwist.net> > > > To: users@ovirt.org > > > Sent: Friday, January 23, 2015 9:22:42 PM > > > Subject: [ovirt-users] Host remains Non-Responsive after reboot > > > > > > > > > I am running oVirt Engine Version 3.5.0.1-1.el6. I have 4 hosts in the > > > cluster. Each host has a drac5 and it is configured and working. I am > > > trying to simulate a node failure. I am running one HA VM on one of the > > > hosts for testing. I simulate the failure by powering off the host with > > the > > > VM running. > > > > > > Here is what is happening. > > > > > > > > > * Host is powered off > > > * ~4 minutes pass and the host is recognized as not responding > > > * Automatic fence runs and the VM migrates. Another host in the node > > > is chosen as a proxy to execute Status command on the host. > > > * Same host is chosen as proxy to execute Start command on the host. > > > * Same host is chosen as proxy to execute Status command on the host. > > > * The host DOES physically start. > > > * The host never shows status of UP. > > > * I select “confirm host has been rebooted” and I see a manual fence > > > start. > > > * Host stays non-responsive. > > > * I put the host in maintenance and then activate it. > > > * Host still non-responsive > > > * I put the host in maintenance and do a reinstall > > > * Reinstall finishes and host becomes UP > > > > > > So, everything seems to go fine with the HA functionality, but the host > > > never recovers without being reinstalled. Please let me know which logs > > you > > > need to look at to help me out with this. > > > > > > Thanks > > > > > > > > > Sent with Mixmax > > > > > > _______________________________________________ > > > Users mailing list > > > Users@ovirt.org > > > http://lists.ovirt.org/mailman/listinfo/users > > > > > > _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users