> On 09 Aug 2016, at 19:14, Nicolás <nico...@devels.es> wrote: > > Hi, > > It worked (thanks Ekin, I'd probably had not turned it off and it would > indeed been restarted), restarting libvirtd and vdsmd made all machines set > their status to up. Seems the cuprit is libvirtd in this case, as I could see > some errors in the log. I'm attaching the log FWIW.
Hi, I suppose you use RHEL/CentOS, a libvirt responsiveness issue was fixed in https://rhn.redhat.com/errata/RHBA-2016-1290.html Let us know if the same thing reproduces again Thanks, michal > > Thanks. > > El 07/08/16 a las 19:37, Ekin Meroğlu escribió: >> Hi, >> >> Just a reminder, if you have power management configured, first turn that >> off for the host - when you restart vdsmd with the power management >> configured, engine finds it not responding and tries to fence (e.g. reboot) >> the host. >> >> Other than that, restarting vdsmd has been safe in my experience... >> >> Regards, >> >> On Thu, Aug 4, 2016 at 6:10 PM, Nicolás <nico...@devels.es >> <mailto:nico...@devels.es>> wrote: >> >> >> El 04/08/16 a las 15:25, Arik Hadas escribió: >> >> ----- Original Message ----- >> El 2016-08-04 08:24, Arik Hadas escribió: >> ----- Original Message ----- >> >> El 04/08/16 a las 07:18, Arik Hadas escribió: >> ----- Original Message ----- >> Hi, >> >> We're running oVirt 4.0.1 and today I found out that one of our hosts >> has all its VMs in an unknown state. I actually don't know how (and >> when) did this happen, but I'd like to restore service possibly without >> turning off these machines. The host is up, the VMs are up, 'qemu' >> process exists, no errors, it's just the VMs running on it that have a >> '?' where status is defined. >> >> Is it safe in this case to simply modify database and set those VM's >> status to 'up'? I remember having to do this a time ago when we faced >> storage issues, it didn't break anything back then. If not, is there a >> "safe" way to migrate those VMs to a different host and restart the >> host >> that marked them as unknown? >> Hi Nicolás, >> >> I assume that the host these VMs are running on is empty in the >> webadmin, >> right? if that is the case then you've probably hit [1]. Changing their >> status to up is not the way to go since these VMs will not be monitored. >> Hi Arik, >> >> By "empty" you mean the webadmin reports the host being running 0 VMs? >> If so, that's not the case, actually the VM count seems to be correct >> in >> relation to "qemu-*" processes (about 32 VMs), I can even see the >> machines in the "Virtual machines" tab of the host, it's just they are >> all marked with the '?' mark. >> No, I meant the 'Host' column in the Virtual Machines tab but if you >> see >> the VMs in the "Virtual machines" sub-tab of the host then run_on_vds >> points to the right host.. >> >> The host is up in the webadmin as well? >> Can you share the engine log? >> >> Yes, the host is up in the webadmin, there are no issues with it, just >> the VMs running on it have the '?' mark. I've made 3 tests: >> >> 1) Restart engine: did not help >> 2) Check firewall, seems to be ok. >> 2) PostgreSQL: UPDATE vm_dynamic SET status = 1 WHERE status = 8; : >> After a while, I see lots of entries like this: >> >> 2016-08-04 09:23:10,910 WARN >> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >> (DefaultQuartzScheduler4) [6ad135b8] Correlation ID: null, Call Stack: >> null, Custom Event ID: -1, Message: VM xxx is not responding. >> >> I'm attaching the engine log, but I don't know when did this happen for >> the first time, though. If there's a manual way/command to migrate VMs >> to a different host I'd appreciate a hint about it. >> >> Is it safe to restart vdsmd on this host? >> The engine log looks fine - the VMs are reported as not-responding for >> some reason. I would restart libvirtd and vdsmd then >> >> Is restarting those two daemons safe? I mean, will that stop all qemu-* >> processes, so the VMs marked as unknown will stop? >> >> >> Thanks. >> >> Thanks. >> >> Yes, there is no other way to resolve it other than changing the DB but >> the change should be to update run_on_vds field of these VMs to the host >> you know they are running on. Their status will then be updates in 15 >> sec. >> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1354494 >> <https://bugzilla.redhat.com/show_bug.cgi?id=1354494> >> >> Arik. >> >> Thanks. >> >> Nicolás >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org <mailto:Users@ovirt.org> >> http://lists.ovirt.org/mailman/listinfo/users >> <http://lists.ovirt.org/mailman/listinfo/users> >> >> >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org <mailto:Users@ovirt.org> >> http://lists.ovirt.org/mailman/listinfo/users >> <http://lists.ovirt.org/mailman/listinfo/users> >> >> >> >> -- >> Ekin Meroğlu Red Hat Certified Architect >> >> linuxera Özgür Yazılım Çözüm ve Hizmetleri >> T +90 (850) 22 LINUX | GSM +90 (532) 137 77 04 >> www.linuxera.com <http://www.linuxera.com/> | bi...@linuxera.com >> <mailto:bi...@linuxera.com> > <libvirtd.tar.gz>_______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users