> On 09 Aug 2016, at 19:14, Nicolás <nico...@devels.es> wrote:
> 
> Hi,
> 
> It worked (thanks Ekin, I'd probably had not turned it off and it would 
> indeed been restarted), restarting libvirtd and vdsmd made all machines set 
> their status to up. Seems the cuprit is libvirtd in this case, as I could see 
> some errors in the log. I'm attaching the log FWIW.

Hi,
I suppose you use RHEL/CentOS, a libvirt responsiveness issue was fixed in 
https://rhn.redhat.com/errata/RHBA-2016-1290.html
Let us know if the same thing reproduces again

Thanks,
michal

> 
> Thanks.
> 
> El 07/08/16 a las 19:37, Ekin Meroğlu escribió:
>> Hi,
>> 
>> Just a reminder, if you have power management configured, first turn that 
>> off for the host - when you restart vdsmd with the power management 
>> configured, engine finds it not responding and tries to fence (e.g. reboot) 
>> the host.
>> 
>> Other than that, restarting vdsmd has been safe in my experience...
>> 
>> Regards,  
>> 
>> On Thu, Aug 4, 2016 at 6:10 PM, Nicolás <nico...@devels.es 
>> <mailto:nico...@devels.es>> wrote:
>> 
>> 
>> El 04/08/16 a las 15:25, Arik Hadas escribió:
>> 
>> ----- Original Message -----
>> El 2016-08-04 08:24, Arik Hadas escribió:
>> ----- Original Message -----
>> 
>> El 04/08/16 a las 07:18, Arik Hadas escribió:
>> ----- Original Message -----
>> Hi,
>> 
>> We're running oVirt 4.0.1 and today I found out that one of our hosts
>> has all its VMs in an unknown state. I actually don't know how (and
>> when) did this happen, but I'd like to restore service possibly without
>> turning off these machines. The host is up, the VMs are up, 'qemu'
>> process exists, no errors, it's just the VMs running on it that have a
>> '?' where status is defined.
>> 
>> Is it safe in this case to simply modify database and set those VM's
>> status to 'up'? I remember having to do this a time ago when we faced
>> storage issues, it didn't break anything back then. If not, is there a
>> "safe" way to migrate those VMs to a different host and restart the
>> host
>> that marked them as unknown?
>> Hi Nicolás,
>> 
>> I assume that the host these VMs are running on is empty in the
>> webadmin,
>> right? if that is the case then you've probably hit [1]. Changing their
>> status to up is not the way to go since these VMs will not be monitored.
>> Hi Arik,
>> 
>> By "empty" you mean the webadmin reports the host being running 0 VMs?
>> If so, that's not the case, actually the VM count seems to be correct
>> in
>> relation to "qemu-*" processes (about 32 VMs), I can even see the
>> machines in the "Virtual machines" tab of the host, it's just they are
>> all marked with the '?' mark.
>> No, I meant the 'Host' column in the Virtual Machines tab but if you
>> see
>> the VMs in the "Virtual machines" sub-tab of the host then run_on_vds
>> points to the right host..
>> 
>> The host is up in the webadmin as well?
>> Can you share the engine log?
>> 
>> Yes, the host is up in the webadmin, there are no issues with it, just
>> the VMs running on it have the '?' mark. I've made 3 tests:
>> 
>> 1) Restart engine: did not help
>> 2) Check firewall, seems to be ok.
>> 2) PostgreSQL: UPDATE vm_dynamic SET status = 1 WHERE status = 8; :
>> After a while, I see lots of entries like this:
>> 
>>       2016-08-04 09:23:10,910 WARN
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (DefaultQuartzScheduler4) [6ad135b8] Correlation ID: null, Call Stack:
>> null, Custom Event ID: -1, Message: VM xxx is not responding.
>> 
>> I'm attaching the engine log, but I don't know when did this happen for
>> the first time, though. If there's a manual way/command to migrate VMs
>> to a different host I'd appreciate a hint about it.
>> 
>> Is it safe to restart vdsmd on this host?
>> The engine log looks fine - the VMs are reported as not-responding for
>> some reason. I would restart libvirtd and vdsmd then
>> 
>> Is restarting those two daemons safe? I mean, will that stop all qemu-* 
>> processes, so the VMs marked as unknown will stop?
>> 
>> 
>> Thanks.
>> 
>> Thanks.
>> 
>> Yes, there is no other way to resolve it other than changing the DB but
>> the change should be to update run_on_vds field of these VMs to the host
>> you know they are running on. Their status will then be updates in 15
>> sec.
>> 
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1354494 
>> <https://bugzilla.redhat.com/show_bug.cgi?id=1354494>
>> 
>> Arik.
>> 
>> Thanks.
>> 
>> Nicolás
>> 
>> _______________________________________________
>> Users mailing list
>> Users@ovirt.org <mailto:Users@ovirt.org>
>> http://lists.ovirt.org/mailman/listinfo/users 
>> <http://lists.ovirt.org/mailman/listinfo/users>
>> 
>> 
>> 
>> _______________________________________________
>> Users mailing list
>> Users@ovirt.org <mailto:Users@ovirt.org>
>> http://lists.ovirt.org/mailman/listinfo/users 
>> <http://lists.ovirt.org/mailman/listinfo/users>
>> 
>> 
>> 
>> -- 
>>      Ekin Meroğlu Red Hat Certified Architect 
>> 
>> linuxera Özgür Yazılım Çözüm ve Hizmetleri 
>> T +90 (850) 22 LINUX | GSM +90 (532) 137 77 04
>> www.linuxera.com <http://www.linuxera.com/> | bi...@linuxera.com 
>> <mailto:bi...@linuxera.com>
> <libvirtd.tar.gz>_______________________________________________
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to