Hi Will, The engine relies on the status reported by VDSM for the management network 'ovirtmgmt' and for its underlying nics/vlans.
In order to see the configuration of 'ovirtmgmt' network please paste the output of the following command to be executed on the host: vdsClient -s 0 getVdsCaps In addition, in order to see the reported status of the networks run and paste on the host: vdsClient -s 0 getVdsStats That should give the indication of which nic is reported as down for ovirtmgmt by vdsm. On Wed, Jan 6, 2016 at 11:15 AM, Eliraz Levi <[email protected]> wrote: > Hi Will how are you? > The log is first pointing about certifications issues: > 2016-01-04 00:02:11,259 ERROR > [org.ovirt.engine.core.vdsbroker.jsonrpc.JsonRpcVdsServer] > (DefaultQuartzScheduler_Worker-81) [] Failed to get peer certification for > host 'ovirt-node-02': SSL session is invalid > 2016-01-04 00:02:11,259 ERROR > [org.ovirt.engine.core.bll.CertificationValidityChecker] > (DefaultQuartzScheduler_Worker-81) [] Failed to retrieve peer > certifications for host 'ovirt-node-02' > > So first thing we should do is to try and solve this problem. > Please try to re install the host. > Thanks. > Eliraz :) > > ----- Original Message ----- > From: "Will Dennis" <[email protected]> > To: "Eliraz Levi" <[email protected]>, "users" <[email protected]> > Sent: Tuesday, 5 January, 2016 5:46:23 AM > Subject: Re: [ovirt-users] host status "Non Operational" - how to diagnose > & fix? > > I must admit I’m getting a bit weary of fighting oVirt problems at this > point… Before I move on to deploying any VMs onto my new infra, I’d like to > get the base infra working… > > I’m still experiencing a “Non Operational” problem on my “ovirt-node-02” > host: > > http://s1096.photobucket.com/user/willdennis/media/ovirt-node-02_problem.png.html > > I have pored thru the logs (all the engine logs, plus the syslogs from the > engine VM + and my three hypervisor/storage hosts) and I can’t pin down why > the one node is having a problem… Of course with how voluminous all these > logs are, it’s kind of like looking for a needle in a haystack, and I’m not > even sure what the needle looks like, or if it’s even a needle :-/ > > I have also rebooted this host in past days, this also did not fix the > problem. > > Note that on the screenshot I posted above, that the webadmin hosts screen > says that -node-01 has one VM running, and the others 0… You’d think that > would be the HE VM running on there, but it’s actually on -node-02: > > $ ansible istgroup-ovirt -f 1 -i prod -u root -m shell -a "hosted-engine > --vm-status | grep -e '^Hostname' -e '^Engine'" > ovirt-node-01 | success | rc=0 >> > Hostname : ovirt-node-01 > Engine status : {"reason": "bad vm status", "health": > "bad", "vm": "down", "detail": "down"} > Hostname : ovirt-node-02 > Engine status : {"health": "good", "vm": "up", > "detail": "up"} > Hostname : ovirt-node-03 > Engine status : {"reason": "vm not running on this > host", "health": "bad", "vm": "down", "detail": "unknown"} > > ovirt-node-02 | success | rc=0 >> > Hostname : ovirt-node-01 > Engine status : {"reason": "bad vm status", "health": > "bad", "vm": "down", "detail": "down"} > Hostname : ovirt-node-02 > Engine status : {"health": "good", "vm": "up", > "detail": "up"} > Hostname : ovirt-node-03 > Engine status : {"reason": "vm not running on this > host", "health": "bad", "vm": "down", "detail": "unknown"} > > ovirt-node-03 | success | rc=0 >> > Hostname : ovirt-node-01 > Engine status : {"reason": "bad vm status", "health": > "bad", "vm": "down", "detail": "down"} > Hostname : ovirt-node-02 > Engine status : {"health": "good", "vm": "up", > "detail": "up"} > Hostname : ovirt-node-03 > Engine status : {"reason": "vm not running on this > host", "health": "bad", "vm": "down", "detail": "unknown”} > > So it looks like the webadmin UI is wrong as well… > > It would be awesome if the UI would give a reason for the “Non > Operational” status somehow… Or if there was a troubleshooter that could be > used to analyze the problem… As it is, being so new to all of this, I am > completely at the list’s mercy to figure this out. > > This software has such promise, so I’ll keep working thru these issues, > but it sure hasn’t been a smooth ride so far… > > > On Jan 4, 2016, at 7:54 AM, Will Dennis <[email protected]<mailto: > [email protected]>> wrote: > > I put all of the engine logs up there now… Try > engine.log-20160103.gzhttp:// > i1096.photobucket.com/albums/g330/willdennis/ovirt-node-02_problem.png > _______________________________________________ > Users mailing list > [email protected] > http://lists.ovirt.org/mailman/listinfo/users > -- Regards, Moti
_______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

