The issue was a split-brain issue on the dom_md/ids file causing an input/output error, thanks!
On Mon, Feb 3, 2014 at 10:43 PM, Andrew Lau <and...@andrewklau.com> wrote: > On Mon, Feb 3, 2014 at 10:40 PM, Doron Fediuck <dfedi...@redhat.com>wrote: > >> >> >> ----- Original Message ----- >> > From: "Andrew Lau" <and...@andrewklau.com> >> > To: "Doron Fediuck" <dfedi...@redhat.com> >> > Cc: "users" <users@ovirt.org>, "Jiri Moskovcak" <jmosk...@redhat.com>, >> "Greg Padgett" <gpadg...@redhat.com> >> > Sent: Monday, February 3, 2014 1:35:01 PM >> > Subject: Re: [Users] Hosted Engine always reports "unknown stale-data" >> > >> > On Mon, Feb 3, 2014 at 9:53 PM, Doron Fediuck <dfedi...@redhat.com> >> wrote: >> > >> > > >> > > >> > > ----- Original Message ----- >> > > > From: "Andrew Lau" <and...@andrewklau.com> >> > > > To: "users" <users@ovirt.org> >> > > > Sent: Monday, February 3, 2014 12:32:45 PM >> > > > Subject: [Users] Hosted Engine always reports "unknown stale-data" >> > > > >> > > > Hi, >> > > > >> > > > I was wondering if anyone has this same notice when they run: >> > > > hosted-engine --vm-status >> > > > >> > > > The "engine status" will always be "unknown stale-data" even when >> the VM >> > > is >> > > > powered on and the engine is online. engine-health will actually >> report >> > > the >> > > > correct status. >> > > > >> > > > eg. >> > > > >> > > > --== Host 1 status ==-- >> > > > >> > > > Status up-to-date : False >> > > > Hostname : 172.16.0.11 >> > > > Host ID : 1 >> > > > Engine status : unknown stale-data >> > > > >> > > > Is it some sort of blocked port causing this or is this by design? >> > > > >> > > > Thanks, >> > > > Andrew >> > > > >> > > > _______________________________________________ >> > > > Users mailing list >> > > > Users@ovirt.org >> > > > http://lists.ovirt.org/mailman/listinfo/users >> > > > >> > > >> > > Hi Andrew, >> > > it looks like an issue with the time stamp. >> > > Which time stamp do you have? How relevant is it? >> > > >> > >> > timestamps seem to be outdated by a lot, interesting error in the >> broker.log >> > >> > Thread-24::INFO::2014-02-03 >> > >> 22:33:14,801::engine_health::90::engine_health.CpuLoadNoEngine::(action) VM >> > not running on this host, status down >> > Thread-22::INFO::2014-02-03 >> > 22:33:14,834::mem_free::53::mem_free.MemFree::(action) memFree: 27382 >> > Thread-23::ERROR::2014-02-03 >> > >> 22:33:14,922::cpu_load_no_engine::156::cpu_load_no_engine.EngineHealth::(update_stat_file) >> > Failed to getVmStats: 'pid' >> > Thread-23::INFO::2014-02-03 >> > >> 22:33:14,923::cpu_load_no_engine::121::cpu_load_no_engine.EngineHealth::(calculate_load) >> > System load total=0.0124, engine=0.0000, non-engine=0.0124 >> > >> > I'm assuming that update_stat_file is the metadata file the vm-status is >> > getting pulled from? >> > >> >> Yep. >> Can you please verify the time your host actually has? >> ie- we have a known issue with time, since we assume all >> hosts are in sync. So if one of your hosts has a time sync >> issue, this can explain the problem you see. >> > > --== Host 1 status ==-- > > Status up-to-date : False > Hostname : 172.16.0.11 > Host ID : 1 > Engine status : unknown stale-data > Score : 0 > Local maintenance : False > Host timestamp : 1391417611 > > --== Host 2 status ==-- > > Status up-to-date : False > Hostname : 172.16.0.12 > Host ID : 2 > Engine status : unknown stale-data > Score : 0 > Local maintenance : False > Host timestamp : 1391417171 > > > > [root@hv01 ~]# date +%s > │[root@hv02 ~]# date +%s > > 1391427754 > │139142775 > 5 > > > >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users