I see there an ERROR on stopMonitoringDomain but I cannot see the correspondent startMonitoringDomain; could you please look for it?
On Fri, Feb 3, 2017 at 1:16 PM, Ralf Schenk <r...@databay.de> wrote: > Hello, > > attached is my vdsm.log from the host with hosted-engine-ha around the > time-frame of agent timeout that is not working anymore for engine (it > works in Ovirt and is active). It simply isn't working for engine-ha > anymore after Update. > > At 2017-02-02 19:25:34,248 you'll find an error corresponoding to agent > timeout error. > > Bye > > > > Am 03.02.2017 um 11:28 schrieb Simone Tiraboschi: > > 3. Three of my hosts have the hosted engine deployed for ha. First all >>> three where marked by a crown (running was gold and others where silver). >>> After upgrading the 3 Host deployed hosted engine ha is not active anymore. >>> >>> I can't get this host back with working ovirt-ha-agent/broker. I already >>> rebooted, manually restarted the services but It isn't able to get cluster >>> state according to >>> "hosted-engine --vm-status". The other hosts state the host status as >>> "unknown stale-data" >>> >>> I already shut down all agents on all hosts and issued a "hosted-engine >>> --reinitialize-lockspace" but that didn't help. >>> >>> Agents stops working after a timeout-error according to log: >>> >>> MainThread::INFO::2017-02-02 19:24:52,040::hosted_engine::8 >>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) >>> VDSM domain monitor status: PENDING >>> MainThread::INFO::2017-02-02 19:24:59,185::hosted_engine::8 >>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) >>> VDSM domain monitor status: PENDING >>> MainThread::INFO::2017-02-02 19:25:06,333::hosted_engine::8 >>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) >>> VDSM domain monitor status: PENDING >>> MainThread::INFO::2017-02-02 19:25:13,554::hosted_engine::8 >>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) >>> VDSM domain monitor status: PENDING >>> MainThread::INFO::2017-02-02 19:25:20,710::hosted_engine::8 >>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) >>> VDSM domain monitor status: PENDING >>> MainThread::INFO::2017-02-02 19:25:27,865::hosted_engine::8 >>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) >>> VDSM domain monitor status: PENDING >>> MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::8 >>> 15::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor) >>> Failed to start monitoring domain >>> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, >>> host_id=3): timeout during domain acquisition >>> MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::4 >>> 69::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>> Error while monitoring engine: Failed to start monitoring domain >>> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout >>> during domain acquisition >>> MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::4 >>> 72::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>> Unexpected error >>> Traceback (most recent call last): >>> File >>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>> line 443, in start_monitoring >>> self._initialize_domain_monitor() >>> File >>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>> line 816, in _initialize_domain_monitor >>> raise Exception(msg) >>> Exception: Failed to start monitoring domain >>> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout >>> during domain acquisition >>> MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::4 >>> 85::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>> Shutting down the agent because of 3 failures in a row! >>> MainThread::INFO::2017-02-02 19:25:32,087::hosted_engine::8 >>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) >>> VDSM domain monitor status: PENDING >>> MainThread::INFO::2017-02-02 19:25:34,250::hosted_engine::7 >>> 69::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor) >>> Failed to stop monitoring domain >>> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96): >>> Storage domain is member of pool: u'domain=7c8deaa8-be02-4aaf-b9 >>> b4-ddc8da99ad96' >>> MainThread::INFO::2017-02-02 19:25:34,254::agent::143::ovir >>> t_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down >>> >> Simone, Martin, can you please follow up on this? >> > > Ralph, could you please attach vdsm logs from on of your hosts for the > relevant time frame? > > > -- > > > *Ralf Schenk* > fon +49 (0) 24 05 / 40 83 70 <+49%202405%20408370> > fax +49 (0) 24 05 / 40 83 759 <+49%202405%204083759> > mail *r...@databay.de* <r...@databay.de> > > *Databay AG* > Jens-Otto-Krag-Straße 11 > D-52146 Würselen > *www.databay.de* <http://www.databay.de> > > Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202 > Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm. > Philipp Hermanns > Aufsichtsratsvorsitzender: Wilhelm Dohmen > ------------------------------ >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users