Btw, this whole thing is probably the oldest untouched piece of hosted engine we have. It worked till now :)
Martin On Wed, Feb 22, 2017 at 3:38 PM, Martin Sivak <[email protected]> wrote: >> broke HA. Could you perhaps fixing it checking that the message *begins* >> with that string, and/or checking the error code. bests, > > Already done: > > https://gerrit.ovirt.org/#/c/72791/ > > Martin > > On Wed, Feb 22, 2017 at 3:32 PM, Francesco Romani <[email protected]> wrote: >> On 02/22/2017 01:53 PM, Simone Tiraboschi wrote: >> >> >> >> On Wed, Feb 22, 2017 at 1:33 PM, Simone Tiraboschi <[email protected]> >> wrote: >>> >>> When ovirt-ha-agent checks the status of the engine VM we get: >>> >>> 2017-02-21 22:21:14,738-0500 ERROR (jsonrpc/2) [api] FINISH getStats >>> error=Virtual machine does not exist: {'vmId': >>> u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} (api:69) >>> Traceback (most recent call last): >>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 67, in >>> method >>> ret = func(*args, **kwargs) >>> File "/usr/share/vdsm/API.py", line 335, in getStats >>> vm = self.vm >>> File "/usr/share/vdsm/API.py", line 130, in vm >>> raise exception.NoSuchVM(vmId=self._UUID) >>> NoSuchVM: Virtual machine does not exist: {'vmId': >>> u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} >>> >>> While in ovirt-ha-agent logs we have: >>> >>> MainThread::INFO::2017-02-21 >>> 22:21:18,583::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>> Current state UnknownLocalVmState (score: 3400) >>> >>> ... >>> >>> MainThread::INFO::2017-02-21 >>> 22:21:31,199::state_decorators::25::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check) >>> Unknown local engine vm status no actions taken >>> >>> Probably it's a bug or a regression somewhere on master. >> >> On ovirt-ha-broker side the detection is based on a strict string match on >> the error message that is expected to be exactly 'Virtual machine does not >> exist' to set down status otherwise we set unknown status as in this case: >> https://gerrit.ovirt.org/gitweb?p=ovirt-hosted-engine-ha.git;a=blob;f=ovirt_hosted_engine_ha/broker/submonitors/engine_health.py;h=d633cb860b811e84021221771bf706a9a4ac1d63;hb=refs/heads/master#l54 >> >> Adding Francesco here to understand if something has recently changed there >> on vdsm side. >> >> It has changed indeed; we had a series of changes which added context to >> some exceptions. I believe the straw who broke the camel's back was >> I32ec3f86f8d53f8412f4c0526fc85e2a42e30ea5 It is unfortunate that this change >> broke HA. Could you perhaps fixing it checking that the message *begins* >> with that string, and/or checking the error code. bests, >> >> -- >> Francesco Romani >> Red Hat Engineering Virtualization R & D >> IRC: fromani >> >> >> _______________________________________________ >> Devel mailing list >> [email protected] >> http://lists.ovirt.org/mailman/listinfo/devel _______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
