> On 22 Feb 2017, at 13:53, Simone Tiraboschi <[email protected]> wrote:
> 
> 
> 
> On Wed, Feb 22, 2017 at 1:33 PM, Simone Tiraboschi <[email protected] 
> <mailto:[email protected]>> wrote:
> When ovirt-ha-agent checks the status of the engine VM we get:
> 2017-02-21 22:21:14,738-0500 ERROR (jsonrpc/2) [api] FINISH getStats 
> error=Virtual machine does not exist: {'vmId': 
> u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} (api:69)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 67, in 
> method
>     ret = func(*args, **kwargs)
>   File "/usr/share/vdsm/API.py", line 335, in getStats
>     vm = self.vm
>   File "/usr/share/vdsm/API.py", line 130, in vm
>     raise exception.NoSuchVM(vmId=self._UUID)
> NoSuchVM: Virtual machine does not exist: {'vmId': 
> u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'}
> 
> While in ovirt-ha-agent logs we have:
> MainThread::INFO::2017-02-21 
> 22:21:18,583::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>  Current state UnknownLocalVmState (score: 3400)
> ...
> MainThread::INFO::2017-02-21 
> 22:21:31,199::state_decorators::25::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>  Unknown local engine vm status no actions taken
> Probably it's a bug or a regression somewhere on master.
> 
> On ovirt-ha-broker side the detection is based on a strict string match on 
> the error message that is expected to be exactly 'Virtual machine does not 
> exist' to set down status otherwise we set unknown status as in this case:
> https://gerrit.ovirt.org/gitweb?p=ovirt-hosted-engine-ha.git;a=blob;f=ovirt_hosted_engine_ha/broker/submonitors/engine_health.py;h=d633cb860b811e84021221771bf706a9a4ac1d63;hb=refs/heads/master#l54
>  
> <https://gerrit.ovirt.org/gitweb?p=ovirt-hosted-engine-ha.git;a=blob;f=ovirt_hosted_engine_ha/broker/submonitors/engine_health.py;h=d633cb860b811e84021221771bf706a9a4ac1d63;hb=refs/heads/master#l54>
>  
> Adding Francesco here to understand if something has recently changed there 
> on vdsm side.

That’s not a very robust code handling.
Yes, the text changed, the vm id was added.
And yes, it may change again any time I guess

> 
> 
> 
> On Wed, Feb 22, 2017 at 1:02 PM, Sandro Bonazzola <[email protected] 
> <mailto:[email protected]>> wrote:
> Adding Lev
> 
> On Wed, Feb 22, 2017 at 12:59 PM, Sahina Bose <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi all,
> 
> On the HC setup, the HE VM is not restarted.
> The agent.log has 
> MainThread::INFO::2017-02-21 
> 22:09:58,022::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>  Global metadata: {}
> MainThread::INFO::2017-02-21 
> 22:09:58,023::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>  Local (id 1): {'engine-health': {'reason': 'failed to getVmStats', 'health': 
> 'unknown', 'vm': 'unknown', 'detail': 'unknown'}, 'bridge': True, 'mem-free': 
> 4079.0, 'maintenance': False, 'cpu-load': 0.0491, 'gateway': True}
> ...
> MainThread::INFO::2017-02-21 
> 22:10:29,219::state_decorators::25::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>  Unknown local engine vm status no actions taken
> MainThread::INFO::2017-02-21 
> 22:10:29,219::brokerlink::111::ovirt_hosted_engine_ha.lib.br 
> <http://ovirt_hosted_engine_ha.lib.br/>okerlink.BrokerLink::(notify) Trying: 
> notify time=1487733029.22 type=state_transition 
> detail=ReinitializeFSM-UnknownLocalVmState 
> hostname='lago-hc-basic-suite-master-host0'
> MainThread::INFO::2017-02-21 
> 22:10:29,317::brokerlink::121::ovirt_hosted_engine_ha.lib.br 
> <http://ovirt_hosted_engine_ha.lib.br/>okerlink.BrokerLink::(notify) Success, 
> was notification of state_transition (ReinitializeFSM-UnknownLocalVmState) 
> sent? ignored
> and the vdsm.log 
> 
> 2017-02-21 22:09:11,962-0500 INFO  (libvirt/events) [virt.vm] 
> (vmId='2ccc0ef0-cc31-45b8-8e91-a78fa4cad671') Changed state to Down: User 
> shut down from within the guest (code=7) (vm:1269)
> 2017-02-21 22:09:11,962-0500 INFO  (libvirt/events) [virt.vm] 
> (vmId='2ccc0ef0-cc31-45b8-8e91-a78fa4cad671') Stopping connection 
> (guestagent:429)
> 
> 2017-02-21 22:09:29,727-0500 ERROR (jsonrpc/4) [api] FINISH getStats 
> error=Virtual machine does not exist: {'vmId': 
> u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} (api:69)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 67, in 
> method
>     ret = func(*args, **kwargs)
>   File "/usr/share/vdsm/API.py", line 335, in getStats
>     vm = self.vm
>   File "/usr/share/vdsm/API.py", line 130, in vm
>     raise exception.NoSuchVM(vmId=self._UUID)
> NoSuchVM: Virtual machine does not exist: {'vmId': 
> u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'}
> 
> 
> What should I be looking for to identify the issue?
> 
> The logs are at 
> http://jenkins.ovirt.org/job/ovirt_master_hc-system-tests/lastCompletedBuild/artifact/exported-artifacts/test_logs/hc-basic-suite-master/post-002_bootstrap.py/lago-hc-basic-suite-master-host0
>  
> <http://jenkins.ovirt.org/job/ovirt_master_hc-system-tests/lastCompletedBuild/artifact/exported-artifacts/test_logs/hc-basic-suite-master/post-002_bootstrap.py/lago-hc-basic-suite-master-host0>
> 
> thanks
> sahina
> 
> _______________________________________________
> Devel mailing list
> [email protected] <mailto:[email protected]>
> http://lists.ovirt.org/mailman/listinfo/devel 
> <http://lists.ovirt.org/mailman/listinfo/devel>
> 
> 
> 
> -- 
> Sandro Bonazzola
> Better technology. Faster innovation. Powered by community collaboration.
> See how it works at redhat.com <http://redhat.com/>
> 
> _______________________________________________
> Devel mailing list
> [email protected]
> http://lists.ovirt.org/mailman/listinfo/devel

_______________________________________________
Devel mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/devel

Reply via email to