Thanks for you reply. Can you please provide step by step instructions on how to upgrade the vdsm from a node command line?
On Tue, Mar 19, 2019 at 2:49 PM Simone Tiraboschi <[email protected]> wrote: > Hi Ada, > here the error: > > 2019-03-19 14:08:25,833+0200 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC > call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:312) > 2019-03-19 14:08:25,839+0200 INFO (vm/a492d2eb) [vdsm.api] FINISH > prepareImage error=Volume does not exist: > (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) from=internal, > task_id=dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257 (api:52) > 2019-03-19 14:08:25,839+0200 ERROR (vm/a492d2eb) > [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257') > Unexpected error (task:875) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, > in _run > return fn(*args, **kargs) > File "<string>", line 2, in prepareImage > File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in > method > ret = func(*args, **kwargs) > File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3199, > in prepareImage > legality = dom.produceVolume(imgUUID, volUUID).getLegality() > File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 822, in > produceVolume > volUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line > 801, in __init__ > self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line > 71, in __init__ > volUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 86, > in __init__ > self.validate() > File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line > 112, in validate > self.validateVolumePath() > File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line > 131, in validateVolumePath > raise se.VolumeDoesNotExist(self.volUUID) > VolumeDoesNotExist: Volume does not exist: > (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) > 2019-03-19 14:08:25,840+0200 INFO (vm/a492d2eb) > [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257') > aborting: Task is aborted: "Volume does not exist: > (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)" - code 201 (task:1181) > 2019-03-19 14:08:25,840+0200 ERROR (vm/a492d2eb) [storage.Dispatcher] > FINISH prepareImage error=Volume does not exist: > (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) (dispatcher:83) > > I think it's still https://bugzilla.redhat.com/1666795 > <https://bugzilla.redhat.com/show_bug.cgi?id=1666795> > > Can you please try updating vdsm to vdsm-4.30.10 since the bug is reported > as solved in that version? > > > > > On Tue, Mar 19, 2019 at 12:30 PM ada per <[email protected]> wrote: > >> an vdsm: >> >> >> >> >> >> On Tue, Mar 19, 2019 at 1:24 PM ada per <[email protected]> wrote: >> >>> Thank you! please see attached files: >>> >>> On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi <[email protected]> >>> wrote: >>> >>>> Can you please check/attach also >>>> /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log ? >>>> >>>> On Tue, Mar 19, 2019 at 11:36 AM ada per <[email protected]> wrote: >>>> >>>>> Hello everyone, >>>>> >>>>> For a strange reason the hosted engine went down and I cannot restart >>>>> it. I tried manually restarting it without any success can you please >>>>> advice? >>>>> >>>>> For all the nodes the engine status is the same as the one below. >>>>> --== Host nodex. (id: 6) status ==-- >>>>> conf_on_shared_storage : True >>>>> Status up-to-date : True >>>>> Hostname : nodex >>>>> Host ID : 6 >>>>> Engine status : {"reason": "bad vm status", >>>>> "health": "bad", "vm": "down_unexpected", "detail": "Down"} >>>>> Score : 3400 >>>>> stopped : False >>>>> Local maintenance : False >>>>> crc32 : 323a9f45 >>>>> local_conf_timestamp : 2648874 >>>>> Host timestamp : 2648874 >>>>> Extra metadata (valid at timestamp): >>>>> metadata_parse_version=1 >>>>> metadata_feature_version=1 >>>>> timestamp=2648874 (Tue Mar 19 12:25:44 2019) >>>>> host-id=6 >>>>> score=3400 >>>>> vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019) >>>>> conf_on_shared_storage=True >>>>> maintenance=False >>>>> state=GlobalMaintenance >>>>> stopped=False >>>>> >>>>> When I try the commands >>>>> root@node5# hosted-engine --vm-shutdown >>>>> I ge the response: >>>>> root@node5# Command VM.shutdown with args {'delay': '120', 'message': >>>>> 'VM is shutting down!', 'vmID': 'a492d2eb-1dfd-470d-a141-3e55d2189275'} >>>>> failed:(code=1, message=Virtual machine does not exist) >>>>> >>>>> But when I run : hosted-engine --vm-start >>>>> I get the response: VM exists and is down, cleaning up and restarting >>>>> >>>>> >>>>> >>>>> Below you can see the # journalctl -u ovirt-ha-agent logs >>>>> >>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Unhandled >>>>> monitoring loop exception >>>>> Traceback >>>>> (most recent call last): >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>> line 430, in start_monitoring >>>>> >>>>> self._monitoring_loop() >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>> line 449, in _monitoring_loop >>>>> for >>>>> old_state, state, delay in self.fsm: >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", >>>>> line 127, in next >>>>> new_data >>>>> = self.refresh(self._state.data) >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", >>>>> line 81, in refresh >>>>> >>>>> stats.update(self.hosted_engine.collect_stats()) >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>> line 737, in collect_stats >>>>> >>>>> all_stats = self._broker.get_stats_from_storage() >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>>>> line 143, in get_stats_from_storage >>>>> result = >>>>> self._proxy.get_stats() >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__ >>>>> return >>>>> self.__send(self.__name, args) >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request >>>>> >>>>> verbose=self.__verbose >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request >>>>> return >>>>> self.single_request(host, handler, request_body, verbose) >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request >>>>> >>>>> self.send_content(h, request_body) >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content >>>>> >>>>> connection.endheaders(request_body) >>>>> File >>>>> "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders >>>>> >>>>> self._send_output(message_body) >>>>> File >>>>> "/usr/lib64/python2.7/httplib.py", line 881, in _send_output >>>>> >>>>> self.send(msg) >>>>> File >>>>> "/usr/lib64/python2.7/httplib.py", line 843, in send >>>>> >>>>> self.connect() >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", >>>>> line 52, in connect >>>>> >>>>> self.sock.connect(base64.b16decode(self.host)) >>>>> File >>>>> "/usr/lib64/python2.7/socket.py", line 224, in meth >>>>> return >>>>> getattr(self._sock,name)(*args) >>>>> error: >>>>> [Errno 2] No such file or directory >>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call >>>>> last): >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>> line 131, in _run_agent >>>>> return >>>>> action(he) >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>> line 55, in action_proper >>>>> return >>>>> he.start_monitoring() >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>> line 437, in start_monitoring >>>>> >>>>> self.publish(stopped) >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>> line 337, in publish >>>>> >>>>> self._push_to_storage(blocks) >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>> line 708, in _push_to_storage >>>>> >>>>> self._broker.put_stats_on_storage(self.host_id, blocks) >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>>>> line 113, in put_stats_on_storage >>>>> >>>>> self._proxy.put_stats(host_id, xmlrpclib.Binary(data)) >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__ >>>>> return >>>>> self.__send(self.__name, args) >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request >>>>> >>>>> verbose=self.__verbose >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request >>>>> return >>>>> self.single_request(host, handler, request_body, verbose) >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request >>>>> >>>>> self.send_content(h, request_body) >>>>> File >>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content >>>>> >>>>> connection.endheaders(request_body) >>>>> File >>>>> "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders >>>>> >>>>> self._send_output(message_body) >>>>> File >>>>> "/usr/lib64/python2.7/httplib.py", line 881, in _send_output >>>>> >>>>> self.send(msg) >>>>> File >>>>> "/usr/lib64/python2.7/httplib.py", line 843, in send >>>>> >>>>> self.connect() >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", >>>>> line 52, in connect >>>>> >>>>> self.sock.connect(base64.b16decode(self.host)) >>>>> File >>>>> "/usr/lib64/python2.7/socket.py", line 224, in meth >>>>> return >>>>> getattr(self._sock,name)(*args) >>>>> error: >>>>> [Errno 2] No such file or directory >>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent >>>>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service: main >>>>> process exited, code=exited, status=157/n/a >>>>> Mar 14 12:04:42 node7. systemd[1]: Unit ovirt-ha-agent.service entered >>>>> failed state. >>>>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service failed. >>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service holdoff time >>>>> over, scheduling restart. >>>>> Mar 14 12:04:52 node7. systemd[1]: Stopped oVirt Hosted Engine High >>>>> Availability Monitoring Agent. >>>>> Mar 14 12:04:52 node7. systemd[1]: Started oVirt Hosted Engine High >>>>> Availability Monitoring Agent. >>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to >>>>> start necessary monitors >>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call >>>>> last): >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>> line 131, in _run_agent >>>>> return >>>>> action(he) >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>> line 55, in action_proper >>>>> return >>>>> he.start_monitoring() >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>> line 413, in start_monitoring >>>>> >>>>> self._initialize_broker() >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>> line 537, in _initialize_broker >>>>> >>>>> m.get('options', {})) >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>>>> line 86, in start_monitor >>>>> >>>>> ).format(t=type, o=options, e=e) >>>>> >>>>> RequestError: brokerlink - failed to start monitor via ovirt-ha-broker: >>>>> [Errno 2] No such file or directory, [monitor: 'ping', options: {'addr': >>>>> '19 >>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent >>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service: main >>>>> process exited, code=exited, status=157/n/a >>>>> Mar 14 12:04:52 node7. systemd[1]: Unit ovirt-ha-agent.service entered >>>>> failed state. >>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service failed. >>>>> Mar 14 12:05:02 node7. systemd[1]: ovirt-ha-agent.service holdoff time >>>>> over, scheduling restart. >>>>> Mar 14 12:05:02 node7. systemd[1]: Stopped oVirt Hosted Engine High >>>>> Availability Monitoring Agent. >>>>> Mar 14 12:05:02 node7. systemd[1]: Started oVirt Hosted Engine High >>>>> Availability Monitoring Agent. >>>>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to >>>>> stop engine vm with /usr/sbin/hosted-engine --vm-poweroff: Co >>>>> (code=1, >>>>> message=Virtual machine does not exist: {'vmId': >>>>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'}) >>>>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to >>>>> stop engine VM: Command VM.destroy with args {'vmID': 'a492d2 >>>>> (code=1, >>>>> message=Virtual machine does not exist: {'vmId': >>>>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'}) >>>>> Mar 15 14:28:16 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:28:36 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:29:00 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:29:22 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:29:44 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:30:06 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:30:28 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:30:50 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:31:12 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:31:33 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:31:56 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:32:18 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:32:40 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> Mar 15 14:33:02 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>> stopped on localhost >>>>> _______________________________________________ >>>>> Users mailing list -- [email protected] >>>>> To unsubscribe send an email to [email protected] >>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>> oVirt Code of Conduct: >>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>> List Archives: >>>>> https://lists.ovirt.org/archives/list/[email protected]/message/NS2SASAK66TEO3MZQYIW64HCDLXVTIL6/ >>>>> >>>>
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/UWH7NM3QCZMMW3XPCZFHYVIQVISLNTAA/

