Re: [ovirt-users] ovirt-3.6 : Hosted-engine crashed and can't restart

2016-07-21 Thread Alexis HAUSER

> The issue seams here: please ensure that you can correctly connect
> your storage server.
> Can you please attach vdsm logs?

Yes actually I figured out it was a DNS problem : as mentioned in the messages 
from the log I provided, it wasn't able to reach the NFS where the engine was 
(as it uses FQDN not IP with NFS it seems, I will fix that for not depending on 
DNS).

This is actually my setup : only Em1 is plugged, it has ovirtmgmt + one other 
logical VLAN network. This VLAN network as in DHCP and never had an IP, 
everything was working fine.
Since I added an IP address to that interface, the manager crashed. Actually it 
is trying to use that VLAN interface as the default route, I have no idea why, 
and cause DNS issue (one of the DNS was on another network, the the second was 
on the game network...it should actually have worked anyway...).
The only way I found to resolve this was ifdown of that interface, and route 
add default gw  ovirtmgmt

After that, I had errors like "unknown stale data" and "failed to reinitilize 
lockspace" ; removing the lockfile with hosted-engine command, and removing 
manually __DIRECT_IO__ file on the engine storage didn't fix it.

I actually found out what was happening : ovirt-ha-agent had errors in his 
status (with systemctl), ovirt-ha-broker had errors related to ha-agent and 
vdsdm had errors related to those 2 previous services.

I resolved my issue by restarting the service in the good order :

# systemctl restart ovirt-ha-agent.service
# systemctl restart ovirt-ha-broker.service
# systemctl restart vdsmd

Anyway thanks for your answer, I hope this topic will help people with similar 
issues
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt-3.6 : Hosted-engine crashed and can't restart

2016-07-21 Thread Simone Tiraboschi
On Wed, Jul 20, 2016 at 5:01 PM, Alexis HAUSER
 wrote:
> After assigning an IP adress to a VLAN network (it was using DHCP by default) 
> that was on the same NIC than ovirtmgmt, my hosted-engine crashed and can't 
> start again...I have no idea how to fix this.
> I had a similar issue some months ago but with a different error. I tried to 
> restart the ha agent that seems to be linked with this error, also restarted 
> the host. I also tried to remove the _DIRECT_IO_ lockfile on the engine 
> storage as it fixed my problem last time but it didn't help...
>
> Any ideas ? Do you think editing manually the logical network in the host and 
> reverting them at it was before crash can help ?
>
>
>
>
>
>
> hosted-engine --vm-status
> Traceback (most recent call last):
>   File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", 
> line 117, in 
> if not status_checker.print_status():
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", 
> line 60, in print_status
> all_host_stats = ha_cli.get_all_host_stats()
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", 
> line 160, in get_all_host_stats
> return self.get_all_stats(self.StatModes.HOST)
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", 
> line 103, in get_all_stats
> self._configure_broker_conn(broker)
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", 
> line 180, in _configure_broker_conn
> dom_type=dom_type)
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", 
> line 176, in set_storage_domain
> .format(sd_type, options, e))
> ovirt_hosted_engine_ha.lib.exceptions.RequestError: Failed to set storage 
> domain FilesystemBackend, options {'dom_type': 'nfs3', 'sd_uuid': 
> 'e41807e5-ee68-40a2-a642-cc226ba0e82d'}: Request failed:  'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'>
>
>
> vdsClient -s 0 list
>
> 16450089-911e-4bad-a8b7-98e84a79ef3a
> Status = Down
> nicModel = rtl8139,pv
> statusTime = 4295559350
> exitMessage = Unable to get volume size for domain 
> e41807e5-ee68-40a2-a642-cc226ba0e82d volume 
> 053df3a6-db18-445a-8f75-61c630ab0003
> emulatedMachine = rhel6.5.0
> pid = 0
> vmName = HostedEngine
> devices = [{'index': '0', 'iface': 'virtio', 'format': 'raw', 
> 'bootOrder': '1', 'address': {'slot': '0x06', 'bus': '0x00', 'domain': 
> '0x', 'type': 'pci', 'function': '0x0'}, 'volumeID': 
> '053df3a6-db18-445a-8f75-61c630ab0003', 'imageID': 
> 'b6daa50d-adad-46a5-8f5f-accfb155a1e1', 'readonly': 'false', 'domainID': 
> 'e41807e5-ee68-40a2-a642-cc226ba0e82d', 'deviceId': 
> 'b6daa50d-adad-46a5-8f5f-accfb155a1e1', 'poolID': 
> '----', 'device': 'disk', 'shared': 
> 'exclusive', 'propagateErrors': 'off', 'type': 'disk'}, {'nicModel': 'pv', 
> 'macAddr': '00:16:3e:1c:4b:81', 'linkActive': 'true', 'network': 'ovirtmgmt', 
> 'deviceId': '0aeaea2f-a419-43cc-92d7-8422f6aa9223', 'address': 'None', 
> 'device': 'bridge', 'type': 'interface'}, {'index': '2', 'iface': 'ide', 
> 'readonly': 'true', 'deviceId': '8c3179ac-b322-4f5c-9449-c52e3665e0ae', 
> 'address': {'bus': '1', 'controller': '0', 'type': 'drive', 'target': '0', 
> 'unit': '0'}, 'device': 'cdrom', 'shared': 'false', 'path': '', 'type': 
> 'disk'}, {'device': 'scsi', 'model': 'virtio-scsi', 'type': 'controller', 
> 'deviceId': '21db0c6e-071c-48ff-b905-95478b37c384', 'address': {'slot': 
> '0x04', 'bus': '0x00', 'domain': '0x', 'type': 'pci', 'function': 
> '0x0'}}, {'device': 'usb', 'type': 'controller', 'deviceId': 
> 'c0384f68-d0c9-4ebb-a779-8dc9911ce2f8', 'address': {'slot': '0x01', 'bus': 
> '0x00', 'domain': '0x', 'type': 'pci', 'function': '0x2'}}, {'device': 
> 'ide', 'type': 'controller', 'deviceId': 
> 'd5a2dd13-138a-482b-9bc3-994b10ec4100', 'address': {'slot': '0x01', 'bus': 
> '0x00', 'domain': '0x', 'type': 'pci', 'function': '0x1'}}, {'device': 
> 'virtio-serial', 'type': 'controller', 'deviceId': 
> '9e695172-c9b0-47df-bc76-8170219dec28', 'address': {'slot': '0x05', 'bus': 
> '0x00', 'domain': '0x', 'type': 'pci', 'function': '0x0'}}]
> guestDiskMapping = {}
> vmType = kvm
> displaySecurePort = -1
> exitReason = 1
> memSize = 6000
> displayPort = -1
> clientIp =
> spiceSecureChannels = 
> smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
> smp = 4
> displayIp = 0
> display = vnc
> exitCode = 1
>
>
> systemctl status ovirt-ha-agent.service -l
> ● 

[ovirt-users] ovirt-3.6 : Hosted-engine crashed and can't restart

2016-07-20 Thread Alexis HAUSER
After assigning an IP adress to a VLAN network (it was using DHCP by default) 
that was on the same NIC than ovirtmgmt, my hosted-engine crashed and can't 
start again...I have no idea how to fix this.
I had a similar issue some months ago but with a different error. I tried to 
restart the ha agent that seems to be linked with this error, also restarted 
the host. I also tried to remove the _DIRECT_IO_ lockfile on the engine storage 
as it fixed my problem last time but it didn't help...

Any ideas ? Do you think editing manually the logical network in the host and 
reverting them at it was before crash can help ?






hosted-engine --vm-status
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 
117, in 
if not status_checker.print_status():
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 
60, in print_status
all_host_stats = ha_cli.get_all_host_stats()
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", 
line 160, in get_all_host_stats
return self.get_all_stats(self.StatModes.HOST)
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", 
line 103, in get_all_stats
self._configure_broker_conn(broker)
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", 
line 180, in _configure_broker_conn
dom_type=dom_type)
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", 
line 176, in set_storage_domain
.format(sd_type, options, e))
ovirt_hosted_engine_ha.lib.exceptions.RequestError: Failed to set storage 
domain FilesystemBackend, options {'dom_type': 'nfs3', 'sd_uuid': 
'e41807e5-ee68-40a2-a642-cc226ba0e82d'}: Request failed: 


vdsClient -s 0 list

16450089-911e-4bad-a8b7-98e84a79ef3a
Status = Down
nicModel = rtl8139,pv
statusTime = 4295559350
exitMessage = Unable to get volume size for domain 
e41807e5-ee68-40a2-a642-cc226ba0e82d volume 053df3a6-db18-445a-8f75-61c630ab0003
emulatedMachine = rhel6.5.0
pid = 0
vmName = HostedEngine
devices = [{'index': '0', 'iface': 'virtio', 'format': 'raw', 
'bootOrder': '1', 'address': {'slot': '0x06', 'bus': '0x00', 'domain': 
'0x', 'type': 'pci', 'function': '0x0'}, 'volumeID': 
'053df3a6-db18-445a-8f75-61c630ab0003', 'imageID': 
'b6daa50d-adad-46a5-8f5f-accfb155a1e1', 'readonly': 'false', 'domainID': 
'e41807e5-ee68-40a2-a642-cc226ba0e82d', 'deviceId': 
'b6daa50d-adad-46a5-8f5f-accfb155a1e1', 'poolID': 
'----', 'device': 'disk', 'shared': 
'exclusive', 'propagateErrors': 'off', 'type': 'disk'}, {'nicModel': 'pv', 
'macAddr': '00:16:3e:1c:4b:81', 'linkActive': 'true', 'network': 'ovirtmgmt', 
'deviceId': '0aeaea2f-a419-43cc-92d7-8422f6aa9223', 'address': 'None', 
'device': 'bridge', 'type': 'interface'}, {'index': '2', 'iface': 'ide', 
'readonly': 'true', 'deviceId': '8c3179ac-b322-4f5c-9449-c52e3665e0ae', 
'address': {'bus': '1', 'controller': '0', 'type': 'drive', 'target': '0', 
'unit': '0'}, 'device': 'cdrom', 'shared': 'false', 'path': '', 'type': 
'disk'}, {'device': 'scsi', 'model': 'virtio-scsi', 'type': 'controller', 
'deviceId': '21db0c6e-071c-48ff-b905-95478b37c384', 'address': {'slot': '0x04', 
'bus': '0x00', 'domain': '0x', 'type': 'pci', 'function': '0x0'}}, 
{'device': 'usb', 'type': 'controller', 'deviceId': 
'c0384f68-d0c9-4ebb-a779-8dc9911ce2f8', 'address': {'slot': '0x01', 'bus': 
'0x00', 'domain': '0x', 'type': 'pci', 'function': '0x2'}}, {'device': 
'ide', 'type': 'controller', 'deviceId': 
'd5a2dd13-138a-482b-9bc3-994b10ec4100', 'address': {'slot': '0x01', 'bus': 
'0x00', 'domain': '0x', 'type': 'pci', 'function': '0x1'}}, {'device': 
'virtio-serial', 'type': 'controller', 'deviceId': 
'9e695172-c9b0-47df-bc76-8170219dec28', 'address': {'slot': '0x05', 'bus': 
'0x00', 'domain': '0x', 'type': 'pci', 'function': '0x0'}}]
guestDiskMapping = {}
vmType = kvm
displaySecurePort = -1
exitReason = 1
memSize = 6000
displayPort = -1
clientIp = 
spiceSecureChannels = 
smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
smp = 4
displayIp = 0
display = vnc
exitCode = 1


systemctl status ovirt-ha-agent.service -l
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring 
Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; 
vendor preset: disabled)
   Active: active (running) since Wed 2016-07-20 14:56:22 UTC; 2min 29s ago
 Main PID: 20236 (ovirt-ha-agent)
   CGroup: /system.slice/ovirt-ha-agent.service
   └─20236 /usr/bin/python