Looks like there still needs to be some work done on oVirt 4.0 Node and ovirt-hosted-engine-setup before it's ready for general consumption. I have spent days trying to get this to work, and only got it running (on one host) after encountering 8 serious issues (7 below and the initial glusterfs one). I have not been able to successfully deploy a second host (see issue 7 below). I will be moving back to deploying hosts using CentOS (with either oVirt 4.0 or oVirt 3.6) as I need a working oVirt deployment up and running.

In case anyone is interested in reproducing the issues, I used the Node ISO here [1] and the latest (7/2/2016) engine appliance OVA here [2]. Those seem to be the "official" files as far as I can tell (which is difficult as the documentation is not clear).

List of issues:
1. The error I mentioned seems to be an problem with the code. I bypassed it by deleting /usr/libexec/vdsm/hooks/before_network_setup/50_fcoe. 2. ovirt-hosted-engine-setup is unable to connect to the vdsm service if the FQDN of the node is not resolvable (i.e. if a DNS server is not entered in the initial setup). This should be checked in either the initial oVirt Node setup process or the beginning of ovirt-hosted-engine-setup. 3. The management bridge does not get created properly when the server is set up with a manually configured DNS server and running NetworkManager (the default on Node). It seems like a bug has been filed for this back in 2014. [3] 4. Using cloud-init with default values to customize the engine appliance can fail on the line "Creating/refreshing DWH database schema" if it takes longer than 600 seconds to return output. This may apply to any other step that takes a long time to complete. The VM no longer appears to be exist after the setup exits that so I am unable to debug. 5. Without using cloud-init, the setup creates an engine VM that I cannot log into (it does not seem to use the engine admin password or a blank password). 6. Destroying the VM (option 4) leaves the files intact on the shared storage so I cannot restart setup without deleting those first. This may be intentional, but the use of kvm terminology (destroy for power off) is not common, not to mention that "virsh -r list --all" does not list the VM anymore. 7. Unable to deploy second host through web UI (error "Failed to configure management network on host node2 due to setup networks failure.") or using ovirt-hosted-engine-setup (it looks like it can't connect to or doesn't start the broker service). 8. Random errors to stderr: "vcpu0 unhandled rdmsr" (this seems to be an upstream bug) and "multipath: error getting device" (this has been an issue for years with oVirt and seems to be due to multipathing being on by default even for systems where that does not apply).

[1] http://resources.ovirt.org/pub/ovirt-4.0/iso/ovirt-node-ng-installer/ovirt-node-ng-installer-ovirt-4.0-2016062412.iso [2] http://jenkins.ovirt.org/view/All/job/ovirt-appliance_ovirt-4.0_build-artifacts-el7-x86_64/
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1160423

On 7/1/2016 8:37 PM, Kevin Hung wrote:
It looks like I'm now getting an error when the deployment tries to configure the management bridge.

Setup log:

2016-07-01 20:29:47 INFO otopi.plugins.gr_he_common.network.bridge bridge._misc:
372 Configuring the management bridge
2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :384 networks: {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}} 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc
:385 bonds: {}
2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc
:386 options: {'connectivityCheck': False}
2016-07-01 20:29:48 DEBUG otopi.context context._executeMethod:142 method exception
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py", line 387, in _misc
    _setupNetworks(conn, networks, bonds, options)
File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py", line 405, in _setupNetworks
    'message: "%s"' % (networks, code, message))
RuntimeError: Failed to setup networks {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}}. Error code: "78" message: "Hook error: Hook Error: ('Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in <module>\n from vdsm.netconfpersistence import RunningConfig\nImportError: No module named netconfpersistence\n',)" 2016-07-01 20:29:48 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Misc configuration': Failed to setup networks {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}}. Error code: "78" message: "Hook error: Hook Error: ('Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in <module>\n from vdsm.netconfpersistence import RunningConfig\nImportError: No module named netconfpersistence\n',)"


On 7/1/2016 5:21 PM, Kevin Hung wrote:
Thank you Sahina, that was the issue. I upgraded my glusterfs server to 3.7.11 and I was able to continue with the deployment. I am seeing other issues with deployment, but I will look into those myself first. Bug has been logged [1].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1352165


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to