On Wed, Aug 23, 2017 at 9:21 AM, Nir Soffer <[email protected]> wrote:
> On Tue, Aug 22, 2017 at 1:48 PM Dan Kenigsberg <[email protected]> wrote: > >> This seems to be my fault, https://gerrit.ovirt.org/80908 should fix it. >> > > This fix the actual error, but we still have bad logging. > > Piotr, can you fix error handling so we get something like: > > Error configuring "foobar": actual error... > Thank you for your suggestion Yes, we will improve the logging. > > >> >> On Tue, Aug 22, 2017 at 1:14 PM, Nir Soffer <[email protected]> wrote: >> > >> > >> > On יום ג׳, 22 באוג׳ 2017, 12:57 Yedidyah Bar David <[email protected]> >> wrote: >> >> >> >> On Tue, Aug 22, 2017 at 12:52 PM, Anton Marchukov <[email protected] >> > >> >> wrote: >> >> > Hello All. >> >> > >> >> > Any news on this? I see the latest failures for vdsm is the same [1] >> >> > and >> >> > the job is still not working for it. >> >> > >> >> > [1] >> >> > >> >> > http://jenkins.ovirt.org/job/ovirt-master_change-queue- >> tester/1901/artifact/exported-artifacts/basic-suit-master- >> el7/test_logs/basic-suite-master/post-002_bootstrap.py/ >> lago-basic-suite-master-engine/_var_log/ovirt-engine/ >> host-deploy/ovirt-host-deploy-20170822035135-lago-basic- >> suite-master-host0-1f46d892.log >> >> >> >> This log has: >> >> >> >> 2017-08-22 03:51:28,272-0400 DEBUG otopi.context >> >> context._executeMethod:128 Stage closeup METHOD >> >> otopi.plugins.ovirt_host_deploy.vdsm.packages.Plugin._reconfigure >> >> 2017-08-22 03:51:28,272-0400 DEBUG >> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:813 >> >> execute: ('/bin/vdsm-tool', 'configure', '--force'), >> >> executable='None', cwd='None', env=None >> >> 2017-08-22 03:51:30,687-0400 DEBUG >> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:863 >> >> execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1 >> >> 2017-08-22 03:51:30,688-0400 DEBUG >> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 >> >> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout: >> >> >> >> Checking configuration status... >> >> >> >> abrt is not configured for vdsm >> >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based >> >> on vdsm configuration >> >> lvm requires configuration >> >> libvirt is not configured for vdsm yet >> >> FAILED: conflicting vdsm and libvirt-qemu tls configuration. >> >> vdsm.conf with ssl=True requires the following changes: >> >> libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 >> >> qemu.conf: spice_tls=1. >> >> multipath requires configuration >> >> >> >> Running configure... >> >> Reconfiguration of abrt is done. >> >> Reconfiguration of passwd is done. >> >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based >> >> on vdsm configuration >> >> Backing up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf. >> 201708220351 >> >> Installing /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf >> >> Units need configuration: {'lvm2-lvmetad.service': {'LoadState': >> >> 'loaded', 'ActiveState': 'active'}, 'lvm2-lvmetad.socket': >> >> {'LoadState': 'loaded', 'ActiveState': 'active'}} >> >> Reconfiguration of lvm is done. >> >> Reconfiguration of sebool is done. >> >> >> >> 2017-08-22 03:51:30,688-0400 DEBUG >> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:926 >> >> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr: >> >> Error: ServiceNotExistError: Tried all alternatives but failed: >> >> ServiceNotExistError: dev-hugepages1G.mount is not native systemctl >> >> service >> >> ServiceNotExistError: dev-hugepages1G.mount is not a SysV service >> >> >> >> >> >> 2017-08-22 03:51:30,689-0400 WARNING >> >> otopi.plugins.ovirt_host_deploy.vdsm.packages >> >> packages._reconfigure:155 Cannot configure vdsm >> >> >> >> Nir, any idea? >> > >> > >> > Looks like some configurator has failed after sebool, but we don't have >> > proper error message with the name of the configurator. >> > >> > Piotr, can you take a look? >> > >> > >> >> >> >> > >> >> > >> >> > >> >> > On Sun, Aug 20, 2017 at 12:39 PM, Nir Soffer <[email protected]> >> wrote: >> >> >> >> >> >> On Sun, Aug 20, 2017 at 11:08 AM Dan Kenigsberg <[email protected]> >> >> >> wrote: >> >> >>> >> >> >>> On Sun, Aug 20, 2017 at 10:39 AM, Yaniv Kaul <[email protected]> >> wrote: >> >> >>> > >> >> >>> > >> >> >>> > On Sun, Aug 20, 2017 at 8:48 AM, Daniel Belenky >> >> >>> > <[email protected]> >> >> >>> > wrote: >> >> >>> >> >> >> >>> >> Failed test: basic_suite_master/002_bootstrap >> >> >>> >> Version: oVirt Master >> >> >>> >> Link to failed job: ovirt-master_change-queue-tester/1860/ >> >> >>> >> Link to logs (Jenkins): test logs >> >> >>> >> Suspected patch: https://gerrit.ovirt.org/#/c/80749/3 >> >> >>> >> >> >> >>> >> From what I was able to find, It seems that for some reason VDSM >> >> >>> >> failed to >> >> >>> >> start on host 1. The VDSM log is empty, and the only error I >> could >> >> >>> >> find in >> >> >>> >> supervdsm.log is that start of LLDP failed (Not sure if it's >> >> >>> >> related) >> >> >>> > >> >> >>> > >> >> >>> > Can you check the networking on the hosts? Something's very >> strange >> >> >>> > there. >> >> >>> > For example: >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 >> NetworkManager[685]: >> >> >>> > <info> >> >> >>> > [1503175122.2682] manager: (e7NZWeNDXwIjQia): new Bond device >> >> >>> > (/org/freedesktop/NetworkManager/Devices/17) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > Setting xmit hash policy to layer2+3 (2) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > Setting xmit hash policy to encap2+3 (3) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > Setting xmit hash policy to encap3+4 (4) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > option xmit_hash_policy: invalid value (5) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > Setting primary_reselect to always (0) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > Setting primary_reselect to better (1) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > Setting primary_reselect to failure (2) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > option primary_reselect: invalid value (3) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > Setting arp_all_targets to any (0) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > Setting arp_all_targets to all (1) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: >> >> >>> > e7NZWeNDXwIjQia: >> >> >>> > option arp_all_targets: invalid value (2) >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: bonding: >> >> >>> > e7NZWeNDXwIjQia is being deleted... >> >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 lldpad: >> recvfrom(Event >> >> >>> > interface): No buffer space available >> >> >>> > >> >> >>> > Y. >> >> >>> >> >> >>> >> >> >>> >> >> >>> The post-boot noise with funny-looking bonds is due to our calling >> of >> >> >>> `vdsm-tool dump-bonding-options` every boot, in order to find the >> >> >>> bonding defaults for the current kernel. >> >> >>> >> >> >>> > >> >> >>> >> >> >> >>> >> From host-deploy log: >> >> >>> >> >> >> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG >> >> >>> >> otopi.plugins.otopi.services.systemd >> >> >>> >> systemd.state:130 starting service vdsmd >> >> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG >> >> >>> >> otopi.plugins.otopi.services.systemd >> >> >>> >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start', >> >> >>> >> 'vdsmd.service'), >> >> >>> >> executable='None', cwd='None', env=None >> >> >>> >> 2017-08-19 16:38:44,628-0400 DEBUG >> >> >>> >> otopi.plugins.otopi.services.systemd >> >> >>> >> plugin.executeRaw:863 execute-result: ('/bin/systemctl', >> 'start', >> >> >>> >> 'vdsmd.service'), rc=1 >> >> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG >> >> >>> >> otopi.plugins.otopi.services.systemd >> >> >>> >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start', >> >> >>> >> 'vdsmd.service') stdout: >> >> >>> >> >> >> >>> >> >> >> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG >> >> >>> >> otopi.plugins.otopi.services.systemd >> >> >>> >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start', >> >> >>> >> 'vdsmd.service') stderr: >> >> >>> >> Job for vdsmd.service failed because the control process exited >> >> >>> >> with >> >> >>> >> error >> >> >>> >> code. See "systemctl status vdsmd.service" and "journalctl -xe" >> for >> >> >>> >> details. >> >> >>> >> >> >> >>> >> 2017-08-19 16:38:44,631-0400 DEBUG otopi.context >> >> >>> >> context._executeMethod:142 method exception >> >> >>> >> Traceback (most recent call last): >> >> >>> >> File "/tmp/ovirt-dunwHj8Njn/pythonlib/otopi/context.py", line >> >> >>> >> 132, >> >> >>> >> in >> >> >>> >> _executeMethod >> >> >>> >> method['method']() >> >> >>> >> File >> >> >>> >> >> >> >>> >> >> >> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/ovirt-host-deploy/ >> vdsm/packages.py", >> >> >>> >> line 224, in _start >> >> >>> >> self.services.state('vdsmd', True) >> >> >>> >> File >> >> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/otopi/services/ >> systemd.py", >> >> >>> >> line 141, in state >> >> >>> >> service=name, >> >> >>> >> RuntimeError: Failed to start service 'vdsmd' >> >> >>> >> >> >> >>> >> >> >> >>> >> From /var/log/messages: >> >> >>> >> >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> Error: >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> One of >> >> >>> >> the modules is not configured to work with VDSM. >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> To >> >> >>> >> configure the module use the following: >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> 'vdsm-tool configure [--module module-name]'. >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> If >> >> >>> >> all >> >> >>> >> modules are not configured try to use: >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> 'vdsm-tool configure --force' >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> (The >> >> >>> >> force flag will stop the module's service and start it >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> afterwards automatically to load the new configuration.) >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> abrt >> >> >>> >> is already configured for vdsm >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> lvm is >> >> >>> >> configured for vdsm >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> libvirt is already configured for vdsm >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> multipath requires configuration >> >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 >> vdsmd_init_common.sh: >> >> >>> >> Modules sanlock, multipath are not configured >> >> >> >> >> >> >> >> >> This means the host was not deployed correctly. When deploying vdsm >> >> >> host deploy must run "vdsm-tool configure --force", which configures >> >> >> multipath and sanlock. >> >> >> >> >> >> We did not change anything in multipath and sanlock configurators >> >> >> lately. >> >> >> >> >> >> Didi, can you check this? >> >> >> >> >> >> _______________________________________________ >> >> >> Devel mailing list >> >> >> [email protected] >> >> >> http://lists.ovirt.org/mailman/listinfo/devel >> >> > >> >> > >> >> > >> >> > >> >> > -- >> >> > Anton Marchukov >> >> > Team Lead - Release Management - RHV DevOps - Red Hat >> >> > >> >> >> >> >> >> >> >> -- >> >> Didi >> > >> > >> > _______________________________________________ >> > Devel mailing list >> > [email protected] >> > http://lists.ovirt.org/mailman/listinfo/devel >> >
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
