On Tue, Aug 22, 2017 at 1:48 PM Dan Kenigsberg <[email protected]> wrote:
> This seems to be my fault, https://gerrit.ovirt.org/80908 should fix it. > This fix the actual error, but we still have bad logging. Piotr, can you fix error handling so we get something like: Error configuring "foobar": actual error... > > On Tue, Aug 22, 2017 at 1:14 PM, Nir Soffer <[email protected]> wrote: > > > > > > On יום ג׳, 22 באוג׳ 2017, 12:57 Yedidyah Bar David <[email protected]> > wrote: > >> > >> On Tue, Aug 22, 2017 at 12:52 PM, Anton Marchukov <[email protected]> > >> wrote: > >> > Hello All. > >> > > >> > Any news on this? I see the latest failures for vdsm is the same [1] > >> > and > >> > the job is still not working for it. > >> > > >> > [1] > >> > > >> > > http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/1901/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20170822035135-lago-basic-suite-master-host0-1f46d892.log > >> > >> This log has: > >> > >> 2017-08-22 03:51:28,272-0400 DEBUG otopi.context > >> context._executeMethod:128 Stage closeup METHOD > >> otopi.plugins.ovirt_host_deploy.vdsm.packages.Plugin._reconfigure > >> 2017-08-22 03:51:28,272-0400 DEBUG > >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:813 > >> execute: ('/bin/vdsm-tool', 'configure', '--force'), > >> executable='None', cwd='None', env=None > >> 2017-08-22 03:51:30,687-0400 DEBUG > >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.executeRaw:863 > >> execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1 > >> 2017-08-22 03:51:30,688-0400 DEBUG > >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 > >> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout: > >> > >> Checking configuration status... > >> > >> abrt is not configured for vdsm > >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based > >> on vdsm configuration > >> lvm requires configuration > >> libvirt is not configured for vdsm yet > >> FAILED: conflicting vdsm and libvirt-qemu tls configuration. > >> vdsm.conf with ssl=True requires the following changes: > >> libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 > >> qemu.conf: spice_tls=1. > >> multipath requires configuration > >> > >> Running configure... > >> Reconfiguration of abrt is done. > >> Reconfiguration of passwd is done. > >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based > >> on vdsm configuration > >> Backing up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf.201708220351 > >> Installing /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf > >> Units need configuration: {'lvm2-lvmetad.service': {'LoadState': > >> 'loaded', 'ActiveState': 'active'}, 'lvm2-lvmetad.socket': > >> {'LoadState': 'loaded', 'ActiveState': 'active'}} > >> Reconfiguration of lvm is done. > >> Reconfiguration of sebool is done. > >> > >> 2017-08-22 03:51:30,688-0400 DEBUG > >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:926 > >> execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr: > >> Error: ServiceNotExistError: Tried all alternatives but failed: > >> ServiceNotExistError: dev-hugepages1G.mount is not native systemctl > >> service > >> ServiceNotExistError: dev-hugepages1G.mount is not a SysV service > >> > >> > >> 2017-08-22 03:51:30,689-0400 WARNING > >> otopi.plugins.ovirt_host_deploy.vdsm.packages > >> packages._reconfigure:155 Cannot configure vdsm > >> > >> Nir, any idea? > > > > > > Looks like some configurator has failed after sebool, but we don't have > > proper error message with the name of the configurator. > > > > Piotr, can you take a look? > > > > > >> > >> > > >> > > >> > > >> > On Sun, Aug 20, 2017 at 12:39 PM, Nir Soffer <[email protected]> > wrote: > >> >> > >> >> On Sun, Aug 20, 2017 at 11:08 AM Dan Kenigsberg <[email protected]> > >> >> wrote: > >> >>> > >> >>> On Sun, Aug 20, 2017 at 10:39 AM, Yaniv Kaul <[email protected]> > wrote: > >> >>> > > >> >>> > > >> >>> > On Sun, Aug 20, 2017 at 8:48 AM, Daniel Belenky > >> >>> > <[email protected]> > >> >>> > wrote: > >> >>> >> > >> >>> >> Failed test: basic_suite_master/002_bootstrap > >> >>> >> Version: oVirt Master > >> >>> >> Link to failed job: ovirt-master_change-queue-tester/1860/ > >> >>> >> Link to logs (Jenkins): test logs > >> >>> >> Suspected patch: https://gerrit.ovirt.org/#/c/80749/3 > >> >>> >> > >> >>> >> From what I was able to find, It seems that for some reason VDSM > >> >>> >> failed to > >> >>> >> start on host 1. The VDSM log is empty, and the only error I > could > >> >>> >> find in > >> >>> >> supervdsm.log is that start of LLDP failed (Not sure if it's > >> >>> >> related) > >> >>> > > >> >>> > > >> >>> > Can you check the networking on the hosts? Something's very > strange > >> >>> > there. > >> >>> > For example: > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 NetworkManager[685]: > >> >>> > <info> > >> >>> > [1503175122.2682] manager: (e7NZWeNDXwIjQia): new Bond device > >> >>> > (/org/freedesktop/NetworkManager/Devices/17) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > Setting xmit hash policy to layer2+3 (2) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > Setting xmit hash policy to encap2+3 (3) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > Setting xmit hash policy to encap3+4 (4) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > option xmit_hash_policy: invalid value (5) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > Setting primary_reselect to always (0) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > Setting primary_reselect to better (1) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > Setting primary_reselect to failure (2) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > option primary_reselect: invalid value (3) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > Setting arp_all_targets to any (0) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > Setting arp_all_targets to all (1) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: > >> >>> > e7NZWeNDXwIjQia: > >> >>> > option arp_all_targets: invalid value (2) > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 kernel: bonding: > >> >>> > e7NZWeNDXwIjQia is being deleted... > >> >>> > Aug 19 16:38:42 lago-basic-suite-master-host0 lldpad: > recvfrom(Event > >> >>> > interface): No buffer space available > >> >>> > > >> >>> > Y. > >> >>> > >> >>> > >> >>> > >> >>> The post-boot noise with funny-looking bonds is due to our calling > of > >> >>> `vdsm-tool dump-bonding-options` every boot, in order to find the > >> >>> bonding defaults for the current kernel. > >> >>> > >> >>> > > >> >>> >> > >> >>> >> From host-deploy log: > >> >>> >> > >> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG > >> >>> >> otopi.plugins.otopi.services.systemd > >> >>> >> systemd.state:130 starting service vdsmd > >> >>> >> 2017-08-19 16:38:41,476-0400 DEBUG > >> >>> >> otopi.plugins.otopi.services.systemd > >> >>> >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start', > >> >>> >> 'vdsmd.service'), > >> >>> >> executable='None', cwd='None', env=None > >> >>> >> 2017-08-19 16:38:44,628-0400 DEBUG > >> >>> >> otopi.plugins.otopi.services.systemd > >> >>> >> plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start', > >> >>> >> 'vdsmd.service'), rc=1 > >> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG > >> >>> >> otopi.plugins.otopi.services.systemd > >> >>> >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start', > >> >>> >> 'vdsmd.service') stdout: > >> >>> >> > >> >>> >> > >> >>> >> 2017-08-19 16:38:44,630-0400 DEBUG > >> >>> >> otopi.plugins.otopi.services.systemd > >> >>> >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start', > >> >>> >> 'vdsmd.service') stderr: > >> >>> >> Job for vdsmd.service failed because the control process exited > >> >>> >> with > >> >>> >> error > >> >>> >> code. See "systemctl status vdsmd.service" and "journalctl -xe" > for > >> >>> >> details. > >> >>> >> > >> >>> >> 2017-08-19 16:38:44,631-0400 DEBUG otopi.context > >> >>> >> context._executeMethod:142 method exception > >> >>> >> Traceback (most recent call last): > >> >>> >> File "/tmp/ovirt-dunwHj8Njn/pythonlib/otopi/context.py", line > >> >>> >> 132, > >> >>> >> in > >> >>> >> _executeMethod > >> >>> >> method['method']() > >> >>> >> File > >> >>> >> > >> >>> >> > >> >>> >> > "/tmp/ovirt-dunwHj8Njn/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", > >> >>> >> line 224, in _start > >> >>> >> self.services.state('vdsmd', True) > >> >>> >> File > >> >>> >> "/tmp/ovirt-dunwHj8Njn/otopi-plugins/otopi/services/systemd.py", > >> >>> >> line 141, in state > >> >>> >> service=name, > >> >>> >> RuntimeError: Failed to start service 'vdsmd' > >> >>> >> > >> >>> >> > >> >>> >> From /var/log/messages: > >> >>> >> > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> Error: > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> One of > >> >>> >> the modules is not configured to work with VDSM. > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> To > >> >>> >> configure the module use the following: > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> 'vdsm-tool configure [--module module-name]'. > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> If > >> >>> >> all > >> >>> >> modules are not configured try to use: > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> 'vdsm-tool configure --force' > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> (The > >> >>> >> force flag will stop the module's service and start it > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> afterwards automatically to load the new configuration.) > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> abrt > >> >>> >> is already configured for vdsm > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> lvm is > >> >>> >> configured for vdsm > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> libvirt is already configured for vdsm > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> multipath requires configuration > >> >>> >> Aug 19 16:38:44 lago-basic-suite-master-host0 > vdsmd_init_common.sh: > >> >>> >> Modules sanlock, multipath are not configured > >> >> > >> >> > >> >> This means the host was not deployed correctly. When deploying vdsm > >> >> host deploy must run "vdsm-tool configure --force", which configures > >> >> multipath and sanlock. > >> >> > >> >> We did not change anything in multipath and sanlock configurators > >> >> lately. > >> >> > >> >> Didi, can you check this? > >> >> > >> >> _______________________________________________ > >> >> Devel mailing list > >> >> [email protected] > >> >> http://lists.ovirt.org/mailman/listinfo/devel > >> > > >> > > >> > > >> > > >> > -- > >> > Anton Marchukov > >> > Team Lead - Release Management - RHV DevOps - Red Hat > >> > > >> > >> > >> > >> -- > >> Didi > > > > > > _______________________________________________ > > Devel mailing list > > [email protected] > > http://lists.ovirt.org/mailman/listinfo/devel >
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
