On Fri, Dec 23, 2016 at 6:20 PM, Barak Korren <[email protected]> wrote: > On 22 December 2016 at 21:56, Nir Soffer <[email protected]> wrote: >> On Thu, Dec 22, 2016 at 9:12 PM, Fred Rolland <[email protected]> wrote: >>> SuperVdsm fails to starts : >>> >>> MainThread::ERROR::2016-12-22 >>> 12:42:08,699::supervdsmServer::317::SuperVdsm.Server::(main) Could not start >>> Super Vdsm >>> Traceback (most recent call last): >>> File "/usr/share/vdsm/supervdsmServer", line 297, in main >>> server = manager.get_server() >>> File "/usr/lib64/python2.7/multiprocessing/managers.py", line 493, in >>> get_server >>> self._authkey, self._serializer) >>> File "/usr/lib64/python2.7/multiprocessing/managers.py", line 162, in >>> __init__ >>> self.listener = Listener(address=address, backlog=16) >>> File "/usr/lib64/python2.7/multiprocessing/connection.py", line 136, in >>> __init__ >>> self._listener = SocketListener(address, family, backlog) >>> File "/usr/lib64/python2.7/multiprocessing/connection.py", line 260, in >>> __init__ >>> self._socket.bind(address) >>> File "/usr/lib64/python2.7/socket.py", line 224, in meth >>> return getattr(self._sock,name)(*args) >>> error: [Errno 2] No such file or directory >>> >>> >>> On Thu, Dec 22, 2016 at 7:54 PM, Barak Korren <[email protected]> wrote: >>>> >>>> It hard to tell currently when did this start b/c we had so package >>>> issues that made the tests fail before reaching that point most of the >>>> day. >>>> >>>> Since we currently have an issue in Lago with collecting AddHost logs >>>> (Hopefully we'll resolve this in the next release early next week), >>>> I`ve ran the tests locally and attached the bundle of generated logs >>>> to this message. >>>> >>>> Included in the attached file are engine logs, host-deploy logs and >>>> VDSM logs for both test hosts. >>>> >>>> From a quick look inside it seems the issue is with VDSM failing to start. >> >> From host-deploy/ovirt-host-deploy-20161222124209-192.168.203.4-604a4799.log: >> >> 2016-12-22 12:42:05 DEBUG otopi.plugins.otopi.services.systemd >> plugin.executeRaw:813 execute: ('/bin/systemctl', 'start', >> 'vdsmd.service'), executable='None', cwd='None', env=None >> 2016-12-22 12:42:09 DEBUG otopi.plugins.otopi.services.systemd >> plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'start', >> 'vdsmd.service'), rc=1 >> 2016-12-22 12:42:09 DEBUG otopi.plugins.otopi.services.systemd >> plugin.execute:921 execute-output: ('/bin/systemctl', 'start', >> 'vdsmd.service') stdout: >> >> >> 2016-12-22 12:42:09 DEBUG otopi.plugins.otopi.services.systemd >> plugin.execute:926 execute-output: ('/bin/systemctl', 'start', >> 'vdsmd.service') stderr: >> A dependency job for vdsmd.service failed. See 'journalctl -xe' for details. >> >> This means that one of the services vdsm depends on could not start. >> >> 2016-12-22 12:42:09 DEBUG otopi.context context._executeMethod:142 >> method exception >> Traceback (most recent call last): >> File "/tmp/ovirt-bUCuRxXXzU/pythonlib/otopi/context.py", line 132, >> in _executeMethod >> method['method']() >> File >> "/tmp/ovirt-bUCuRxXXzU/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", >> line 209, in _start >> self.services.state('vdsmd', True) >> File "/tmp/ovirt-bUCuRxXXzU/otopi-plugins/otopi/services/systemd.py", >> line 141, in state >> service=name, >> RuntimeError: Failed to start service 'vdsmd' >> >> This error is not very useful for anyone. What we need in otopi log is >> the output of >> journalctl -xe (suggested by systemctl). >> >> Didi, can we collect this info when starting a service fail? >> >> Barak, can you log in to the host with this error and collect the output? >> > By the time I looged in to the host, all IP addresses are gone (I'm > guessing the setup process killed dhclient), so I'm having to work via > the serial console) > > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > valid_lft forever preferred_lft forever > inet6 ::1/128 scope host > valid_lft forever preferred_lft forever > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > state UP qlen 1000 > link/ether 54:52:c0:a8:cb:02 brd ff:ff:ff:ff:ff:ff > inet6 fe80::5652:c0ff:fea8:cb02/64 scope link > valid_lft forever preferred_lft forever > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > state UP qlen 1000 > link/ether 54:52:c0:a8:cc:02 brd ff:ff:ff:ff:ff:ff > inet6 fe80::5652:c0ff:fea8:cc02/64 scope link > valid_lft forever preferred_lft forever > 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > state UP qlen 1000 > link/ether 54:52:c0:a8:cc:03 brd ff:ff:ff:ff:ff:ff > inet6 fe80::5652:c0ff:fea8:cc03/64 scope link > valid_lft forever preferred_lft forever > 5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > state UP qlen 1000 > link/ether 54:52:c0:a8:ca:02 brd ff:ff:ff:ff:ff:ff > inet6 fe80::5652:c0ff:fea8:ca02/64 scope link > valid_lft forever preferred_lft forever > > > Here is the interesting stuff I can gather from journalctl: > > Dec 22 12:42:06 lago-basic-suite-master-host0 > ovirt-imageio-daemon[5007]: Traceback (most recent call last): > Dec 22 12:42:06 lago-basic-suite-master-host0 > ovirt-imageio-daemon[5007]: File "/usr/bin/ovirt-imageio-daemon", line > 14, in <module>
Thanks, Barak. My guess stays with Bug 1400003 - imageio fails during system startup as the culprit. _______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
