What happens if you run "/usr/bin/vdsm-tool restore-nets" manually?

On 19 July 2017 at 16:22, Anthony.Fillmore <anthony.fillm...@target.com>
wrote:

> All services active and running except the vdsm-network.service which last
> entry is “activating”:
>
>
>
> [root@t0894bmh1001 vdsm.conf.d]# systemctl status -l vdsm-network.service
> -l
>
> ● vdsm-network.service - Virtual Desktop Server Manager network restoration
>
>    Loaded: loaded (/usr/lib/systemd/system/vdsm-network.service; enabled;
> vendor preset: enabled)
>
>    Active: activating (start) since Tue 2017-07-18 10:42:57 CDT; 23h ago
>
>   Process: 8216 ExecStartPre=/usr/bin/vdsm-tool --vvverbose --append
> --logfile=/var/log/vdsm/upgrade.log upgrade-unified-persistence
> (code=exited, status=0/SUCCESS)
>
> Main PID: 8231 (vdsm-tool)
>
>    CGroup: /system.slice/vdsm-network.service
>
>            ├─8231 /usr/bin/python /usr/bin/vdsm-tool restore-nets
>
>            └─8240 /usr/bin/python /usr/share/vdsm/vdsm-restore-net-config
>
> *From:* Alan Griffiths [mailto:apgriffith...@gmail.com]
> *Sent:* Wednesday, July 19, 2017 10:13 AM
>
> *To:* Anthony.Fillmore <anthony.fillm...@target.com>
> *Cc:* Pavel Gashev <p...@acronis.com>; users@ovirt.org; Brandon.Markgraf <
> brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after
> Network Outage
>
>
>
> Looking at vdsmd.service on one of my 4.0 hosts.
>
>
>
> Requires=multipathd.service libvirtd.service time-sync.target \
>
>          iscsid.service rpcbind.service supervdsmd.service sanlock.service
> \
>
>          vdsm-network.service
>
>
>
> Are all these services present and running?
>
>
>
>
>
> On 19 July 2017 at 16:05, Anthony.Fillmore <anthony.fillm...@target.com>
> wrote:
>
> Are the vdsm.conf or mom.conf file in /etc/vdsm of note in this
> situation?
>
>
>
> *From:* Anthony.Fillmore
> *Sent:* Wednesday, July 19, 2017 9:57 AM
> *To:* 'Alan Griffiths' <apgriffith...@gmail.com>
> *Cc:* Pavel Gashev <p...@acronis.com>; users@ovirt.org; Brandon.Markgraf <
> brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* RE: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after
> Network Outage
>
>
>
> [boxname ~]# systemctl | grep -i dead
>
> mom-vdsm.service
>
>                                                        start MOM instance
> configured for VDSM purposes
>
> vdsmd.service
>
>                                                     start Virtual Desktop
> Server Manager
>
>
>
>
>
> [ boxname ~]# systemctl | grep -i exited
>
> blk-availability.service
>
>                                                                Availability
> of block devices
>
> iptables.service
>
>                                                 IPv4 firewall with
> iptables
>
> kdump.service
>
>                                                              Crash
> recovery kernel arming
>
> kmod-static-nodes.service
>
>                                                              Create list
> of required static device nodes for the current kernel
>
> lvm2-monitor.service
>
>                                                              Monitoring
> of LVM2 mirrors, snapshots etc. using dmeventd or progress polling
>
> lvm2-pvscan@253:3.service
>
>                                                              LVM2 PV scan
> on device 253:3
>
> lvm2-pvscan@253:4.service
>
>                                                              LVM2 PV scan
> on device 253:4
>
> lvm2-pvscan@8:3.service
>
>                                                              LVM2 PV scan
> on device 8:3
>
> network.service
>
>                                                                          LSB:
> Bring up/down networking
>
> openvswitch-nonetwork.service
>
>                                                           Open vSwitch
> Internal Unit
>
> openvswitch.service
>
>                                         Open vSwitch
>
> rhel-dmesg.service
>
>                                                              Dump dmesg
> to /var/log/dmesg
>
> rhel-import-state.service
>
>                                                              Import
> network configuration from initramfs
>
> rhel-readonly.service
>
>                                                                    Configure
> read-only root support
>
> systemd-journal-flush.service
>
>                                                        Flush Journal to
> Persistent Storage
>
> systemd-modules-load.service
>
>                                                                             
> Load
> Kernel Modules
>
> systemd-random-seed.service
>
>                                                              Load/Save
> Random Seed
>
> systemd-readahead-collect.service
>
>                                                                  Collect
> Read-Ahead Data
>
> systemd-readahead-replay.service
>
>                                                                   Replay
> Read-Ahead Data
>
> systemd-remount-fs.service
>
>                                                              Remount Root
> and Kernel File Systems
>
> systemd-sysctl.service
>
>                                                      Apply Kernel
> Variables
>
> systemd-tmpfiles-setup-dev.service
>
>                                                                 Create
> Static Device Nodes in /dev
>
> systemd-tmpfiles-setup.service
>
>
> Create Volatile Files and Directories
>
> systemd-udev-trigger.service
>
>                                                              udev
> Coldplug all Devices
>
> systemd-update-utmp.service
>
>                                                              Update UTMP
> about System Boot/Shutdown
>
> systemd-user-sessions.service
>
>                                                                  Permit
> User Sessions
>
> systemd-vconsole-setup.service
>
>                                                                        Setup
> Virtual Console
>
> vdsm-network-init.service
>
>                                                              Virtual
> Desktop Server Manager network IP+link restoration
>
>
>
> *From:* Alan Griffiths [mailto:apgriffith...@gmail.com
> <apgriffith...@gmail.com>]
> *Sent:* Wednesday, July 19, 2017 9:47 AM
>
>
> *To:* Anthony.Fillmore <anthony.fillm...@target.com>
> *Cc:* Pavel Gashev <p...@acronis.com>; users@ovirt.org; Brandon.Markgraf <
> brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after
> Network Outage
>
>
>
> Are there other failed services?
>
>
>
> systemctl --state=failed
>
>
>
> On 19 July 2017 at 15:40, Anthony.Fillmore <anthony.fillm...@target.com>
> wrote:
>
> Hey Alan,
>
>
>
> Rpcbind is running on my box, looks like no issue there.  Any other ideas
> on what could be keeping vdsmd dead?  I even uninstalled all Ovirt related
> components from the host and went for a reinstall of the host through Ovirt
> (just short of actually fully removing the host from ovirt and re-adding,
> which I want to avoid) and the reinstall ends up timing out when it
> attempts to start VDSM (checking logs can see the service is dead when it
> gets here).
>
>
>
> Thanks,
>
> Tony
>
>
>
> *From:* Alan Griffiths [mailto:apgriffith...@gmail.com]
> *Sent:* Wednesday, July 19, 2017 4:14 AM
> *To:* Anthony.Fillmore <anthony.fillm...@target.com>
> *Cc:* Pavel Gashev <p...@acronis.com>; users@ovirt.org; Brandon.Markgraf <
> brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* Re: [ovirt-users] [EXTERNAL] Re: Host stuck unresponsive after
> Network Outage
>
>
>
> Is rpcbind running? This is a dependency for vdsmd.
>
>
>
> I've seen issues where rpcbind will not start on boot if IPv6 is disabled.
> The solution for me was to rebuild the initramfs, aka "dracut -f"
>
>
>
> On 18 July 2017 at 18:13, Anthony.Fillmore <anthony.fillm...@target.com>
> wrote:
>
> [boxname ~]# systemctl status -l vdsm-network
>
> ● vdsm-network.service - Virtual Desktop Server Manager network restoration
>
>    Loaded: loaded (/usr/lib/systemd/system/vdsm-network.service; enabled;
> vendor preset: enabled)
>
>    Active: activating (start) since Tue 2017-07-18 10:42:57 CDT; 1h 29min
> ago
>
>   Process: 8216 ExecStartPre=/usr/bin/vdsm-tool --vvverbose --append
> --logfile=/var/log/vdsm/upgrade.log upgrade-unified-persistence
> (code=exited, status=0/SUCCESS)
>
> Main PID: 8231 (vdsm-tool)
>
>    CGroup: /system.slice/vdsm-network.service
>
>            ├─8231 /usr/bin/python /usr/bin/vdsm-tool restore-nets
>
>            └─8240 /usr/bin/python /usr/share/vdsm/vdsm-restore-net-config
>
>
>
> Jul 18 10:42:57 t0894bmh1001.stores.target.com systemd[1]: Starting
> Virtual Desktop Server Manager network restoration...
>
>
>
> Thanks,
>
> Tony
>
> *From:* Pavel Gashev [mailto:p...@acronis.com]
> *Sent:* Tuesday, July 18, 2017 11:17 AM
> *To:* Anthony.Fillmore <anthony.fillm...@target.com>; users@ovirt.org
> *Cc:* Brandon.Markgraf <brandon.markg...@target.com>; Sandeep.Mendiratta <
> sandeep.mendira...@target.com>
> *Subject:* [EXTERNAL] Re: [ovirt-users] Host stuck unresponsive after
> Network Outage
>
>
>
> Anthony,
>
>
>
> Output of “systemctl status -l vdsm-network” would help.
>
>
>
>
>
> *From: *<users-boun...@ovirt.org> on behalf of "Anthony.Fillmore" <
> anthony.fillm...@target.com>
> *Date: *Tuesday, 18 July 2017 at 18:13
> *To: *"users@ovirt.org" <users@ovirt.org>
> *Cc: *"Brandon.Markgraf" <brandon.markg...@target.com>,
> "Sandeep.Mendiratta" <sandeep.mendira...@target.com>
> *Subject: *[ovirt-users] Host stuck unresponsive after Network Outage
>
>
>
> Hey Ovirt Users and Team,
>
>
>
> I have a host that I am unable to recover post a network outage.  The host
> is stuck in unresponsive mode, even though the host is on the network, able
> to SSH and seems to be healthy.  I’ve tried several things to recover the
> host in Ovirt, but have had no success so far.  I’d like to reach out to
> the community before blowing away and rebuilding the host.
>
>
>
> *Environment*: I have an Ovengine server with about 26 Datacenters, with
> 2 to 3 hosts per Datacenter.  My Ovengine server is hosted centrally, with
> my hosts being bare-metal and distributed throughout my environment.
>   Ovengine is version 4.0.6.
>
>
>
> *What I’ve tried: *put into maintenance mode, rebooted the host.
> Confirmed host was rebooted and tried to active, goes back to
> unresponsive.   Attempted a reinstall, which fails.
>
>
>
> *Checking from the host perspective, I can see the following problems: *
>
>
>
> [boxname~]# systemctl status vdsmd
>
> ● vdsmd.service - Virtual Desktop Server Manager
>
>    Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor
> preset: enabled)
>
>    Active: inactive (dead)
>
>
>
> Jul 14 12:34:28 boxname systemd[1]: Dependency failed for Virtual Desktop
> Server Manager.
>
> Jul 14 12:34:28 boxname systemd[1]: Job vdsmd.service/start failed with
> result 'dependency'.
>
>
>
> *Going a bit deeper, the results of journalctl –xe: *
>
>
>
> [root@boxname ~]# journalctl -xe
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit libvirtd.service has begun shutting down.
>
> Jul 18 09:07:31 boxname systemd[1]: Stopped Virtualization daemon.
>
> -- Subject: Unit libvirtd.service has finished shutting down
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit libvirtd.service has finished shutting down.
>
> Jul 18 09:07:31 boxname systemd[1]: Reloading.
>
> Jul 18 09:07:31 boxname systemd[1]: Binding to IPv6 address not available
> since kernel does not support IPv6.
>
> Jul 18 09:07:31 boxname systemd[1]: [/usr/lib/systemd/system/rpcbind.socket:6]
> Failed to parse address value, ignoring: [::
>
> Jul 18 09:07:31 boxname systemd[1]: Started Auxiliary vdsm service for
> running helper functions as root.
>
> -- Subject: Unit supervdsmd.service has finished start-up
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit supervdsmd.service has finished starting up.
>
> --
>
> -- The start-up result is done.
>
> Jul 18 09:07:31 boxname systemd[1]: Starting Auxiliary vdsm service for
> running helper functions as root...
>
> -- Subject: Unit supervdsmd.service has begun start-up
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit supervdsmd.service has begun starting up.
>
> Jul 18 09:07:31 boxname systemd[1]: Starting Virtualization daemon...
>
> -- Subject: Unit libvirtd.service has begun start-up
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit libvirtd.service has begun starting up.
>
> Jul 18 09:07:32 boxname systemd[1]: Started Virtualization daemon.
>
> -- Subject: Unit libvirtd.service has finished start-up
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit libvirtd.service has finished starting up.
>
> --
>
> -- The start-up result is done.
>
> Jul 18 09:07:32 boxname systemd[1]: Starting Virtual Desktop Server
> Manager network restoration...
>
> -- Subject: Unit vdsm-network.service has begun start-up
>
> -- Defined-By: systemd
>
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
> --
>
> -- Unit vdsm-network.service has begun starting up.
>
> lines 2751-2797/2797 (END)
>
>
>
> Does the community have suggestions on what can be done next to recover
> this host within Ovirt?  I can provide additional log dumps as needed,
> please inform with what you need to assist further.
>
>
>
> Thank you,
>
> Tony
>
>
>
>
> _______________________________________________
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
>
>
>
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to