Re: [ovirt-users] Hosted engine install failed; vdsm upset about broker
> On Apr 21, 2017, at 6:38 AM, knarra wrote: > > On 04/21/2017 06:34 PM, Jamie Lawrence wrote: >>> On Apr 20, 2017, at 10:36 PM, knarra wrote: The installer claimed it did, but I believe it didn’t. Below the error from my original email, there’s the below (apologies for not including it earlier; I missed it). Note: 04ff4cf1-135a-4918-9a1f-8023322f89a3 is the HE - I’m pretty sure it is complaining about itself. (In any case, I verified that there are no other VMs running with both virsh and vdsClient.) >> ^^^ >> 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:128 Stage late_setup METHOD otopi.plugins.gr_he_setup.vm.runvm.Plugin._late_setup 2017-04-19 12:27:02 DEBUG otopi.plugins.gr_he_setup.vm.runvm runvm._late_setup:83 {'status': {'message': 'Done', 'code': 0}, 'items': [u'04ff4cf1-135a-4918-9a1f-8023322f89a3']} 2017-04-19 12:27:02 ERROR otopi.plugins.gr_he_setup.vm.runvm runvm._late_setup:91 The following VMs have been found: 04ff4cf1-135a-4918-9a1f-8023322f89a3 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-setup/vm/runvm.py", line 95, in _late_setup _('Cannot setup Hosted Engine with other VMs running') RuntimeError: Cannot setup Hosted Engine with other VMs running 2017-04-19 12:27:02 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Environment setup': Cannot setup Hosted Engine with other VMs running 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:760 ENVIRONMENT DUMP - BEGIN 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV BASE/error=bool:'True' 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV BASE/exceptionInfo=list:'[(, RuntimeError('Cannot setup Hosted Engine with other VMs running',), )]' 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:774 ENVIRONMENT DUMP - END >>> James, generally this issue happens when the setup failed once and you >>> tried re running it again. Can you clean it and deploy it again? HE >>> should come up successfully. Below are the steps for cleaning it up. >> Knarra, >> >> I realize that. However, that is not the situation in my case. See above, at >> the mark - the UUID it is complaining about is the UUID of the hosted-engine >> it just installed. From the answers file generated from the run (whole thing >> below): >> >> OVEHOSTED_VM/vmUUID=str:04ff4cf1-135a-4918-9a1f-8023322f89a3 >> Also see the WARNs I mentioned previously, quoted below. Excerpt: >> >> Apr 19 12:29:20 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm root WARN >> File: >> /var/lib/libvirt/qemu/channels/04ff4cf1-135a-4918-9a1f-8023322f89a3.com.redhat.rhevm.vdsm >> already removed >> Apr 19 12:29:20 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm root WARN >> File: >> /var/lib/libvirt/qemu/channels/04ff4cf1-135a-4918-9a1f-8023322f89a3.org.qemu.guest_agent.0 >> already removed >> Apr 19 12:29:30 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm >> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect >> to broker, the number of errors has exceeded the limit (1) >> I’m not clear on what it is attempting to do there, but it seems relevant. > I remember that you said HE vm was not started when the installation was > successful. Is Local Maintenance enabled on that host? > > can you please check if the services 'ovirt-ha-agent' and 'ovirt-ha-broker' > running fine and try to restart them once ? Agent and broker logs from before are down in the original message quoting. They’re running, but not fine. [root@sc5-ovirt-2 jlawrence]# ps ax|grep ha- 130599 ?Ssl3:52 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon 132869 ?Ss 0:13 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon 133501 pts/0S+ 0:00 grep --color=auto ha- [root@sc5-ovirt-2 jlawrence]# systemctl restart ovirt-ha-agent ovirt-ha-broker [root@sc5-ovirt-2 jlawrence]# tail -40 /var/log/ovirt-hosted-engine-ha/broker.log Thread-46::INFO::2017-04-21 10:52:57,058::storage_backends::119::ovirt_hosted_engine_ha.lib.storage_backends::(_check_symlinks) Cleaning up stale LV link '/rhev/data-center/mnt/glusterSD/sc5-gluster-1.squaretrade.com:_ovirt__engine/a1155699-0bcf-44c5-aa55-a574ca3ad313/ha_agent/hosted-engine.metadata' Thread-53::INFO::2017-04-21 10:52:57,070::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed Thread-50::INFO::2017-04-21 10:52:57,118::mem_free::
Re: [ovirt-users] Hosted engine install failed; vdsm upset about broker
On 04/21/2017 06:34 PM, Jamie Lawrence wrote: On Apr 20, 2017, at 10:36 PM, knarra wrote: The installer claimed it did, but I believe it didn’t. Below the error from my original email, there’s the below (apologies for not including it earlier; I missed it). Note: 04ff4cf1-135a-4918-9a1f-8023322f89a3 is the HE - I’m pretty sure it is complaining about itself. (In any case, I verified that there are no other VMs running with both virsh and vdsClient.) ^^^ 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:128 Stage late_setup METHOD otopi.plugins.gr_he_setup.vm.runvm.Plugin._late_setup 2017-04-19 12:27:02 DEBUG otopi.plugins.gr_he_setup.vm.runvm runvm._late_setup:83 {'status': {'message': 'Done', 'code': 0}, 'items': [u'04ff4cf1-135a-4918-9a1f-8023322f89a3']} 2017-04-19 12:27:02 ERROR otopi.plugins.gr_he_setup.vm.runvm runvm._late_setup:91 The following VMs have been found: 04ff4cf1-135a-4918-9a1f-8023322f89a3 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-setup/vm/runvm.py", line 95, in _late_setup _('Cannot setup Hosted Engine with other VMs running') RuntimeError: Cannot setup Hosted Engine with other VMs running 2017-04-19 12:27:02 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Environment setup': Cannot setup Hosted Engine with other VMs running 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:760 ENVIRONMENT DUMP - BEGIN 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV BASE/error=bool:'True' 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV BASE/exceptionInfo=list:'[(, RuntimeError('Cannot setup Hosted Engine with other VMs running',), )]' 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:774 ENVIRONMENT DUMP - END James, generally this issue happens when the setup failed once and you tried re running it again. Can you clean it and deploy it again? HE should come up successfully. Below are the steps for cleaning it up. Knarra, I realize that. However, that is not the situation in my case. See above, at the mark - the UUID it is complaining about is the UUID of the hosted-engine it just installed. From the answers file generated from the run (whole thing below): OVEHOSTED_VM/vmUUID=str:04ff4cf1-135a-4918-9a1f-8023322f89a3 Also see the WARNs I mentioned previously, quoted below. Excerpt: Apr 19 12:29:20 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm root WARN File: /var/lib/libvirt/qemu/channels/04ff4cf1-135a-4918-9a1f-8023322f89a3.com.redhat.rhevm.vdsm already removed Apr 19 12:29:20 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm root WARN File: /var/lib/libvirt/qemu/channels/04ff4cf1-135a-4918-9a1f-8023322f89a3.org.qemu.guest_agent.0 already removed Apr 19 12:29:30 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect to broker, the number of errors has exceeded the limit (1) I’m not clear on what it is attempting to do there, but it seems relevant. I remember that you said HE vm was not started when the installation was successful. Is Local Maintenance enabled on that host? can you please check if the services 'ovirt-ha-agent' and 'ovirt-ha-broker' running fine and try to restart them once ? I know there is no failed install left on the gluster volume, because when I attempt an install, part of my scripted prep process is deleting and recreating the Gluster volume. The below instructions are more or less what I’m doing already in a script[1]. (the gluster portion of the script process is: stop the volume, delete the volume, remove the mount point directory to avoid Gluster’s xattr problem with recycling directories, recreate the directory, change perms, create the volume, start the volume, set Ovirt-recc’ed volume options.) -j [1] We have a requirement for automated setup of all production resources, so all of this ends up being scripted. 1) vdsClient -s 0 list table | awk '{print $1}' | xargs vdsClient -s 0 destroy 2) stop the volume and delete all the information inside the bricks from all the hosts 3) try to umount storage from /rhev/data-center/mnt/ - umount -f /rhev/data-center/mnt/ if it is mounted 4) remove all dirs from /rhev/data-center/mnt/ - rm -rf /rhev/data-center/mnt/* 5) start volume again and start the deployment. Thanks kasturi If I start it manually, the default DC is down, the default cluster has the installation host in the cluster, there is no storage, and the VM doesn’t show up in the GUI. In this install run, I have not yet started the engine manually. you wont be seeing HE vm until HE storage is imported into the UI. HE storage will be automatically imported into the
Re: [ovirt-users] Hosted engine install failed; vdsm upset about broker
> On Apr 20, 2017, at 10:36 PM, knarra wrote: >> The installer claimed it did, but I believe it didn’t. Below the error from >> my original email, there’s the below (apologies for not including it >> earlier; I missed it). Note: 04ff4cf1-135a-4918-9a1f-8023322f89a3 is the HE >> - I’m pretty sure it is complaining about itself. (In any case, I verified >> that there are no other VMs running with both virsh and vdsClient.) ^^^ >> 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:128 Stage >> late_setup METHOD otopi.plugins.gr_he_setup.vm.runvm.Plugin._late_setup >> 2017-04-19 12:27:02 DEBUG otopi.plugins.gr_he_setup.vm.runvm >> runvm._late_setup:83 {'status': {'message': 'Done', 'code': 0}, 'items': >> [u'04ff4cf1-135a-4918-9a1f-8023322f89a3']} >> 2017-04-19 12:27:02 ERROR otopi.plugins.gr_he_setup.vm.runvm >> runvm._late_setup:91 The following VMs have been found: >> 04ff4cf1-135a-4918-9a1f-8023322f89a3 >> 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:142 method >> exception >> Traceback (most recent call last): >> File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in >> _executeMethod >> method['method']() >> File >> "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-setup/vm/runvm.py", >> line 95, in _late_setup >> _('Cannot setup Hosted Engine with other VMs running') >> RuntimeError: Cannot setup Hosted Engine with other VMs running >> 2017-04-19 12:27:02 ERROR otopi.context context._executeMethod:151 Failed to >> execute stage 'Environment setup': Cannot setup Hosted Engine with other VMs >> running >> 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:760 >> ENVIRONMENT DUMP - BEGIN >> 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV >> BASE/error=bool:'True' >> 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV >> BASE/exceptionInfo=list:'[(, >> RuntimeError('Cannot setup Hosted Engine with other VMs running',), >> )]' >> 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:774 >> ENVIRONMENT DUMP - END > James, generally this issue happens when the setup failed once and you tried > re running it again. Can you clean it and deploy it again? HE should come > up successfully. Below are the steps for cleaning it up. Knarra, I realize that. However, that is not the situation in my case. See above, at the mark - the UUID it is complaining about is the UUID of the hosted-engine it just installed. From the answers file generated from the run (whole thing below): OVEHOSTED_VM/vmUUID=str:04ff4cf1-135a-4918-9a1f-8023322f89a3 Also see the WARNs I mentioned previously, quoted below. Excerpt: Apr 19 12:29:20 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm root WARN File: /var/lib/libvirt/qemu/channels/04ff4cf1-135a-4918-9a1f-8023322f89a3.com.redhat.rhevm.vdsm already removed Apr 19 12:29:20 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm root WARN File: /var/lib/libvirt/qemu/channels/04ff4cf1-135a-4918-9a1f-8023322f89a3.org.qemu.guest_agent.0 already removed Apr 19 12:29:30 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect to broker, the number of errors has exceeded the limit (1) I’m not clear on what it is attempting to do there, but it seems relevant. I know there is no failed install left on the gluster volume, because when I attempt an install, part of my scripted prep process is deleting and recreating the Gluster volume. The below instructions are more or less what I’m doing already in a script[1]. (the gluster portion of the script process is: stop the volume, delete the volume, remove the mount point directory to avoid Gluster’s xattr problem with recycling directories, recreate the directory, change perms, create the volume, start the volume, set Ovirt-recc’ed volume options.) -j [1] We have a requirement for automated setup of all production resources, so all of this ends up being scripted. > 1) vdsClient -s 0 list table | awk '{print $1}' | xargs vdsClient -s 0 destroy > > 2) stop the volume and delete all the information inside the bricks from all > the hosts > > 3) try to umount storage from /rhev/data-center/mnt/ - umount -f > /rhev/data-center/mnt/ if it is mounted > > 4) remove all dirs from /rhev/data-center/mnt/ - rm -rf > /rhev/data-center/mnt/* > > 5) start volume again and start the deployment. > > Thanks > kasturi >> >> If I start it manually, the default DC is down, the default cluster has the installation host in the cluster, there is no storage, and the VM doesn’t show up in the GUI. In this install run, I have not yet started the engine manually. >>> you wont be seeing HE vm until HE storage is imported into the UI. HE >>> storage will be automatically imported into the UI (which will import HE vm >>> too )once a master domain is pr
Re: [ovirt-users] Hosted engine install failed; vdsm upset about broker
On 04/20/2017 10:48 PM, Jamie Lawrence wrote: On Apr 19, 2017, at 11:35 PM, knarra wrote: On 04/20/2017 03:15 AM, Jamie Lawrence wrote: I trialed installing the hosted engine, following the instructions at http://www.ovirt.org/documentation/self-hosted/chap-Deploying_Self-Hosted_Engine/ . This is using Gluster as the backend storage subsystem. Answer file at the end. Per the docs, "When the hosted-engine deployment script completes successfully, the oVirt Engine is configured and running on your host. The Engine has already configured the data center, cluster, host, the Engine virtual machine, and a shared storage domain dedicated to the Engine virtual machine.” In my case, this is false. The installation claims success, but the hosted engine VM stays stopped, unless I start it manually. During the install process there is a step where HE vm is stopped and started. Can you check if this has happened correctly ? The installer claimed it did, but I believe it didn’t. Below the error from my original email, there’s the below (apologies for not including it earlier; I missed it). Note: 04ff4cf1-135a-4918-9a1f-8023322f89a3 is the HE - I’m pretty sure it is complaining about itself. (In any case, I verified that there are no other VMs running with both virsh and vdsClient.) 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:128 Stage late_setup METHOD otopi.plugins.gr_he_setup.vm.runvm.Plugin._late_setup 2017-04-19 12:27:02 DEBUG otopi.plugins.gr_he_setup.vm.runvm runvm._late_setup:83 {'status': {'message': 'Done', 'code': 0}, 'items': [u'04ff4cf1-135a-4918-9a1f-8023322f89a3']} 2017-04-19 12:27:02 ERROR otopi.plugins.gr_he_setup.vm.runvm runvm._late_setup:91 The following VMs have been found: 04ff4cf1-135a-4918-9a1f-8023322f89a3 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-setup/vm/runvm.py", line 95, in _late_setup _('Cannot setup Hosted Engine with other VMs running') RuntimeError: Cannot setup Hosted Engine with other VMs running 2017-04-19 12:27:02 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Environment setup': Cannot setup Hosted Engine with other VMs running 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:760 ENVIRONMENT DUMP - BEGIN 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV BASE/error=bool:'True' 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV BASE/exceptionInfo=list:'[(, RuntimeError('Cannot setup Hosted Engine with other VMs running',), )]' 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:774 ENVIRONMENT DUMP - END James, generally this issue happens when the setup failed once and you tried re running it again. Can you clean it and deploy it again? HE should come up successfully. Below are the steps for cleaning it up. 1) vdsClient -s 0 list table | awk '{print $1}' | xargs vdsClient -s 0 destroy 2) stop the volume and delete all the information inside the bricks from all the hosts 3) try to umount storage from /rhev/data-center/mnt/ - umount -f /rhev/data-center/mnt/ if it is mounted 4) remove all dirs from /rhev/data-center/mnt/ - rm -rf /rhev/data-center/mnt/* 5) start volume again and start the deployment. Thanks kasturi If I start it manually, the default DC is down, the default cluster has the installation host in the cluster, there is no storage, and the VM doesn’t show up in the GUI. In this install run, I have not yet started the engine manually. you wont be seeing HE vm until HE storage is imported into the UI. HE storage will be automatically imported into the UI (which will import HE vm too )once a master domain is present . Sure; I’m just attempting to provide context. I assume this is related to the errors in ovirt-hosted-engine-setup.log, below. (The timestamps are confusing; it looks like the Python errors are logged some time after they’re captured or something.) The HA broker and agent logs just show them looping in the sequence below. Is there a decent way to pick this up and continue? If not, how do I make this work? Can you please check the following things. 1) is glusterd running on all the nodes ? 'systemctl status glistered’ 2) Are you able to connect to your storage server which is ovirt_engine in your case. 3) Can you check if all the brick process in the volume is up ? 1) Verified that glusterd is running on all three nodes. 2) [root@sc5-thing-1]# mount -tglusterfs sc5-gluster-1:/ovirt_engine /mnt/ovirt_engine [root@sc5-thing-1]# df -h Filesystem Size Used Avail Use% Mounted on […] sc5-gluster-1:/ovirt_engine 300G 2.6G 298G 1% /mnt/ovirt_engine 3) [root@sc5-gluster-1 jla
Re: [ovirt-users] Hosted engine install failed; vdsm upset about broker (revised)
> On Apr 20, 2017, at 9:18 AM, Simone Tiraboschi wrote: > Could you please share the output of > sudo -u vdsm sudo service sanlock status That command line prompts for vdsm’s password, which it doesn’t have. But output returned as root is below. Is that ‘operation not permitted’ related? Thanks, -j [root@sc5-ovirt-2 jlawrence]# service sanlock status Redirecting to /bin/systemctl status sanlock.service ● sanlock.service - Shared Storage Lease Manager Loaded: loaded (/usr/lib/systemd/system/sanlock.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2017-04-19 16:56:40 PDT; 17h ago Process: 16764 ExecStart=/usr/sbin/sanlock daemon (code=exited, status=0/SUCCESS) Main PID: 16765 (sanlock) CGroup: /system.slice/sanlock.service ├─16765 /usr/sbin/sanlock daemon └─16766 /usr/sbin/sanlock daemon Apr 19 16:56:40 sc5-ovirt-2.squaretrade.com systemd[1]: Starting Shared Storage Lease Manager... Apr 19 16:56:40 sc5-ovirt-2.squaretrade.com systemd[1]: Started Shared Storage Lease Manager. Apr 19 16:56:40 sc5-ovirt-2.squaretrade.com sanlock[16765]: 2017-04-19 16:56:40-0700 482 [16765]: set scheduler RR|RESET_ON_FORK priority 99 failed: Operation not permitted ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Hosted engine install failed; vdsm upset about broker
> On Apr 19, 2017, at 11:35 PM, knarra wrote: > > On 04/20/2017 03:15 AM, Jamie Lawrence wrote: >> I trialed installing the hosted engine, following the instructions at >> http://www.ovirt.org/documentation/self-hosted/chap-Deploying_Self-Hosted_Engine/ >> . This is using Gluster as the backend storage subsystem. >> >> Answer file at the end. >> >> Per the docs, >> >> "When the hosted-engine deployment script completes successfully, the oVirt >> Engine is configured and running on your host. The Engine has already >> configured the data center, cluster, host, the Engine virtual machine, and a >> shared storage domain dedicated to the Engine virtual machine.” >> >> In my case, this is false. The installation claims success, but the hosted >> engine VM stays stopped, unless I start it manually. > During the install process there is a step where HE vm is stopped and > started. Can you check if this has happened correctly ? The installer claimed it did, but I believe it didn’t. Below the error from my original email, there’s the below (apologies for not including it earlier; I missed it). Note: 04ff4cf1-135a-4918-9a1f-8023322f89a3 is the HE - I’m pretty sure it is complaining about itself. (In any case, I verified that there are no other VMs running with both virsh and vdsClient.) 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:128 Stage late_setup METHOD otopi.plugins.gr_he_setup.vm.runvm.Plugin._late_setup 2017-04-19 12:27:02 DEBUG otopi.plugins.gr_he_setup.vm.runvm runvm._late_setup:83 {'status': {'message': 'Done', 'code': 0}, 'items': [u'04ff4cf1-135a-4918-9a1f-8023322f89a3']} 2017-04-19 12:27:02 ERROR otopi.plugins.gr_he_setup.vm.runvm runvm._late_setup:91 The following VMs have been found: 04ff4cf1-135a-4918-9a1f-8023322f89a3 2017-04-19 12:27:02 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-setup/vm/runvm.py", line 95, in _late_setup _('Cannot setup Hosted Engine with other VMs running') RuntimeError: Cannot setup Hosted Engine with other VMs running 2017-04-19 12:27:02 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Environment setup': Cannot setup Hosted Engine with other VMs running 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:760 ENVIRONMENT DUMP - BEGIN 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV BASE/error=bool:'True' 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:770 ENV BASE/exceptionInfo=list:'[(, RuntimeError('Cannot setup Hosted Engine with other VMs running',), )]' 2017-04-19 12:27:02 DEBUG otopi.context context.dumpEnvironment:774 ENVIRONMENT DUMP - END >> If I start it manually, the default DC is down, the default cluster has the >> installation host in the cluster, there is no storage, and the VM doesn’t >> show up in the GUI. In this install run, I have not yet started the engine >> manually. > you wont be seeing HE vm until HE storage is imported into the UI. HE storage > will be automatically imported into the UI (which will import HE vm too )once > a master domain is present . Sure; I’m just attempting to provide context. >> I assume this is related to the errors in ovirt-hosted-engine-setup.log, >> below. (The timestamps are confusing; it looks like the Python errors are >> logged some time after they’re captured or something.) The HA broker and >> agent logs just show them looping in the sequence below. >> >> Is there a decent way to pick this up and continue? If not, how do I make >> this work? > Can you please check the following things. > > 1) is glusterd running on all the nodes ? 'systemctl status glistered’ > 2) Are you able to connect to your storage server which is ovirt_engine in > your case. > 3) Can you check if all the brick process in the volume is up ? 1) Verified that glusterd is running on all three nodes. 2) [root@sc5-thing-1]# mount -tglusterfs sc5-gluster-1:/ovirt_engine /mnt/ovirt_engine [root@sc5-thing-1]# df -h Filesystem Size Used Avail Use% Mounted on […] sc5-gluster-1:/ovirt_engine 300G 2.6G 298G 1% /mnt/ovirt_engine 3) [root@sc5-gluster-1 jlawrence]# gluster volume status Status of volume: ovirt_engine Gluster process TCP Port RDMA Port Online Pid -- Brick sc5-gluster-1:/gluster-bricks/ovirt_e ngine/ovirt_engine-149217 0 Y 22102 Brick sc5-gluster-2:/gluster-bricks/ovirt_e ngine/ovirt_engine-149157 0 Y 37842 Brick sc5-gluster-3:/gluster-bricks/ovirt_e ngine/ovirt_engine-149157 0 Y 112018 Self-
Re: [ovirt-users] Hosted engine install failed; vdsm upset about broker (revised)
On Thu, Apr 20, 2017 at 2:14 AM, Jamie Lawrence wrote: > > So, tracing this further, I’m pretty sure this is something about sanlock. > > As best I can tell this[1] seems to be the failure that is blocking > importing the pool, creating storage domains, importing the HE, etc. > Contrary to the log, sanlock is running; I verified it starts on > system-boot and restarts just fine. > > I found one reference to someone having a similar problem in 3.6, but that > appeared to have been a permission issue I’m not afflicted with. > > How can I move past this? > Could you please share the output of sudo -u vdsm sudo service sanlock status ? > > TIA, > > -j > > > [1] agent.log: > MainThread::WARNING::2017-04-19 17:07:13,537::agent::209:: > ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Restarting agent, > attempt '6' > MainThread::INFO::2017-04-19 17:07:13,567::hosted_engine:: > 242::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) > Found certificate common name: sc5-ovirt-2.squaretrade.com > MainThread::INFO::2017-04-19 17:07:13,569::hosted_engine:: > 604::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_initialize_vdsm) Initializing VDSM > MainThread::INFO::2017-04-19 17:07:16,044::hosted_engine:: > 630::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_initialize_storage_images) Connecting the storage > MainThread::INFO::2017-04-19 17:07:16,045::storage_server:: > 219::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > Connecting storage server > MainThread::INFO::2017-04-19 17:07:20,876::storage_server:: > 226::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > Connecting storage server > MainThread::INFO::2017-04-19 17:07:20,893::storage_server:: > 233::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > Refreshing the storage domain > MainThread::INFO::2017-04-19 17:07:21,160::hosted_engine:: > 657::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_initialize_storage_images) Preparing images > MainThread::INFO::2017-04-19 17:07:21,160::image::126:: > ovirt_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images > MainThread::INFO::2017-04-19 17:07:23,954::hosted_engine:: > 660::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_initialize_storage_images) Refreshing vm.conf > MainThread::INFO::2017-04-19 17:07:23,955::config::485:: > ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf) > Reloading vm.conf from the shared storage domain > MainThread::INFO::2017-04-19 17:07:23,955::config::412:: > ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine. > config::(_get_vm_conf_content_from_ovf_store) Trying to get a fresher > copy of vm configuration from the OVF_STORE > MainThread::WARNING::2017-04-19 17:07:26,741::ovf_store::107:: > ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Unable to find > OVF_STORE > MainThread::ERROR::2017-04-19 17:07:26,744::config::450:: > ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine. > config::(_get_vm_conf_content_from_ovf_store) Unable to identify the > OVF_STORE volume, falling back to initial vm.conf. Please ensure you > already added your first data domain for regular VMs > MainThread::INFO::2017-04-19 17:07:26,770::hosted_engine:: > 509::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_initialize_broker) Initializing ha-broker connection > MainThread::INFO::2017-04-19 17:07:26,771::brokerlink::130: > :ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) > Starting monitor ping, options {'addr': '10.181.26.1'} > MainThread::INFO::2017-04-19 17:07:26,774::brokerlink::141: > :ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) > Success, id 140621269798096 > MainThread::INFO::2017-04-19 17:07:26,774::brokerlink::130: > :ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) > Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name': > 'ovirtmgmt', 'address': '0'} > MainThread::INFO::2017-04-19 17:07:26,791::brokerlink::141: > :ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) > Success, id 140621269798544 > MainThread::INFO::2017-04-19 17:07:26,792::brokerlink::130: > :ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) > Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'} > MainThread::INFO::2017-04-19 17:07:26,793::brokerlink::141: > :ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) > Success, id 140621269798224 > MainThread::INFO::2017-04-19 17:07:26,794::brokerlink::130: > :ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) > Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid': > '04ff4cf1-135a-4918-9a1f-8023322f89a3', 'address': '0'} > MainThread::INFO::2017-04-19 17:07:26,796::brokerlink::141: > :ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) >
Re: [ovirt-users] Hosted engine install failed; vdsm upset about broker
On 04/20/2017 03:15 AM, Jamie Lawrence wrote: I trialed installing the hosted engine, following the instructions at http://www.ovirt.org/documentation/self-hosted/chap-Deploying_Self-Hosted_Engine/ . This is using Gluster as the backend storage subsystem. Answer file at the end. Per the docs, "When the hosted-engine deployment script completes successfully, the oVirt Engine is configured and running on your host. The Engine has already configured the data center, cluster, host, the Engine virtual machine, and a shared storage domain dedicated to the Engine virtual machine.” In my case, this is false. The installation claims success, but the hosted engine VM stays stopped, unless I start it manually. During the install process there is a step where HE vm is stopped and started. Can you check if this has happened correctly ? If I start it manually, the default DC is down, the default cluster has the installation host in the cluster, there is no storage, and the VM doesn’t show up in the GUI. In this install run, I have not yet started the engine manually. you wont be seeing HE vm until HE storage is imported into the UI. HE storage will be automatically imported into the UI (which will import HE vm too )once a master domain is present . I assume this is related to the errors in ovirt-hosted-engine-setup.log, below. (The timestamps are confusing; it looks like the Python errors are logged some time after they’re captured or something.) The HA broker and agent logs just show them looping in the sequence below. Is there a decent way to pick this up and continue? If not, how do I make this work? Can you please check the following things. 1) is glusterd running on all the nodes ? 'systemctl status glusterd' 2) Are you able to connect to your storage server which is ovirt_engine in your case. 3) Can you check if all the brick process in the volume is up ? Thanks kasturi. Thanks, -j - - - - ovirt-hosted-engine-setup.log snippet: - - - - 2017-04-19 12:29:55 DEBUG otopi.context context._executeMethod:128 Stage late_setup METHOD otopi.plugins.gr_he_setup.system.vdsmenv.Plugin._late_setup 2017-04-19 12:29:55 DEBUG otopi.plugins.otopi.services.systemd systemd.status:90 check service vdsmd status 2017-04-19 12:29:55 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:813 execute: ('/bin/systemctl', 'status', 'vdsmd.service'), executable='None', cwd='None', env=None 2017-04-19 12:29:55 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/bin/systemctl', 'status', 'vdsmd.service'), rc=0 2017-04-19 12:29:55 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:921 execute-output: ('/bin/systemctl', 'status', 'vdsmd.service') stdout: ● vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2017-04-19 12:26:59 PDT; 2min 55s ago Process: 67370 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh --post-stop (code=exited, status=0/SUCCESS) Process: 69995 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS) Main PID: 70062 (vdsm) CGroup: /system.slice/vdsmd.service └─70062 /usr/bin/python2 /usr/share/vdsm/vdsm Apr 19 12:29:00 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect to broker, the number of errors has exceeded the limit (1) Apr 19 12:29:00 sc5-ovirt-2.squaretrade.com vdsm[70062]: vdsm root ERROR failed to retrieve Hosted Engine HA info Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 231, in _getHaInfo stats = instance.get_all_stats() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 102, in get_all_stats with broker.connection(self._retries, self._wait): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection self.connect(retries, wait) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect raise BrokerConnectionError(error_msg)