[ovirt-users] Re: Failed HostedEngine Deployment
I think you may be right, here. I decided to just start over and use the actual ovirt-node installation media, rather than Centos Stream installation media. Hopefully that gets the software-side situated. Thanks for the pointers. From: Strahil Nikolov Sent: Sunday, January 23, 2022 5:46 PM To: Robert Tongue ; users Subject: Re: [ovirt-users] Failed HostedEngine Deployment yum downgrade qemu-kvm-block-gluster-6.0.0-33.el8s libvirt-daemon-driver-qemu-6.0.0-33.el8s qemu-kvm-common -6.0.0-33.el8s qemu-kvm-hw-usbredir-6.0.0-33.el8s qemu-kvm-u i-opengl-6.0.0-33.el8s qemu-kvm-block-rbd-6.0.0-33.el8s qemu -img-6.0.0-33.el8s qemu-kvm-6.0.0-33.el8s qemu-kvm-block-cur l-6.0.0-33.el8s qemu-kvm-block-ssh-6.0.0-33.el8s qemu-kvm-ui -spice-6.0.0-33.el8s ipxe-roms-qemu-6.0.0-33.el8s qemu-kvm-c ore-6.0.0-33.el8s qemu-kvm-docs-6.0.0-33.el8s qemu-kvm-block-6.0.0-33.el8s Best Regards, Strahil Nikolov On Sun, Jan 23, 2022 at 22:47, Robert Tongue wrote: Ahh, I did some repoquery commands can see a good bit of qemu* packages are coming from appstream rather than ovirt-4.4-centos-stream-advanced-virtualization. What's the recommanded fix? From: Strahil Nikolov Sent: Sunday, January 23, 2022 3:41 PM To: users ; Robert Tongue Subject: Re: [ovirt-users] Failed HostedEngine Deployment I've seen this. Ensure that all qemu-related packages are coming from centos-advanced-virtualization repo (6.0.0-33.el8s.x86_64). There is a known issue with the latest packages in the CentOS Stream. Also, you can set the following alias on the Hypervisours: alias virsh='virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf' Best Regards, Strahil Nikolov В неделя, 23 януари 2022 г., 21:14:20 Гринуич+2, Robert Tongue написа: Greetings oVirt people, I am having a problem with the hosted-engine deployment, and unfortunately after a weekend spent trying to get this far, I am finally stuck, and cannot figure out how to fix this. I am starting with 1 host, and will have 4 when this is finished. Storage is GlusterFS, hyperconverged, but I am managing that myself outside of oVirt. It's a single-node GlusterFS volume, which I will expand out across the other 4 nodes as well. I get all the way through the initial hosted-engine deployment (via the cockpit interface) pre-storage, then get most of the way through the storage portion of it. It fails at starting the HostedEngine VM in its final state after copying the VM disk to shared storage. This is where it gets weird. [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt.deleted.domain resolves to 192.168.x.x. If you are using DHCP, check your DHCP reservation configuration"} I've masked out the domain and IP for obvious reasons. However I think this deployment error isn't really the reason for the failure, it's just where it is at when it fails. The HostedEngine VM is starting, but not actually booting. I was able to change the VNC password with `hosted-engine --add-console-password`, and see the local console display with that, however it just displays "The guest has not initialized the display (yet)". I also did: # hosted-engine --console The engine VM is running on this host Escape character is ^] Yet that doesn't move any further, nor allow any input. The VM does not respond on the network. I am thinking it's just not making it to the initial BIOS screen and booting at all. What would cause that? Here is the glusterfs volume for clarity. # gluster volume info storage Volume Name: storage Type: Distribute Volume ID: e9544310-8890-43e3-b49c-6e8c7472dbbb Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: node1:/var/glusterfs/storage/1 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.ping-timeout: 5 performance.client-io-threads: on server.event-threads: 4 client.event-threads: 4 cluster.choose-local: off user.cifs: off features.shard: on cluster.shd-wait-qlength: 1024 cluster.locking-scheme: full cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable performance.strict-o-direct: on network.remote-dio: disable performance.low-prio-threads: 32 performance.io-cache: off performance.read-ahead: off performance.quick-read: off storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz stepping : 9 microcode : 0x21 cpu MHz : 4000.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de p
[ovirt-users] Re: Failed HostedEngine Deployment
Thanks for the response. How can I verify this? Has something with the installation procedures changed recently? From: Strahil Nikolov Sent: Sunday, January 23, 2022 3:41 PM To: users ; Robert Tongue Subject: Re: [ovirt-users] Failed HostedEngine Deployment I've seen this. Ensure that all qemu-related packages are coming from centos-advanced-virtualization repo (6.0.0-33.el8s.x86_64). There is a known issue with the latest packages in the CentOS Stream. Also, you can set the following alias on the Hypervisours: alias virsh='virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf' Best Regards, Strahil Nikolov В неделя, 23 януари 2022 г., 21:14:20 Гринуич+2, Robert Tongue написа: Greetings oVirt people, I am having a problem with the hosted-engine deployment, and unfortunately after a weekend spent trying to get this far, I am finally stuck, and cannot figure out how to fix this. I am starting with 1 host, and will have 4 when this is finished. Storage is GlusterFS, hyperconverged, but I am managing that myself outside of oVirt. It's a single-node GlusterFS volume, which I will expand out across the other 4 nodes as well. I get all the way through the initial hosted-engine deployment (via the cockpit interface) pre-storage, then get most of the way through the storage portion of it. It fails at starting the HostedEngine VM in its final state after copying the VM disk to shared storage. This is where it gets weird. [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt.deleted.domain resolves to 192.168.x.x. If you are using DHCP, check your DHCP reservation configuration"} I've masked out the domain and IP for obvious reasons. However I think this deployment error isn't really the reason for the failure, it's just where it is at when it fails. The HostedEngine VM is starting, but not actually booting. I was able to change the VNC password with `hosted-engine --add-console-password`, and see the local console display with that, however it just displays "The guest has not initialized the display (yet)". I also did: # hosted-engine --console The engine VM is running on this host Escape character is ^] Yet that doesn't move any further, nor allow any input. The VM does not respond on the network. I am thinking it's just not making it to the initial BIOS screen and booting at all. What would cause that? Here is the glusterfs volume for clarity. # gluster volume info storage Volume Name: storage Type: Distribute Volume ID: e9544310-8890-43e3-b49c-6e8c7472dbbb Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: node1:/var/glusterfs/storage/1 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.ping-timeout: 5 performance.client-io-threads: on server.event-threads: 4 client.event-threads: 4 cluster.choose-local: off user.cifs: off features.shard: on cluster.shd-wait-qlength: 1024 cluster.locking-scheme: full cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable performance.strict-o-direct: on network.remote-dio: disable performance.low-prio-threads: 32 performance.io-cache: off performance.read-ahead: off performance.quick-read: off storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz stepping : 9 microcode : 0x21 cpu MHz : 4000.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds bogomips : 7199.86 clflush size : 64 cache_alignment: 64 address sizes : 36 bits physical, 48 bits virtual power management: [ plus 7 more ] Thanks for any insight that can be provided. ___ Users mailing list -- users@ovirt.org<mailto:users@ovirt.org> To unsubscribe send an email to users-le...@ovirt.org<mailto:users-le...@ovirt.org> Privacy Statement: https://www.ovirt.org/pri
[ovirt-users] Re: Failed HostedEngine Deployment
Ahh, I did some repoquery commands can see a good bit of qemu* packages are coming from appstream rather than ovirt-4.4-centos-stream-advanced-virtualization. What's the recommanded fix? From: Strahil Nikolov Sent: Sunday, January 23, 2022 3:41 PM To: users ; Robert Tongue Subject: Re: [ovirt-users] Failed HostedEngine Deployment I've seen this. Ensure that all qemu-related packages are coming from centos-advanced-virtualization repo (6.0.0-33.el8s.x86_64). There is a known issue with the latest packages in the CentOS Stream. Also, you can set the following alias on the Hypervisours: alias virsh='virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf' Best Regards, Strahil Nikolov В неделя, 23 януари 2022 г., 21:14:20 Гринуич+2, Robert Tongue написа: Greetings oVirt people, I am having a problem with the hosted-engine deployment, and unfortunately after a weekend spent trying to get this far, I am finally stuck, and cannot figure out how to fix this. I am starting with 1 host, and will have 4 when this is finished. Storage is GlusterFS, hyperconverged, but I am managing that myself outside of oVirt. It's a single-node GlusterFS volume, which I will expand out across the other 4 nodes as well. I get all the way through the initial hosted-engine deployment (via the cockpit interface) pre-storage, then get most of the way through the storage portion of it. It fails at starting the HostedEngine VM in its final state after copying the VM disk to shared storage. This is where it gets weird. [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt.deleted.domain resolves to 192.168.x.x. If you are using DHCP, check your DHCP reservation configuration"} I've masked out the domain and IP for obvious reasons. However I think this deployment error isn't really the reason for the failure, it's just where it is at when it fails. The HostedEngine VM is starting, but not actually booting. I was able to change the VNC password with `hosted-engine --add-console-password`, and see the local console display with that, however it just displays "The guest has not initialized the display (yet)". I also did: # hosted-engine --console The engine VM is running on this host Escape character is ^] Yet that doesn't move any further, nor allow any input. The VM does not respond on the network. I am thinking it's just not making it to the initial BIOS screen and booting at all. What would cause that? Here is the glusterfs volume for clarity. # gluster volume info storage Volume Name: storage Type: Distribute Volume ID: e9544310-8890-43e3-b49c-6e8c7472dbbb Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: node1:/var/glusterfs/storage/1 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.ping-timeout: 5 performance.client-io-threads: on server.event-threads: 4 client.event-threads: 4 cluster.choose-local: off user.cifs: off features.shard: on cluster.shd-wait-qlength: 1024 cluster.locking-scheme: full cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable performance.strict-o-direct: on network.remote-dio: disable performance.low-prio-threads: 32 performance.io-cache: off performance.read-ahead: off performance.quick-read: off storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz stepping : 9 microcode : 0x21 cpu MHz : 4000.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds bogomips : 7199.86 clflush size : 64 cache_alignment: 64 address sizes : 36 bits physical, 48 bits virtual power management: [ plus 7 more ] Thanks for any insight that can be provided. ___ Users mailing list -- users@ovirt.org<mailto:users@ovirt.org> To unsubscribe send an email to users-le...@ovirt.org&
[ovirt-users] Failed HostedEngine Deployment
Greetings oVirt people, I am having a problem with the hosted-engine deployment, and unfortunately after a weekend spent trying to get this far, I am finally stuck, and cannot figure out how to fix this. I am starting with 1 host, and will have 4 when this is finished. Storage is GlusterFS, hyperconverged, but I am managing that myself outside of oVirt. It's a single-node GlusterFS volume, which I will expand out across the other 4 nodes as well. I get all the way through the initial hosted-engine deployment (via the cockpit interface) pre-storage, then get most of the way through the storage portion of it. It fails at starting the HostedEngine VM in its final state after copying the VM disk to shared storage. This is where it gets weird. [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt.deleted.domain resolves to 192.168.x.x. If you are using DHCP, check your DHCP reservation configuration"} I've masked out the domain and IP for obvious reasons. However I think this deployment error isn't really the reason for the failure, it's just where it is at when it fails. The HostedEngine VM is starting, but not actually booting. I was able to change the VNC password with `hosted-engine --add-console-password`, and see the local console display with that, however it just displays "The guest has not initialized the display (yet)". I also did: # hosted-engine --console The engine VM is running on this host Escape character is ^] Yet that doesn't move any further, nor allow any input. The VM does not respond on the network. I am thinking it's just not making it to the initial BIOS screen and booting at all. What would cause that? Here is the glusterfs volume for clarity. # gluster volume info storage Volume Name: storage Type: Distribute Volume ID: e9544310-8890-43e3-b49c-6e8c7472dbbb Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: node1:/var/glusterfs/storage/1 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.ping-timeout: 5 performance.client-io-threads: on server.event-threads: 4 client.event-threads: 4 cluster.choose-local: off user.cifs: off features.shard: on cluster.shd-wait-qlength: 1024 cluster.locking-scheme: full cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable performance.strict-o-direct: on network.remote-dio: disable performance.low-prio-threads: 32 performance.io-cache: off performance.read-ahead: off performance.quick-read: off storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz stepping : 9 microcode : 0x21 cpu MHz : 4000.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds bogomips : 7199.86 clflush size : 64 cache_alignment: 64 address sizes : 36 bits physical, 48 bits virtual power management: [ plus 7 more ] Thanks for any insight that can be provided. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/JZQYGXQP5DO4HJSLONTBNMPQ5YUX54MX/
[ovirt-users] Re: HostedEngine VM Paused after power failure
I've seen this happen with the VM disk itself becoming corrupt. If you try to read the contents of the file, and it gives you "Input/Output Error", then it is not good news. I've been testing oVirt recently, and these issues alone are preventing me from using it full time. I cannot help further, unfortunately, as I have no idea how to fix it. So best I can say is, hopefully someone else chimes in and helps both of us. -phunyguy From: ieas...@telvue.com Sent: Tuesday, February 9, 2021 6:25 PM To: users@ovirt.org Subject: [ovirt-users] Re: HostedEngine VM Paused after power failure Attempting to resume or start the VM doesn't yield any results. Here is the status of the VM: Host ID: 1 Host timestamp : 115601 Score : 3400 Engine status : {"vm": "up", "health": "bad", "detail": "Paused", "reason": "bad vm status"} Hostname : Local maintenance : False stopped: False crc32 : 68efbf40 conf_on_shared_storage : True local_conf_timestamp : 115601 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=115601 (Tue Feb 9 18:25:48 2021) host-id=1 score=3400 vm_conf_refresh_time=115601 (Tue Feb 9 18:25:48 2021) conf_on_shared_storage=True maintenance=False state=EngineStarting stopped=False Here is a chunk in agent.log that is a bit perplexing. I'm not too sure what it means that the VM doesn't exist. Storage is correctly mounted, everything looks fully operational. I can see the HostedEngine disk available to the Host. MainThread::INFO::2021-02-09 18:08:13,843::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineDown (score: 3400) MainThread::INFO::2021-02-09 18:08:23,864::states::467::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down and local host has best score (3400), attempting to start engine VM MainThread::INFO::2021-02-09 18:08:23,894::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineStart) sent? ignored MainThread::INFO::2021-02-09 18:08:23,983::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStart (score: 3400) MainThread::INFO::2021-02-09 18:08:24,000::hosted_engine::895::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state) Ensuring VDSM state is clear for engine VM MainThread::INFO::2021-02-09 18:08:24,005::hosted_engine::907::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state) Vdsm state for VM clean MainThread::INFO::2021-02-09 18:08:24,005::hosted_engine::853::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) Starting vm using `/usr/sbin/hosted-engine --vm-start` MainThread::INFO::2021-02-09 18:08:24,519::hosted_engine::862::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) stdout: VM in WaitForLaunch MainThread::INFO::2021-02-09 18:08:24,519::hosted_engine::863::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) stderr: Command VM.getStats with args {'vmID': '74b3c839-c89c-4857-ada0-95715672348a'} failed: (code=1, message=Virtual machine does not exist: {'vmId': '74b3c839-c89c-4857-ada0-95715672348a'}) MainThread::INFO::2021-02-09 18:08:24,519::hosted_engine::875::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) Engine VM started on localhost MainThread::INFO::2021-02-09 18:08:24,552::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineStart-EngineStarting) sent? ignored MainThread::INFO::2021-02-09 18:08:24,565::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2021-02-09 18:08:34,585::states::736::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. MainThread::INFO::2021-02-09 18:08:34,590::state_decorators::99::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check) Timeout set to Tue Feb 9 18:18:34 2021 while transitioning -> ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UDKODQL5A4NNIWJMONVYTFIG
[ovirt-users] Custom Fence Agent
Greetings everyone, I am having another problem that I was hoping to get some assistance with. I have created my own custom fence agent for some tasmota-flashed wifi smart plugs, that can control plug power input to ovirt nodes. This works great; however I am running into a problem getting it added to ovirt as a power manager. I got the custom fence agent added with engine-config -s, and it shows up in the webui to select as a power management agent, then I put in the details for the plug, IP address, login, password, and press the "test" button, which passes, and shows the status as power=on. Once I save the settings, however, it is logged in the engine.log file that fencing will fail, because there is no node available to proxy the operation. When I go back into the power management settings, and press "test" again, then I get the error: "Test failed: Failed to run fence status-check on host 'ovirt1'. No other host was available to serve as proxy for the operation." I have the agent script in /usr/sbin/ on all nodes, execute permissions set, and I can run it manually at the command line just fine, so I am really at a loss here as what to check. What am I missing here? Please help. Thank you for your time. Script: #!/usr/libexec/platform-python -tt from urllib.parse import quote import requests import sys import atexit sys.path.append("/usr/share/fence") from fencing import * def set_power_status(conn, options): if "on" in options["--action"]: response = requests.get(buildUrl(options, "on")) elif "off" in options["--action"]: response = requests.get(buildUrl(options, "off")) return def get_power_status(conn, options): response = requests.get(buildUrl(options, "status")) if "\"Power\":0" in response.text: return "off" elif "\"Power\":1" in response.text: return "on" def buildUrl(options, action): cmnd = { 'on' : 'Power On', 'off' : 'Power Off', 'status' : 'Status' } return "http://" + options["--ip"] + "/cm?user=" + quote(options["--username"]) + "&password=" + quote(options["--password"]) + "&cmnd=" + quote(cmnd.get(action, "Error")) def main(): device_opt = ["ipaddr", "login", "passwd", "web"] atexit.register(atexit_handler) all_opt["power_wait"]["default"] = 5 options = check_input(device_opt, process_input(device_opt)) docs = {} docs["shortdesc"] = "Fence agent for Tasmota-flashed Smarthome Plugs" docs["longdesc"] = "" docs["vendorurl"] = "" show_docs(options, docs) ## ## Fence operations result = fence_action(None, options, set_power_status, get_power_status) sys.exit(result) if __name__ == "__main__": main() ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/5CPYEXCOJWRXKFJ2CFBTKOWVX5MHRKHB/
[ovirt-users] Re: VM templates
Ahh, OK, I didn't realize that. I appreciate it, and was hoping to get good feedback like this when I posted what I did. Will make the changes. What is meant by "gluster only machine" here? From: Strahil Nikolov Sent: Wednesday, January 27, 2021 7:11 AM To: Robert Tongue ; users Subject: Re: [ovirt-users] Re: VM templates You should create a file like mine, cause vdsm manages /etc/multipathd.conf # cat /etc/multipath/conf.d/blacklist.conf blacklist { devnode "*" wwid nvme.1cc1-324a31313230303131343036-414441544120 535838323030504e50-0001 wwid TOSHIBA-TR200_Z7KB600SK46S wwid ST500NM0011_Z1M00LM7 wwid WDC_WD5003ABYX-01WERA0_WD-WMAYP2303189 wwid WDC_WD15EADS-00P8B0_WD-WMAVU0885453 wwid WDC_WD5003ABYZ-011FA0_WD-WMAYP0F35PJ4 } Keep in mind 'devnode *' is OK only for gluster-only machine. Best Regards, Strahil Nikolov Sent from Yahoo Mail on Android<https://go.onelink.me/107872968?pid=InProduct&c=Global_Internal_YGrowth_AndroidEmailSig__AndroidUsers&af_wl=ym&af_sub1=Internal&af_sub2=Global_YGrowth&af_sub3=EmailSignature> On Wed, Jan 27, 2021 at 6:02, Robert Tongue wrote: ___ Users mailing list -- users@ovirt.org<mailto:users@ovirt.org> To unsubscribe send an email to users-le...@ovirt.org<mailto:users-le...@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/YD7ROMATPWFWO2IIUQ3G3WHJDXLFK2DD/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6SZZKTZNH67XUZ66FYQ7XI2KGXCFRCZN/
[ovirt-users] Re: VM templates
Correction, the issue came back, but I fixed it again, the actual issue was multipathd. I had to set up device filters in /etc/multipath.conf blacklist { protocol "(scsi:adt|scsi:sbp)" devnode "^hd[a-z]" devnode "^sd[a-z]$" devnode "^sd[a-z]" devnode "^nvme0n1" devnode "^nvme0n1p$"blacklist { } Probably overkill, but it works. From: Robert Tongue Sent: Tuesday, January 26, 2021 2:24 PM To: users Subject: Re: VM templates I fixed my own issue, and for everyone else that may run into this, the issue was the fact that I created the first oVirt node VM inside VMware, and got it fully configured with all the software/disks/partitioning/settings, then cloned it to two more VMs. Then I ran the hosted-engine deployment and set up the cluster. I think it was because I used clones for each cluster node, and that confused things due to device/system identifiers. I rebuilt all 3 node VMs from scratch, and everything works perfectly now. Thanks for listening. From: Robert Tongue Sent: Monday, January 25, 2021 10:03 AM To: users Subject: VM templates Hello, Another weird issue over here. I have the latest oVirt running inside VMware Vcenter, as a proof of concept/testing platform. Things are working well finally, for the most part, however I am noticing strange behavior with templates, and deployed VMs from that template. Let me explain: I created a basic Ubuntu Server VM, captured that VM as a template, then deployed 4 VMs from that template. The deployment went fine; however I can only start 3 of the 4 VMs. If I shut one down one of the 3 that I started, I can then start the other one that refused to start, then the one I JUST shut down will then refuse to start. The error is: VM test3 is down with error. Exit message: Bad volume specification {'device': 'disk', 'type': 'disk', 'diskType': 'file', 'specParams': {}, 'alias': 'ua-2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'address': {'bus': '0', 'controller': '0', 'unit': '0', 'type': 'drive', 'target': '0'}, 'domainID': '804c6a0c-b246-4ccc-b3ab-dd4ceb819cea', 'imageID': '2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'poolID': '3208bbce-5e04-11eb-9313-00163e281c6d', 'volumeID': 'f514ab22-07ae-40e4-9146-1041d78553fd', 'path': '/rhev/data-center/3208bbce-5e04-11eb-9313-00163e281c6d/804c6a0c-b246-4ccc-b3ab-dd4ceb819cea/images/2dc7fbff-da30-485d-891f-03a0ed60fd0a/f514ab22-07ae-40e4-9146-1041d78553fd', 'discard': True, 'format': 'cow', 'propagateErrors': 'off', 'cache': 'none', 'iface': 'scsi', 'name': 'sda', 'bootOrder': '1', 'serial': '2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'index': 0, 'reqsize': '0', 'truesize': '2882392576', 'apparentsize': '3435134976'}. The underlying storage is GlusterFS, self-managed outside of oVirt. I can provide any logs needed, please let me know which. Thanks in advance. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/YD7ROMATPWFWO2IIUQ3G3WHJDXLFK2DD/
[ovirt-users] Re: VM templates
I fixed my own issue, and for everyone else that may run into this, the issue was the fact that I created the first oVirt node VM inside VMware, and got it fully configured with all the software/disks/partitioning/settings, then cloned it to two more VMs. Then I ran the hosted-engine deployment and set up the cluster. I think it was because I used clones for each cluster node, and that confused things due to device/system identifiers. I rebuilt all 3 node VMs from scratch, and everything works perfectly now. Thanks for listening. From: Robert Tongue Sent: Monday, January 25, 2021 10:03 AM To: users Subject: VM templates Hello, Another weird issue over here. I have the latest oVirt running inside VMware Vcenter, as a proof of concept/testing platform. Things are working well finally, for the most part, however I am noticing strange behavior with templates, and deployed VMs from that template. Let me explain: I created a basic Ubuntu Server VM, captured that VM as a template, then deployed 4 VMs from that template. The deployment went fine; however I can only start 3 of the 4 VMs. If I shut one down one of the 3 that I started, I can then start the other one that refused to start, then the one I JUST shut down will then refuse to start. The error is: VM test3 is down with error. Exit message: Bad volume specification {'device': 'disk', 'type': 'disk', 'diskType': 'file', 'specParams': {}, 'alias': 'ua-2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'address': {'bus': '0', 'controller': '0', 'unit': '0', 'type': 'drive', 'target': '0'}, 'domainID': '804c6a0c-b246-4ccc-b3ab-dd4ceb819cea', 'imageID': '2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'poolID': '3208bbce-5e04-11eb-9313-00163e281c6d', 'volumeID': 'f514ab22-07ae-40e4-9146-1041d78553fd', 'path': '/rhev/data-center/3208bbce-5e04-11eb-9313-00163e281c6d/804c6a0c-b246-4ccc-b3ab-dd4ceb819cea/images/2dc7fbff-da30-485d-891f-03a0ed60fd0a/f514ab22-07ae-40e4-9146-1041d78553fd', 'discard': True, 'format': 'cow', 'propagateErrors': 'off', 'cache': 'none', 'iface': 'scsi', 'name': 'sda', 'bootOrder': '1', 'serial': '2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'index': 0, 'reqsize': '0', 'truesize': '2882392576', 'apparentsize': '3435134976'}. The underlying storage is GlusterFS, self-managed outside of oVirt. I can provide any logs needed, please let me know which. Thanks in advance. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XXHXJEOMZQA2JYL252GZXKZWWHLI6T6W/
[ovirt-users] oVirt initramfs regeneration
Is it possible to force an initramfs regeneration in oVirt Node 4.4.4? I am doing some advanced partitioning and cannot seem to figure out how to properly do that if it's even possible. Thanks. -phunyguy ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/7HE7Z6XYJP566PV3BSW225U4UCYE5FW3/
[ovirt-users] Re: VM templates
Thanks for the reply. Here is my glusterfs options for the volume, am I missing anything critical? [root@cluster1-vm ~]# gluster volume info storage Volume Name: storage Type: Distributed-Disperse Volume ID: 67112b70-e319-4629-b768-03df9d9a0e84 Status: Started Snapshot Count: 0 Number of Bricks: 2 x (4 + 2) = 12 Transport-type: tcp Bricks: Brick1: node1-vm:/var/glusterfs/storage/1 Brick2: node2-vm:/var/glusterfs/storage/1 Brick3: node3-vm:/var/glusterfs/storage/1 Brick4: node1-vm:/var/glusterfs/storage/2 Brick5: node2-vm:/var/glusterfs/storage/2 Brick6: node3-vm:/var/glusterfs/storage/2 Brick7: node1-vm:/var/glusterfs/storage/3 Brick8: node2-vm:/var/glusterfs/storage/3 Brick9: node3-vm:/var/glusterfs/storage/3 Brick10: node1-vm:/var/glusterfs/storage/4 Brick11: node2-vm:/var/glusterfs/storage/4 Brick12: node3-vm:/var/glusterfs/storage/4 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.ping-timeout: 5 performance.client-io-threads: on server.event-threads: 4 client.event-threads: 4 cluster.choose-local: off user.cifs: off features.shard: on cluster.shd-wait-qlength: 1 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable performance.strict-o-direct: on network.remote-dio: disable performance.low-prio-threads: 32 performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet storage.fips-mode-rchecksum: on nfs.disable: on From: Strahil Nikolov Sent: Monday, January 25, 2021 10:56 AM To: Robert Tongue ; users Subject: Re: [ovirt-users] VM templates First of all , verify the gluster volume options (gluster volume info ; gluster volume status ).When you use HCI, ovirt sets up a lot of optimized options in order to gain the maximum of the Gluster storage. Best Regards, Strahil Nikolov В 15:03 + на 25.01.2021 (пн), Robert Tongue написа: Hello, Another weird issue over here. I have the latest oVirt running inside VMware Vcenter, as a proof of concept/testing platform. Things are working well finally, for the most part, however I am noticing strange behavior with templates, and deployed VMs from that template. Let me explain: I created a basic Ubuntu Server VM, captured that VM as a template, then deployed 4 VMs from that template. The deployment went fine; however I can only start 3 of the 4 VMs. If I shut one down one of the 3 that I started, I can then start the other one that refused to start, then the one I JUST shut down will then refuse to start. The error is: VM test3 is down with error. Exit message: Bad volume specification {'device': 'disk', 'type': 'disk', 'diskType': 'file', 'specParams': {}, 'alias': 'ua-2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'address': {'bus': '0', 'controller': '0', 'unit': '0', 'type': 'drive', 'target': '0'}, 'domainID': '804c6a0c-b246-4ccc-b3ab-dd4ceb819cea', 'imageID': '2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'poolID': '3208bbce-5e04-11eb-9313-00163e281c6d', 'volumeID': 'f514ab22-07ae-40e4-9146-1041d78553fd', 'path': '/rhev/data-center/3208bbce-5e04-11eb-9313-00163e281c6d/804c6a0c-b246-4ccc-b3ab-dd4ceb819cea/images/2dc7fbff-da30-485d-891f-03a0ed60fd0a/f514ab22-07ae-40e4-9146-1041d78553fd', 'discard': True, 'format': 'cow', 'propagateErrors': 'off', 'cache': 'none', 'iface': 'scsi', 'name': 'sda', 'bootOrder': '1', 'serial': '2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'index': 0, 'reqsize': '0', 'truesize': '2882392576', 'apparentsize': '3435134976'}. The underlying storage is GlusterFS, self-managed outside of oVirt. I can provide any logs needed, please let me know which. Thanks in advance. ___ Users mailing list -- <mailto:users@ovirt.org> users@ovirt.org To unsubscribe send an email to <mailto:users-le...@ovirt.org> users-le...@ovirt.org Privacy Statement: <https://www.ovirt.org/privacy-policy.html> https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: <https://www.ovirt.org/community/about/community-guidelines/> https://www.ovirt.org/community/about/community-guidelines/ List Archives: <https://lists.ovirt.org/archives/list/users@ovirt.org/message/TZMAQ56NIQK7DY4WM4RZV4YRERMDEZRO/> https://lists.ovirt.org/archives/list/users@ovirt.org/message/TZMAQ56NIQK7DY4WM4RZV4YRERMDEZRO/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ONDAHJU23YKKF4TBJPST7X7R4ZOZTQRK/
[ovirt-users] VM templates
Hello, Another weird issue over here. I have the latest oVirt running inside VMware Vcenter, as a proof of concept/testing platform. Things are working well finally, for the most part, however I am noticing strange behavior with templates, and deployed VMs from that template. Let me explain: I created a basic Ubuntu Server VM, captured that VM as a template, then deployed 4 VMs from that template. The deployment went fine; however I can only start 3 of the 4 VMs. If I shut one down one of the 3 that I started, I can then start the other one that refused to start, then the one I JUST shut down will then refuse to start. The error is: VM test3 is down with error. Exit message: Bad volume specification {'device': 'disk', 'type': 'disk', 'diskType': 'file', 'specParams': {}, 'alias': 'ua-2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'address': {'bus': '0', 'controller': '0', 'unit': '0', 'type': 'drive', 'target': '0'}, 'domainID': '804c6a0c-b246-4ccc-b3ab-dd4ceb819cea', 'imageID': '2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'poolID': '3208bbce-5e04-11eb-9313-00163e281c6d', 'volumeID': 'f514ab22-07ae-40e4-9146-1041d78553fd', 'path': '/rhev/data-center/3208bbce-5e04-11eb-9313-00163e281c6d/804c6a0c-b246-4ccc-b3ab-dd4ceb819cea/images/2dc7fbff-da30-485d-891f-03a0ed60fd0a/f514ab22-07ae-40e4-9146-1041d78553fd', 'discard': True, 'format': 'cow', 'propagateErrors': 'off', 'cache': 'none', 'iface': 'scsi', 'name': 'sda', 'bootOrder': '1', 'serial': '2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'index': 0, 'reqsize': '0', 'truesize': '2882392576', 'apparentsize': '3435134976'}. The underlying storage is GlusterFS, self-managed outside of oVirt. I can provide any logs needed, please let me know which. Thanks in advance. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TZMAQ56NIQK7DY4WM4RZV4YRERMDEZRO/
[ovirt-users] VMware Fence Agent
Greetings all, I am new to oVirt, and have a proof of concept setup with a 3-node oVirt cluster nested inside of VMware VCenter to learn it, so then I can efficiently migrate that back out to the physical nodes to replace VCenter. I have gotten all the way to a working cluster setup, with the exception of fencing. I used engine-config to pull in the vmware_soap fence agent, and enabled all the options, however there is one small thing I cannot figure out. The connection uses a self-signed certificate on the vcenter side, and I cannot figure out the proper combination of engine-config -s commands to get the script to be called with the "ssl-insecure" option, which does contain a value. It just needs the option passed. Is there anyone out there in the ether that can help me out? I can provide any information you request. Thanks in advance. The fence agent script is called with the following syntax in my tests, and returned the proper status: [root@cluster2-vm ~]# /usr/sbin/fence_vmware_soap -o status -a vcenter.address --username="administrator@vsphere.local" --password="0bfusc@t3d" --ssl-insecure -n cluster1-vm Status: ON -phunyguy ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3PTMUPHR3ZOSQL3SEMTJPAWOAFL5ZUY2/