[ovirt-users] Re: Failed HostedEngine Deployment

2022-01-23 Thread Robert Tongue
I think you may be right, here.  I decided to just start over and use the 
actual ovirt-node installation media, rather than Centos Stream installation 
media. Hopefully that gets the software-side situated. Thanks for the pointers.

From: Strahil Nikolov 
Sent: Sunday, January 23, 2022 5:46 PM
To: Robert Tongue ; users 
Subject: Re: [ovirt-users] Failed HostedEngine Deployment

yum downgrade qemu-kvm-block-gluster-6.0.0-33.el8s 
libvirt-daemon-driver-qemu-6.0.0-33.el8s qemu-kvm-common
-6.0.0-33.el8s qemu-kvm-hw-usbredir-6.0.0-33.el8s qemu-kvm-u
i-opengl-6.0.0-33.el8s qemu-kvm-block-rbd-6.0.0-33.el8s qemu
-img-6.0.0-33.el8s qemu-kvm-6.0.0-33.el8s qemu-kvm-block-cur
l-6.0.0-33.el8s qemu-kvm-block-ssh-6.0.0-33.el8s qemu-kvm-ui
-spice-6.0.0-33.el8s ipxe-roms-qemu-6.0.0-33.el8s qemu-kvm-c
ore-6.0.0-33.el8s qemu-kvm-docs-6.0.0-33.el8s qemu-kvm-block-6.0.0-33.el8s

Best Regards,
Strahil Nikolov

On Sun, Jan 23, 2022 at 22:47, Robert Tongue
 wrote:
Ahh, I did some repoquery commands can see a good bit of qemu* packages are 
coming from appstream rather than 
ovirt-4.4-centos-stream-advanced-virtualization.

What's the recommanded fix?

From: Strahil Nikolov 
Sent: Sunday, January 23, 2022 3:41 PM
To: users ; Robert Tongue 
Subject: Re: [ovirt-users] Failed HostedEngine Deployment

I've seen this.

Ensure that all qemu-related packages are coming from 
centos-advanced-virtualization repo (6.0.0-33.el8s.x86_64).
There is a known issue with the latest packages in the CentOS Stream.

Also, you can set the following alias on the Hypervisours:
alias virsh='virsh -c 
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf'


Best Regards,
Strahil Nikolov
В неделя, 23 януари 2022 г., 21:14:20 Гринуич+2, Robert Tongue 
 написа:


Greetings oVirt people,

I am having a problem with the hosted-engine deployment, and unfortunately 
after a weekend spent trying to get this far, I am finally stuck, and cannot 
figure out how to fix this.

I am starting with 1 host, and will have 4 when this is finished.  Storage is 
GlusterFS, hyperconverged, but I am managing that myself outside of oVirt. It's 
a single-node GlusterFS volume, which I will expand out across the other 4 
nodes as well.  I get all the way through the initial hosted-engine deployment 
(via the cockpit interface) pre-storage, then get most of the way through the 
storage portion of it.  It fails at starting the HostedEngine VM in its final 
state after copying the VM disk to shared storage.

This is where it gets weird.

[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM 
IP address is while the engine's he_fqdn ovirt.deleted.domain resolves to 
192.168.x.x. If you are using DHCP, check your DHCP reservation configuration"}

I've masked out the domain and IP for obvious reasons.  However I think this 
deployment error isn't really the reason for the failure, it's just where it is 
at when it fails.  The HostedEngine VM is starting, but not actually booting.   
I was able to change the VNC password with `hosted-engine 
--add-console-password`, and see the local console display with that, however 
it just displays "The guest has not initialized the display (yet)".

I also did:

# hosted-engine --console
The engine VM is running on this host
Escape character is ^]

Yet that doesn't move any further, nor allow any input.  The VM does not 
respond on the network.  I am thinking it's just not making it to the initial 
BIOS screen and booting at all.  What would cause that?

Here is the glusterfs volume for clarity.

# gluster volume info storage

Volume Name: storage
Type: Distribute
Volume ID: e9544310-8890-43e3-b49c-6e8c7472dbbb
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node1:/var/glusterfs/storage/1
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
network.ping-timeout: 5
performance.client-io-threads: on
server.event-threads: 4
client.event-threads: 4
cluster.choose-local: off
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 1024
cluster.locking-scheme: full
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
performance.strict-o-direct: on
network.remote-dio: disable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz
stepping : 9
microcode : 0x21
cpu MHz : 4000.000
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de p

[ovirt-users] Re: Failed HostedEngine Deployment

2022-01-23 Thread Robert Tongue
Thanks for the response.  How can I verify this? Has something with the 
installation procedures changed recently?

From: Strahil Nikolov 
Sent: Sunday, January 23, 2022 3:41 PM
To: users ; Robert Tongue 
Subject: Re: [ovirt-users] Failed HostedEngine Deployment

I've seen this.

Ensure that all qemu-related packages are coming from 
centos-advanced-virtualization repo (6.0.0-33.el8s.x86_64).
There is a known issue with the latest packages in the CentOS Stream.

Also, you can set the following alias on the Hypervisours:
alias virsh='virsh -c 
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf'


Best Regards,
Strahil Nikolov
В неделя, 23 януари 2022 г., 21:14:20 Гринуич+2, Robert Tongue 
 написа:


Greetings oVirt people,

I am having a problem with the hosted-engine deployment, and unfortunately 
after a weekend spent trying to get this far, I am finally stuck, and cannot 
figure out how to fix this.

I am starting with 1 host, and will have 4 when this is finished.  Storage is 
GlusterFS, hyperconverged, but I am managing that myself outside of oVirt. It's 
a single-node GlusterFS volume, which I will expand out across the other 4 
nodes as well.  I get all the way through the initial hosted-engine deployment 
(via the cockpit interface) pre-storage, then get most of the way through the 
storage portion of it.  It fails at starting the HostedEngine VM in its final 
state after copying the VM disk to shared storage.

This is where it gets weird.

[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM 
IP address is while the engine's he_fqdn ovirt.deleted.domain resolves to 
192.168.x.x. If you are using DHCP, check your DHCP reservation configuration"}

I've masked out the domain and IP for obvious reasons.  However I think this 
deployment error isn't really the reason for the failure, it's just where it is 
at when it fails.  The HostedEngine VM is starting, but not actually booting.   
I was able to change the VNC password with `hosted-engine 
--add-console-password`, and see the local console display with that, however 
it just displays "The guest has not initialized the display (yet)".

I also did:

# hosted-engine --console
The engine VM is running on this host
Escape character is ^]

Yet that doesn't move any further, nor allow any input.  The VM does not 
respond on the network.  I am thinking it's just not making it to the initial 
BIOS screen and booting at all.  What would cause that?

Here is the glusterfs volume for clarity.

# gluster volume info storage

Volume Name: storage
Type: Distribute
Volume ID: e9544310-8890-43e3-b49c-6e8c7472dbbb
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node1:/var/glusterfs/storage/1
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
network.ping-timeout: 5
performance.client-io-threads: on
server.event-threads: 4
client.event-threads: 4
cluster.choose-local: off
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 1024
cluster.locking-scheme: full
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
performance.strict-o-direct: on
network.remote-dio: disable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz
stepping : 9
microcode : 0x21
cpu MHz : 4000.000
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm 
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid 
aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr 
pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx f16c rdrand 
lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority 
ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs 
itlb_multihit srbds
bogomips : 7199.86
clflush size : 64
cache_alignment: 64
address sizes : 36 bits physical, 48 bits virtual
power management:

[ plus 7 more ]



Thanks for any insight that can be provided.

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/pri

[ovirt-users] Re: Failed HostedEngine Deployment

2022-01-23 Thread Robert Tongue
Ahh, I did some repoquery commands can see a good bit of qemu* packages are 
coming from appstream rather than 
ovirt-4.4-centos-stream-advanced-virtualization.

What's the recommanded fix?

From: Strahil Nikolov 
Sent: Sunday, January 23, 2022 3:41 PM
To: users ; Robert Tongue 
Subject: Re: [ovirt-users] Failed HostedEngine Deployment

I've seen this.

Ensure that all qemu-related packages are coming from 
centos-advanced-virtualization repo (6.0.0-33.el8s.x86_64).
There is a known issue with the latest packages in the CentOS Stream.

Also, you can set the following alias on the Hypervisours:
alias virsh='virsh -c 
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf'


Best Regards,
Strahil Nikolov
В неделя, 23 януари 2022 г., 21:14:20 Гринуич+2, Robert Tongue 
 написа:


Greetings oVirt people,

I am having a problem with the hosted-engine deployment, and unfortunately 
after a weekend spent trying to get this far, I am finally stuck, and cannot 
figure out how to fix this.

I am starting with 1 host, and will have 4 when this is finished.  Storage is 
GlusterFS, hyperconverged, but I am managing that myself outside of oVirt. It's 
a single-node GlusterFS volume, which I will expand out across the other 4 
nodes as well.  I get all the way through the initial hosted-engine deployment 
(via the cockpit interface) pre-storage, then get most of the way through the 
storage portion of it.  It fails at starting the HostedEngine VM in its final 
state after copying the VM disk to shared storage.

This is where it gets weird.

[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM 
IP address is while the engine's he_fqdn ovirt.deleted.domain resolves to 
192.168.x.x. If you are using DHCP, check your DHCP reservation configuration"}

I've masked out the domain and IP for obvious reasons.  However I think this 
deployment error isn't really the reason for the failure, it's just where it is 
at when it fails.  The HostedEngine VM is starting, but not actually booting.   
I was able to change the VNC password with `hosted-engine 
--add-console-password`, and see the local console display with that, however 
it just displays "The guest has not initialized the display (yet)".

I also did:

# hosted-engine --console
The engine VM is running on this host
Escape character is ^]

Yet that doesn't move any further, nor allow any input.  The VM does not 
respond on the network.  I am thinking it's just not making it to the initial 
BIOS screen and booting at all.  What would cause that?

Here is the glusterfs volume for clarity.

# gluster volume info storage

Volume Name: storage
Type: Distribute
Volume ID: e9544310-8890-43e3-b49c-6e8c7472dbbb
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node1:/var/glusterfs/storage/1
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
network.ping-timeout: 5
performance.client-io-threads: on
server.event-threads: 4
client.event-threads: 4
cluster.choose-local: off
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 1024
cluster.locking-scheme: full
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
performance.strict-o-direct: on
network.remote-dio: disable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz
stepping : 9
microcode : 0x21
cpu MHz : 4000.000
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm 
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid 
aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr 
pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx f16c rdrand 
lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority 
ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs 
itlb_multihit srbds
bogomips : 7199.86
clflush size : 64
cache_alignment: 64
address sizes : 36 bits physical, 48 bits virtual
power management:

[ plus 7 more ]



Thanks for any insight that can be provided.

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org&

[ovirt-users] Failed HostedEngine Deployment

2022-01-23 Thread Robert Tongue
Greetings oVirt people,

I am having a problem with the hosted-engine deployment, and unfortunately 
after a weekend spent trying to get this far, I am finally stuck, and cannot 
figure out how to fix this.

I am starting with 1 host, and will have 4 when this is finished.  Storage is 
GlusterFS, hyperconverged, but I am managing that myself outside of oVirt. It's 
a single-node GlusterFS volume, which I will expand out across the other 4 
nodes as well.  I get all the way through the initial hosted-engine deployment 
(via the cockpit interface) pre-storage, then get most of the way through the 
storage portion of it.  It fails at starting the HostedEngine VM in its final 
state after copying the VM disk to shared storage.

This is where it gets weird.

[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM 
IP address is while the engine's he_fqdn ovirt.deleted.domain resolves to 
192.168.x.x. If you are using DHCP, check your DHCP reservation configuration"}

I've masked out the domain and IP for obvious reasons.  However I think this 
deployment error isn't really the reason for the failure, it's just where it is 
at when it fails.  The HostedEngine VM is starting, but not actually booting.   
I was able to change the VNC password with `hosted-engine 
--add-console-password`, and see the local console display with that, however 
it just displays "The guest has not initialized the display (yet)".

I also did:

# hosted-engine --console
The engine VM is running on this host
Escape character is ^]

Yet that doesn't move any further, nor allow any input.  The VM does not 
respond on the network.  I am thinking it's just not making it to the initial 
BIOS screen and booting at all.  What would cause that?

Here is the glusterfs volume for clarity.

# gluster volume info storage

Volume Name: storage
Type: Distribute
Volume ID: e9544310-8890-43e3-b49c-6e8c7472dbbb
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node1:/var/glusterfs/storage/1
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
network.ping-timeout: 5
performance.client-io-threads: on
server.event-threads: 4
client.event-threads: 4
cluster.choose-local: off
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 1024
cluster.locking-scheme: full
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
performance.strict-o-direct: on
network.remote-dio: disable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz
stepping : 9
microcode : 0x21
cpu MHz : 4000.000
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm 
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid 
aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr 
pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx f16c rdrand 
lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority 
ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs 
itlb_multihit srbds
bogomips : 7199.86
clflush size : 64
cache_alignment: 64
address sizes : 36 bits physical, 48 bits virtual
power management:

[ plus 7 more ]



Thanks for any insight that can be provided.

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JZQYGXQP5DO4HJSLONTBNMPQ5YUX54MX/


[ovirt-users] Re: HostedEngine VM Paused after power failure

2021-02-09 Thread Robert Tongue
I've seen this happen with the VM disk itself becoming corrupt.  If you try to 
read the contents of the file, and it gives you "Input/Output Error", then it 
is not good news.  I've been testing oVirt recently, and these issues alone are 
preventing me from using it full time.  I cannot help further, unfortunately, 
as I have no idea how to fix it.  So best I can say is, hopefully someone else 
chimes in and helps both of us.

-phunyguy

From: ieas...@telvue.com 
Sent: Tuesday, February 9, 2021 6:25 PM
To: users@ovirt.org 
Subject: [ovirt-users] Re: HostedEngine VM Paused after power failure

Attempting to resume or start the VM doesn't yield any results.

Here is the status of the VM:
Host ID: 1
Host timestamp : 115601
Score  : 3400
Engine status  : {"vm": "up", "health": "bad", "detail": 
"Paused", "reason": "bad vm status"}
Hostname   :
Local maintenance  : False
stopped: False
crc32  : 68efbf40
conf_on_shared_storage : True
local_conf_timestamp   : 115601
Status up-to-date  : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=115601 (Tue Feb  9 18:25:48 2021)
host-id=1
score=3400
vm_conf_refresh_time=115601 (Tue Feb  9 18:25:48 2021)
conf_on_shared_storage=True
maintenance=False
state=EngineStarting
stopped=False


Here is a chunk in agent.log that is a bit perplexing.  I'm not too sure what 
it means that the VM doesn't exist.  Storage is correctly mounted, everything 
looks fully operational.  I can see the HostedEngine disk available to the Host.

MainThread::INFO::2021-02-09 
18:08:13,843::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineDown (score: 3400)
MainThread::INFO::2021-02-09 
18:08:23,864::states::467::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 Engine down and local host has best score (3400), attempting to start engine VM
MainThread::INFO::2021-02-09 
18:08:23,894::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
 Success, was notification of state_transition (EngineDown-EngineStart) sent? 
ignored
MainThread::INFO::2021-02-09 
18:08:23,983::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineStart (score: 3400)
MainThread::INFO::2021-02-09 
18:08:24,000::hosted_engine::895::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
 Ensuring VDSM state is clear for engine VM
MainThread::INFO::2021-02-09 
18:08:24,005::hosted_engine::907::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
 Vdsm state for VM clean
MainThread::INFO::2021-02-09 
18:08:24,005::hosted_engine::853::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
 Starting vm using `/usr/sbin/hosted-engine --vm-start`
MainThread::INFO::2021-02-09 
18:08:24,519::hosted_engine::862::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
 stdout: VM in WaitForLaunch

MainThread::INFO::2021-02-09 
18:08:24,519::hosted_engine::863::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
 stderr: Command VM.getStats with args {'vmID': 
'74b3c839-c89c-4857-ada0-95715672348a'} failed:
(code=1, message=Virtual machine does not exist: {'vmId': 
'74b3c839-c89c-4857-ada0-95715672348a'})

MainThread::INFO::2021-02-09 
18:08:24,519::hosted_engine::875::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
 Engine VM started on localhost
MainThread::INFO::2021-02-09 
18:08:24,552::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
 Success, was notification of state_transition (EngineStart-EngineStarting) 
sent? ignored
MainThread::INFO::2021-02-09 
18:08:24,565::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineStarting (score: 3400)
MainThread::INFO::2021-02-09 
18:08:34,585::states::736::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 VM is powering up..
MainThread::INFO::2021-02-09 
18:08:34,590::state_decorators::99::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
 Timeout set to Tue Feb  9 18:18:34 2021 while transitioning  -> 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UDKODQL5A4NNIWJMONVYTFIG

[ovirt-users] Custom Fence Agent

2021-01-31 Thread Robert Tongue
Greetings everyone, I am having another problem that I was hoping to get some 
assistance with.  

I have created my own custom fence agent for some tasmota-flashed wifi smart 
plugs, that can control plug power input to ovirt nodes. This works great; 
however I am running into a problem getting it added to ovirt as a power 
manager. I got the custom fence agent added with engine-config -s, and it shows 
up in the webui to select as a power management agent, then I put in the 
details for the plug, IP address, login, password, and press the "test" button, 
which passes, and shows the status as power=on.  Once I save the settings, 
however, it is logged in the engine.log file that fencing will fail, because 
there is no node available to proxy the operation.  When I go back into the 
power management settings, and press "test" again, then I get the error: "Test 
failed: Failed to run fence status-check on host 'ovirt1'. No other host was 
available to serve as proxy for the operation."

I have the agent script in /usr/sbin/ on all nodes, execute permissions set, 
and I can run it manually at the command line just fine, so I am really at a 
loss here as what to check.  What am I missing here? Please help.

Thank you for your time. 

Script:
#!/usr/libexec/platform-python -tt

from urllib.parse import quote
import requests
import sys
import atexit
sys.path.append("/usr/share/fence")
from fencing import *

def set_power_status(conn, options):
if "on" in options["--action"]:
response = requests.get(buildUrl(options, "on"))
elif "off" in options["--action"]:
response = requests.get(buildUrl(options, "off"))
return

def get_power_status(conn, options):
response = requests.get(buildUrl(options, "status"))
if "\"Power\":0" in response.text:
return "off"
elif "\"Power\":1" in response.text:
return "on"

def buildUrl(options, action):
cmnd = {
'on' : 'Power On',
'off' : 'Power Off',
'status' : 'Status'
}
return "http://" + options["--ip"] + "/cm?user=" + 
quote(options["--username"]) + "&password=" + quote(options["--password"]) + 
"&cmnd=" + quote(cmnd.get(action, "Error"))

def main():
device_opt = ["ipaddr", "login", "passwd", "web"]

atexit.register(atexit_handler)

all_opt["power_wait"]["default"] = 5

options = check_input(device_opt, process_input(device_opt))

docs = {}
docs["shortdesc"] = "Fence agent for Tasmota-flashed Smarthome Plugs"
docs["longdesc"] = ""
docs["vendorurl"] = ""
show_docs(options, docs)

##
## Fence operations

result = fence_action(None, options, set_power_status, get_power_status)
sys.exit(result)

if __name__ == "__main__":
main()
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5CPYEXCOJWRXKFJ2CFBTKOWVX5MHRKHB/


[ovirt-users] Re: VM templates

2021-01-27 Thread Robert Tongue
Ahh, OK, I didn't realize that.  I appreciate it, and was hoping to get good 
feedback like this when I posted what I did.  Will make the changes.

What is meant by "gluster only machine" here?

From: Strahil Nikolov 
Sent: Wednesday, January 27, 2021 7:11 AM
To: Robert Tongue ; users 
Subject: Re: [ovirt-users] Re: VM templates

You should create a file like mine, cause vdsm manages /etc/multipathd.conf

# cat /etc/multipath/conf.d/blacklist.conf
blacklist {
devnode "*"
wwid nvme.1cc1-324a31313230303131343036-414441544120
535838323030504e50-0001
wwid TOSHIBA-TR200_Z7KB600SK46S
wwid ST500NM0011_Z1M00LM7
wwid WDC_WD5003ABYX-01WERA0_WD-WMAYP2303189
wwid WDC_WD15EADS-00P8B0_WD-WMAVU0885453
wwid WDC_WD5003ABYZ-011FA0_WD-WMAYP0F35PJ4
}

Keep in mind 'devnode *' is OK only for gluster-only machine.

Best Regards,
Strahil Nikolov


Sent from Yahoo Mail on 
Android<https://go.onelink.me/107872968?pid=InProduct&c=Global_Internal_YGrowth_AndroidEmailSig__AndroidUsers&af_wl=ym&af_sub1=Internal&af_sub2=Global_YGrowth&af_sub3=EmailSignature>

On Wed, Jan 27, 2021 at 6:02, Robert Tongue
 wrote:
___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YD7ROMATPWFWO2IIUQ3G3WHJDXLFK2DD/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6SZZKTZNH67XUZ66FYQ7XI2KGXCFRCZN/


[ovirt-users] Re: VM templates

2021-01-26 Thread Robert Tongue
Correction, the issue came back, but I fixed it again, the actual issue was 
multipathd.  I had to set up device filters in /etc/multipath.conf

blacklist {
protocol "(scsi:adt|scsi:sbp)"
   devnode "^hd[a-z]"
   devnode "^sd[a-z]$"
   devnode "^sd[a-z]"
   devnode "^nvme0n1"
   devnode "^nvme0n1p$"blacklist {
}

Probably overkill, but it works.

From: Robert Tongue 
Sent: Tuesday, January 26, 2021 2:24 PM
To: users 
Subject: Re: VM templates

I fixed my own issue, and for everyone else that may run into this, the issue 
was the fact that I created the first oVirt node VM inside VMware, and got it 
fully configured with all the software/disks/partitioning/settings, then cloned 
it to two more VMs.   Then I ran the hosted-engine deployment and set up the 
cluster.   I think it was because I used clones for each cluster node, and that 
confused things due to device/system identifiers.

I rebuilt all 3 node VMs from scratch, and everything works perfectly now.

Thanks for listening.

From: Robert Tongue
Sent: Monday, January 25, 2021 10:03 AM
To: users 
Subject: VM templates

Hello,

Another weird issue over here.  I have the latest oVirt running inside VMware 
Vcenter, as a proof of concept/testing platform.  Things are working well 
finally, for the most part, however I am noticing strange behavior with 
templates, and deployed VMs from that template.  Let me explain:

I created a basic Ubuntu Server VM, captured that VM as a template, then 
deployed 4 VMs from that template.  The deployment went fine; however I can 
only start 3 of the 4 VMs.  If I shut one down one of the 3 that I started, I 
can then start the other one that refused to start, then the one I JUST shut 
down will then refuse to start.  The error is:

VM test3 is down with error. Exit message: Bad volume specification {'device': 
'disk', 'type': 'disk', 'diskType': 'file', 'specParams': {}, 'alias': 
'ua-2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'address': {'bus': '0', 
'controller': '0', 'unit': '0', 'type': 'drive', 'target': '0'}, 'domainID': 
'804c6a0c-b246-4ccc-b3ab-dd4ceb819cea', 'imageID': 
'2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'poolID': 
'3208bbce-5e04-11eb-9313-00163e281c6d', 'volumeID': 
'f514ab22-07ae-40e4-9146-1041d78553fd', 'path': 
'/rhev/data-center/3208bbce-5e04-11eb-9313-00163e281c6d/804c6a0c-b246-4ccc-b3ab-dd4ceb819cea/images/2dc7fbff-da30-485d-891f-03a0ed60fd0a/f514ab22-07ae-40e4-9146-1041d78553fd',
 'discard': True, 'format': 'cow', 'propagateErrors': 'off', 'cache': 'none', 
'iface': 'scsi', 'name': 'sda', 'bootOrder': '1', 'serial': 
'2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'index': 0, 'reqsize': '0', 'truesize': 
'2882392576', 'apparentsize': '3435134976'}.

The underlying storage is GlusterFS, self-managed outside of oVirt.

I can provide any logs needed, please let me know which.  Thanks in advance.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YD7ROMATPWFWO2IIUQ3G3WHJDXLFK2DD/


[ovirt-users] Re: VM templates

2021-01-26 Thread Robert Tongue
I fixed my own issue, and for everyone else that may run into this, the issue 
was the fact that I created the first oVirt node VM inside VMware, and got it 
fully configured with all the software/disks/partitioning/settings, then cloned 
it to two more VMs.   Then I ran the hosted-engine deployment and set up the 
cluster.   I think it was because I used clones for each cluster node, and that 
confused things due to device/system identifiers.

I rebuilt all 3 node VMs from scratch, and everything works perfectly now.

Thanks for listening.

From: Robert Tongue
Sent: Monday, January 25, 2021 10:03 AM
To: users 
Subject: VM templates

Hello,

Another weird issue over here.  I have the latest oVirt running inside VMware 
Vcenter, as a proof of concept/testing platform.  Things are working well 
finally, for the most part, however I am noticing strange behavior with 
templates, and deployed VMs from that template.  Let me explain:

I created a basic Ubuntu Server VM, captured that VM as a template, then 
deployed 4 VMs from that template.  The deployment went fine; however I can 
only start 3 of the 4 VMs.  If I shut one down one of the 3 that I started, I 
can then start the other one that refused to start, then the one I JUST shut 
down will then refuse to start.  The error is:

VM test3 is down with error. Exit message: Bad volume specification {'device': 
'disk', 'type': 'disk', 'diskType': 'file', 'specParams': {}, 'alias': 
'ua-2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'address': {'bus': '0', 
'controller': '0', 'unit': '0', 'type': 'drive', 'target': '0'}, 'domainID': 
'804c6a0c-b246-4ccc-b3ab-dd4ceb819cea', 'imageID': 
'2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'poolID': 
'3208bbce-5e04-11eb-9313-00163e281c6d', 'volumeID': 
'f514ab22-07ae-40e4-9146-1041d78553fd', 'path': 
'/rhev/data-center/3208bbce-5e04-11eb-9313-00163e281c6d/804c6a0c-b246-4ccc-b3ab-dd4ceb819cea/images/2dc7fbff-da30-485d-891f-03a0ed60fd0a/f514ab22-07ae-40e4-9146-1041d78553fd',
 'discard': True, 'format': 'cow', 'propagateErrors': 'off', 'cache': 'none', 
'iface': 'scsi', 'name': 'sda', 'bootOrder': '1', 'serial': 
'2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'index': 0, 'reqsize': '0', 'truesize': 
'2882392576', 'apparentsize': '3435134976'}.

The underlying storage is GlusterFS, self-managed outside of oVirt.

I can provide any logs needed, please let me know which.  Thanks in advance.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XXHXJEOMZQA2JYL252GZXKZWWHLI6T6W/


[ovirt-users] oVirt initramfs regeneration

2021-01-25 Thread Robert Tongue
Is it possible to force an initramfs regeneration in oVirt Node 4.4.4? I am 
doing some advanced partitioning and cannot seem to figure out how to properly 
do that if it's even possible. Thanks.

-phunyguy
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7HE7Z6XYJP566PV3BSW225U4UCYE5FW3/


[ovirt-users] Re: VM templates

2021-01-25 Thread Robert Tongue
Thanks for the reply.  Here is my glusterfs options for the volume, am I 
missing anything critical?

[root@cluster1-vm ~]# gluster volume info storage

Volume Name: storage
Type: Distributed-Disperse
Volume ID: 67112b70-e319-4629-b768-03df9d9a0e84
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (4 + 2) = 12
Transport-type: tcp
Bricks:
Brick1: node1-vm:/var/glusterfs/storage/1
Brick2: node2-vm:/var/glusterfs/storage/1
Brick3: node3-vm:/var/glusterfs/storage/1
Brick4: node1-vm:/var/glusterfs/storage/2
Brick5: node2-vm:/var/glusterfs/storage/2
Brick6: node3-vm:/var/glusterfs/storage/2
Brick7: node1-vm:/var/glusterfs/storage/3
Brick8: node2-vm:/var/glusterfs/storage/3
Brick9: node3-vm:/var/glusterfs/storage/3
Brick10: node1-vm:/var/glusterfs/storage/4
Brick11: node2-vm:/var/glusterfs/storage/4
Brick12: node3-vm:/var/glusterfs/storage/4
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
network.ping-timeout: 5
performance.client-io-threads: on
server.event-threads: 4
client.event-threads: 4
cluster.choose-local: off
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 1
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
performance.strict-o-direct: on
network.remote-dio: disable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on

From: Strahil Nikolov 
Sent: Monday, January 25, 2021 10:56 AM
To: Robert Tongue ; users 
Subject: Re: [ovirt-users] VM templates

First of all ,

verify the gluster volume options (gluster volume info  ; gluster 
volume status ).When you use HCI, ovirt sets up a lot of optimized 
options in order to gain the maximum of the Gluster storage.

Best Regards,
Strahil Nikolov

В 15:03 + на 25.01.2021 (пн), Robert Tongue написа:
Hello,

Another weird issue over here.  I have the latest oVirt running inside VMware 
Vcenter, as a proof of concept/testing platform.  Things are working well 
finally, for the most part, however I am noticing strange behavior with 
templates, and deployed VMs from that template.  Let me explain:

I created a basic Ubuntu Server VM, captured that VM as a template, then 
deployed 4 VMs from that template.  The deployment went fine; however I can 
only start 3 of the 4 VMs.  If I shut one down one of the 3 that I started, I 
can then start the other one that refused to start, then the one I JUST shut 
down will then refuse to start.  The error is:

VM test3 is down with error. Exit message: Bad volume specification {'device': 
'disk', 'type': 'disk', 'diskType': 'file', 'specParams': {}, 'alias': 
'ua-2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'address': {'bus': '0', 
'controller': '0', 'unit': '0', 'type': 'drive', 'target': '0'}, 'domainID': 
'804c6a0c-b246-4ccc-b3ab-dd4ceb819cea', 'imageID': 
'2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'poolID': 
'3208bbce-5e04-11eb-9313-00163e281c6d', 'volumeID': 
'f514ab22-07ae-40e4-9146-1041d78553fd', 'path': 
'/rhev/data-center/3208bbce-5e04-11eb-9313-00163e281c6d/804c6a0c-b246-4ccc-b3ab-dd4ceb819cea/images/2dc7fbff-da30-485d-891f-03a0ed60fd0a/f514ab22-07ae-40e4-9146-1041d78553fd',
 'discard': True, 'format': 'cow', 'propagateErrors': 'off', 'cache': 'none', 
'iface': 'scsi', 'name': 'sda', 'bootOrder': '1', 'serial': 
'2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'index': 0, 'reqsize': '0', 'truesize': 
'2882392576', 'apparentsize': '3435134976'}.

The underlying storage is GlusterFS, self-managed outside of oVirt.

I can provide any logs needed, please let me know which.  Thanks in advance.

___

Users mailing list --

<mailto:users@ovirt.org>

users@ovirt.org


To unsubscribe send an email to

<mailto:users-le...@ovirt.org>

users-le...@ovirt.org


Privacy Statement:

<https://www.ovirt.org/privacy-policy.html>

https://www.ovirt.org/privacy-policy.html


oVirt Code of Conduct:

<https://www.ovirt.org/community/about/community-guidelines/>

https://www.ovirt.org/community/about/community-guidelines/


List Archives:

<https://lists.ovirt.org/archives/list/users@ovirt.org/message/TZMAQ56NIQK7DY4WM4RZV4YRERMDEZRO/>

https://lists.ovirt.org/archives/list/users@ovirt.org/message/TZMAQ56NIQK7DY4WM4RZV4YRERMDEZRO/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ONDAHJU23YKKF4TBJPST7X7R4ZOZTQRK/


[ovirt-users] VM templates

2021-01-25 Thread Robert Tongue
Hello,

Another weird issue over here.  I have the latest oVirt running inside VMware 
Vcenter, as a proof of concept/testing platform.  Things are working well 
finally, for the most part, however I am noticing strange behavior with 
templates, and deployed VMs from that template.  Let me explain:

I created a basic Ubuntu Server VM, captured that VM as a template, then 
deployed 4 VMs from that template.  The deployment went fine; however I can 
only start 3 of the 4 VMs.  If I shut one down one of the 3 that I started, I 
can then start the other one that refused to start, then the one I JUST shut 
down will then refuse to start.  The error is:

VM test3 is down with error. Exit message: Bad volume specification {'device': 
'disk', 'type': 'disk', 'diskType': 'file', 'specParams': {}, 'alias': 
'ua-2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'address': {'bus': '0', 
'controller': '0', 'unit': '0', 'type': 'drive', 'target': '0'}, 'domainID': 
'804c6a0c-b246-4ccc-b3ab-dd4ceb819cea', 'imageID': 
'2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'poolID': 
'3208bbce-5e04-11eb-9313-00163e281c6d', 'volumeID': 
'f514ab22-07ae-40e4-9146-1041d78553fd', 'path': 
'/rhev/data-center/3208bbce-5e04-11eb-9313-00163e281c6d/804c6a0c-b246-4ccc-b3ab-dd4ceb819cea/images/2dc7fbff-da30-485d-891f-03a0ed60fd0a/f514ab22-07ae-40e4-9146-1041d78553fd',
 'discard': True, 'format': 'cow', 'propagateErrors': 'off', 'cache': 'none', 
'iface': 'scsi', 'name': 'sda', 'bootOrder': '1', 'serial': 
'2dc7fbff-da30-485d-891f-03a0ed60fd0a', 'index': 0, 'reqsize': '0', 'truesize': 
'2882392576', 'apparentsize': '3435134976'}.

The underlying storage is GlusterFS, self-managed outside of oVirt.

I can provide any logs needed, please let me know which.  Thanks in advance.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TZMAQ56NIQK7DY4WM4RZV4YRERMDEZRO/


[ovirt-users] VMware Fence Agent

2021-01-20 Thread Robert Tongue
Greetings all, I am new to oVirt, and have a proof of concept setup with a 
3-node oVirt cluster nested inside of VMware VCenter to learn it, so then I can 
efficiently migrate that back out to the physical nodes to replace VCenter.   I 
have gotten all the way to a working cluster setup, with the exception of 
fencing.  I used engine-config to pull in the vmware_soap fence agent, and 
enabled all the options, however there is one small thing I cannot figure out.  
The connection uses a self-signed certificate on the vcenter side, and I cannot 
figure out the proper combination of engine-config -s commands to get the 
script to be called with the "ssl-insecure" option, which does contain a value. 
 It just needs the option passed.   Is there anyone out there in the ether that 
can help me out? I can provide any information you request.  Thanks in advance.

The fence agent script is called with the following syntax in my tests, and 
returned the proper status:

[root@cluster2-vm ~]# /usr/sbin/fence_vmware_soap -o status -a vcenter.address 
--username="administrator@vsphere.local" --password="0bfusc@t3d" --ssl-insecure 
-n cluster1-vm

Status: ON


-phunyguy
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3PTMUPHR3ZOSQL3SEMTJPAWOAFL5ZUY2/