[ovirt-users] Re: HA VM and vm leases usage with site failure

2021-08-06 Thread Klaas Demter

Hi,

I'd say your scenario is exactly what the storage leases are for.

I can tell you that it used to work in 4.3, but I haven't tested/needed 
the feature in quite a while :)


Maybe open a bugzilla and attach relevant logs to get the developer 
attention.



Greetings

Klaas


On 8/5/21 4:42 PM, Gianluca Cecchi wrote:

Hello,
supposing latest 4.4.7 environment installed with an external engine 
and two hosts, one in one site and one in another site.

For storage I have one FC storage domain.
I try to simulate a sort of "site failure scenario" to see what kind 
of HA I should expect.


The 2 hosts have power mgmt configured through fence_ipmilan.

I have 2 VMs, one configured as HA with lease on storage (Resume 
Behavior: kill) and one not marked as HA.


Initially host1 is SPM and it is the host that runs the two VMs.

Fencing of host1 from host2 initially works ok. I can test also from 
command line:
# fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L 
operator -S /usr/local/bin/pwd.sh -o status

Status: ON

On host2 I then prevent reaching host1 iDRAC:
firewall-cmd --direct --add-rule ipv4 filter OUTPUT 0 -d 10.10.193.152 
-p udp --dport 623 -j DROP

firewall-cmd --direct --add-rule ipv4 filter OUTPUT 1 -j ACCEPT

so that:

# fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L 
operator -S /usr/local/bin/pwd.sh -o status
2021-08-05 15:06:07,254 ERROR: Failed: Unable to obtain correct plug 
status or plug is not available


On host1 I generate panic:
# date ; echo 1 > /proc/sys/kernel/sysrq ; echo c > /proc/sysrq-trigger
Thu Aug  5 15:06:24 CEST 2021

host1 correctly completes its crash dump (kdump integration is 
enabled) and reboots, but I stop it at grub prompt so that host1 is 
unreachable from host2 point of view and also power fencing not determined


At this point I thought that VM lease functionality would have come in 
place and host2 would be able to re-start the HA VM, as it is able to 
see that the lease is not taken from the other host and so it can 
acquire the lock itself

Instead it goes through the attempt to power fence loop
I wait about 25 minutes without any effect but continuous attempts.

After 2 minutes host2 correctly becomes SPM and VMs are marked as unknown

At a certain point after the failures in power fencing host1, I see 
the event:


Failed to power fence host host1. Please check the host status and 
it's power management settings, and then manually reboot it and click 
"Confirm Host Has Been Rebooted"


If I select host and choose "Confirm Host Has Been Rebooted", then the 
two VMs are marked as down and the HA one is correctly booted by host2.


But this requires my manual intervention.

Is the behavior above the expected one or the use of VM leases should 
have allowed host2 to bypass fencing inability and start the HA VM 
with lease? Otherwise I don't understand the reason to have the lease 
itself at all


Thanks,
Gianluca


___
Users mailing list --users@ovirt.org
To unsubscribe send an email tousers-le...@ovirt.org
Privacy Statement:https://www.ovirt.org/privacy-policy.html
oVirt Code of 
Conduct:https://www.ovirt.org/community/about/community-guidelines/
List 
Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/FK254O4WOPWV56F753BVSK5GYQFZ4E5Q/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SGYIN5U57JJCHHQGJFT3FL2BZ3QTYI4H/


[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM

2021-08-06 Thread Strahil Nikolov via Users
Hi David,
I hope you manage to recover the VM or most of the data. If you got multiple 
disks in that VM (easily observeable in oVirt UI), you might need to repeat 
that again for the rest of the disks.
Check with xfs_info the inode size (isize), as the default used to be 256, but 
I have noticed that in some cases mkfs.xfs picked a higher value (EL7). Also, 
check the gluster's logs or at least keep them for a later check. Usually, 
smaller inode size can cause a lot and really awkward issues in Gluster, but 
this needs to be verified.
Once the raid is fully rebuilt, you will have to add both the HW raid and the 
arbiter brick (add-brick replica 3 arbiter 1) . As you will be reusing the 
arbiter brick, the safest is to mkfs.xfs and also increase the inode ratio to 
90%.
Can you provide your volume info ? The default shard size is just 64MB and 
transfer is quite fast, so there should be no locking or the symptoms reported .
Once the healing is over, you should be ready for the rebuilt of the other node.
Best Regards,Strahil Nikolov


Ok, so right now, my production cluster is operating off of a single brick. I 
was planning on expanding the storage on the 2nd host next week, and adding 
that back into the cluster, and getting the Replica 2, Arbiter 1 redundancy 
working again.

How would you recommend I proceed with that plan, knowing that I'm currently 
operating off of a single brick in which I did NOT specify the size with 
`mkfs.xfs -i size=512?
Should I specify the size on the new brick I build next week, and then once 
everything is healed, reformat the current brick?

> And then there is a lot of information missing between the lines: I guess you 
> are using a 3 node HCI setup and were adding new disks (/dev/sdb) on all 
> three nodes and trying to move the glusterfs to those new bigger disks?

You are correct in that I'm using 3-node HCI. I originally built HCI with 
Gluster replication on all 3 nodes (Replica 3). As I'm increasing the storage, 
I'm also moving to an architecture of Replica 2/Arbiter 1. So yes, the plan was:

1) Convert FROM Replica 3 TO replica 2/arbiter 1
2) Convert again down to a Replica 1 (so no replication... just operating 
storage on a single host)
3) Rebuild the RAID array (with larger storage) on one of the unused hosts, and 
rebuild the gluster bricks
4) Add the larger RAID back into gluster, let it heal
5) Now, remove the bricks from the host with the smaller storage -- THIS is 
where things went awry, and what caused the data loss on this 1 particular VM
--- This is where I am currently ---
6) Rebuild the RAID array on the remaining host that is now unused (This is 
what I am / was planning to do next week)




Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐

On Thursday, August 5th, 2021 at 3:12 PM, Thomas Hoberg  
wrote:

> If you manage to export the disk image via the GUI, the result should be a 
> qcow2 format file, which you can mount/attach to anything Linux (well, if the 
> VM was Linux... it didn't say)
> 

> But it's perhaps easier to simply try to attach the disk of the failed VM as 
> a secondary to a live VM to recover the data.
> 

> Users mailing list -- users@ovirt.org
> 

> To unsubscribe send an email to users-le...@ovirt.org
> 

> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> 

> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> 

> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SLXLQ4BLQUPBV5355DFFACF6LFJX4MWY/
>   
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XPUTVULWCBIRWJJZMHWRO7XB4VBVBSHU/


[ovirt-users] Re: Architecture design docs

2021-08-06 Thread Jesse Hu
Thanks Tony and Thomas. https://www.ovirt.org/documentation/ contains docs for 
using oVirt, but not docs for developing oVirt.

Somewhat I agree on "I also have not encountered any useful oVirt/RHV 
architecture books. The
ones I found were very much "do this, do that" and didn't help me as a
technical architect.". oVirt is open sourced under Apache License, but  seems 
like currently it's not open to contributors or 3rd party developers who want 
to modify and rebuild from the source code?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KEXLSWDN4AAD4X3NBNIYFCFRNZKUJHPK/


[ovirt-users] Add nodes to single node gluster hyperconverged

2021-08-06 Thread Mathieu Valois

Hi everyone,

is it possible to add nodes to a single node gluster hyperconverged 
setup? I have 3 nodes, one has been reinstalled with single node HE and 
VMs migrated on it. How can I add the two other nodes to this setup?


Thank you,

M.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WJV7ST4BQ5P2U4SCSXKYB4VZAX36U4PW/


[ovirt-users] non-critical request - Disk volume label - Web-ui

2021-08-06 Thread Jorge Visentini
Hi everyone!

Firstly, congratulations for the evolution of the oVirt 4.4.7.

A *non-critical* request for a future version... if possible, add a label
in disk volumes, in Web-ui.

Thank you all!

[image: image.png]

--
Att,
Jorge Visentini
+55 55 98432-9868
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BCFKDMSTRGJGTQ5SCQ26GNG27TQTDZCG/


[ovirt-users] Re: Data recovery from (now unused, but still mounted) Gluster Volume for a single VM

2021-08-06 Thread David White via Users
Thank you for all the responses.
Following Strahil's instructions, I *think* that I was able to reconstruct the 
disk image. I'm just waiting for that image to finish downloading onto my local 
machine, at which point I'll try to import into VirtualBox or something. 
Fingers crossed!

Worst case scenario, I do have backups for that particular VM from 3 months 
ago, which I have already restored onto a new VM.
Losing 3 months of data is much better than losing 100% of the data from the 
past 2-3+ years.

Thank you.

> First of all you diddn't 'mkfs.xfs -i size=512' . You just 'mkfs.xfs' , whis 
> is not good and could have caused your VM problems. Also , check with 
> xfs_info the isize of the FS.

Ok, so right now, my production cluster is operating off of a single brick. I 
was planning on expanding the storage on the 2nd host next week, and adding 
that back into the cluster, and getting the Replica 2, Arbiter 1 redundancy 
working again.

How would you recommend I proceed with that plan, knowing that I'm currently 
operating off of a single brick in which I did NOT specify the size with 
`mkfs.xfs -i size=512?
Should I specify the size on the new brick I build next week, and then once 
everything is healed, reformat the current brick?

> And then there is a lot of information missing between the lines: I guess you 
> are using a 3 node HCI setup and were adding new disks (/dev/sdb) on all 
> three nodes and trying to move the glusterfs to those new bigger disks?

You are correct in that I'm using 3-node HCI. I originally built HCI with 
Gluster replication on all 3 nodes (Replica 3). As I'm increasing the storage, 
I'm also moving to an architecture of Replica 2/Arbiter 1. So yes, the plan was:

1) Convert FROM Replica 3 TO replica 2/arbiter 1
2) Convert again down to a Replica 1 (so no replication... just operating 
storage on a single host)
3) Rebuild the RAID array (with larger storage) on one of the unused hosts, and 
rebuild the gluster bricks
4) Add the larger RAID back into gluster, let it heal
5) Now, remove the bricks from the host with the smaller storage -- THIS is 
where things went awry, and what caused the data loss on this 1 particular VM
--- This is where I am currently ---
6) Rebuild the RAID array on the remaining host that is now unused (This is 
what I am / was planning to do next week)




Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐

On Thursday, August 5th, 2021 at 3:12 PM, Thomas Hoberg  
wrote:

> If you manage to export the disk image via the GUI, the result should be a 
> qcow2 format file, which you can mount/attach to anything Linux (well, if the 
> VM was Linux... it didn't say)
> 

> But it's perhaps easier to simply try to attach the disk of the failed VM as 
> a secondary to a live VM to recover the data.
> 

> Users mailing list -- users@ovirt.org
> 

> To unsubscribe send an email to users-le...@ovirt.org
> 

> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> 

> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> 

> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SLXLQ4BLQUPBV5355DFFACF6LFJX4MWY/

publickey - dmwhite823@protonmail.com - 0x320CD582.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CZDHKGQES4ZOGGFJIBB46CZEGD647DLZ/


[ovirt-users] Adding new host to a cluster and stuck at "Detect if host is a prebuilt image".

2021-08-06 Thread fauzuwan . nazri93
Hello everyone, I try to add a host to a new cluster and
during provisioning the event showed "Detect if host is a prebuilt image."
and it just stuck there forever. 

Version: ovirt-node-ng-image-update-placeholder-4.4.7.1-1.el8.noarch
Actual results: Host provisioning stuck at  "Detect if host is a prebuilt 
image."
Expected results: Provisioning successful.

By checking the ovirt host deployment playbook logs, I can't found any useful 
output. 
Below is the last output form the logs.


2021-08-06 19:25:12 MYT - TASK [ovirt-host-deploy-facts : Detect if host is a 
prebuilt image] 
2021-08-06 19:25:12 MYT - ok: [ovirth02.zyzyx.virtnet]
2021-08-06 19:25:12 MYT - {
  "status" : "OK",
  "msg" : "",
  "data" : {
"uuid" : "a7b74be9-30ab-4ca4-b2b6-e7af6b86bb6c",
"counter" : 24,
"stdout" : "ok: [ovirth02.zyzyx.virtnet]",
"start_line" : 22,
"end_line" : 23,
"runner_ident" : "f155ccc8-f6a8-11eb-94c2-00e04cf8ff45",
"event" : "runner_on_ok",
"pid" : 1549383,
"created" : "2021-08-06T11:25:10.333884",
"parent_uuid" : "00e04cf8-ff45-b99d-863c-0182",
"event_data" : {
  "playbook" : "ovirt-host-deploy.yml",
  "playbook_uuid" : "c1e368bf-e183-47c0-b6f5-377114c20eab",
  "play" : "all",
  "play_uuid" : "00e04cf8-ff45-b99d-863c-0007",
  "play_pattern" : "all",
  "task" : "Detect if host is a prebuilt image",
  "task_uuid" : "00e04cf8-ff45-b99d-863c-0182",
  "task_action" : "set_fact",
  "task_args" : "",
  "task_path" : 
"/usr/share/ovirt-engine/ansible-runner-service-project/project/roles/ovirt-host-deploy-facts/tasks/host-os.yml:26",
  "role" : "ovirt-host-deploy-facts",
  "host" : "ovirth02.zyzyx.virtnet",
  "remote_addr" : "ovirth02.zyzyx.virtnet",
  "res" : {
"changed" : false,
"ansible_facts" : {
  "node_host" : true
},
"_ansible_no_log" : false
  },
  "start" : "2021-08-06T11:25:10.254355",
  "end" : "2021-08-06T11:25:10.333550",
  "duration" : 0.079195,
  "event_loop" : null,
  "uuid" : "a7b74be9-30ab-4ca4-b2b6-e7af6b86bb6c"
}
  }
}

2021-08-06 19:25:12 MYT - TASK [ovirt-host-deploy-facts : Reset configuration 
of advanced virtualization module] ***
2021-08-06 19:25:12 MYT - TASK [ovirt-host-deploy-facts : Find relevant 
advanced virtualization module version] ***
2021-08-06 19:25:12 MYT - TASK [ovirt-host-deploy-facts : Enable advanced 
virtualization module] *
2021-08-06 19:25:12 MYT - TASK [ovirt-host-deploy-facts : Ensure Python3 is 
installed for CentOS/RHEL8 hosts] ***



Thank You.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DWLRVZEU6DYZFUCWRNWGLTA2SNJ25ST2/


[ovirt-users] Wondering a way to create a universal USB Live OS to connect to VM Portal

2021-08-06 Thread eta . entropy
Hi All,

once oVirt Infrastracture has been setup, VMs created, assigned to users and 
they manage them through VM Portal,

I'm wondering if there is an easy way to provide to end users a USB key that 
will just

> plug into any computer or boot from it
> connect to given VM Portal to manage assigned VMs
> connect to given SPICE console to enter assigned VMs

Just like a Linux Live USB with preconfigured service to be started and to 
connect to VM Portal or SPICE console

Is there something already available to start from ?

Is this something doable or am I dreaming ?

Thanks for any input
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RELZ52FICDVN6YSNQ3YXFEUVHH7IJHMD/


[ovirt-users] Re: Question about PCI storage passthrough for a single guest VM

2021-08-06 Thread Thomas Hoberg
You're welcome!

My machine learning team members that I am maintaining oVirt for tend to load 
training data is large sequential batches, which means bandwidth is nice to 
have. While I give them local SSD storage on the compute nodes, I also give 
them lots of HDD/VDO based gluster file space, which might do miserably on 
OLTP, but pipes out sequential data at rates at least similar to SATA SSDs with 
a 10Gbit network. Seems to work for them, because to CUDA applications even RAM 
is barely faster the block storage.

PCIe 4.0 NVMe at 8GB/s per device becomes a challenge to any block storage 
abstraction, inside or outside a VM. And when we are talking about NVMe storage 
with "native" KV APIs like FusionIO did back then, PCI pass-through will be a 
necessity, unless somebody comes up with a new hardware abstraction layer.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JTVITH24IOZXOJLQAW6UCQAYHTZWAYUI/


[ovirt-users] Re: live merge of snapshots failed

2021-08-06 Thread g . vasilopoulos
I think these are the corresponding logs
qcow2: Marking image as corrupt: Cluster allocation offset 0x7890c000 unaligned 
(L2 offset: 0x39e0, L2 index: 0); further corruption events will be 
suppressed
main_channel_link: add main channel client
main_channel_client_handle_pong: net test: latency 12.959000 ms, bitrate 
3117199391 bps (2972.792998 Mbps)
inputs_connect: inputs channel client create
red_qxl_set_cursor_peer: 
red_channel_client_disconnect: rcc=0x56405bdf69c0 (channel=0x56405ad7c940 
type=3 id=0)
red_channel_client_disconnect: rcc=0x56405e78cdd0 (channel=0x56405bb96900 
type=4 id=0)
red_channel_client_disconnect: rcc=0x56405e79c5b0 (channel=0x56405ad7c220 
type=2 id=0)
red_channel_client_disconnect: rcc=0x56405bdea9f0 (channel=0x56405ad7c150 
type=1 id=0)
main_channel_client_on_disconnect: rcc=0x56405bdea9f0
red_client_destroy: destroy client 0x56405c383110 with #channels=4
red_qxl_disconnect_cursor_peer: 
red_qxl_disconnect_display_peer: 
2021-08-03T08:10:50.516974Z qemu-kvm: terminating on signal 15 from pid 6847 
()
2021-08-03 08:10:50.717+: shutting down, reason=destroyed
2021-08-03 11:02:57.502+: starting up libvirt version: 4.5.0, package: 
33.el7_8.1 (CentOS BuildSystem , 2020-05-12-16:25:35, 
x86-01.bsys.centos.org), qemu version: 2.12.0qemu-kvm-ev-2.12.0-44.1.el7_8.1, 
kernel: 3.10.0-1127.8.2.el7.x86_64, hostname: ovirt3-5.vmmgmt-int.uoc.gr
LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
QEMU_AUDIO_DRV=none \
/usr/libexec/qemu-kvm \
-name guest=anova.admin.uoc.gr,debug-threads=on \
-S \
-object 
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-12-anova.admin.uoc.gr/master-key.aes
 \
-machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off \
-cpu 
Westmere,vme=on,pclmuldq=on,x2apic=on,hypervisor=on,arat=on,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_synic,hv_stimer
 \
-m size=8388608k,slots=16,maxmem=33554432k \
-realtime mlock=off \
-smp 2,maxcpus=16,sockets=16,cores=1,threads=1 \
-object iothread,id=iothread1 \
-numa node,nodeid=0,cpus=0-1,mem=8192 \
-uuid 1c1d20ed-3167-4be7-bff3-29845142fc57 \
-smbios 'type=1,manufacturer=oVirt,product=oVirt 
Node,version=7-8.2003.0.el7.centos,serial=4c4c4544-0053-4b10-8059-cac04f475832,uuid=1c1d20ed-3167-4be7-bff3-29845142fc57'
 \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=33,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=2021-08-03T12:02:56,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-hpet \
-no-shutdown \
-global PIIX4_PM.disable_s3=1 \
-global PIIX4_PM.disable_s4=1 \
-boot strict=on \
-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
-device 
virtio-scsi-pci,iothread=iothread1,id=ua-90ae154d-56b8-499a-9173-c4cd225ba0c6,bus=pci.0,addr=0x7
 \
-device 
virtio-serial-pci,id=ua-a8dc285c-6fa9-45b2-a4f9-c8862be71342,max_ports=16,bus=pci.0,addr=0x4
 \
-drive 
file=/rhev/data-center/mnt/10.252.80.208:_home_isos/5b1a0f29-8f97-42c3-bea2-39f83bbfbf24/images/----/virtio-win-0.1.185.iso,format=raw,if=none,id=drive-ua-cfb42882-2eba-41b9--43781eeff382,werror=report,rerror=report,readonly=on
 \
-device 
ide-cd,bus=ide.1,unit=0,drive=drive-ua-cfb42882-2eba-41b9--43781eeff382,id=ua-cfb42882-2eba-41b9--43781eeff382,bootindex=2
 \
-drive 
file=/rhev/data-center/mnt/blockSD/a5a492a7-f770-4472-baa3-ac7297a581a9/images/2e6e3cd3-f0cb-47a7-8bda-7738bd7c1fb5/84c005da-cbec-4ace-8619-5a8e2ae5ea75,format=raw,if=none,id=drive-ua-2e6e3cd3-f0cb-47a7-8bda-7738bd7c1fb5,serial=2e6e3cd3-f0cb-47a7-8bda-7738bd7c1fb5,werror=stop,rerror=stop,cache=none,aio=native,throttling.bps-read=157286400,throttling.bps-write=73400320,throttling.iops-read=1200,throttling.iops-write=180
 \
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZHUBVVF5QJC7QBX526R7BXBILTHRGMNP/


[ovirt-users] Re: Architecture design docs

2021-08-06 Thread Thomas Hoberg
In the two years that I have been using oVirt, I've been yearning for some nice 
architecture primer myself, but I have not been able to find a nice "textbook 
style" architecture document.

And it does not help that some of the more in-depth information on the oVirt 
site, doesn't seem navigatable from the main "Documentation" link.

This is my very personal opinion, others might have different impressions.

oVirt isn't really a product in the sense that all parts were designed to go 
together. Instead it's a package that has been assembled from quite a few 
rather distinct pieces of technology that Redhat has aquired over the last 
decades. What some might view as proof of extreme flexibility, others will see 
as lack of integration. The oVirt team is careful not to re-implement anything 
on their side, that some other component already delivers. Unfortunately, that 
means you better understand those components underneath, their range of 
functionality and the tooling around them, because oVirt guys won't explain 
what other teams do (e.g. KVM, Gluster, VDO, LVM, Ansible).

KVM originates from Moshe Bar's Qumranet, is the key ingredient of oVirt/RHV 
but also leads a somewhat independent life on its own.

Gluster was a separate scale-out storage company that Redhat aquired, which has 
been passing through its very own trials and tribulations and suffers from lack 
of large scale adoption, especially since scale-out is either cloud or HPC 
where Gluster seems to hold little appeal. I think it's stagnating and its 
level of integration with oVirt is really minimal, even with the tons of work 
developers have done. I consider oVirt's HCI a brilliant value proposition in 
theory and a half baked implementation that one certainly should not use "for 
the entire enterprise".

VDSM is core to oVirt and the philsophical principle AFIK has remained 
unchanged over the last ten years. Its approach also isn't exactly novel (but 
solid!), I see parallels not only with vSphere but right back to things like 
the Tivoli workload scheduler, a mainframe batch scheduler going back decades.

The working principle is to make a deployment plan (~batch schedule), engrave 
that plan into persistent shared storage, so that every worker (host) can 
follow this optimal and conflict free plan, while the manager is free to rest 
or die or be rebooted. It relieves the manager of having to be clustered for 
availability itsself.

VDSM is the agent on every host (or node) responsible for reading and running 
that plan and the engine is the manager, which continously creates the new 
plans.

Originally the manager ran on a separte physical machine, but then somebody 
managed to get that "teleported" into a VM running on the oVirt farm itsself. 
It's a very nice feat and mostly just eliminates the need for a separate host 
to manage the engine. It also allows for an automatic restart of the management 
engine on another host, should the host it ran on fail. But it still needs some 
special treatment compared to any other VM. Again this is something VMware and 
Oracle also managed to do with their VM orchestration tools, perhaps Nutanix 
was the first out of the door with that feature.

Perhaps looking at what the other vendors do, sometimes helps to understand how 
oVirt works, because they do copy ideas from each other (and may be shy about 
documenting that).

There are architecture presentations out there, which unfortunately mostly 
describe the implementation changes made over the years, not the fundamental 
design philosophy nor the current implementation state. That's mostly because 
the implementation has changed fundamentally and keeps changing rapidly so the 
effort to maintain up-to-date docs seems too great. E.g. one of the more recent 
key efforts has been to make Ansible do as much work as possible, where the 
original implementation seems to have used scripts.

But that should not keep someone from doing a textbook on the architecture 
design principles behind oVirt, and perhaps a condensed overview about the 
implementation changes over the years and their motivations.

One could argue that while Redhat is an open source company, it's not an open 
knowledge company. It doesn't necessarily publish all available internal 
documentation and training material they create for their support engineers. 
They do want so sell commercial support.

On the other hand there is conference material, there are lots of RHV related 
Youtube videos scattered around, so you can find a lot of information, just not 
in that tight nice little book you and I seem to wish for.

Unfortunately, I also have not encountered any useful oVirt/RHV architecture 
books. The ones I found were very much "do this, do that" and didn't help me as 
a technical architect.

If I thought that oVirt had a bright and bullish future, I'd be tempted to 
write such a book myself.

With vSphere struggling aginst clouds, I don't see RHV/oVirt doing the right 
things to 

[ovirt-users] Re: live merge of snapshots failed

2021-08-06 Thread Benny Zlotnik
2021-08-03 15:50:59,040+0300 ERROR (libvirt/events) [virt.vm]
(vmId='1c1d20ed-3167-4be7-bff3-29845142fc57') Block job ACTIVE_COMMIT
for drive 
/rhev/data-center/mnt/blockSD/a5a492a7-f770-4472-baa3-ac7297a581a9/images/2e6e3cd3-f0cb-47a7-8bda-7738bd7c1fb5/b43b7c33-5b53-4332-a2e0-f950debb919b
has failed (vm:5847)

Do you have access to libvirtd logs?
Since you're using an outdated version it's possible you've hit an old
bug that's been fixed

On Wed, Aug 4, 2021 at 10:30 AM  wrote:
>
> here os the vdsm.log from the SPM
> there is a report for the second disk of the vm but the first (the one which 
> failes to merge does not seem to be anywhere)
> 2021-08-03 15:51:40,051+0300 INFO  (jsonrpc/7) [vdsm.api] START 
> getVolumeInfo(sdUUID=u'96000ec9-e181-44eb-893f-e0a36e3a6775', 
> spUUID=u'5da76866-7b7d-11eb-9913-00163e1f2643', 
> imgUUID=u'205a30a3-fc06-4ceb-8ef2-018f16d4ccbb', 
> volUUID=u'7611ebcf-5323-45ca-b16c-9302d0bdedc6', options=None) 
> from=:::10.252.80.201,58850, 
> flow_id=3bf9345d-fab2-490f-ba44-6aa014bbb743, 
> task_id=be6c50d9-a8e4-4ef5-85cf-87a00d79d77e (api:48)
> 2021-08-03 15:51:40,052+0300 INFO  (jsonrpc/7) [storage.VolumeManifest] Info 
> request: sdUUID=96000ec9-e181-44eb-893f-e0a36e3a6775 
> imgUUID=205a30a3-fc06-4ceb-8ef2-018f16d4ccbb volUUID = 
> 7611ebcf-5323-45ca-b16c-9302d0bdedc6  (volume:240)
> 2021-08-03 15:51:40,081+0300 INFO  (jsonrpc/7) [storage.VolumeManifest] 
> 96000ec9-e181-44eb-893f-e0a36e3a6775/205a30a3-fc06-4ceb-8ef2-018f16d4ccbb/7611ebcf-5323-45ca-b16c-9302d0bdedc6
>  info is {'status': 'OK', 'domain': '96000ec9-e181-44eb-893f-e0a36e3a6775', 
> 'voltype': 'LEAF', 'description': 
> '{"DiskAlias":"anova.admin.uoc.gr_Disk2","DiskDescription":""}', 'parent': 
> '----', 'format': 'RAW', 'generation': 0, 
> 'image': '205a30a3-fc06-4ceb-8ef2-018f16d4ccbb', 'disktype': 'DATA', 
> 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '42949672960', 'children': 
> [], 'pool': '', 'ctime': '1625846644', 'capacity': '42949672960', 'uuid': 
> u'7611ebcf-5323-45ca-b16c-9302d0bdedc6', 'truesize': '42949672960', 'type': 
> 'PREALLOCATED', 'lease': {'path': 
> '/dev/96000ec9-e181-44eb-893f-e0a36e3a6775/leases', 'owners': [], 'version': 
> None, 'offset': 105906176}} (volume:279)
> 2021-08-03 15:51:40,081+0300 INFO  (jsonrpc/7) [vdsm.api] FINISH 
> getVolumeInfo return={'info': {'status': 'OK', 'domain': 
> '96000ec9-e181-44eb-893f-e0a36e3a6775', 'voltype': 'LEAF', 'description': 
> '{"DiskAlias":"anova.admin.uoc.gr_Disk2","DiskDescription":""}', 'parent': 
> '----', 'format': 'RAW', 'generation': 0, 
> 'image': '205a30a3-fc06-4ceb-8ef2-018f16d4ccbb', 'disktype': 'DATA', 
> 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '42949672960', 'children': 
> [], 'pool': '', 'ctime': '1625846644', 'capacity': '42949672960', 'uuid': 
> u'7611ebcf-5323-45ca-b16c-9302d0bdedc6', 'truesize': '42949672960', 'type': 
> 'PREALLOCATED', 'lease': {'path': 
> '/dev/96000ec9-e181-44eb-893f-e0a36e3a6775/leases', 'owners': [], 'version': 
> None, 'offset': 105906176}}} from=:::10.252.80.201,58850, 
> flow_id=3bf9345d-fab2-490f-ba44-6aa014bbb743, 
> task_id=be6c50d9-a8e4-4ef5-85cf-87a00d79d77e (api:54)
> 2021-08-03 15:51:40,083+0300 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC 
> call Volume.getInfo succeeded in 0.04 seconds (__init__:312)
>
> last appearance of this drive on the spm vdsm.log is when the snapshot 
> download finishes:
> 2021-08-03 15:34:18,619+0300 INFO  (jsonrpc/6) [vdsm.api] FINISH 
> get_image_ticket return={'result': {u'timeout': 300, u'idle_time': 0, 
> u'uuid': u'5c1943a9-cac4-4398-9ec1-46ab82cacd04', u'ops': [u'read'], u'url': 
> u'file:///rhev/data-center/mnt/blockSD/a5a492a7-f770-4472-baa3-ac7297a581a9/images/2e6e3cd3-f0cb-47a7-8bda-7738bd7c1fb5/84c005da-cbec-4ace-8619-5a8e2ae5ea75',
>  u'expires': 6191177, u'transferred': 150256746496, u'transfer_id': 
> u'7dcb75c0-4373-4986-b25f-5629b1b68f5d', u'sparse': False, u'active': True, 
> u'size': 150323855360}} from=:::10.252.80.201,58850, 
> flow_id=3035db30-8a8c-48a5-b0c6-0781fda6ac2e, 
> task_id=674028a2-e37c-46e4-a463-eeae1b09aef0 (api:54)
> 2021-08-03 15:34:18,620+0300 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC 
> call Host.get_image_ticket succeeded in 0.00 seconds (__init__:312)
>
> If I can send any more information or test something please let me know.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/KEJ24BI6PLXYFQHJ6O2AESK3M4SXMUID/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html

[ovirt-users] Major network issue with 4.4.7

2021-08-06 Thread Andrea Chierici

Dear all,
I've been using ovirt for at least 6 years and only lately I've stepped 
into a weird problem that I hope someone will be able to give help.


My hardware is:
- blade lenovo for hosts, with dual switch
- dell equallogic for iscsi storage, directly connected to the blade 
switches
- The two host network cards are configured with bonding and all the 
vlans are accessed from it (mtu 9000)

- all the hosts and ovirt engine have firewalld service disabled)

My engine is hosted on a separate vmware vm (I will evaluate the self 
hosted engine later...). I want to stress the fact that for years this 
setup worked smoothly without any significant issue (and all the minor 
updates were completed flawlessly).


A few weeks ago I started the update from the rock solid 4.3 to the 
latest 4.4.7. I began with the manager, following the docs, installing a 
new centos8 vm and importing the backup: everything went smootly and I 
was able to get access to the manager without any problem, all the 
machines still there :)

I then began updating the hosts, from centos7 to centos8 stream, one by one.
Immediately I noticed network issues, with the VMs hosted on the first 
updated host. Migrating VMs from centos8 host to other centos8 quite 
often fails, but the main issue is this: *if I start one of the VMs on 
the centos8 host, they have no network connectivity. If I migrate them 
to a centos7 hosts the network starts to work, and if I migrate the VMs 
back to the centos8 host, the network keeps working.*
I am puzzled and can't understand what's going on. Generally speaking 
all the centos8 hosts (I have 6 in my cluster, and now 3 are centos8 
while the rest is still centos7) seem to be very unstable, meaning that 
the VMs they host are quite often showing network issues and temporary 
glitches.


Can someone give a hint on how to solve this weird issue?


Thanks,
Andrea


--
Andrea Chierici - INFN-CNAF 
Viale Berti Pichat 6/2, 40127 BOLOGNA
Office Tel: +39 051 2095463 
SkypeID ataruz
--



smime.p7s
Description: S/MIME Cryptographic Signature
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KIFOGDEAA2TCQPNDCVT56VPSDBVY47BF/


[ovirt-users] glance.ovirt.org planned outage: 10.08.2021 at 01:00 UTC

2021-08-06 Thread Evgheni Dereveanchin
Hi everyone,

There's an outage scheduled in order to move glance.ovirt.org to new
hardware. This will happen after midnight the upcoming Tuesday between 1AM
and 3AM UTC. It will not be possible to pull images from our Glance image
registry during this period. Other services will not be affected.

If you see any CI jobs failing on Glance tests - please re-run them in the
morning after the planned outage window is over. If issues persist please
report it via JIRA or reach out to me personally.

-- 
Regards,
Evgheni Dereveanchin
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4J6JEU6IVPN7YJB7SBHG34AQTY4QKAFW/


[ovirt-users] [ANN] oVirt 4.4.8 Fourth Release Candidate is now available for testing

2021-08-06 Thread Sandro Bonazzola
oVirt 4.4.8 Fourth Release Candidate is now available for testing

The oVirt Project is pleased to announce the availability of oVirt 4.4.8
Fourth Release Candidate for testing, as of August 6th, 2021.

This update is the eighth in a series of stabilization updates to the 4.4
series.
Documentation

   -

   If you want to try oVirt as quickly as possible, follow the instructions
   on the Download  page.
   -

   For complete installation, administration, and usage instructions, see
   the oVirt Documentation .
   -

   For upgrading from a previous version, see the oVirt Upgrade Guide
   .
   -

   For a general overview of oVirt, see About oVirt
   .

Important notes before you try it

Please note this is a pre-release build.

The oVirt Project makes no guarantees as to its suitability or usefulness.

This pre-release must not be used in production.
Installation instructions

For installation instructions and additional information please refer to:

https://ovirt.org/documentation/

This release is available now on x86_64 architecture for:

* Red Hat Enterprise Linux 8.4 or similar

* CentOS Stream 8

This release supports Hypervisor Hosts on x86_64 and ppc64le architectures
for:

* Red Hat Enterprise Linux 8.4 or similar

* CentOS Stream 8

* oVirt Node 4.4 based on CentOS Stream 8 (available for x86_64 only)

See the release notes [1] for installation instructions and a list of new
features and bugs fixed.

Notes:

- oVirt Appliance is already available based on CentOS Stream 8

- oVirt Node NG is already available based on CentOS Stream 8

Additional Resources:

* Read more about the oVirt 4.4.8 release highlights:
http://www.ovirt.org/release/4.4.8/

* Get more oVirt project updates on Twitter: https://twitter.com/ovirt

* Check out the latest project news on the oVirt blog:
http://www.ovirt.org/blog/


[1] http://www.ovirt.org/release/4.4.8/
[2] http://resources.ovirt.org/pub/ovirt-4.4-pre/iso/

-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com


*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.*
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/O3US6YVYGCLJDAGV7DSH4YHT4VTMR56U/


[ovirt-users] Re: ISO Upload in in Paused by System Status

2021-08-06 Thread Strahil Nikolov via Users
Have you tried with an empty pass ?
Best Regards,Strahil Nikolov
 
 
I obtained the Certificate from the link on from the ovirt console main 
page.  The certificate has been save to storage.  I attempt to import the 
certificate into a FireFox Browser and get the following message: 

Please enter the password that was used to encrypt this certificate backup:

I enter in the same password used during the installation of ovirt.  After 
entering in the password the following message is displayed:

Failed to decode the file. Either it is not in PKCS #12 format, has been 
corrupted, or the password you entered was incorrect.

What could be the problem here, I don't have another password to enter?  

Thanks
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/D3DRU37QBB4V43M2NT7BQKEUDHCFI635/
  
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2JKT2K5J26W3I22IACXSBYFQHZ5EUGCK/