[ovirt-users] Re: [Non-DoD Source] Re: NVIDIA vGPU driver for Ovirt 4.5.4

2024-01-18 Thread Vinícius Ferrão via Users
Yes, and it’s expensive.

In our case we just use PCI-E Passthrough.


Sent from my iPhone

On 18 Jan 2024, at 12:50, samuel@horebdata.cn wrote:


Great.

Another related question:  Is is true that one has to buy Nivida  vGPU license 
in addition to the GPU hardware?


Do Right Thing (做正确的事) / Pursue Excellence (追求卓越) / Help Others Succeed (成就他人)

From: Silveira, Michael A CTR USN NAVSTA NEWPORT RI (USA) via 
Users
Date: 2024-01-18 12:48
To: gianluca.amato...@gmail.com
CC: users@ovirt.org
Subject: [ovirt-users] Re: [Non-DoD Source] Re: NVIDIA vGPU driver for Ovirt 
4.5.4
Hello,

I would need to virtualize it using GRID to be used by multiple VMs.  I ended 
up upgrading the kernel to 4.18.0-477.10.1.el8_8.x86_64 and was able to install 
the 16.2 NVIDIA driver for RHEL 8.8 and that works.

V/r,
Mike

From: Gianluca Amato 
Sent: Thursday, January 18, 2024 5:06 AM
To: Silveira, Michael A CTR USN NAVSTA NEWPORT RI (USA) 

Cc: users@ovirt.org
Subject: [Non-DoD Source] Re: [ovirt-users] NVIDIA vGPU driver for Ovirt 4.5.4

Do you want to virtualize the Tesla V100 or are just assigning the GPU to a 
single VM via PCI passtrough ?

--gianluca

On Thu, Jan 18, 2024 at 9:23 AM michael.a.silveira3.ctr--- via Users 
mailto:users@ovirt.org>> wrote:
Hello,

Does anyone know which, if any, NVIDIA GRID driver supports Ovirt 4.5.4 on 
Ovirt-node (kernel 4.18.0-408.el8.x86_64)?  I've recently upgraded to Ovirt 4.5 
and can't find a NVIDIA GRID driver that will connect to my Tesla v100 on the 
new kernel.  nvidia-smi returns the following no matter what driver I install:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.  
Make sure that the latest NVIDIA driver is installed and running.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5OWJWJZQU2QT5X4Q4TJ2HJJF3AJYMSQU/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/A4CPHZB4MTOCIWKFJK22CUHXKJ2PSOQ2/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SEOQQJRD67Z3ZBX2X2LD7E53EQEADB5J/


[ovirt-users] GPU Passthrough issues with oVirt 4.5

2023-06-30 Thread Vinícius Ferrão via Users
Hello, does anyone is having issues with device passthrough on oVirt 4.5?

I can passthrough the devices without issue to a given VM, but inside the VM it 
fails to recognize all the devices.

In my case I’ve added 4x GPUs to a VM, but only one show up, and there’s the 
following errors inside the VM:

[   23.006655] nvidia :0a:00.0: enabling device ( -> 0002)
[   23.008026] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR1 is 0M @ 0x0 (PCI::0a:00.0)
[   23.008035] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR2 is 0M @ 0x0 (PCI::0a:00.0)
[   23.008040] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR3 is 0M @ 0x0 (PCI::0a:00.0)
[   23.008045] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR4 is 0M @ 0x0 (PCI::0a:00.0)
[   23.008049] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR5 is 0M @ 0x0 (PCI::0a:00.0)
[   23.012339] NVRM: The NVIDIA GPU :0a:00.0 (PCI ID: 10de:1db1)
   NVRM: installed in this system is not supported by the
   NVRM: NVIDIA 535.54.03 driver release.
   NVRM: Please see 'Appendix A - Supported NVIDIA GPU Products'
   NVRM: in this release's README, available on the operating system
   NVRM: specific graphics driver download page at www.nvidia.com.
[   23.016175] nvidia: probe of :0a:00.0 failed with error -1
[   23.016838] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR0 is 0M @ 0x0 (PCI::0b:00.0)
[   23.016842] nvidia: probe of :0b:00.0 failed with error -1
[   23.017211] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR0 is 0M @ 0x0 (PCI::0c:00.0)
[   23.017215] nvidia: probe of :0c:00.0 failed with error -1
[   23.017248] NVRM: The NVIDIA probe routine failed for 3 device(s).
[   23.214409] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  535.54.03  Tue 
Jun  6 22:20:39 UTC 2023
[   23.485704] [drm] [nvidia-drm] [GPU ID 0x0900] Loading driver
[   23.485708] [drm] Initialized nvidia-drm 0.0.0 20160202 for :09:00.0 on 
minor 1

On the host this shows up on dmesg, but seems right:

[  709.572845] vfio-pci :1a:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[  709.572877] vfio-pci :1a:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[  709.572883] vfio-pci :1a:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
[  710.660813] vfio-pci :1d:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[  710.660845] vfio-pci :1d:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[  710.660851] vfio-pci :1d:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
[  711.748760] vfio-pci :1e:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[  711.748791] vfio-pci :1e:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[  711.748797] vfio-pci :1e:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
[  712.836687] vfio-pci :1c:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[  712.836718] vfio-pci :1c:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[  712.836725] vfio-pci :1c:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0


Thanks.

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OXQGKJWVKRCJ5ABZFEG4TG4VOBKBIO2I/


[ovirt-users] Re: Future of ovirt - end-of-life RHV

2023-04-06 Thread Vinícius Ferrão via Users
Sandro, is there any proposed migration path from RHV to oVirt that didn’t 
envolve reinstalling everything and restoring the engine with a backup?

Thank you.

Sent from my iPhone

On 6 Apr 2023, at 03:54, Sandro Bonazzola  wrote:




Il giorno gio 6 apr 2023 alle ore 06:56 Margit Meyer 
mailto:m...@htwsaar.de>> ha scritto:
What about the future of ovirt when RHV comes to end-of-life?

I presented the future of oVirt here: 
https://blogs.ovirt.org/2022/02/future-of-ovirt-february-2022/
The overall situation didn't change over the past year.

Is there a chance for further usage?

Yes

RedHat OpenShift will not be an option for us


Just in case, there's a corresponding community project, OKD: 
https://www.okd.io/
which provides Virtualization support: 
https://docs.okd.io/latest/virt/about-virt.html
But nothing prevents you to continue using oVirt as long as the community keeps 
maintaining it.

--

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING - Red Hat In-Vehicle Operating System

Red Hat EMEA

sbona...@redhat.com

[https://static.redhat.com/libs/redhat/brand-assets/2/corp/logo--200.png]
Red Hat respects your work life balance. Therefore there is no need to answer 
this email out of your office hours.


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LGHT7TZKRGRUPIK2AJYYDDY5G4JMFT4Z/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EQL64OMTVD4DHGJ7LVXJKX65I7JY7RTZ/


[ovirt-users] Re: Is it possible to auto-start VMs on single LocalStorage host without the engine?

2022-11-25 Thread Vinícius Ferrão via Users
It’s not recommended but you can put hard coded entries on /etc/hosts.

It used to work in the past (4 years ago), but I never relied on this recently.

> On 25 Nov 2022, at 07:51, ernest.beinr...@axonpro.sk wrote:
> 
> I currently use KVM/virsh for my DNS. I would like it to ovirt, but I need 
> DNS started for the engine to work. So I need to start the DNS vm before the 
> engine. Is that possible with ovirt? I was thinking I could use the same 
> mechanism as hosted engine, as that would autostart.
> 
> In my current KVM setup I needed only to symlink /etc/libvirtd/qemu/dns.xml 
> to /etc/libvirtd/qemu/autostart/ to get it to run. 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/JPREYFWDOYKCPIBL75QGSHCNNS333KMJ/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YEBPMSLQZVKIZUO5UKXW2MKHL7SJBBMC/


[ovirt-users] Re: Not able to login as admin after successfull deployment of hosted engine (OVirt 4.5.1)

2022-07-15 Thread Vinícius Ferrão via Users
admin@ovirt should be the correct way if you’re using Keycloak.

Sent from my iPhone

On 15 Jul 2022, at 16:49, Ralf Schenk  wrote:



Hello list,

"admin@localhost" did the trick That was frustrating but searching through 
the list helped !

Bye

Am 15.07.2022 um 21:26 schrieb Ralf Schenk:

Hello List,

I successfully deployed a fresh hosted-engine, but I'm not able to login to 
Administration-Portal. I'm perferctly sure about the password I had to type 
multiple times

I'm running ovirt-node-ng-4.5.1-0.20220622.0 and deployed engine via cli-based 
ovirt-hosted-engine-setup.

Neither "admin" nor "admin@internal" are working (A profile cannot be choosen 
as in earlier versions).

I can login to the monitoring part (grafana !) and also Cockpit but not 
Administration-Portal nor VM-Portal.

I can ssh into the engine and lookup the user-database which has the user.

root@engine02 ~]# ovirt-aaa-jdbc-tool query --what=user
Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false
-- User admin(2be16cf0-5eb7-4b0e-923e-7bdc7bc2aa6f) --
Namespace: *
Name: admin
ID: 2be16cf0-5eb7-4b0e-923e-7bdc7bc2aa6f
Display Name:
Email: root@localhost
First Name: admin
Last Name:
Department:
Title:
Description:
Account Disabled: false
Account Locked: false
Account Unlocked At: 1970-01-01 00:00:00Z
Account Valid From: 2022-07-15 18:23:47Z
Account Valid To: -07-15 18:23:47Z
Account Without Password: false
Last successful Login At: 1970-01-01 00:00:00Z
Last unsuccessful Login At: 1970-01-01 00:00:00Z
Password Valid To: -05-28 18:23:49Z

However no groups by default ???

[root@engine02 ~]# ovirt-aaa-jdbc-tool query --what=group
Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false

Any solution ? I don't want to repeat the hosted-engine deployment a fourth 
time after I mastered all problems with NFS permissions, GUI deployment not 
accepting my Bond which is perfectly ok called "bond0" etc

Bye

--
[thbAZSkoJSkgke42.png]
Ralf Schenk
fon:02405 / 40 83 70
mail:   r...@databay.de 
web:www.databay.de 
Databay AG
Jens-Otto-Krag-Str. 11
52146 Würselen


Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.Kfm. Philipp 
Hermanns
Aufsichtsratsvorsitzender: Dr. Jan Scholzen



___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZU2X36SQEBD5WIC7S6X4F6LPJ2XZ4WRK/


--
[0WvCEB6ZAUUINC1l.png]
Ralf Schenk
fon:02405 / 40 83 70
mail:   r...@databay.de 
web:www.databay.de 
Databay AG
Jens-Otto-Krag-Str. 11
52146 Würselen


Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.Kfm. Philipp 
Hermanns
Aufsichtsratsvorsitzender: Dr. Jan Scholzen
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q2BGCWEQMP67HMUUARGO43N2VIY7KHHW/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GF7OYHRMD25BQRSCTIDPIMDPUYWJPSQV/


[ovirt-users] Re: Seeking best performance on oVirt cluster

2022-07-14 Thread Vinícius Ferrão via Users
Hi David, as an advise. You should not set sync=disabled on TrueNAS. Doing that 
you’re considering every write as async, and if you have a powerless you’ll 
have data lost.

There are some conservatives that state that you should do the opposite: 
sync=always, which bogs down the performance, but I particularly use 
sync=standard.

I think you should look at the real issue on your pool. Is it with more than 
80% occupation? Do you have an SLOG device (to offload the sync writes, so you 
can continue using sync=standard)?

Another issue is that RAID-Z is not really recommended for latency scenarios 
(VMs). You should be using stripe of mirrors instead. RAID-Z2 would be a choice 
if you have lots of RAM and L2ARC devices to compensate the slow disk access. 
ZFS is not a performance beast, security is the first option in this filesystem.

Regards.

PS: I run FreeNAS (and now TrueNAS) since the 0.7 days and with VM storage 
since 8.2. It’s good if you know what you’re doing.

On 14 Jul 2022, at 12:26, David Johnson 
mailto:djohn...@maxistechnology.com>> wrote:

Thank you Nir, this is good information.


On Thu, Jul 14, 2022 at 9:34 AM Nir Soffer 
mailto:nsof...@redhat.com>> wrote:
On Tue, Jul 12, 2022 at 9:02 AM David Johnson 
mailto:djohn...@maxistechnology.com>> wrote:
Good morning all,

I am trying to get the best performance out of my cluster possible,

Here are the details of what I have now:

Ovirt version: 4.4.10.7-1.el8
Bare metal for the ovirt engine
two hosts
TrueNAS cluster storage
   1 NFS share
   3 vdevs, 6 drives in raidz2 in each vdev
   2 nvme drives for silog
Storage network is 10 GBit all static IP addresses

Tonight, I built a new VM from a template.  It had 5 attached disks totalling 
100 GB.  It took 30 minutes to deploy the new VM from the template.

Global utilization was 9%.
The SPM has 50% of its memory free and never showed more than 12% network 
utilization

62 out of 65 TB are available on the newly created NFS backing store (no 
fragmentation). The TureNAS system is probably overprovisioned for our use.

There were peak throughputs of up to 4 GBytes/second (on a 10 GBit network), 
but overall throughput on the NAS and the network were low.
ARC hits were 95 to 100%
L2 hits were 0 to 70%

Here's the NFS usage stats:


I believe the first peak is where the silog buffered the initial burst of 
instructions, followed by sustained IO as the VM volumes were built in 
parallel, and then finally tapering off to the one 50 GB volume that took 40 
minutes to copy.

The indications of the NFS stats graph are that the network performance is just 
fine.

Here are the disk IO stats covering the same time frame, plus a bit before to 
show a spike IO:


The spike at 2250 (10 minutes before I started building my VM) shows that the 
spinners actually hit write speed of almost 20 MBytes per second briefly, then 
settled in at a sustained 3 to 4 MBytes per second.  The silog absorbs several 
spikes, but remains mostly idle, with activity measured in kilobytes per second.

The HGST HUS726060AL5210 drives boast a spike throughput of 12 GB/S, and 
sustained throughput of 227 Mbps.

--
Now to the questions:
1. Am I asking the on the right list? Does this look like something where 
tuning ovirt might make a difference, or is this more likely a configuration 
issue with my storage appliances?

2. Am I expecting too much?  Is this well within the bounds of acceptable 
(expected) performance?

3. How would I go about identifying the bottleneck, should I need to dig deeper?

One thing that can be interesting to try to to apply this patch for vdsm:

diff --git a/lib/vdsm/storage/sd.py b/lib/vdsm/storage/sd.py
index 36c393b5a..9cb7486c0 100644
--- a/lib/vdsm/storage/sd.py
+++ b/lib/vdsm/storage/sd.py
@@ -401,7 +401,7 @@ class StorageDomainManifest(object):
 Unordered writes improve copy performance but are recommended only for
 preallocated devices and raw format.
 """
-return format == sc.RAW_FORMAT and not self.supportsSparseness
+return True

 @property
 def oop(self):

This enables unordered writes for qemu-img convert, which can be up to 6 times 
faster on
block storage. When we tested it with file storage it did not give lot of 
improvement,  but this
was tested a long time ago, and since then we use unordered writes everywhere 
else in the
system.

Another thing to try is NFS 4.2, which can be much faster when coping images, 
since it supports
sparseness. But I don't think TrueNAS supports NFS 4.2 yet (in 12.x they did 
not).

If you must work with older NFS, using qcow2 disks will be much faster when 
copying disks
(e.g. create vm from template). The way to get qcow2 disks is to check "enable 
incremental backup"
when creating disks.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: 

[ovirt-users] Re: DO I HAVE TO PAY MONEY OR DO PEOPLE PAY MONEY TO GET LITTLE INFORMATION?

2022-07-12 Thread Vinícius Ferrão via Users
Did you tried our suggestions abiolaemma01?

Regards.

> On 8 Jul 2022, at 14:35, abiolaemm...@gmail.com wrote:
> 
> Good day Vinícius Ferrão,
>Appreciated your rapid reply. For more than 6weeks now, I have been at war 
> with oVirt installation and how to deploy it in Oracle Linux Server 8.6 
> Oracle Linux Server 8.6.  I have watched all Youtube videos and on how to 
> solve this problems but cant still have a SOLUTION, all i got is ERRORS.
> I need to deploy oVirt but all i got are errors as shown below in the screen 
> shot and also the ovirt-imageio is another issue on how to upload images iso 
> files. 
> 
> I cant attached screenshot files here, so i have sent you an email with the 
> screenshot of the errors.
> 
> Please I will be the most grateful if you can help me find a solution to this 
> problem.
> 
> Appreciated
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZU6S7GVTPAKCNKWQXMKT6L5NJNTAVU62/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VUZGRBWAI32BG4HGD45U4YGHNETEIXC3/


[ovirt-users] Re: Upgrade from 4.2 directly to 4.5

2022-07-08 Thread Vinícius Ferrão via Users
Hi Sandro I’m writing back to thank you for the support and give a feedback.

I was unable to properly restore the engine from 4.3 to 4.5 (or even 4.4); 
there was some errors during the network phase and I think that was because 
ovirtmgmt was running on top a bond0 and in the 4.2 days, when the engine was 
originally deployed that was broken and unsupported.

After 4 or 5 tries I ended up reconfiguring the entire engine in a brand new 
one. It was faster… there was a lot of errors on 
OVESETUP_OVN/ovirtProviderOvnSecret and I gave up.

Thank you.

On 30 Jun 2022, at 03:32, Sandro Bonazzola 
mailto:sbona...@redhat.com>> wrote:



Il giorno gio 30 giu 2022 alle ore 05:56 Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> ha scritto:


On 29 Jun 2022, at 02:43, Sandro Bonazzola 
mailto:sbona...@redhat.com>> wrote:



Il giorno mar 28 giu 2022 alle ore 17:52 Vinícius Ferrão via Users 
mailto:users@ovirt.org>> ha scritto:
Hello, I would like to know if I can do an oVirt upgrade directly from 4.2 to 
4.5.

I don’t have a free host to upgrade the oVirt Node, so I was hoping that would 
be possible to fire up a new engine (with restore-backup) on the old oVirt 
Nodes and later on upgrade the hosts.

Is this possible?

Please upgrade the engine to 4.3 and take a backup.
Direct upgrade from 4.3 to 4.5 should work fine but direct upgrade from 4.2 
backup is not working.
As for the host, oVirt Node 4.5 is supposed to be compatible with ovirt engine 
>= 4.2.

Thank you Sandro. I’m following the instructions over here:
https://www.ovirt.org/documentation/upgrade_guide/#Upgrading_the_Manager_to_4-3_4-2_local_db
https://www.ovirt.org/documentation/upgrade_guide/#Upgrading_Remote_Databases_from_PG95_to_PG10_4-2_remote_db

I had to upgrade Postgres to version 10, on the SelfHosted Engine to proceed. 
Just followed the “remote database procedure” even without a remote DB.

I just have one more question: to do the jump from 4.3 to 4.5 on the engine the 
only way is restoring a backup right? Since 4.5 is based on EL8. I cannot do 
the yum update ovirt\*setup\* method right?

Right

Is there other way?

No


Also during the host upgrades, I understand that I can run oVirt Node 4.5 with 
Engine 4.2 (and now 4.3), which is awesome. But is there a path where I can 
upgrade the engine to the latest version and them move the hosts?

You can move your hosted engine to bare metal and then upgrade the hosts. If 
you have hosted engine, the installation from backup requires 1 host on 4.5 
with the corresponding ovirt engine appliance.



I have those questions because I can’t see where or when the OS upgrade will be 
done on the SHE during 4.3 -> 4.5.

When installing the first 4.5/el8 host you can restore backup into a newly 
deployed hosted engine 4.5/el8 so both engine and first host get upgraded at 
the same step.



Thank you.


--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA<https://www.redhat.com/>

sbona...@redhat.com<mailto:sbona...@redhat.com>
[X]<https://www.redhat.com/>
Red Hat respects your work life balance. Therefore there is no need to answer 
this email out of your office hours.





--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA<https://www.redhat.com/>

sbona...@redhat.com<mailto:sbona...@redhat.com>
[https://static.redhat.com/libs/redhat/brand-assets/2/corp/logo--200.png]<https://www.redhat.com/>
Red Hat respects your work life balance. Therefore there is no need to answer 
this email out of your office hours.

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSYJYKDXB3YNF7JI32HIU45UVJDSKKXV/


[ovirt-users] Re: DO I HAVE TO PAY MONEY OR DO PEOPLE PAY MONEY TO GET LITTLE INFORMATION?

2022-07-08 Thread Vinícius Ferrão via Users
Hello.

I would not recommending using OL as a base system, this is mainly because OL 
has a special fork of oVirt that fits better, which is Oracle OLKVM: 
https://www.oracle.com/virtualization/

With that being said, the way forward is to run oVirt Node 4.5.1 (with EL8 
Stream; EL9 is still experimental) and a self hosted engine.

Can you use this instead of OL or do you have a custom OL requirement? I’m 
saying that because I do run Oracle Linux in some places and I know that for 
KVM it’s not 1:1 compatible with oVirt.

I didn’t fully understand your other messages, but I see you tried to upload an 
.ISO image and it failed. That’s some work that must be done for ISO upload 
works correctly, and to be honest I never cared for it, I just use the legacy 
ISO domain instead and manually copy the ISO files to my NFS server. This may 
solve the issue for you.

Regards.

On 8 Jul 2022, at 14:35, abiolaemm...@gmail.com<mailto:abiolaemm...@gmail.com> 
wrote:

Good day Vinícius Ferrão,
   Appreciated your rapid reply. For more than 6weeks now, I have been at war 
with oVirt installation and how to deploy it in Oracle Linux Server 8.6 Oracle 
Linux Server 8.6.  I have watched all Youtube videos and on how to solve this 
problems but cant still have a SOLUTION, all i got is ERRORS.
I need to deploy oVirt but all i got are errors as shown below in the screen 
shot and also the ovirt-imageio is another issue on how to upload images iso 
files.

I cant attached screenshot files here, so i have sent you an email with the 
screenshot of the errors.

Please I will be the most grateful if you can help me find a solution to this 
problem.

Appreciated
___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZU6S7GVTPAKCNKWQXMKT6L5NJNTAVU62/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OCEFUM6E2UVANGOWJYHEWPUNDLFMDJ2E/


[ovirt-users] Re: DO I HAVE TO PAY MONEY OR DO PEOPLE PAY MONEY TO GET LITTLE INFORMATION?

2022-07-08 Thread Vinícius Ferrão via Users
What info do you need?

> On 8 Jul 2022, at 12:35, abiolaemm...@gmail.com wrote:
> 
> I might be asking the wrong question. Am sorry if i am.
> DO I HAVE TO PAY SOME MONEY FOR ME TO GET SOME INFORMATION OR HELP ON THIS 
> FORUM?
> I men information about the oVirt product and how to fix the bugs.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/GBIJDPXU6O267R3QN4MFYE5TKQ32R5G4/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KIZNTRBCGWMOPVANCUKP346UFMSYXZWV/


[ovirt-users] Re: Invalid username or password

2022-07-04 Thread Vinícius Ferrão via Users
I’ve found on 4.5.1 release notes: 
https://www.ovirt.org/release/4.5.1/#keycloak-sso-setup-for-ovirt-engine

On 4 Jul 2022, at 19:31, less foobar via Users 
mailto:users@ovirt.org>> wrote:

nowhere on the docu it is said that I should use admin@ovirt... how did you 
find that. And yes you are correct admin@ovirt works
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KB6CW4O65I3C2I32LO3FNPFH65XUE2IO/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PB7YP5W7LWPU47FCYBWWJULINEDGQDAI/


[ovirt-users] Re: Invalid username or password

2022-07-04 Thread Vinícius Ferrão via Users
New username is admin@ovirt if you’re using Keycloak integration.

I’ve scratched my head with this yesterday.

Sent from my iPhone

> On 4 Jul 2022, at 14:29, less foobar via Users  wrote:
> 
> I've installed a fresh ovirt. The default engine user doesn't work for me. 
> * I've tried changing the password with:
> ```
> ovirt-aaa-jdbc-tool user password-reset admin --password-valid-to="2035-12-31 
> 12:00:00Z"
> Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false
> Password:
> Reenter password:
> updating user admin...
> user updated successfully
> ```
> * I've tried unblocking the admin user:
> ```
> ovirt-aaa-jdbc-tool user unlock admin
> Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false
> updating user admin...
> user updated successfully
> ```
> * I've tried `admin`, `admin@internal` and `root@localhost`
> * Here is my admin details:
> ```
> ovirt-aaa-jdbc-tool user show admin
> Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false
> -- User admin(bca1d04a-cc83-4ab6-8233-602ab66df6d9) --
> Namespace: *
> Name: admin
> ID: bca1d04a-cc83-4ab6-8233-602ab66df6d9
> Display Name: 
> Email: root@localhost
> First Name: admin
> Last Name: 
> Department: 
> Title: 
> Description: 
> Account Disabled: false
> Account Locked: false
> Account Unlocked At: 2022-07-04 17:16:50Z
> Account Valid From: 2022-06-28 21:27:59Z
> Account Valid To: -06-28 21:27:59Z
> Account Without Password: false
> Last successful Login At: 1970-01-01 00:00:00Z
> Last unsuccessful Login At: 1970-01-01 00:00:00Z
> Password Valid To: 2035-12-31 12:00:00Z
> ```
> * above the keycloak sign-in page I see ovirt-internal this is why I'm 
> assuming I'm on the right page. No matter where I click on `Administration 
> Porta`, `VM Portal` or in the upper right corner I land on that page. This is 
> why I'm assuming it is the internal login page and my "default" admin account 
> should work. 
> * If I try the `Monitoring Portal` I can login without any issues. 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/LODSF7V7QX5GWG2ZDUP6XXDZSBQTZOHP/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JEEU4Q2FY2RTTGNF4E5O6BGNKJ2UZTLV/


[ovirt-users] Re: Upgrade from 4.2 directly to 4.5

2022-06-29 Thread Vinícius Ferrão via Users


On 29 Jun 2022, at 02:43, Sandro Bonazzola 
mailto:sbona...@redhat.com>> wrote:



Il giorno mar 28 giu 2022 alle ore 17:52 Vinícius Ferrão via Users 
mailto:users@ovirt.org>> ha scritto:
Hello, I would like to know if I can do an oVirt upgrade directly from 4.2 to 
4.5.

I don’t have a free host to upgrade the oVirt Node, so I was hoping that would 
be possible to fire up a new engine (with restore-backup) on the old oVirt 
Nodes and later on upgrade the hosts.

Is this possible?

Please upgrade the engine to 4.3 and take a backup.
Direct upgrade from 4.3 to 4.5 should work fine but direct upgrade from 4.2 
backup is not working.
As for the host, oVirt Node 4.5 is supposed to be compatible with ovirt engine 
>= 4.2.

Thank you Sandro. I’m following the instructions over here:
https://www.ovirt.org/documentation/upgrade_guide/#Upgrading_the_Manager_to_4-3_4-2_local_db
https://www.ovirt.org/documentation/upgrade_guide/#Upgrading_Remote_Databases_from_PG95_to_PG10_4-2_remote_db

I had to upgrade Postgres to version 10, on the SelfHosted Engine to proceed. 
Just followed the “remote database procedure” even without a remote DB.

I just have one more question: to do the jump from 4.3 to 4.5 on the engine the 
only way is restoring a backup right? Since 4.5 is based on EL8. I cannot do 
the yum update ovirt\*setup\* method right? Is there other way?

Also during the host upgrades, I understand that I can run oVirt Node 4.5 with 
Engine 4.2 (and now 4.3), which is awesome. But is there a path where I can 
upgrade the engine to the latest version and them move the hosts?

I have those questions because I can’t see where or when the OS upgrade will be 
done on the SHE during 4.3 -> 4.5.

Thank you.


--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA<https://www.redhat.com/>

sbona...@redhat.com<mailto:sbona...@redhat.com>
[https://static.redhat.com/libs/redhat/brand-assets/2/corp/logo--200.png]<https://www.redhat.com/>
Red Hat respects your work life balance. Therefore there is no need to answer 
this email out of your office hours.



___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YP2TJA2KFZCDFWOOS3IDC24FPJ6XBSNP/


[ovirt-users] Upgrade from 4.2 directly to 4.5

2022-06-28 Thread Vinícius Ferrão via Users
Hello, I would like to know if I can do an oVirt upgrade directly from 4.2 to 
4.5.

I don’t have a free host to upgrade the oVirt Node, so I was hoping that would 
be possible to fire up a new engine (with restore-backup) on the old oVirt 
Nodes and later on upgrade the hosts.

Is this possible?

Thank you.

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MXGKCZP5Q45Y4WLZLF2ZGS3HZBLZAWKP/


[ovirt-users] Re: VM access to infiniband network

2022-06-01 Thread Vinícius Ferrão via Users
Hmm, BeeGFS natively supports RDMA.

Which is good.

So the customer wants to access BeeGFS storage inside the VM? Is that the 
purpose? If yes you should use IOMMU to provide ib0 interfaces direct to the VM.

If it’s a unique VM just passthrough the Connect-X card to it.

If it’s more than a VM, you may need to look at SR-IOV, and enable it.

Which case are you looking for?

Regards,

> On 1 Jun 2022, at 14:47, Roberto Bertucci  wrote:
> 
> Thank you Vinicious, customer has beegfs through infiniband and he wants vms 
> having access to beegfs storage.
> I will try to convince him to buy a new mellanox card dedicated to beegfs.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/KBONEZNZBC6UXRTU3PBA6ZXSS6357FGX/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5YVXUJPLXITZ5RX22PYDQPSZSAV7UMU5/


[ovirt-users] Re: VM access to infiniband network

2022-05-26 Thread Vinícius Ferrão via Users
The error says that you can’t use IPoIB. And you’re using it. IP address on 
Infiniband is IPoIB.

I don’t know the reasoning behind this, but as my experience with IB, IPoIB is 
a resource hog when you have NFS on top of it. There’s no offload on the 
hardware. A 1GbE ethernet card will perform better in some cases.

Regards,

> On 26 May 2022, at 10:14, Roberto Bertucci  wrote:
> 
> Hi all,
> i am facing a problem while trying to associate a Mellanox infiniband 
> interface to a network and using it for VM traffic.
> 
> vdsm log shows the following message:
> The bridge  cannot use IP over InfiniBand interface  name> as port. Please use RoCE interface instead.
> 
> Did anybody face the same problem and solve it?
> Actually ib interface is configured with an ip address and we are mounting 
> NFS filesystems on cluster nodes through infiniband network.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/S4B554ANIYXAFEJJ34KQYLWETHDGVWQ4/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QDXTEKAQXHOFRD4PG2EQO7LTOFHHWBOE/


[ovirt-users] Re: Import an snapshot of an iSCSI Domain

2022-03-07 Thread Vinícius Ferrão via Users
Hi Nir and Strahil.

On 6 Mar 2022, at 19:09, Nir Soffer 
mailto:nsof...@redhat.com>> wrote:

On Fri, Mar 4, 2022 at 8:28 AM Vinícius Ferrão via Users
mailto:users@ovirt.org>> wrote:

Hi again, I don’t know if it will be possible to import the storage domain due 
to conflicts with the UUID of the LVM devices. I’ve tried to issue a 
vgimportclone to chance the UUIDs and import the volume but it still does not 
shows up on oVirt.

LVM can change VG/PV UUIDs and names, but storage domain metadata kept in the
VG tags and volume metadata area contain the old VG and PV names and UUIDs,
so it is unlikely to work.

The system is designed so if the original PVs are bad, you can
disconnect them and
connect a backup of the PVs, and import the storage domain again to the system.

Can you explain in more details what you are trying to do?

Nir

What I was trying to accomplish is get some VM snapshots from days ago.

On my storage system it generates a snapshot of the entire disk pool in a daly 
basis. So it was natural, at least for me, to mount a snapshot at a given time 
to export some VMs from those snapshots. The environment had an attack and 
needed to be recovered from it, and rolling back was the approach.

But that what the issue happened, when trying to import the snapshot from the 
iSCSI share it conflicted with the running iSCSI share, since it’s a storage 
snapshot, it have the exactly same metadata.

I think the issue here is a missing feature on oVirt to remap the metadata and 
permit the mount in this situation, right?

What I ended up doing:
* Removed one of the servers from oVirt Cluster / Datacenter
* Reinstalled from the ground
* Fired up a new hosted engine on a new iSCSI HE dedicated share
* Reconfigured everything: network, VLAN, iSCSI, etc.
* == IMPORTED the snapshot on this new engine
* Created an export domain on another NFS share
* Exported the 6 VM’s that I need to export from the storage-level snapshot to 
the export domain
* Detached the export domain from the temporary engine
* Added the export domain on the production engine
* Deleted the compromised VMs
* Imported the “snapshoted" VM’s from the export domain
* == INFRA is back
* Destroyed the new engine, the snapshot iSCSI share and the export domain
* Reinstalled the host
* Added back to the original Datacenter / Cluster.

As you can see was a tiresome work just to get the VM’s back from an storage 
level snapshot of the iSCSI share, but is what I’ve ended up doing.

Lesson learned: it’s too difficult to recover storage-level snapshots, it’s 
hard on NFS and on iSCSI is extremely worse since you can’t mount iSCSI 
whatever you want to.

My opinion: Should be a feature on oVirt to at least allow to mount this kind 
of conflicting volume as readonly for recovery reasons only.

Thanks.


I don’t know how to mount the iSCSI volume to recover the data. The data is 
there but it’s extremely difficult to get it.

Any ideias?

Thanks.


On 3 Mar 2022, at 20:56, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

I think I’ve found the root cause, and it’s the LVM inside the iSCSI volume:

[root@rhvh5 ~]# pvscan
WARNING: Not using device /dev/mapper/36589cfc00db9cf56949c63d338ef for PV 
fTIrnd-gnz2-dI8i-DesK-vIqs-E1BK-mvxtha.
WARNING: PV fTIrnd-gnz2-dI8i-DesK-vIqs-E1BK-mvxtha prefers device 
/dev/mapper/36589cfc006f6c96763988802912b because device is used by LV.
PV /dev/mapper/36589cfc006f6c96763988802912bVG 
9377d243-2c18-4620-995f-5fc680e7b4f3   lvm2 [<10.00 TiB / 7.83 TiB free]
PV /dev/mapper/36589cfc00a1b985d3908c07e41adVG 
650b0003-7eec-4fa5-85ea-c019f6408248   lvm2 [199.62 GiB / <123.88 GiB free]
PV /dev/mapper/3600605b00805d8a01c2180fd0d8d8dad3   VG rhvh_rhvh5   
  lvm2 [<277.27 GiB / 54.55 GiB free]
Total: 3 [<10.47 TiB] / in use: 3 [<10.47 TiB] / in no VG: 0 [0   ]

The device that’s not being using is the snapshot. There’s a way to change the 
ID of the device so I can import the data domain?

Thanks.

On 3 Mar 2022, at 20:21, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Hello,

I need to import an old snapshot of my Data domain but oVirt does not find the 
snapshot version when importing on the web interface.

To be clear, I’ve mounted a snapshot on my storage, and exported it on iSCSI. I 
was expecting that I could be able to import it on the engine.

On the web interface this Import Pre-Configured Domain finds the relative IQN 
but it does not show up as a target.

Any ideas?


___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/

[ovirt-users] Re: Import an snapshot of an iSCSI Domain

2022-03-03 Thread Vinícius Ferrão via Users
Hi again, I don’t know if it will be possible to import the storage domain due 
to conflicts with the UUID of the LVM devices. I’ve tried to issue a 
vgimportclone to chance the UUIDs and import the volume but it still does not 
shows up on oVirt.

I don’t know how to mount the iSCSI volume to recover the data. The data is 
there but it’s extremely difficult to get it.

Any ideias?

Thanks.


> On 3 Mar 2022, at 20:56, Vinícius Ferrão  wrote:
> 
> I think I’ve found the root cause, and it’s the LVM inside the iSCSI volume:
> 
> [root@rhvh5 ~]# pvscan 
>  WARNING: Not using device /dev/mapper/36589cfc00db9cf56949c63d338ef for 
> PV fTIrnd-gnz2-dI8i-DesK-vIqs-E1BK-mvxtha.
>  WARNING: PV fTIrnd-gnz2-dI8i-DesK-vIqs-E1BK-mvxtha prefers device 
> /dev/mapper/36589cfc006f6c96763988802912b because device is used by LV.
>  PV /dev/mapper/36589cfc006f6c96763988802912bVG 
> 9377d243-2c18-4620-995f-5fc680e7b4f3   lvm2 [<10.00 TiB / 7.83 TiB free]
>  PV /dev/mapper/36589cfc00a1b985d3908c07e41adVG 
> 650b0003-7eec-4fa5-85ea-c019f6408248   lvm2 [199.62 GiB / <123.88 GiB free]
>  PV /dev/mapper/3600605b00805d8a01c2180fd0d8d8dad3   VG rhvh_rhvh5
>  lvm2 [<277.27 GiB / 54.55 GiB free]
>  Total: 3 [<10.47 TiB] / in use: 3 [<10.47 TiB] / in no VG: 0 [0   ]
> 
> The device that’s not being using is the snapshot. There’s a way to change 
> the ID of the device so I can import the data domain?
> 
> Thanks.
> 
>> On 3 Mar 2022, at 20:21, Vinícius Ferrão via Users  wrote:
>> 
>> Hello,
>> 
>> I need to import an old snapshot of my Data domain but oVirt does not find 
>> the snapshot version when importing on the web interface.
>> 
>> To be clear, I’ve mounted a snapshot on my storage, and exported it on 
>> iSCSI. I was expecting that I could be able to import it on the engine.
>> 
>> On the web interface this Import Pre-Configured Domain finds the relative 
>> IQN but it does not show up as a target.
>> 
>> Any ideas?
>> 
>> 
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct: 
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives: 
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/3WEQQHZ46DKQJXHVX5QF4S2UVBYF4URR/
> 

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3XDKQK32V6E4K3IB7BLY5XOGDNHJBW3L/


[ovirt-users] Re: Import an snapshot of an iSCSI Domain

2022-03-03 Thread Vinícius Ferrão via Users
I think I’ve found the root cause, and it’s the LVM inside the iSCSI volume:

[root@rhvh5 ~]# pvscan 
  WARNING: Not using device /dev/mapper/36589cfc00db9cf56949c63d338ef for 
PV fTIrnd-gnz2-dI8i-DesK-vIqs-E1BK-mvxtha.
  WARNING: PV fTIrnd-gnz2-dI8i-DesK-vIqs-E1BK-mvxtha prefers device 
/dev/mapper/36589cfc006f6c96763988802912b because device is used by LV.
  PV /dev/mapper/36589cfc006f6c96763988802912bVG 
9377d243-2c18-4620-995f-5fc680e7b4f3   lvm2 [<10.00 TiB / 7.83 TiB free]
  PV /dev/mapper/36589cfc00a1b985d3908c07e41adVG 
650b0003-7eec-4fa5-85ea-c019f6408248   lvm2 [199.62 GiB / <123.88 GiB free]
  PV /dev/mapper/3600605b00805d8a01c2180fd0d8d8dad3   VG rhvh_rhvh5 
lvm2 [<277.27 GiB / 54.55 GiB free]
  Total: 3 [<10.47 TiB] / in use: 3 [<10.47 TiB] / in no VG: 0 [0   ]

The device that’s not being using is the snapshot. There’s a way to change the 
ID of the device so I can import the data domain?

Thanks.

> On 3 Mar 2022, at 20:21, Vinícius Ferrão via Users  wrote:
> 
> Hello,
> 
> I need to import an old snapshot of my Data domain but oVirt does not find 
> the snapshot version when importing on the web interface.
> 
> To be clear, I’ve mounted a snapshot on my storage, and exported it on iSCSI. 
> I was expecting that I could be able to import it on the engine.
> 
> On the web interface this Import Pre-Configured Domain finds the relative IQN 
> but it does not show up as a target.
> 
> Any ideas?
> 
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/3WEQQHZ46DKQJXHVX5QF4S2UVBYF4URR/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MU3FOHTMKWSEJ4UERNFOGCVUZIOOC2SR/


[ovirt-users] Import an snapshot of an iSCSI Domain

2022-03-03 Thread Vinícius Ferrão via Users
Hello,

I need to import an old snapshot of my Data domain but oVirt does not find the 
snapshot version when importing on the web interface.

To be clear, I’ve mounted a snapshot on my storage, and exported it on iSCSI. I 
was expecting that I could be able to import it on the engine.

On the web interface this Import Pre-Configured Domain finds the relative IQN 
but it does not show up as a target.

Any ideas?


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3WEQQHZ46DKQJXHVX5QF4S2UVBYF4URR/


[ovirt-users] Re: oVirt + TrueNAS: Unable to create iSCSI domain - I am missing something obvious

2022-03-02 Thread Vinícius Ferrão via Users
And as a complement for the last message, having 500GB of SLOG is almost 
irrelevant.

SLOG isn’t a write cache mechanism, it’s a write offload one. The maximum size 
used by SLOG is equal to the maximum transaction group that you could have.

I don’t know the maths to measure this for you, but I do know that a saturated 
10GbE link will only generate 8GB of SLOG.

The reason you want to use a bigger device for SLOG is endurance and some 
vendors only achieve full performance with bigger devices. It’s not the 
capacity you’re looking for.

This information is not easily gathered on the web, but TrueNAS documentation 
is getting better, as you could see here: 
https://www.truenas.com/docs/references/slog/

Please excuse-me if any information I provided is wrong about SLOG on ZFS. I 
don’t think it is but I haven’t rechecked it for a while (maybe 3-5 years).

Sent from my iPhone

On 3 Mar 2022, at 00:28, Vinícius Ferrão  wrote:

 David do yourself a favor a move away from NFS on TrueNAS for VM hosting.

As a personal experience hosting VMs on NFS may cause your entire 
infrastructure to be down if you change something on TrueNAS, even adding a new 
NFS share may trigger a NFS server restart and suddenly all your VMs will be 
trashed. Emphasis on _may_.

I’ve been using the product since FreeNAS 8, which was 2012 and that’s observed 
behavior.

Also oVirt has its quirks with iSCSI, mainly on MPIO (Multipath I/O) but as for 
the combination with TrueNAS just stick with iSCSI.

Sent from my iPhone

On 3 Mar 2022, at 00:02, David Johnson  wrote:


The cluster is on nfs today, with 500gb NVME SiLOG. Under heavy IO the vm's are 
thrown into paused state instead of iowait. A prior email chain identified a 
code error in qemu, with a repro using nothing more than DD to set 2 gb on the 
virtual disk to 0's .

Since the point of the system is to handle massive IO workloads, this is 
obviously not acceptable.

If there is a way to make the nfs Mount more robust I'm all for it over the 
headaches that go with managing block io.

On Wed, Mar 2, 2022, 8:46 AM Nir Soffer 
mailto:nsof...@redhat.com>> wrote:
On Wed, Mar 2, 2022 at 3:01 PM David Johnson 
mailto:djohn...@maxistechnology.com>> wrote:
Good morning folks, and thank you in advance.

I am working on migrating my oVirt backing store from NFS to iSCSI.

oVirt Environment:
oVirt Open Virtualization Manager
Software Version:4.4.4.7-1.el8
TrueNAS environment:
FreeBSD truenas.local 12.2-RELEASE-p11 75566f060d4(HEAD) TRUENAS amd64

The iSCSI share is on a TrueNAS server, exposed to user VDSM and group 36.

oVirt sees the targeted share, but is unable to make use of it.

The latest issue is "Error while executing action New SAN Storage Domain: 
Volume Group block size error, please check your Volume Group configuration, 
Supported block size is 512 bytes."

As near as I can tell, oVirt does not support any block size other than 512 
bytes, while TrueNAS's smallest OOB block size is 4k.

This is correct, oVirt does not support 4k block storage.


I know that oVirt on TrueNAS is a common configuration, so I expect I am 
missing something really obvious here, probably a TrueNAS configuration needed 
to make TrueNAS work with 512 byte blocks.

Any advice would be helpful.

You can use NFS exported by TrueNAS. With NFS the underlying block size is 
hidden
since direct I/O on NFS does not perform direct I/O on the server.

Another way is to use Managed Block Storage (MBS) - if there a Cinder driver 
that can manage
your storage server, you can use MBS disks with any block size. The block size 
limit comes from
the traditional lvm based storage domain code. When using MBS, you use one LUN 
per disk, and
qemu does not have any issue working with such LUNs.

Check with TrueNAS if they support emulating 512 block size of have another way 
to
support clients that do not support 4k storage.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6NLGE4Q2ABJ2DEP7MXFRZ3QLQNP37A5V/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2FPJSYW6LKWOQISRI26X4REYJSQVWL7E/


[ovirt-users] Re: oVirt + TrueNAS: Unable to create iSCSI domain - I am missing something obvious

2022-03-02 Thread Vinícius Ferrão via Users
David do yourself a favor a move away from NFS on TrueNAS for VM hosting.

As a personal experience hosting VMs on NFS may cause your entire 
infrastructure to be down if you change something on TrueNAS, even adding a new 
NFS share may trigger a NFS server restart and suddenly all your VMs will be 
trashed. Emphasis on _may_.

I’ve been using the product since FreeNAS 8, which was 2012 and that’s observed 
behavior.

Also oVirt has its quirks with iSCSI, mainly on MPIO (Multipath I/O) but as for 
the combination with TrueNAS just stick with iSCSI.

Sent from my iPhone

On 3 Mar 2022, at 00:02, David Johnson  wrote:


The cluster is on nfs today, with 500gb NVME SiLOG. Under heavy IO the vm's are 
thrown into paused state instead of iowait. A prior email chain identified a 
code error in qemu, with a repro using nothing more than DD to set 2 gb on the 
virtual disk to 0's .

Since the point of the system is to handle massive IO workloads, this is 
obviously not acceptable.

If there is a way to make the nfs Mount more robust I'm all for it over the 
headaches that go with managing block io.

On Wed, Mar 2, 2022, 8:46 AM Nir Soffer 
mailto:nsof...@redhat.com>> wrote:
On Wed, Mar 2, 2022 at 3:01 PM David Johnson 
mailto:djohn...@maxistechnology.com>> wrote:
Good morning folks, and thank you in advance.

I am working on migrating my oVirt backing store from NFS to iSCSI.

oVirt Environment:
oVirt Open Virtualization Manager
Software Version:4.4.4.7-1.el8
TrueNAS environment:
FreeBSD truenas.local 12.2-RELEASE-p11 75566f060d4(HEAD) TRUENAS amd64

The iSCSI share is on a TrueNAS server, exposed to user VDSM and group 36.

oVirt sees the targeted share, but is unable to make use of it.

The latest issue is "Error while executing action New SAN Storage Domain: 
Volume Group block size error, please check your Volume Group configuration, 
Supported block size is 512 bytes."

As near as I can tell, oVirt does not support any block size other than 512 
bytes, while TrueNAS's smallest OOB block size is 4k.

This is correct, oVirt does not support 4k block storage.


I know that oVirt on TrueNAS is a common configuration, so I expect I am 
missing something really obvious here, probably a TrueNAS configuration needed 
to make TrueNAS work with 512 byte blocks.

Any advice would be helpful.

You can use NFS exported by TrueNAS. With NFS the underlying block size is 
hidden
since direct I/O on NFS does not perform direct I/O on the server.

Another way is to use Managed Block Storage (MBS) - if there a Cinder driver 
that can manage
your storage server, you can use MBS disks with any block size. The block size 
limit comes from
the traditional lvm based storage domain code. When using MBS, you use one LUN 
per disk, and
qemu does not have any issue working with such LUNs.

Check with TrueNAS if they support emulating 512 block size of have another way 
to
support clients that do not support 4k storage.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6NLGE4Q2ABJ2DEP7MXFRZ3QLQNP37A5V/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EXV6ZV25IKS2SIFULUNMCZZSHKBEBSUA/


[ovirt-users] Re: oVirt + TrueNAS: Unable to create iSCSI domain - I am missing something obvious

2022-03-02 Thread Vinícius Ferrão via Users
TrueNAS supports 512 bytes block in iSCSI.

Check at: Sharing => iSCSI => Extent.

Edit your Extent configuration and look for Block Size.

I’m running three different oVirt DCs with TrueNAS and iSCSI in all of them.

Sent from my iPhone

On 2 Mar 2022, at 11:49, Nir Soffer  wrote:


On Wed, Mar 2, 2022 at 3:01 PM David Johnson 
mailto:djohn...@maxistechnology.com>> wrote:
Good morning folks, and thank you in advance.

I am working on migrating my oVirt backing store from NFS to iSCSI.

oVirt Environment:
oVirt Open Virtualization Manager
Software Version:4.4.4.7-1.el8
TrueNAS environment:
FreeBSD truenas.local 12.2-RELEASE-p11 75566f060d4(HEAD) TRUENAS amd64

The iSCSI share is on a TrueNAS server, exposed to user VDSM and group 36.

oVirt sees the targeted share, but is unable to make use of it.

The latest issue is "Error while executing action New SAN Storage Domain: 
Volume Group block size error, please check your Volume Group configuration, 
Supported block size is 512 bytes."

As near as I can tell, oVirt does not support any block size other than 512 
bytes, while TrueNAS's smallest OOB block size is 4k.

This is correct, oVirt does not support 4k block storage.


I know that oVirt on TrueNAS is a common configuration, so I expect I am 
missing something really obvious here, probably a TrueNAS configuration needed 
to make TrueNAS work with 512 byte blocks.

Any advice would be helpful.

You can use NFS exported by TrueNAS. With NFS the underlying block size is 
hidden
since direct I/O on NFS does not perform direct I/O on the server.

Another way is to use Managed Block Storage (MBS) - if there a Cinder driver 
that can manage
your storage server, you can use MBS disks with any block size. The block size 
limit comes from
the traditional lvm based storage domain code. When using MBS, you use one LUN 
per disk, and
qemu does not have any issue working with such LUNs.

Check with TrueNAS if they support emulating 512 block size of have another way 
to
support clients that do not support 4k storage.

Nir
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FOPLGL4BQTTDSMDIAJPGGFDFMGDIZ4OT/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3NN2RD6UO2F3NN5NTBYEBGI5JDQHPIIG/


[ovirt-users] Re: Importing VM from Xen Server 7.1

2021-08-27 Thread Vinícius Ferrão via Users
Hi Francesco, I never was able to achieve this migration in that way.

After many hours trying I just given up and used virt-p2v and treated all the 
VMs from XenServer as physical servers.

xen+ssh AFAIK does not work correctly with XAPI (Xen API) which is what 
XenServer uses.

Sent from my iPhone

> On 27 Aug 2021, at 08:05, francesco--- via Users  wrote:
> 
> Hi all,
> 
> resuming the "dead" thread  "Importing VM from Xen Server 6.5" 
> (https://lists.ovirt.org/pipermail/users/2016-August/075213.html) I'm trying 
> to import via GUI a VM from Xen Server 7.1 in a host Centos 8.4, oVirt 4.4.
> 
> Created the SSH key for vdsm user, added the IP in the host target firewall, 
> an ssh connection works, installed netcat on the target host, but like the 
> original thread, when i execute the "load" command for getting the VMS list, 
> I get the following error in the host and in the engine:
> 
> Aug 27 12:04:43 centos8-4.host vdsm[78126]: ERROR error connecting to 
> hypervisor#012Traceback (most recent call last):#012  File 
> "/usr/lib/python3.6/site-packages/vdsm/v2v.py", line 193, in 
> get_external_vm_names#012passwd=password)#012  File 
> "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 
> 107, in open_connection#012return function.retry(libvirtOpen, timeout=10, 
> sleep=0.2)#012  File 
> "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 58, in 
> retry#012return func()#012  File 
> "/usr/lib64/python3.6/site-packages/libvirt.py", line 148, in openAuth#012
> raise libvirtError('virConnectOpenAuth() failed')#012libvirt.libvirtError: 
> End of file while reading data: Ncat: No such file or directory.: 
> Input/output error
> 
> Trying the command virsh -c xen+ssh://root@xen7.target the same error:
> 
> error: failed to connect to the hypervisor
> error: End of file while reading data: Ncat: No such file or directory.: 
> Input/output error
> 
> Any ideas?
> 
> Thank you for your time and help.
> 
> Francesco
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/QR2B62PLDI2XO7WLTVTBG3MZRDC4RQ2Q/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SGYSJPJCUC3ICJFY5CHVYWNOXHRWASQ7/


[ovirt-users] Re: Is there a way to support Mellanox OFED with oVirt/RHV?

2021-08-05 Thread Vinícius Ferrão via Users
Oh I got it. --enablerepo is on yum/dnf, and not mlnxofedinstall.

Alright, I run mlnxofedinstall without arguments. That would do the job. Thank 
you Edward!

On 5 Aug 2021, at 17:26, Edward Berger 
mailto:edwber...@gmail.com>> wrote:

The ovirt node-ng installer iso creates imgbased systems with baseos and 
appstream repos disabled,
not something you would have with a regular base OS installed system with added 
oVirt repos...

so with node, 'dnf install foo' usually fails if not adding an extra 
--enablerepo flag, which seems to change with the OS version.

here's some old notes I had.

download latest mlnx ofed archive
tar xfvz *.tgz
cd *64

# mount -o loop MLNX*iso /mnt
# cd /mnt

#./mlnxinstall requires more RPMS to be installed
# note: some versions of CentOS use different case reponames, look at contents 
of /etc/yum.repos.d files
yum --enablerepo baseos install perl-Term-ANSIColor
yum --enablerepo baseos --enablerepo appstream install perl-Getopt-Long tcl 
gcc-gfortran tcsh tk make
./mlnxinstall

On Thu, Aug 5, 2021 at 3:32 PM Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:
Hi Edward, it seems that running mlnxofedinstall would do the job. Although 
I've some questions.

You mentioned the --enable-repo option but I didn't find it. There's a disable 
one, so I'm assuming that it's enabled by default. Anyway there's no repos 
added after the script.

I've run the script with the arguments: ./mlnxofedinstall --with-nfsrdma -vvv; 
and everything went fine:

[root@rhvepyc2 mnt]# /etc/init.d/openibd status

  HCA driver loaded

Configured IPoIB devices:
ib0

Currently active IPoIB devices:
ib0
Configured Mellanox EN devices:

Currently active Mellanox devices:
ib0

The following OFED modules are loaded:

  rdma_ucm
  rdma_cm
  ib_ipoib
  mlx5_core
  mlx5_ib
  ib_uverbs
  ib_umad
  ib_cm
  ib_core
  mlxfw

[root@rhvepyc2 mnt]# rpm -qa | grep -i mlnx
libibverbs-54mlnx1-1.54103.x86_64
infiniband-diags-54mlnx1-1.54103.x86_64
mlnx-ethtool-5.10-1.54103.x86_64
rdma-core-54mlnx1-1.54103.x86_64
dapl-utils-2.1.10.1.mlnx-OFED.4.9.0.1.4.54103.x86_64
kmod-mlnx-nfsrdma-5.4-OFED.5.4.1.0.3.1.rhel8u4.x86_64
dapl-2.1.10.1.mlnx-OFED.4.9.0.1.4.54103.x86_64
mlnx-tools-5.2.0-0.54103.x86_64
libibumad-54mlnx1-1.54103.x86_64
opensm-5.9.0.MLNX20210617.c9f2ade-0.1.54103.x86_64
kmod-kernel-mft-mlnx-4.17.0-1.rhel8u4.x86_64
ibacm-54mlnx1-1.54103.x86_64
dapl-devel-static-2.1.10.1.mlnx-OFED.4.9.0.1.4.54103.x86_64
ar_mgr-1.0-5.9.0.MLNX20210617.g5dd71ee.54103.x86_64
mlnx-ofa_kernel-5.4-OFED.5.4.1.0.3.1.rhel8u4.x86_64
rdma-core-devel-54mlnx1-1.54103.x86_64
opensm-static-5.9.0.MLNX20210617.c9f2ade-0.1.54103.x86_64
srp_daemon-54mlnx1-1.54103.x86_64
sharp-2.5.0.MLNX20210613.83fe753-1.54103.x86_64
mlnx-iproute2-5.11.0-1.54103.x86_64
kmod-knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.rhel8u4.x86_64
librdmacm-54mlnx1-1.54103.x86_64
opensm-libs-5.9.0.MLNX20210617.c9f2ade-0.1.54103.x86_64
mlnx-ofa_kernel-devel-5.4-OFED.5.4.1.0.3.1.rhel8u4.x86_64
dapl-devel-2.1.10.1.mlnx-OFED.4.9.0.1.4.54103.x86_64
dump_pr-1.0-5.9.0.MLNX20210617.g5dd71ee.54103.x86_64
mlnxofed-docs-5.4-1.0.3.0.noarch
opensm-devel-5.9.0.MLNX20210617.c9f2ade-0.1.54103.x86_64
knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.rhel8u4.x86_64
librdmacm-utils-54mlnx1-1.54103.x86_64
mlnx-fw-updater-5.4-1.0.3.0.x86_64
kmod-mlnx-ofa_kernel-5.4-OFED.5.4.1.0.3.1.rhel8u4.x86_64
libibverbs-utils-54mlnx1-1.54103.x86_64
ibutils2-2.1.1-0.136.MLNX20210617.g4883fca.54103.x86_64

As a final question, did you selected the option: --add-kernel-support on the 
script? I couldn't find the difference between enabling it or not.

Thank you.

On 5 Aug 2021, at 15:20, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

Hmmm. Running the mlnx_ofed_install.sh script is a pain. But I got your idea. 
I'll do this test right now and report back. Ideally using the repo would 
guarantee an easy upgrade path between release, but Mellanox is lacking on this 
part.

And yes Edward, I want to use the virtual Infiniband interfaces too.

Thank you.

On 5 Aug 2021, at 10:52, Edward Berger 
mailto:edwber...@gmail.com>> wrote:

I don't know if you can just remove the gluster-rdma rpm.

I'm using mlnx ofed on some 4.4 ovirt node hosts by installing it via the 
mellanox tar/iso and
running the mellanox install script after adding the required dependencies with 
--enable-repo,
which isn't the same as adding a repository and 'dnf install'.  So I would try 
that on a test host.

I use it for the 'virtual infiniband' interfaces that get attached to VMs as 
'host device passthru'.

I'll note the node versions of gluster are 7.8 (node 
4.4.4.0/CentOS8.3<http://4.4.4.0/CentOS8.3>) and 7.9 (node 
4.4.4.1/CentOS8.3<http://4.4.4.1/CentOS8.3>).
unlike your glusterfs version 6.0.x

I'll be trying to install mellanox ofed on node 4.4.7.1 (CentOS 8 stream) soon 
to see how that works out.



On Wed, Aug 4, 2021 at 10:04 PM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hello,

Is there

[ovirt-users] Re: Is there a way to support Mellanox OFED with oVirt/RHV?

2021-08-05 Thread Vinícius Ferrão via Users
Hi Edward, it seems that running mlnxofedinstall would do the job. Although 
I've some questions.

You mentioned the --enable-repo option but I didn't find it. There's a disable 
one, so I'm assuming that it's enabled by default. Anyway there's no repos 
added after the script.

I've run the script with the arguments: ./mlnxofedinstall --with-nfsrdma -vvv; 
and everything went fine:

[root@rhvepyc2 mnt]# /etc/init.d/openibd status

  HCA driver loaded

Configured IPoIB devices:
ib0

Currently active IPoIB devices:
ib0
Configured Mellanox EN devices:

Currently active Mellanox devices:
ib0

The following OFED modules are loaded:

  rdma_ucm
  rdma_cm
  ib_ipoib
  mlx5_core
  mlx5_ib
  ib_uverbs
  ib_umad
  ib_cm
  ib_core
  mlxfw

[root@rhvepyc2 mnt]# rpm -qa | grep -i mlnx
libibverbs-54mlnx1-1.54103.x86_64
infiniband-diags-54mlnx1-1.54103.x86_64
mlnx-ethtool-5.10-1.54103.x86_64
rdma-core-54mlnx1-1.54103.x86_64
dapl-utils-2.1.10.1.mlnx-OFED.4.9.0.1.4.54103.x86_64
kmod-mlnx-nfsrdma-5.4-OFED.5.4.1.0.3.1.rhel8u4.x86_64
dapl-2.1.10.1.mlnx-OFED.4.9.0.1.4.54103.x86_64
mlnx-tools-5.2.0-0.54103.x86_64
libibumad-54mlnx1-1.54103.x86_64
opensm-5.9.0.MLNX20210617.c9f2ade-0.1.54103.x86_64
kmod-kernel-mft-mlnx-4.17.0-1.rhel8u4.x86_64
ibacm-54mlnx1-1.54103.x86_64
dapl-devel-static-2.1.10.1.mlnx-OFED.4.9.0.1.4.54103.x86_64
ar_mgr-1.0-5.9.0.MLNX20210617.g5dd71ee.54103.x86_64
mlnx-ofa_kernel-5.4-OFED.5.4.1.0.3.1.rhel8u4.x86_64
rdma-core-devel-54mlnx1-1.54103.x86_64
opensm-static-5.9.0.MLNX20210617.c9f2ade-0.1.54103.x86_64
srp_daemon-54mlnx1-1.54103.x86_64
sharp-2.5.0.MLNX20210613.83fe753-1.54103.x86_64
mlnx-iproute2-5.11.0-1.54103.x86_64
kmod-knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.rhel8u4.x86_64
librdmacm-54mlnx1-1.54103.x86_64
opensm-libs-5.9.0.MLNX20210617.c9f2ade-0.1.54103.x86_64
mlnx-ofa_kernel-devel-5.4-OFED.5.4.1.0.3.1.rhel8u4.x86_64
dapl-devel-2.1.10.1.mlnx-OFED.4.9.0.1.4.54103.x86_64
dump_pr-1.0-5.9.0.MLNX20210617.g5dd71ee.54103.x86_64
mlnxofed-docs-5.4-1.0.3.0.noarch
opensm-devel-5.9.0.MLNX20210617.c9f2ade-0.1.54103.x86_64
knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.rhel8u4.x86_64
librdmacm-utils-54mlnx1-1.54103.x86_64
mlnx-fw-updater-5.4-1.0.3.0.x86_64
kmod-mlnx-ofa_kernel-5.4-OFED.5.4.1.0.3.1.rhel8u4.x86_64
libibverbs-utils-54mlnx1-1.54103.x86_64
ibutils2-2.1.1-0.136.MLNX20210617.g4883fca.54103.x86_64

As a final question, did you selected the option: --add-kernel-support on the 
script? I couldn't find the difference between enabling it or not.

Thank you.

On 5 Aug 2021, at 15:20, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

Hmmm. Running the mlnx_ofed_install.sh script is a pain. But I got your idea. 
I'll do this test right now and report back. Ideally using the repo would 
guarantee an easy upgrade path between release, but Mellanox is lacking on this 
part.

And yes Edward, I want to use the virtual Infiniband interfaces too.

Thank you.

On 5 Aug 2021, at 10:52, Edward Berger 
mailto:edwber...@gmail.com>> wrote:

I don't know if you can just remove the gluster-rdma rpm.

I'm using mlnx ofed on some 4.4 ovirt node hosts by installing it via the 
mellanox tar/iso and
running the mellanox install script after adding the required dependencies with 
--enable-repo,
which isn't the same as adding a repository and 'dnf install'.  So I would try 
that on a test host.

I use it for the 'virtual infiniband' interfaces that get attached to VMs as 
'host device passthru'.

I'll note the node versions of gluster are 7.8 (node 
4.4.4.0/CentOS8.3<http://4.4.4.0/CentOS8.3>) and 7.9 (node 
4.4.4.1/CentOS8.3<http://4.4.4.1/CentOS8.3>).
unlike your glusterfs version 6.0.x

I'll be trying to install mellanox ofed on node 4.4.7.1 (CentOS 8 stream) soon 
to see how that works out.



On Wed, Aug 4, 2021 at 10:04 PM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hello,

Is there a way to keep Mellanox OFED and oVirt/RHV playing nice with each other?

The real issue is regarding GlusterFS. It seems to be a Mellanox issue, but I 
would like to know if there's something that we can do make both play nice on 
the same machine:

[root@rhvepyc2 ~]# dnf update --nobest
Updating Subscription Management repositories.
Last metadata expiration check: 0:14:25 ago on Wed 04 Aug 2021 02:01:11 AM -03.
Dependencies resolved.

 Problem: both package mlnx-ofed-all-user-only-5.4-1.0.3.0.rhel8.4.noarch and 
mlnx-ofed-all-5.4-1.0.3.0.rhel8.4.noarch obsolete glusterfs-rdma
  - cannot install the best update candidate for package 
glusterfs-rdma-6.0-49.1.el8.x86_64
  - package ovirt-host-4.4.7-1.el8ev.x86_64 requires glusterfs-rdma, but none 
of the providers can be installed
  - package mlnx-ofed-all-5.4-1.0.3.0.rhel8.4.noarch obsoletes glusterfs-rdma 
provided by glusterfs-rdma-6.0-49.1.el8.x86_64
  - package glusterfs-rdma-3.12.2-40.2.el8.x86_64 requires glusterfs(x86-64) = 
3.12.2-40.2.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-15.el8.x86_64 requires glusterfs(x

[ovirt-users] Re: Is there a way to support Mellanox OFED with oVirt/RHV?

2021-08-05 Thread Vinícius Ferrão
Hmmm. Running the mlnx_ofed_install.sh script is a pain. But I got your idea. 
I'll do this test right now and report back. Ideally using the repo would 
guarantee an easy upgrade path between release, but Mellanox is lacking on this 
part.

And yes Edward, I want to use the virtual Infiniband interfaces too.

Thank you.

On 5 Aug 2021, at 10:52, Edward Berger 
mailto:edwber...@gmail.com>> wrote:

I don't know if you can just remove the gluster-rdma rpm.

I'm using mlnx ofed on some 4.4 ovirt node hosts by installing it via the 
mellanox tar/iso and
running the mellanox install script after adding the required dependencies with 
--enable-repo,
which isn't the same as adding a repository and 'dnf install'.  So I would try 
that on a test host.

I use it for the 'virtual infiniband' interfaces that get attached to VMs as 
'host device passthru'.

I'll note the node versions of gluster are 7.8 (node 
4.4.4.0/CentOS8.3<http://4.4.4.0/CentOS8.3>) and 7.9 (node 
4.4.4.1/CentOS8.3<http://4.4.4.1/CentOS8.3>).
unlike your glusterfs version 6.0.x

I'll be trying to install mellanox ofed on node 4.4.7.1 (CentOS 8 stream) soon 
to see how that works out.



On Wed, Aug 4, 2021 at 10:04 PM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hello,

Is there a way to keep Mellanox OFED and oVirt/RHV playing nice with each other?

The real issue is regarding GlusterFS. It seems to be a Mellanox issue, but I 
would like to know if there's something that we can do make both play nice on 
the same machine:

[root@rhvepyc2 ~]# dnf update --nobest
Updating Subscription Management repositories.
Last metadata expiration check: 0:14:25 ago on Wed 04 Aug 2021 02:01:11 AM -03.
Dependencies resolved.

 Problem: both package mlnx-ofed-all-user-only-5.4-1.0.3.0.rhel8.4.noarch and 
mlnx-ofed-all-5.4-1.0.3.0.rhel8.4.noarch obsolete glusterfs-rdma
  - cannot install the best update candidate for package 
glusterfs-rdma-6.0-49.1.el8.x86_64
  - package ovirt-host-4.4.7-1.el8ev.x86_64 requires glusterfs-rdma, but none 
of the providers can be installed
  - package mlnx-ofed-all-5.4-1.0.3.0.rhel8.4.noarch obsoletes glusterfs-rdma 
provided by glusterfs-rdma-6.0-49.1.el8.x86_64
  - package glusterfs-rdma-3.12.2-40.2.el8.x86_64 requires glusterfs(x86-64) = 
3.12.2-40.2.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-15.el8.x86_64 requires glusterfs(x86-64) = 
6.0-15.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-20.el8.x86_64 requires glusterfs(x86-64) = 
6.0-20.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-37.el8.x86_64 requires glusterfs(x86-64) = 
6.0-37.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-37.2.el8.x86_64 requires glusterfs(x86-64) = 
6.0-37.2.el8, but none of the providers can be installed
  - cannot install both glusterfs-3.12.2-40.2.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-15.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-20.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-37.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-37.2.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install the best update candidate for package 
ovirt-host-4.4.7-1.el8ev.x86_64
  - cannot install the best update candidate for package 
glusterfs-6.0-49.1.el8.x86_64
=
 PackageArchitectureVersion 
  RepositorySize
=
Installing dependencies:
 openvswitchx86_64  2.14.1-1.54103  
  mlnx_ofed_5.4-1.0.3.0_base17 M
 ovirt-openvswitch  noarch  2.11-1.el8ev
  rhv-4-mgmt-agent-for-rhel-8-x86_64-rpms  8.7 k
 replacing  rhv-openvswitch.noarch 1:2.11-7.el8ev
 unboundx86_64  1.7.3-15.el8
  rhel-8-for-x86_64-appstream-rpms 895 k
Skipping packages with conflicts:
(add '--best --allowerasing' to command line to force their upgrade):
 glusterfs  x86_64  3.12.2-40.2.el8 
  rhel-8-for-x86_64-baseos-rpms558 k
 glusterfs  x86_64  6.0-15.el8  
  rhel-8-for-x86_64-baseos-rpms658 k
 glusterfs  x86_64

[ovirt-users] Re: Is there a way to support Mellanox OFED with oVirt/RHV?

2021-08-05 Thread Vinícius Ferrão via Users
Yes it is deprecated on RHGS 3.5; but I really don't care for Gluster and I 
don't use it. What I would like to use is things like NFS over RDMA, that only 
Mellanox OFED provides and the host have other users that we need MLNX OFED to 
get support from Mellanox.

That's why I'm trying to install the MLNX OFED distribution. This is a 
development machine, it's not for production so we don't care we things break. 
But even when I try to force the install of MLNX OFED packages things does not 
work as expected.

Thank you.

On 5 Aug 2021, at 06:55, Strahil Nikolov 
mailto:hunter86...@yahoo.com>> wrote:

As far as I know rdma is deprecated ong glusterfs, but it most probably works.

Best Regards,
Strahil Nikolov

On Thu, Aug 5, 2021 at 5:05, Vinícius Ferrão via Users
mailto:users@ovirt.org>> wrote:
Hello,

Is there a way to keep Mellanox OFED and oVirt/RHV playing nice with each other?

The real issue is regarding GlusterFS. It seems to be a Mellanox issue, but I 
would like to know if there's something that we can do make both play nice on 
the same machine:

[root@rhvepyc2<mailto:root@rhvepyc2> ~]# dnf update --nobest
Updating Subscription Management repositories.
Last metadata expiration check: 0:14:25 ago on Wed 04 Aug 2021 02:01:11 AM -03.
Dependencies resolved.

Problem: both package mlnx-ofed-all-user-only-5.4-1.0.3.0.rhel8.4.noarch and 
mlnx-ofed-all-5.4-1.0.3.0.rhel8.4.noarch obsolete glusterfs-rdma
  - cannot install the best update candidate for package 
glusterfs-rdma-6.0-49.1.el8.x86_64
  - package ovirt-host-4.4.7-1.el8ev.x86_64 requires glusterfs-rdma, but none 
of the providers can be installed
  - package mlnx-ofed-all-5.4-1.0.3.0.rhel8.4.noarch obsoletes glusterfs-rdma 
provided by glusterfs-rdma-6.0-49.1.el8.x86_64
  - package glusterfs-rdma-3.12.2-40.2.el8.x86_64 requires glusterfs(x86-64) = 
3.12.2-40.2.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-15.el8.x86_64 requires glusterfs(x86-64) = 
6.0-15.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-20.el8.x86_64 requires glusterfs(x86-64) = 
6.0-20.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-37.el8.x86_64 requires glusterfs(x86-64) = 
6.0-37.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-37.2.el8.x86_64 requires glusterfs(x86-64) = 
6.0-37.2.el8, but none of the providers can be installed
  - cannot install both glusterfs-3.12.2-40.2.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-15.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-20.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-37.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-37.2.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install the best update candidate for package 
ovirt-host-4.4.7-1.el8ev.x86_64
  - cannot install the best update candidate for package 
glusterfs-6.0-49.1.el8.x86_64
=
PackageArchitectureVersion  
RepositorySize
=
Installing dependencies:
openvswitchx86_64  2.14.1-1.54103   
 mlnx_ofed_5.4-1.0.3.0_base17 M
ovirt-openvswitch  noarch  2.11-1.el8ev 
 rhv-4-mgmt-agent-for-rhel-8-x86_64-rpms  8.7 k
replacing  rhv-openvswitch.noarch 1:2.11-7.el8ev
unboundx86_64  1.7.3-15.el8 
 rhel-8-for-x86_64-appstream-rpms895 k
Skipping packages with conflicts:
(add '--best --allowerasing' to command line to force their upgrade):
glusterfs  x86_64  3.12.2-40.2.el8  
rhel-8-for-x86_64-baseos-rpms558 k
glusterfs  x86_64  6.0-15.el8   
 rhel-8-for-x86_64-baseos-rpms658 k
glusterfs  x86_64  6.0-20.el8   
 rhel-8-for-x86_64-baseos-rpms659 k
glusterfs  x86_64  6.0-37.el8   
 rhel-8-for-x86_64-baseos-rpms663 k
glusterfs  x86_64  6.0-37.2.el8 
 rhel-8-for-x86_64-baseos-rpms662 k
Skipping packages with b

[ovirt-users] Is there a way to support Mellanox OFED with oVirt/RHV?

2021-08-04 Thread Vinícius Ferrão via Users
Hello,

Is there a way to keep Mellanox OFED and oVirt/RHV playing nice with each other?

The real issue is regarding GlusterFS. It seems to be a Mellanox issue, but I 
would like to know if there's something that we can do make both play nice on 
the same machine:

[root@rhvepyc2 ~]# dnf update --nobest
Updating Subscription Management repositories.
Last metadata expiration check: 0:14:25 ago on Wed 04 Aug 2021 02:01:11 AM -03.
Dependencies resolved.

 Problem: both package mlnx-ofed-all-user-only-5.4-1.0.3.0.rhel8.4.noarch and 
mlnx-ofed-all-5.4-1.0.3.0.rhel8.4.noarch obsolete glusterfs-rdma
  - cannot install the best update candidate for package 
glusterfs-rdma-6.0-49.1.el8.x86_64
  - package ovirt-host-4.4.7-1.el8ev.x86_64 requires glusterfs-rdma, but none 
of the providers can be installed
  - package mlnx-ofed-all-5.4-1.0.3.0.rhel8.4.noarch obsoletes glusterfs-rdma 
provided by glusterfs-rdma-6.0-49.1.el8.x86_64
  - package glusterfs-rdma-3.12.2-40.2.el8.x86_64 requires glusterfs(x86-64) = 
3.12.2-40.2.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-15.el8.x86_64 requires glusterfs(x86-64) = 
6.0-15.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-20.el8.x86_64 requires glusterfs(x86-64) = 
6.0-20.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-37.el8.x86_64 requires glusterfs(x86-64) = 
6.0-37.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-37.2.el8.x86_64 requires glusterfs(x86-64) = 
6.0-37.2.el8, but none of the providers can be installed
  - cannot install both glusterfs-3.12.2-40.2.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-15.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-20.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-37.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install both glusterfs-6.0-37.2.el8.x86_64 and 
glusterfs-6.0-49.1.el8.x86_64
  - cannot install the best update candidate for package 
ovirt-host-4.4.7-1.el8ev.x86_64
  - cannot install the best update candidate for package 
glusterfs-6.0-49.1.el8.x86_64
=
 PackageArchitectureVersion 
  RepositorySize
=
Installing dependencies:
 openvswitchx86_64  2.14.1-1.54103  
  mlnx_ofed_5.4-1.0.3.0_base17 M
 ovirt-openvswitch  noarch  2.11-1.el8ev
  rhv-4-mgmt-agent-for-rhel-8-x86_64-rpms  8.7 k
 replacing  rhv-openvswitch.noarch 1:2.11-7.el8ev
 unboundx86_64  1.7.3-15.el8
  rhel-8-for-x86_64-appstream-rpms 895 k
Skipping packages with conflicts:
(add '--best --allowerasing' to command line to force their upgrade):
 glusterfs  x86_64  3.12.2-40.2.el8 
  rhel-8-for-x86_64-baseos-rpms558 k
 glusterfs  x86_64  6.0-15.el8  
  rhel-8-for-x86_64-baseos-rpms658 k
 glusterfs  x86_64  6.0-20.el8  
  rhel-8-for-x86_64-baseos-rpms659 k
 glusterfs  x86_64  6.0-37.el8  
  rhel-8-for-x86_64-baseos-rpms663 k
 glusterfs  x86_64  6.0-37.2.el8
  rhel-8-for-x86_64-baseos-rpms662 k
Skipping packages with broken dependencies:
 glusterfs-rdma x86_64  3.12.2-40.2.el8 
  rhel-8-for-x86_64-baseos-rpms 49 k
 glusterfs-rdma x86_64  6.0-15.el8  
  rhel-8-for-x86_64-baseos-rpms 46 k
 glusterfs-rdma x86_64  6.0-20.el8  
  rhel-8-for-x86_64-baseos-rpms 46 k
 glusterfs-rdma x86_64  6.0-37.2.el8
  rhel-8-for-x86_64-baseos-rpms 48 k
 glusterfs-rdma x86_64  6.0-37.el8  
  rhel-8-for-x86_64-baseos-rpms 48 k

Transaction Summary

[ovirt-users] Re: Host not becoming active due to VDSM failure

2021-08-03 Thread Vinícius Ferrão via Users
As a followup to the mailing list.

Updating the machine solved this issue. But the bugzilla still applies since it 
was blocking the upgrade.

Thank you all.


On 2 Aug 2021, at 13:22, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Hi Ales, Nir.

Sorry for the delayed answer. I didn't had the opportunity to answer it before.

I'm running RHV (not RHVH) on RHEL 8.4 and on top of ppc64le. So it's not 
vanilla oVirt.

Right now its based on:
ovirt-host-4.4.1-4.el8ev.ppc64le

I'm already with nmstate >= 0.3 as I can see:
nmstate-1.0.2-11.el8_4.noarch

VDSM in fact is old, I tried upgrading VDSM but there's a failed dependency on 
openvswitch:
[root@rhvpower ~]# dnf update vdsm
Updating Subscription Management repositories.
Last metadata expiration check: 0:11:15 ago on Mon 02 Aug 2021 12:06:44 PM EDT.
Error:
 Problem 1: package vdsm-python-4.40.70.6-1.el8ev.noarch requires vdsm-network 
= 4.40.70.6-1.el8ev, but none of the providers can be installed
  - package vdsm-4.40.70.6-1.el8ev.ppc64le requires vdsm-python = 
4.40.70.6-1.el8ev, but none of the providers can be installed
  - package vdsm-network-4.40.70.6-1.el8ev.ppc64le requires openvswitch >= 
2.11, but none of the providers can be installed
  - cannot install the best update candidate for package 
vdsm-4.40.35.1-1.el8ev.ppc64le
  - nothing provides openvswitch2.11 needed by 
rhv-openvswitch-1:2.11-7.el8ev.noarch
  - nothing provides openvswitch2.11 needed by 
ovirt-openvswitch-2.11-1.el8ev.noarch
 Problem 2: package vdsm-python-4.40.70.6-1.el8ev.noarch requires vdsm-network 
= 4.40.70.6-1.el8ev, but none of the providers can be installed
  - package vdsm-4.40.70.6-1.el8ev.ppc64le requires vdsm-python = 
4.40.70.6-1.el8ev, but none of the providers can be installed
  - package vdsm-network-4.40.70.6-1.el8ev.ppc64le requires openvswitch >= 
2.11, but none of the providers can be installed
  - cannot install the best update candidate for package 
vdsm-hook-vmfex-dev-4.40.35.1-1.el8ev.noarch
  - nothing provides openvswitch2.11 needed by 
rhv-openvswitch-1:2.11-7.el8ev.noarch
  - nothing provides openvswitch2.11 needed by 
ovirt-openvswitch-2.11-1.el8ev.noarch
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use 
not only best candidate packages)

Nothing seems to provide an openvswitch release that satisfies VDSM. There's no 
openvswitch package installed right now, nor available on the repositories:

[root@rhvpower ~]# dnf install openvswitch
Updating Subscription Management repositories.
Last metadata expiration check: 0:15:28 ago on Mon 02 Aug 2021 12:06:44 PM EDT.
Error:
 Problem: cannot install the best candidate for the job
  - nothing provides openvswitch2.11 needed by 
ovirt-openvswitch-2.11-1.el8ev.noarch
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use 
not only best candidate packages)

Any ideias on how to past beyond this issue? This is probably only related to 
ppc64le.
I already opened a bugzilla about the openvswitch issue here: 
https://bugzilla.redhat.com/show_bug.cgi?id=1988507

Thank you all.

On 2 Aug 2021, at 02:09, Ales Musil 
mailto:amu...@redhat.com>> wrote:



On Fri, Jul 30, 2021 at 8:54 PM Nir Soffer 
mailto:nsof...@redhat.com>> wrote:
On Fri, Jul 30, 2021 at 7:41 PM Vinícius Ferrão via Users
mailto:users@ovirt.org>> wrote:
...
> restore-net::ERROR::2021-07-30 
> 12:34:56,167::restore_net_config::462::root::(restore) restoration failed.
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/vdsm/network/restore_net_config.py", 
> line 460, in restore
> unified_restoration()
>   File "/usr/lib/python3.6/site-packages/vdsm/network/restore_net_config.py", 
> line 112, in unified_restoration
> classified_conf = _classify_nets_bonds_config(available_config)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/restore_net_config.py", 
> line 237, in _classify_nets_bonds_config
> net_info = NetInfo(netswitch.configurator.netinfo())
>   File 
> "/usr/lib/python3.6/site-packages/vdsm/network/netswitch/configurator.py", 
> line 323, in netinfo
> _netinfo = netinfo_get(vdsmnets, compatibility)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 268, in get
> return _get(vdsmnets)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 76, in _get
> extra_info.update(_get_devices_info_from_nmstate(state, devices))
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 165, in _get_devices_info_from_nmstate
> nmstate.get_interfaces(state, filter=devices)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 164, in 
> for ifname, ifstate in six.viewitems(
>   File "/usr/lib/python3.6/site-p

[ovirt-users] Re: Host not becoming active due to VDSM failure

2021-08-02 Thread Vinícius Ferrão via Users
Hi Ales, Nir.

Sorry for the delayed answer. I didn't had the opportunity to answer it before.

I'm running RHV (not RHVH) on RHEL 8.4 and on top of ppc64le. So it's not 
vanilla oVirt.

Right now its based on:
ovirt-host-4.4.1-4.el8ev.ppc64le

I'm already with nmstate >= 0.3 as I can see:
nmstate-1.0.2-11.el8_4.noarch

VDSM in fact is old, I tried upgrading VDSM but there's a failed dependency on 
openvswitch:
[root@rhvpower ~]# dnf update vdsm
Updating Subscription Management repositories.
Last metadata expiration check: 0:11:15 ago on Mon 02 Aug 2021 12:06:44 PM EDT.
Error:
 Problem 1: package vdsm-python-4.40.70.6-1.el8ev.noarch requires vdsm-network 
= 4.40.70.6-1.el8ev, but none of the providers can be installed
  - package vdsm-4.40.70.6-1.el8ev.ppc64le requires vdsm-python = 
4.40.70.6-1.el8ev, but none of the providers can be installed
  - package vdsm-network-4.40.70.6-1.el8ev.ppc64le requires openvswitch >= 
2.11, but none of the providers can be installed
  - cannot install the best update candidate for package 
vdsm-4.40.35.1-1.el8ev.ppc64le
  - nothing provides openvswitch2.11 needed by 
rhv-openvswitch-1:2.11-7.el8ev.noarch
  - nothing provides openvswitch2.11 needed by 
ovirt-openvswitch-2.11-1.el8ev.noarch
 Problem 2: package vdsm-python-4.40.70.6-1.el8ev.noarch requires vdsm-network 
= 4.40.70.6-1.el8ev, but none of the providers can be installed
  - package vdsm-4.40.70.6-1.el8ev.ppc64le requires vdsm-python = 
4.40.70.6-1.el8ev, but none of the providers can be installed
  - package vdsm-network-4.40.70.6-1.el8ev.ppc64le requires openvswitch >= 
2.11, but none of the providers can be installed
  - cannot install the best update candidate for package 
vdsm-hook-vmfex-dev-4.40.35.1-1.el8ev.noarch
  - nothing provides openvswitch2.11 needed by 
rhv-openvswitch-1:2.11-7.el8ev.noarch
  - nothing provides openvswitch2.11 needed by 
ovirt-openvswitch-2.11-1.el8ev.noarch
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use 
not only best candidate packages)

Nothing seems to provide an openvswitch release that satisfies VDSM. There's no 
openvswitch package installed right now, nor available on the repositories:

[root@rhvpower ~]# dnf install openvswitch
Updating Subscription Management repositories.
Last metadata expiration check: 0:15:28 ago on Mon 02 Aug 2021 12:06:44 PM EDT.
Error:
 Problem: cannot install the best candidate for the job
  - nothing provides openvswitch2.11 needed by 
ovirt-openvswitch-2.11-1.el8ev.noarch
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use 
not only best candidate packages)

Any ideias on how to past beyond this issue? This is probably only related to 
ppc64le.
I already opened a bugzilla about the openvswitch issue here: 
https://bugzilla.redhat.com/show_bug.cgi?id=1988507

Thank you all.

On 2 Aug 2021, at 02:09, Ales Musil 
mailto:amu...@redhat.com>> wrote:



On Fri, Jul 30, 2021 at 8:54 PM Nir Soffer 
mailto:nsof...@redhat.com>> wrote:
On Fri, Jul 30, 2021 at 7:41 PM Vinícius Ferrão via Users
mailto:users@ovirt.org>> wrote:
...
> restore-net::ERROR::2021-07-30 
> 12:34:56,167::restore_net_config::462::root::(restore) restoration failed.
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/vdsm/network/restore_net_config.py", 
> line 460, in restore
> unified_restoration()
>   File "/usr/lib/python3.6/site-packages/vdsm/network/restore_net_config.py", 
> line 112, in unified_restoration
> classified_conf = _classify_nets_bonds_config(available_config)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/restore_net_config.py", 
> line 237, in _classify_nets_bonds_config
> net_info = NetInfo(netswitch.configurator.netinfo())
>   File 
> "/usr/lib/python3.6/site-packages/vdsm/network/netswitch/configurator.py", 
> line 323, in netinfo
> _netinfo = netinfo_get(vdsmnets, compatibility)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 268, in get
> return _get(vdsmnets)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 76, in _get
> extra_info.update(_get_devices_info_from_nmstate(state, devices))
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 165, in _get_devices_info_from_nmstate
> nmstate.get_interfaces(state, filter=devices)
>   File "/usr/lib/python3.6/site-packages/vdsm/network/netinfo/cache.py", line 
> 164, in 
> for ifname, ifstate in six.viewitems(
>   File "/usr/lib/python3.6/site-packages/vdsm/network/nmstate/api.py", line 
> 228, in is_dhcp_enabled
> return util_is_dhcp_enabled(family_info)
>   File 
> "/usr/lib/python3.6/site-packages/vdsm/network/nmstate/bridge_util.py", line 
> 137, in is_dhc

[ovirt-users] Host not becoming active due to VDSM failure

2021-07-30 Thread Vinícius Ferrão via Users
Hello,

I have a host that's failing to bring up VDSM, the logs don't say anything 
specific, but there's a Python error about DHCP on it. Is there anyone with a 
similar issue?

[root@rhvpower ~]# systemctl status vdsmd
● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor 
preset: disabled)
   Active: inactive (dead)

Jul 30 01:53:40 rhvpower.local.versatushpc.com.br systemd[1]: Dependency failed 
for Virtual Desktop Server Manager.
Jul 30 01:53:40 rhvpower.local.versatushpc.com.br systemd[1]: vdsmd.service: 
Job vdsmd.service/start failed with result 'dependency'.
Jul 30 12:34:12 rhvpower.local.versatushpc.com.br systemd[1]: Dependency failed 
for Virtual Desktop Server Manager.
Jul 30 12:34:12 rhvpower.local.versatushpc.com.br systemd[1]: vdsmd.service: 
Job vdsmd.service/start failed with result 'dependency'.
[root@rhvpower ~]# systemctl start vdsmd
A dependency job for vdsmd.service failed. See 'journalctl -xe' for details.


On the logs I got the following messages:

==> /var/log/vdsm/upgrade.log <==
MainThread::DEBUG::2021-07-30 12:34:55,143::libvirtconnection::168::root::(get) 
trying to connect libvirt
MainThread::INFO::2021-07-30 
12:34:55,167::netconfpersistence::238::root::(_clearDisk) Clearing netconf: 
/var/lib/vdsm/staging/netconf
MainThread::INFO::2021-07-30 
12:34:55,178::netconfpersistence::188::root::(save) Saved new config 
RunningConfig({'ovirtmgmt': {'netmask': '255.255.255.0', 'bonding': 'bond0', 
'ipv6autoconf': False, 'bridged': True, 'ipaddr': '10.20.0.106', 
'defaultRoute': True, 'dhcpv6': False, 'gateway': '10.20.0.1', 'mtu': 1500, 
'switch': 'legacy', 'stp': False, 'bootproto': 'none', 'nameservers': 
['10.20.0.1']}, 'servers': {'vlan': 172, 'bonding': 'bond0', 'ipv6autoconf': 
False, 'bridged': True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 
'defaultRoute': False, 'stp': False, 'bootproto': 'none', 'nameservers': []}, 
'xcat-other': {'vlan': 2020, 'bonding': 'bond0', 'ipv6autoconf': False, 
'bridged': True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 
'defaultRoute': False, 'stp': False, 'bootproto': 'none', 'nameservers': []}, 
'xcat-nodes1': {'vlan': 2021, 'bonding': 'bond0', 'ipv6autoconf': False, 
'bridged': True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 
'defaultRoute': False, 'stp': False, 'bootproto': 'none', 'nameservers': []}, 
'xcat-nodes3': {'vlan': 2023, 'bonding': 'bond0', 'ipv6autoconf': False, 
'bridged': True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 
'defaultRoute': False, 'stp': False, 'bootproto': 'none', 'nameservers': []}, 
'xcat-nodes2': {'vlan': 2022, 'bonding': 'bond0', 'ipv6autoconf': False, 
'bridged': True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 
'defaultRoute': False, 'stp': False, 'bootproto': 'none', 'nameservers': []}, 
'nfs': {'vlan': 200, 'bonding': 'bond0', 'ipv6autoconf': False, 'bridged': 
True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 'defaultRoute': False, 
'stp': False, 'bootproto': 'none', 'nameservers': []}, 'storage': {'vlan': 192, 
'netmask': '255.255.255.240', 'bonding': 'bond0', 'ipv6autoconf': False, 
'bridged': False, 'ipaddr': '192.168.10.6', 'dhcpv6': False, 'mtu': 1500, 
'switch': 'legacy', 'defaultRoute': False, 'bootproto': 'none', 'nameservers': 
[]}, 'xcat-nodes4': {'vlan': 2024, 'bonding': 'bond0', 'ipv6autoconf': False, 
'bridged': True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 
'defaultRoute': False, 'stp': False, 'bootproto': 'none', 'nameservers': []}}, 
{'bond0': {'nics': ['enP48p1s0f2', 'enP48p1s0f3'], 'options': 'mode=4', 
'switch': 'legacy', 'hwaddr': '98:be:94:78:cc:72'}}, {}) to 
[/var/lib/vdsm/staging/netconf/nets,/var/lib/vdsm/staging/netconf/bonds,/var/lib/vdsm/staging/netconf/devices]
MainThread::INFO::2021-07-30 
12:34:55,179::netconfpersistence::238::root::(_clearDisk) Clearing netconf: 
/var/lib/vdsm/persistence/netconf
MainThread::INFO::2021-07-30 
12:34:55,188::netconfpersistence::188::root::(save) Saved new config 
PersistentConfig({'ovirtmgmt': {'netmask': '255.255.255.0', 'bonding': 'bond0', 
'ipv6autoconf': False, 'bridged': True, 'ipaddr': '10.20.0.106', 
'defaultRoute': True, 'dhcpv6': False, 'gateway': '10.20.0.1', 'mtu': 1500, 
'switch': 'legacy', 'stp': False, 'bootproto': 'none', 'nameservers': 
['10.20.0.1']}, 'servers': {'vlan': 172, 'bonding': 'bond0', 'ipv6autoconf': 
False, 'bridged': True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 
'defaultRoute': False, 'stp': False, 'bootproto': 'none', 'nameservers': []}, 
'xcat-other': {'vlan': 2020, 'bonding': 'bond0', 'ipv6autoconf': False, 
'bridged': True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 
'defaultRoute': False, 'stp': False, 'bootproto': 'none', 'nameservers': []}, 
'xcat-nodes1': {'vlan': 2021, 'bonding': 'bond0', 'ipv6autoconf': False, 
'bridged': True, 'dhcpv6': False, 'mtu': 1500, 'switch': 'legacy', 
'defaultRoute': False, 'stp': False, 'bootproto': 'none', 'nameservers': []}, 

[ovirt-users] Re: LACP across multiple switches

2021-07-27 Thread Vinícius Ferrão via Users
Yes.

I have it running this way. You must configure as 802.3ad normally on oVirt but 
keep in mind that you must use bond and not teaming.

On the switches just configure MLAG, VLT, vPC ou whatever multichassis 
aggregation suportes by your switch vendor.

For the ovirtmgmt there’s some caveats to add it on top of bonds. I’m not sure 
if as today is solved, but you need to preconfigure vdsm if you want the bonded 
interfaces to host ovirtmgmt.

Sent from my iPhone

On 27 Jul 2021, at 12:01, Jorge Visentini  wrote:


Hi all.

Is it possible to configure oVirt for work with two NICs in bond/LACP across 
two switches, according to the image below?




Thank you all.
You guys do a wonderful job.
--
Att,
Jorge Visentini
+55 55 98432-9868
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ARRD5B7M6RHOV2DR7UOMGQVZ7AWBTFOU/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6MAD7CG4J44QNVAY5RYANXU6TALOGQBT/


[ovirt-users] Re: Slow VM replication

2021-04-30 Thread Vinícius Ferrão via Users
As far as I know FreeNAS supports VMware-like snapshots but not the language 
that oVirt speaks.

Another point to observe is that FreeNAS with RAID-Z3 is not recommended from 
VM storage, because it is just slow for this purpose. Usually NFS issues sync 
requests which will be slow due to it's nature. iSCSI tends to be faster since 
it can request sync and async operations.

If you pool does not have a ZIL for Sync Write Offload you'll get this kind of 
performance.

I have three oVirt installs backed by FreeNAS and all of them have 24 spindles 
in stripe of mirrors configuration, due to those issues.

Regards,


On 30 Apr 2021, at 14:05, Strahil Nikolov via Users 
mailto:users@ovirt.org>> wrote:

I have the feeling that it's using the wrong Network.

Also, check if TrueNAS supports NFS Server-Side Copy.
In order to use that feature mount the share  via v4.2 .


Best Regards,
Strahil Nikolov

On Fri, Apr 30, 2021 at 19:10, David Johnson
mailto:djohn...@maxistechnology.com>> wrote:
Hi everyone,

When cloning a VM, I discovered that the time to close appears excessive based 
on the underlying platform I am using. It appears that the clone operation is 
not making efficient use of the network.  I have seen up to 4 GBits sustained 
throughput by applications on VM's on the cluster.

Is there a configuration I might be missing?

System specifics:
Backing store: NFS on TrueNAS running zRaid 3 on 11 spinning disks

Ovirt Controller: I7 desktop
General network: 1 GBit Ethernet

Ovirt Host:  Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz, 2x32 cores, 256 GB RAM
Data Network: 10 GBit 10GBase-Twinax, dedicated to this Host and the TrueNAS


Operation:
Copying the 60GB partition, the copy operation never exceeds 40 megabytes per 
second (less than 0.5 GBit), even though the dedicated 10 Gigabit data network 
is not otherwise busy.



David Johnson
Director of Development, Maxis Technology
844.696.2947 ext 702 (o) | 479.531.3590 (c)
[https://maxistechnology.com/wp-content/uploads/2021/04/linkedin.png]
 [https://maxistechnology.com/wp-content/uploads/2021/04/email.png] 


[https://maxistechnology.com/wp-content/uploads/2021/04/Maxis_logo_small.gif]

Follow us: 
[https://maxistechnology.com/wp-content/uploads/2021/04/linkedin.png] 


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WTEWHJ6NMFGGKQHQNGGI3C5TPC2OETGU/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WGJLJUBHAUXA23NHVRXYHY2NL6IYWWVL/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/46E3UOOTVXJSO2BUNQ4AIJR3XQJ36SGM/


[ovirt-users] Re: Migrate windows 2003 server 64bits from libvirt to ovirt

2021-02-22 Thread Vinícius Ferrão via Users
Hi Fernando.

The blue screen message is in Portuguese, the majority of the list speaks 
English. So it will be hard to get some help on this.

Regarding the message for non Portuguese speakers, is says that the BIOS and/or 
the firmware isn’t compatible with ACPI.

Since the OS is legacy, this may be something related to missing drivers or 
something similar to this, like wrong drivers from other hypervisors. You said 
that you’ve imported the VM for other hypervisor, so it was preinstalled. Did 
you installed the oVirt Guest Tools before uploading the VM on oVirt? The guest 
tools should add the required drivers so the VM can boot. It’s a good ideia to 
remove the tools from the other hypervisor too.

On 22 Feb 2021, at 12:15, Fernando Hallberg 
mailto:ferna...@webgenium.com.br>> wrote:

Hi,

I have a VM with 2003 server x64, and I upload the vm image to oVirt.

The VM boot on the oVirt, but, the blue screen appear with a error message:



Anybody has some information about this?

I try to convert de img file from raw to qcow2, but the error persists.

Regards,
Fernando Hallberg
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UPP7EBZZ776WE4JEEIOJCLOTMCKJIQWM/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OOBJRVP2JBEA7Q7LTMKOD5JJKEDOR6Z2/


[ovirt-users] Re: Constantly XFS in memory corruption inside VMs

2021-01-14 Thread Vinícius Ferrão via Users
Hi all.

DISCLAIMER: read this if you are running TrueNAS 12.0-RELEASE or 12.0-U1.

After struggling for almost 2 months I’ve finally nailed down the issue to the 
storage subsystem.

Everything we’ve tried to solve the issue were only mitigations. In fact 
there’s nothing wrong with oVirt in first place, nor with the NFSv4 storage 
backend. Indeed changing to NFSv3 as recommended by Abhishek greatly mitigated 
the issue but the issue still existed.

The issue was due to a bug (not fixed yet - but already identified) on a new 
feature of TrueNAS 12.0-RELEASE and 12.0-U1, which is the Asynchronous 
Copy-on-Write. After numerous days of testing and constantly losing data even 
on other hypervisor solutions the storage was identified as the only common 
denominator in everything I’ve test.

So that’s it, I’ll leave the links here for iXsystems Jira issue with all the 
data for those who want to check it out.
Jira Issue: https://jira.ixsystems.com/browse/NAS-108627
TrueNAS Forums: 
https://www.truenas.com/community/threads/freenas-now-truenas-is-no-longer-stable.89445

I would like to thank specially Strahil and Abhishek for giving ideas and 
suggestions to figure out what’s may be happening. And as a final disclaimer, 
if you’re running FreeNAS up to 11.3-U5 do not upgrade to 12.0 yet. Wait for 
12.0-U1.1 or 12.0-U2 because I think they will either have the feature disabled 
or fixed in the cited future versions.

Thank you all,
Vinícius.

On 13 Dec 2020, at 00:34, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Hi Abhishek,

I haven’t found any critical corruption after the change. But I’m not sure if 
this was the issue, right now I’m suspecting of the storage subsystem. I’ll 
give some more days to see how things will end up.

Definitely there’s an improvement but, again, not sure yet if it was solved.

Thanks,

On 2 Dec 2020, at 09:21, Abhishek Sahni 
mailto:abhishek.sahni1...@gmail.com>> wrote:

I have been through a similar type of weird situation and ended up knowing that 
it was because of NFS mounting.

ENV:
STORAGE: Dell EMC VNX 5200 - NFS shares.

a) Given below is the mounting of the storage (NFSv4 SHARE) on the nodes when 
creating new VMs failed while installation. However existing VMs are running 
fine.  [nfsv4]

# mount

A.B.C.D:/VIRT_CC on /rhev/data-center/mnt/A.B.C.D:_VIRT__CC type nfs4 
(rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=A.B.C.D,local_lock=none,addr=A.B.C.D)


b) Switched back to NFSv3 and everything came back to normal.

# mount

A.B.C.D:/VIRT_CC on /rhev/data-center/mnt/A.B.C.D:_VIRT__CC type nfs 
(rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=A.B.C.D,mountvers=3,mountport=1234,mountproto=udp,local_lock=all,addr=A.B.C.D)

Conclusion: Checked logs everywhere (Nodes and Storage.) but didnt find 
anything which can lead to the error.

WORKAROUND:
NFS Storage domain was configured with "AUTO" negotiated  option.

1) I put the storage domain in maintainance mode.
2) Changed it to NFS v3 and remove it from the maintanance mode.

and Boom everything came back to normal.

You can check if that workaround will work for you.

On Wed, Dec 2, 2020 at 10:42 AM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Can this be related the case?
https://bugzilla.redhat.com/show_bug.cgi?id=810082

On 1 Dec 2020, at 10:25, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

ECC RAM everywhere: hosts and storage.

I even run Memtest86 on both hypervisor hosts just be sure. No errors. I 
haven’t had the opportunity to run it on the storage yet.

After I’ve sent that message yesterday, the engine VM crashed again, filesystem 
went offline. There was some discards (again) on the switch, probably due to 
the “boot storm” of other VM’s. But this time a simple reboot fixed the 
filesystem and the hosted engine VM was back.

Since it was an extremely small amount of time, I’ve checked everything again, 
and only the discards issues came up, there are ~90k discards on Po2 (which is 
the LACP interface of the hypervisor). Since the fact, I enabled hardware flow 
control on the ports of the switch, but discards are still happening:

PortAlign-Err FCS-ErrXmit-Err Rcv-Err  UnderSize  
OutDiscards
Po1 0   0   0   0  00
Po2 0   0   0   0  0 
3650
Po3 0   0   0   0  00
Po4 0   0   0   0  00
Po5 0   0   0   0  00
Po6 0   0   0   0  00
Po7 0   0   0   0  00
Po20   

[ovirt-users] Re: Shrink iSCSI Domain

2020-12-29 Thread Vinícius Ferrão via Users
It provides but isn’t enabled.

I run on TrueNAS, so in the past FreeNAS didn’t recommend deduplication due to 
some messy requirement that the deduplication table must fit on RAM or else the 
pool will be unable to mount. So I’ve avoided using it. Not sure how it’s today…

Thanks.

> On 28 Dec 2020, at 13:43, Strahil Nikolov  wrote:
> 
> Vinius,
> 
> does your storage provide dedpulication ? If yes, then you can provide a new 
> thin-provisioned LUN and migrate the data from the old LUN to the new one.
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> 
> В понеделник, 28 декември 2020 г., 18:27:38 Гринуич+2, Vinícius Ferrão via 
> Users  написа: 
> 
> 
> 
> 
> 
> Hi Shani, thank you! 
> 
> 
> 
> It’s only one LUN :(
> 
> 
> 
> 
> So it may be a best practice to split an SD in multiple LUNs?
> 
> 
> 
> 
> Thank you.
> 
> 
> 
>>   
>> On 28 Dec 2020, at 09:08, Shani Leviim  wrote:
>> 
>> 
>>   
>>   
>>   Hi,
>> 
>>   You can reduce LUNs from an iSCSI storage domain once it's in maintenance. 
>> [1]
>> 
>> 
>>   On the UI, after putting the storage domain in maintenance > Manage Domain 
>> > select the LUNs to be removed from the storage domain.
>> 
>>   
>> 
>> 
>>   Note that reducing LUNs is applicable in case the storage domain has more 
>> than 1 LUN.
>> 
>>   (Otherwise, removing the single LUN means removing the whole storage 
>> domain).
>> 
>> 
>>   
>> 
>> 
>>   [1]  
>> https://www.ovirt.org/develop/release-management/features/storage/reduce-luns-from-sd.html
>> 
>>   
>> 
>> 
>>   
>>   
>>   
>>   
>>   
>> Regards,
>> Shani Leviim
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>   
>> On Sun, Dec 27, 2020 at 8:16 PM Vinícius Ferrão via Users  
>> wrote:
>> 
>> 
>>>   Hello,
>>> 
>>> Is there any way to reduce the size of an iSCSI Storage Domain? I can’t 
>>> seem to figure this myself. It’s probably unsupported, and the path would 
>>> be create a new iSCSI Storage Domain with the reduced size and move the 
>>> virtual disks to there and them delete the old one.
>>> 
>>> But I would like to confirm if this is the only way to do this…
>>> 
>>> In the past I had a requirement, so I’ve created the VM Domains with 10TB, 
>>> now it’s just too much, and I need to use the space on the storage for 
>>> other activities.
>>> 
>>> Thanks all and happy new year.
>>> ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to  users-le...@ovirt.org
>>> Privacy Statement:  https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct:  
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:  
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/4B26ZBZUMRXZ6MLJ6YQTK26SZNZOYQLF/
>>> 
>> 
>> 
>> 
> 
> 
> 
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/OWQ2WQZ35U3XEU67MWKPB7CJK7YMNTTG/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MYAM2QM7D2RZJZ632IRHLTIZ6XGMPC4Y/


[ovirt-users] Re: Shrink iSCSI Domain

2020-12-28 Thread Vinícius Ferrão via Users
Hi Shani, thank you!

It’s only one LUN :(

So it may be a best practice to split an SD in multiple LUNs?

Thank you.

On 28 Dec 2020, at 09:08, Shani Leviim 
mailto:slev...@redhat.com>> wrote:

Hi,
You can reduce LUNs from an iSCSI storage domain once it's in maintenance. [1]
On the UI, after putting the storage domain in maintenance > Manage Domain > 
select the LUNs to be removed from the storage domain.

Note that reducing LUNs is applicable in case the storage domain has more than 
1 LUN.
(Otherwise, removing the single LUN means removing the whole storage domain).

[1] 
https://www.ovirt.org/develop/release-management/features/storage/reduce-luns-from-sd.html

Regards,
Shani Leviim


On Sun, Dec 27, 2020 at 8:16 PM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hello,

Is there any way to reduce the size of an iSCSI Storage Domain? I can’t seem to 
figure this myself. It’s probably unsupported, and the path would be create a 
new iSCSI Storage Domain with the reduced size and move the virtual disks to 
there and them delete the old one.

But I would like to confirm if this is the only way to do this…

In the past I had a requirement, so I’ve created the VM Domains with 10TB, now 
it’s just too much, and I need to use the space on the storage for other 
activities.

Thanks all and happy new year.
___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4B26ZBZUMRXZ6MLJ6YQTK26SZNZOYQLF/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OWQ2WQZ35U3XEU67MWKPB7CJK7YMNTTG/


[ovirt-users] Shrink iSCSI Domain

2020-12-27 Thread Vinícius Ferrão via Users
Hello,

Is there any way to reduce the size of an iSCSI Storage Domain? I can’t seem to 
figure this myself. It’s probably unsupported, and the path would be create a 
new iSCSI Storage Domain with the reduced size and move the virtual disks to 
there and them delete the old one.

But I would like to confirm if this is the only way to do this…

In the past I had a requirement, so I’ve created the VM Domains with 10TB, now 
it’s just too much, and I need to use the space on the storage for other 
activities.

Thanks all and happy new year.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4B26ZBZUMRXZ6MLJ6YQTK26SZNZOYQLF/


[ovirt-users] Re: CentOS 8 is dead

2020-12-25 Thread Vinícius Ferrão via Users
Oracle took that college meme — just change the variables name — too seriously.

> On 25 Dec 2020, at 16:35, James Loker-Steele via Users  
> wrote:
> 
> Yes.
> We use OEL and have setup oracles branded ovirt as well as test ovirt on 
> oracle and it works a treat.
> 
> 
> Sent from my iPhone
> 
>> On 25 Dec 2020, at 18:23, Diggy Mc  wrote:
>> 
>> Is Oracle Linux a viable alternative for the oVirt project?  It is, after 
>> all, a rebuild of RHEL like CentOS.  If not viable, why not?  I need to make 
>> some decisions posthaste about my pending oVirt 4.4 deployments.
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct: 
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives: 
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/PXCL7XVD7BLKKLPWIZJPNUMAFP3A3B5D/
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/M2TMEQZH6LB65RJUPFAFTSBWYPAXSCZ3/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6IZXVHFR6JV6CQCPQFJALFBI5ZORBB7M/


[ovirt-users] Re: Constantly XFS in memory corruption inside VMs

2020-12-12 Thread Vinícius Ferrão via Users
Hi Abhishek,

I haven’t found any critical corruption after the change. But I’m not sure if 
this was the issue, right now I’m suspecting of the storage subsystem. I’ll 
give some more days to see how things will end up.

Definitely there’s an improvement but, again, not sure yet if it was solved.

Thanks,

On 2 Dec 2020, at 09:21, Abhishek Sahni 
mailto:abhishek.sahni1...@gmail.com>> wrote:

I have been through a similar type of weird situation and ended up knowing that 
it was because of NFS mounting.

ENV:
STORAGE: Dell EMC VNX 5200 - NFS shares.

a) Given below is the mounting of the storage (NFSv4 SHARE) on the nodes when 
creating new VMs failed while installation. However existing VMs are running 
fine.  [nfsv4]

# mount

A.B.C.D:/VIRT_CC on /rhev/data-center/mnt/A.B.C.D:_VIRT__CC type nfs4 
(rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=A.B.C.D,local_lock=none,addr=A.B.C.D)


b) Switched back to NFSv3 and everything came back to normal.

# mount

A.B.C.D:/VIRT_CC on /rhev/data-center/mnt/A.B.C.D:_VIRT__CC type nfs 
(rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=A.B.C.D,mountvers=3,mountport=1234,mountproto=udp,local_lock=all,addr=A.B.C.D)

Conclusion: Checked logs everywhere (Nodes and Storage.) but didnt find 
anything which can lead to the error.

WORKAROUND:
NFS Storage domain was configured with "AUTO" negotiated  option.

1) I put the storage domain in maintainance mode.
2) Changed it to NFS v3 and remove it from the maintanance mode.

and Boom everything came back to normal.

You can check if that workaround will work for you.

On Wed, Dec 2, 2020 at 10:42 AM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Can this be related the case?
https://bugzilla.redhat.com/show_bug.cgi?id=810082

On 1 Dec 2020, at 10:25, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

ECC RAM everywhere: hosts and storage.

I even run Memtest86 on both hypervisor hosts just be sure. No errors. I 
haven’t had the opportunity to run it on the storage yet.

After I’ve sent that message yesterday, the engine VM crashed again, filesystem 
went offline. There was some discards (again) on the switch, probably due to 
the “boot storm” of other VM’s. But this time a simple reboot fixed the 
filesystem and the hosted engine VM was back.

Since it was an extremely small amount of time, I’ve checked everything again, 
and only the discards issues came up, there are ~90k discards on Po2 (which is 
the LACP interface of the hypervisor). Since the fact, I enabled hardware flow 
control on the ports of the switch, but discards are still happening:

PortAlign-Err FCS-ErrXmit-Err Rcv-Err  UnderSize  
OutDiscards
Po1 0   0   0   0  00
Po2 0   0   0   0  0 
3650
Po3 0   0   0   0  00
Po4 0   0   0   0  00
Po5 0   0   0   0  00
Po6 0   0   0   0  00
Po7 0   0   0   0  00
Po200   0   0   0  0
13788

I think this may be related… but it’s just a guess.

Thanks,


On 1 Dec 2020, at 05:06, Strahil Nikolov 
mailto:hunter86...@yahoo.com>> wrote:

Could it be faulty ram ?
Do you use ECC ram ?

Best Regards,
Strahil Nikolov






В вторник, 1 декември 2020 г., 06:17:10 Гринуич+2, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> написа:






Hi again,



I had to shutdown everything because of a power outage in the office. When 
trying to get the infra up again, even the Engine have corrupted:



[  772.466982] XFS (dm-4): Invalid superblock magic number
mount: /var: wrong fs type, bad option, bad superblock on 
/dev/mapper/ovirt-var, missing codepage or helper program, or other error.
[  772.472885] XFS (dm-3): Mounting V5 Filesystem
[  773.629700] XFS (dm-3): Starting recovery (logdev: internal)
[  773.731104] XFS (dm-3): Metadata CRC error detected at 
xfs_agfl_read_verify+0xa1/0xf0 [xfs], xfs_agfl block 0xf3
[  773.734352] XFS (dm-3): Unmount and run xfs_repair
[  773.736216] XFS (dm-3): First 128 bytes of corrupted metadata buffer:
[  773.738458] : 23 31 31 35 36 35 35 34 29 00 2d 20 52 65 62 75  
#1156554).- Rebu
[  773.741044] 0010: 69 6c 74 20 66 6f 72 20 68 74 74 70 73 3a 2f 2f  ilt 
for https://
[  773.743636] 0020: 66 65 64 6f 72 61 70 72 6f 6a 65 63 74 2e 6f 72  
fedoraproject.or
[  773.746191] 0030: 67 2f 77 69 6b 69 2f 46 65 64 6f 72 61 5f 32 33  
g/wiki/Fedora_23
[  773.748818] 0040: 5f 4d 6

[ovirt-users] Re: CentOS 8 is dead

2020-12-08 Thread Vinícius Ferrão via Users
CentOS Stream is unstable at best.

I’ve used it recently and it was just a mess. There’s no binary compatibility 
with the current point release and there’s no version pinning. So it will be 
really difficult to keep track of things.

I’m really curious how oVirt will handle this.

From: Wesley Stewart 
Sent: Tuesday, December 8, 2020 4:56 PM
To: Strahil Nikolov 
Cc: users 
Subject: [ovirt-users] Re: CentOS 8 is dead

This is a little concerning.

But it seems pretty easy to convert:
https://www.centos.org/centos-stream/

However I would be curious to see if someone tests this with having an active 
ovirt node!

On Tue, Dec 8, 2020 at 2:39 PM Strahil Nikolov via Users 
mailto:users@ovirt.org>> wrote:
Hello All,

I'm really worried about the following news:
https://blog.centos.org/2020/12/future-is-centos-stream/

Did anyone tried to port oVirt to SLES/openSUSE or any Debian-based
distro ?

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HZC4D4OSYL64DX5VYXDJCHDNRZDRGIT6/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZLTWP255MVLDGSBYEG266FDMGZKOE4J5/


[ovirt-users] Re: difference between CPU server and client family

2020-12-08 Thread Vinícius Ferrão via Users
AFAIK Client is for the i3/i5/i7/i9 families and the other one is for Xeon 
platforms.

But you have pretty unusually Xeon, so it may be missing some flags that will 
properly classify the CPU.

You can run this on the host to check what’s detected:


[root]# vdsm-client Host getCapabilities

Sent from my iPhone

On 8 Dec 2020, at 10:52, jb  wrote:

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RM3AE7FLYVNDIESMXCGUAABHWIEK5AG2/


[ovirt-users] Re: Constantly XFS in memory corruption inside VMs

2020-12-01 Thread Vinícius Ferrão via Users
Can this be related the case?
https://bugzilla.redhat.com/show_bug.cgi?id=810082

On 1 Dec 2020, at 10:25, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

ECC RAM everywhere: hosts and storage.

I even run Memtest86 on both hypervisor hosts just be sure. No errors. I 
haven’t had the opportunity to run it on the storage yet.

After I’ve sent that message yesterday, the engine VM crashed again, filesystem 
went offline. There was some discards (again) on the switch, probably due to 
the “boot storm” of other VM’s. But this time a simple reboot fixed the 
filesystem and the hosted engine VM was back.

Since it was an extremely small amount of time, I’ve checked everything again, 
and only the discards issues came up, there are ~90k discards on Po2 (which is 
the LACP interface of the hypervisor). Since the fact, I enabled hardware flow 
control on the ports of the switch, but discards are still happening:

PortAlign-Err FCS-ErrXmit-Err Rcv-Err  UnderSize  
OutDiscards
Po1 0   0   0   0  00
Po2 0   0   0   0  0 
3650
Po3 0   0   0   0  00
Po4 0   0   0   0  00
Po5 0   0   0   0  00
Po6 0   0   0   0  00
Po7 0   0   0   0  00
Po200   0   0   0  0
13788

I think this may be related… but it’s just a guess.

Thanks,


On 1 Dec 2020, at 05:06, Strahil Nikolov 
mailto:hunter86...@yahoo.com>> wrote:

Could it be faulty ram ?
Do you use ECC ram ?

Best Regards,
Strahil Nikolov






В вторник, 1 декември 2020 г., 06:17:10 Гринуич+2, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> написа:






Hi again,



I had to shutdown everything because of a power outage in the office. When 
trying to get the infra up again, even the Engine have corrupted:



[  772.466982] XFS (dm-4): Invalid superblock magic number
mount: /var: wrong fs type, bad option, bad superblock on 
/dev/mapper/ovirt-var, missing codepage or helper program, or other error.
[  772.472885] XFS (dm-3): Mounting V5 Filesystem
[  773.629700] XFS (dm-3): Starting recovery (logdev: internal)
[  773.731104] XFS (dm-3): Metadata CRC error detected at 
xfs_agfl_read_verify+0xa1/0xf0 [xfs], xfs_agfl block 0xf3
[  773.734352] XFS (dm-3): Unmount and run xfs_repair
[  773.736216] XFS (dm-3): First 128 bytes of corrupted metadata buffer:
[  773.738458] : 23 31 31 35 36 35 35 34 29 00 2d 20 52 65 62 75  
#1156554).- Rebu
[  773.741044] 0010: 69 6c 74 20 66 6f 72 20 68 74 74 70 73 3a 2f 2f  ilt 
for https://
[  773.743636] 0020: 66 65 64 6f 72 61 70 72 6f 6a 65 63 74 2e 6f 72  
fedoraproject.or
[  773.746191] 0030: 67 2f 77 69 6b 69 2f 46 65 64 6f 72 61 5f 32 33  
g/wiki/Fedora_23
[  773.748818] 0040: 5f 4d 61 73 73 5f 52 65 62 75 69 6c 64 00 2d 20  
_Mass_Rebuild.-
[  773.751399] 0050: 44 72 6f 70 20 6f 62 73 6f 6c 65 74 65 20 64 65  Drop 
obsolete de
[  773.753933] 0060: 66 61 74 74 72 20 73 74 61 6e 7a 61 73 20 28 23  fattr 
stanzas (#
[  773.756428] 0070: 31 30 34 37 30 33 31 29 00 2d 20 49 6e 73 74 61  
1047031).- Insta
[  773.758873] XFS (dm-3): metadata I/O error in "xfs_trans_read_buf_map" at 
daddr 0xf3 len 1 error 74
[  773.763756] XFS (dm-3): xfs_do_force_shutdown(0x8) called from line 446 of 
file fs/xfs/libxfs/xfs_defer.c. Return address = 962bd5ee
[  773.769363] XFS (dm-3): Corruption of in-memory data detected.  Shutting 
down filesystem
[  773.772643] XFS (dm-3): Please unmount the filesystem and rectify the 
problem(s)
[  773.776079] XFS (dm-3): xfs_imap_to_bp: xfs_trans_read_buf() returned error 
-5.
[  773.779113] XFS (dm-3): xlog_recover_clear_agi_bucket: failed to clear agi 
3. Continuing.
[  773.783039] XFS (dm-3): xfs_imap_to_bp: xfs_trans_read_buf() returned error 
-5.
[  773.785698] XFS (dm-3): xlog_recover_clear_agi_bucket: failed to clear agi 
3. Continuing.
[  773.790023] XFS (dm-3): Ending recovery (logdev: internal)
[  773.792489] XFS (dm-3): Error -5 recovering leftover CoW allocations.
mount: /var/log: can't read superblock on /dev/mapper/ovirt-log.
mount: /var/log/audit: mount point does not exist.




/var seems to be completely trashed.




The only time that I’ve seem something like this was faulty hardware. But 
nothing shows up on logs, as far as I know.




After forcing repairs with -L I’ve got other issues:




mount -a
[  326.170941] XFS (dm-4): Mounting V5 Filesystem
[  326.404788] XFS (dm-4): Ending clean mount
[  326.415291] XFS (dm-3): Mounting V5 Filesystem
[  326.611673] XFS (dm-3): Ending clean mount
[  326.621705] XFS (dm-2): Mounting V5 F

[ovirt-users] Re: Constantly XFS in memory corruption inside VMs

2020-12-01 Thread Vinícius Ferrão via Users
ECC RAM everywhere: hosts and storage.

I even run Memtest86 on both hypervisor hosts just be sure. No errors. I 
haven’t had the opportunity to run it on the storage yet.

After I’ve sent that message yesterday, the engine VM crashed again, filesystem 
went offline. There was some discards (again) on the switch, probably due to 
the “boot storm” of other VM’s. But this time a simple reboot fixed the 
filesystem and the hosted engine VM was back.

Since it was an extremely small amount of time, I’ve checked everything again, 
and only the discards issues came up, there are ~90k discards on Po2 (which is 
the LACP interface of the hypervisor). Since the fact, I enabled hardware flow 
control on the ports of the switch, but discards are still happening:

PortAlign-Err FCS-ErrXmit-Err Rcv-Err  UnderSize  
OutDiscards 
Po1 0   0   0   0  0
0 
Po2 0   0   0   0  0 
3650 
Po3 0   0   0   0  0
0 
Po4 0   0   0   0  0
0 
Po5 0   0   0   0  0
0 
Po6 0   0   0   0  0
0 
Po7 0   0   0   0  0
0 
Po200   0   0   0  0
13788 

I think this may be related… but it’s just a guess.

Thanks,


> On 1 Dec 2020, at 05:06, Strahil Nikolov  wrote:
> 
> Could it be faulty ram ?
> Do you use ECC ram ?
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> 
> В вторник, 1 декември 2020 г., 06:17:10 Гринуич+2, Vinícius Ferrão via Users 
>  написа: 
> 
> 
> 
> 
> 
> 
> Hi again,
> 
> 
> 
> I had to shutdown everything because of a power outage in the office. When 
> trying to get the infra up again, even the Engine have corrupted: 
> 
> 
> 
> [  772.466982] XFS (dm-4): Invalid superblock magic number
> mount: /var: wrong fs type, bad option, bad superblock on 
> /dev/mapper/ovirt-var, missing codepage or helper program, or other error.
> [  772.472885] XFS (dm-3): Mounting V5 Filesystem
> [  773.629700] XFS (dm-3): Starting recovery (logdev: internal)
> [  773.731104] XFS (dm-3): Metadata CRC error detected at 
> xfs_agfl_read_verify+0xa1/0xf0 [xfs], xfs_agfl block 0xf3 
> [  773.734352] XFS (dm-3): Unmount and run xfs_repair
> [  773.736216] XFS (dm-3): First 128 bytes of corrupted metadata buffer:
> [  773.738458] : 23 31 31 35 36 35 35 34 29 00 2d 20 52 65 62 75  
> #1156554).- Rebu
> [  773.741044] 0010: 69 6c 74 20 66 6f 72 20 68 74 74 70 73 3a 2f 2f  ilt 
> for https://
> [  773.743636] 0020: 66 65 64 6f 72 61 70 72 6f 6a 65 63 74 2e 6f 72  
> fedoraproject.or
> [  773.746191] 0030: 67 2f 77 69 6b 69 2f 46 65 64 6f 72 61 5f 32 33  
> g/wiki/Fedora_23
> [  773.748818] 0040: 5f 4d 61 73 73 5f 52 65 62 75 69 6c 64 00 2d 20  
> _Mass_Rebuild.- 
> [  773.751399] 0050: 44 72 6f 70 20 6f 62 73 6f 6c 65 74 65 20 64 65  
> Drop obsolete de
> [  773.753933] 0060: 66 61 74 74 72 20 73 74 61 6e 7a 61 73 20 28 23  
> fattr stanzas (#
> [  773.756428] 0070: 31 30 34 37 30 33 31 29 00 2d 20 49 6e 73 74 61  
> 1047031).- Insta
> [  773.758873] XFS (dm-3): metadata I/O error in "xfs_trans_read_buf_map" at 
> daddr 0xf3 len 1 error 74
> [  773.763756] XFS (dm-3): xfs_do_force_shutdown(0x8) called from line 446 of 
> file fs/xfs/libxfs/xfs_defer.c. Return address = 962bd5ee
> [  773.769363] XFS (dm-3): Corruption of in-memory data detected.  Shutting 
> down filesystem
> [  773.772643] XFS (dm-3): Please unmount the filesystem and rectify the 
> problem(s)
> [  773.776079] XFS (dm-3): xfs_imap_to_bp: xfs_trans_read_buf() returned 
> error -5.
> [  773.779113] XFS (dm-3): xlog_recover_clear_agi_bucket: failed to clear agi 
> 3. Continuing.
> [  773.783039] XFS (dm-3): xfs_imap_to_bp: xfs_trans_read_buf() returned 
> error -5.
> [  773.785698] XFS (dm-3): xlog_recover_clear_agi_bucket: failed to clear agi 
> 3. Continuing.
> [  773.790023] XFS (dm-3): Ending recovery (logdev: internal)
> [  773.792489] XFS (dm-3): Error -5 recovering leftover CoW allocations.
> mount: /var/log: can't read superblock on /dev/mapper/ovirt-log.
> mount: /var/log/audit: mount point does not exist.
> 
> 
> 
> 
> /var seems to be completely trashed.
> 
> 
> 
> 
> The only time that I’ve seem something like this was faulty hardware. But 
> nothing shows up on logs, as far as I know.
> 
> 
> 
> 
> After forcing repairs with -L I’ve got other issues:
> 
> 

[ovirt-users] Re: Constantly XFS in memory corruption inside VMs

2020-11-30 Thread Vinícius Ferrão via Users
Hi again,

I had to shutdown everything because of a power outage in the office. When 
trying to get the infra up again, even the Engine have corrupted:

[  772.466982] XFS (dm-4): Invalid superblock magic number
mount: /var: wrong fs type, bad option, bad superblock on 
/dev/mapper/ovirt-var, missing codepage or helper program, or other error.
[  772.472885] XFS (dm-3): Mounting V5 Filesystem
[  773.629700] XFS (dm-3): Starting recovery (logdev: internal)
[  773.731104] XFS (dm-3): Metadata CRC error detected at 
xfs_agfl_read_verify+0xa1/0xf0 [xfs], xfs_agfl block 0xf3
[  773.734352] XFS (dm-3): Unmount and run xfs_repair
[  773.736216] XFS (dm-3): First 128 bytes of corrupted metadata buffer:
[  773.738458] : 23 31 31 35 36 35 35 34 29 00 2d 20 52 65 62 75  
#1156554).- Rebu
[  773.741044] 0010: 69 6c 74 20 66 6f 72 20 68 74 74 70 73 3a 2f 2f  ilt 
for https://
[  773.743636] 0020: 66 65 64 6f 72 61 70 72 6f 6a 65 63 74 2e 6f 72  
fedoraproject.or
[  773.746191] 0030: 67 2f 77 69 6b 69 2f 46 65 64 6f 72 61 5f 32 33  
g/wiki/Fedora_23
[  773.748818] 0040: 5f 4d 61 73 73 5f 52 65 62 75 69 6c 64 00 2d 20  
_Mass_Rebuild.-
[  773.751399] 0050: 44 72 6f 70 20 6f 62 73 6f 6c 65 74 65 20 64 65  Drop 
obsolete de
[  773.753933] 0060: 66 61 74 74 72 20 73 74 61 6e 7a 61 73 20 28 23  fattr 
stanzas (#
[  773.756428] 0070: 31 30 34 37 30 33 31 29 00 2d 20 49 6e 73 74 61  
1047031).- Insta
[  773.758873] XFS (dm-3): metadata I/O error in "xfs_trans_read_buf_map" at 
daddr 0xf3 len 1 error 74
[  773.763756] XFS (dm-3): xfs_do_force_shutdown(0x8) called from line 446 of 
file fs/xfs/libxfs/xfs_defer.c. Return address = 962bd5ee
[  773.769363] XFS (dm-3): Corruption of in-memory data detected.  Shutting 
down filesystem
[  773.772643] XFS (dm-3): Please unmount the filesystem and rectify the 
problem(s)
[  773.776079] XFS (dm-3): xfs_imap_to_bp: xfs_trans_read_buf() returned error 
-5.
[  773.779113] XFS (dm-3): xlog_recover_clear_agi_bucket: failed to clear agi 
3. Continuing.
[  773.783039] XFS (dm-3): xfs_imap_to_bp: xfs_trans_read_buf() returned error 
-5.
[  773.785698] XFS (dm-3): xlog_recover_clear_agi_bucket: failed to clear agi 
3. Continuing.
[  773.790023] XFS (dm-3): Ending recovery (logdev: internal)
[  773.792489] XFS (dm-3): Error -5 recovering leftover CoW allocations.
mount: /var/log: can't read superblock on /dev/mapper/ovirt-log.
mount: /var/log/audit: mount point does not exist.

/var seems to be completely trashed.

The only time that I’ve seem something like this was faulty hardware. But 
nothing shows up on logs, as far as I know.

After forcing repairs with -L I’ve got other issues:

mount -a
[  326.170941] XFS (dm-4): Mounting V5 Filesystem
[  326.404788] XFS (dm-4): Ending clean mount
[  326.415291] XFS (dm-3): Mounting V5 Filesystem
[  326.611673] XFS (dm-3): Ending clean mount
[  326.621705] XFS (dm-2): Mounting V5 Filesystem
[  326.784067] XFS (dm-2): Starting recovery (logdev: internal)
[  326.792083] XFS (dm-2): Metadata CRC error detected at 
xfs_agi_read_verify+0xc7/0xf0 [xfs], xfs_agi block 0x2
[  326.794445] XFS (dm-2): Unmount and run xfs_repair
[  326.795557] XFS (dm-2): First 128 bytes of corrupted metadata buffer:
[  326.797055] : 4d 33 44 34 39 56 00 00 80 00 00 00 f0 cf 00 00  
M3D49V..
[  326.799685] 0010: 00 00 00 00 02 00 00 00 23 10 00 00 3d 08 01 08  
#...=...
[  326.802290] 0020: 21 27 44 34 39 56 00 00 00 d0 00 00 01 00 00 00  
!'D49V..
[  326.804748] 0030: 50 00 00 00 00 00 00 00 23 10 00 00 41 01 08 08  
P...#...A...
[  326.807296] 0040: 21 27 44 34 39 56 00 00 10 d0 00 00 02 00 00 00  
!'D49V..
[  326.809883] 0050: 60 00 00 00 00 00 00 00 23 10 00 00 41 01 08 08  
`...#...A...
[  326.812345] 0060: 61 2f 44 34 39 56 00 00 00 00 00 00 00 00 00 00  
a/D49V..
[  326.814831] 0070: 50 34 00 00 00 00 00 00 23 10 00 00 82 08 08 04  
P4..#...
[  326.817237] XFS (dm-2): metadata I/O error in "xfs_trans_read_buf_map" at 
daddr 0x2 len 1 error 74
mount: /var/log/audit: mount(2) system call failed: Structure needs cleaning.

But after more xfs_repair -L the engine is up…

Now I need to scavenge other VMs and do the same thing.

That’s it.

Thanks all,
V.

PS: For those interested, there’s a paste of the fixes: 
https://pastebin.com/jsMguw6j

On 29 Nov 2020, at 17:03, Strahil Nikolov 
mailto:hunter86...@yahoo.com>> wrote:

Damn...

You are using EFI boot. Does this happen only to EFI machines ?
Did you notice if only EL 8 is affected ?

Best Regards,
Strahil Nikolov






В неделя, 29 ноември 2020 г., 19:36:09 Гринуич+2, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> написа:





Yes!

I have a live VM right now that will de dead on a reboot:

[root@kontainerscomk ~]# cat /etc/*release
NAME="Red Hat Enterprise Linux"
VERSION="8.3 (Ootpa)"
ID="rhel"
ID_LIKE="fedor

[ovirt-users] Re: Constantly XFS in memory corruption inside VMs

2020-11-29 Thread Vinícius Ferrão via Users
Hi Strahil,

The majority of the VMs are UEFI. But I do have some Legacy BIOS VMs and they 
are corrupting too. I have a mix of RHEL/CentOS 7 and 8.

All of them are corrupting. XFS on everything with default values from 
installation.

There’s one VM with Ubuntu 18.04 LTS and ext4 that corruption is not found 
there. And the three NTFS VMs that I have are good too.

So the common denominator is XFS on Enterprise Linux (7 or 8).

Any other ideas?

Thanks.

PS: That VM that will die after the reboot is almost new. Installed on November 
19th, and oVirt is even with the Run Once flag because it never rebooted since 
installation.


Sent from my iPhone

> On 29 Nov 2020, at 17:03, Strahil Nikolov  wrote:
> 
> Damn...
> 
> You are using EFI boot. Does this happen only to EFI machines ?
> Did you notice if only EL 8 is affected ?
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> 
> В неделя, 29 ноември 2020 г., 19:36:09 Гринуич+2, Vinícius Ferrão 
>  написа: 
> 
> 
> 
> 
> 
> Yes!
> 
> I have a live VM right now that will de dead on a reboot:
> 
> [root@kontainerscomk ~]# cat /etc/*release
> NAME="Red Hat Enterprise Linux"
> VERSION="8.3 (Ootpa)"
> ID="rhel"
> ID_LIKE="fedora"
> VERSION_ID="8.3"
> PLATFORM_ID="platform:el8"
> PRETTY_NAME="Red Hat Enterprise Linux 8.3 (Ootpa)"
> ANSI_COLOR="0;31"
> CPE_NAME="cpe:/o:redhat:enterprise_linux:8.3:GA"
> HOME_URL="https://www.redhat.com/;
> BUG_REPORT_URL="https://bugzilla.redhat.com/;
> 
> REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
> REDHAT_BUGZILLA_PRODUCT_VERSION=8.3
> REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
> REDHAT_SUPPORT_PRODUCT_VERSION="8.3"
> Red Hat Enterprise Linux release 8.3 (Ootpa)
> Red Hat Enterprise Linux release 8.3 (Ootpa)
> 
> [root@kontainerscomk ~]# sysctl -a | grep dirty
> vm.dirty_background_bytes = 0
> vm.dirty_background_ratio = 10
> vm.dirty_bytes = 0
> vm.dirty_expire_centisecs = 3000
> vm.dirty_ratio = 30
> vm.dirty_writeback_centisecs = 500
> vm.dirtytime_expire_seconds = 43200
> 
> [root@kontainerscomk ~]# xfs_db -r /dev/dm-0
> xfs_db: /dev/dm-0 is not a valid XFS filesystem (unexpected SB magic number 
> 0xa82a)
> Use -F to force a read attempt.
> [root@kontainerscomk ~]# xfs_db -r /dev/dm-0 -F
> xfs_db: /dev/dm-0 is not a valid XFS filesystem (unexpected SB magic number 
> 0xa82a)
> xfs_db: size check failed
> xfs_db: V1 inodes unsupported. Please try an older xfsprogs.
> 
> [root@kontainerscomk ~]# cat /etc/fstab
> #
> # /etc/fstab
> # Created by anaconda on Thu Nov 19 22:40:39 2020
> #
> # Accessible filesystems, by reference, are maintained under '/dev/disk/'.
> # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
> #
> # After editing this file, run 'systemctl daemon-reload' to update systemd
> # units generated from this file.
> #
> /dev/mapper/rhel-root  /  xfsdefaults0 0
> UUID=ad84d1ea-c9cc-4b22-8338-d1a6b2c7d27e /boot  xfs
> defaults0 0
> UUID=4642-2FF6  /boot/efi  vfat
> umask=0077,shortname=winnt 0 2
> /dev/mapper/rhel-swap  noneswapdefaults0 0
> 
> Thanks,
> 
> 
> -Original Message-
> From: Strahil Nikolov  
> Sent: Sunday, November 29, 2020 2:33 PM
> To: Vinícius Ferrão 
> Cc: users 
> Subject: Re: [ovirt-users] Re: Constantly XFS in memory corruption inside VMs
> 
> Can you check the output on the VM that was affected:
> # cat /etc/*release
> # sysctl -a | grep dirty
> 
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> В неделя, 29 ноември 2020 г., 19:07:48 Гринуич+2, Vinícius Ferrão via Users 
>  написа: 
> 
> 
> 
> 
> 
> Hi Strahil.
> 
> I’m not using barrier options on mount. It’s the default settings from CentOS 
> install.
> 
> I have some additional findings, there’s a big number of discarded packages 
> on the switch on the hypervisor interfaces.
> 
> Discards are OK as far as I know, I hope TCP handles this and do the proper 
> retransmissions, but I ask if this may be related or not. Our storage is over 
> NFS. My general expertise is with iSCSI and I’ve never seen this kind of 
> issue with iSCSI, not that I’m aware of.
> 
> In other clusters, I’ve seen a high number of discards with iSCSI on 
> XenServer 7.2 but there’s no corruption on the VMs there...
> 
> Thanks,
> 
> Sent from my iPhone
> 
>> On 29 Nov 2020, at 04:00, Strahil Nikolov  wrote:
>> 
>> Are you using "nobarrier" mou

[ovirt-users] Re: Constantly XFS in memory corruption inside VMs

2020-11-29 Thread Vinícius Ferrão via Users
Yes!

I have a live VM right now that will de dead on a reboot:

[root@kontainerscomk ~]# cat /etc/*release
NAME="Red Hat Enterprise Linux"
VERSION="8.3 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.3"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.3 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8.3:GA"
HOME_URL="https://www.redhat.com/;
BUG_REPORT_URL="https://bugzilla.redhat.com/;

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.3
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.3"
Red Hat Enterprise Linux release 8.3 (Ootpa)
Red Hat Enterprise Linux release 8.3 (Ootpa)

[root@kontainerscomk ~]# sysctl -a | grep dirty
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 30
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200

[root@kontainerscomk ~]# xfs_db -r /dev/dm-0
xfs_db: /dev/dm-0 is not a valid XFS filesystem (unexpected SB magic number 
0xa82a)
Use -F to force a read attempt.
[root@kontainerscomk ~]# xfs_db -r /dev/dm-0 -F
xfs_db: /dev/dm-0 is not a valid XFS filesystem (unexpected SB magic number 
0xa82a)
xfs_db: size check failed
xfs_db: V1 inodes unsupported. Please try an older xfsprogs.

[root@kontainerscomk ~]# cat /etc/fstab
#
# /etc/fstab
# Created by anaconda on Thu Nov 19 22:40:39 2020
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
/dev/mapper/rhel-root   /   xfs defaults0 0
UUID=ad84d1ea-c9cc-4b22-8338-d1a6b2c7d27e /boot   xfs 
defaults0 0
UUID=4642-2FF6  /boot/efi   vfat
umask=0077,shortname=winnt 0 2
/dev/mapper/rhel-swap   noneswapdefaults    0 0

Thanks,


-Original Message-
From: Strahil Nikolov  
Sent: Sunday, November 29, 2020 2:33 PM
To: Vinícius Ferrão 
Cc: users 
Subject: Re: [ovirt-users] Re: Constantly XFS in memory corruption inside VMs

Can you check the output on the VM that was affected:
# cat /etc/*release
# sysctl -a | grep dirty


Best Regards,
Strahil Nikolov





В неделя, 29 ноември 2020 г., 19:07:48 Гринуич+2, Vinícius Ferrão via Users 
 написа: 





Hi Strahil.

I’m not using barrier options on mount. It’s the default settings from CentOS 
install.

I have some additional findings, there’s a big number of discarded packages on 
the switch on the hypervisor interfaces.

Discards are OK as far as I know, I hope TCP handles this and do the proper 
retransmissions, but I ask if this may be related or not. Our storage is over 
NFS. My general expertise is with iSCSI and I’ve never seen this kind of issue 
with iSCSI, not that I’m aware of.

In other clusters, I’ve seen a high number of discards with iSCSI on XenServer 
7.2 but there’s no corruption on the VMs there...

Thanks,

Sent from my iPhone

> On 29 Nov 2020, at 04:00, Strahil Nikolov  wrote:
> 
> Are you using "nobarrier" mount options in the VM ?
> 
> If yes, can you try to remove the "nobarrrier" option.
> 
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> 
> В събота, 28 ноември 2020 г., 19:25:48 Гринуич+2, Vinícius Ferrão 
>  написа: 
> 
> 
> 
> 
> 
> Hi Strahil,
> 
> I moved a running VM to other host, rebooted and no corruption was found. If 
> there's any corruption it may be silent corruption... I've cases where the VM 
> was new, just installed, run dnf -y update to get the updated packages, 
> rebooted, and boom XFS corruption. So perhaps the motion process isn't the 
> one to blame.
> 
> But, in fact, I remember when moving a VM that it went down during the 
> process and when I rebooted it was corrupted. But this may not seems related. 
> It perhaps was already in a inconsistent state.
> 
> Anyway, here's the mount options:
> 
> Host1:
> 192.168.10.14:/mnt/pool0/ovirt/vm on 
> /rhev/data-center/mnt/192.168.10.14:_mnt_pool0_ovirt_vm type nfs4 
> (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,noshar
> ecache,proto=tcp,timeo=100,retrans=3,sec=sys,clientaddr=192.168.10.1,l
> ocal_lock=none,addr=192.168.10.14)
> 
> Host2:
> 192.168.10.14:/mnt/pool0/ovirt/vm on 
> /rhev/data-center/mnt/192.168.10.14:_mnt_pool0_ovirt_vm type nfs4 
> (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,noshar
> ecache,proto=tcp,timeo=100,retrans=3,sec=sys,clientaddr=192.168.10.1,l
> ocal_lock=none,addr=192.168.10.14)
> 
> The option

[ovirt-users] Re: Constantly XFS in memory corruption inside VMs

2020-11-29 Thread Vinícius Ferrão via Users
Hi Strahil.

I’m not using barrier options on mount. It’s the default settings from CentOS 
install.

I have some additional findings, there’s a big number of discarded packages on 
the switch on the hypervisor interfaces.

Discards are OK as far as I know, I hope TCP handles this and do the proper 
retransmissions, but I ask if this may be related or not. Our storage is over 
NFS. My general expertise is with iSCSI and I’ve never seen this kind of issue 
with iSCSI, not that I’m aware of.

In other clusters, I’ve seen a high number of discards with iSCSI on XenServer 
7.2 but there’s no corruption on the VMs there...

Thanks,

Sent from my iPhone

> On 29 Nov 2020, at 04:00, Strahil Nikolov  wrote:
> 
> Are you using "nobarrier" mount options in the VM ?
> 
> If yes, can you try to remove the "nobarrrier" option.
> 
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> 
> В събота, 28 ноември 2020 г., 19:25:48 Гринуич+2, Vinícius Ferrão 
>  написа: 
> 
> 
> 
> 
> 
> Hi Strahil,
> 
> I moved a running VM to other host, rebooted and no corruption was found. If 
> there's any corruption it may be silent corruption... I've cases where the VM 
> was new, just installed, run dnf -y update to get the updated packages, 
> rebooted, and boom XFS corruption. So perhaps the motion process isn't the 
> one to blame.
> 
> But, in fact, I remember when moving a VM that it went down during the 
> process and when I rebooted it was corrupted. But this may not seems related. 
> It perhaps was already in a inconsistent state.
> 
> Anyway, here's the mount options:
> 
> Host1:
> 192.168.10.14:/mnt/pool0/ovirt/vm on 
> /rhev/data-center/mnt/192.168.10.14:_mnt_pool0_ovirt_vm type nfs4 
> (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,nosharecache,proto=tcp,timeo=100,retrans=3,sec=sys,clientaddr=192.168.10.1,local_lock=none,addr=192.168.10.14)
> 
> Host2:
> 192.168.10.14:/mnt/pool0/ovirt/vm on 
> /rhev/data-center/mnt/192.168.10.14:_mnt_pool0_ovirt_vm type nfs4 
> (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,nosharecache,proto=tcp,timeo=100,retrans=3,sec=sys,clientaddr=192.168.10.1,local_lock=none,addr=192.168.10.14)
> 
> The options are the default ones. I haven't changed anything when configuring 
> this cluster.
> 
> Thanks.
> 
> 
> 
> -Original Message-
> From: Strahil Nikolov  
> Sent: Saturday, November 28, 2020 1:54 PM
> To: users ; Vinícius Ferrão 
> Subject: Re: [ovirt-users] Constantly XFS in memory corruption inside VMs
> 
> Can you try with a test vm, if this happens after a Virtual Machine migration 
> ?
> 
> What are your mount options for the storage domain ?
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> 
> В събота, 28 ноември 2020 г., 18:25:15 Гринуич+2, Vinícius Ferrão via Users 
>  написа: 
> 
> 
> 
> 
> 
>   
> 
> 
> Hello,
> 
>  
> 
> I’m trying to discover why an oVirt 4.4.3 Cluster with two hosts and NFS 
> shared storage on TrueNAS 12.0 is constantly getting XFS corruption inside 
> the VMs.
> 
>  
> 
> For random reasons VM’s gets corrupted, sometimes halting it or just being 
> silent corrupted and after a reboot the system is unable to boot due to 
> “corruption of in-memory data detected”. Sometimes the corrupted data are 
> “all zeroes”, sometimes there’s data there. In extreme cases the XFS 
> superblock 0 get’s corrupted and the system cannot even detect a XFS 
> partition anymore since the magic XFS key is corrupted on the first blocks of 
> the virtual disk.
> 
>  
> 
> This is happening for a month now. We had to rollback some backups, and I 
> don’t trust anymore on the state of the VMs.
> 
>  
> 
> Using xfs_db I can see that some VM’s have corrupted superblocks but the VM 
> is up. One in specific, was with sb0 corrupted, so I knew when a reboot kicks 
> in the machine will be gone, and that’s exactly what happened.
> 
>  
> 
> Another day I was just installing a new CentOS 8 VM for random reasons, and 
> after running dnf -y update and a reboot the VM was corrupted needing XFS 
> repair. That was an extreme case.
> 
>  
> 
> So, I’ve looked on the TrueNAS logs, and there’s apparently nothing wrong on 
> the system. No errors logged on dmesg, nothing on /var/log/messages and no 
> errors on the “zpools”, not even after scrub operations. On the switch, a 
> Catalyst 2960X, we’ve been monitoring it and all it’s interfaces. There are 
> no “up and down” and zero errors on all interfaces (we have a 4x Port LACP on 
> the TrueNAS side and 2x Port LACP on each hosts), everything seems to be 
> fine. The only metric that I was unable to get is “dr

[ovirt-users] Re: Constantly XFS in memory corruption inside VMs

2020-11-28 Thread Vinícius Ferrão via Users
Hi Strahil,

I moved a running VM to other host, rebooted and no corruption was found. If 
there's any corruption it may be silent corruption... I've cases where the VM 
was new, just installed, run dnf -y update to get the updated packages, 
rebooted, and boom XFS corruption. So perhaps the motion process isn't the one 
to blame.

But, in fact, I remember when moving a VM that it went down during the process 
and when I rebooted it was corrupted. But this may not seems related. It 
perhaps was already in a inconsistent state.

Anyway, here's the mount options:

Host1:
192.168.10.14:/mnt/pool0/ovirt/vm on 
/rhev/data-center/mnt/192.168.10.14:_mnt_pool0_ovirt_vm type nfs4 
(rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,nosharecache,proto=tcp,timeo=100,retrans=3,sec=sys,clientaddr=192.168.10.1,local_lock=none,addr=192.168.10.14)

Host2:
192.168.10.14:/mnt/pool0/ovirt/vm on 
/rhev/data-center/mnt/192.168.10.14:_mnt_pool0_ovirt_vm type nfs4 
(rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,nosharecache,proto=tcp,timeo=100,retrans=3,sec=sys,clientaddr=192.168.10.1,local_lock=none,addr=192.168.10.14)

The options are the default ones. I haven't changed anything when configuring 
this cluster.

Thanks.



-Original Message-
From: Strahil Nikolov  
Sent: Saturday, November 28, 2020 1:54 PM
To: users ; Vinícius Ferrão 
Subject: Re: [ovirt-users] Constantly XFS in memory corruption inside VMs

Can you try with a test vm, if this happens after a Virtual Machine migration ?

What are your mount options for the storage domain ?

Best Regards,
Strahil Nikolov






В събота, 28 ноември 2020 г., 18:25:15 Гринуич+2, Vinícius Ferrão via Users 
 написа: 





  


Hello,

 

I’m trying to discover why an oVirt 4.4.3 Cluster with two hosts and NFS shared 
storage on TrueNAS 12.0 is constantly getting XFS corruption inside the VMs.

 

For random reasons VM’s gets corrupted, sometimes halting it or just being 
silent corrupted and after a reboot the system is unable to boot due to 
“corruption of in-memory data detected”. Sometimes the corrupted data are “all 
zeroes”, sometimes there’s data there. In extreme cases the XFS superblock 0 
get’s corrupted and the system cannot even detect a XFS partition anymore since 
the magic XFS key is corrupted on the first blocks of the virtual disk.

 

This is happening for a month now. We had to rollback some backups, and I don’t 
trust anymore on the state of the VMs.

 

Using xfs_db I can see that some VM’s have corrupted superblocks but the VM is 
up. One in specific, was with sb0 corrupted, so I knew when a reboot kicks in 
the machine will be gone, and that’s exactly what happened.

 

Another day I was just installing a new CentOS 8 VM for random reasons, and 
after running dnf -y update and a reboot the VM was corrupted needing XFS 
repair. That was an extreme case.

 

So, I’ve looked on the TrueNAS logs, and there’s apparently nothing wrong on 
the system. No errors logged on dmesg, nothing on /var/log/messages and no 
errors on the “zpools”, not even after scrub operations. On the switch, a 
Catalyst 2960X, we’ve been monitoring it and all it’s interfaces. There are no 
“up and down” and zero errors on all interfaces (we have a 4x Port LACP on the 
TrueNAS side and 2x Port LACP on each hosts), everything seems to be fine. The 
only metric that I was unable to get is “dropped packages”, but I’m don’t know 
if this can be an issue or not.

 

Finally, on oVirt, I can’t find anything either. I looked on /var/log/messages 
and /var/log/sanlock.log but there’s nothing that I found suspicious.

 

Is there’s anyone out there experiencing this? Our VM’s are mainly CentOS 7/8 
with XFS, there’s 3 Windows VM’s that does not seems to be affected, everything 
else is affected.

 

Thanks all.



___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: 
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VLYSE7HCFNWTWFZZTL2EJHV36OENHUGB/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OWT5U6UTXNSBZELWFVID42XKYMUSCPDF/


[ovirt-users] Constantly XFS in memory corruption inside VMs

2020-11-28 Thread Vinícius Ferrão via Users
Hello,

I'm trying to discover why an oVirt 4.4.3 Cluster with two hosts and NFS shared 
storage on TrueNAS 12.0 is constantly getting XFS corruption inside the VMs.

For random reasons VM's gets corrupted, sometimes halting it or just being 
silent corrupted and after a reboot the system is unable to boot due to 
"corruption of in-memory data detected". Sometimes the corrupted data are "all 
zeroes", sometimes there's data there. In extreme cases the XFS superblock 0 
get's corrupted and the system cannot even detect a XFS partition anymore since 
the magic XFS key is corrupted on the first blocks of the virtual disk.

This is happening for a month now. We had to rollback some backups, and I don't 
trust anymore on the state of the VMs.

Using xfs_db I can see that some VM's have corrupted superblocks but the VM is 
up. One in specific, was with sb0 corrupted, so I knew when a reboot kicks in 
the machine will be gone, and that's exactly what happened.

Another day I was just installing a new CentOS 8 VM for random reasons, and 
after running dnf -y update and a reboot the VM was corrupted needing XFS 
repair. That was an extreme case.

So, I've looked on the TrueNAS logs, and there's apparently nothing wrong on 
the system. No errors logged on dmesg, nothing on /var/log/messages and no 
errors on the "zpools", not even after scrub operations. On the switch, a 
Catalyst 2960X, we've been monitoring it and all it's interfaces. There are no 
"up and down" and zero errors on all interfaces (we have a 4x Port LACP on the 
TrueNAS side and 2x Port LACP on each hosts), everything seems to be fine. The 
only metric that I was unable to get is "dropped packages", but I'm don't know 
if this can be an issue or not.

Finally, on oVirt, I can't find anything either. I looked on /var/log/messages 
and /var/log/sanlock.log but there's nothing that I found suspicious.

Is there's anyone out there experiencing this? Our VM's are mainly CentOS 7/8 
with XFS, there's 3 Windows VM's that does not seems to be affected, everything 
else is affected.

Thanks all.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VLYSE7HCFNWTWFZZTL2EJHV36OENHUGB/


[ovirt-users] Re: EPYC CPU not being detected correctly on cluster

2020-11-25 Thread Vinícius Ferrão via Users
Lucia, I ended figuring out.

The culprit is that I was pinned with the wrong virt module; after running this 
commands I was able to have the CPU properly detected:

# dnf module reset virt
# dnf module enable virt:8.3
# dnf upgrade –nobest

I think virt was in 8.2.

Thank you!

From: Lucia Jelinkova 
Sent: Monday, November 23, 2020 6:25 AM
To: Vinícius Ferrão 
Cc: users 
Subject: Re: [ovirt-users] EPYC CPU not being detected correctly on cluster

Hi Vinícius,

Thank you for the libvirt output - libvirt marked the EPYC CPU as not usable. 
Let's query qemu why that is.  You do not need an oVirt VM to do that, just any 
VM running on qemu, e.g. created by Virtual Machines Manager or you can follow 
the command from the answer here:

https://unix.stackexchange.com/questions/309788/how-to-create-a-vm-from-scratch-with-virsh

Then you can use the following commands:
sudo virsh list --all
sudo virsh qemu-monitor-command [your-vm's-name] --pretty 
'{"execute":"query-cpu-definitions"}'

I do not know if this could be related to UEFI Firmware, lets check the qemu 
output first.

Regards,

Lucia


On Fri, Nov 20, 2020 at 4:07 PM Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:
Hi Lucia,

I had to create an user for virsh:
# saslpasswd2 -a libvirt test
Password:
Again (for verification):

With that in mind, here’s the outputs:


  /usr/libexec/qemu-kvm
  kvm
  pc-i440fx-rhel7.6.0
  x86_64
  
  
  


  /usr/share/OVMF/OVMF_CODE.secboot.fd
  
rom
pflash
  
  
yes
no
  
  
no
  

  
  


  EPYC-IBPB
  AMD
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  


  qemu64
  qemu32
  phenom
  pentium3
  pentium2
  pentium
  n270
  kvm64
  kvm32
  coreduo
  core2duo
  athlon
  Westmere-IBRS
  Westmere
  Skylake-Server-noTSX-IBRS
  Skylake-Server-IBRS
  Skylake-Server
  Skylake-Client-noTSX-IBRS
  Skylake-Client-IBRS
  Skylake-Client
  SandyBridge-IBRS
  SandyBridge
  Penryn
  Opteron_G5
  Opteron_G4
  Opteron_G3
  Opteron_G2
  Opteron_G1
  Nehalem-IBRS
  Nehalem
  IvyBridge-IBRS
  IvyBridge
  Icelake-Server-noTSX
  Icelake-Server
  Icelake-Client-noTSX
  Icelake-Client
  Haswell-noTSX-IBRS
  Haswell-noTSX
  Haswell-IBRS
  Haswell
  EPYC-IBPB
  EPYC
  Dhyana
 Cooperlake
  Conroe
  Cascadelake-Server-noTSX
  Cascadelake-Server
  Broadwell-noTSX-IBRS
  Broadwell-noTSX
  Broadwell-IBRS
  Broadwell
  486

  
  

  
disk
cdrom
floppy
lun
  
  
ide
fdc
scsi
virtio
usb
sata
  
  
virtio
virtio-transitional
virtio-non-transitional
  


  
sdl
vnc
spice
  


  
vga
cirrus
qxl
virtio
none
bochs
ramfb
  


  
subsystem
  
  
default
mandatory
requisite
optional
  
  
usb
pci
scsi
  
  
  
default
vfio
  


  
virtio
virtio-transitional
virtio-non-transitional
  
  
random
egd
  

  
  






  47
  1

  


Regarding the last two commands, I don’t have any VM running, since I cannot 
start anything on the engine.

I’m starting to suspect that this may be something in the UEFI Firmware.

Any thoughts?

Thanks,

From: Lucia Jelinkova mailto:ljeli...@redhat.com>>
Sent: Friday, November 20, 2020 5:30 AM
To: Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>>
Cc: users mailto:users@ovirt.org>>
Subject: Re: [ovirt-users] EPYC CPU not being detected correctly on cluster

Hi,

oVirt CPU detection depends on libvirt (and that depends on qemu) CPU models. 
Could you please run the following command to see what libvirt reports?

virsh domcapabilities

That should give you the list of CPUs known to libvirt with a usability flag 
for each CPU.

If you find out that the CPU is not usable by libvirt, you might want to dig 
deeper by querying quemu directly.

Locate any VM running on the system by
sudo virsh list --all

Use the name of a VM in the following command:
sudo virsh qemu-monitor-command [your-vm's-name] --pretty 
'{"execute":"query-cpu-definitions"}'

That would give you the list of all CPUs supported by qemu and it will list all 
cpu's features that are not available on your system.

Regards,

Lucia

On Thu, Nov 19, 2020 at 9:38 PM Vinícius Ferrão via Users 
mailto:users@ovirt.

[ovirt-users] Re: EPYC CPU not being detected correctly on cluster

2020-11-20 Thread Vinícius Ferrão via Users
Hi Lucia,

I had to create an user for virsh:
# saslpasswd2 -a libvirt test
Password:
Again (for verification):

With that in mind, here’s the outputs:


  /usr/libexec/qemu-kvm
  kvm
  pc-i440fx-rhel7.6.0
  x86_64
  
  
  


  /usr/share/OVMF/OVMF_CODE.secboot.fd
  
rom
pflash
  
  
yes
no
  
  
no
  

  
  


  EPYC-IBPB
  AMD
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  


  qemu64
  qemu32
  phenom
  pentium3
  pentium2
  pentium
  n270
  kvm64
  kvm32
  coreduo
  core2duo
  athlon
  Westmere-IBRS
  Westmere
  Skylake-Server-noTSX-IBRS
  Skylake-Server-IBRS
  Skylake-Server
  Skylake-Client-noTSX-IBRS
  Skylake-Client-IBRS
  Skylake-Client
  SandyBridge-IBRS
  SandyBridge
  Penryn
  Opteron_G5
  Opteron_G4
  Opteron_G3
  Opteron_G2
  Opteron_G1
  Nehalem-IBRS
  Nehalem
  IvyBridge-IBRS
  IvyBridge
  Icelake-Server-noTSX
  Icelake-Server
  Icelake-Client-noTSX
  Icelake-Client
  Haswell-noTSX-IBRS
  Haswell-noTSX
  Haswell-IBRS
  Haswell
  EPYC-IBPB
  EPYC
  Dhyana
 Cooperlake
  Conroe
  Cascadelake-Server-noTSX
  Cascadelake-Server
  Broadwell-noTSX-IBRS
  Broadwell-noTSX
  Broadwell-IBRS
  Broadwell
  486

  
  

  
disk
cdrom
floppy
lun
  
  
ide
fdc
scsi
virtio
usb
sata
  
  
virtio
virtio-transitional
virtio-non-transitional
  


  
sdl
vnc
spice
  


  
vga
cirrus
qxl
virtio
none
bochs
ramfb
  


  
subsystem
  
  
default
mandatory
requisite
optional
  
  
usb
pci
scsi
  
  
  
default
vfio
  


  
virtio
virtio-transitional
virtio-non-transitional
  
  
random
egd
  

  
  






  47
  1

  


Regarding the last two commands, I don’t have any VM running, since I cannot 
start anything on the engine.

I’m starting to suspect that this may be something in the UEFI Firmware.

Any thoughts?

Thanks,

From: Lucia Jelinkova 
Sent: Friday, November 20, 2020 5:30 AM
To: Vinícius Ferrão 
Cc: users 
Subject: Re: [ovirt-users] EPYC CPU not being detected correctly on cluster

Hi,

oVirt CPU detection depends on libvirt (and that depends on qemu) CPU models. 
Could you please run the following command to see what libvirt reports?

virsh domcapabilities

That should give you the list of CPUs known to libvirt with a usability flag 
for each CPU.

If you find out that the CPU is not usable by libvirt, you might want to dig 
deeper by querying quemu directly.

Locate any VM running on the system by
sudo virsh list --all

Use the name of a VM in the following command:
sudo virsh qemu-monitor-command [your-vm's-name] --pretty 
'{"execute":"query-cpu-definitions"}'

That would give you the list of all CPUs supported by qemu and it will list all 
cpu's features that are not available on your system.

Regards,

Lucia

On Thu, Nov 19, 2020 at 9:38 PM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hi

I’ve an strange issue with two hosts (not using the hypervisor image) with EPYC 
CPUs, on the engine I got this message:

The host CPU does not match the Cluster CPU Type and is running in a degraded 
mode. It is missing the following CPU flags: model_EPYC. Please update the host 
CPU microcode or change the Cluster CPU Type.

But it is an EPYC CPU, the firmware is updated to the latest versions, but for 
some reason oVirt does not like it.

Here’s the relevant output from VDSM:
"cpuCores": "128",
"cpuFlags": 
"ibs,vme,abm,sep,ssse3,perfctr_core,sse4_2,skip-l1dfl-vmentry,cx16,pae,misalignsse,avx2,smap,movbe,vgif,rdctl-no,extapic,clflushopt,de,sse4_1,xsaveerptr,perfctr_llc,fma,mca,sse,rdtscp,monitor,umip,mwaitx,cr8_legacy,mtrr,stibp,bmi2,pclmulqdq,amd-ssbd,lbrv,pdpe1gb,constant_tsc,vmmcall,f16c,ibrs,fsgsbase,invtsc,nopl,lm,3dnowprefetch,smca,ht,tsc_adjust,popcnt,cpb,bmi1,mmx,arat,aperfmperf,bpext,cqm_occup_llc,virt-ssbd,tce,pse,xsave,xgetbv1,topoext,sha_ni,amd_ppin,rdrand,cpuid,tsc_scale,extd_apicid,cqm,rep_good,tsc,sse4a,flushbyasid,pschange-mc-no,mds-no,ibpb,smep,clflush,tsc-deadline,fxsr,pat,avx,pfthreshold,v_vmsave_vmload,osvw,xsavec,cdp_l3,clzero,svm_lock,nonstop_tsc,adx,hw_pstate,spec-ctrl,arch-capabilities,xsaveopt,skinit,rdt_a,svm,rdpid,lah

[ovirt-users] EPYC CPU not being detected correctly on cluster

2020-11-19 Thread Vinícius Ferrão via Users
Hi

I've an strange issue with two hosts (not using the hypervisor image) with EPYC 
CPUs, on the engine I got this message:

The host CPU does not match the Cluster CPU Type and is running in a degraded 
mode. It is missing the following CPU flags: model_EPYC. Please update the host 
CPU microcode or change the Cluster CPU Type.

But it is an EPYC CPU, the firmware is updated to the latest versions, but for 
some reason oVirt does not like it.

Here's the relevant output from VDSM:
"cpuCores": "128",
"cpuFlags": 
"ibs,vme,abm,sep,ssse3,perfctr_core,sse4_2,skip-l1dfl-vmentry,cx16,pae,misalignsse,avx2,smap,movbe,vgif,rdctl-no,extapic,clflushopt,de,sse4_1,xsaveerptr,perfctr_llc,fma,mca,sse,rdtscp,monitor,umip,mwaitx,cr8_legacy,mtrr,stibp,bmi2,pclmulqdq,amd-ssbd,lbrv,pdpe1gb,constant_tsc,vmmcall,f16c,ibrs,fsgsbase,invtsc,nopl,lm,3dnowprefetch,smca,ht,tsc_adjust,popcnt,cpb,bmi1,mmx,arat,aperfmperf,bpext,cqm_occup_llc,virt-ssbd,tce,pse,xsave,xgetbv1,topoext,sha_ni,amd_ppin,rdrand,cpuid,tsc_scale,extd_apicid,cqm,rep_good,tsc,sse4a,flushbyasid,pschange-mc-no,mds-no,ibpb,smep,clflush,tsc-deadline,fxsr,pat,avx,pfthreshold,v_vmsave_vmload,osvw,xsavec,cdp_l3,clzero,svm_lock,nonstop_tsc,adx,hw_pstate,spec-ctrl,arch-capabilities,xsaveopt,skinit,rdt_a,svm,rdpid,lahf_lm,fpu,rdseed,fxsr_opt,sse2,nrip_save,vmcb_clean,sme,cat_l3,cqm_mbm_local,irperf,overflow_recov,avic,mce,mmxext,msr,cx8,hypervisor,wdt,mba,nx,decodeassists,cmp_legacy,x2apic,perfctr_nb,succor,pni,xsaves,clwb,cqm_llc,syscall,apic,pge,npt,pse36,cmov,ssbd,pausefilter,sev,aes,wbnoinvd,cqm_mbm_total,spec_ctrl,model_qemu32,model_Opteron_G3,model_Nehalem-IBRS,model_qemu64,model_Conroe,model_kvm64,model_Penryn,model_SandyBridge,model_pentium,model_pentium2,model_kvm32,model_Nehalem,model_Opteron_G2,model_pentium3,model_Opteron_G1,model_SandyBridge-IBRS,model_486,model_Westmere-IBRS,model_Westmere",
"cpuModel": "AMD EPYC 7H12 64-Core Processor",
"cpuSockets": "2",
"cpuSpeed": "3293.405",
"cpuThreads": "256",

Any ideia on why ou what to do to fix it?

Thanks,

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WP6XL6ODTLJVB46MAXKCOA34PEFN576Q/


[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-22 Thread Vinícius Ferrão via Users
Hi again Strahil,

It’s oVirt 4.3.10. Same CPU on the entire cluster, it’s three machines with 
Xeon E5-2620v2 (Ivy Bridge), all the machines are identical in model and specs.

I’ve changed the VM CPU Model to:
Nehalem,+spec-ctrl,+ssbd

Let’s see how it behaves. If it crashes again I’ll definitely look at rolling 
back the OS updates.

Thank you all.

PS: I can try upgrading to 4.4.

> On 22 Sep 2020, at 04:28, Strahil Nikolov  wrote:
> 
> This looks much like my openBSD 6.6 under Latest AMD CPUs. KVM did not accept 
> a pretty valid instruction and it was a bug in KVM.
> 
> Maybe you can try to :
> - power off the VM
> - pick an older CPU type for that VM only
> - power on and monitor in the next days 
> 
> Do you have a cluster with different cpu vendor (if currently on AMD -> Intel 
> and if currently Intel -> AMD)? Maybe you can move it to another cluster and 
> identify if the issue happens there too.
> 
> Another option is to try to rollback the windows updates , to identify if any 
> of them has caused the problem. Yet, that's aworkaround and not a fix .
> 
> 
> Are you using oVirt 4.3 or 4.4 ?
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> 
> В вторник, 22 септември 2020 г., 10:08:44 Гринуич+3, Vinícius Ferrão 
>  написа: 
> 
> 
> 
> 
> 
> Hi Strahil, yes I can’t find anything recently either. You digged way further 
> then me, I found some regressions on the kernel but I don’t know if it’s 
> related or not: 
> 
> 
> 
> https://patchwork.kernel.org/patch/5526561/
> 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1045027
> 
> 
> 
> 
> Regarding the OS, nothing new was installed, just regular Windows Updates.
> 
> And finally about nested virtualisation, it’s disabled on hypervisor.
> 
> 
> 
> 
> One thing that caught my attention on the link you’ve sent is regarding a 
> rootkit: https://devblogs.microsoft.com/oldnewthing/20060421-12/?p=31443
> 
> 
> 
> 
> But come on, it’s from 2006…
> 
> 
> 
> 
> Well, I’m up to other ideas, VM just crashed once again:
> 
> 
> 
> 
> EAX= EBX=075c5180 ECX=75432002 EDX=000400b6
> ESI=c8ddc080 EDI=075d6800 EBP=a19bbdfe ESP=7db5d770
> EIP=8000 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=1 HLT=0
> ES =   00809300
> CS =9900 7ff99000  00809300
> SS =   00809300
> DS =   00809300
> FS =   00809300
> GS =   00809300
> LDT=  000f 
> TR =0040 075da000 0067 8b00
> GDT= 075dbfb0 0057
> IDT=  
> CR0=00050032 CR2=242cb25a CR3=001ad002 CR4=
> DR0= DR1= DR2= 
> DR3= 
> DR6=4ff0 DR7=0400
> EFER=
> Code=ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ff ff 
> ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> ff
> 
> 
> 
> 
> [519192.536247] *** Guest State ***
> [519192.536275] CR0: actual=0x00050032, shadow=0x00050032, 
> gh_mask=fff7
> [519192.536324] CR4: actual=0x2050, shadow=0x, 
> gh_mask=f871
> [519192.537322] CR3 = 0x001ad002
> [519192.538166] RSP = 0xfb047db5d770  RIP = 0x8000
> [519192.539017] RFLAGS=0x0002 DR7 = 0x0400
> [519192.539861] Sysenter RSP= CS:RIP=:
> [519192.540690] CS:   sel=0x9900, attr=0x08093, limit=0x, 
> base=0x7ff99000
> [519192.541523] DS:   sel=0x, attr=0x08093, limit=0x, 
> base=0x
> [519192.542356] SS:   sel=0x, attr=0x08093, limit=0x, 
> base=0x
> [519192.543167] ES:   sel=0x, attr=0x08093, limit=0x, 
> base=0x
> [519192.543961] FS:   sel=0x, attr=0x08093, limit=0x, 
> base=0x
> [519192.544747] GS:   sel=0x, attr=0x08093, limit=0x, 
> base=0x
> [519192.545511] GDTR:   limit=0x0057, 
> base=0xad01075dbfb0
> [519192.546275] LDTR: sel=0x, attr=0x1, limit=0x000f, 
> base=0x
> [519192.547052] IDTR:   limit=0x, 
> base=0x
> [519192.547841] TR:   sel=0x0040, attr=0x0008b, limit=0x0067, 
> base=0xad01075da000
> [519192.548639] EFER = 0x  PAT = 0x0007010600070106
> [519192.549460] DebugCtl = 0x  DebugExceptions = 
> 0x
> [519192.550302] Interruptibility =

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-22 Thread Vinícius Ferrão via Users
Hi Gianluca.

On 22 Sep 2020, at 04:24, Gianluca Cecchi 
mailto:gianluca.cec...@gmail.com>> wrote:



On Tue, Sep 22, 2020 at 9:12 AM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hi Strahil, yes I can’t find anything recently either. You digged way further 
then me, I found some regressions on the kernel but I don’t know if it’s 
related or not:

https://patchwork.kernel.org/patch/5526561/
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1045027

Regarding the OS, nothing new was installed, just regular Windows Updates.
And finally about nested virtualisation, it’s disabled on hypervisor.



In your original post you wrote about the VM going suspended.
So I think there could be something useful in engine.log on the engine and/or 
vdsm.log on the hypervisor.
Could you check those?

Yes I goes to suspend. I think this is just the engine don’t knowing what 
really happened and guessing it was suspended. On engine.log I only have this 
two lines:

# grep "2020-09-22 01:51" /var/log/ovirt-engine/engine.log
2020-09-22 01:51:52,604-03 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-57) [] VM 
'351db98a-5f74-439f-99a4-31f611b2d250'(cerulean) moved from 'Up' --> 'Paused'
2020-09-22 01:51:52,699-03 INFO  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-57) [] 
EVENT_ID: VM_PAUSED(1,025), VM cerulean has been paused.

Note that I’ve “grepped” with time. There’s only this two lines when it crashed 
like 2h30m ago.

On vdsm.log on the near time with the name of the VM I only found an huge JSON, 
with the characteristics of the VM. If there something that I should check 
specifically? Tried some combinations of “grep” but nothing really useful.

Also, do you see anything in event viewer of the WIndows VM and/or in Freenas 
logs?

FreeNAS is just cool, nothing wrong there. No errors on dmesg, nor resource 
starvation on ZFS. No overload on the disks, nothing… the storage is running 
easy.

About Windows Event Viewer it’s my Achilles’ heel; nothing relevant either as 
far as I’m concerned. There’s of course some mentions of improperly shutdown 
due to the crash, but nothing else. I’m looking further here, will report back 
if I found something useful.

Thanks,


Gianluca

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XTTUYAGYB6EE5I3XNNLBZEBWY363XTIQ/


[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-22 Thread Vinícius Ferrão via Users
Hi Strahil, yes I can’t find anything recently either. You digged way further 
then me, I found some regressions on the kernel but I don’t know if it’s 
related or not:

https://patchwork.kernel.org/patch/5526561/
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1045027

Regarding the OS, nothing new was installed, just regular Windows Updates.
And finally about nested virtualisation, it’s disabled on hypervisor.

One thing that caught my attention on the link you’ve sent is regarding a 
rootkit: https://devblogs.microsoft.com/oldnewthing/20060421-12/?p=31443

But come on, it’s from 2006…

Well, I’m up to other ideas, VM just crashed once again:

EAX= EBX=075c5180 ECX=75432002 EDX=000400b6
ESI=c8ddc080 EDI=075d6800 EBP=a19bbdfe ESP=7db5d770
EIP=8000 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=1 HLT=0
ES =   00809300
CS =9900 7ff99000  00809300
SS =   00809300
DS =   00809300
FS =   00809300
GS =   00809300
LDT=  000f 
TR =0040 075da000 0067 8b00
GDT= 075dbfb0 0057
IDT=  
CR0=00050032 CR2=242cb25a CR3=001ad002 CR4=
DR0= DR1= DR2= 
DR3=
DR6=4ff0 DR7=0400
EFER=
Code=ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ff ff ff 
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

[519192.536247] *** Guest State ***
[519192.536275] CR0: actual=0x00050032, shadow=0x00050032, 
gh_mask=fff7
[519192.536324] CR4: actual=0x2050, shadow=0x, 
gh_mask=f871
[519192.537322] CR3 = 0x001ad002
[519192.538166] RSP = 0xfb047db5d770  RIP = 0x8000
[519192.539017] RFLAGS=0x0002 DR7 = 0x0400
[519192.539861] Sysenter RSP= CS:RIP=:
[519192.540690] CS:   sel=0x9900, attr=0x08093, limit=0x, 
base=0x7ff99000
[519192.541523] DS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[519192.542356] SS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[519192.543167] ES:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[519192.543961] FS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[519192.544747] GS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[519192.545511] GDTR:   limit=0x0057, 
base=0xad01075dbfb0
[519192.546275] LDTR: sel=0x, attr=0x1, limit=0x000f, 
base=0x
[519192.547052] IDTR:   limit=0x, 
base=0x
[519192.547841] TR:   sel=0x0040, attr=0x0008b, limit=0x0067, 
base=0xad01075da000
[519192.548639] EFER = 0x  PAT = 0x0007010600070106
[519192.549460] DebugCtl = 0x  DebugExceptions = 
0x
[519192.550302] Interruptibility = 0009  ActivityState = 
[519192.551137] *** Host State ***
[519192.551963] RIP = 0xc150a034  RSP = 0x88cd9cafbc90
[519192.552805] CS=0010 SS=0018 DS= ES= FS= GS= TR=0040
[519192.553646] FSBase=7f7da762a700 GSBase=88d45f2c 
TRBase=88d45f2c4000
[519192.554496] GDTBase=88d45f2cc000 IDTBase=ff528000
[519192.555347] CR0=80050033 CR3=00033dc82000 CR4=001627e0
[519192.556202] Sysenter RSP= CS:RIP=0010:91596cc0
[519192.557058] EFER = 0x0d01  PAT = 0x0007050600070106
[519192.557913] *** Control State ***
[519192.558757] PinBased=003f CPUBased=b6a1edfa SecondaryExec=0ceb
[519192.559605] EntryControls=d1ff ExitControls=002fefff
[519192.560453] ExceptionBitmap=00060042 PFECmask= PFECmatch=
[519192.561306] VMEntry: intr_info= errcode=0006 ilen=
[519192.562158] VMExit: intr_info= errcode= ilen=0001
[519192.563006] reason=8021 qualification=
[519192.563860] IDTVectoring: info= errcode=
[519192.564695] TSC Offset = 0xfffcc6c7d53f16d7
[519192.565526] TPR Threshold = 0x00
[519192.566345] EPT pointer = 0x000b9397901e
[519192.567162] PLE Gap=0080 Window=1000
[519192.567984] Virtual processor ID = 0x0005


Thank you!


On 22 Sep 2020, at 02:30, Strahil Nikolov 
mailto:hunter86...@yahoo.com>> wrote:

Interesting is that I don't find anything recent , but this one:
https://devblogs.microsoft.com/oldnewthing/20120511-00/?p=7653

Can you check if anything in the OS was updated/changed recently ?

Also check if the VM is with nested virtualization enabled.

Best Regards,
Strahil Nikolov






В понеделник, 21 септември 2020 г., 23:56:26 Гринуич+3, Vinícius Ferrão 
 написа:





Strahil, thank you man. We finally got some 

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-21 Thread Vinícius Ferrão via Users
Strahil, thank you man. We finally got some output:

2020-09-15T12:34:49.362238Z qemu-kvm: warning: CPU(s) not present in any NUMA 
nodes: CPU 10 [socket-id: 10, core-id: 0, thread-id: 0], CPU 11 [socket-id: 11, 
core-id: 0, thread-id: 0], CPU 12 [socket-id: 12, core-id: 0, thread-id: 0], 
CPU 13 [socket-id: 13, core-id: 0, thread-id: 0], CPU 14 [socket-id: 14, 
core-id: 0, thread-id: 0], CPU 15 [socket-id: 15, core-id: 0, thread-id: 0]
2020-09-15T12:34:49.362265Z qemu-kvm: warning: All CPU(s) up to maxcpus should 
be described in NUMA config, ability to start up with partial NUMA mappings is 
obsoleted and will be removed in future
KVM: entry failed, hardware error 0x8021

If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.

EAX= EBX=01746180 ECX=4be7c002 EDX=000400b6
ESI=8b3d6080 EDI=02d70400 EBP=a19bbdfe ESP=82883770
EIP=8000 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=1 HLT=0
ES =   00809300
CS =8d00 7ff8d000  00809300
SS =   00809300
DS =   00809300
FS =   00809300
GS =   00809300
LDT=  000f 
TR =0040 04c59000 0067 8b00
GDT= 04c5afb0 0057
IDT=  
CR0=00050032 CR2=c1b7ec48 CR3=001ad002 CR4=
DR0= DR1= DR2= 
DR3= 
DR6=0ff0 DR7=0400
EFER=
Code=ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ff ff ff 
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
2020-09-16T04:11:55.344128Z qemu-kvm: terminating on signal 15 from pid 1 
()
2020-09-16 04:12:02.212+: shutting down, reason=shutdown






That’s the issue, I got this on the logs of both physical machines. The 
probability of both machines are damaged is not quite common right? So even 
with the log saying it’s a hardware error it may be software related? And 
again, this only happens with this VM.

> On 21 Sep 2020, at 17:36, Strahil Nikolov  wrote:
> 
> Usually libvirt's log might provide hints (yet , no clues) of any issues.
> 
> For example: 
> /var/log/libvirt/qemu/.log
> 
> Anything changed recently (maybe oVirt version was increased) ?
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> 
> В понеделник, 21 септември 2020 г., 23:28:13 Гринуич+3, Vinícius Ferrão 
>  написа: 
> 
> 
> 
> 
> 
> Hi Strahil, 
> 
> 
> 
> Both disks are VirtIO-SCSI and are Preallocated:
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Thanks,
> 
> 
> 
> 
> 
> 
> 
> 
>>   
>> On 21 Sep 2020, at 17:09, Strahil Nikolov  wrote:
>> 
>> 
>>   
>> What type of disks are you using ? Any change you use thin disks ?
>> 
>> Best Regards,
>> Strahil Nikolov
>> 
>> 
>> 
>> 
>> 
>> 
>> В понеделник, 21 септември 2020 г., 07:20:23 Гринуич+3, Vinícius Ferrão via 
>> Users  написа: 
>> 
>> 
>> 
>> 
>> 
>> Hi, sorry to bump the thread.
>> 
>> But I still with this issue on the VM. This crashes are still happening, and 
>> I really don’t know what to do. Since there’s nothing on logs, except from 
>> that message on `dmesg` of the host machine I started changing setting to 
>> see if anything changes or if I at least I get a pattern.
>> 
>> What I’ve tried:
>> 1. Disabled I/O Threading on VM.
>> 2. Increased I/O Threading to 2 form 1.
>> 3. Disabled Memory Balooning.
>> 4. Reduced VM resources form 10 CPU’s and 48GB of RAM to 6 CPU’s and 24GB of 
>> RAM.
>> 5. Moved the VM to another host.
>> 6. Dedicated a host specific to this VM.
>> 7. Check on the storage system to see if there’s any resource starvation, 
>> but everything seems to be fine.
>> 8. Checked both iSCSI switches to see if there’s something wrong with the 
>> fabrics: 0 errors.
>> 
>> I’m really running out of ideas. The VM was working normally and suddenly 
>> this started.
>> 
>> Thanks,
>> 
>> PS: When I was typing this message it crashed again:
>> 
>> [427483.126725] *** Guest State ***
>> [427483.127661] CR0: actual=0x00050032, shadow=0x00050032, 
>> gh_mask=fff7
>> [427483.128505] CR4: actual=0x2050, shadow=0x, 
>> gh_mask=f871
>> [427483.129342] CR3 = 0x0001849ff002
>> [427483.130177] RSP = 0xb10186b0  RI

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-20 Thread Vinícius Ferrão via Users
Hi, sorry to bump the thread.

But I still with this issue on the VM. This crashes are still happening, and I 
really don’t know what to do. Since there’s nothing on logs, except from that 
message on `dmesg` of the host machine I started changing setting to see if 
anything changes or if I at least I get a pattern.

What I’ve tried:
1. Disabled I/O Threading on VM.
2. Increased I/O Threading to 2 form 1.
3. Disabled Memory Balooning.
4. Reduced VM resources form 10 CPU’s and 48GB of RAM to 6 CPU’s and 24GB of 
RAM.
5. Moved the VM to another host.
6. Dedicated a host specific to this VM.
7. Check on the storage system to see if there’s any resource starvation, but 
everything seems to be fine.
8. Checked both iSCSI switches to see if there’s something wrong with the 
fabrics: 0 errors.

I’m really running out of ideas. The VM was working normally and suddenly this 
started.

Thanks,

PS: When I was typing this message it crashed again:

[427483.126725] *** Guest State ***
[427483.127661] CR0: actual=0x00050032, shadow=0x00050032, 
gh_mask=fff7
[427483.128505] CR4: actual=0x2050, shadow=0x, 
gh_mask=f871
[427483.129342] CR3 = 0x0001849ff002
[427483.130177] RSP = 0xb10186b0  RIP = 0x8000
[427483.131014] RFLAGS=0x0002 DR7 = 0x0400
[427483.131859] Sysenter RSP= CS:RIP=:
[427483.132708] CS:   sel=0x9b00, attr=0x08093, limit=0x, 
base=0x7ff9b000
[427483.133559] DS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[427483.134413] SS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[427483.135237] ES:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[427483.136040] FS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[427483.136842] GS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[427483.137629] GDTR:   limit=0x0057, 
base=0xb10186eb4fb0
[427483.138409] LDTR: sel=0x, attr=0x1, limit=0x000f, 
base=0x
[427483.139202] IDTR:   limit=0x, 
base=0x
[427483.139998] TR:   sel=0x0040, attr=0x0008b, limit=0x0067, 
base=0xb10186eb3000
[427483.140816] EFER = 0x  PAT = 0x0007010600070106
[427483.141650] DebugCtl = 0x  DebugExceptions = 
0x
[427483.142503] Interruptibility = 0009  ActivityState = 
[427483.143353] *** Host State ***
[427483.144194] RIP = 0xc0c65024  RSP = 0x9253c0b9bc90
[427483.145043] CS=0010 SS=0018 DS= ES= FS= GS= TR=0040
[427483.145903] FSBase=7fcc13816700 GSBase=925adf24 
TRBase=925adf244000
[427483.146766] GDTBase=925adf24c000 IDTBase=ff528000
[427483.147630] CR0=80050033 CR3=0010597b6000 CR4=001627e0
[427483.148498] Sysenter RSP= CS:RIP=0010:8f196cc0
[427483.149365] EFER = 0x0d01  PAT = 0x0007050600070106
[427483.150231] *** Control State ***
[427483.151077] PinBased=003f CPUBased=b6a1edfa SecondaryExec=0ceb
[427483.151942] EntryControls=d1ff ExitControls=002fefff
[427483.152800] ExceptionBitmap=00060042 PFECmask= PFECmatch=
[427483.153661] VMEntry: intr_info= errcode=0006 ilen=
[427483.154521] VMExit: intr_info= errcode= ilen=0004
[427483.155376] reason=8021 qualification=
[427483.156230] IDTVectoring: info= errcode=
[427483.157068] TSC Offset = 0xfffccfc261506dd9
[427483.157905] TPR Threshold = 0x0d
[427483.158728] EPT pointer = 0x0009b437701e
[427483.159550] PLE Gap=0080 Window=0008
[427483.160370] Virtual processor ID = 0x0004


> On 16 Sep 2020, at 17:11, Vinícius Ferrão  wrote:
> 
> Hello,
> 
> I’m an Exchange Server VM that’s going down to suspend without possibility of 
> recovery. I need to click on shutdown and them power on. I can’t find 
> anything useful on the logs, except on “dmesg” of the host:
> 
> [47807.747606] *** Guest State ***
> [47807.747633] CR0: actual=0x00050032, shadow=0x00050032, 
> gh_mask=fff7
> [47807.747671] CR4: actual=0x2050, shadow=0x, 
> gh_mask=f871
> [47807.747721] CR3 = 0x001ad002
> [47807.747739] RSP = 0xc20904fa3770  RIP = 0x8000
> [47807.747766] RFLAGS=0x0002 DR7 = 0x0400
> [47807.747792] Sysenter RSP= CS:RIP=:
> [47807.747821] CS:   sel=0x9100, attr=0x08093, limit=0x, 
> base=0x7ff91000
> [47807.747855] DS:   sel=0x, attr=0x08093, limit=0x, 
> base=0x
> [47807.747889] SS:   sel=0x, attr=0x08093, limit=0x, 

[ovirt-users] How to discover why a VM is getting suspended without recovery possibility?

2020-09-16 Thread Vinícius Ferrão via Users
Hello,

I’m an Exchange Server VM that’s going down to suspend without possibility of 
recovery. I need to click on shutdown and them power on. I can’t find anything 
useful on the logs, except on “dmesg” of the host:

[47807.747606] *** Guest State ***
[47807.747633] CR0: actual=0x00050032, shadow=0x00050032, 
gh_mask=fff7
[47807.747671] CR4: actual=0x2050, shadow=0x, 
gh_mask=f871
[47807.747721] CR3 = 0x001ad002
[47807.747739] RSP = 0xc20904fa3770  RIP = 0x8000
[47807.747766] RFLAGS=0x0002 DR7 = 0x0400
[47807.747792] Sysenter RSP= CS:RIP=:
[47807.747821] CS:   sel=0x9100, attr=0x08093, limit=0x, 
base=0x7ff91000
[47807.747855] DS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[47807.747889] SS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[47807.747923] ES:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[47807.747957] FS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[47807.747991] GS:   sel=0x, attr=0x08093, limit=0x, 
base=0x
[47807.748025] GDTR:   limit=0x0057, 
base=0x80817e7d5fb0
[47807.748059] LDTR: sel=0x, attr=0x1, limit=0x000f, 
base=0x
[47807.748093] IDTR:   limit=0x, 
base=0x
[47807.748128] TR:   sel=0x0040, attr=0x0008b, limit=0x0067, 
base=0x80817e7d4000
[47807.748162] EFER = 0x  PAT = 0x0007010600070106
[47807.748189] DebugCtl = 0x  DebugExceptions = 
0x
[47807.748221] Interruptibility = 0009  ActivityState = 
[47807.748248] *** Host State ***
[47807.748263] RIP = 0xc0c65024  RSP = 0x9252bda5fc90
[47807.748290] CS=0010 SS=0018 DS= ES= FS= GS= TR=0040
[47807.748318] FSBase=7f46d462a700 GSBase=9252ffac 
TRBase=9252ffac4000
[47807.748351] GDTBase=9252ffacc000 IDTBase=ff528000
[47807.748377] CR0=80050033 CR3=00105ac8c000 CR4=001627e0
[47807.748407] Sysenter RSP= CS:RIP=0010:8f196cc0
[47807.748435] EFER = 0x0d01  PAT = 0x0007050600070106
[47807.748461] *** Control State ***
[47807.748478] PinBased=003f CPUBased=b6a1edfa SecondaryExec=0ceb
[47807.748507] EntryControls=d1ff ExitControls=002fefff
[47807.748531] ExceptionBitmap=00060042 PFECmask= PFECmatch=
[47807.748561] VMEntry: intr_info= errcode=0006 ilen=
[47807.748589] VMExit: intr_info= errcode= ilen=0001
[47807.748618] reason=8021 qualification=
[47807.748645] IDTVectoring: info= errcode=
[47807.748669] TSC Offset = 0xf9b8c8d943b6
[47807.748699] TPR Threshold = 0x00
[47807.748715] EPT pointer = 0x00105cd5601e
[47807.748735] PLE Gap=0080 Window=1000
[47807.748755] Virtual processor ID = 0x0003

So something really went crazy. The VM is going down at least two times a day 
for the last 5 days.

At first I thought it would be an hardware issue, so I restarted the VM on 
other host, and the same thing happened.

About the VM it’s configured with 10 CPUs, 48GB of RAM running oVirt 4.3.10 
with iSCSI storage to a FreeNAS box, where the VM disks are running; there are 
a 300GB disc for C:\ and 2TB disk for D:\.

Any ideia on how to start troubleshooting it?

Thanks,

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/X34PTPXY5GLAULTQ2ZCB3PGZA2MON5KX/


[ovirt-users] Re: Multiple GPU Passthrough with NVLink (Invalid I/O region)

2020-09-04 Thread Vinícius Ferrão via Users
Thanks Michael and Arman.

To make things clear, you guys are using Passthrough, right? It’s not vGPU. The 
4x GPUs are added on the “Host Devices” tab of the VM.
What I’m trying to achieve is add the 4x V100 directly to one specific VM.

And finally can you guys confirm which BIOS type is being used in your 
machines? I’m with Q35 Chipset with UEFI BIOS. I haven’t tested it with legacy, 
perhaps I’ll give it a try.

Thanks again.

On 4 Sep 2020, at 14:09, Michael Jones 
mailto:m...@mikejonesey.co.uk>> wrote:

Also use multiple t4, also p4, titans, no issues but never used the nvlink

On Fri, 4 Sep 2020, 16:02 Arman Khalatyan, 
mailto:arm2...@gmail.com>> wrote:
hi,
with the 2xT4 we haven't seen any trouble. we have no nvlink there.

did u try to disable the nvlink?



Vinícius Ferrão via Users mailto:users@ovirt.org>> schrieb am 
Fr., 4. Sept. 2020, 08:39:
Hello, here we go again.

I’m trying to passthrough 4x NVIDIA Tesla V100 GPUs (with NVLink) to a single 
VM; but things aren’t that good. Only one GPU shows up on the VM. lspci is able 
to show the GPUs, but three of them are unusable:

08:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev 
a1)
09:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev 
a1)
0a:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev 
a1)
0b:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev 
a1)

There are some errors on dmesg, regarding a misconfigured BIOS:

[   27.295972] nvidia: loading out-of-tree module taints kernel.
[   27.295980] nvidia: module license 'NVIDIA' taints kernel.
[   27.295981] Disabling lock debugging due to kernel taint
[   27.304180] nvidia: module verification failed: signature and/or required 
key missing - tainting kernel
[   27.364244] nvidia-nvlink: Nvlink Core is being initialized, major device 
number 241
[   27.579261] nvidia :09:00.0: enabling device ( -> 0002)
[   27.579560] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR1 is 0M @ 0x0 (PCI::09:00.0)
[   27.579560] NVRM: The system BIOS may have misconfigured your GPU.
[   27.579566] nvidia: probe of :09:00.0 failed with error -1
[   27.580727] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR0 is 0M @ 0x0 (PCI::0a:00.0)
[   27.580729] NVRM: The system BIOS may have misconfigured your GPU.
[   27.580734] nvidia: probe of :0a:00.0 failed with error -1
[   27.581299] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR0 is 0M @ 0x0 (PCI::0b:00.0)
[   27.581300] NVRM: The system BIOS may have misconfigured your GPU.
[   27.581305] nvidia: probe of :0b:00.0 failed with error -1
[   27.581333] NVRM: The NVIDIA probe routine failed for 3 device(s).
[   27.581334] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  450.51.06  Sun 
Jul 19 20:02:54 UTC 2020
[   27.649128] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for 
UNIX platforms  450.51.06  Sun Jul 19 20:06:42 UTC 2020

The host is Secure Intel Skylake (x86_64). VM is running with Q35 Chipset with 
UEFI (pc-q35-rhel8.2.0)

I’ve tried to change the I/O mapping options on the host, tried with 56TB and 
12TB without success. Same results. Didn’t tried with 512GB since the machine 
have 768GB of system RAM.

Tried blacklisting the nouveau on the host, nothing.
Installed NVIDIA drivers on the host, nothing.

In the host I can use the 4x V100, but inside a single VM it’s impossible.

Any suggestions?



___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/73CXU27AX6ND6EXUJKBKKRWM6DJH7UL7/
___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PIO4DIVUU4JWG5FXYW3NQSVXCFZWYV26/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FY5J2VGAZXUOE3K5QJIS3ETXP76M3CHO/


[ovirt-users] Multiple GPU Passthrough with NVLink (Invalid I/O region)

2020-09-04 Thread Vinícius Ferrão via Users
Hello, here we go again.

I’m trying to passthrough 4x NVIDIA Tesla V100 GPUs (with NVLink) to a single 
VM; but things aren’t that good. Only one GPU shows up on the VM. lspci is able 
to show the GPUs, but three of them are unusable:

08:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev 
a1)
09:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev 
a1)
0a:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev 
a1)
0b:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev 
a1)

There are some errors on dmesg, regarding a misconfigured BIOS:

[   27.295972] nvidia: loading out-of-tree module taints kernel.
[   27.295980] nvidia: module license 'NVIDIA' taints kernel.
[   27.295981] Disabling lock debugging due to kernel taint
[   27.304180] nvidia: module verification failed: signature and/or required 
key missing - tainting kernel
[   27.364244] nvidia-nvlink: Nvlink Core is being initialized, major device 
number 241
[   27.579261] nvidia :09:00.0: enabling device ( -> 0002)
[   27.579560] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR1 is 0M @ 0x0 (PCI::09:00.0)
[   27.579560] NVRM: The system BIOS may have misconfigured your GPU.
[   27.579566] nvidia: probe of :09:00.0 failed with error -1
[   27.580727] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR0 is 0M @ 0x0 (PCI::0a:00.0)
[   27.580729] NVRM: The system BIOS may have misconfigured your GPU.
[   27.580734] nvidia: probe of :0a:00.0 failed with error -1
[   27.581299] NVRM: This PCI I/O region assigned to your NVIDIA device is 
invalid:
   NVRM: BAR0 is 0M @ 0x0 (PCI::0b:00.0)
[   27.581300] NVRM: The system BIOS may have misconfigured your GPU.
[   27.581305] nvidia: probe of :0b:00.0 failed with error -1
[   27.581333] NVRM: The NVIDIA probe routine failed for 3 device(s).
[   27.581334] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  450.51.06  Sun 
Jul 19 20:02:54 UTC 2020
[   27.649128] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for 
UNIX platforms  450.51.06  Sun Jul 19 20:06:42 UTC 2020

The host is Secure Intel Skylake (x86_64). VM is running with Q35 Chipset with 
UEFI (pc-q35-rhel8.2.0)

I’ve tried to change the I/O mapping options on the host, tried with 56TB and 
12TB without success. Same results. Didn’t tried with 512GB since the machine 
have 768GB of system RAM.

Tried blacklisting the nouveau on the host, nothing.
Installed NVIDIA drivers on the host, nothing.

In the host I can use the 4x V100, but inside a single VM it’s impossible.

Any suggestions?



___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/73CXU27AX6ND6EXUJKBKKRWM6DJH7UL7/


[ovirt-users] Mellanox OFED with oVirt

2020-09-01 Thread Vinícius Ferrão via Users
Hello,

Anyone had success using Mellanox OFED with oVirt? Already learned some things:

1. I can’t use oVirt Node.
2. Mellanox OFED cannot be installed with mlnx-ofed-all since it breaks dnf. We 
need to rely on the upstream RDMA implementation.
3. The way to go is running: dnf install mlnx-ofed-dpdk-upstream-libs

But after the installation I ended up with broken dnf:

[root@c4140 ~]# dnf update
Updating Subscription Management repositories.
Last metadata expiration check: 0:03:54 ago on Tue 01 Sep 2020 11:52:41 PM -03.
Error: 
 Problem: both package mlnx-ofed-all-user-only-5.1-0.6.6.0.rhel8.2.noarch and 
mlnx-ofed-all-5.1-0.6.6.0.rhel8.2.noarch obsolete glusterfs-rdma
  - cannot install the best update candidate for package 
glusterfs-rdma-6.0-37.el8.x86_64
  - package ovirt-host-4.4.1-4.el8ev.x86_64 requires glusterfs-rdma, but none 
of the providers can be installed
  - package mlnx-ofed-all-5.1-0.6.6.0.rhel8.2.noarch obsoletes glusterfs-rdma 
provided by glusterfs-rdma-6.0-37.el8.x86_64
  - package glusterfs-rdma-3.12.2-40.2.el8.x86_64 requires glusterfs(x86-64) = 
3.12.2-40.2.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-15.el8.x86_64 requires glusterfs(x86-64) = 
6.0-15.el8, but none of the providers can be installed
  - package glusterfs-rdma-6.0-20.el8.x86_64 requires glusterfs(x86-64) = 
6.0-20.el8, but none of the providers can be installed
  - cannot install both glusterfs-3.12.2-40.2.el8.x86_64 and 
glusterfs-6.0-37.el8.x86_64
  - cannot install both glusterfs-6.0-15.el8.x86_64 and 
glusterfs-6.0-37.el8.x86_64
  - cannot install both glusterfs-6.0-20.el8.x86_64 and 
glusterfs-6.0-37.el8.x86_64
  - cannot install the best update candidate for package 
ovirt-host-4.4.1-4.el8ev.x86_64
  - cannot install the best update candidate for package 
glusterfs-6.0-37.el8.x86_64
(try to add '--allowerasing' to command line to replace conflicting packages or 
'--skip-broken' to skip uninstallable packages or '--nobest' to use not only 
best candidate packages)

That are the packages installed:

[root@c4140 ~]# rpm -qa *mlnx*
mlnx-dpdk-19.11.0-1.51066.x86_64
mlnx-ofa_kernel-devel-5.1-OFED.5.1.0.6.6.1.rhel8u2.x86_64
mlnx-ethtool-5.4-1.51066.x86_64
mlnx-dpdk-devel-19.11.0-1.51066.x86_64
mlnx-ofa_kernel-5.1-OFED.5.1.0.6.6.1.rhel8u2.x86_64
mlnx-dpdk-doc-19.11.0-1.51066.noarch
mlnx-dpdk-tools-19.11.0-1.51066.x86_64
mlnx-ofed-dpdk-upstream-libs-5.1-0.6.6.0.rhel8.2.noarch
kmod-mlnx-ofa_kernel-5.1-OFED.5.1.0.6.6.1.rhel8u2.x86_64
mlnx-iproute2-5.6.0-1.51066.x86_64

And finally this is the repo that I’m using:
[root@c4140 ~]# cat /etc/yum.repos.d/mellanox_mlnx_ofed.repo 
#
# Mellanox Technologies Ltd. public repository configuration file.
# For more information, refer to http://linux.mellanox.com
#

[mlnx_ofed_latest_base]
name=Mellanox Technologies rhel8.2-$basearch mlnx_ofed latest
baseurl=http://linux.mellanox.com/public/repo/mlnx_ofed/latest/rhel8.2/$basearch
enabled=1
gpgkey=http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox
gpgcheck=1


So anyone had success with this?

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z2QBLGLN5NNUUCGHYM5HL4QDHIPZ6J72/


[ovirt-users] Re: Missing model_FLAGS on specific host

2020-08-28 Thread Vinícius Ferrão via Users
Hi just a followup here.

Something was messy with the DELL Firmware.

I’ve refreshed it and completely erased the NVRAM. Reconfigured the firmware 
and now the host behave as expected:

[root@c4140 ~]# !vdsm
vdsm-client Host getCapabilities | egrep "cpuFlags|cpuModel|Skylake"
"cpuFlags": 
"pln,msr,acpi,sse2,smx,rdrand,cqm_occup_llc,xsaveopt,rdseed,rtm,epb,sse,hypervisor,ibrs,cmov,nopl,cpuid_fault,pse,f16c,spec_ctrl,adx,constant_tsc,bts,rdt_a,pae,nx,tsc,x2apic,sep,pat,cqm_mbm_total,pebs,xsave,smep,ds_cpl,fma,ospke,mca,mmx,pge,pku,pcid,aperfmperf,ssse3,flexpriority,cqm,avx512dq,avx512vl,fpu,umip,flush_l1d,ssbd,lm,syscall,movbe,vpid,ht,xsavec,invpcid_single,3dnowprefetch,tsc_deadline_timer,cx8,rep_good,tm2,avx,cx16,rdtscp,ss,popcnt,lahf_lm,stibp,arch_perfmon,smap,clflushopt,invtsc,vmx,dts,xsaves,md_clear,dtherm,avx512f,bmi2,mpx,arch-capabilities,dtes64,avx512cd,mba,avx2,ept,pts,vme,vnmi,fxsr,pschange-mc-no,dca,avx512bw,tsc_adjust,cqm_llc,pclmulqdq,cat_l3,bmi1,monitor,pti,arat,abm,cpuid,clflush,mce,sse4_2,erms,nonstop_tsc,apic,cdp_l3,fsgsbase,sdbg,art,xgetbv1,tpr_shadow,cqm_mbm_local,clwb,pdpe1gb,xtpr,ida,de,pbe,intel_pt,ibpb,est,intel_ppin,tm,pni,aes,amd-ssbd,md-clear,skip-l1dfl-vmentry,hle,pdcm,invpcid,mtrr,pse36,sse4_1,xtopology,model_core2duo,model_pentium2,model_Skylake-Server-IBRS,model_Haswell,model_Skylake-Server,model_IvyBridge-IBRS,model_Penryn,model_Broadwell-noTSX-IBRS,model_qemu64,model_n270,model_kvm32,model_coreduo,model_Broadwell-IBRS,model_Skylake-Client-noTSX-IBRS,model_Haswell-IBRS,model_Broadwell-noTSX,model_Skylake-Client,model_SandyBridge,model_Skylake-Server-noTSX-IBRS,model_SandyBridge-IBRS,model_Broadwell,model_kvm64,model_Nehalem-IBRS,model_IvyBridge,model_pentium,model_Skylake-Client-IBRS,model_Conroe,model_Haswell-noTSX,model_Opteron_G2,model_Westmere,model_qemu32,model_486,model_pentium3,model_Opteron_G1,model_Westmere-IBRS,model_Haswell-noTSX-IBRS,model_Nehalem",
"cpuModel": "Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz",

Not sure what really happened, but those actions solved the issue.

On 28 Aug 2020, at 03:29, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Hi,

I’ve an strange issue in one of my hosts, it’s missing a lot of CPU flags that 
oVirt seems to require:

[root@c4140 ~]# vdsm-client Host getCapabilities | egrep "cpuFlags|cpuModel"
"cpuFlags": 
"ssse3,mca,ept,pdpe1gb,vmx,clwb,smep,msr,acpi,pge,sse4_2,nopl,cqm_mbm_total,cx16,avx512vl,aperfmperf,xsaves,3dnowprefetch,nonstop_tsc,cmov,mce,intel_pt,avx512f,fpu,pku,tsc,sdbg,erms,pse36,md_clear,apic,sse,pcid,clflushopt,xtopology,pts,monitor,vpid,cpuid,hle,mba,ss,cqm,avx2,ibpb,xgetbv1,flush_l1d,mmx,epb,pti,fxsr,dca,nx,syscall,stibp,mtrr,cx8,sse2,avx,sep,intel_ppin,lm,tm,bts,adx,bmi1,smx,popcnt,pclmulqdq,lahf_lm,mpx,rdseed,cqm_llc,avx512cd,cdp_l3,f16c,invpcid,fsgsbase,cpuid_fault,tm2,smap,dts,pse,xsave,sse4_1,constant_tsc,pat,tsc_deadline_timer,vnmi,avx512dq,dtes64,xsaveopt,ida,pdcm,tpr_shadow,pln,de,x2apic,avx512bw,pae,rdrand,clflush,rdtscp,art,cqm_mbm_local,pebs,ssbd,movbe,pbe,tsc_adjust,vme,ht,est,bmi2,cat_l3,dtherm,ospke,rdt_a,aes,ibrs,rep_good,fma,xtpr,ds_cpl,abm,xsavec,invpcid_single,flexpriority,cqm_occup_llc,pni,rtm,arat,arch_perfmon",
"cpuModel": "Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz",

In a properly working host it have flags like those ones:

model_Westmere-IBRS,model_kvm32,model_core2duo,model_Opteron_G1,model_Broadwell,model_qemu64,model_Broadwell-noTSX,model_Nehalem-IBRS,model_Haswell-IBRS,model_pentium2,model_Broadwell-IBRS,model_Haswell-noTSX,model_Haswell,model_Haswell-noTSX-IBRS,model_Conroe,model_pentium,model_n270,model_Nehalem,model_IvyBridge-IBRS,model_kvm64,model_SandyBridge,model_pentium3,model_Broadwell-noTSX-IBRS,model_qemu32,model_486,model_IvyBridge,model_SandyBridge-IBRS,model_Westmere,model_Penryn,model_Opteron_G2,model_coreduo",

But in this machine it’s totally missing. I know this model_ flags are an oVirt 
thing, since they aren’t default on the CPUs.

The host machine is a Dell C4140 compute node, the firmware is fully updated, 
so I’ve done the basic to figure out what’s happening.

Thanks,

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XKXBZLYRMMJGBFATFUOWLN2CBS6T75DM/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z5DOWN4F4TYI6FMCOVYDEGT4GBZ27SSF/


[ovirt-users] Missing model_FLAGS on specific host

2020-08-28 Thread Vinícius Ferrão via Users
Hi,

I’ve an strange issue in one of my hosts, it’s missing a lot of CPU flags that 
oVirt seems to require:

[root@c4140 ~]# vdsm-client Host getCapabilities | egrep "cpuFlags|cpuModel"
"cpuFlags": 
"ssse3,mca,ept,pdpe1gb,vmx,clwb,smep,msr,acpi,pge,sse4_2,nopl,cqm_mbm_total,cx16,avx512vl,aperfmperf,xsaves,3dnowprefetch,nonstop_tsc,cmov,mce,intel_pt,avx512f,fpu,pku,tsc,sdbg,erms,pse36,md_clear,apic,sse,pcid,clflushopt,xtopology,pts,monitor,vpid,cpuid,hle,mba,ss,cqm,avx2,ibpb,xgetbv1,flush_l1d,mmx,epb,pti,fxsr,dca,nx,syscall,stibp,mtrr,cx8,sse2,avx,sep,intel_ppin,lm,tm,bts,adx,bmi1,smx,popcnt,pclmulqdq,lahf_lm,mpx,rdseed,cqm_llc,avx512cd,cdp_l3,f16c,invpcid,fsgsbase,cpuid_fault,tm2,smap,dts,pse,xsave,sse4_1,constant_tsc,pat,tsc_deadline_timer,vnmi,avx512dq,dtes64,xsaveopt,ida,pdcm,tpr_shadow,pln,de,x2apic,avx512bw,pae,rdrand,clflush,rdtscp,art,cqm_mbm_local,pebs,ssbd,movbe,pbe,tsc_adjust,vme,ht,est,bmi2,cat_l3,dtherm,ospke,rdt_a,aes,ibrs,rep_good,fma,xtpr,ds_cpl,abm,xsavec,invpcid_single,flexpriority,cqm_occup_llc,pni,rtm,arat,arch_perfmon",
"cpuModel": "Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz",

In a properly working host it have flags like those ones:

model_Westmere-IBRS,model_kvm32,model_core2duo,model_Opteron_G1,model_Broadwell,model_qemu64,model_Broadwell-noTSX,model_Nehalem-IBRS,model_Haswell-IBRS,model_pentium2,model_Broadwell-IBRS,model_Haswell-noTSX,model_Haswell,model_Haswell-noTSX-IBRS,model_Conroe,model_pentium,model_n270,model_Nehalem,model_IvyBridge-IBRS,model_kvm64,model_SandyBridge,model_pentium3,model_Broadwell-noTSX-IBRS,model_qemu32,model_486,model_IvyBridge,model_SandyBridge-IBRS,model_Westmere,model_Penryn,model_Opteron_G2,model_coreduo",

But in this machine it’s totally missing. I know this model_ flags are an oVirt 
thing, since they aren’t default on the CPUs.

The host machine is a Dell C4140 compute node, the firmware is fully updated, 
so I’ve done the basic to figure out what’s happening.

Thanks,

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XKXBZLYRMMJGBFATFUOWLN2CBS6T75DM/


[ovirt-users] Re: POWER9 (ppc64le) Support on oVirt 4.4.1

2020-08-27 Thread Vinícius Ferrão via Users


On 27 Aug 2020, at 16:03, Arik Hadas 
mailto:aha...@redhat.com>> wrote:



On Thu, Aug 27, 2020 at 8:40 PM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hi Michal,

On 27 Aug 2020, at 05:08, Michal Skrivanek 
mailto:michal.skriva...@redhat.com>> wrote:



On 26 Aug 2020, at 20:50, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Okay here we go Arik.

With your insight I’ve done the following:

# rpm -Va

This showed what’s zeroed on the machine, since it was a lot of things, I’ve 
just gone crazy and done:

you should still have host deploy logs on the engine machine. it’s weird it 
succeeded, unless it somehow happened afterwards?

It only succeeded my yum reinstall rampage.

yum list installed | cut -f 1 -d " " > file
yum -y reinstall `cat file | xargs`

Reinstalled everything.

Everything worked as expected and I finally added the machine back to the 
cluster. It’s operational.

eh, I wouldn’t trust it much. did you run redeploy at least?

I’ve done reinstall on the web interface of the engine. I can reinstall the 
host, there’s nothing running on it… gonna try a third format.



Now I’ve another issue, I have 3 VM’s that are ppc64le, when trying to import 
them, the Hosted Engine identifies them as x86_64:



So…

This appears to be a bug. Any ideia on how to force it back to ppc64? I can’t 
manually force the import on the Hosted Engine since there’s no buttons to do 
this…

how exactly did you import them? could be a bug indeed.
we don’t support changing it as it doesn’t make sense, the guest can’t be 
converted

Yeah. I done the normal procedure, added the storage domain to the engine and 
clicked on “Import VM”. Immediately it was detected as x86_64.

Since I wasn’t able to upgrade my environment from 4.3.10 to 4.4.1 due to 
random errors when redeploying the engine with the backup from 4.3.10, I just 
reinstalled it, reconfigured everything and them imported the storage domains.

I don’t know where the information about architecture is stored in the storage 
domain, I tried to search for some metadata files inside the domain but nothing 
come up. Is there a way to force this change? It must be a way.

I even tried to import the machine as x86_64. So I can delete the VM and just 
reattach the disks in a new only, effectively not losing the data, but…



Yeah, so something is broken. The check during the import appears to be OK, but 
the interface does not me allow to import it to the ppc64le machine, since it’s 
read as x86_64.

Could you please provide the output of the following query from the database:
select * from unregistered_ovf_of_entities where 
entity_name='energy.versatushpc.com.br<http://energy.versatushpc.com.br/>';

Sure, there you go:

 46ad1d80-2649-48f5-92e6-e5489d11d30c | 
energy.versatushpc.com.br<http://energy.versatushpc.com.br> | VM  | 
   1 | | d19456e4-0051-456e-b33c-57348a78c2e0 |
 http://schemas.dmtf.org/ovf/envelope/1/; 
xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim
-schema/2/CIM_ResourceAllocationSettingData" 
xmlns:vssd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData;
 xmlns:xsi="http://ww
w.w3.org/2001/XMLSchema-instance<http://w.w3.org/2001/XMLSchema-instance>" 
ovf:version="4.1.0.0">List of networks
List of Virtual Diskshttp://www.vmwa
re.com/specifications/vmdk.html#sparse<http://re.com/specifications/vmdk.html#sparse>"
 ovf:volume-format="RAW" ovf:volume-type="Sparse" 
ovf:disk-interface="VirtIO_SCSI" ovf:read-only="false" ovf:shareable
="false" ovf:boot="true" ovf:pass-discard="false" 
ovf:disk-alias="energy.versatushpc.com.br_Disk1" ovf:disk-description="" 
ovf:wipe-after-delete="false">energy.versatushpc.com.br<http://energy.versatushpc.com.br>Holds
 Kosen backend and frontend prod
 services (nginx + 
docker)2020/08/19 
20:11:332020/08/20 18:37:41falseguest_agentfalse1Etc/GMT984.31AUTO_RESUME2730falsefalsefalse16ea16f22-45d7-11ea-bd83-00163e518b7c0falsetruetruetrueLOCK_SCREEN016384truefalseBlastoise----Blanktrue032644894-755e-4588-b967-8fb9dc3277952false000
0----Blankfalse2020/08/20 17:52:35Guest Operating 
Systemother_linux_ppc642 CPU, 4096 MemoryENGINE 
4.1.0.02 virtual 
cpuNumber of virtual 
CPU132111624096 
>MB of memoryMemory Size24MegaBytes4096energy.versatushpc.com.br_Disk1b1d9832e-076f-48f3-a300-0b5cdf0949af17775b24a9-6a32-431a-831f-4ac9b3b31152/b1d9832e-076f-48f3-a300-0b5cdf0949af--------d19456e4-0051-456e-b33c-57348a78c2e06c54f91e-89bf-45b4-bc48-56e74c4efd5e2020/08/19 
20:13:051970/01/01 
00:00:002020/08
/20 
18:37:41diskdisk{type=drive,
 bus=0, controller=1, target=0, unit=0}<
BootOrder>1truefalseua

[ovirt-users] Re: POWER9 (ppc64le) Support on oVirt 4.4.1

2020-08-27 Thread Vinícius Ferrão via Users
Hi Michal,

On 27 Aug 2020, at 05:08, Michal Skrivanek 
mailto:michal.skriva...@redhat.com>> wrote:



On 26 Aug 2020, at 20:50, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Okay here we go Arik.

With your insight I’ve done the following:

# rpm -Va

This showed what’s zeroed on the machine, since it was a lot of things, I’ve 
just gone crazy and done:

you should still have host deploy logs on the engine machine. it’s weird it 
succeeded, unless it somehow happened afterwards?

It only succeeded my yum reinstall rampage.

yum list installed | cut -f 1 -d " " > file
yum -y reinstall `cat file | xargs`

Reinstalled everything.

Everything worked as expected and I finally added the machine back to the 
cluster. It’s operational.

eh, I wouldn’t trust it much. did you run redeploy at least?

I’ve done reinstall on the web interface of the engine. I can reinstall the 
host, there’s nothing running on it… gonna try a third format.



Now I’ve another issue, I have 3 VM’s that are ppc64le, when trying to import 
them, the Hosted Engine identifies them as x86_64:



So…

This appears to be a bug. Any ideia on how to force it back to ppc64? I can’t 
manually force the import on the Hosted Engine since there’s no buttons to do 
this…

how exactly did you import them? could be a bug indeed.
we don’t support changing it as it doesn’t make sense, the guest can’t be 
converted

Yeah. I done the normal procedure, added the storage domain to the engine and 
clicked on “Import VM”. Immediately it was detected as x86_64.

Since I wasn’t able to upgrade my environment from 4.3.10 to 4.4.1 due to 
random errors when redeploying the engine with the backup from 4.3.10, I just 
reinstalled it, reconfigured everything and them imported the storage domains.

I don’t know where the information about architecture is stored in the storage 
domain, I tried to search for some metadata files inside the domain but nothing 
come up. Is there a way to force this change? It must be a way.

I even tried to import the machine as x86_64. So I can delete the VM and just 
reattach the disks in a new only, effectively not losing the data, but…

[cid:254FDE4F-5CBC-472A-9C89-0B728E4B0894]

Yeah, so something is broken. The check during the import appears to be OK, but 
the interface does not me allow to import it to the ppc64le machine, since it’s 
read as x86_64.


Thanks,
michal


Ideias?

On 26 Aug 2020, at 15:04, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

What a strange thing is happening here:

[root@power ~]# file /usr/bin/vdsm-client
/usr/bin/vdsm-client: empty
[root@power ~]# ls -l /usr/bin/vdsm-client
-rwxr-xr-x. 1 root root 0 Jul  3 06:23 /usr/bin/vdsm-client

A lot of files are just empty, I’ve tried reinstalling vdsm-client, it worked, 
but there’s other zeroed files:

Transaction test succeeded.
Running transaction
  Preparing:
 1/1
  Reinstalling : vdsm-client-4.40.22-1.el8ev.noarch 
 1/2
  Cleanup  : vdsm-client-4.40.22-1.el8ev.noarch 
 2/2
  Running scriptlet: vdsm-client-4.40.22-1.el8ev.noarch 
 2/2
/sbin/ldconfig: File /lib64/libkadm5clnt.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4.4.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libisns.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libiscsi.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libopeniscsiusr.so.0 is emp

[ovirt-users] Re: POWER9 (ppc64le) Support on oVirt 4.4.1

2020-08-26 Thread Vinícius Ferrão via Users
Okay here we go Arik.

With your insight I’ve done the following:

# rpm -Va

This showed what’s zeroed on the machine, since it was a lot of things, I’ve 
just gone crazy and done:
yum list installed | cut -f 1 -d " " > file
yum -y reinstall `cat file | xargs`

Reinstalled everything.

Everything worked as expected and I finally added the machine back to the 
cluster. It’s operational.

Now I’ve another issue, I have 3 VM’s that are ppc64le, when trying to import 
them, the Hosted Engine identifies them as x86_64:

[cid:78A36F83-2CAF-4A52-B0CA-FCF35177F0F9]

So…

This appears to be a bug. Any ideia on how to force it back to ppc64? I can’t 
manually force the import on the Hosted Engine since there’s no buttons to do 
this…

Ideias?

On 26 Aug 2020, at 15:04, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

What a strange thing is happening here:

[root@power ~]# file /usr/bin/vdsm-client
/usr/bin/vdsm-client: empty
[root@power ~]# ls -l /usr/bin/vdsm-client
-rwxr-xr-x. 1 root root 0 Jul  3 06:23 /usr/bin/vdsm-client

A lot of files are just empty, I’ve tried reinstalling vdsm-client, it worked, 
but there’s other zeroed files:

Transaction test succeeded.
Running transaction
  Preparing:
 1/1
  Reinstalling : vdsm-client-4.40.22-1.el8ev.noarch 
 1/2
  Cleanup  : vdsm-client-4.40.22-1.el8ev.noarch 
 2/2
  Running scriptlet: vdsm-client-4.40.22-1.el8ev.noarch 
 2/2
/sbin/ldconfig: File /lib64/libkadm5clnt.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4.4.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libisns.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libiscsi.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libopeniscsiusr.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libopeniscsiusr.so.0.2.0 is empty, not checked.

/sbin/ldconfig: File /lib64/libkadm5clnt.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4.4.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libisns.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libiscsi.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libopeniscsiusr.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libopeniscsiusr.so.0.2.0 is empty, not checked.

  Verifying: vd

[ovirt-users] Re: POWER9 (ppc64le) Support on oVirt 4.4.1

2020-08-26 Thread Vinícius Ferrão via Users
What a strange thing is happening here:

[root@power ~]# file /usr/bin/vdsm-client
/usr/bin/vdsm-client: empty
[root@power ~]# ls -l /usr/bin/vdsm-client
-rwxr-xr-x. 1 root root 0 Jul  3 06:23 /usr/bin/vdsm-client

A lot of files are just empty, I’ve tried reinstalling vdsm-client, it worked, 
but there’s other zeroed files:

Transaction test succeeded.
Running transaction
  Preparing:
 1/1
  Reinstalling : vdsm-client-4.40.22-1.el8ev.noarch 
 1/2
  Cleanup  : vdsm-client-4.40.22-1.el8ev.noarch 
 2/2
  Running scriptlet: vdsm-client-4.40.22-1.el8ev.noarch 
 2/2
/sbin/ldconfig: File /lib64/libkadm5clnt.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4.4.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libisns.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libiscsi.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libopeniscsiusr.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libopeniscsiusr.so.0.2.0 is empty, not checked.

/sbin/ldconfig: File /lib64/libkadm5clnt.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5clnt_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11 is empty, not checked.
/sbin/ldconfig: File /lib64/libkadm5srv_mit.so.11.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4 is empty, not checked.
/sbin/ldconfig: File /lib64/libsensors.so.4.4.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-admin.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-lxc.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt-qemu.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libvirt.so.0.6000.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libisns.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libiscsi.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libopeniscsiusr.so.0 is empty, not checked.
/sbin/ldconfig: File /lib64/libopeniscsiusr.so.0.2.0 is empty, not checked.

  Verifying: vdsm-client-4.40.22-1.el8ev.noarch 
 1/2
  Verifying: vdsm-client-4.40.22-1.el8ev.noarch 
 2/2
Installed products updated.

Reinstalled:
  vdsm-client-4.40.22-1.el8ev.noarch



I’ve never seen something like this.

I’ve already reinstalled the host from the ground and the same thing happens.


On 26 Aug 2020, at 14:28, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Hello Arik,
This is probably the issue. Output totally empty:

[root@power ~]# vdsm-client Host getCapabilities
[root@power ~]#

Here are the packages installed on the machine: (grepped ovirt and vdsm on rpm 
-qa)
ovirt-imageio-daemon-2.0.8-1.el8ev.ppc64le
ovirt-i

[ovirt-users] Re: POWER9 (ppc64le) Support on oVirt 4.4.1

2020-08-26 Thread Vinícius Ferrão via Users
Hello Arik,
This is probably the issue. Output totally empty:

[root@power ~]# vdsm-client Host getCapabilities
[root@power ~]#

Here are the packages installed on the machine: (grepped ovirt and vdsm on rpm 
-qa)
ovirt-imageio-daemon-2.0.8-1.el8ev.ppc64le
ovirt-imageio-client-2.0.8-1.el8ev.ppc64le
ovirt-host-4.4.1-4.el8ev.ppc64le
ovirt-vmconsole-host-1.0.8-1.el8ev.noarch
ovirt-host-dependencies-4.4.1-4.el8ev.ppc64le
ovirt-imageio-common-2.0.8-1.el8ev.ppc64le
ovirt-vmconsole-1.0.8-1.el8ev.noarch
vdsm-hook-vmfex-dev-4.40.22-1.el8ev.noarch
vdsm-hook-fcoe-4.40.22-1.el8ev.noarch
vdsm-hook-ethtool-options-4.40.22-1.el8ev.noarch
vdsm-hook-openstacknet-4.40.22-1.el8ev.noarch
vdsm-common-4.40.22-1.el8ev.noarch
vdsm-python-4.40.22-1.el8ev.noarch
vdsm-jsonrpc-4.40.22-1.el8ev.noarch
vdsm-api-4.40.22-1.el8ev.noarch
vdsm-yajsonrpc-4.40.22-1.el8ev.noarch
vdsm-4.40.22-1.el8ev.ppc64le
vdsm-network-4.40.22-1.el8ev.ppc64le
vdsm-http-4.40.22-1.el8ev.noarch
vdsm-client-4.40.22-1.el8ev.noarch
vdsm-hook-vhostmd-4.40.22-1.el8ev.noarch

Any ideias to try?

Thanks.

On 26 Aug 2020, at 05:09, Arik Hadas 
mailto:aha...@redhat.com>> wrote:



On Mon, Aug 24, 2020 at 1:30 AM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hello, I was using oVirt 4.3.10 with IBM AC922 (POWER9 / ppc64le) without any 
issues.

Since I’ve moved to 4.4.1 I can’t add the AC922 machine to the engine anymore, 
it complains with the following error:
The host CPU does not match the Cluster CPU type and is running in degraded 
mode. It is missing the following CPU flags: model_POWER9, powernv.

Any ideia of what’s may be happening? The engine runs on x86_64, and I was 
using this way on 4.3.10.

Machine info:
timebase: 51200
platform: PowerNV
model   : 8335-GTH
machine : PowerNV 8335-GTH
firmware: OPAL
MMU : Radix

Can you please provide the output of 'vdsm-client Host getCapabilities' on that 
host?


Thanks,


___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RV6FHRGKGPPZHVR36WKUHBFDMCQHEJHP/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3DFMIR7764V6P4U3DIMDKP6I2RNNNA3T/


[ovirt-users] POWER9 (ppc64le) Support on oVirt 4.4.1

2020-08-23 Thread Vinícius Ferrão via Users
Hello, I was using oVirt 4.3.10 with IBM AC922 (POWER9 / ppc64le) without any 
issues.

Since I’ve moved to 4.4.1 I can’t add the AC922 machine to the engine anymore, 
it complains with the following error:
The host CPU does not match the Cluster CPU type and is running in degraded 
mode. It is missing the following CPU flags: model_POWER9, powernv.

Any ideia of what’s may be happening? The engine runs on x86_64, and I was 
using this way on 4.3.10.

Machine info:
timebase: 51200
platform: PowerNV
model   : 8335-GTH
machine : PowerNV 8335-GTH
firmware: OPAL
MMU : Radix

Thanks,


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RV6FHRGKGPPZHVR36WKUHBFDMCQHEJHP/


[ovirt-users] Hosted Engine stuck in Firmware

2020-08-22 Thread Vinícius Ferrão via Users
Hello, I’ve an strange issue with oVirt 4.4.1

The hosted engine is stuck in the UEFI firmware trying to “never” boot.

I think this happened when I changed the default VM mode for the cluster inside 
the datacenter.

There’s a way to fix this without redeploying the engine?

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GBFY2F4FNVZ25TLCR6IZ5YP32PUQLDLI/


[ovirt-users] Re: Support for Shared SAS storage

2020-08-07 Thread Vinícius Ferrão via Users
Really??

Treat it as FC??

Thats new for me.


On 8 Aug 2020, at 00:35, Jeff Bailey 
mailto:bai...@cs.kent.edu>> wrote:


I haven't tried with 4.4 but shared SAS works just fine with 4.3 (and has for 
many, many years).  You simply treat it as Fibre Channel.  If your LUNs aren't 
showing up I'd make sure they're being claimed as multipath devices.  You want 
them to be.  After that, just make sure they're sufficiently wiped so they 
don't look like they're in use.


On 8/7/2020 10:49 PM, Lao Dh via Users wrote:
Wow. That's sound bad. Then what storage type you choose at last (with your SAS 
connected storage)? VMware vSphere support DAS. Red Hat should do something.

2020年8月8日土曜日 4:06:34 GMT+8、Vinícius Ferrão via Users 
<mailto:users@ovirt.org>が書いたメール:


No, there’s no support for direct attached shared SAS storage on oVirt/RHV.

Fibre Channel is a different thing that oVirt/RHV supports.

> On 7 Aug 2020, at 08:52, hkexdong--- via Users 
> mailto:users@ovirt.org>> wrote:
>
> Hello Vinícius,
> Do you able to connect the SAS external storage?
> Now I've the problem during host engine setup. Select Fibre Channel and end 
> up show "No LUNS found".
> ___
> Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
> To unsubscribe send an email to 
> users-le...@ovirt.org<mailto:users-le...@ovirt.org>
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RDPLKGIRN5ZGIEPWGOKMGNFZNMCEN5RC/


___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2CLI3YSYU7BPI62YANJXZV7RIQFOXXED/




___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WOBHQDCBZZK5WKRAUNHP5CGFYY3HQYYU/


___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UY52JRY5EMQJTMKG3POE2YXSFGL7P55S/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/275GTMI6XQKR52U32WPESHA4BPSAHWZF/


[ovirt-users] Re: Support for Shared SAS storage

2020-08-07 Thread Vinícius Ferrão via Users
No, there’s no support for direct attached shared SAS storage on oVirt/RHV.

Fibre Channel is a different thing that oVirt/RHV supports.

> On 7 Aug 2020, at 08:52, hkexdong--- via Users  wrote:
> 
> Hello Vinícius,
> Do you able to connect the SAS external storage?
> Now I've the problem during host engine setup. Select Fibre Channel and end 
> up show "No LUNS found".
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RDPLKGIRN5ZGIEPWGOKMGNFZNMCEN5RC/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2CLI3YSYU7BPI62YANJXZV7RIQFOXXED/


[ovirt-users] Re: iSCSI multipath with separate subnets... still not possible in 4.4.x?

2020-07-18 Thread Vinícius Ferrão via Users
I second that, I’ve tirelessly talked about this and just given up, it’s a 
basic feature that keeps oVirt lagging behind.

> On 18 Jul 2020, at 04:47, Uwe Laverenz  wrote:
> 
> Hi Mark,
> 
> Am 14.07.20 um 02:14 schrieb Mark R:
> 
>> I'm looking through quite a few bug reports and mailing list threads,
>> but want to make sure I'm not missing some recent development.  It
>> appears that doing iSCSI with two separate, non-routed subnets is
>> still not possible with 4.4.x. I have the dead-standard iSCSI setup
>> with two separate switches, separate interfaces on hosts and storage,
>> and separate subnets that have no gateway and are completely
>> unreachable except from directly attached interfaces.
> 
> I haven't tested 4.4 yet but AFAIK nothing has changed, OVirt iSCSI bonds 
> don't work with separated, isolated subnets:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1474904
> 
> I don't use them as multipathing generally works without OVirt bonds in my 
> setup, I configured multipathd directly to use round robin e.g..
> 
> cu,
> Uwe
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/EXRSANPZHZ2JE2DKRB6KBMYVVMDSGSJV/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZOAM3K7EU4UXMIZSPMHJHR72YVWYSJFK/


[ovirt-users] Re: New fenceType in oVirt code for IBM OpenBMC

2020-07-07 Thread Vinícius Ferrão via Users
@Martin if needed I can raise a RFE for this. Just point me where to do, and I 
will do it.

Thank you.

On 1 Jul 2020, at 03:33, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Hi Martin,

On 1 Jul 2020, at 03:26, Martin Perina 
mailto:mper...@redhat.com>> wrote:



On Wed, Jul 1, 2020 at 1:57 AM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hello,

After some days scratching my head I found that oVirt is probably missing 
fenceTypes for IBM’s implementation of OpenBMC in the Power Management section. 
The host machine is an OpenPOWER AC922 (ppc64le).

The BMC basically is an “ipmilan” device but the ciphers must be defined as 3 
or 17 by default:

[root@h01 ~]# ipmitool -I lanplus -H 10.20.10.2 root -P 0penBmc -L operator -C 
3 channel getciphers ipmi
ID   IANAAuth AlgIntegrity Alg   Confidentiality Alg
3N/A hmac_sha1   hmac_sha1_96aes_cbc_128
17   N/A hmac_sha256 sha256_128  aes_cbc_128

The default ipmilan connector forces the option cipher=1 which breaks the 
communication.

Hi,

have you tried to overwrite the default by adding cipher=3 into Options field 
when adding/updating fence agent configuration for specific host?

Eli, looking at 
https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/ipmi-second-gen-interface-spec-v2-rev1-1.pdf
 I'm not sure our defaults make sense, because by default we enable IPMIv2 
(lanplus=1), but we set IPMIv1 cipher support (cipher=1). Or am I missing 
something?

Yes I’m running this way right now: ipmilan with cipher=17 on options.

But to figure it out I took almost a month. Really. I’ve sent a message on 5 
June to the list: Power Management on IBM AC922 Power9 (ppc64le); and I was 
trying to solve it since them.

This was mainly due to poor documentation. I only figured it out when I done a 
lot of searches on Github to read the oVirt code. So the cipher=1 thing show 
up, and I guessed that it may be it. And it was…

I know that no one cares for ppc64le haha. But I think a change on the list of 
supported fenceTypes will save some people the time I’ve lost with this. If 
there’s something like “openbmc"  would be great.

Or at least a better explanation on the Power Management configure box. Not 
even the options are explained correctly guessing lanplus=1 was hard. I tried a 
lot of combinations like:
I=lanplus
-I lanplus
-I=lanplus

Thanks,


Regards,
Martin

So I was reading the code and found this “fenceType” class, but I wasn't able 
to found where to define those classes. So I can create another one called 
something like openbmc to set cipher=17 by default.

Another question is how bad the output is, it only returns a JSON-RPC generic 
error. But I don’t know how to suggest a fix for this.

Thanks,

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BP33DZ3AET53DGS7TAD6L765WKQIOW7B/


--
Martin Perina
Manager, Software Engineering
Red Hat Czech s.r.o.

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/H2YXAPKSR4BAONK7JMRPT4B3WEMMFTWR/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/J3NMDDGEG4FE2PM66SVIRXULARBODWZU/


[ovirt-users] Re: New fenceType in oVirt code for IBM OpenBMC

2020-07-01 Thread Vinícius Ferrão via Users
Hi Martin,

On 1 Jul 2020, at 03:26, Martin Perina 
mailto:mper...@redhat.com>> wrote:



On Wed, Jul 1, 2020 at 1:57 AM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hello,

After some days scratching my head I found that oVirt is probably missing 
fenceTypes for IBM’s implementation of OpenBMC in the Power Management section. 
The host machine is an OpenPOWER AC922 (ppc64le).

The BMC basically is an “ipmilan” device but the ciphers must be defined as 3 
or 17 by default:

[root@h01 ~]# ipmitool -I lanplus -H 10.20.10.2 root -P 0penBmc -L operator -C 
3 channel getciphers ipmi
ID   IANAAuth AlgIntegrity Alg   Confidentiality Alg
3N/A hmac_sha1   hmac_sha1_96aes_cbc_128
17   N/A hmac_sha256 sha256_128  aes_cbc_128

The default ipmilan connector forces the option cipher=1 which breaks the 
communication.

Hi,

have you tried to overwrite the default by adding cipher=3 into Options field 
when adding/updating fence agent configuration for specific host?

Eli, looking at 
https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/ipmi-second-gen-interface-spec-v2-rev1-1.pdf
 I'm not sure our defaults make sense, because by default we enable IPMIv2 
(lanplus=1), but we set IPMIv1 cipher support (cipher=1). Or am I missing 
something?

Yes I’m running this way right now: ipmilan with cipher=17 on options.

But to figure it out I took almost a month. Really. I’ve sent a message on 5 
June to the list: Power Management on IBM AC922 Power9 (ppc64le); and I was 
trying to solve it since them.

This was mainly due to poor documentation. I only figured it out when I done a 
lot of searches on Github to read the oVirt code. So the cipher=1 thing show 
up, and I guessed that it may be it. And it was…

I know that no one cares for ppc64le haha. But I think a change on the list of 
supported fenceTypes will save some people the time I’ve lost with this. If 
there’s something like “openbmc"  would be great.

Or at least a better explanation on the Power Management configure box. Not 
even the options are explained correctly guessing lanplus=1 was hard. I tried a 
lot of combinations like:
I=lanplus
-I lanplus
-I=lanplus

Thanks,


Regards,
Martin

So I was reading the code and found this “fenceType” class, but I wasn't able 
to found where to define those classes. So I can create another one called 
something like openbmc to set cipher=17 by default.

Another question is how bad the output is, it only returns a JSON-RPC generic 
error. But I don’t know how to suggest a fix for this.

Thanks,

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BP33DZ3AET53DGS7TAD6L765WKQIOW7B/


--
Martin Perina
Manager, Software Engineering
Red Hat Czech s.r.o.

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/H2YXAPKSR4BAONK7JMRPT4B3WEMMFTWR/


[ovirt-users] New fenceType in oVirt code for IBM OpenBMC

2020-06-30 Thread Vinícius Ferrão via Users
Hello,

After some days scratching my head I found that oVirt is probably missing 
fenceTypes for IBM’s implementation of OpenBMC in the Power Management section. 
The host machine is an OpenPOWER AC922 (ppc64le).

The BMC basically is an “ipmilan” device but the ciphers must be defined as 3 
or 17 by default:

[root@h01 ~]# ipmitool -I lanplus -H 10.20.10.2 root -P 0penBmc -L operator -C 
3 channel getciphers ipmi
ID   IANAAuth AlgIntegrity Alg   Confidentiality Alg
3N/A hmac_sha1   hmac_sha1_96aes_cbc_128
17   N/A hmac_sha256 sha256_128  aes_cbc_128 

The default ipmilan connector forces the option cipher=1 which breaks the 
communication.

So I was reading the code and found this “fenceType” class, but I wasn't able 
to found where to define those classes. So I can create another one called 
something like openbmc to set cipher=17 by default.

Another question is how bad the output is, it only returns a JSON-RPC generic 
error. But I don’t know how to suggest a fix for this.

Thanks,

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BP33DZ3AET53DGS7TAD6L765WKQIOW7B/


[ovirt-users] Re: Clean old mount points in hosts VDSM

2020-06-25 Thread Vinícius Ferrão via Users
Strahil, thank you.

Reinstalling the host solved the issue.

> On 25 Jun 2020, at 15:48, Vinícius Ferrão via Users  wrote:
> 
> I think yes. But I’m not sure. 
> 
> I can do it again, there’s an update so I’ll do both and report back.
> 
> Thank you Strahil.
> 
>> On 25 Jun 2020, at 00:37, Strahil Nikolov  wrote:
>> 
>> Did you reinstall the node via the WEB UI ?
>> 
>> Best Regards,
>> Strahil  Nikolov
>> 
>> На 25 юни 2020 г. 3:23:15 GMT+03:00, "Vinícius Ferrão via Users" 
>>  написа:
>>> Hello,
>>> 
>>> For reasons unknown one of my hosts is trying to mount an old storage
>>> point that’s been removed some time ago.
>>> 
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:35,958-0300 INFO 
>>> (tmap-65016/0) [IOProcessClient] (/192.168.10.6:_mnt_pool0_ovirt_he)
>>> Starting client (__init__:308)
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:35,968-0300 INFO 
>>> (ioprocess/12115) [IOProcess] (/192.168.10.6:_mnt_pool0_ovirt_he)
>>> Starting ioprocess (__init__:434)
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:42,167-0300 INFO  (jsonrpc/6)
>>> [vdsm.api] START connectStorageServer(domType=1,
>>> spUUID=u'----',
>>> conList=[{u'protocol_version': u'auto', u'connection':
>>> u'192.168.10.6:/mnt/pool0/ovirt/he', u'user': u'kvm', u'id':
>>> u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}], options=None)
>>> from=::1,59090, task_id=5ea81925-ec92-4031-aa36-bb6f436321d5 (api:48)
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:42,169-0300 INFO  (jsonrpc/6)
>>> [storage.StorageServer.MountConnection] Creating directory
>>> u'/rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he'
>>> (storageServer:168)
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:42,169-0300 INFO  (jsonrpc/6)
>>> [storage.fileUtils] Creating directory:
>>> /rhev/data-center/mnt/192.168.10.6:_mnt_pool0ovirt_he mode: None
>>> (fileUtils:199)
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:42,169-0300 INFO  (jsonrpc/6)
>>> [storage.Mount] mounting 192.168.10.6:/mnt/pool0/ovirt/he at
>>> /rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he (mount:204)
>>> /var/log/vdsm/vdsm.log:MountError: (32, ';mount.nfs: mounting
>>> 192.168.10.6:/mnt/pool0/ovirt/he failed, reason given by server: No
>>> such file or directory\n')
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:43,683-0300 INFO  (jsonrpc/5)
>>> [vdsm.api] START connectStorageServer(domType=1,
>>> spUUID=u'----',
>>> conList=[{u'protocol_version': u'auto', u'connection':
>>> u'192.168.10.6:/mnt/pool0/ovirt/he', u'user': u'kvm', u'id':
>>> u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}], options=None)
>>> from=::1,59094, task_id=9ce61858-dea1-4059-b942-a52c8c82afdc (api:48)
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:43,685-0300 INFO  (jsonrpc/5)
>>> [storage.StorageServer.MountConnection] Creating directory
>>> u'/rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he'
>>> (storageServer:168)
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:43,685-0300 INFO  (jsonrpc/5)
>>> [storage.fileUtils] Creating directory:
>>> /rhev/data-center/mnt/192.168.10.6:_mnt_pool0ovirt_he mode: None
>>> (fileUtils:199)
>>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:43,685-0300 INFO  (jsonrpc/5)
>>> [storage.Mount] mounting 192.168.10.6:/mnt/pool0/ovirt/he at
>>> /rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he (mount:204)
>>> /var/log/vdsm/vdsm.log:MountError: (32, ';mount.nfs: mounting
>>> 192.168.10.6:/mnt/pool0/ovirt/he failed, reason given by server: No
>>> such file or directory\n’)
>>> 
>>> This only happens in one host and it’s spamming /var/log/vdsm/vdsm.log.
>>> 
>>> Any ideia on how to debug this and remove the entry?
>>> 
>>> Thanks,
>>> 
>>> ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/DS2ZIZVAXJVLPR6BFSZU63TU7KJWTZVA/
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/4TSSMMITEHBXTGS76BYQ5ZPKRB7REZF7/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OICOEQ47EFZH2EPOLRXH7CKAT3XL4XJE/


[ovirt-users] Re: Clean old mount points in hosts VDSM

2020-06-25 Thread Vinícius Ferrão via Users
I think yes. But I’m not sure. 

I can do it again, there’s an update so I’ll do both and report back.

Thank you Strahil.

> On 25 Jun 2020, at 00:37, Strahil Nikolov  wrote:
> 
> Did you reinstall the node via the WEB UI ?
> 
> Best Regards,
> Strahil  Nikolov
> 
> На 25 юни 2020 г. 3:23:15 GMT+03:00, "Vinícius Ferrão via Users" 
>  написа:
>> Hello,
>> 
>> For reasons unknown one of my hosts is trying to mount an old storage
>> point that’s been removed some time ago.
>> 
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:35,958-0300 INFO 
>> (tmap-65016/0) [IOProcessClient] (/192.168.10.6:_mnt_pool0_ovirt_he)
>> Starting client (__init__:308)
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:35,968-0300 INFO 
>> (ioprocess/12115) [IOProcess] (/192.168.10.6:_mnt_pool0_ovirt_he)
>> Starting ioprocess (__init__:434)
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:42,167-0300 INFO  (jsonrpc/6)
>> [vdsm.api] START connectStorageServer(domType=1,
>> spUUID=u'----',
>> conList=[{u'protocol_version': u'auto', u'connection':
>> u'192.168.10.6:/mnt/pool0/ovirt/he', u'user': u'kvm', u'id':
>> u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}], options=None)
>> from=::1,59090, task_id=5ea81925-ec92-4031-aa36-bb6f436321d5 (api:48)
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:42,169-0300 INFO  (jsonrpc/6)
>> [storage.StorageServer.MountConnection] Creating directory
>> u'/rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he'
>> (storageServer:168)
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:42,169-0300 INFO  (jsonrpc/6)
>> [storage.fileUtils] Creating directory:
>> /rhev/data-center/mnt/192.168.10.6:_mnt_pool0ovirt_he mode: None
>> (fileUtils:199)
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:42,169-0300 INFO  (jsonrpc/6)
>> [storage.Mount] mounting 192.168.10.6:/mnt/pool0/ovirt/he at
>> /rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he (mount:204)
>> /var/log/vdsm/vdsm.log:MountError: (32, ';mount.nfs: mounting
>> 192.168.10.6:/mnt/pool0/ovirt/he failed, reason given by server: No
>> such file or directory\n')
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:43,683-0300 INFO  (jsonrpc/5)
>> [vdsm.api] START connectStorageServer(domType=1,
>> spUUID=u'----',
>> conList=[{u'protocol_version': u'auto', u'connection':
>> u'192.168.10.6:/mnt/pool0/ovirt/he', u'user': u'kvm', u'id':
>> u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}], options=None)
>> from=::1,59094, task_id=9ce61858-dea1-4059-b942-a52c8c82afdc (api:48)
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:43,685-0300 INFO  (jsonrpc/5)
>> [storage.StorageServer.MountConnection] Creating directory
>> u'/rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he'
>> (storageServer:168)
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:43,685-0300 INFO  (jsonrpc/5)
>> [storage.fileUtils] Creating directory:
>> /rhev/data-center/mnt/192.168.10.6:_mnt_pool0ovirt_he mode: None
>> (fileUtils:199)
>> /var/log/vdsm/vdsm.log:2020-06-24 19:57:43,685-0300 INFO  (jsonrpc/5)
>> [storage.Mount] mounting 192.168.10.6:/mnt/pool0/ovirt/he at
>> /rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he (mount:204)
>> /var/log/vdsm/vdsm.log:MountError: (32, ';mount.nfs: mounting
>> 192.168.10.6:/mnt/pool0/ovirt/he failed, reason given by server: No
>> such file or directory\n’)
>> 
>> This only happens in one host and it’s spamming /var/log/vdsm/vdsm.log.
>> 
>> Any ideia on how to debug this and remove the entry?
>> 
>> Thanks,
>> 
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/DS2ZIZVAXJVLPR6BFSZU63TU7KJWTZVA/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4TSSMMITEHBXTGS76BYQ5ZPKRB7REZF7/


[ovirt-users] Clean old mount points in hosts VDSM

2020-06-24 Thread Vinícius Ferrão via Users
Hello,

For reasons unknown one of my hosts is trying to mount an old storage point 
that’s been removed some time ago.

/var/log/vdsm/vdsm.log:2020-06-24 19:57:35,958-0300 INFO  (tmap-65016/0) 
[IOProcessClient] (/192.168.10.6:_mnt_pool0_ovirt_he) Starting client 
(__init__:308)
/var/log/vdsm/vdsm.log:2020-06-24 19:57:35,968-0300 INFO  (ioprocess/12115) 
[IOProcess] (/192.168.10.6:_mnt_pool0_ovirt_he) Starting ioprocess 
(__init__:434)
/var/log/vdsm/vdsm.log:2020-06-24 19:57:42,167-0300 INFO  (jsonrpc/6) 
[vdsm.api] START connectStorageServer(domType=1, 
spUUID=u'----', conList=[{u'protocol_version': 
u'auto', u'connection': u'192.168.10.6:/mnt/pool0/ovirt/he', u'user': u'kvm', 
u'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}], options=None) from=::1,59090, 
task_id=5ea81925-ec92-4031-aa36-bb6f436321d5 (api:48)
/var/log/vdsm/vdsm.log:2020-06-24 19:57:42,169-0300 INFO  (jsonrpc/6) 
[storage.StorageServer.MountConnection] Creating directory 
u'/rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he' (storageServer:168)
/var/log/vdsm/vdsm.log:2020-06-24 19:57:42,169-0300 INFO  (jsonrpc/6) 
[storage.fileUtils] Creating directory: 
/rhev/data-center/mnt/192.168.10.6:_mnt_pool0ovirt_he mode: None (fileUtils:199)
/var/log/vdsm/vdsm.log:2020-06-24 19:57:42,169-0300 INFO  (jsonrpc/6) 
[storage.Mount] mounting 192.168.10.6:/mnt/pool0/ovirt/he at 
/rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he (mount:204)
/var/log/vdsm/vdsm.log:MountError: (32, ';mount.nfs: mounting 
192.168.10.6:/mnt/pool0/ovirt/he failed, reason given by server: No such file 
or directory\n')
/var/log/vdsm/vdsm.log:2020-06-24 19:57:43,683-0300 INFO  (jsonrpc/5) 
[vdsm.api] START connectStorageServer(domType=1, 
spUUID=u'----', conList=[{u'protocol_version': 
u'auto', u'connection': u'192.168.10.6:/mnt/pool0/ovirt/he', u'user': u'kvm', 
u'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}], options=None) from=::1,59094, 
task_id=9ce61858-dea1-4059-b942-a52c8c82afdc (api:48)
/var/log/vdsm/vdsm.log:2020-06-24 19:57:43,685-0300 INFO  (jsonrpc/5) 
[storage.StorageServer.MountConnection] Creating directory 
u'/rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he' (storageServer:168)
/var/log/vdsm/vdsm.log:2020-06-24 19:57:43,685-0300 INFO  (jsonrpc/5) 
[storage.fileUtils] Creating directory: 
/rhev/data-center/mnt/192.168.10.6:_mnt_pool0ovirt_he mode: None (fileUtils:199)
/var/log/vdsm/vdsm.log:2020-06-24 19:57:43,685-0300 INFO  (jsonrpc/5) 
[storage.Mount] mounting 192.168.10.6:/mnt/pool0/ovirt/he at 
/rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_he (mount:204)
/var/log/vdsm/vdsm.log:MountError: (32, ';mount.nfs: mounting 
192.168.10.6:/mnt/pool0/ovirt/he failed, reason given by server: No such file 
or directory\n’)

This only happens in one host and it’s spamming /var/log/vdsm/vdsm.log.

Any ideia on how to debug this and remove the entry?

Thanks,

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DS2ZIZVAXJVLPR6BFSZU63TU7KJWTZVA/


[ovirt-users] Re: teaming vs bonding

2020-06-10 Thread Vinícius Ferrão via Users
Only bonding, teaming is not supported on the by the hypervisor.

This was valid up to 4.3; not sure if something changed on 4.4, since I didn’t 
checked it.


> On 10 Jun 2020, at 15:30, Diggy Mc  wrote:
> 
> Does 4.4.x support adapter teaming?  If yes, which is preferred, teaming or 
> bonding?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/THMDPSFEX4GAISJ5ELGEWBEFMLKGQVE5/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IXTUQTYSZS6QR4BYDHNCUBBFRHDRHAXS/


[ovirt-users] Re: What happens when shared storage is down?

2020-06-09 Thread Vinícius Ferrão via Users


> On 7 Jun 2020, at 08:34, Strahil Nikolov  wrote:
> 
> 
> 
> На 7 юни 2020 г. 1:58:27 GMT+03:00, "Vinícius Ferrão via Users" 
>  написа:
>> Hello,
>> 
>> This is a pretty vague and difficult question to answer. But what
>> happens if the shared storage holding the VMs is down or unavailable
>> for a period of time?
> Once  a  pending I/O is blocked, libvirt will pause the VM .
> 
>> I’m aware that a longer timeout may put the VMs on pause state, but how
>> this is handled? Is it a time limit? Requests limit? Who manages this?
> You got sanlock.service that notifies the engine when a storage domain is 
> unaccessible for  mode than 60s.
> 
> Libvirt also will pause  a  VM when a pending I/O cannot be done.
> 
>> In an event of self recovery of the storage backend what happens next?
> Usually the engine should resume the VM,  and from application perspective 
> nothing has happened.

Hmm thanks Strahil. I was thinking to upgrade the storage backend of one of my 
oVirt clusters without powering off the VM’s, just to be lazy.

The storage does not have dual controllers, so downtime is needed. I’m trying 
to understand what happens so I can evaluate this update without turning off 
the VMs.

>> Manual intervention is required? The VMs may be down or they just
>> continue to run? It depends on the guest OS running like in XenServer
>> where different scenarios may happen?
>> 
>> I’ve looked here:
>> https://www.ovirt.org/documentation/admin-guide/chap-Storage.html but
>> there’s nothing that goes about this question.
>> 
>> Thanks,
>> 
>> Sent from my iPhone

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BVZAG2V3KBB364U5VBRCBIU42LJNGCI6/


[ovirt-users] Re: Cannot start ppc64le VM's

2020-06-09 Thread Vinícius Ferrão via Users


On 8 Jun 2020, at 07:43, Michal Skrivanek 
mailto:michal.skriva...@redhat.com>> wrote:



On 5 Jun 2020, at 20:23, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

Hi Michal

On 5 Jun 2020, at 04:39, Michal Skrivanek 
mailto:michal.skriva...@redhat.com>> wrote:



On 5 Jun 2020, at 08:19, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Hello, I’m trying to run ppc64le VM’s on POWER9 but qemu-kvm fails complaining 
about NUMA issues:

that is not a line you should be looking at, it’s just a harmless warning.
I suppose it’s the other one, about spectre fixes

VM ppc64le.local.versatushpc.com.br<http://ppc64le.local.versatushpc.com.br/> 
is down with error. Exit message: internal error: qemu unexpectedly closed the 
monitor: 2020-06-05T06:16:10.716052Z qemu-kvm: warning: CPU(s) not present in 
any NUMA nodes: CPU 4 [core-id: 4], CPU 5 [core-id: 5], CPU 6 [core-id: 6], CPU 
7 [core-id: 7], CPU 8 [core-id: 8], CPU 9 [core-id: 9], CPU 10 [core-id: 10], 
CPU 11 [core-id: 11], CPU 12 [core-id: 12], CPU 13 [core-id: 13], CPU 14 
[core-id: 14], CPU 15 [core-id: 15] 2020-06-05T06:16:10.716067Z qemu-kvm: 
warning: All CPU(s) up to maxcpus should be described in NUMA config, ability 
to start up with partial NUMA mappings is obsoleted and will be removed in 
future 2020-06-05T06:16:11.155924Z qemu-kvm: Requested safe indirect branch 
capability level not supported by kvm, try cap-ibs=fixed-ibs.

Any idea of what’s happening?

I found some links, but I’m not sure if they are related or not:
https://bugzilla.redhat.com/show_bug.cgi?id=1732726
https://bugzilla.redhat.com/show_bug.cgi?id=1592648

yes, they look relevant if that’s the hw you have. We do use 
pseries-rhel7.6.0-sxxm machine type in 4.3 (not in 4.4. that would be the 
preferred solution, to upgrade).
If you don’t care about security you can also modify the machine type per VM 
(or in engine db for all VMs) to "pseries-rhel7.6.0"

I’m using an AC922 machine.

and is it oVirt  4.3 or 4.4?
Bug 1732726 is on RHEL 8, so relevant only for oVirt 4.4, i.e. you’d have to 
have a 4.3 cluster level?
if you really want to keep using -sxxm you need to modify it to add the extra 
flag the bug talks about

this shouldn’t be needed in 4.4 cluster level though

Hi Michal, I’m running 4.3.10. Not in 4.4 yet.

So the workaround would be add cap-ibs=fixed-ibs to VM parameters so sxxm would 
work? Where do I add this? Do you know?

Thanks.



In fact I can boot the VMs with pseries-rhel7.6.0 but not with 
pseries-rhel7.6.0-sxxm; how do you made pseries-rhel7.6.0-sxxm works on 4.3 
release?

# lscpu
Architecture:  ppc64le
Byte Order:Little Endian
CPU(s):128
On-line CPU(s) list:   0-127
Thread(s) per core:4
Core(s) per socket:16
Socket(s): 2
NUMA node(s):  6
Model: 2.2 (pvr 004e 1202)
Model name:POWER9, altivec supported
CPU max MHz:   3800.
CPU min MHz:   2300.
L1d cache: 32K
L1i cache: 32K
L2 cache:  512K
L3 cache:  10240K
NUMA node0 CPU(s): 0-63
NUMA node8 CPU(s): 64-127
NUMA node252 CPU(s):
NUMA node253 CPU(s):
NUMA node254 CPU(s):
NUMA node255 CPU(s):

Thank you for helping out.


Thanks,
michal

Thanks,

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PVVQDBO2XJYBQN7EUDMM74QZJ2UTLRJ2/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SC2ERGD6UZ7SCNOM52F3MDFMZVWY7B5E/


[ovirt-users] Re: Power Management on IBM AC922 Power9 (ppc64le)

2020-06-08 Thread Vinícius Ferrão via Users
Yes… actually IBM uses pretty standard stuff. IPMI is enabled by default and as 
I said, I can use ipmitool on CLI and it’s works normally.

I do have some updates, I upgraded the OpenBMC firmware and now I can use 
ipmitool like anything else with -U and -P; so I was hoping that oVirt would 
handle the Power Management with IPMI over LAN (exactly how you suggested) but 
the issue stays. JSON-RPC error. :(

Now I really think this is a bug, but I would like to get some confirmation 
from the oVirt devs to raise it on bugzilla.

> On 8 Jun 2020, at 14:00, bernadette.pfau--- via Users  wrote:
> 
> Making a guess here -- on Dell iDRAC there is a setting for "IPMI over LAN".  
> Is there an equivalent on the IBM?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/BYLLNDCJ2VO3RRTJXS45CNUQYF3GYR6R/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3ZTOY2JM3EOHYDQ5XQBPNQ3YATTTX3BE/


[ovirt-users] What happens when shared storage is down?

2020-06-06 Thread Vinícius Ferrão via Users
Hello,

This is a pretty vague and difficult question to answer. But what happens if 
the shared storage holding the VMs is down or unavailable for a period of time?

I’m aware that a longer timeout may put the VMs on pause state, but how this is 
handled? Is it a time limit? Requests limit? Who manages this?

In an event of self recovery of the storage backend what happens next? Manual 
intervention is required? The VMs may be down or they just continue to run? It 
depends on the guest OS running like in XenServer where different scenarios may 
happen?

I’ve looked here: 
https://www.ovirt.org/documentation/admin-guide/chap-Storage.html but there’s 
nothing that goes about this question.

Thanks,

Sent from my iPhone
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SQ3AQNL5VOYGRH63SHQJFB7MYUWQDGO3/


[ovirt-users] Re: Cannot start ppc64le VM's

2020-06-05 Thread Vinícius Ferrão via Users
Hi Michal

On 5 Jun 2020, at 04:39, Michal Skrivanek 
mailto:michal.skriva...@redhat.com>> wrote:



On 5 Jun 2020, at 08:19, Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:

Hello, I’m trying to run ppc64le VM’s on POWER9 but qemu-kvm fails complaining 
about NUMA issues:

that is not a line you should be looking at, it’s just a harmless warning.
I suppose it’s the other one, about spectre fixes

VM ppc64le.local.versatushpc.com.br<http://ppc64le.local.versatushpc.com.br/> 
is down with error. Exit message: internal error: qemu unexpectedly closed the 
monitor: 2020-06-05T06:16:10.716052Z qemu-kvm: warning: CPU(s) not present in 
any NUMA nodes: CPU 4 [core-id: 4], CPU 5 [core-id: 5], CPU 6 [core-id: 6], CPU 
7 [core-id: 7], CPU 8 [core-id: 8], CPU 9 [core-id: 9], CPU 10 [core-id: 10], 
CPU 11 [core-id: 11], CPU 12 [core-id: 12], CPU 13 [core-id: 13], CPU 14 
[core-id: 14], CPU 15 [core-id: 15] 2020-06-05T06:16:10.716067Z qemu-kvm: 
warning: All CPU(s) up to maxcpus should be described in NUMA config, ability 
to start up with partial NUMA mappings is obsoleted and will be removed in 
future 2020-06-05T06:16:11.155924Z qemu-kvm: Requested safe indirect branch 
capability level not supported by kvm, try cap-ibs=fixed-ibs.

Any idea of what’s happening?

I found some links, but I’m not sure if they are related or not:
https://bugzilla.redhat.com/show_bug.cgi?id=1732726
https://bugzilla.redhat.com/show_bug.cgi?id=1592648

yes, they look relevant if that’s the hw you have. We do use 
pseries-rhel7.6.0-sxxm machine type in 4.3 (not in 4.4. that would be the 
preferred solution, to upgrade).
If you don’t care about security you can also modify the machine type per VM 
(or in engine db for all VMs) to "pseries-rhel7.6.0"

I’m using an AC922 machine.

In fact I can boot the VMs with pseries-rhel7.6.0 but not with 
pseries-rhel7.6.0-sxxm; how do you made pseries-rhel7.6.0-sxxm works on 4.3 
release?

# lscpu
Architecture:  ppc64le
Byte Order:Little Endian
CPU(s):128
On-line CPU(s) list:   0-127
Thread(s) per core:4
Core(s) per socket:16
Socket(s): 2
NUMA node(s):  6
Model: 2.2 (pvr 004e 1202)
Model name:POWER9, altivec supported
CPU max MHz:   3800.
CPU min MHz:   2300.
L1d cache: 32K
L1i cache: 32K
L2 cache:  512K
L3 cache:  10240K
NUMA node0 CPU(s): 0-63
NUMA node8 CPU(s): 64-127
NUMA node252 CPU(s):
NUMA node253 CPU(s):
NUMA node254 CPU(s):
NUMA node255 CPU(s):

Thank you for helping out.


Thanks,
michal

Thanks,

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PVVQDBO2XJYBQN7EUDMM74QZJ2UTLRJ2/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TH36FTKGIQR2WZ5D7KJUYLY46C5GXO7Z/


[ovirt-users] Cannot start ppc64le VM's

2020-06-05 Thread Vinícius Ferrão via Users
Hello, I’m trying to run ppc64le VM’s on POWER9 but qemu-kvm fails complaining 
about NUMA issues:

VM ppc64le.local.versatushpc.com.br is 
down with error. Exit message: internal error: qemu unexpectedly closed the 
monitor: 2020-06-05T06:16:10.716052Z qemu-kvm: warning: CPU(s) not present in 
any NUMA nodes: CPU 4 [core-id: 4], CPU 5 [core-id: 5], CPU 6 [core-id: 6], CPU 
7 [core-id: 7], CPU 8 [core-id: 8], CPU 9 [core-id: 9], CPU 10 [core-id: 10], 
CPU 11 [core-id: 11], CPU 12 [core-id: 12], CPU 13 [core-id: 13], CPU 14 
[core-id: 14], CPU 15 [core-id: 15] 2020-06-05T06:16:10.716067Z qemu-kvm: 
warning: All CPU(s) up to maxcpus should be described in NUMA config, ability 
to start up with partial NUMA mappings is obsoleted and will be removed in 
future 2020-06-05T06:16:11.155924Z qemu-kvm: Requested safe indirect branch 
capability level not supported by kvm, try cap-ibs=fixed-ibs.

Any idea of what’s happening?

I found some links, but I’m not sure if they are related or not:
https://bugzilla.redhat.com/show_bug.cgi?id=1732726
https://bugzilla.redhat.com/show_bug.cgi?id=1592648

Thanks,

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PVVQDBO2XJYBQN7EUDMM74QZJ2UTLRJ2/


[ovirt-users] Power Management on IBM AC922 Power9 (ppc64le)

2020-06-04 Thread Vinícius Ferrão via Users
Hello,

I would like to know how to enable Power Management on AC922 hardware from IBM. 
It’s ppc64le architecture and runs OpenBMC as manager.

I only get Test failed: Internal JSON-RPC error when adding the infos with 
ipmilan on the engine. From the command line I can use ipmitool but without 
specifying any user. On the Engine I must specify an user. There’s no way to 
leave it blank.

Thanks,


 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5THZCU737LSCVOGLUF3INB5DEKWO56YD/


[ovirt-users] Re: POWER9 Support: VDSM requiring LVM2 package that's missing

2020-05-14 Thread Vinícius Ferrão via Users
Hi Amit, I think I found the answer: It’s not available yet.

https://bugzilla.redhat.com/show_bug.cgi?id=1829348

It's this bug right?

Thanks,

On 14 May 2020, at 20:14, Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>> wrote:

Hi Amit, thanks for confirming.

Do you know in which repository VDSM 4.30.36 is available?

It’s not available on any of both:
rhel-7-for-power-9-rpms/ppc64le  Red 
Hat Enterprise Linux 7 for POWER9 (RPMs) 9,156
rhel-7-server-rhv-4-mgmt-agent-for-power-9-rpms/ppc64le Red Hat Virtualization 
4 Management Agents (for RHEL 7 Server for IBM POWER9   814


Thank you!


On 14 May 2020, at 20:09, Amit Bawer 
mailto:aba...@redhat.com>> wrote:


On Fri, May 15, 2020 at 12:19 AM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hello,

I would like to know if this is a bug or not, if yes I will submit to Red Hat.
Fixed on vdsm-4.30.46

I’m trying to add a ppc64le (POWER9) machine to the hosts pool, but there’s 
missing dependencies on VDSM:

--> Processing Dependency: lvm2 >= 7:2.02.186-7.el7_8.1 for package: 
vdsm-4.30.44-1.el7ev.ppc64le
--> Finished Dependency Resolution
Error: Package: vdsm-4.30.44-1.el7ev.ppc64le 
(rhel-7-server-rhv-4-mgmt-agent-for-power-9-rpms)
   Requires: lvm2 >= 7:2.02.186-7.el7_8.1
   Available: 7:lvm2-2.02.171-8.el7.ppc64le (rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.171-8.el7
   Available: 7:lvm2-2.02.177-4.el7.ppc64le (rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.177-4.el7
   Available: 7:lvm2-2.02.180-8.el7.ppc64le (rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-8.el7
   Available: 7:lvm2-2.02.180-10.el7_6.1.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.1
   Available: 7:lvm2-2.02.180-10.el7_6.2.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.2
   Available: 7:lvm2-2.02.180-10.el7_6.3.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.3
   Available: 7:lvm2-2.02.180-10.el7_6.7.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.7
   Available: 7:lvm2-2.02.180-10.el7_6.8.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.8
   Installing: 7:lvm2-2.02.180-10.el7_6.9.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.9


Thanks,

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/I3YDM2VN7K2GHNLNLWCEXZRSAHI4F4L7/


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4ITYU2XWLABXTRSQ3VE3G7AUXADWP3DB/


[ovirt-users] Re: POWER9 Support: VDSM requiring LVM2 package that's missing

2020-05-14 Thread Vinícius Ferrão via Users
Hi Amit, thanks for confirming.

Do you know in which repository VDSM 4.30.36 is available?

It’s not available on any of both:
rhel-7-for-power-9-rpms/ppc64le  Red 
Hat Enterprise Linux 7 for POWER9 (RPMs) 9,156
rhel-7-server-rhv-4-mgmt-agent-for-power-9-rpms/ppc64le Red Hat Virtualization 
4 Management Agents (for RHEL 7 Server for IBM POWER9   814


Thank you!


On 14 May 2020, at 20:09, Amit Bawer 
mailto:aba...@redhat.com>> wrote:


On Fri, May 15, 2020 at 12:19 AM Vinícius Ferrão via Users 
mailto:users@ovirt.org>> wrote:
Hello,

I would like to know if this is a bug or not, if yes I will submit to Red Hat.
Fixed on vdsm-4.30.46

I’m trying to add a ppc64le (POWER9) machine to the hosts pool, but there’s 
missing dependencies on VDSM:

--> Processing Dependency: lvm2 >= 7:2.02.186-7.el7_8.1 for package: 
vdsm-4.30.44-1.el7ev.ppc64le
--> Finished Dependency Resolution
Error: Package: vdsm-4.30.44-1.el7ev.ppc64le 
(rhel-7-server-rhv-4-mgmt-agent-for-power-9-rpms)
   Requires: lvm2 >= 7:2.02.186-7.el7_8.1
   Available: 7:lvm2-2.02.171-8.el7.ppc64le (rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.171-8.el7
   Available: 7:lvm2-2.02.177-4.el7.ppc64le (rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.177-4.el7
   Available: 7:lvm2-2.02.180-8.el7.ppc64le (rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-8.el7
   Available: 7:lvm2-2.02.180-10.el7_6.1.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.1
   Available: 7:lvm2-2.02.180-10.el7_6.2.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.2
   Available: 7:lvm2-2.02.180-10.el7_6.3.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.3
   Available: 7:lvm2-2.02.180-10.el7_6.7.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.7
   Available: 7:lvm2-2.02.180-10.el7_6.8.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.8
   Installing: 7:lvm2-2.02.180-10.el7_6.9.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.9


Thanks,

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/I3YDM2VN7K2GHNLNLWCEXZRSAHI4F4L7/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/W3P6LKEPJBNECFYFJEML6G5W3XDAS43Q/


[ovirt-users] POWER9 Support: VDSM requiring LVM2 package that's missing

2020-05-14 Thread Vinícius Ferrão via Users
Hello,

I would like to know if this is a bug or not, if yes I will submit to Red Hat.

I’m trying to add a ppc64le (POWER9) machine to the hosts pool, but there’s 
missing dependencies on VDSM:

--> Processing Dependency: lvm2 >= 7:2.02.186-7.el7_8.1 for package: 
vdsm-4.30.44-1.el7ev.ppc64le
--> Finished Dependency Resolution
Error: Package: vdsm-4.30.44-1.el7ev.ppc64le 
(rhel-7-server-rhv-4-mgmt-agent-for-power-9-rpms)
   Requires: lvm2 >= 7:2.02.186-7.el7_8.1
   Available: 7:lvm2-2.02.171-8.el7.ppc64le (rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.171-8.el7
   Available: 7:lvm2-2.02.177-4.el7.ppc64le (rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.177-4.el7
   Available: 7:lvm2-2.02.180-8.el7.ppc64le (rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-8.el7
   Available: 7:lvm2-2.02.180-10.el7_6.1.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.1
   Available: 7:lvm2-2.02.180-10.el7_6.2.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.2
   Available: 7:lvm2-2.02.180-10.el7_6.3.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.3
   Available: 7:lvm2-2.02.180-10.el7_6.7.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.7
   Available: 7:lvm2-2.02.180-10.el7_6.8.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.8
   Installing: 7:lvm2-2.02.180-10.el7_6.9.ppc64le 
(rhel-7-for-power-9-rpms)
   lvm2 = 7:2.02.180-10.el7_6.9


Thanks,

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/I3YDM2VN7K2GHNLNLWCEXZRSAHI4F4L7/


[ovirt-users] Re: Host fails to enter in maintenance due to migration failure

2020-04-13 Thread Vinícius Ferrão
Hi Eric, thanks for the reply.

Actually putting the host to activate did stopped the maintenance and aborted 
the migration. Thank you for that.

Regarding the segregated motion network, I don’t have additional physical 
interfaces, I can only do it on a VLAN. I don’t know if this would help…

The attached picture shows my networks.

[cid:62ACD8A2-EDB9-4307-8C91-DA675A6F8FB0@home.ferrao.eti.br]



On 13 Apr 2020, at 14:43, 
eev...@digitaldatatechs.com<mailto:eev...@digitaldatatechs.com> wrote:

It should give you the option to Activate the host to end  it's preparing for 
maintenance.  Once its active if the vm's are shut down it should go strait to 
maintenance mode in just a few seconds.
My question is, are you using a migration network separate from ovirtmgmt 
network? That would probably speed things along for you.

Eric Evans
Digital Data Services LLC.
304.660.9080


-Original Message-
From: Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>>
Sent: Monday, April 13, 2020 12:04 PM
To: users mailto:users@ovirt.org>>
Subject: [ovirt-users] Host fails to enter in maintenance due to migration 
failure

Hello,

I’ve a host that’s preparing to maintenance for almost 20 hours now.

There’s a huge VM on it, with 32 gigs of RAM and this VM is failing migration.

So, there’s a way to cancel prepare for maintenance, so the host can stop 
trying this migration that’s end up failing?

I can just shutdown the VM to do the maintenance on the host...

Thanks,

___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org> Privacy Statement: 
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NT7XSYQ2G237DFTZXDHK2TZU4KH3BOCB/


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3ZELOYUJTDML4X6DBNC5PEO4T6XKQ3Z4/


[ovirt-users] Host fails to enter in maintenance due to migration failure

2020-04-13 Thread Vinícius Ferrão
Hello,

I’ve a host that’s preparing to maintenance for almost 20 hours now.

There’s a huge VM on it, with 32 gigs of RAM and this VM is failing migration.

So, there’s a way to cancel prepare for maintenance, so the host can stop 
trying this migration that’s end up failing?

I can just shutdown the VM to do the maintenance on the host...

Thanks,

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NT7XSYQ2G237DFTZXDHK2TZU4KH3BOCB/


[ovirt-users] Re: Ovirt and Dell Compellent in ISCSI

2020-04-09 Thread Vinícius Ferrão
It’s the same problem all over again.

iSCSI in oVirt/RHV is broken. For years.

Reported this a while: https://bugzilla.redhat.com/show_bug.cgi?id=1474904

iSCSI Multipath in the engine does not means a thing. It’s broken.

I don’t know why the oVirt team does not acknowledge this. I’m being an idiot 
always answering and inquiring about this, but I can’t understand why anyone 
doesn’t take this seriously. It’s real, it’s affects everyone and simply does 
not makes sense.


Sent from my iPhone

On 9 Apr 2020, at 10:39, "dalma...@cines.fr"  wrote:


Hi Shani,
thank's for the reply.
In this case, the bounding, I think, is inapropriate.
The dell compellent has 2 "fault domain" with différent IP network
This is a iSCSI array with 8 front-end ports (4 per controller). The iSCSI 
network is simple: two independant switches with a single VLAN, front-end ports 
are split equally between the two switches.
And for each serveur one Ethernet controller is connected on each switch. So, 
bonding seems inappropriate.
(see this dell documentation : 
https://downloads.dell.com/manuals/common/scv30x0iscsi-setup_en.pdf )
Maybe I misunderstood how iscsi bonding work in ovirt ?

Regards,
Sylvain


De: "Shani Leviim" 
À: dalma...@cines.fr
Cc: "users" 
Envoyé: Mardi 7 Avril 2020 15:41:16
Objet: Re: [ovirt-users] Ovirt and Dell Compellent in ISCSI

Hi Sylvain,
Not sure that's exactly what you're looking for, but you can define an iscsi 
bond (iscsi multipath) using the UI and REST API:
https://www.ovirt.org/develop/release-management/features/storage/iscsi-multipath.html

Note that this is a character of the DC.

Hope it helps.

Regards,
Shani Leviim


On Wed, Apr 1, 2020 at 12:35 PM mailto:dalma...@cines.fr>> 
wrote:
hi all,
we use ovirt 4.3 on dell server r640 runing centos 7.7 and a storage bay Dell 
Compellent SCv3020 in ISCSI.
We use two 10gb interfaces for iSCSI connection on each dell server.
If we configure ISCSI connection directly from web IU, we can’t specify the two 
physical ethernet interface , and there are missing path . (only 4 path on 8)
So, on the shell of hypervisor we use this commands  for configure the 
connections :
iscsiadm -m iface -I em1 --op=new # 1st ethernet interface
iscsiadm -m iface -I p3p1 --op=new # 2d ethernet interface
iscsiadm -m discovery -t sendtargets -p xx.xx.xx.xx
iscsiadm -m node -o show
iscsiadm -m node --login
after this, on the web IU we can connect our LUN with all path.

Also, I don’t understand how to configure multipath in the web UI . By defaut 
the configuration is in failover :
multipath -ll :
36000d3100457e405 dm-3 COMPELNT,Compellent Vol
size=500G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 23:0:0:1 sdb 8:16   active ready running
  |- 24:0:0:1 sdd 8:48   active ready running
  |- 25:0:0:1 sdc 8:32   active ready running
  |- 26:0:0:1 sde 8:64   active ready running
  |- 31:0:0:1 sdf 8:80   active ready running
  |- 32:0:0:1 sdg 8:96   active ready running
  |- 33:0:0:1 sdh 8:112  active ready running
  |- 34:0:0:1 sdi 8:128  active ready running

I think round robind or another configuration will be more performent.

So can we made this configuration , select physical interface and configure 
multipath in web UI ? for easyly maintenance and adding other server ?

Thank you.

Sylvain.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UDKK3ZW7QCWHXQL2SXHAL3EN5SHZNRM4/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5E4FUJXAOPVDFYTUB6HWK3LUDSO4A5V6/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MFOLP37VWRHR3HL7LV4VZUBNEDCJ4DA2/


[ovirt-users] Re: ISO and Export domains deprecated

2020-04-07 Thread Vinícius Ferrão
You can have another data domain named ISO. Just organize it there.

Sent from my iPhone

On 6 Apr 2020, at 19:37, "eev...@digitaldatatechs.com" 
 wrote:


My understanding is the export (data) domains will hold the iso’s and vfd files 
now. Personally, I like having an ISO domain to keep those files separated from 
vm hard disks and configs.
I guess I’m a bit old school.

Eric Evans
Digital Data Services LLC.
304.660.9080


From: Colin Coe 
Sent: Monday, April 6, 2020 6:25 PM
To: eev...@digitaldatatechs.com
Cc: users 
Subject: [ovirt-users] Re: ISO and Export domains deprecated

I'm pretty sure export domains are being deprecated however my main concern now 
is the sharing of ISO domains and can I do the equivalent with data domains ?

On Tue, 7 Apr 2020 at 04:20, 
mailto:eev...@digitaldatatechs.com>> wrote:
What will replace export domains? I thought only ISO domains were being 
depreciated.

Eric Evans
Digital Data Services LLC.
304.660.9080


From: Colin Coe mailto:colin@gmail.com>>
Sent: Sunday, April 5, 2020 2:19 AM
To: users mailto:Users@ovirt.org>>
Subject: [ovirt-users] ISO and Export domains deprecated

Hi all

I'm trying to understand how ISO and Export domains going away  is going to 
affect us.

We have four RHV instances:
- Prod DC1 (RHV4.1)
- Prod DC2 (RHV4.1)
- DEV (RHV4.3)
- TEST (RHV4.3)

Prod DC1, DEV and TEST all share export and ISO domains.

Prod DC2 is remote and currently has its own ISO and Export domains.

When ISO and Export domains go away, can I still share ISO domains between the 
RHV instances?

Thanks



___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7TYHN46ARPXIWIBJSAHTPMGP3ACLY6EN/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EJTKLTZJJUXQ34DKX5PCPUTIS7KSBMR5/


[ovirt-users] Re: Windows deployment

2020-02-27 Thread Vinícius Ferrão
Eric,

On the WDS server you must add the drivers to the image.

But pay attention here. The drivers from the oVirt release aren’t signed from a 
trusted authority from Windows. Only the drivers from RHV (the downstream and 
paid software from RH, based on oVirt).

WDS needs properly signed drivers for network install.

You’ll never be able to install it over PXE from WDS with the oVirt Drivers.

Change the network card to something generic like rtl8139 and them change back 
to VirtIO or install manually without PXE.



Sent from my iPhone

On 27 Feb 2020, at 04:44, Dominik Holler  wrote:




On Wed, Feb 26, 2020 at 11:45 PM 
mailto:eev...@digitaldatatechs.com>> wrote:
It is a Red Hat Virtio Ethernet Adapter.


Would a e1000 or rtl8139 work for your scenario?


Eric Evans
Digital Data Services LLC.
304.660.9080


From: Dominik Holler mailto:dhol...@redhat.com>>
Sent: Wednesday, February 26, 2020 11:55 AM
To: eev...@digitaldatatechs.com
Cc: users mailto:users@ovirt.org>>
Subject: [ovirt-users] Re: Windows deployment



On Wed, Feb 26, 2020 at 4:45 PM 
mailto:eev...@digitaldatatechs.com>> wrote:
I have a Windows Deployment server that works with Hyper-V. I am tryng to see 
if it will bare metal provision to Ovirt.

Do you want to provision Windows into an oVirt VM?

I have the necessary drivers loaded

I am not familiar with the Windows deployment server.
Would you help me to understand which drivers are loaded to what?

and initially it gets an IP and starts the installation.

I understand that the windows installer is starting. Where does he get the 
files from?

When it comes to the screen starting setup, I get an IP error. I have included 
a screenshot.

Might this be related to network drivers?
If Windows in an oVirt VM, what is the type of the virtual NIC?
Maybe using the e1000 or rtl8139 as type of the virtual NIC helps until the 
VirtIO drivers are ins
talled?

Has anyone see this before?

I also considered provisioning from Foreman but I would rather use what I have 
instead of recreating the entire installation.
Any help would be appreciated.

Here is a link to the actual error:

ftp://ftp.digitaldatatechs.com/pub/Windows%20Provision.jpg
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to 
users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BGGJTCK7PZKCUS65BBDP6UXX4KIHW7V5/


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/B2Z77CA4HYHJ7X7ADEDE5XUP2STXTGSI/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2LO5UGSMUFFVYTJPM2HBGRHELLKZ75T7/


[ovirt-users] Re: populating the ISO domain

2020-02-19 Thread Vinícius Ferrão
Hi.

ISO domains were deprecated. Now you should use a Data Domain instead and fill 
with the ISOs.

If you still using ISO Domain, like me, I only scp the ISOs directly to the 
storage. You don’t need to use the ISO uploader script. I never made it work 
anyway LOL.

Sent from my iPhone

> On 19 Feb 2020, at 04:33, "eev...@digitaldatatechs.com" 
>  wrote:
> 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KTEGB65SLPFEPIZRICMBE4AD4MMBCCGG/


[ovirt-users] Re: Reimport disks

2020-02-13 Thread Vinícius Ferrão
Import domain will work. The VM metadata is available on the OVF_STORE 
container, inside the domain. So even the names and settings come back.

Them you gradually start moving the VMs to the Gluster storage.

Sent from my iPhone

> On 13 Feb 2020, at 11:42, Robert Webb  wrote:
> 
> Off the top of my head, would you use the "Import Domain" option?
> 
> 
> From: Christian Reiss 
> Sent: Thursday, February 13, 2020 9:30 AM
> To: users
> Subject: [ovirt-users] Reimport disks
> 
> Hey folks,
> 
> I created a new cluster with a new engine, everything is green and
> running again (3 HCI, Gluster, this time Gluster 7.0 and CentOS7 hosts).
> 
> I do have a backup of the /images/ directory from the old installation.
> I tried copying (and preserving user/ permissions) into the new images
> gluster dir and trying a domain -> scan to no avail.
> 
> What is the correct way to introduce oVirt to "new" (or unknown) images?
> 
> -Chris.
> 
> --
> with kind regards,
> mit freundlichen Gruessen,
> 
> Christian Reiss
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/V7QZ5UZZYTYWGVHXFDKVPSELBRYOCM7Z/
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q2TV6CGLZYBHC4ZEWUIXIUJOBNQOMWUN/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TLETAKMXSFELOUXCSPIFWMCZH3QPL2EI/


[ovirt-users] Re: Deploy Hosted Engine fails at "Set VLAN ID at datacenter level"

2020-02-05 Thread Vinícius Ferrão
Man, Mode 4 is LACP.

You must use the correct technology or things will get messy.

I run a lot of RHV and oVirt clusters with Mode 4, 802.3ad, LACP or any other 
name you want to use. But you must comply with the rules. I do recommend some 
reading about bonding modes, don’t just enable them and see what happens, this 
is not responsible at all.

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.3/html/Installation_Guide/Bonding_Modes.html

And man, open a new thread, this one was from a VLAN issue on the storage 
network.

> On 5 Feb 2020, at 19:53, eev...@digitaldatatechs.com wrote:
> 
> So if the bond is set to mode 4 I should LAG the ports it connects to? It 
> doesn't affect an identical server not in the ovirt cluster and shows 2 Gb 
> connection. I have tried mode 1 and 2 with the same result. Also, the other 
> ovirt node is in mode 4 and no issues at all.
> 
> Eric Evans
> Digital Data Services LLC.
> 304.660.9080
> 
> 
> -Original Message-
> From: Vinícius Ferrão  
> Sent: Wednesday, February 05, 2020 4:51 PM
> To: eev...@digitaldatatechs.com
> Cc: users@ovirt.org
> Subject: [ovirt-users] Re: Deploy Hosted Engine fails at "Set VLAN ID at 
> datacenter level"
> 
> Mode 4 is LACP.
> 
> So the switches must be talking LACP.
> 
> 
> 
>> On 5 Feb 2020, at 18:28, eev...@digitaldatatechs.com wrote:
>> 
>> I tried to set lags but it killed communication altogether. So no they are 
>> not.
>> 
>> Eric Evans
>> Digital Data Services LLC.
>> 304.660.9080
>> 
>> 
>> -Original Message-
>> From: Vinícius Ferrão 
>> Sent: Wednesday, February 05, 2020 3:55 PM
>> To: eev...@digitaldatatechs.com
>> Cc: users@ovirt.org
>> Subject: [ovirt-users] Re: Deploy Hosted Engine fails at "Set VLAN ID at 
>> datacenter level"
>> 
>> The switches are configured for LACP Active mode?
>> 
>> 
>>> On 5 Feb 2020, at 17:40, eev...@digitaldatatechs.com wrote:
>>> 
>>> I have a hypervisor with nic bonding in place and it has some communication 
>>> issues. It constantly goes up and down and into non operational mode but 
>>> comes back up. Is there a fix for this? Using Mode 4 for bonding.
>>> 
>>> Eric Evans
>>> Digital Data Services LLC.
>>> 304.660.9080
>>> 
>>> 
>>> -Original Message-
>>> From: clam2...@gmail.com 
>>> Sent: Wednesday, February 05, 2020 12:48 PM
>>> To: users@ovirt.org
>>> Subject: [ovirt-users] Re: Deploy Hosted Engine fails at "Set VLAN ID at 
>>> datacenter level"
>>> 
>>> Thank you Guillaume!  My mistake.  Resolved.
>>> 
>>> I am now having further issues I believe to be because I am using tagged 
>>> VLANs with NIC teaming.  It appears that teaming is not well supported in 
>>> oVirt - is that accurate, and that I should rebuild using bonds.  If you 
>>> have any experience to shed light on this it is much appreciated.
>>> 
>>> [ INFO ] TASK [ovirt.hosted_engine_setup : Fail with error 
>>> description] [ ERROR ] fatal: [localhost]: FAILED! => {"changed":
>>> false, "msg": "The host has been set in non_operational status, 
>>> deployment errors: code 505: Host fmov1n1.bcn.dtcorp.com installation 
>>> failed. Failed to configure management network on the host., code
>>> 9000: Failed to verify Power Management configuration for Host 
>>> fmov1n1.bcn.dtcorp.com., fix accordingly and re-deploy."}
>>> 
>>> Thanks so very much,
>>> Charles
>>> ___
>>> Users mailing list -- users@ovirt.org To unsubscribe send an email to 
>>> users-le...@ovirt.org Privacy
>>> Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct: 
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives: 
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/LPUANZA
>>> L
>>> Z7L5ZC26XE65AKFJCPUINU2N/
>>> ___
>>> Users mailing list -- users@ovirt.org To unsubscribe send an email to 
>>> users-le...@ovirt.org Privacy
>>> Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct: 
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives: 
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/NB3OJYA
>>> H
>>> UG3YWTV3KHPGEYFVUM3ZZH

  1   2   >