from:"Nicolas Ecarnot"

[ovirt-users] Re: Proper way to upgrade hosts OS?

2019-07-01 Thread Nicolas Ecarnot


Le 26/06/2019 à 12:34, Nicolas Ecarnot a écrit :

Hello,

We're not using nodes but CentOS 7.x hosts.
Do you know if some documentation has been written about the proper way 
to upgrade the operating system of the hosts, and especially how to 
prevent breaking dependencies or cause versions flaws?


Thank you.



Hello,

As no answer came, may anyone just tell me if there's any chance to 
break something?


Thank you.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GZYVUCMBIZIZOMKSZUIJRZ6IMWBBI2X6/

[ovirt-users] Proper way to upgrade hosts OS?

2019-06-26 Thread Nicolas Ecarnot


Hello,

We're not using nodes but CentOS 7.x hosts.
Do you know if some documentation has been written about the proper way 
to upgrade the operating system of the hosts, and especially how to 
prevent breaking dependencies or cause versions flaws?


Thank you.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4SYJWWODEY2VZOAMU5NIRDOJCPANNR6S/

[ovirt-users] Re: Old mailing list SPAM

2019-05-15 Thread Nicolas Ecarnot


Le 15/05/2019 à 07:46, Markus Stockhausen a écrit :

Hi,

does anyone currently get old mails of 2016 from the mailing list?


I do.

(Though it is annoying, it allowed me to get an answer about which I 
never thought to ask - Thanks Nir, by the way)



We are spammed with something like this from teknikservice.nu:

...
Received: from mail.ovirt.org (localhost [IPv6:::1])by mail.ovirt.org
  (Postfix) with ESMTP id A33EA46AD3;Tue, 14 May 2019 14:48:48 -0400 (EDT)

Received: by mail.ovirt.org (Postfix, from userid 995)id D283A407D0; Tue, 14
  May 2019 14:42:29 -0400 (EDT)

Received: from bauhaus.teknikservice.nu (smtp.teknikservice.nu 
[81.216.61.60])

(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))(No
  client certificate requested)by mail.ovirt.org (Postfix) with ESMTPS id
  BF954467FEfor ; Tue, 14 May 2019 14:36:54 -0400 (EDT)

Received: by bauhaus.teknikservice.nu (Postfix, from userid 0)id 259822F504;
  Tue, 14 May 2019 20:32:33 +0200 (CEST) <- 3 YEAR TIME WARP ?

Received: from washer.actnet.nu (washer.actnet.nu [212.214.67.187])(using
  TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))(No client
  certificate requested)by bauhaus.teknikservice.nu (Postfix) with ESMTPS id
  430FEDA541for ; Thu,  6 Oct 2016 18:02:51 +0200 
(CEST)


Received: from lists.ovirt.org (lists.ovirt.org [173.255.252.138])(using
  TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))(No client
  certificate requested)by washer.actnet.nu (Postfix) with ESMTPS id
  D75A82293FCfor ; Thu,  6 Oct 2016 18:04:11 +0200
  (CEST)
...

Markus


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XI3LV4GPACT7ILZ3BNJLHHQBEWI3HWLI/




--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IEOIF3KVPKLBO2UNZ65FSRX7EFPXHF3V/

[ovirt-users] Re: VM has paused due to no storage space error

2019-05-15 Thread Nicolas Ecarnot

Hi Nir, hi Sandvik,

As I saw this issue lots of times and as I'm using thin prov. + block
storage, I feel concerned.

Read my question below.

Le 02/10/2016 à 12:55, Nir Soffer a écrit :

On Sun, Oct 2, 2016 at 12:06 PM, Sandvik Agustin
wrote:

Hi users,

I have this problem that sometimes 1 to 3 VM just automatically paused with
user interaction and getting this error "VM has paused due to no storage
space error". any inputs from you guys are very appreciated.

This is expected - when there is no storage space :-)

The vm is paused when there are some io pending io requests that
could not be fulfilled since you don't have enough space.

In a real machine the io requests would fail. In a vm, the vm can pause,
you can fix the issue (extend the storage domain), and resume the vm.

But I guess there is storage space available, otherwise you would
not spend the time sending this mail.

This can happen when using thin provisioned disks on block storage
(iSCSI, FC). We provision such disk with 1G, and and extend the disk
(add 1G) when it becomes too full (by default, free space < 0.5G).

If we fail to extend the disk quick enough,

"quick enough" -> Is there some place where this threshold can be
configured?

the vm will pause before the
extend was completed. Once the extend was completed, we resume
the vm.

So you may see very short pauses, but they should be rare.

To understand the issue, we need to inspect vdsm logs from the host
running the vm that paused, showing the timeframe when the vm
was paused.

You should see this message in the log each time a vm pauses:

abnormal vm stop device error ENOSPC

Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

--
IMPORTANT!
This message has been scanned for viruses and phishing links.
However, it is your responsibility to evaluate the links and attachments you
choose to click.
If you are uncertain, we always try to help.
Greetings helpd...@actnet.se

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5MAYP4SZZQC5BB2VVPQBXYWH4OOJ7LUW/

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KF4SVQOE7U7ELLOIE4CNPSH2TAN7MW3K/

[ovirt-users] Re: DISCARD support?

2019-05-14 Thread Nicolas Ecarnot

Hello,

Sending this here to share knowledge.

Here is what I learned from many BZ and mailing list posts readings. I'm
not working at Redhat, so please correct me if I'm wrong.

We are using thin-provisioned block storage LUNs (Equallogic), on which
oVirt is creating numerous Logical Volumes, and we're very happy with it.
When oVirt is removing a virtual disk, the SAN is not informed, because
the LVM layer is not sending the "issue_discard" flag.

/etc/lvm/lvm.conf is not the natural place to try to change this
parameter, as VDSM is not using it.

Efforts are presently made to include issue_discard setting support
directly into vdsm.conf, first on a datacenter scope (4.0.x), then per
storage domain (4.1.x) and maybe via a web GUI check-box. Part of the
effort is to make sure every bit of a planned to be removed LV get wiped
out. Part is to inform the block storage side about the deletion, in
case of thin provisioned LUNs.

https://bugzilla.redhat.com/show_bug.cgi?id=1342919
https://bugzilla.redhat.com/show_bug.cgi?id=981626

--
Nicolas ECARNOT

On Mon, Oct 3, 2016 at 2:24 PM, Nicolas Ecarnot <mailto:nico...@ecarnot.net>> wrote:

Yaniv,

As a pure random way of web surfing, I found that you posted on
twitter an information about DISCARD support.
(https://twitter.com/YanivKaul/status/773513216664174592
<https://twitter.com/YanivKaul/status/773513216664174592>)

I did not dig any further, but has it any relation with the fact
that so far, oVirt did not reclaim lost storage space amongst its
logical volumes of its storage domains?

A BZ exist about this, but one was told no work would be done about
it until 4.x.y, so now we're there, I was wondering if you knew more?

Feel free to send such questions on the mailing list (ovirt users or
devel), so other will be able to both chime in and see the response.
We've supported a custom hook for enabling discard per disk (which is
only relevant for virtio-SCSI and IDE) for some versions now (3.5 I
believe).

We are planning to add this via a UI and API in 4.1.
In addition, we are looking into discard (instead of wipe after delete,
when discard is also zero'ing content) as well as discard when removing LVs.

See:
http://www.ovirt.org/develop/release-management/features/storage/pass-discard-from-guest-to-underlying-storage/
http://www.ovirt.org/develop/release-management/features/storage/wipe-volumes-using-blkdiscard/
http://www.ovirt.org/develop/release-management/features/storage/discard-after-delete/

Best,

--
Nicolas ECARNOT

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XNWYONXSWEN5AJVUJURRL7G3QJW62SNJ/

[ovirt-users] Logical Volume extend failed

2019-03-11 Thread Nicolas Ecarnot

Hello,

[Context :
I'm moving all my VMs from an old 3.6 DC to a brand new 4.3 DC.
For local reasons, I'm doing it using an export domain, and one by one.
]

Today, for no obvious reason, error messages began to appear :
"
VDSM SPM-servername command failed: Logical Volume extend failed
"

Lots of similar errors appear in the engine log, with no obvious
additional hint.

In the VDSM log, I'm not skilled enough to see what's wrong either.

The 3.6 engine and vdsm log files are here :

https://framadrop.org/r/6cFSb0GRc1#VQ6XqYWg9HzniHMjgKmXVpXy0I+RIS/MiMGBpU+1bak=

https://framadrop.org/r/JFswiD3fkA#fdU+m3JCVMVg/eLjtJVTqOiAKIj4eyhsRWisxcrea7I=

It may come from one of our storage domain that was close to full, but I
freed 200Go space since, and the issue keeps appearing.

Now, my attempts to export a VM are failing.
I still can stop and start a VM.

(I'm not completely relaxed with this situation.)

I read some similar experience here
(https://www.canarytek.com/2017/07/21/Harmfull_bug_in_oVirt_block_storage.html)
but I'm not sure it is related.

I can psql-query and check things if needed, but I mostly need advices.

Thank you.

[ovirt-users] Re: Fencing : SSL or not?

2019-02-22 Thread Nicolas Ecarnot


Le 22/02/2019 à 15:45, Martin Perina a écrit :
If I understand that correctly, this is a request to open session to 
IPMI. If you haven't received any response, then I'd check:


1. Do you have IPMI enabled?



Hello Martin,

you hit the point.

IPMI was not unable (anymore).

IPMI is activated by default since years in all our hosts.

But recent firmware upgrades on some of our Dell hosts, and especially 
on iDRAC firmwares led to the disabling of IPMI.



I'm sorry for having bothered you and the audience. Sorry for this waste 
of time. Thank you Dell :-\


--
Nicolas ECARNOT

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KO7REWCFUWRGU453N5XYSFZSS75RFFU6/

[ovirt-users] Re: Host choice when migrating VMs

2019-02-22 Thread Nicolas Ecarnot


Le 22/02/2019 à 15:48, Dominik Holler a écrit :

Hosts _needs_ the same networks to be available in the same cluster. Different 
networked hosts needs to be put in a separate cluster.



This is the most straight approach, which is supported by oVirt.
But there is the possibility to attach logical networks, which are
neither required in the cluster, nor attached to all hosts in the
cluster, to a VM. oVirt's scheduling will respect this.


So you're saying oVirt knows which other hosts in the cluster have the 
non-mandatory network(s) the VM has and only chooses between those a host to 
migrate the VM to?



Yes. If you try to trigger the migration manually, UI will provide you
the list of possible hosts to migrate the VM.
https://github.com/oVirt/ovirt-engine/blob/7d111f3aa089f77f92049f4d3ec792e5ff7e5324/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/scheduling/policyunits/NetworkPolicyUnit.java#L132




*THIS* is precisely the answer I was expecting.

Thank you Dominik.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LT6I4GS42VIPQYBF4EGT7HBS2LVLUN2Z/

[ovirt-users] Re: Host choice when migrating VMs

2019-02-22 Thread Nicolas Ecarnot


Le 22/02/2019 à 15:02, Karli Sjöberg a écrit :



Den 22 feb. 2019 09:24 skrev Nicolas Ecarnot :

Hello,

I'm almost sure the following is useless as I think I know how it's
working, but as I'm preparing a major change in our infrastructure, I'd
rather be sure and not mess up. And also to be sure.
(Just to be sure)

For some reasons, and for the first time in our infra., one of our new
DC will temporary include heterogeneous hosts : some networks will be
available only on parts of them.




Hi Karli,

Hosts _needs_ the same networks to be available in the same cluster. 


Correct me if I'm wrong, but I think that your statement is true *if* 
the networks are set as mandatory, which is not automatically wanted nor 
true. In our case, we have to disable this mandatory attribute.


I agree that when the networks are mandatory, every host unable to use 
them will end up unavailable.



Different networked hosts needs to be put in a separate cluster.

/K



--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CGPHGFXYI3OZX2XKTLCFZ6W3GN4Q6U4Q/

[ovirt-users] Re: Fencing : SSL or not?

2019-02-22 Thread Nicolas Ecarnot


Le 22/02/2019 à 12:13, Martin Perina a écrit :

Unfortunately using fence_ipmilan is not possible to display more 
debugging details, so as mentioned earlier could you please run 
ipmitool directly?


ipmitool vv -I lanplus -H c-hv05.prd.sdis38.fr 
<http://c-hv05.prd.sdis38.fr> -p 623 -U stonith -P  -L 
ADMINISTRATOR chassis power status


Above should display more details ...


root@hv04:/etc# ipmitool -vv -I lanplus -H c-hv05.prd.sdis38.fr -p 623 -U 
stonith -P 'xxx' -L ADMINISTRATOR chassis power status


Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x8e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x8e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x8e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x8e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x0e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x0e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x0e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x0e 0x04 


Get Auth Capabilities error
Error issuing Get Channel Authentication Capabilities request
Error: Unable to establish IPMI v2 / RMCP+ session
root@hv04:/etc#

--
Nicolas ECARNOT

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DQKUC2G745CKN6BT2SC3T6LSCEEML7NN/

[ovirt-users] Re: Fencing : SSL or not?

2019-02-22 Thread Nicolas Ecarnot


Hi Martin,

Le 21/02/2019 à 13:04, Martin Perina a écrit :

Hi Nicolas,

see my reply inline


See mine below.



On Mon, Feb 18, 2019 at 9:51 AM Nicolas Ecarnot <mailto:nico...@ecarnot.net>> wrote:


Hello,

As fence_idrac has never worked for us, and as fence_ipmilan has worked
nicely since years, we are using fence_ipmilan with the lanplus=1
option
and we're happy with it.

We upgraded to 4.3.0.4 and we're witnessing that we cannot fence our
hosts anymore :

2019-02-18 09:42:08,678+01 ERROR
[org.ovirt.engine.core.bll.pm
<http://org.ovirt.engine.core.bll.pm>.FenceProxyLocator] (default
task-11)
[2f78ed99-6703-4d92-b7cb-948c2d24b623] Can not run fence action on host
'x', no suitable proxy host was found.


This is not related fence_ipmi issue below. Engine, is order to be able 
to execute fencing operation, needs at least one other hosts in Up 
status, which is used as a proxy host to perform fencing operation. So 
do you have at least one host in Up status in the same 
cluster/datacenter as the host you want to run fencing operation on?


Yes.

If so, then please enable debug information to find out why we cannot 
find any host acting as fence proxy:


1. Please download log-control.sh script from 
https://github.com/oVirt/ovirt-engine/tree/master/contrib#log-control-sh 
and save on engine machine

2. Please execute following on engine machine
   log-control.sh org.ovirt.engine.core.bll.pm 
<http://org.ovirt.engine.core.bll.pm> DEBUG
3. Go to the problematic host, click Edit, go to Power Management tab, 
click on the existing fence agent and click on Test button
4. Take a look at engine.log, there should be logged information, why we 
were not able to find out fence proxy


I followed the instructions above, but I feel this is not the best debug 
path. I learned nothing new.
The fence proxy is not missing. It is known and found, and it is trying 
to do its job, as written below :





and on the SPM :

fence_ipmilan: Failed: Unable to obtain correct plug status or plug is
not available


Could you please provide debug output of below command?

ipmitool -vv -I lanplus -H  -p 623 -U  
-P  -L ADMINISTRATOR chassis power status


See below a debug session.
I'm comparing two hosts, and one only is answering fence status queries.

I must add that before the upgrade to 4.3, both hosts were responding 
correctly.


fence_ipmilan --username=stonith --password='xxx' --lanplus 
--ip=c-serv-hv-prds01.sdis.isere.fr --action=status -v
2019-02-22 11:34:01,537 INFO: Executing: /usr/bin/ipmitool -I lanplus -H 
c-serv-hv-prds01.sdis.isere.fr -p 623 -U stonith -P [set] -L 
ADMINISTRATOR chassis power status


2019-02-22 11:34:01,654 DEBUG: 0 Chassis Power is on


Status: ON
root@hv04:/etc# fence_ipmilan --username=stonith --password='xxx' 
--lanplus --ip=c-hv05.prd.sdis38.fr --action=status -v
2019-02-22 11:34:15,335 INFO: Executing: /usr/bin/ipmitool -I lanplus -H 
c-hv05.prd.sdis38.fr -p 623 -U stonith -P [set] -L ADMINISTRATOR chassis 
power status


2019-02-22 11:34:35,338 ERROR: Connection timed out


root@hv04:/etc# nmap c-serv-hv-prds01.sdis.isere.fr

Starting Nmap 6.40 ( http://nmap.org ) at 2019-02-22 11:34 CET
Nmap scan report for c-serv-hv-prds01.sdis.isere.fr (192.168.53.2)
Host is up (0.010s latency).
rDNS record for 192.168.53.2: c-5g3yxx1.sdis.isere.fr
Not shown: 996 closed ports
PORT STATE SERVICE
22/tcp   open  ssh
80/tcp   open  http
443/tcp  open  https
5900/tcp open  vnc

Nmap done: 1 IP address (1 host up) scanned in 0.45 seconds
root@hv04:/etc# nmap c-hv05.prd.sdis38.fr

Starting Nmap 6.40 ( http://nmap.org ) at 2019-02-22 11:34 CET
Nmap scan report for c-hv05.prd.sdis38.fr (192.168.50.194)
Host is up (0.00060s latency).
rDNS record for 192.168.50.194: C-550W2S2.sdis.isere.fr
Not shown: 996 closed ports
PORT STATE SERVICE
22/tcp   open  ssh
80/tcp   open  http
443/tcp  open  https
5900/tcp open  vnc
MAC Address: CC:C5:E5:57:26:E0 (Unknown)

Nmap done: 1 IP address (1 host up) scanned in 0.20 seconds
root@hv04:/etc# ping -c 1 c-serv-hv-prds01.sdis.isere.fr
PING c-5g3yxx1.sdis.isere.fr (192.168.53.2) 56(84) bytes of data.
64 bytes from c-5g3yxx1.sdis.isere.fr (192.168.53.2): icmp_seq=1 ttl=61 
time=2.37 ms


--- c-5g3yxx1.sdis.isere.fr ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.371/2.371/2.371/0.000 ms
root@hv04:/etc# ping -c 1 c-hv05.prd.sdis38.fr
PING c-550w2s2.prd.sdis38.fr (192.168.50.194) 56(84) bytes of data.
64 bytes from C-550W2S2.sdis.isere.fr (192.168.50.194): icmp_seq=1 
ttl=64 time=0.189 ms


--- c-550w2s2.prd.sdis38.fr ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.189/0.189/0.189/0.000 ms




Above is the command which fence_ipmi is internally executing, and -vv 
adds debugging output which can reveal issue with the plug status


Regards,
Martin


I found the sugg

[ovirt-users] Host choice when migrating VMs

2019-02-22 Thread Nicolas Ecarnot


Hello,

I'm almost sure the following is useless as I think I know how it's 
working, but as I'm preparing a major change in our infrastructure, I'd 
rather be sure and not mess up. And also to be sure.

(Just to be sure)

For some reasons, and for the first time in our infra., one of our new 
DC will temporary include heterogeneous hosts : some networks will be 
available only on parts of them.


Please may someone confirm me that with every load balancing / VM 
startup / VM migration / host choice, oVirt will smartly choose the 
available host equipped with the adequate networks?


--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QGX3PHA4T3SXXDTYZ4VGY6UHECO7P6V5/

[ovirt-users] Fencing : SSL or not?

2019-02-18 Thread Nicolas Ecarnot


Hello,

As fence_idrac has never worked for us, and as fence_ipmilan has worked 
nicely since years, we are using fence_ipmilan with the lanplus=1 option 
and we're happy with it.


We upgraded to 4.3.0.4 and we're witnessing that we cannot fence our 
hosts anymore :


2019-02-18 09:42:08,678+01 ERROR 
[org.ovirt.engine.core.bll.pm.FenceProxyLocator] (default task-11) 
[2f78ed99-6703-4d92-b7cb-948c2d24b623] Can not run fence action on host 
'x', no suitable proxy host was found.


and on the SPM :

fence_ipmilan: Failed: Unable to obtain correct plug status or plug is 
not available


I found the suggested workaround here :

https://access.redhat.com/solutions/3349841

but no combination of
- lanplus={0,1}
- -z
- ssl=={0,1}

lead to no solution.

The package version is the same as what's described in the KB :
fence-agents-rhevm-4.2.1-11.el7_6.7.x86_64

What should I test now?

Thank you.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SEUAZ6JB6CIYY2GOBNJN2XSWOSH6DHDJ/

[ovirt-users] Re: Forum available

2019-02-08 Thread Nicolas Ecarnot


Le 08/02/2019 à 09:05, Josep Manel Andrés Moscardó a écrit :

Hi all,
I am just wondering if anyone like me would like to have everything that 
is bump here in a forum, with all the benefits it brings


Absolutely.

Digging through mail archives is somethimes painful.

(and people 
will still be able to subscribe and reply through email). Something like 
Discourse would be nice in my opinion.


Best.


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TUU357HINGWFA23T3SMKDVTM7EKLX6VS/




--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/H427EVNMN3NZHB7NGW4Z62IOPRIGFNGP/

[ovirt-users] Re: Bug in the web interface?

2019-02-06 Thread Nicolas Ecarnot

Le 06/02/2019 à 15:42, Greg Sheremeta a écrit :
On Wed, Feb 6, 2019 at 6:33 AM Nicolas Ecarnot <mailto:nico...@ecarnot.net>> wrote:

Le 06/02/2019 à 10:53, Lucie Leistnerova a écrit :
>
> On 2/6/19 10:22 AM, Nicolas Ecarnot wrote:
>> Hi Lucie,
>>
>> Le 06/02/2019 à 10:02, Lucie Leistnerova a écrit :
>>> I'm sorry, my mistake I did not mention to remove the package
without
>>> dependencies.

Same -- sorry, ugh.
For anyone in the same situation, the better thing to do now is simply 
'yum update ovirt-engine-ui-extensions'

That will remove the old dashboard correctly.
https://github.com/oVirt/ovirt-engine-ui-extensions/blob/master/packaging/spec.in#L16

Thank you. We need this kind of wheels greasing as oVirt's complexity 
increases.

To sum up, I think what I'm missing is a clear and solide
documentation
or official Redhat message about whether/what/how/when can/cannot we
update (with "yum update") the engine host and/or the hosts.

Not Red Hat -- oVirt :)

Yep, Greg Sheremeta  ;-)

Indeed, we need an Upgrade Guide update. I'll look into it.

Generally, on my dev instances (which are probably nowhere near as 
complicated as your setups), I run 'yum update' followed by 
'engine-setup'.

Actually, my experience is that yum-upgrading the engine was most of the 
times harmless, but yum-upgrading the hosts lead to complex situations.

I'm at a point where I no longer update my hosts with yum update, and 
only relies on oVirt's update (either via the web GUI or ansible's 
cluster upgrade) which only updates part of the packages.

I'd rather have a strong enough RPM environment around oVirt preventing 
any issue (the version lock usage shows that it's already a concern 
oVirt's people are dealing with and I thank you. Keep strengthening.)

--

Nicolas ECARNOT

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TQAYEZSGMLQCWFJTMAUERABCUNYWG3N6/

[ovirt-users] Re: Bug in the web interface?

2019-02-06 Thread Nicolas Ecarnot


Le 06/02/2019 à 10:53, Lucie Leistnerova a écrit :


On 2/6/19 10:22 AM, Nicolas Ecarnot wrote:

Hi Lucie,

Le 06/02/2019 à 10:02, Lucie Leistnerova a écrit :
I'm sorry, my mistake I did not mention to remove the package without 
dependencies.


rpm -e --nodeps ...


I'll write that down.



When looking at the log file above
(https://framadrop.org/r/ywTOD-Q02-#dA6hdYaxfZpgUB68gtJLB9inH5oJajrL4H9LTktDd6o=) 
[...]
"/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", 



The error is cause by missing ovirt-engine-dbscripts.


OK

Well, I thought I messed up with packages, and I thought a compete 
yum update would help, as I read :

Le 05/02/2019 à 15:19, Greg Sheremeta wrote :



The fix is pushed. Standalone engine upgrades should be fine starting
now. `yum update` any appliance engines or already upgraded 
engines to get the latest ovirt-engine-ui-extensions, which fixes 
the problem.


So I ran a yum update.

This package is part of ovirt-engine versionlock so can't be 
installed/updated separately.
engine-setup should install the missing packages. I tried it by 
myself and it fixed the issue.


   [install] 
ovirt-engine-dbscripts-4.3.0.5-0.0.master.20190205084851.gitaaebfc9.el7.noarch 
will be installed


I see I have this package, though in an older version :
# rpm -qa|grep -i dbscripts
ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch

The version shouldn't be problem. I tested it in u/s ovirt. Now I tried 
with same version.


Try to remove that package and install again. Versionlock seems to 
differ here so I was able to install it separately, if not run 
engine-setup.


# rpm -e --nodeps ovirt-engine-dbscripts


Indeed, it found a lot of missing files/dir.



# yum install ovirt-engine-dbscripts


I forgot to set LANG=C so you'll read some parts in french, but I get 
the idea :



root@mvm01:/tmp# yum install ovirt-engine-dbscripts
Modules complémentaires chargés : fastestmirror, versionlock
Loading mirror speeds from cached hostfile
 * base: centos.mirror.fr.planethoster.net
 * epel: pkg.adfinis-sygroup.ch
 * extras: ftp.pasteur.fr
 * ovirt-4.3: ovirt.repo.nfrance.com
 * ovirt-4.3-epel: pkg.adfinis-sygroup.ch
 * updates: centos.mirror.fr.planethoster.net
Excluding 1 update due to versionlock (use "yum versionlock status" to 
show it)

Résolution des dépendances
--> Lancement de la transaction de test
---> Le paquet ovirt-engine-dbscripts.noarch 0:4.3.0.4-1.el7 sera installé
--> Résolution des dépendances terminée

Dépendances résolues

=
 Package 
Architecture Version 
Dépôt 
Taille

=
Installation :
 ovirt-engine-dbscripts 
noarch   4.3.0.4-1.el7 
ovirt-4.3 
331 k


Résumé de la transaction
=
Installation   1 Paquet

Taille totale des téléchargements : 331 k
Taille d'installation : 1.6 M
Is this ok [y/d/N]: y
Downloading packages:
ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch.rpm 

 | 331 kB 
00:00:02

Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Avertissement : RPMDB a été modifiée par une autre application que yum.
** 1 problèmes RPMDB préexistants trouvés, la sortie de « yum check » 
est la suivante :
ovirt-engine-4.3.0.4-1.el7.noarch a des dépendances manquantes de 
ovirt-engine-dbscripts = ('0', '4.3.0.4', '1.el7')
  Installation : ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch 



 1/1
  Vérification : ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch 



 1/1

Installé :
  ovirt-engine-dbscripts.noarch 0:4.3.0.4-1.el7 





Terminé !

-

After that, I ran again engine-setup and it went OK.
Now, my ovirt DC and dashboard is back to life, thanks to you Lucie.

To sum up, I think what I'm missing is a clear and solide documentation 
or official Redhat message about whether/what/how/when can/cannot we 
update (with "yum update") the engine host and/or the hosts.


??

--
Nicolas ECARNOT
__

[ovirt-users] Re: Bug in the web interface?

2019-02-06 Thread Nicolas Ecarnot


Hi Lucie,

Le 06/02/2019 à 10:02, Lucie Leistnerova a écrit :
I'm sorry, my mistake I did not mention to remove the package without 
dependencies.


rpm -e --nodeps ...


I'll write that down.



When looking at the log file above
(https://framadrop.org/r/ywTOD-Q02-#dA6hdYaxfZpgUB68gtJLB9inH5oJajrL4H9LTktDd6o=) 
[...]
"/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", 

The error is cause by missing ovirt-engine-dbscripts.


OK

Well, I thought I messed up with packages, and I thought a compete yum 
update would help, as I read :

Le 05/02/2019 à 15:19, Greg Sheremeta wrote :



The fix is pushed. Standalone engine upgrades should be fine starting
now. `yum update` any appliance engines or already upgraded engines 
to get the latest ovirt-engine-ui-extensions, which fixes the problem.


So I ran a yum update.

This package is part of ovirt-engine versionlock so can't be 
installed/updated separately.
engine-setup should install the missing packages. I tried it by myself 
and it fixed the issue.


   [install] 
ovirt-engine-dbscripts-4.3.0.5-0.0.master.20190205084851.gitaaebfc9.el7.noarch 
will be installed


I see I have this package, though in an older version :
# rpm -qa|grep -i dbscripts
ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch



Not sure what went wrong by you, send please the setup log and the 


>> 
(https://framadrop.org/r/ywTOD-Q02-#dA6hdYaxfZpgUB68gtJLB9inH5oJajrL4H9LTktDd6o=)


ovirt-engine* rpms list. And also result of 'ls 
/usr/share/ovirt-engine/dbscripts'


# LANG=C ls -la /usr/share/ovirt-engine/dbscripts
ls: cannot access /usr/share/ovirt-engine/dbscripts: No such file or 
directory


You seem to hit the point.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DA3RSDSTLAHWDCIAZNAGRUMKFHT7Y2GN/

[ovirt-users] Re: Bug in the web interface?

2019-02-06 Thread Nicolas Ecarnot

Le 05/02/2019 à 15:19, Greg Sheremeta wrote :


The fix is pushed. Standalone engine upgrades should be fine starting 
now. `yum update` any appliance engines or already upgraded engines to 
get the latest ovirt-engine-ui-extensions, which fixes the problem.


So I ran a yum update.

After running again engine-setup, it is failing the same way.
I compared the complete rpm list with another 4.3 DC with no issue, and 
apart the removed ovirt-engine-dashboard package and obviously many 
upgraded packages, I see no obvious missing parts.


I'm at loss and don't know how to save this DC, so any help is welcome.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7QT44H4DEIZPZVMBO6UPRQ6GZWAKWP3S/

[ovirt-users] Re: [4.3.0] VNC Virt-viewer console not opening

2019-02-05 Thread Nicolas Ecarnot


Hello Greg,

Le 04/02/2019 à 21:13, Greg Sheremeta a écrit :

When I try to use Spice instead of VNc, it is working nicely.


My goal is to stick to VNC.


When I try to use noVNC, the additional tab opens and shows
"Unsupported
security types: 19"


Looks like https://bugzilla.redhat.com/show_bug.cgi?id=1659155

Can you try disabling vnc security on the cluster and then reboot the host?


VNC security is already disabled.


What could I give to help you help me?

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ARBA5SBJLY3QS73XYRJYQ7F7TZJ5KOYT/

[ovirt-users] [4.3.0] VNC Virt-viewer console not opening

2019-02-04 Thread Nicolas Ecarnot


Hello,

First, congratulations to all of you who worked for this 4.3.0 release, 
and obviously thank you.


Today, I upgraded 4 oVirt setups (4 DC) from 4.2.7 to 4.3.0.
I went well on all 4 DCs.

But on one of them, when I try to open a console, I see it open as a 
flash (it opens and closes immediately).


I'm using Firefox 64.0 with Ubuntu 18.10, and all my VMs are setup like 
this :

- video type : QXL
- Gfx protocol : VNC
- VNC Kbd layout : fr
and I'm using virt-viewer

On the problematic DC, all the VMs are showing the same issue.

When I try to use Spice instead of VNc, it is working nicely.
When I try to use noVNC, the additional tab opens and shows "Unsupported 
security types: 19"


I tried to track down this issue thanks to the firefox dev console, but 
it's beyond my understanding.


Trying the same with Chromium does the same blinking open/close.

I'd rather learn how to provide additionnal debug messages, but
/var/log/ovirt-engine/engine.log does not give any useful hint :

2019-02-04 16:57:04,150+01 INFO 
[org.ovirt.engine.core.bll.SetVmTicketCommand] (default task-24) 
[1fb01d42] Running command: SetVmTicketCommand internal: false. Entities 
affected :  ID: 0c3e02b3-7fec-4bb1-b3d6-2e6c228e7278 Type:

 VMAction group CONNECT_TO_VM with role type USER
2019-02-04 16:57:04,155+01 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SetVmTicketVDSCommand] 
(default task-24) [1fb01d42] START, SetVmTicketVDSCommand(HostName = 
hv01.prd.sdis38.fr, SetVmTicketVDSCommandParameters:{hostId='
687c1c01-a5e1-449c-89d2-9713ccfc2487', 
vmId='0c3e02b3-7fec-4bb1-b3d6-2e6c228e7278', protocol='VNC', 
ticket='IivrpGHx5zSw', validTime='120', userName='admin', 
userId='4a340386-851a-11e8-863d-3417ebeef1af', disconnectAction='NONE'}

), log id: 2a897f30
2019-02-04 16:57:04,188+01 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SetVmTicketVDSCommand] 
(default task-24) [1fb01d42] FINISH, SetVmTicketVDSCommand, return: , 
log id: 2a897f30
2019-02-04 16:57:04,211+01 INFO 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(default task-24) [1fb01d42] EVENT_ID: VM_SET_TICKET(164), User 
admin@internal-authz initiated console session for VM ad02.ct

at.sdis38.fr

What could I give to help you help me?

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KGCM25ILBTQTY6NLVJUDE7CNF5C5BRE7/

[ovirt-users] Re: The admin portal ui should be more simplified

2019-01-10 Thread Nicolas Ecarnot


Le 10/01/2019 à 15:13, fle...@hotmail.com a écrit :

We have a rhv  of 11 Datacerters, 11 clusters, 40 hosts and 300 vms.
The 4 of us administrators are suffering from the new 4.2 UI lack of active 
area 。The manipulation logic also make us confused.
A simple operation needs more clicks than before.
Please just make the UI more simplified,
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ETR6Q5YWUFTF6Y6RN6SHEAURJBK7OGOQ/



Hello,

Would it be wise to suggest two clever ways to deal with complexity :

- ManageIQ
- Ansible

We use them both, and are quite happy with them.

Regards,

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MSVKUQMBBXUOVOAWE5FICFL5MACXWERT/

[ovirt-users] Re: Trouble connecting to IDRAC7

2018-08-01 Thread Nicolas Ecarnot


Le 01/08/2018 à 15:28, Jayme a écrit :
I just enabled power management/fencing successfully on two of my hosts 
(Dell poweredge R720s with Idrac 7) but am failing to add the third.  I 
enter the IP and user/pass like the others, it takes 15 seconds or so 
they spits out "Test Failed: Internal JSON-RPC error"


I tried resetting the IDRAC on that server.  I can also ping it and 
access it fine in a web browser.  I can ping it from the host as well.


Is there any configuration in IDRAC that could be blocking the fence 
attempt or any logs in oVirt I can look at to figure out what might be 
happening with the connection?


I see there is a "fence_idrac" command on the hosts but unsure what 
switches to use with it to test.


Thanks!


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UJQDE3W6NSZWLMSZJQZD7OZM4CYEMNKI/



Hello Jayme,

All our iDrac are successfully power-managed this way :

type : ipmilan
options : lanplus=1

In the Drac, we use a dedicated user with the appropriate rights.

HTH

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FTT6IBBAONVMLWWHDW3W76KWT433AYQ2/

[ovirt-users] [No question] NFS disabled, hosts wandering tearful

2018-08-01 Thread Nicolas Ecarnot

Hello,

This is a simple testimony about what happened yesterday in one of our DC.
This DC runs on a dedicated bare-metal engine, oversized compared to the
need, thus I've added a NFS service on it to host a small storage domain
and the ISO storage domain.
Yesterday, after having received the colorful announce about the 4.2.5
version, I decided to upgrade.
As our engine was still on a CentOS 7.4, I first upgraded its OS version
to 7.5, then reboot. Smooth.

Then I followed the very usual oVirt engine upgrade path. Smooth.
Eventually, I upgraded the hosts with ovirt-ansible-cluster-upgrade as
usual.

The result was frightening because the hosts were put in maintenance,
upgraded, back to life, seen unavailable, unreachable, connecting,
alive, rebooted, then back to another turn and looping...
During this, the SPM role was obviously jumping around, and that did not
help the debug.

In the end, it appeared that something during an upgrade stopped and
disabled the NFS service. My hosts partially relied on it, so after
having restarted the NFS service, all came back to life.

The NFS disabling may come from the CentOS upgrade, except if someone
tells me it could come from something on the oVirt side?

I'm sure the RH people will advice me not to run NFS on the engine, but
apart this event, I had no trouble doing this in years.

Regards,

[ovirt-users] Re: Is enabling Epel repo will break the installation?

2018-07-23 Thread Nicolas Ecarnot


Le 23/07/2018 à 15:33, Arman Khalatyan a écrit :

Hello,
As I remember some time ago the epel collectd was in conflict with the
ovirt one.
Is it still the case?
Thanks,
Arman.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/S4SYV6L5EIW36B3CIR7VWA42FNJCDCUG/



Hello,

With a recent 4.2.4.5-1.el7 it was still the case...

I just excluded collectd from epel.repo and it was OK.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GYZPPUBDSNGKKUYANCEHRRCOHKPUY24N/

[ovirt-users] Host reboot failed

2018-07-13 Thread Nicolas Ecarnot


Hello,

[oVirt 4.2.4.5-1.el7]

Sequence :
- Among 7 active UP hosts, one of them runs zero VM
- On this (still in UP state) host, I run a SSH-restart via the web GUI
- The host gracefully shuts down then reboots, with no issue
- In the web GUI, as in real life, the host stays in Reboot state forever

A this point, the engine can ping it, can ssh-connect to it, the host 
seems to have zero issue.


In the web GUI, I can not put it into active state because it is not in 
maintenance state. It stays in reboot state.
I can not either put it in maintenance state because it stays in reboot 
state.


This state lasts long enough to allow me type this mail, look into logs, 
and as I was about to send logs, I see the host is returning to life 
(its states comes back as UP).
I don't type fast, so after the host has finished rebooting, maybe 5 or 
10 minutes have passed before the engine links again to the host.


Before posting additional logs and comments, does anybody know if this 
is a know bug or behavior, or do I have to open a BZ?


Regards,

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/B5HCXSJR57LQ2SNRFK4POUIX7Z2DX2S6/

[ovirt-users] Re: Lost host after upgrade/reboot

2018-06-19 Thread Nicolas Ecarnot


Le 19/06/2018 à 10:14, Nicolas Ecarnot a écrit :
In this engine log above, you see that I'm using my account to manage 
this engine, as I 'm doing for years with no issue.
I'll try the exact same path with admin@internal to see what could 
change, but I don't see the link.


I just tried on another host, using admin@internal, and the same issue 
occurred.



What other logs could I give you to debug this?

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q2KI7OJKUYJLZ3MQU5LPBQW77A5A4YOX/

[ovirt-users] Lost host after upgrade/reboot

2018-06-19 Thread Nicolas Ecarnot


Hello,

TL;DR : engine stops talking with rebooted host.


[oVirt 4.2.3.5-1.el7.centos]

- From the web gui, upgrading a host, allowing the reboot checkbox checked
- upgrade is OK (/var/log/yum.log is showing successful updates + the 
Ansible host deploy log is also OK)

- reboot is OK (clean, SSH OK...)
- the host eventually appears as "Install failed"
- the engine.log is telling :


2018-06-19 10:02:24,896+02 ERROR
[org.ovirt.engine.core.bll.SshHostRebootCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] SSH
reboot command failed on host 'serv-hv-prds06': SSH session timeout
host 'root@ serv-hv-prds06' Stdout: Stderr: 2018-06-19
10:02:25,028+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
EVENT_ID: SYSTEM_FAILED_SSH_HOST_RESTART(198), A restart usin g SSH
initiated by the engine to Host serv-hv-prds06 has failed. 2018-06-19
10:02:25,185+02 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
START, SetVdsStatusVDSCommand(HostName = serv-hv-prds06,
SetVdsStatusVDSCom 
mandParameters:{hostId='9c1566a4-8432-4de6-b30d-fd3b8e5fafca',

status='InstallFailed', nonOperationalReason='NONE',
stopSpmFailureLogged='false', maintenanceReason='null'}), log id:
833f9bd 2018-06-19 10:02:25,191+02 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
FINISH, SetVdsStatusVDSCommand, log id: 833f9bd 2018-06-19
10:02:25,191+02 ERROR
[org.ovirt.engine.core.bll.hostdeploy.UpgradeHostInternalCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
Engine failed to restart via ssh host 'serv-hv-prds06' ('9c1566a4- 
8432-4de6-b30d-fd3b8e5fafca') after upgrade 2018-06-19

10:02:25,256+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID:
HOST_UPGRADE_FAILED(841 ), Failed to upgrade Host serv-hv-prds06
(User: necar...@sdis.isere.fr@SDIS38-authz). 2018-06-19
10:02:30,755+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-69)
[8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID:
HOST_UPGRADE_FAILED(841), Failed to upgrade Host serv-hv-prds06
(User: necar...@sdis.isere.fr@SDIS38-authz).


- Manually activating the host puts it back on track without issue

The usual SSH communications between the engine and the host are usually 
very sound (VM migrations, maintenance...).


On this oVirt DC, I reproduced this issue twice on 2 different hosts.

In this engine log above, you see that I'm using my account to manage 
this engine, as I 'm doing for years with no issue.
I'll try the exact same path with admin@internal to see what could 
change, but I don't see the link.


What other logs could I give you to debug this?

Regards,

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CT5KHY3C2ASOXBVNUIEBG5WA42JKJGXH/

[ovirt-users] Re: Hosts : Upgrade failed - 4.2.3

2018-05-17 Thread Nicolas Ecarnot


Le 16/05/2018 à 12:55, Fred Rolland a écrit :

It looks you still have 4.1 repos...


Yes.

I thought Ansible was in charge of disabling oldest repos.

Is does not seem to be the case, is it?

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org

[ovirt-users] Hosts : Upgrade failed - 4.2.3

2018-05-16 Thread Nicolas Ecarnot


Hello,

I was on 4.2.2 and it failed.
I upgraded to 4.2.3 and it's still failing.

From the GUI, I switch one host into maintenance mode, try to upgrade 
it, and it is failing.


On the engine, the engine.log is not saying anything helpful.
But on the engine, I see in 
/var/log/ovirt-engine/host-deploy/ovirt-host-mgmt-ansible-20180516121013-xxx-dacf1972-f184-4d01-a863-7974579e6bc8.log, 
I see :



http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.8/repodata/repomd.xml:
 [Errno 14] HTTP Error 404 - Not Found
Essai d'un autre miroir.
To address this issue please refer to the below wiki article 


https://wiki.centos.org/yum-errors

If above article doesn't help to resolve this issue please use 
https://bugs.centos.org/.

http://mirror.centos.org/centos/7/virt/x86_64/ovirt-4.1/repodata/repomd.xml: 
[Errno 14] HTTP Error 404 - Not Found
Essai d'un autre miroir.


This is french, but I'm sure you understand that it translates into 
"gluster repo issue".


Is there something I could do?

Thank you.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org

[ovirt-users] Why RAW images when using GlusterFS?

2018-04-05 Thread Nicolas Ecarnot


Hello,

Amongst others, I have one 3.6 DC working very well since years and all 
based on GlusterFS.
When having a close look (qemu-img info) on the images, I see their 
format is all RAW and not QCOW2.


I never noticed or bothered before, but I'm wondering :
- is it by design?
- it is something we can change (I'd prefer qcow2)
- it there some limitations?

And finally, I have the same questions about NFS storage domains.

Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] VM has been paused due to NO STORAGE SPACE ERROR ?!?!?!?!

2018-03-16 Thread Nicolas Ecarnot


Le 16/03/2018 à 15:48, Alex Crow a écrit :

On 16/03/18 13:46, Nicolas Ecarnot wrote:

Le 16/03/2018 à 13:28, Karli Sjöberg a écrit :



Den 16 mars 2018 12:26 skrev Enrico Becchetti 
<enrico.becche...@pg.infn.it>:


   Dear All,
    Does someone had seen that error ?


Yes, I experienced it dozens of times on 3.6 (my 4.2 setup has 
insufficient workload to trigger such event).

And in every case, there was no actual lack of space.


    Enrico Becchetti Servizio di Calcolo e Reti
I think I remember something to do with thin provisioning and not 
being able to grow fast enough, so out of space. Are the VM's disk 
thick or thin?


All our storage domains are thin-prov. and served by iSCSI (Equallogic 
PS6xxx and 4xxx).


Enrico, do you know if a bug has been filed about this?

Did the VM remain paused? In my experience the VM just gets temporarily 
paused while the storage is expanded. RH confirmed to me in a ticket 
that this is expected behaviour.


AFAIR, most of them went back up and running by themselves (we had to 
manually some of them from times to times).

The storage side weakness is an interesting trail to follow.
We also experienced this behavior when migrating lots of VMs at once, 
yet using a dedicated storage network.


Being on this mailing list since long, I remember we already discussed 
several times about how some users feel how oVirt can appear sensitive 
to storage latencies. On my side, the site where most of our workload 
resides is still in 3.6, so I can not yet witness the efforts oVirt devs 
have made to cope with this in 4.2 but I'm sure they did.


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] VM has been paused due to NO STORAGE SPACE ERROR ?!?!?!?!

2018-03-16 Thread Nicolas Ecarnot


Le 16/03/2018 à 13:28, Karli Sjöberg a écrit :



Den 16 mars 2018 12:26 skrev Enrico Becchetti <enrico.becche...@pg.infn.it>:

   Dear All,
Does someone had seen that error ?


Yes, I experienced it dozens of times on 3.6 (my 4.2 setup has 
insufficient workload to trigger such event).

And in every case, there was no actual lack of space.


Enrico Becchetti Servizio di Calcolo e Reti
I think I remember something to do with thin provisioning and not being 
able to grow fast enough, so out of space. Are the VM's disk thick or thin?


All our storage domains are thin-prov. and served by iSCSI (Equallogic 
PS6xxx and 4xxx).


Enrico, do you know if a bug has been filed about this?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] firewall node

2018-03-09 Thread Nicolas Ecarnot


https://www.mail-archive.com/users@ovirt.org/msg46608.html


Le 09/03/2018 à 20:12, Fabrice SOLER a écrit :

Hello,

I am trying to open a port on the node.

For that, in the cluster configuration I have choosed firewalld, I have 
created the 
|*/etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml* file.|


|
- name: Enable additional port on firewalld
   firewalld:
     port: "12345/tcp"
     permanent: yes
     immediate: yes
     state: enabled
|

|then I have rebooted the node like it is noticed on this link :
|

|https://www.ovirt.org/blog/2017/12/host-deploy-customization/
|

|On the node, after the reboot, I read the iptables (iptables -L) and 
the port is not open.

|

|I have just updated the engine and the node is 4.2.1.1.|

|Is there some change about the firewalld in this version ? (in 4.2.0 it 
worked)

|

|Sincerery
|

--


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users




--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Power off VM from VM portal

2018-03-07 Thread Nicolas Ecarnot


Le 07/03/2018 à 13:42, Alexandr Krivulya a écrit :



06.03.2018 17:39, Nicolas Ecarnot пишет:

Le 06/03/2018 à 16:02, Alexandr Krivulya a écrit :

Hi,

is there any way to power off VM from VM portal (4.2.1.7)? I can't 
find "power off" button, just "shutdown".



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Hello Alexandr,

After having clicked on the VM link, you'll notice that on the right 
of the Shutdown button is an arrow allowing you to access to the Power 
Off feature.


I cant find this arrow on Shutdown button




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



Oh sorry I answered in the context of admin portal.
Indeed, in the VM portal, I neither see this poweroff button.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Power off VM from VM portal

2018-03-06 Thread Nicolas Ecarnot


Le 06/03/2018 à 16:02, Alexandr Krivulya a écrit :

Hi,

is there any way to power off VM from VM portal (4.2.1.7)? I can't find 
"power off" button, just "shutdown".



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Hello Alexandr,

After having clicked on the VM link, you'll notice that on the right of 
the Shutdown button is an arrow allowing you to access to the Power Off 
feature.


Regards,

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] Importing VM fails with "No space left on device"

2018-03-06 Thread Nicolas Ecarnot


Hello,

When importing a VM, I'm facing the know bug :
https://access.redhat.com/solutions/2770791

QImgError: ecode=1, stdout=[], stderr=['qemu-img: error while writing 
sector 93569024: No space left on device'


The difference between my case and what is described in the RH webpage 
is that I have no "Failed to flush the refcount block cache".


Here is what I see :


ecfbd1a4-f9d2-463a-ade6-def5bd217b43::DEBUG::2018-03-06 
09:57:36,460::utils::718::root::(watchCmd) FAILED:  = ['qemu-img: error while 
writing sector 205517952: No space left on device'];  = 1
ecfbd1a4-f9d2-463a-ade6-def5bd217b43::ERROR::2018-03-06 09:57:36,460::image::865::Storage.Image::(copyCollapsed) conversion failure for volume ac08bc8d-1eea-449a-a102-cf763c6726c8 
Traceback (most recent call last):

  File "/usr/share/vdsm/storage/image.py", line 860, in copyCollapsed
volume.fmt2str(dstVolFormat))
  File "/usr/lib/python2.7/site-packages/vdsm/qemuimg.py", line 207, in convert
raise QImgError(rc, out, err)
QImgError: ecode=1, stdout=[], stderr=['qemu-img: error while writing sector 
205517952: No space left on device'], message=None
ecfbd1a4-f9d2-463a-ade6-def5bd217b43::ERROR::2018-03-06 
09:57:36,461::image::878::Storage.Image::(copyCollapsed) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 866, in copyCollapsed
raise se.CopyImageError(str(e))
CopyImageError: low level Image copy failed: ("ecode=1, stdout=[], 
stderr=['qemu-img: error while writing sector 205517952: No space left on device'], 
message=None",)


I followed the advices in the RH webpage (check if the figures are 
correct between the qemu-img sizes and the meta-data file), and they 
seem to be correct :


root@serv-hv-adm30:/etc# qemu-img info 
/rhev/data-center/mnt/serv-lin-adm1.sdis.isere.fr\:_home_vmexport3/be2878c9-2c46-476b-bfae-8b02a4679022/images/a5d68d88-3b54-488d-a61e-7995a1906994/ac08bc8d-1eea-449a-a102-cf763c6726c8
image: 
/rhev/data-center/mnt/serv-lin-adm1.sdis.isere.fr:_home_vmexport3/be2878c9-2c46-476b-bfae-8b02a4679022/images/a5d68d88-3b54-488d-a61e-7995a1906994/ac08bc8d-1eea-449a-a102-cf763c6726c8

file format: qcow2
virtual size: 98G (105226698752 bytes)
disk size: 97G
cluster_size: 65536
Format specific information:
compat: 0.10
refcount bits: 16

root@serv-hv-adm30:/etc# cat 
/rhev/data-center/mnt/serv-lin-adm1.sdis.isere.fr\:_home_vmexport3/be2878c9-2c46-476b-bfae-8b02a4679022/images/a5d68d88-3b54-488d-a61e-7995a1906994/ac08bc8d-1eea-449a-a102-cf763c6726c8.meta 
DOMAIN=be2878c9-2c46-476b-bfae-8b02a4679022

CTIME=1520318755
FORMAT=COW
DISKTYPE=1
LEGALITY=LEGAL
SIZE=205520896
VOLTYPE=LEAF
DESCRIPTION=
IMAGE=a5d68d88-3b54-488d-a61e-7995a1906994
PUUID=----
MTIME=0
POOL_UUID=
TYPE=SPARSE
EOF


So I don't see what's wrong?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVIRT 4.1 / iSCSI Multipathing

2018-03-05 Thread Nicolas Ecarnot

Hello,

[Unusual setup]
Last week, I eventually managed to make a 4.2.1.7 oVirt work with
iscsi-multipathing on both hosts and guest, connected to a Dell
Equallogic SAN which is providing one single virtual ip - my hosts have
two dedicated NICS for iscsi, but on the same VLAN. Torture-tests showed
good resilience.

[Classical setup]
But this year, we plan to create at least two additional DCs but to
connect their hosts to a "classical" SAN, ie which provides TWO IPs on
segregated VLANs (not routed), and we'd like to use the same
iscsi-multipathing feature.

The discussion below could lead to think that oVirt needs the two iscsi
VLANs to be routed, allowing the hosts in one VLAN to access to
resources in the other.

As Vinicius explained, this is not a best practice to say the least.

Searching through the mailing list archive, I found no answer to
Vinicius' question.

May a Redhat storage and/or network expert enlighten us on these points?

Regards,

--
Nicolas Ecarnot

Le 21/07/2017 à 20:56, Vinícius Ferrão a écrit :

On 21 Jul 2017, at 15:12, Yaniv Kaul <yk...@redhat.com
<mailto:yk...@redhat.com>> wrote:

On Wed, Jul 19, 2017 at 9:13 PM, Vinícius Ferrão <fer...@if.ufrj.br
<mailto:fer...@if.ufrj.br>> wrote:

Hello,

I’ve skipped this message entirely yesterday. So this is per
design? Because the best practices of iSCSI MPIO, as far as I
know, recommends two completely separate paths. If this can’t be
achieved with oVirt what’s the point of running MPIO?

With regular storage it is quite easy to achieve using 'iSCSI bonding'.
I think the Dell storage is a bit different and requires some more
investigation - or experience with it.

Yaniv, thank you for answering this. I’m really hoping that a solution
would be found.

Actually I’m not running anything from DELL. My storage system is
FreeNAS which is pretty standard and, as far as I know, iSCSI
practices dictates segregate networks for proper working.

All other major virtualization products supports iSCSI this way:
vSphere, XenServer and Hyper-V. So I was really surprised that oVirt
(and even RHV, I requested a trial yesterday) does not implement ISCSI
with the well know best practices.

There’s a picture of the architecture that I take from Google when
searching for ”mpio best practives”:
https://image.slidesharecdn.com/2010-12-06-midwest-reg-vmug-101206110506-phpapp01/95/nextgeneration-best-practices-for-vmware-and-storage-15-728.jpg?cb=1296301640

Ans as you can see it’s segregated networks on a machine reaching the
same target.

In my case, my datacenter has five Hypervisor Machines, with two NICs
dedicated for iSCSI. Both NICs connect to different converged ethernet
switches and the iStorage is connected the same way.

So it really does not make sense that a the first NIC can reach the
second NIC target. In a case of a switch failure the cluster will go
down anyway, so what’s the point of running MPIO? Right?

Thanks once again,
V.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.2.x and ManageIQ : Adding 'cfme' credentials

2018-03-01 Thread Nicolas Ecarnot


Le 01/03/2018 à 15:50, Nicolas Ecarnot a écrit :

Couldn't the Redhat documentation mentioned above be more accurate?


Something like 'scl enable rh-postgrsql95' should help.


Not that much...

root@serv-mvm-prds01:/etc/ovirt-engine-setup.conf.d# cd /tmp
root@serv-mvm-prds01:/tmp# su - postgres
Dernière connexion : jeudi  1 mars 2018 à 15:42:40 CET sur pts/2
-bash-4.2$ scl enable rh-postgrsql95
Need at least 3 arguments.
Run scl --help to get help.


After reading and reading again :

For the record, here are the steps allowing me to add this user :

su - postgres

scl enable rh-postgresql95 'psql ovirt_engine_history'

CREATE ROLE cfme with LOGIN ENCRYPTED PASSWORD 'xxx';

SELECT 'GRANT SELECT ON ' || relname || ' TO cfme;' FROM pg_class JOIN 
pg_namespace ON pg_namespace.oid = pg_class.relnamespace WHERE nspname = 
'public' AND relkind IN ('r', 'v','S');


\q

exit



--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.2.x and ManageIQ : Adding 'cfme' credentials

2018-03-01 Thread Nicolas Ecarnot


Le 01/03/2018 à 15:00, Yaniv Kaul a écrit :



On Thu, Mar 1, 2018 at 2:13 PM, Nicolas Ecarnot <nico...@ecarnot.net 
<mailto:nico...@ecarnot.net>> wrote:


Hello,

As for my 4 previous oVirt DCs, I'm trying to add them to ManageIQ
providers.

I tried to follow this guide :


https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.6/html-single/deployment_planning_guide/#data_collection_for_rhev_33_34

<https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.6/html-single/deployment_planning_guide/#data_collection_for_rhev_33_34>

But when trying to run psql, the shell tells me the command is not
found.




Hello Yanniv,

Thank you for answering.


Because you are probably on PG 9.5 SCL, I assume?


I've never heard about that before today.
I installed a bare-metal CentOS 7.4 on which I installed oVirt 4.2.
I saw no reference to SCL nowhere, neither during the setup, neither in 
the oVirt install documentation.


How an average user is supposed to behave in such a situation?
(In my case, as usual, I read and read again)

Couldn't the Redhat documentation mentioned above be more accurate?


Something like 'scl enable rh-postgrsql95' should help.


Not that much...

root@serv-mvm-prds01:/etc/ovirt-engine-setup.conf.d# cd /tmp
root@serv-mvm-prds01:/tmp# su - postgres
Dernière connexion : jeudi  1 mars 2018 à 15:42:40 CET sur pts/2
-bash-4.2$ scl enable rh-postgrsql95
Need at least 3 arguments.
Run scl --help to get help.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] oVirt 4.2.x and ManageIQ : Adding 'cfme' credentials

2018-03-01 Thread Nicolas Ecarnot


Hello,

As for my 4 previous oVirt DCs, I'm trying to add them to ManageIQ 
providers.


I tried to follow this guide :

https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.6/html-single/deployment_planning_guide/#data_collection_for_rhev_33_34

But when trying to run psql, the shell tells me the command is not found.

I made a very simple setup : when running engine-setup, I answered the 
default question about DWH, so the DB is local.


When viewing (with pgAdmin) the roles of this new PostgreSQL DB, I see 
there is no 'cfme' user.
Do I have to re-run the setup and answer different things to ensure 
other packages and setup are made?


I saw 
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/data_warehouse_guide/#Overview_of_Configuring_Data_Warehouse 
telling me to re-run.


But I see that :
rpm -qa|grep -i dwh
ovirt-engine-dwh-4.2.1.2-1.el7.centos.noarch
ovirt-engine-dwh-setup-4.2.1.2-1.el7.centos.noarch

so I thought it was already enough... ?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Hosts firewall custom setup

2018-02-27 Thread Nicolas Ecarnot


Hello,

For the record :
The workaround you suggest below is successful.

Thank you.

--
Nicolas Ecarnot

Le 27/02/2018 à 14:15, Ondra Machacek a écrit :



On 02/27/2018 11:29 AM, Nicolas Ecarnot wrote:

Le 26/02/2018 à 15:00, Yedidyah Bar David a écrit :

But how do we add custom rules in case of firewalld type?


Please see: https://ovirt.org/blog/2017/12/host-deploy-customization/

Hello Didi and al,

- I followed the advices found in this blog page, I created the exact 
same filename with the adequate content.

- I've setup the cluster type to firewalld
- I restarted ovirt-engine
- I reinstalled a host

I see no usage of this Ansible yml file.
I see the creation of an ansible deploy log file for my host, and I 
see the usual firewall ports being opened, but I see nowhere any usage 
of the /etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml file.

- I added the debug msg part in the ansible recipe, but to no avail.
- Huge grepping through the /var/log of the engine shows no calls of 
this script.


Thus, I see no effect on ports of the host's firewalld config.

What should I look at now?


It looks like you hit the following bug:

  https://bugzilla.redhat.com/show_bug.cgi?id=1549163

It will be fixed in 4.2.2 release.

I believe you can meanwhile remove line:

  - oVirt-metrics

from file:

/usr/share/ovirt-engine/playbooks/roles/ovirt-host-deploy/meta/main.yml



Thank you.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Hosts firewall custom setup

2018-02-27 Thread Nicolas Ecarnot


Le 26/02/2018 à 15:00, Yedidyah Bar David a écrit :

But how do we add custom rules in case of firewalld type?


Please see: https://ovirt.org/blog/2017/12/host-deploy-customization/

Hello Didi and al,

- I followed the advices found in this blog page, I created the exact 
same filename with the adequate content.

- I've setup the cluster type to firewalld
- I restarted ovirt-engine
- I reinstalled a host

I see no usage of this Ansible yml file.
I see the creation of an ansible deploy log file for my host, and I see 
the usual firewall ports being opened, but I see nowhere any usage of 
the /etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml file.

- I added the debug msg part in the ansible recipe, but to no avail.
- Huge grepping through the /var/log of the engine shows no calls of 
this script.


Thus, I see no effect on ports of the host's firewalld config.

What should I look at now?

Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Hosts firewall custom setup

2018-02-26 Thread Nicolas Ecarnot


Le 26/02/2018 à 14:03, Yedidyah Bar David a écrit :

On Mon, Feb 26, 2018 at 2:01 PM, Nicolas Ecarnot <nico...@ecarnot.net> wrote:

Hello,

On oVirt 4.2.1.7, I'm trying to setup custom iptables rules as I'm doing
since years with engine-config --set IPTablesConfigSiteCustom="blah blah
blah".

On my hosts, I can see in my hosts that /etc/sysconfig/iptables does contain
the correct custom rules I added, but when manually checking with iptables
-L, I don't see my rules active.

On my hosts, I see that the iptables services is stopped and disabled, and
that the firewalld service is up and running.

That explains why iptables customization has no effect.


Indeed.

IIRC the type of firewall is now set per cluster or something like that, not
sure about the details - adding Ondra.


Per cluster, one can indeed choose the firewall type.
I suppose it translates on the hosts into the activation of the adequate 
service.

But how do we add custom rules in case of firewalld type?

On the hosts, I imagine that could translate into changes in :
/etc/firewalld/zones/public.xml

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] Hosts firewall custom setup

2018-02-26 Thread Nicolas Ecarnot


Hello,

On oVirt 4.2.1.7, I'm trying to setup custom iptables rules as I'm doing 
since years with engine-config --set IPTablesConfigSiteCustom="blah blah 
blah".


On my hosts, I can see in my hosts that /etc/sysconfig/iptables does 
contain the correct custom rules I added, but when manually checking 
with iptables -L, I don't see my rules active.


On my hosts, I see that the iptables services is stopped and disabled, 
and that the firewalld service is up and running.


That explains why iptables customization has no effect.

In the engine setup, I see that 
/etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf contains :

OVESETUP_CONFIG/firewallManager=none:None

I'm confused about this setting : when running engine-setup, I'm not 
sure to understand if answering yes to the question about the firewall 
will modify the engine, the hosts, or all of them?


Actually, I'd like my engine to stay with a disabled firewall, but my 
hosts with an active one.


Is it true to say that this is not an option and I have to answer yes, 
enable the firewall on the engine, allowing the 
OVESETUP_CONFIG/firewallManager option to be set up (to firewalld or 
iptables), thus allowing the spread of this setup towards the hosts?


Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] [Qemu-block] qcow2 images corruption

2018-02-14 Thread Nicolas Ecarnot




https://framadrop.org/r/Lvvr392QZo#/wOeYUUlHQAtkUw1E+x2YdqTqq21Pbic6OPBIH0TjZE=

Le 14/02/2018 à 00:01, John Snow a écrit :



On 02/13/2018 04:41 AM, Kevin Wolf wrote:

Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben:

TL; DR : qcow2 images keep getting corrupted. Any workaround?


Not without knowing the cause.

The first thing to make sure is that the image isn't touched by a second
process while QEMU is running a VM. The classic one is using 'qemu-img
snapshot' on the image of a running VM, which is instant corruption (and
newer QEMU versions have locking in place to prevent this), but we have
seen more absurd cases of things outside QEMU tampering with the image
when we were investigating previous corruption reports.

This covers the majority of all reports, we haven't had a real
corruption caused by a QEMU bug in ages.


After having found (https://access.redhat.com/solutions/1173623) the right
logical volume hosting the qcow2 image, I can run qemu-img check on it.
- On 80% of my VMs, I find no errors.
- On 15% of them, I find Leaked cluster errors that I can correct using
"qemu-img check -r all"
- On 5% of them, I find Leaked clusters errors and further fatal errors,
which can not be corrected with qemu-img.
In rare cases, qemu-img can correct them, but destroys large parts of the
image (becomes unusable), and on other cases it can not correct them at all.


It would be good if you could make the 'qemu-img check' output available
somewhere.

It would be even better if we could have a look at the respective image.
I seem to remember that John (CCed) had a few scripts to analyse
corrupted qcow2 images, maybe we would be able to see something there.



Hi! I did write a pretty simplistic tool for trying to tell the shape of
a corruption at a glance. It seems to work pretty similarly to the other
tool you already found, but it won't hurt anything to run it:

https://github.com/jnsnow/qcheck

(Actually, that other tool looks like it has an awful lot of options.
I'll have to check it out.)

It can print a really upsetting amount of data (especially for very
corrupt images), but in the default case, the simple setting should do
the trick just fine.

You could always put the output from this tool in a pastebin too; it
might help me visualize the problem a bit more -- I find seeing the
exact offsets and locations of where all the various tables and things
to be pretty helpful.

You can also always use the "deluge" option and compress it if you want,
just don't let it print to your terminal:

jsnow@probe (dev) ~/s/qcheck> ./qcheck -xd
/home/bos/jsnow/src/qemu/bin/git/install_test_f26.qcow2 > deluge.log;
and ls -sh deluge.log
4.3M deluge.log

but it compresses down very well:

jsnow@probe (dev) ~/s/qcheck> 7z a -t7z -m0=ppmd deluge.ppmd.7z deluge.log
jsnow@probe (dev) ~/s/qcheck> ls -s deluge.ppmd.7z
316 deluge.ppmd.7z

So I suppose if you want to send along:
(1) The basic output without any flags, in a pastebin
(2) The zipped deluge output, just in case

and I will try my hand at guessing what went wrong.


(Also, maybe my tool will totally choke for your image, who knows. It
hasn't received an overwhelming amount of testing apart from when I go
to use it personally and inevitably wind up displeased with how it
handles certain situations, so ...)


What I read similar to my case is :
- usage of qcow2
- heavy disk I/O
- using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the
solution. Having asked this question to oVirt experts
(https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's
not clear the driver is to blame.


This seems very unlikely. The corruption you're seeing is in the qcow2
metadata, not only in the guest data. If anything, virtio-scsi exercises
more qcow2 code paths than virtio-blk, so any potential bug that affects
virtio-blk should also affect virtio-scsi, but not the other way around.


I agree with the answer Yaniv Kaul gave to me, saying I have to properly
report the issue, so I'm longing to know which peculiar information I can
give you now.


To be honest, debugging corruption after the fact is pretty hard. We'd
need the 'qemu-img check' output and ideally the image to do anything,
but I can't promise that anything would come out of this.

Best would be a reproducer, or at least some operation that you can link
to the appearance of the corruption. Then we could take a more targeted
look at the respective code.


As you can imagine, all this setup is in production, and for most of the
VMs, I can not "play" with them. Moreover, we launched a campaign of nightly
stopping every VM, qemu-img check them one by one, then boot.
So it might take some time before I find another corrupted image.
(which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine
that automated migrations of VMs could trigger similar behaviors on qcow2

Re: [ovirt-users] [Qemu-block] qcow2 images corruption

2018-02-13 Thread Nicolas Ecarnot


Le 13/02/2018 à 16:26, Nicolas Ecarnot a écrit :
>> It would be good if you could make the 'qemu-img check' output available
>> somewhere.
>

I found this :
https://github.com/ShijunDeng/qcow2-dump

and the transcript (beautiful colors when viewed with "more") is attached :


--
Nicolas ECARNOT
Le script a débuté sur mar. 13 févr. 2018 17:31:05 CET
]0;root@serv-hv-adm13:/home[?1034h[01;32mroot@serv-hv-adm13[00m:[01;34m/home[00m#
 /root/qcow2-dump -m check serv-term-adm4-corr.qcow2.img
[1;32m
File:[1;36m serv-term-adm4-corr.qcow2.img
[0m

magic: 0x514649fb
version: [1;36m2
[0mbacking_file_offset: 0x0
backing_file_size: 0
fs_type: [1;32mxfs
[0mvirtual_size: 64424509440 / 61440M / 60G
disk_size: 36507222016 / 34816M / 34G
seek_end: 36507222016 [[1;32m0x88000[0m] / 34816M / 34G
cluster_bits: [1;36m16
[0mcluster_size: [1;36m65536
[0mcrypt_method: 0
csize_shift: 54
csize_mask: 255
cluster_offset_mask: [1;36m0x3f
[0ml1_table_offset: [1;32m0x76a46
[0ml1_size: [1;32m120
[0ml1_vm_state_index: [1;32m120
[0ml2_size: [1;36m8192
[0mrefcount_order: [1;36m4
[0mrefcount_bits: [1;36m16
[0mrefcount_block_bits: [1;36m15
[0mrefcount_block_size: [1;36m32768
[0mrefcount_table_offset: [1;32m0x1
[0mrefcount_table_clusters: [1;32m1
[0msnapshots_offset: [1;32m0x0
[0mnb_snapshots: [1;32m0
[0mincompatible_features: 
compatible_features: 
autoclear_features: 



[1;32mActive Snapshot:
[0m
L1 Table:   [offset: 0x76a46, len: 120]

[1;36mResult:
[0mL1 Table:   unaligned: [1;33m0, [0minvalid: [1;33m0, [0munused: 53, 
used: 67
L2 Table:   unaligned: [1;33m0, [0minvalid: [1;33m0, [0munused: 20304, 
used: 528560



[1;32mRefcount Table:
[0m
Refcount Table: [offset: 0x1, len: 8192]

[1;36mResult:
[0mRefcount Table: unaligned: [1;33m0, [0minvalid: [1;33m0, [0munused: 
8175, used: 17
Refcount:   error: [1;33m4342, [0mleak: [1;33m0, [0munused: 28426, 
used: 524288



[1;32mCOPIED OFLAG:
[0m

[1;36mResult:
[0mL1 Table ERROR OFLAG_COPIED: [1;33m1
[0mL2 Table ERROR OFLAG_COPIED: [1;33m4323
[0mActive L2 COPIED: [1;33m528560 [34639708160 / 33035M / 32G]

[0m

[1;32mActive Cluster:
[0m
[1;36m
Result:
[0mActive Cluster: reuse: [1;33m17

[0m

[1;31mSummary:
[0mpreallocation:  [1;32moff
[0mActive Cluster: [1;31mreuse: 17
[0mRefcount Table: [1;33munaligned: 0, [0m[1;33minvalid: 0, [0munused: 
8175, used: 17
Refcount:   [1;33merror: [0m[1;31m4342, [0m[1;33mleak: 0, 
[0m[1;31mrebuild: 4325, [0munused: 28426, used: 524288
L1 Table:   [1;33munaligned: 0, [0m[1;33minvalid: 0, [0munused: 53, 
used: 67
[1;33moflag copied: [0m[1;31m1
[0mL2 Table:   [1;33munaligned: 0, [0m[1;33minvalid: 0, [0munused: 
20304, used: 528560
[1;33moflag copied: [0m[1;31m4323
[0m

###[5;31m qcow2 image has refcount errors!   (=_=#)[0m###
###[5;31mand qcow2 image has copied errors!  (o_0)?[0m###
###[5;31m  Sadly: refcount error cause active cluster reused! Orz[0m  ###
###[1;33m Please backup this image and contact the author![0m ###



]0;root@serv-hv-adm13:/home[01;32mroot@serv-hv-adm13[00m:[01;34m/home[00m#
 exit

Script terminé sur mar. 13 févr. 2018 17:31:13 CET
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] [Qemu-block] qcow2 images corruption

2018-02-13 Thread Nicolas Ecarnot


Hello Kevin,

Le 13/02/2018 à 10:41, Kevin Wolf a écrit :

Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben:

TL; DR : qcow2 images keep getting corrupted. Any workaround?


Not without knowing the cause.


Actually, my main concern is mostly about finding the cause rather than 
correcting my corrupted VMs.


Another way to say it : I prefer to help oVirt than help myself.


The first thing to make sure is that the image isn't touched by a second
process while QEMU is running a VM.


Indeed, I read some BZ about this issue : they were raised by a user who 
ran some qemu-img commands on a "mounted" image, thus leading to some 
corruption.
In my case, I'm not playing with this, and the corrupted VMs were only 
touched by classical oVirt actions.



The classic one is using 'qemu-img
snapshot' on the image of a running VM, which is instant corruption (and
newer QEMU versions have locking in place to prevent this), but we have
seen more absurd cases of things outside QEMU tampering with the image
when we were investigating previous corruption reports.

This covers the majority of all reports, we haven't had a real
corruption caused by a QEMU bug in ages.


May I ask after what QEMU version this kind of locking has been added.
As I wrote, our oVirt setup is 3.6 so not recent.




After having found (https://access.redhat.com/solutions/1173623) the right
logical volume hosting the qcow2 image, I can run qemu-img check on it.
- On 80% of my VMs, I find no errors.
- On 15% of them, I find Leaked cluster errors that I can correct using
"qemu-img check -r all"
- On 5% of them, I find Leaked clusters errors and further fatal errors,
which can not be corrected with qemu-img.
In rare cases, qemu-img can correct them, but destroys large parts of the
image (becomes unusable), and on other cases it can not correct them at all.


It would be good if you could make the 'qemu-img check' output available
somewhere.


See attachment.



It would be even better if we could have a look at the respective image.
I seem to remember that John (CCed) had a few scripts to analyse
corrupted qcow2 images, maybe we would be able to see something there.


I just exported it like this :
qemu-img convert /dev/the_correct_path /home/blablah.qcow2.img

The resulting file is 32G and I need an idea to transfer this img to you.




What I read similar to my case is :
- usage of qcow2
- heavy disk I/O
- using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the
solution. Having asked this question to oVirt experts
(https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's
not clear the driver is to blame.


This seems very unlikely. The corruption you're seeing is in the qcow2
metadata, not only in the guest data.


Are you saying:
- the corruption is in the metadata and in the guest data
OR
- the corruption is only in the metadata
?


If anything, virtio-scsi exercises
more qcow2 code paths than virtio-blk, so any potential bug that affects
virtio-blk should also affect virtio-scsi, but not the other way around.


I get that.




I agree with the answer Yaniv Kaul gave to me, saying I have to properly
report the issue, so I'm longing to know which peculiar information I can
give you now.


To be honest, debugging corruption after the fact is pretty hard. We'd
need the 'qemu-img check' output


Done.


and ideally the image to do anything,


I remember some Redhat people once gave me a temporary access to put 
heavy file on some dedicated server. Is it still possible?



but I can't promise that anything would come out of this.

Best would be a reproducer, or at least some operation that you can link
to the appearance of the corruption. Then we could take a more targeted
look at the respective code.


Sure.
Alas I find no obvious pattern leading to corruption :
From the guest side, it appeared with windows 2003, 2008, 2012, linux 
centOS 6 and 7. It appeared with virtio-blk; and I changed some VMs to 
used virtio-scsi but it's too soon to see appearance of corruption in 
that case.
As I said, I'm using snapshots VERY rarely, and our versions are too old 
so we do them the cold way only (VM shutdown). So very safely.
The "weirdest" thing we do is to migrate VMs : you see how conservative 
we are!



As you can imagine, all this setup is in production, and for most of the
VMs, I can not "play" with them. Moreover, we launched a campaign of nightly
stopping every VM, qemu-img check them one by one, then boot.
So it might take some time before I find another corrupted image.
(which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine
that automated migrations of VMs could trigger similar behaviors on qcow2
images.


To my knowledge, oVirt only uses external snapshots and creates them
with QMP. This should be perfectly safe because from the perspective of
the qcow2 image being snapshotted,

Re: [ovirt-users] qcow2 images corruption

2018-02-08 Thread Nicolas Ecarnot


Le 08/02/2018 à 13:59, Yaniv Kaul a écrit :



On Feb 7, 2018 7:08 PM, "Nicolas Ecarnot" <nico...@ecarnot.net 
<mailto:nico...@ecarnot.net>> wrote:


Hello,

TL; DR : qcow2 images keep getting corrupted. Any workaround?

Long version:
This discussion has already been launched by me on the oVirt and
on qemu-block mailing list, under similar circumstances but I
learned further things since months and here are some informations :

- We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using
CentOS 7.{2,3} hosts
- Hosts :
  - CentOS 7.2 1511 :
    - Kernel = 3.10.0 327
    - KVM : 2.3.0-31
    - libvirt : 1.2.17
    - vdsm : 4.17.32-1
  - CentOS 7.3 1611 :
    - Kernel 3.10.0 514
    - KVM : 2.3.0-31
    - libvirt 2.0.0-10
    - vdsm : 4.17.32-1


All are somewhat old releases. I suggest upgrading to the latest RHEL 
and qemu-kvm bits.


Later on, upgrade oVirt.
Y.

Hello Yaniv,

We could discuss for hours about the fact that CentOS 7.3 was released 
in January 2017, thus not that old.
And also discuss for hours explaining the gap between developers' will 
to push their freshest releases and the curb we - industry users - put 
on adopting such new versions. In my case, the virtualization 
infrastructure is just one of the +30 domains I have to master everyday, 
and the more stable the better.
In the setup described previously, the qemu qcow2 images were correct, 
then not. We did not change anything. We have to find a workaround and 
we need your expertise.


Not understanding the cause of the corruption threatens us to the same 
situation in oVirt 4.2.


--
Nicolas Ecarnot
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] qcow2 images corruption

2018-02-07 Thread Nicolas Ecarnot

Hello,

TL; DR : qcow2 images keep getting corrupted. Any workaround?

Long version:
This discussion has already been launched by me on the oVirt and on
qemu-block mailing list, under similar circumstances but I learned
further things since months and here are some informations :

- We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS
7.{2,3} hosts

- Hosts :
- CentOS 7.2 1511 :
- Kernel = 3.10.0 327
- KVM : 2.3.0-31
- libvirt : 1.2.17
- vdsm : 4.17.32-1
- CentOS 7.3 1611 :
- Kernel 3.10.0 514
- KVM : 2.3.0-31
- libvirt 2.0.0-10
- vdsm : 4.17.32-1
- Our storage is 2 Equallogic SANs connected via iSCSI on a dedicated
network
- Depends on weeks, but all in all, there are around 32 hosts, 8 storage
domains and for various reasons, very few VMs (less than 200).
- One peculiar point is that most of our VMs are provided an additional
dedicated network interface that is iSCSI-connected to some volumes of
our SAN - these volumes not being part of the oVirt setup. That could
lead to a lot of additional iSCSI traffic.

From times to times, a random VM appears paused by oVirt.
Digging into the oVirt engine logs, then into the host vdsm logs, it
appears that the host considers the qcow2 image as corrupted.
Along what I consider as a conservative behavior, vdsm stops any
interaction with this image and marks it as paused.

Any try to unpause it leads to the same conservative pause.

After having found (https://access.redhat.com/solutions/1173623) the
right logical volume hosting the qcow2 image, I can run qemu-img check
on it.

- On 80% of my VMs, I find no errors.
- On 15% of them, I find Leaked cluster errors that I can correct using
"qemu-img check -r all"
- On 5% of them, I find Leaked clusters errors and further fatal errors,
which can not be corrected with qemu-img.
In rare cases, qemu-img can correct them, but destroys large parts of
the image (becomes unusable), and on other cases it can not correct them
at all.

Months ago, I already sent a similar message but the error message was
about No space left on device
(https://www.mail-archive.com/qemu-block@gnu.org/msg00110.html).

This time, I don't have this message about space, but only corruption.

I kept reading and found a similar discussion in the Proxmox group :
https://lists.ovirt.org/pipermail/users/2018-February/086750.html

https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-disk-i-o.32865/page-2

What I read similar to my case is :
- usage of qcow2
- heavy disk I/O
- using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the
solution. Having asked this question to oVirt experts
(https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but
it's not clear the driver is to blame.

I agree with the answer Yaniv Kaul gave to me, saying I have to properly
report the issue, so I'm longing to know which peculiar information I
can give you now.

As you can imagine, all this setup is in production, and for most of the
VMs, I can not "play" with them. Moreover, we launched a campaign of
nightly stopping every VM, qemu-img check them one by one, then boot.

So it might take some time before I find another corrupted image.
(which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to
imagine that automated migrations of VMs could trigger similar behaviors
on qcow2 images.

Last point about the versions we use : yes that's old, yes we're
planning to upgrade, but we don't know when.

Regards,

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users