[ovirt-users] Re: Proper way to upgrade hosts OS?

2019-07-01 Thread Nicolas Ecarnot

Le 26/06/2019 à 12:34, Nicolas Ecarnot a écrit :

Hello,

We're not using nodes but CentOS 7.x hosts.
Do you know if some documentation has been written about the proper way 
to upgrade the operating system of the hosts, and especially how to 
prevent breaking dependencies or cause versions flaws?


Thank you.



Hello,

As no answer came, may anyone just tell me if there's any chance to 
break something?


Thank you.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GZYVUCMBIZIZOMKSZUIJRZ6IMWBBI2X6/


[ovirt-users] Proper way to upgrade hosts OS?

2019-06-26 Thread Nicolas Ecarnot

Hello,

We're not using nodes but CentOS 7.x hosts.
Do you know if some documentation has been written about the proper way 
to upgrade the operating system of the hosts, and especially how to 
prevent breaking dependencies or cause versions flaws?


Thank you.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4SYJWWODEY2VZOAMU5NIRDOJCPANNR6S/


[ovirt-users] Re: Old mailing list SPAM

2019-05-15 Thread Nicolas Ecarnot

Le 15/05/2019 à 07:46, Markus Stockhausen a écrit :

Hi,

does anyone currently get old mails of 2016 from the mailing list?


I do.

(Though it is annoying, it allowed me to get an answer about which I 
never thought to ask - Thanks Nir, by the way)



We are spammed with something like this from teknikservice.nu:

...
Received: from mail.ovirt.org (localhost [IPv6:::1])by mail.ovirt.org
  (Postfix) with ESMTP id A33EA46AD3;Tue, 14 May 2019 14:48:48 -0400 (EDT)

Received: by mail.ovirt.org (Postfix, from userid 995)id D283A407D0; Tue, 14
  May 2019 14:42:29 -0400 (EDT)

Received: from bauhaus.teknikservice.nu (smtp.teknikservice.nu 
[81.216.61.60])

(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))(No
  client certificate requested)by mail.ovirt.org (Postfix) with ESMTPS id
  BF954467FEfor ; Tue, 14 May 2019 14:36:54 -0400 (EDT)

Received: by bauhaus.teknikservice.nu (Postfix, from userid 0)id 259822F504;
  Tue, 14 May 2019 20:32:33 +0200 (CEST) <- 3 YEAR TIME WARP ?

Received: from washer.actnet.nu (washer.actnet.nu [212.214.67.187])(using
  TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))(No client
  certificate requested)by bauhaus.teknikservice.nu (Postfix) with ESMTPS id
  430FEDA541for ; Thu,  6 Oct 2016 18:02:51 +0200 
(CEST)


Received: from lists.ovirt.org (lists.ovirt.org [173.255.252.138])(using
  TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))(No client
  certificate requested)by washer.actnet.nu (Postfix) with ESMTPS id
  D75A82293FCfor ; Thu,  6 Oct 2016 18:04:11 +0200
  (CEST)
...

Markus


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XI3LV4GPACT7ILZ3BNJLHHQBEWI3HWLI/




--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IEOIF3KVPKLBO2UNZ65FSRX7EFPXHF3V/


[ovirt-users] Re: VM has paused due to no storage space error

2019-05-15 Thread Nicolas Ecarnot

Hi Nir, hi Sandvik,

As I saw this issue lots of times and as I'm using thin prov. + block 
storage, I feel concerned.

Read my question below.

Le 02/10/2016 à 12:55, Nir Soffer a écrit :

On Sun, Oct 2, 2016 at 12:06 PM, Sandvik Agustin
 wrote:

Hi users,

I have this problem that sometimes 1 to 3 VM just automatically paused with
user interaction and getting this error "VM has paused due to no storage
space error". any inputs from you guys are very appreciated.

This is expected - when there is no storage space :-)

The vm is paused when there are some io pending io requests that
could not be fulfilled since you don't have enough space.

In a real machine the io requests would fail. In a vm, the vm can pause,
you can fix the issue (extend the storage domain), and resume the vm.

But I guess there is storage space available, otherwise you would
not spend the time sending this mail.

This can happen when using thin provisioned disks on block storage
(iSCSI, FC). We provision such disk with 1G, and and extend the disk
(add 1G) when it becomes too full (by default, free space < 0.5G).

If we fail to extend the disk quick enough,



"quick enough" -> Is there some place where this threshold can be 
configured?




  the vm will pause before the
extend was completed. Once the extend was completed, we resume
the vm.

So you may see very short pauses, but they should be rare.

To understand the issue, we need to inspect vdsm logs from the host
running the vm that paused, showing the timeframe when the vm
was paused.

You should see this message in the log each time a vm pauses:

 abnormal vm stop device  error ENOSPC

Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

--
IMPORTANT!
This message has been scanned for viruses and phishing links.
However, it is your responsibility to evaluate the links and attachments you 
choose to click.
If you are uncertain, we always try to help.
Greetings helpd...@actnet.se



--
IMPORTANT!
This message has been scanned for viruses and phishing links.
However, it is your responsibility to evaluate the links and attachments you 
choose to click.
If you are uncertain, we always try to help.
Greetings helpd...@actnet.se

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5MAYP4SZZQC5BB2VVPQBXYWH4OOJ7LUW/



--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KF4SVQOE7U7ELLOIE4CNPSH2TAN7MW3K/


[ovirt-users] Re: DISCARD support?

2019-05-14 Thread Nicolas Ecarnot

Hello,

Sending this here to share knowledge.

Here is what I learned from many BZ and mailing list posts readings. I'm 
not working at Redhat, so please correct me if I'm wrong.


We are using thin-provisioned block storage LUNs (Equallogic), on which 
oVirt is creating numerous Logical Volumes, and we're very happy with it.
When oVirt is removing a virtual disk, the SAN is not informed, because 
the LVM layer is not sending the "issue_discard" flag.


/etc/lvm/lvm.conf is not the natural place to try to change this 
parameter, as VDSM is not using it.


Efforts are presently made to include issue_discard setting support 
directly into vdsm.conf, first on a datacenter scope (4.0.x), then per 
storage domain (4.1.x) and maybe via a web GUI check-box. Part of the 
effort is to make sure every bit of a planned to be removed LV get wiped 
out. Part is to inform the block storage side about the deletion, in 
case of thin provisioned LUNs.


https://bugzilla.redhat.com/show_bug.cgi?id=1342919
https://bugzilla.redhat.com/show_bug.cgi?id=981626

--
Nicolas ECARNOT

On Mon, Oct 3, 2016 at 2:24 PM, Nicolas Ecarnot <mailto:nico...@ecarnot.net>> wrote:


   Yaniv,

   As a pure random way of web surfing, I found that you posted on
   twitter an information about DISCARD support.
   (https://twitter.com/YanivKaul/status/773513216664174592
   <https://twitter.com/YanivKaul/status/773513216664174592>)

   I did not dig any further, but has it any relation with the fact
   that so far, oVirt did not reclaim lost storage space amongst its
   logical volumes of its storage domains?

   A BZ exist about this, but one was told no work would be done about
   it until 4.x.y, so now we're there, I was wondering if you knew more?


Feel free to send such questions on the mailing list (ovirt users or 
devel), so other will be able to both chime in and see the response.
We've supported a custom hook for enabling discard per disk (which is 
only relevant for virtio-SCSI and IDE) for some versions now (3.5 I 
believe).

We are planning to add this via a UI and API in 4.1.
In addition, we are looking into discard (instead of wipe after delete, 
when discard is also zero'ing content) as well as discard when removing LVs.

See:
http://www.ovirt.org/develop/release-management/features/storage/pass-discard-from-guest-to-underlying-storage/
http://www.ovirt.org/develop/release-management/features/storage/wipe-volumes-using-blkdiscard/
http://www.ovirt.org/develop/release-management/features/storage/discard-after-delete/

Y.


   Best,

   -- 
   Nicolas ECARNOT




--
IMPORTANT!
This message has been scanned for viruses and phishing links.
However, it is your responsibility to evaluate the links and attachments you 
choose to click.
If you are uncertain, we always try to help.
Greetings helpd...@actnet.se



--
IMPORTANT!
This message has been scanned for viruses and phishing links.
However, it is your responsibility to evaluate the links and attachments you 
choose to click.
If you are uncertain, we always try to help.
Greetings helpd...@actnet.se


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XNWYONXSWEN5AJVUJURRL7G3QJW62SNJ/


[ovirt-users] Logical Volume extend failed

2019-03-11 Thread Nicolas Ecarnot

Hello,

[Context :
I'm moving all my VMs from an old 3.6 DC to a brand new 4.3 DC.
For local reasons, I'm doing it using an export domain, and one by one.
]

Today, for no obvious reason, error messages began to appear :
"
VDSM SPM-servername command failed: Logical Volume extend failed
"

Lots of similar errors appear in the engine log, with no obvious 
additional hint.

In the VDSM log, I'm not skilled enough to see what's wrong either.

The 3.6 engine and vdsm log files are here :

https://framadrop.org/r/6cFSb0GRc1#VQ6XqYWg9HzniHMjgKmXVpXy0I+RIS/MiMGBpU+1bak=

https://framadrop.org/r/JFswiD3fkA#fdU+m3JCVMVg/eLjtJVTqOiAKIj4eyhsRWisxcrea7I=

It may come from one of our storage domain that was close to full, but I 
freed 200Go space since, and the issue keeps appearing.


Now, my attempts to export a VM are failing.
I still can stop and start a VM.

(I'm not completely relaxed with this situation.)

I read some similar experience here 
(https://www.canarytek.com/2017/07/21/Harmfull_bug_in_oVirt_block_storage.html) 
but I'm not sure it is related.

I can psql-query and check things if needed, but I mostly need advices.

Thank you.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OFE5IWWFKQLWWJR3KHCIDMTS2JHLHEC4/


[ovirt-users] Re: Fencing : SSL or not?

2019-02-22 Thread Nicolas Ecarnot

Le 22/02/2019 à 15:45, Martin Perina a écrit :
If I understand that correctly, this is a request to open session to 
IPMI. If you haven't received any response, then I'd check:


1. Do you have IPMI enabled?



Hello Martin,

you hit the point.

IPMI was not unable (anymore).

IPMI is activated by default since years in all our hosts.

But recent firmware upgrades on some of our Dell hosts, and especially 
on iDRAC firmwares led to the disabling of IPMI.



I'm sorry for having bothered you and the audience. Sorry for this waste 
of time. Thank you Dell :-\


--
Nicolas ECARNOT

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KO7REWCFUWRGU453N5XYSFZSS75RFFU6/


[ovirt-users] Re: Host choice when migrating VMs

2019-02-22 Thread Nicolas Ecarnot

Le 22/02/2019 à 15:48, Dominik Holler a écrit :

Hosts _needs_ the same networks to be available in the same cluster. Different 
networked hosts needs to be put in a separate cluster.



This is the most straight approach, which is supported by oVirt.
But there is the possibility to attach logical networks, which are
neither required in the cluster, nor attached to all hosts in the
cluster, to a VM. oVirt's scheduling will respect this.


So you're saying oVirt knows which other hosts in the cluster have the 
non-mandatory network(s) the VM has and only chooses between those a host to 
migrate the VM to?



Yes. If you try to trigger the migration manually, UI will provide you
the list of possible hosts to migrate the VM.
https://github.com/oVirt/ovirt-engine/blob/7d111f3aa089f77f92049f4d3ec792e5ff7e5324/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/scheduling/policyunits/NetworkPolicyUnit.java#L132




*THIS* is precisely the answer I was expecting.

Thank you Dominik.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LT6I4GS42VIPQYBF4EGT7HBS2LVLUN2Z/


[ovirt-users] Re: Host choice when migrating VMs

2019-02-22 Thread Nicolas Ecarnot

Le 22/02/2019 à 15:02, Karli Sjöberg a écrit :



Den 22 feb. 2019 09:24 skrev Nicolas Ecarnot :

Hello,

I'm almost sure the following is useless as I think I know how it's
working, but as I'm preparing a major change in our infrastructure, I'd
rather be sure and not mess up. And also to be sure.
(Just to be sure)

For some reasons, and for the first time in our infra., one of our new
DC will temporary include heterogeneous hosts : some networks will be
available only on parts of them.




Hi Karli,

Hosts _needs_ the same networks to be available in the same cluster. 


Correct me if I'm wrong, but I think that your statement is true *if* 
the networks are set as mandatory, which is not automatically wanted nor 
true. In our case, we have to disable this mandatory attribute.


I agree that when the networks are mandatory, every host unable to use 
them will end up unavailable.



Different networked hosts needs to be put in a separate cluster.

/K



--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CGPHGFXYI3OZX2XKTLCFZ6W3GN4Q6U4Q/


[ovirt-users] Re: Fencing : SSL or not?

2019-02-22 Thread Nicolas Ecarnot

Le 22/02/2019 à 12:13, Martin Perina a écrit :

Unfortunately using fence_ipmilan is not possible to display more 
debugging details, so as mentioned earlier could you please run 
ipmitool directly?


ipmitool vv -I lanplus -H c-hv05.prd.sdis38.fr 
<http://c-hv05.prd.sdis38.fr> -p 623 -U stonith -P  -L 
ADMINISTRATOR chassis power status


Above should display more details ...


root@hv04:/etc# ipmitool -vv -I lanplus -H c-hv05.prd.sdis38.fr -p 623 -U 
stonith -P 'xxx' -L ADMINISTRATOR chassis power status


Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x8e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x8e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x8e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x8e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x0e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x0e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x0e 0x04 




Sending IPMI command payload
   netfn   : 0x06
   command : 0x38
   data: 0x0e 0x04 


Get Auth Capabilities error
Error issuing Get Channel Authentication Capabilities request
Error: Unable to establish IPMI v2 / RMCP+ session
root@hv04:/etc#

--
Nicolas ECARNOT

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DQKUC2G745CKN6BT2SC3T6LSCEEML7NN/


[ovirt-users] Re: Fencing : SSL or not?

2019-02-22 Thread Nicolas Ecarnot

Hi Martin,

Le 21/02/2019 à 13:04, Martin Perina a écrit :

Hi Nicolas,

see my reply inline


See mine below.



On Mon, Feb 18, 2019 at 9:51 AM Nicolas Ecarnot <mailto:nico...@ecarnot.net>> wrote:


Hello,

As fence_idrac has never worked for us, and as fence_ipmilan has worked
nicely since years, we are using fence_ipmilan with the lanplus=1
option
and we're happy with it.

We upgraded to 4.3.0.4 and we're witnessing that we cannot fence our
hosts anymore :

2019-02-18 09:42:08,678+01 ERROR
[org.ovirt.engine.core.bll.pm
<http://org.ovirt.engine.core.bll.pm>.FenceProxyLocator] (default
task-11)
[2f78ed99-6703-4d92-b7cb-948c2d24b623] Can not run fence action on host
'x', no suitable proxy host was found.


This is not related fence_ipmi issue below. Engine, is order to be able 
to execute fencing operation, needs at least one other hosts in Up 
status, which is used as a proxy host to perform fencing operation. So 
do you have at least one host in Up status in the same 
cluster/datacenter as the host you want to run fencing operation on?


Yes.

If so, then please enable debug information to find out why we cannot 
find any host acting as fence proxy:


1. Please download log-control.sh script from 
https://github.com/oVirt/ovirt-engine/tree/master/contrib#log-control-sh 
and save on engine machine

2. Please execute following on engine machine
   log-control.sh org.ovirt.engine.core.bll.pm 
<http://org.ovirt.engine.core.bll.pm> DEBUG
3. Go to the problematic host, click Edit, go to Power Management tab, 
click on the existing fence agent and click on Test button
4. Take a look at engine.log, there should be logged information, why we 
were not able to find out fence proxy


I followed the instructions above, but I feel this is not the best debug 
path. I learned nothing new.
The fence proxy is not missing. It is known and found, and it is trying 
to do its job, as written below :





and on the SPM :

fence_ipmilan: Failed: Unable to obtain correct plug status or plug is
not available


Could you please provide debug output of below command?

ipmitool -vv -I lanplus -H  -p 623 -U  
-P  -L ADMINISTRATOR chassis power status


See below a debug session.
I'm comparing two hosts, and one only is answering fence status queries.

I must add that before the upgrade to 4.3, both hosts were responding 
correctly.


fence_ipmilan --username=stonith --password='xxx' --lanplus 
--ip=c-serv-hv-prds01.sdis.isere.fr --action=status -v
2019-02-22 11:34:01,537 INFO: Executing: /usr/bin/ipmitool -I lanplus -H 
c-serv-hv-prds01.sdis.isere.fr -p 623 -U stonith -P [set] -L 
ADMINISTRATOR chassis power status


2019-02-22 11:34:01,654 DEBUG: 0 Chassis Power is on


Status: ON
root@hv04:/etc# fence_ipmilan --username=stonith --password='xxx' 
--lanplus --ip=c-hv05.prd.sdis38.fr --action=status -v
2019-02-22 11:34:15,335 INFO: Executing: /usr/bin/ipmitool -I lanplus -H 
c-hv05.prd.sdis38.fr -p 623 -U stonith -P [set] -L ADMINISTRATOR chassis 
power status


2019-02-22 11:34:35,338 ERROR: Connection timed out


root@hv04:/etc# nmap c-serv-hv-prds01.sdis.isere.fr

Starting Nmap 6.40 ( http://nmap.org ) at 2019-02-22 11:34 CET
Nmap scan report for c-serv-hv-prds01.sdis.isere.fr (192.168.53.2)
Host is up (0.010s latency).
rDNS record for 192.168.53.2: c-5g3yxx1.sdis.isere.fr
Not shown: 996 closed ports
PORT STATE SERVICE
22/tcp   open  ssh
80/tcp   open  http
443/tcp  open  https
5900/tcp open  vnc

Nmap done: 1 IP address (1 host up) scanned in 0.45 seconds
root@hv04:/etc# nmap c-hv05.prd.sdis38.fr

Starting Nmap 6.40 ( http://nmap.org ) at 2019-02-22 11:34 CET
Nmap scan report for c-hv05.prd.sdis38.fr (192.168.50.194)
Host is up (0.00060s latency).
rDNS record for 192.168.50.194: C-550W2S2.sdis.isere.fr
Not shown: 996 closed ports
PORT STATE SERVICE
22/tcp   open  ssh
80/tcp   open  http
443/tcp  open  https
5900/tcp open  vnc
MAC Address: CC:C5:E5:57:26:E0 (Unknown)

Nmap done: 1 IP address (1 host up) scanned in 0.20 seconds
root@hv04:/etc# ping -c 1 c-serv-hv-prds01.sdis.isere.fr
PING c-5g3yxx1.sdis.isere.fr (192.168.53.2) 56(84) bytes of data.
64 bytes from c-5g3yxx1.sdis.isere.fr (192.168.53.2): icmp_seq=1 ttl=61 
time=2.37 ms


--- c-5g3yxx1.sdis.isere.fr ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.371/2.371/2.371/0.000 ms
root@hv04:/etc# ping -c 1 c-hv05.prd.sdis38.fr
PING c-550w2s2.prd.sdis38.fr (192.168.50.194) 56(84) bytes of data.
64 bytes from C-550W2S2.sdis.isere.fr (192.168.50.194): icmp_seq=1 
ttl=64 time=0.189 ms


--- c-550w2s2.prd.sdis38.fr ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.189/0.189/0.189/0.000 ms




Above is the command which fence_ipmi is internally executing, and -vv 
adds debugging output which can reveal issue with the plug status


Regards,
Martin


I found the sugg

[ovirt-users] Host choice when migrating VMs

2019-02-22 Thread Nicolas Ecarnot

Hello,

I'm almost sure the following is useless as I think I know how it's 
working, but as I'm preparing a major change in our infrastructure, I'd 
rather be sure and not mess up. And also to be sure.

(Just to be sure)

For some reasons, and for the first time in our infra., one of our new 
DC will temporary include heterogeneous hosts : some networks will be 
available only on parts of them.


Please may someone confirm me that with every load balancing / VM 
startup / VM migration / host choice, oVirt will smartly choose the 
available host equipped with the adequate networks?


--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QGX3PHA4T3SXXDTYZ4VGY6UHECO7P6V5/


[ovirt-users] Fencing : SSL or not?

2019-02-18 Thread Nicolas Ecarnot

Hello,

As fence_idrac has never worked for us, and as fence_ipmilan has worked 
nicely since years, we are using fence_ipmilan with the lanplus=1 option 
and we're happy with it.


We upgraded to 4.3.0.4 and we're witnessing that we cannot fence our 
hosts anymore :


2019-02-18 09:42:08,678+01 ERROR 
[org.ovirt.engine.core.bll.pm.FenceProxyLocator] (default task-11) 
[2f78ed99-6703-4d92-b7cb-948c2d24b623] Can not run fence action on host 
'x', no suitable proxy host was found.


and on the SPM :

fence_ipmilan: Failed: Unable to obtain correct plug status or plug is 
not available


I found the suggested workaround here :

https://access.redhat.com/solutions/3349841

but no combination of
- lanplus={0,1}
- -z
- ssl=={0,1}

lead to no solution.

The package version is the same as what's described in the KB :
fence-agents-rhevm-4.2.1-11.el7_6.7.x86_64

What should I test now?

Thank you.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SEUAZ6JB6CIYY2GOBNJN2XSWOSH6DHDJ/


[ovirt-users] Re: Forum available

2019-02-08 Thread Nicolas Ecarnot

Le 08/02/2019 à 09:05, Josep Manel Andrés Moscardó a écrit :

Hi all,
I am just wondering if anyone like me would like to have everything that 
is bump here in a forum, with all the benefits it brings


Absolutely.

Digging through mail archives is somethimes painful.

(and people 
will still be able to subscribe and reply through email). Something like 
Discourse would be nice in my opinion.


Best.


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TUU357HINGWFA23T3SMKDVTM7EKLX6VS/




--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/H427EVNMN3NZHB7NGW4Z62IOPRIGFNGP/


[ovirt-users] Re: Bug in the web interface?

2019-02-06 Thread Nicolas Ecarnot

Le 06/02/2019 à 15:42, Greg Sheremeta a écrit :
On Wed, Feb 6, 2019 at 6:33 AM Nicolas Ecarnot <mailto:nico...@ecarnot.net>> wrote:


Le 06/02/2019 à 10:53, Lucie Leistnerova a écrit :
>
> On 2/6/19 10:22 AM, Nicolas Ecarnot wrote:
>> Hi Lucie,
>>
>> Le 06/02/2019 à 10:02, Lucie Leistnerova a écrit :
>>> I'm sorry, my mistake I did not mention to remove the package
without
>>> dependencies.


Same -- sorry, ugh.
For anyone in the same situation, the better thing to do now is simply 
'yum update ovirt-engine-ui-extensions'

That will remove the old dashboard correctly.
https://github.com/oVirt/ovirt-engine-ui-extensions/blob/master/packaging/spec.in#L16



Thank you. We need this kind of wheels greasing as oVirt's complexity 
increases.






To sum up, I think what I'm missing is a clear and solide
documentation
or official Redhat message about whether/what/how/when can/cannot we
update (with "yum update") the engine host and/or the hosts.


Not Red Hat -- oVirt :)


Yep, Greg Sheremeta  ;-)


Indeed, we need an Upgrade Guide update. I'll look into it.

Generally, on my dev instances (which are probably nowhere near as 
complicated as your setups), I run 'yum update' followed by 
'engine-setup'.


Actually, my experience is that yum-upgrading the engine was most of the 
times harmless, but yum-upgrading the hosts lead to complex situations.


I'm at a point where I no longer update my hosts with yum update, and 
only relies on oVirt's update (either via the web GUI or ansible's 
cluster upgrade) which only updates part of the packages.


I'd rather have a strong enough RPM environment around oVirt preventing 
any issue (the version lock usage shows that it's already a concern 
oVirt's people are dealing with and I thank you. Keep strengthening.)



--

Nicolas ECARNOT

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TQAYEZSGMLQCWFJTMAUERABCUNYWG3N6/


[ovirt-users] Re: Bug in the web interface?

2019-02-06 Thread Nicolas Ecarnot

Le 06/02/2019 à 10:53, Lucie Leistnerova a écrit :


On 2/6/19 10:22 AM, Nicolas Ecarnot wrote:

Hi Lucie,

Le 06/02/2019 à 10:02, Lucie Leistnerova a écrit :
I'm sorry, my mistake I did not mention to remove the package without 
dependencies.


rpm -e --nodeps ...


I'll write that down.



When looking at the log file above
(https://framadrop.org/r/ywTOD-Q02-#dA6hdYaxfZpgUB68gtJLB9inH5oJajrL4H9LTktDd6o=) 
[...]
"/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", 



The error is cause by missing ovirt-engine-dbscripts.


OK

Well, I thought I messed up with packages, and I thought a compete 
yum update would help, as I read :

Le 05/02/2019 à 15:19, Greg Sheremeta wrote :



The fix is pushed. Standalone engine upgrades should be fine starting
now. `yum update` any appliance engines or already upgraded 
engines to get the latest ovirt-engine-ui-extensions, which fixes 
the problem.


So I ran a yum update.

This package is part of ovirt-engine versionlock so can't be 
installed/updated separately.
engine-setup should install the missing packages. I tried it by 
myself and it fixed the issue.


   [install] 
ovirt-engine-dbscripts-4.3.0.5-0.0.master.20190205084851.gitaaebfc9.el7.noarch 
will be installed


I see I have this package, though in an older version :
# rpm -qa|grep -i dbscripts
ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch

The version shouldn't be problem. I tested it in u/s ovirt. Now I tried 
with same version.


Try to remove that package and install again. Versionlock seems to 
differ here so I was able to install it separately, if not run 
engine-setup.


# rpm -e --nodeps ovirt-engine-dbscripts


Indeed, it found a lot of missing files/dir.



# yum install ovirt-engine-dbscripts


I forgot to set LANG=C so you'll read some parts in french, but I get 
the idea :



root@mvm01:/tmp# yum install ovirt-engine-dbscripts
Modules complémentaires chargés : fastestmirror, versionlock
Loading mirror speeds from cached hostfile
 * base: centos.mirror.fr.planethoster.net
 * epel: pkg.adfinis-sygroup.ch
 * extras: ftp.pasteur.fr
 * ovirt-4.3: ovirt.repo.nfrance.com
 * ovirt-4.3-epel: pkg.adfinis-sygroup.ch
 * updates: centos.mirror.fr.planethoster.net
Excluding 1 update due to versionlock (use "yum versionlock status" to 
show it)

Résolution des dépendances
--> Lancement de la transaction de test
---> Le paquet ovirt-engine-dbscripts.noarch 0:4.3.0.4-1.el7 sera installé
--> Résolution des dépendances terminée

Dépendances résolues

=
 Package 
Architecture Version 
Dépôt 
Taille

=
Installation :
 ovirt-engine-dbscripts 
noarch   4.3.0.4-1.el7 
ovirt-4.3 
331 k


Résumé de la transaction
=
Installation   1 Paquet

Taille totale des téléchargements : 331 k
Taille d'installation : 1.6 M
Is this ok [y/d/N]: y
Downloading packages:
ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch.rpm 

 | 331 kB 
00:00:02

Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Avertissement : RPMDB a été modifiée par une autre application que yum.
** 1 problèmes RPMDB préexistants trouvés, la sortie de « yum check » 
est la suivante :
ovirt-engine-4.3.0.4-1.el7.noarch a des dépendances manquantes de 
ovirt-engine-dbscripts = ('0', '4.3.0.4', '1.el7')
  Installation : ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch 



 1/1
  Vérification : ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch 



 1/1

Installé :
  ovirt-engine-dbscripts.noarch 0:4.3.0.4-1.el7 





Terminé !

-

After that, I ran again engine-setup and it went OK.
Now, my ovirt DC and dashboard is back to life, thanks to you Lucie.

To sum up, I think what I'm missing is a clear and solide documentation 
or official Redhat message about whether/what/how/when can/cannot we 
update (with "yum update") the engine host and/or the hosts.


??

--
Nicolas ECARNOT
__

[ovirt-users] Re: Bug in the web interface?

2019-02-06 Thread Nicolas Ecarnot

Hi Lucie,

Le 06/02/2019 à 10:02, Lucie Leistnerova a écrit :
I'm sorry, my mistake I did not mention to remove the package without 
dependencies.


rpm -e --nodeps ...


I'll write that down.



When looking at the log file above
(https://framadrop.org/r/ywTOD-Q02-#dA6hdYaxfZpgUB68gtJLB9inH5oJajrL4H9LTktDd6o=) 
[...]
"/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", 

The error is cause by missing ovirt-engine-dbscripts.


OK

Well, I thought I messed up with packages, and I thought a compete yum 
update would help, as I read :

Le 05/02/2019 à 15:19, Greg Sheremeta wrote :



The fix is pushed. Standalone engine upgrades should be fine starting
now. `yum update` any appliance engines or already upgraded engines 
to get the latest ovirt-engine-ui-extensions, which fixes the problem.


So I ran a yum update.

This package is part of ovirt-engine versionlock so can't be 
installed/updated separately.
engine-setup should install the missing packages. I tried it by myself 
and it fixed the issue.


   [install] 
ovirt-engine-dbscripts-4.3.0.5-0.0.master.20190205084851.gitaaebfc9.el7.noarch 
will be installed


I see I have this package, though in an older version :
# rpm -qa|grep -i dbscripts
ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch



Not sure what went wrong by you, send please the setup log and the 


>> 
(https://framadrop.org/r/ywTOD-Q02-#dA6hdYaxfZpgUB68gtJLB9inH5oJajrL4H9LTktDd6o=)


ovirt-engine* rpms list. And also result of 'ls 
/usr/share/ovirt-engine/dbscripts'


# LANG=C ls -la /usr/share/ovirt-engine/dbscripts
ls: cannot access /usr/share/ovirt-engine/dbscripts: No such file or 
directory


You seem to hit the point.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DA3RSDSTLAHWDCIAZNAGRUMKFHT7Y2GN/


[ovirt-users] Re: Bug in the web interface?

2019-02-06 Thread Nicolas Ecarnot
Le 05/02/2019 à 15:19, Greg Sheremeta wrote :


The fix is pushed. Standalone engine upgrades should be fine starting 
now. `yum update` any appliance engines or already upgraded engines to 
get the latest ovirt-engine-ui-extensions, which fixes the problem.


So I ran a yum update.

After running again engine-setup, it is failing the same way.
I compared the complete rpm list with another 4.3 DC with no issue, and 
apart the removed ovirt-engine-dashboard package and obviously many 
upgraded packages, I see no obvious missing parts.


I'm at loss and don't know how to save this DC, so any help is welcome.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7QT44H4DEIZPZVMBO6UPRQ6GZWAKWP3S/


[ovirt-users] Re: [4.3.0] VNC Virt-viewer console not opening

2019-02-05 Thread Nicolas Ecarnot

Hello Greg,

Le 04/02/2019 à 21:13, Greg Sheremeta a écrit :

When I try to use Spice instead of VNc, it is working nicely.


My goal is to stick to VNC.


When I try to use noVNC, the additional tab opens and shows
"Unsupported
security types: 19"


Looks like https://bugzilla.redhat.com/show_bug.cgi?id=1659155

Can you try disabling vnc security on the cluster and then reboot the host?


VNC security is already disabled.


What could I give to help you help me?

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ARBA5SBJLY3QS73XYRJYQ7F7TZJ5KOYT/


[ovirt-users] [4.3.0] VNC Virt-viewer console not opening

2019-02-04 Thread Nicolas Ecarnot

Hello,

First, congratulations to all of you who worked for this 4.3.0 release, 
and obviously thank you.


Today, I upgraded 4 oVirt setups (4 DC) from 4.2.7 to 4.3.0.
I went well on all 4 DCs.

But on one of them, when I try to open a console, I see it open as a 
flash (it opens and closes immediately).


I'm using Firefox 64.0 with Ubuntu 18.10, and all my VMs are setup like 
this :

- video type : QXL
- Gfx protocol : VNC
- VNC Kbd layout : fr
and I'm using virt-viewer

On the problematic DC, all the VMs are showing the same issue.

When I try to use Spice instead of VNc, it is working nicely.
When I try to use noVNC, the additional tab opens and shows "Unsupported 
security types: 19"


I tried to track down this issue thanks to the firefox dev console, but 
it's beyond my understanding.


Trying the same with Chromium does the same blinking open/close.

I'd rather learn how to provide additionnal debug messages, but
/var/log/ovirt-engine/engine.log does not give any useful hint :

2019-02-04 16:57:04,150+01 INFO 
[org.ovirt.engine.core.bll.SetVmTicketCommand] (default task-24) 
[1fb01d42] Running command: SetVmTicketCommand internal: false. Entities 
affected :  ID: 0c3e02b3-7fec-4bb1-b3d6-2e6c228e7278 Type:

 VMAction group CONNECT_TO_VM with role type USER
2019-02-04 16:57:04,155+01 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SetVmTicketVDSCommand] 
(default task-24) [1fb01d42] START, SetVmTicketVDSCommand(HostName = 
hv01.prd.sdis38.fr, SetVmTicketVDSCommandParameters:{hostId='
687c1c01-a5e1-449c-89d2-9713ccfc2487', 
vmId='0c3e02b3-7fec-4bb1-b3d6-2e6c228e7278', protocol='VNC', 
ticket='IivrpGHx5zSw', validTime='120', userName='admin', 
userId='4a340386-851a-11e8-863d-3417ebeef1af', disconnectAction='NONE'}

), log id: 2a897f30
2019-02-04 16:57:04,188+01 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SetVmTicketVDSCommand] 
(default task-24) [1fb01d42] FINISH, SetVmTicketVDSCommand, return: , 
log id: 2a897f30
2019-02-04 16:57:04,211+01 INFO 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(default task-24) [1fb01d42] EVENT_ID: VM_SET_TICKET(164), User 
admin@internal-authz initiated console session for VM ad02.ct

at.sdis38.fr

What could I give to help you help me?

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KGCM25ILBTQTY6NLVJUDE7CNF5C5BRE7/


[ovirt-users] Re: The admin portal ui should be more simplified

2019-01-10 Thread Nicolas Ecarnot

Le 10/01/2019 à 15:13, fle...@hotmail.com a écrit :

We have a rhv  of 11 Datacerters, 11 clusters, 40 hosts and 300 vms.
The 4 of us administrators are suffering from the new 4.2 UI lack of active 
area 。The manipulation logic also make us confused.
A simple operation needs more clicks than before.
Please just make the UI more simplified,
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ETR6Q5YWUFTF6Y6RN6SHEAURJBK7OGOQ/



Hello,

Would it be wise to suggest two clever ways to deal with complexity :

- ManageIQ
- Ansible

We use them both, and are quite happy with them.

Regards,

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MSVKUQMBBXUOVOAWE5FICFL5MACXWERT/


[ovirt-users] Re: Trouble connecting to IDRAC7

2018-08-01 Thread Nicolas Ecarnot

Le 01/08/2018 à 15:28, Jayme a écrit :
I just enabled power management/fencing successfully on two of my hosts 
(Dell poweredge R720s with Idrac 7) but am failing to add the third.  I 
enter the IP and user/pass like the others, it takes 15 seconds or so 
they spits out "Test Failed: Internal JSON-RPC error"


I tried resetting the IDRAC on that server.  I can also ping it and 
access it fine in a web browser.  I can ping it from the host as well.


Is there any configuration in IDRAC that could be blocking the fence 
attempt or any logs in oVirt I can look at to figure out what might be 
happening with the connection?


I see there is a "fence_idrac" command on the hosts but unsure what 
switches to use with it to test.


Thanks!


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UJQDE3W6NSZWLMSZJQZD7OZM4CYEMNKI/



Hello Jayme,

All our iDrac are successfully power-managed this way :

type : ipmilan
options : lanplus=1

In the Drac, we use a dedicated user with the appropriate rights.

HTH

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FTT6IBBAONVMLWWHDW3W76KWT433AYQ2/


[ovirt-users] [No question] NFS disabled, hosts wandering tearful

2018-08-01 Thread Nicolas Ecarnot

Hello,

This is a simple testimony about what happened yesterday in one of our DC.
This DC runs on a dedicated bare-metal engine, oversized compared to the 
need, thus I've added a NFS service on it to host a small storage domain 
and the ISO storage domain.
Yesterday, after having received the colorful announce about the 4.2.5 
version, I decided to upgrade.
As our engine was still on a CentOS 7.4, I first upgraded its OS version 
to 7.5, then reboot. Smooth.

Then I followed the very usual oVirt engine upgrade path. Smooth.
Eventually, I upgraded the hosts with ovirt-ansible-cluster-upgrade as 
usual.


The result was frightening because the hosts were put in maintenance, 
upgraded, back to life, seen unavailable, unreachable, connecting, 
alive, rebooted, then back to another turn and looping...
During this, the SPM role was obviously jumping around, and that did not 
help the debug.


In the end, it appeared that something during an upgrade stopped and 
disabled the NFS service. My hosts partially relied on it, so after 
having restarted the NFS service, all came back to life.


The NFS disabling may come from the CentOS upgrade, except if someone 
tells me it could come from something on the oVirt side?


I'm sure the RH people will advice me not to run NFS on the engine, but 
apart this event, I had no trouble doing this in years.


Regards,

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GB72URRHAB3TNUO4QQBRMWITGTLSJBZJ/


[ovirt-users] Re: Is enabling Epel repo will break the installation?

2018-07-23 Thread Nicolas Ecarnot

Le 23/07/2018 à 15:33, Arman Khalatyan a écrit :

Hello,
As I remember some time ago the epel collectd was in conflict with the
ovirt one.
Is it still the case?
Thanks,
Arman.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/S4SYV6L5EIW36B3CIR7VWA42FNJCDCUG/



Hello,

With a recent 4.2.4.5-1.el7 it was still the case...

I just excluded collectd from epel.repo and it was OK.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GYZPPUBDSNGKKUYANCEHRRCOHKPUY24N/


[ovirt-users] Host reboot failed

2018-07-13 Thread Nicolas Ecarnot

Hello,

[oVirt 4.2.4.5-1.el7]

Sequence :
- Among 7 active UP hosts, one of them runs zero VM
- On this (still in UP state) host, I run a SSH-restart via the web GUI
- The host gracefully shuts down then reboots, with no issue
- In the web GUI, as in real life, the host stays in Reboot state forever

A this point, the engine can ping it, can ssh-connect to it, the host 
seems to have zero issue.


In the web GUI, I can not put it into active state because it is not in 
maintenance state. It stays in reboot state.
I can not either put it in maintenance state because it stays in reboot 
state.


This state lasts long enough to allow me type this mail, look into logs, 
and as I was about to send logs, I see the host is returning to life 
(its states comes back as UP).
I don't type fast, so after the host has finished rebooting, maybe 5 or 
10 minutes have passed before the engine links again to the host.


Before posting additional logs and comments, does anybody know if this 
is a know bug or behavior, or do I have to open a BZ?


Regards,

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/B5HCXSJR57LQ2SNRFK4POUIX7Z2DX2S6/


[ovirt-users] Re: Lost host after upgrade/reboot

2018-06-19 Thread Nicolas Ecarnot

Le 19/06/2018 à 10:14, Nicolas Ecarnot a écrit :
In this engine log above, you see that I'm using my account to manage 
this engine, as I 'm doing for years with no issue.
I'll try the exact same path with admin@internal to see what could 
change, but I don't see the link.


I just tried on another host, using admin@internal, and the same issue 
occurred.



What other logs could I give you to debug this?

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q2KI7OJKUYJLZ3MQU5LPBQW77A5A4YOX/


[ovirt-users] Lost host after upgrade/reboot

2018-06-19 Thread Nicolas Ecarnot

Hello,

TL;DR : engine stops talking with rebooted host.


[oVirt 4.2.3.5-1.el7.centos]

- From the web gui, upgrading a host, allowing the reboot checkbox checked
- upgrade is OK (/var/log/yum.log is showing successful updates + the 
Ansible host deploy log is also OK)

- reboot is OK (clean, SSH OK...)
- the host eventually appears as "Install failed"
- the engine.log is telling :


2018-06-19 10:02:24,896+02 ERROR
[org.ovirt.engine.core.bll.SshHostRebootCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] SSH
reboot command failed on host 'serv-hv-prds06': SSH session timeout
host 'root@ serv-hv-prds06' Stdout: Stderr: 2018-06-19
10:02:25,028+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
EVENT_ID: SYSTEM_FAILED_SSH_HOST_RESTART(198), A restart usin g SSH
initiated by the engine to Host serv-hv-prds06 has failed. 2018-06-19
10:02:25,185+02 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
START, SetVdsStatusVDSCommand(HostName = serv-hv-prds06,
SetVdsStatusVDSCom 
mandParameters:{hostId='9c1566a4-8432-4de6-b30d-fd3b8e5fafca',

status='InstallFailed', nonOperationalReason='NONE',
stopSpmFailureLogged='false', maintenanceReason='null'}), log id:
833f9bd 2018-06-19 10:02:25,191+02 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
FINISH, SetVdsStatusVDSCommand, log id: 833f9bd 2018-06-19
10:02:25,191+02 ERROR
[org.ovirt.engine.core.bll.hostdeploy.UpgradeHostInternalCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
Engine failed to restart via ssh host 'serv-hv-prds06' ('9c1566a4- 
8432-4de6-b30d-fd3b8e5fafca') after upgrade 2018-06-19

10:02:25,256+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID:
HOST_UPGRADE_FAILED(841 ), Failed to upgrade Host serv-hv-prds06
(User: necar...@sdis.isere.fr@SDIS38-authz). 2018-06-19
10:02:30,755+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-69)
[8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID:
HOST_UPGRADE_FAILED(841), Failed to upgrade Host serv-hv-prds06
(User: necar...@sdis.isere.fr@SDIS38-authz).


- Manually activating the host puts it back on track without issue

The usual SSH communications between the engine and the host are usually 
very sound (VM migrations, maintenance...).


On this oVirt DC, I reproduced this issue twice on 2 different hosts.

In this engine log above, you see that I'm using my account to manage 
this engine, as I 'm doing for years with no issue.
I'll try the exact same path with admin@internal to see what could 
change, but I don't see the link.


What other logs could I give you to debug this?

Regards,

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CT5KHY3C2ASOXBVNUIEBG5WA42JKJGXH/


[ovirt-users] Re: Hosts : Upgrade failed - 4.2.3

2018-05-17 Thread Nicolas Ecarnot

Le 16/05/2018 à 12:55, Fred Rolland a écrit :

It looks you still have 4.1 repos...


Yes.

I thought Ansible was in charge of disabling oldest repos.

Is does not seem to be the case, is it?

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org


[ovirt-users] Hosts : Upgrade failed - 4.2.3

2018-05-16 Thread Nicolas Ecarnot

Hello,

I was on 4.2.2 and it failed.
I upgraded to 4.2.3 and it's still failing.

From the GUI, I switch one host into maintenance mode, try to upgrade 
it, and it is failing.


On the engine, the engine.log is not saying anything helpful.
But on the engine, I see in 
/var/log/ovirt-engine/host-deploy/ovirt-host-mgmt-ansible-20180516121013-xxx-dacf1972-f184-4d01-a863-7974579e6bc8.log, 
I see :



http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.8/repodata/repomd.xml:
 [Errno 14] HTTP Error 404 - Not Found
Essai d'un autre miroir.
To address this issue please refer to the below wiki article 


https://wiki.centos.org/yum-errors

If above article doesn't help to resolve this issue please use 
https://bugs.centos.org/.

http://mirror.centos.org/centos/7/virt/x86_64/ovirt-4.1/repodata/repomd.xml: 
[Errno 14] HTTP Error 404 - Not Found
Essai d'un autre miroir.


This is french, but I'm sure you understand that it translates into 
"gluster repo issue".


Is there something I could do?

Thank you.

--
Nicolas ECARNOT
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org


[ovirt-users] Why RAW images when using GlusterFS?

2018-04-05 Thread Nicolas Ecarnot

Hello,

Amongst others, I have one 3.6 DC working very well since years and all 
based on GlusterFS.
When having a close look (qemu-img info) on the images, I see their 
format is all RAW and not QCOW2.


I never noticed or bothered before, but I'm wondering :
- is it by design?
- it is something we can change (I'd prefer qcow2)
- it there some limitations?

And finally, I have the same questions about NFS storage domains.

Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VM has been paused due to NO STORAGE SPACE ERROR ?!?!?!?!

2018-03-16 Thread Nicolas Ecarnot

Le 16/03/2018 à 15:48, Alex Crow a écrit :

On 16/03/18 13:46, Nicolas Ecarnot wrote:

Le 16/03/2018 à 13:28, Karli Sjöberg a écrit :



Den 16 mars 2018 12:26 skrev Enrico Becchetti 
<enrico.becche...@pg.infn.it>:


   Dear All,
    Does someone had seen that error ?


Yes, I experienced it dozens of times on 3.6 (my 4.2 setup has 
insufficient workload to trigger such event).

And in every case, there was no actual lack of space.


    Enrico Becchetti Servizio di Calcolo e Reti
I think I remember something to do with thin provisioning and not 
being able to grow fast enough, so out of space. Are the VM's disk 
thick or thin?


All our storage domains are thin-prov. and served by iSCSI (Equallogic 
PS6xxx and 4xxx).


Enrico, do you know if a bug has been filed about this?

Did the VM remain paused? In my experience the VM just gets temporarily 
paused while the storage is expanded. RH confirmed to me in a ticket 
that this is expected behaviour.


AFAIR, most of them went back up and running by themselves (we had to 
manually some of them from times to times).

The storage side weakness is an interesting trail to follow.
We also experienced this behavior when migrating lots of VMs at once, 
yet using a dedicated storage network.


Being on this mailing list since long, I remember we already discussed 
several times about how some users feel how oVirt can appear sensitive 
to storage latencies. On my side, the site where most of our workload 
resides is still in 3.6, so I can not yet witness the efforts oVirt devs 
have made to cope with this in 4.2 but I'm sure they did.


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VM has been paused due to NO STORAGE SPACE ERROR ?!?!?!?!

2018-03-16 Thread Nicolas Ecarnot

Le 16/03/2018 à 13:28, Karli Sjöberg a écrit :



Den 16 mars 2018 12:26 skrev Enrico Becchetti <enrico.becche...@pg.infn.it>:

   Dear All,
Does someone had seen that error ?


Yes, I experienced it dozens of times on 3.6 (my 4.2 setup has 
insufficient workload to trigger such event).

And in every case, there was no actual lack of space.


Enrico Becchetti Servizio di Calcolo e Reti
I think I remember something to do with thin provisioning and not being 
able to grow fast enough, so out of space. Are the VM's disk thick or thin?


All our storage domains are thin-prov. and served by iSCSI (Equallogic 
PS6xxx and 4xxx).


Enrico, do you know if a bug has been filed about this?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] firewall node

2018-03-09 Thread Nicolas Ecarnot

https://www.mail-archive.com/users@ovirt.org/msg46608.html


Le 09/03/2018 à 20:12, Fabrice SOLER a écrit :

Hello,

I am trying to open a port on the node.

For that, in the cluster configuration I have choosed firewalld, I have 
created the 
|*/etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml* file.|


|
- name: Enable additional port on firewalld
   firewalld:
     port: "12345/tcp"
     permanent: yes
     immediate: yes
     state: enabled
|

|then I have rebooted the node like it is noticed on this link :
|

|https://www.ovirt.org/blog/2017/12/host-deploy-customization/
|

|On the node, after the reboot, I read the iptables (iptables -L) and 
the port is not open.

|

|I have just updated the engine and the node is 4.2.1.1.|

|Is there some change about the firewalld in this version ? (in 4.2.0 it 
worked)

|

|Sincerery
|

--


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users




--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Power off VM from VM portal

2018-03-07 Thread Nicolas Ecarnot

Le 07/03/2018 à 13:42, Alexandr Krivulya a écrit :



06.03.2018 17:39, Nicolas Ecarnot пишет:

Le 06/03/2018 à 16:02, Alexandr Krivulya a écrit :

Hi,

is there any way to power off VM from VM portal (4.2.1.7)? I can't 
find "power off" button, just "shutdown".



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Hello Alexandr,

After having clicked on the VM link, you'll notice that on the right 
of the Shutdown button is an arrow allowing you to access to the Power 
Off feature.


I cant find this arrow on Shutdown button




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



Oh sorry I answered in the context of admin portal.
Indeed, in the VM portal, I neither see this poweroff button.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Power off VM from VM portal

2018-03-06 Thread Nicolas Ecarnot

Le 06/03/2018 à 16:02, Alexandr Krivulya a écrit :

Hi,

is there any way to power off VM from VM portal (4.2.1.7)? I can't find 
"power off" button, just "shutdown".



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Hello Alexandr,

After having clicked on the VM link, you'll notice that on the right of 
the Shutdown button is an arrow allowing you to access to the Power Off 
feature.


Regards,

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Importing VM fails with "No space left on device"

2018-03-06 Thread Nicolas Ecarnot

Hello,

When importing a VM, I'm facing the know bug :
https://access.redhat.com/solutions/2770791

QImgError: ecode=1, stdout=[], stderr=['qemu-img: error while writing 
sector 93569024: No space left on device'


The difference between my case and what is described in the RH webpage 
is that I have no "Failed to flush the refcount block cache".


Here is what I see :


ecfbd1a4-f9d2-463a-ade6-def5bd217b43::DEBUG::2018-03-06 
09:57:36,460::utils::718::root::(watchCmd) FAILED:  = ['qemu-img: error while 
writing sector 205517952: No space left on device'];  = 1
ecfbd1a4-f9d2-463a-ade6-def5bd217b43::ERROR::2018-03-06 09:57:36,460::image::865::Storage.Image::(copyCollapsed) conversion failure for volume ac08bc8d-1eea-449a-a102-cf763c6726c8 
Traceback (most recent call last):

  File "/usr/share/vdsm/storage/image.py", line 860, in copyCollapsed
volume.fmt2str(dstVolFormat))
  File "/usr/lib/python2.7/site-packages/vdsm/qemuimg.py", line 207, in convert
raise QImgError(rc, out, err)
QImgError: ecode=1, stdout=[], stderr=['qemu-img: error while writing sector 
205517952: No space left on device'], message=None
ecfbd1a4-f9d2-463a-ade6-def5bd217b43::ERROR::2018-03-06 
09:57:36,461::image::878::Storage.Image::(copyCollapsed) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 866, in copyCollapsed
raise se.CopyImageError(str(e))
CopyImageError: low level Image copy failed: ("ecode=1, stdout=[], 
stderr=['qemu-img: error while writing sector 205517952: No space left on device'], 
message=None",)


I followed the advices in the RH webpage (check if the figures are 
correct between the qemu-img sizes and the meta-data file), and they 
seem to be correct :


root@serv-hv-adm30:/etc# qemu-img info 
/rhev/data-center/mnt/serv-lin-adm1.sdis.isere.fr\:_home_vmexport3/be2878c9-2c46-476b-bfae-8b02a4679022/images/a5d68d88-3b54-488d-a61e-7995a1906994/ac08bc8d-1eea-449a-a102-cf763c6726c8
image: 
/rhev/data-center/mnt/serv-lin-adm1.sdis.isere.fr:_home_vmexport3/be2878c9-2c46-476b-bfae-8b02a4679022/images/a5d68d88-3b54-488d-a61e-7995a1906994/ac08bc8d-1eea-449a-a102-cf763c6726c8

file format: qcow2
virtual size: 98G (105226698752 bytes)
disk size: 97G
cluster_size: 65536
Format specific information:
compat: 0.10
refcount bits: 16

root@serv-hv-adm30:/etc# cat 
/rhev/data-center/mnt/serv-lin-adm1.sdis.isere.fr\:_home_vmexport3/be2878c9-2c46-476b-bfae-8b02a4679022/images/a5d68d88-3b54-488d-a61e-7995a1906994/ac08bc8d-1eea-449a-a102-cf763c6726c8.meta 
DOMAIN=be2878c9-2c46-476b-bfae-8b02a4679022

CTIME=1520318755
FORMAT=COW
DISKTYPE=1
LEGALITY=LEGAL
SIZE=205520896
VOLTYPE=LEAF
DESCRIPTION=
IMAGE=a5d68d88-3b54-488d-a61e-7995a1906994
PUUID=----
MTIME=0
POOL_UUID=
TYPE=SPARSE
EOF


So I don't see what's wrong?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVIRT 4.1 / iSCSI Multipathing

2018-03-05 Thread Nicolas Ecarnot

Hello,

[Unusual setup]
Last week, I eventually managed to make a 4.2.1.7 oVirt work with 
iscsi-multipathing on both hosts and guest, connected to a Dell 
Equallogic SAN which is providing one single virtual ip - my hosts have 
two dedicated NICS for iscsi, but on the same VLAN. Torture-tests showed 
good resilience.


[Classical setup]
But this year, we plan to create at least two additional DCs but to 
connect their hosts to a "classical" SAN, ie which provides TWO IPs on 
segregated VLANs (not routed), and we'd like to use the same 
iscsi-multipathing feature.


The discussion below could lead to think that oVirt needs the two iscsi 
VLANs to be routed, allowing the hosts in one VLAN to access to 
resources in the other.

As Vinicius explained, this is not a best practice to say the least.

Searching through the mailing list archive, I found no answer to 
Vinicius' question.


May a Redhat storage and/or network expert enlighten us on these points?

Regards,

--
Nicolas Ecarnot

Le 21/07/2017 à 20:56, Vinícius Ferrão a écrit :


On 21 Jul 2017, at 15:12, Yaniv Kaul <yk...@redhat.com 
<mailto:yk...@redhat.com>> wrote:




On Wed, Jul 19, 2017 at 9:13 PM, Vinícius Ferrão <fer...@if.ufrj.br 
<mailto:fer...@if.ufrj.br>> wrote:


Hello,

I’ve skipped this message entirely yesterday. So this is per
design? Because the best practices of iSCSI MPIO, as far as I
know, recommends two completely separate paths. If this can’t be
achieved with oVirt what’s the point of running MPIO?


With regular storage it is quite easy to achieve using 'iSCSI bonding'.
I think the Dell storage is a bit different and requires some more 
investigation - or experience with it.

 Y.


Yaniv, thank you for answering this. I’m really hoping that a solution 
would be found.


Actually I’m not running anything from DELL. My storage system is 
FreeNAS which is pretty standard and, as far as I know, iSCSI 
practices dictates segregate networks for proper working.


All other major virtualization products supports iSCSI this way: 
vSphere, XenServer and Hyper-V. So I was really surprised that oVirt 
(and even RHV, I requested a trial yesterday) does not implement ISCSI 
with the well know best practices.


There’s a picture of the architecture that I take from Google when 
searching for ”mpio best practives”: 
https://image.slidesharecdn.com/2010-12-06-midwest-reg-vmug-101206110506-phpapp01/95/nextgeneration-best-practices-for-vmware-and-storage-15-728.jpg?cb=1296301640


Ans as you can see it’s segregated networks on a machine reaching the 
same target.


In my case, my datacenter has five Hypervisor Machines, with two NICs 
dedicated for iSCSI. Both NICs connect to different converged ethernet 
switches and the iStorage is connected the same way.


So it really does not make sense that a the first NIC can reach the 
second NIC target. In a case of a switch failure the cluster will go 
down anyway, so what’s the point of running MPIO? Right?


Thanks once again,
V.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 4.2.x and ManageIQ : Adding 'cfme' credentials

2018-03-01 Thread Nicolas Ecarnot

Le 01/03/2018 à 15:50, Nicolas Ecarnot a écrit :

Couldn't the Redhat documentation mentioned above be more accurate?


Something like 'scl enable rh-postgrsql95' should help.


Not that much...

root@serv-mvm-prds01:/etc/ovirt-engine-setup.conf.d# cd /tmp
root@serv-mvm-prds01:/tmp# su - postgres
Dernière connexion : jeudi  1 mars 2018 à 15:42:40 CET sur pts/2
-bash-4.2$ scl enable rh-postgrsql95
Need at least 3 arguments.
Run scl --help to get help.


After reading and reading again :

For the record, here are the steps allowing me to add this user :

su - postgres

scl enable rh-postgresql95 'psql ovirt_engine_history'

CREATE ROLE cfme with LOGIN ENCRYPTED PASSWORD 'xxx';

SELECT 'GRANT SELECT ON ' || relname || ' TO cfme;' FROM pg_class JOIN 
pg_namespace ON pg_namespace.oid = pg_class.relnamespace WHERE nspname = 
'public' AND relkind IN ('r', 'v','S');


\q

exit



--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 4.2.x and ManageIQ : Adding 'cfme' credentials

2018-03-01 Thread Nicolas Ecarnot

Le 01/03/2018 à 15:00, Yaniv Kaul a écrit :



On Thu, Mar 1, 2018 at 2:13 PM, Nicolas Ecarnot <nico...@ecarnot.net 
<mailto:nico...@ecarnot.net>> wrote:


Hello,

As for my 4 previous oVirt DCs, I'm trying to add them to ManageIQ
providers.

I tried to follow this guide :


https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.6/html-single/deployment_planning_guide/#data_collection_for_rhev_33_34

<https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.6/html-single/deployment_planning_guide/#data_collection_for_rhev_33_34>

But when trying to run psql, the shell tells me the command is not
found.




Hello Yanniv,

Thank you for answering.


Because you are probably on PG 9.5 SCL, I assume?


I've never heard about that before today.
I installed a bare-metal CentOS 7.4 on which I installed oVirt 4.2.
I saw no reference to SCL nowhere, neither during the setup, neither in 
the oVirt install documentation.


How an average user is supposed to behave in such a situation?
(In my case, as usual, I read and read again)

Couldn't the Redhat documentation mentioned above be more accurate?


Something like 'scl enable rh-postgrsql95' should help.


Not that much...

root@serv-mvm-prds01:/etc/ovirt-engine-setup.conf.d# cd /tmp
root@serv-mvm-prds01:/tmp# su - postgres
Dernière connexion : jeudi  1 mars 2018 à 15:42:40 CET sur pts/2
-bash-4.2$ scl enable rh-postgrsql95
Need at least 3 arguments.
Run scl --help to get help.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] oVirt 4.2.x and ManageIQ : Adding 'cfme' credentials

2018-03-01 Thread Nicolas Ecarnot

Hello,

As for my 4 previous oVirt DCs, I'm trying to add them to ManageIQ 
providers.


I tried to follow this guide :

https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.6/html-single/deployment_planning_guide/#data_collection_for_rhev_33_34

But when trying to run psql, the shell tells me the command is not found.

I made a very simple setup : when running engine-setup, I answered the 
default question about DWH, so the DB is local.


When viewing (with pgAdmin) the roles of this new PostgreSQL DB, I see 
there is no 'cfme' user.
Do I have to re-run the setup and answer different things to ensure 
other packages and setup are made?


I saw 
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/data_warehouse_guide/#Overview_of_Configuring_Data_Warehouse 
telling me to re-run.


But I see that :
rpm -qa|grep -i dwh
ovirt-engine-dwh-4.2.1.2-1.el7.centos.noarch
ovirt-engine-dwh-setup-4.2.1.2-1.el7.centos.noarch

so I thought it was already enough... ?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosts firewall custom setup

2018-02-27 Thread Nicolas Ecarnot

Hello,

For the record :
The workaround you suggest below is successful.

Thank you.

--
Nicolas Ecarnot

Le 27/02/2018 à 14:15, Ondra Machacek a écrit :



On 02/27/2018 11:29 AM, Nicolas Ecarnot wrote:

Le 26/02/2018 à 15:00, Yedidyah Bar David a écrit :

But how do we add custom rules in case of firewalld type?


Please see: https://ovirt.org/blog/2017/12/host-deploy-customization/

Hello Didi and al,

- I followed the advices found in this blog page, I created the exact 
same filename with the adequate content.

- I've setup the cluster type to firewalld
- I restarted ovirt-engine
- I reinstalled a host

I see no usage of this Ansible yml file.
I see the creation of an ansible deploy log file for my host, and I 
see the usual firewall ports being opened, but I see nowhere any usage 
of the /etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml file.

- I added the debug msg part in the ansible recipe, but to no avail.
- Huge grepping through the /var/log of the engine shows no calls of 
this script.


Thus, I see no effect on ports of the host's firewalld config.

What should I look at now?


It looks like you hit the following bug:

  https://bugzilla.redhat.com/show_bug.cgi?id=1549163

It will be fixed in 4.2.2 release.

I believe you can meanwhile remove line:

  - oVirt-metrics

from file:

/usr/share/ovirt-engine/playbooks/roles/ovirt-host-deploy/meta/main.yml



Thank you.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosts firewall custom setup

2018-02-27 Thread Nicolas Ecarnot

Le 26/02/2018 à 15:00, Yedidyah Bar David a écrit :

But how do we add custom rules in case of firewalld type?


Please see: https://ovirt.org/blog/2017/12/host-deploy-customization/

Hello Didi and al,

- I followed the advices found in this blog page, I created the exact 
same filename with the adequate content.

- I've setup the cluster type to firewalld
- I restarted ovirt-engine
- I reinstalled a host

I see no usage of this Ansible yml file.
I see the creation of an ansible deploy log file for my host, and I see 
the usual firewall ports being opened, but I see nowhere any usage of 
the /etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml file.

- I added the debug msg part in the ansible recipe, but to no avail.
- Huge grepping through the /var/log of the engine shows no calls of 
this script.


Thus, I see no effect on ports of the host's firewalld config.

What should I look at now?

Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosts firewall custom setup

2018-02-26 Thread Nicolas Ecarnot

Le 26/02/2018 à 14:03, Yedidyah Bar David a écrit :

On Mon, Feb 26, 2018 at 2:01 PM, Nicolas Ecarnot <nico...@ecarnot.net> wrote:

Hello,

On oVirt 4.2.1.7, I'm trying to setup custom iptables rules as I'm doing
since years with engine-config --set IPTablesConfigSiteCustom="blah blah
blah".

On my hosts, I can see in my hosts that /etc/sysconfig/iptables does contain
the correct custom rules I added, but when manually checking with iptables
-L, I don't see my rules active.

On my hosts, I see that the iptables services is stopped and disabled, and
that the firewalld service is up and running.

That explains why iptables customization has no effect.


Indeed.

IIRC the type of firewall is now set per cluster or something like that, not
sure about the details - adding Ondra.


Per cluster, one can indeed choose the firewall type.
I suppose it translates on the hosts into the activation of the adequate 
service.

But how do we add custom rules in case of firewalld type?

On the hosts, I imagine that could translate into changes in :
/etc/firewalld/zones/public.xml

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Hosts firewall custom setup

2018-02-26 Thread Nicolas Ecarnot

Hello,

On oVirt 4.2.1.7, I'm trying to setup custom iptables rules as I'm doing 
since years with engine-config --set IPTablesConfigSiteCustom="blah blah 
blah".


On my hosts, I can see in my hosts that /etc/sysconfig/iptables does 
contain the correct custom rules I added, but when manually checking 
with iptables -L, I don't see my rules active.


On my hosts, I see that the iptables services is stopped and disabled, 
and that the firewalld service is up and running.


That explains why iptables customization has no effect.

In the engine setup, I see that 
/etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf contains :

OVESETUP_CONFIG/firewallManager=none:None

I'm confused about this setting : when running engine-setup, I'm not 
sure to understand if answering yes to the question about the firewall 
will modify the engine, the hosts, or all of them?


Actually, I'd like my engine to stay with a disabled firewall, but my 
hosts with an active one.


Is it true to say that this is not an option and I have to answer yes, 
enable the firewall on the engine, allowing the 
OVESETUP_CONFIG/firewallManager option to be set up (to firewalld or 
iptables), thus allowing the spread of this setup towards the hosts?


Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Qemu-block] qcow2 images corruption

2018-02-14 Thread Nicolas Ecarnot



https://framadrop.org/r/Lvvr392QZo#/wOeYUUlHQAtkUw1E+x2YdqTqq21Pbic6OPBIH0TjZE=

Le 14/02/2018 à 00:01, John Snow a écrit :



On 02/13/2018 04:41 AM, Kevin Wolf wrote:

Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben:

TL; DR : qcow2 images keep getting corrupted. Any workaround?


Not without knowing the cause.

The first thing to make sure is that the image isn't touched by a second
process while QEMU is running a VM. The classic one is using 'qemu-img
snapshot' on the image of a running VM, which is instant corruption (and
newer QEMU versions have locking in place to prevent this), but we have
seen more absurd cases of things outside QEMU tampering with the image
when we were investigating previous corruption reports.

This covers the majority of all reports, we haven't had a real
corruption caused by a QEMU bug in ages.


After having found (https://access.redhat.com/solutions/1173623) the right
logical volume hosting the qcow2 image, I can run qemu-img check on it.
- On 80% of my VMs, I find no errors.
- On 15% of them, I find Leaked cluster errors that I can correct using
"qemu-img check -r all"
- On 5% of them, I find Leaked clusters errors and further fatal errors,
which can not be corrected with qemu-img.
In rare cases, qemu-img can correct them, but destroys large parts of the
image (becomes unusable), and on other cases it can not correct them at all.


It would be good if you could make the 'qemu-img check' output available
somewhere.

It would be even better if we could have a look at the respective image.
I seem to remember that John (CCed) had a few scripts to analyse
corrupted qcow2 images, maybe we would be able to see something there.



Hi! I did write a pretty simplistic tool for trying to tell the shape of
a corruption at a glance. It seems to work pretty similarly to the other
tool you already found, but it won't hurt anything to run it:

https://github.com/jnsnow/qcheck

(Actually, that other tool looks like it has an awful lot of options.
I'll have to check it out.)

It can print a really upsetting amount of data (especially for very
corrupt images), but in the default case, the simple setting should do
the trick just fine.

You could always put the output from this tool in a pastebin too; it
might help me visualize the problem a bit more -- I find seeing the
exact offsets and locations of where all the various tables and things
to be pretty helpful.

You can also always use the "deluge" option and compress it if you want,
just don't let it print to your terminal:

jsnow@probe (dev) ~/s/qcheck> ./qcheck -xd
/home/bos/jsnow/src/qemu/bin/git/install_test_f26.qcow2 > deluge.log;
and ls -sh deluge.log
4.3M deluge.log

but it compresses down very well:

jsnow@probe (dev) ~/s/qcheck> 7z a -t7z -m0=ppmd deluge.ppmd.7z deluge.log
jsnow@probe (dev) ~/s/qcheck> ls -s deluge.ppmd.7z
316 deluge.ppmd.7z

So I suppose if you want to send along:
(1) The basic output without any flags, in a pastebin
(2) The zipped deluge output, just in case

and I will try my hand at guessing what went wrong.


(Also, maybe my tool will totally choke for your image, who knows. It
hasn't received an overwhelming amount of testing apart from when I go
to use it personally and inevitably wind up displeased with how it
handles certain situations, so ...)


What I read similar to my case is :
- usage of qcow2
- heavy disk I/O
- using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the
solution. Having asked this question to oVirt experts
(https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's
not clear the driver is to blame.


This seems very unlikely. The corruption you're seeing is in the qcow2
metadata, not only in the guest data. If anything, virtio-scsi exercises
more qcow2 code paths than virtio-blk, so any potential bug that affects
virtio-blk should also affect virtio-scsi, but not the other way around.


I agree with the answer Yaniv Kaul gave to me, saying I have to properly
report the issue, so I'm longing to know which peculiar information I can
give you now.


To be honest, debugging corruption after the fact is pretty hard. We'd
need the 'qemu-img check' output and ideally the image to do anything,
but I can't promise that anything would come out of this.

Best would be a reproducer, or at least some operation that you can link
to the appearance of the corruption. Then we could take a more targeted
look at the respective code.


As you can imagine, all this setup is in production, and for most of the
VMs, I can not "play" with them. Moreover, we launched a campaign of nightly
stopping every VM, qemu-img check them one by one, then boot.
So it might take some time before I find another corrupted image.
(which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine
that automated migrations of VMs could trigger similar behaviors on qcow2

Re: [ovirt-users] [Qemu-block] qcow2 images corruption

2018-02-13 Thread Nicolas Ecarnot

Le 13/02/2018 à 16:26, Nicolas Ecarnot a écrit :
>> It would be good if you could make the 'qemu-img check' output available
>> somewhere.
>

I found this :
https://github.com/ShijunDeng/qcow2-dump

and the transcript (beautiful colors when viewed with "more") is attached :


--
Nicolas ECARNOT
Le script a débuté sur mar. 13 févr. 2018 17:31:05 CET
]0;root@serv-hv-adm13:/home[?1034hroot@serv-hv-adm13:/home#
 /root/qcow2-dump -m check serv-term-adm4-corr.qcow2.img

File: serv-term-adm4-corr.qcow2.img


magic: 0x514649fb
version: 2
backing_file_offset: 0x0
backing_file_size: 0
fs_type: xfs
virtual_size: 64424509440 / 61440M / 60G
disk_size: 36507222016 / 34816M / 34G
seek_end: 36507222016 [0x88000] / 34816M / 34G
cluster_bits: 16
cluster_size: 65536
crypt_method: 0
csize_shift: 54
csize_mask: 255
cluster_offset_mask: 0x3f
l1_table_offset: 0x76a46
l1_size: 120
l1_vm_state_index: 120
l2_size: 8192
refcount_order: 4
refcount_bits: 16
refcount_block_bits: 15
refcount_block_size: 32768
refcount_table_offset: 0x1
refcount_table_clusters: 1
snapshots_offset: 0x0
nb_snapshots: 0
incompatible_features: 
compatible_features: 
autoclear_features: 



Active Snapshot:

L1 Table:   [offset: 0x76a46, len: 120]

Result:
L1 Table:   unaligned: 0, invalid: 0, unused: 53, 
used: 67
L2 Table:   unaligned: 0, invalid: 0, unused: 20304, 
used: 528560



Refcount Table:

Refcount Table: [offset: 0x1, len: 8192]

Result:
Refcount Table: unaligned: 0, invalid: 0, unused: 
8175, used: 17
Refcount:   error: 4342, leak: 0, unused: 28426, 
used: 524288



COPIED OFLAG:


Result:
L1 Table ERROR OFLAG_COPIED: 1
L2 Table ERROR OFLAG_COPIED: 4323
Active L2 COPIED: 528560 [34639708160 / 33035M / 32G]



Active Cluster:


Result:
Active Cluster: reuse: 17



Summary:
preallocation:  off
Active Cluster: reuse: 17
Refcount Table: unaligned: 0, invalid: 0, unused: 
8175, used: 17
Refcount:   error: 4342, leak: 0, 
rebuild: 4325, unused: 28426, used: 524288
L1 Table:   unaligned: 0, invalid: 0, unused: 53, 
used: 67
oflag copied: 1
L2 Table:   unaligned: 0, invalid: 0, unused: 
20304, used: 528560
oflag copied: 4323


### qcow2 image has refcount errors!   (=_=#)###
###and qcow2 image has copied errors!  (o_0)?###
###  Sadly: refcount error cause active cluster reused! Orz  ###
### Please backup this image and contact the author! ###



]0;root@serv-hv-adm13:/homeroot@serv-hv-adm13:/home#
 exit

Script terminé sur mar. 13 févr. 2018 17:31:13 CET
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Qemu-block] qcow2 images corruption

2018-02-13 Thread Nicolas Ecarnot

Hello Kevin,

Le 13/02/2018 à 10:41, Kevin Wolf a écrit :

Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben:

TL; DR : qcow2 images keep getting corrupted. Any workaround?


Not without knowing the cause.


Actually, my main concern is mostly about finding the cause rather than 
correcting my corrupted VMs.


Another way to say it : I prefer to help oVirt than help myself.


The first thing to make sure is that the image isn't touched by a second
process while QEMU is running a VM.


Indeed, I read some BZ about this issue : they were raised by a user who 
ran some qemu-img commands on a "mounted" image, thus leading to some 
corruption.
In my case, I'm not playing with this, and the corrupted VMs were only 
touched by classical oVirt actions.



The classic one is using 'qemu-img
snapshot' on the image of a running VM, which is instant corruption (and
newer QEMU versions have locking in place to prevent this), but we have
seen more absurd cases of things outside QEMU tampering with the image
when we were investigating previous corruption reports.

This covers the majority of all reports, we haven't had a real
corruption caused by a QEMU bug in ages.


May I ask after what QEMU version this kind of locking has been added.
As I wrote, our oVirt setup is 3.6 so not recent.




After having found (https://access.redhat.com/solutions/1173623) the right
logical volume hosting the qcow2 image, I can run qemu-img check on it.
- On 80% of my VMs, I find no errors.
- On 15% of them, I find Leaked cluster errors that I can correct using
"qemu-img check -r all"
- On 5% of them, I find Leaked clusters errors and further fatal errors,
which can not be corrected with qemu-img.
In rare cases, qemu-img can correct them, but destroys large parts of the
image (becomes unusable), and on other cases it can not correct them at all.


It would be good if you could make the 'qemu-img check' output available
somewhere.


See attachment.



It would be even better if we could have a look at the respective image.
I seem to remember that John (CCed) had a few scripts to analyse
corrupted qcow2 images, maybe we would be able to see something there.


I just exported it like this :
qemu-img convert /dev/the_correct_path /home/blablah.qcow2.img

The resulting file is 32G and I need an idea to transfer this img to you.




What I read similar to my case is :
- usage of qcow2
- heavy disk I/O
- using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the
solution. Having asked this question to oVirt experts
(https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's
not clear the driver is to blame.


This seems very unlikely. The corruption you're seeing is in the qcow2
metadata, not only in the guest data.


Are you saying:
- the corruption is in the metadata and in the guest data
OR
- the corruption is only in the metadata
?


If anything, virtio-scsi exercises
more qcow2 code paths than virtio-blk, so any potential bug that affects
virtio-blk should also affect virtio-scsi, but not the other way around.


I get that.




I agree with the answer Yaniv Kaul gave to me, saying I have to properly
report the issue, so I'm longing to know which peculiar information I can
give you now.


To be honest, debugging corruption after the fact is pretty hard. We'd
need the 'qemu-img check' output


Done.


and ideally the image to do anything,


I remember some Redhat people once gave me a temporary access to put 
heavy file on some dedicated server. Is it still possible?



but I can't promise that anything would come out of this.

Best would be a reproducer, or at least some operation that you can link
to the appearance of the corruption. Then we could take a more targeted
look at the respective code.


Sure.
Alas I find no obvious pattern leading to corruption :
From the guest side, it appeared with windows 2003, 2008, 2012, linux 
centOS 6 and 7. It appeared with virtio-blk; and I changed some VMs to 
used virtio-scsi but it's too soon to see appearance of corruption in 
that case.
As I said, I'm using snapshots VERY rarely, and our versions are too old 
so we do them the cold way only (VM shutdown). So very safely.
The "weirdest" thing we do is to migrate VMs : you see how conservative 
we are!



As you can imagine, all this setup is in production, and for most of the
VMs, I can not "play" with them. Moreover, we launched a campaign of nightly
stopping every VM, qemu-img check them one by one, then boot.
So it might take some time before I find another corrupted image.
(which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine
that automated migrations of VMs could trigger similar behaviors on qcow2
images.


To my knowledge, oVirt only uses external snapshots and creates them
with QMP. This should be perfectly safe because from the perspective of
the qcow2 image being snapshotted,

Re: [ovirt-users] qcow2 images corruption

2018-02-08 Thread Nicolas Ecarnot

Le 08/02/2018 à 13:59, Yaniv Kaul a écrit :



On Feb 7, 2018 7:08 PM, "Nicolas Ecarnot" <nico...@ecarnot.net 
<mailto:nico...@ecarnot.net>> wrote:


Hello,

TL; DR : qcow2 images keep getting corrupted. Any workaround?

Long version:
This discussion has already been launched by me on the oVirt and
on qemu-block mailing list, under similar circumstances but I
learned further things since months and here are some informations :

- We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using
CentOS 7.{2,3} hosts
- Hosts :
  - CentOS 7.2 1511 :
    - Kernel = 3.10.0 327
    - KVM : 2.3.0-31
    - libvirt : 1.2.17
    - vdsm : 4.17.32-1
  - CentOS 7.3 1611 :
    - Kernel 3.10.0 514
    - KVM : 2.3.0-31
    - libvirt 2.0.0-10
    - vdsm : 4.17.32-1


All are somewhat old releases. I suggest upgrading to the latest RHEL 
and qemu-kvm bits.


Later on, upgrade oVirt.
Y.

Hello Yaniv,

We could discuss for hours about the fact that CentOS 7.3 was released 
in January 2017, thus not that old.
And also discuss for hours explaining the gap between developers' will 
to push their freshest releases and the curb we - industry users - put 
on adopting such new versions. In my case, the virtualization 
infrastructure is just one of the +30 domains I have to master everyday, 
and the more stable the better.
In the setup described previously, the qemu qcow2 images were correct, 
then not. We did not change anything. We have to find a workaround and 
we need your expertise.


Not understanding the cause of the corruption threatens us to the same 
situation in oVirt 4.2.


--
Nicolas Ecarnot
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] qcow2 images corruption

2018-02-07 Thread Nicolas Ecarnot

Hello,

TL; DR : qcow2 images keep getting corrupted. Any workaround?

Long version:
This discussion has already been launched by me on the oVirt and on 
qemu-block mailing list, under similar circumstances but I learned 
further things since months and here are some informations :


- We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 
7.{2,3} hosts

- Hosts :
  - CentOS 7.2 1511 :
- Kernel = 3.10.0 327
- KVM : 2.3.0-31
- libvirt : 1.2.17
- vdsm : 4.17.32-1
  - CentOS 7.3 1611 :
- Kernel 3.10.0 514
- KVM : 2.3.0-31
- libvirt 2.0.0-10
- vdsm : 4.17.32-1
- Our storage is 2 Equallogic SANs connected via iSCSI on a dedicated 
network
- Depends on weeks, but all in all, there are around 32 hosts, 8 storage 
domains and for various reasons, very few VMs (less than 200).
- One peculiar point is that most of our VMs are provided an additional 
dedicated network interface that is iSCSI-connected to some volumes of 
our SAN - these volumes not being part of the oVirt setup. That could 
lead to a lot of additional iSCSI traffic.


From times to times, a random VM appears paused by oVirt.
Digging into the oVirt engine logs, then into the host vdsm logs, it 
appears that the host considers the qcow2 image as corrupted.
Along what I consider as a conservative behavior, vdsm stops any 
interaction with this image and marks it as paused.

Any try to unpause it leads to the same conservative pause.

After having found (https://access.redhat.com/solutions/1173623) the 
right logical volume hosting the qcow2 image, I can run qemu-img check 
on it.

- On 80% of my VMs, I find no errors.
- On 15% of them, I find Leaked cluster errors that I can correct using 
"qemu-img check -r all"
- On 5% of them, I find Leaked clusters errors and further fatal errors, 
which can not be corrected with qemu-img.
In rare cases, qemu-img can correct them, but destroys large parts of 
the image (becomes unusable), and on other cases it can not correct them 
at all.


Months ago, I already sent a similar message but the error message was 
about No space left on device 
(https://www.mail-archive.com/qemu-block@gnu.org/msg00110.html).


This time, I don't have this message about space, but only corruption.

I kept reading and found a similar discussion in the Proxmox group :
https://lists.ovirt.org/pipermail/users/2018-February/086750.html

https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-disk-i-o.32865/page-2

What I read similar to my case is :
- usage of qcow2
- heavy disk I/O
- using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the 
solution. Having asked this question to oVirt experts 
(https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but 
it's not clear the driver is to blame.


I agree with the answer Yaniv Kaul gave to me, saying I have to properly 
report the issue, so I'm longing to know which peculiar information I 
can give you now.


As you can imagine, all this setup is in production, and for most of the 
VMs, I can not "play" with them. Moreover, we launched a campaign of 
nightly stopping every VM, qemu-img check them one by one, then boot.

So it might take some time before I find another corrupted image.
(which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to 
imagine that automated migrations of VMs could trigger similar behaviors 
on qcow2 images.


Last point about the versions we use : yes that's old, yes we're 
planning to upgrade, but we don't know when.


Regards,

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] qcow2 images corruption

2018-02-07 Thread Nicolas Ecarnot

Hello,

TL; DR : qcow2 images keep getting corrupted. Any workaround?

Long version:
This discussion has already been launched by me on the oVirt and on 
qemu-block mailing list, under similar circumstances but I learned 
further things since months and here are some informations :


- We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 
7.{2,3} hosts

- Hosts :
  - CentOS 7.2 1511 :
- Kernel = 3.10.0 327
- KVM : 2.3.0-31
- libvirt : 1.2.17
- vdsm : 4.17.32-1
  - CentOS 7.3 1611 :
- Kernel 3.10.0 514
- KVM : 2.3.0-31
- libvirt 2.0.0-10
- vdsm : 4.17.32-1
- Our storage is 2 Equallogic SANs connected via iSCSI on a dedicated 
network
- Depends on weeks, but all in all, there are around 32 hosts, 8 storage 
domains and for various reasons, very few VMs (less than 200).
- One peculiar point is that most of our VMs are provided an additional 
dedicated network interface that is iSCSI-connected to some volumes of 
our SAN - these volumes not being part of the oVirt setup. That could 
lead to a lot of additional iSCSI traffic.


From times to times, a random VM appears paused by oVirt.
Digging into the oVirt engine logs, then into the host vdsm logs, it 
appears that the host considers the qcow2 image as corrupted.
Along what I consider as a conservative behavior, vdsm stops any 
interaction with this image and marks it as paused.

Any try to unpause it leads to the same conservative pause.

After having found (https://access.redhat.com/solutions/1173623) the 
right logical volume hosting the qcow2 image, I can run qemu-img check 
on it.

- On 80% of my VMs, I find no errors.
- On 15% of them, I find Leaked cluster errors that I can correct using 
"qemu-img check -r all"
- On 5% of them, I find Leaked clusters errors and further fatal errors, 
which can not be corrected with qemu-img.
In rare cases, qemu-img can correct them, but destroys large parts of 
the image (becomes unusable), and on other cases it can not correct them 
at all.


Months ago, I already sent a similar message but the error message was 
about No space left on device 
(https://www.mail-archive.com/qemu-block@gnu.org/msg00110.html).


This time, I don't have this message about space, but only corruption.

I kept reading and found a similar discussion in the Proxmox group :
https://lists.ovirt.org/pipermail/users/2018-February/086750.html

https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-disk-i-o.32865/page-2

What I read similar to my case is :
- usage of qcow2
- heavy disk I/O
- using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the 
solution. Having asked this question to oVirt experts 
(https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but 
it's not clear the driver is to blame.


I agree with the answer Yaniv Kaul gave to me, saying I have to properly 
report the issue, so I'm longing to know which peculiar information I 
can give you now.


As you can imagine, all this setup is in production, and for most of the 
VMs, I can not "play" with them. Moreover, we launched a campaign of 
nightly stopping every VM, qemu-img check them one by one, then boot.

So it might take some time before I find another corrupted image.
(which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to 
imagine that automated migrations of VMs could trigger similar behaviors 
on qcow2 images.


Last point about the versions we use : yes that's old, yes we're 
planning to upgrade, but we don't know when.


Regards,

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] qemu-kvm images corruption

2018-02-06 Thread Nicolas Ecarnot

Hello,

On our two 3.6 DCs, we're still facing qcow2 corruptions, even on 
freshly installed VMs (CentOS7, win2012, win2008...).


(We are still hoping to find some time to migrate all this to 4.2, but 
it's a big work and our one-person team - me - is overwhelmed.)


My workaround is described in my previous thread below, but it's just a 
workaround.


Reading further, I found that :

https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-disk-i-o.32865/page-2

There are many things I don't know or understand, and I'd like your 
opinion :


- Is "virtio" is synonym of "virtio-blk"?
- Is it true that the development of virtio-scsi is active and the one 
of virtio is stopped?
- People in the proxmox forum seem to say that no qcow2 corruption 
occurs when using IDE (not an option for me) neither virtio-scsi. Does 
any Redhat people ever heard of this?
- Is converting all my VMs to use virtio-scsi a guarantee against 
further corruptions?
- What is the non-official but nonetheless recommended driver oVirt devs 
recommend in the sense of future, development and stability?


Regards,

--
Nicolas ECARNOT

Le 15/09/2017 à 14:06, Nicolas Ecarnot a écrit :

TL;DR:
How to avoid images corruption?


Hello,

On two of our old 3.6 DC, a recent series of VM migrations lead to some 
issues :

- I'm putting a host into maintenance mode
- most of the VM are migrating nicely
- one remaining VM never migrates, and the logs are showing :

* engine.log : "...VM has been paused due to I/O error..."
* vdsm.log : "...Improbable extension request for volume..."

After digging amongst the RH BZ tickets, I saved the day by :
- stopping the VM
- lvchange -ay the adequate /dev/...
- qemu-img check [-r all] /rhev/blahblah
- lvchange -an...
- boot the VM
- enjoy!

Yesterday this worked for a VM where only one error occurred on the qemu 
image, and the repair was easily done by qemu-img.


Today, facing the same issue on another VM, it failed because the errors 
were very numerous, and also because of this message :


[...]
Rebuilding refcount structure
ERROR writing refblock: No space left on device
qemu-img: Check failed: No space left on device
[...]

The PV/VG/LV are far from being full, so I guess I don't where to look at.
I tried many ways to solve it but I'm not comfortable at all with qemu 
images, corruption and solving, so I ended up exporting this VM (to an 
NFS export domain), importing it into another DC : this had the side 
effect to use qemu-img convert from qcow2 to qcow2, and (maybe?) to 
solve some errors???
I also copied it into another qcow2 file with the same qemu-img convert 
way, but it is leading to another clean qcow2 image without errors.


I saw that on 4.x some bugs are fixed about VM migrations, but this is 
not the point here.
I checked my SANs, my network layers, my blades, the OS (CentOS 7.2) of 
my hosts, but I see nothing special.


The real reason behind my message is not to know how to repair anything, 
rather than to understand what could have lead to this situation?

Where to keep a keen eye?




--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] critical production issue for a vm

2017-12-06 Thread Nicolas Ecarnot

Le 06/12/2017 à 11:21, Nathanaël Blanchet a écrit :

Hi all,

I'm about to lose one very important vm. I shut down this vm for 
maintenance and then I moved the four disks to a new created lun. This 
vm has 2 snapshots.


After successful move, the vm refuses to start with this message:

Bad volume specification {u'index': 0, u'domainID': 
u'961ea94a-aced-4dd0-a9f0-266ce1810177', 'reqsize': '0', u'format': 
u'cow', u'bootOrder': u'1', u'discard': False, u'volumeID': 
u'a0b6d5cb-db1e-4c25-aaaf-1bbee142c60b', 'apparentsize': '2147483648', 
u'imageID': u'4a95614e-bf1d-407c-aa72-2df414abcb7a', u'specParams': {}, 
u'readonly': u'false', u'iface': u'virtio', u'optional': u'false', 
u'deviceId': u'4a95614e-bf1d-407c-aa72-2df414abcb7a', 'truesize': 
'2147483648', u'poolID': u'48ca3019-9dbf-4ef3-98e9-08105d396350', 
u'device': u'disk', u'shared': u'false', u'propagateErrors': u'off', 
u'type': u'disk'}.


I tried to merge the snaphots, export , clone from snapshot, copy disks, 
or deactivate disks and every action fails when it is about disk.


I began to dd lv group to get a new vm intended to a standalone 
libvirt/kvm, the vm quite boots up but it is an outdated version before 
the first snapshot. There is a lot of disks when doing a "lvs | grep 
961ea94a" supposed to be disks snapshots. Which of them must I choose to 
get the last vm before shutting down? I'm not used to deal snapshot with 
virsh/libvirt, so some help will be much appreciated.


Is there some unknown command to recover this vm into ovirt?

Thank you in advance.



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



Beside specific oVirt answers, did you try to get informations about the 
snapshot tree with qemu-img info --backing-chain on the adequate 
/dev/... logical volume?
As you know how to dd from LVs, you could extract every needed snapshots 
files and rebuild your VM outside of oVirt.

Then take time to re-import it later and safely.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] iSCSI multipathing missing tab

2017-11-22 Thread Nicolas Ecarnot

Le 21/11/2017 à 15:21, Nicolas Ecarnot a écrit :

Hello,

oVirt 4.1.6.2-1.el7.centos

Under the datacenter section, I see no iSCSI multipathing tab.
As I'm building this new DC, could this be because this DC is not yet 
initialized?




Self-replying (sorry, once again), for the record :

https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/administration_guide/#Configuring_iSCSI_Multipathing


Prerequisites

Ensure you have created an iSCSI storage domain and discovered and logged into all the paths to the iSCSI target(s). 


As usual : Me, Read The Fine Manual...

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] iSCSI multipathing missing tab

2017-11-21 Thread Nicolas Ecarnot

Hello,

oVirt 4.1.6.2-1.el7.centos

Under the datacenter section, I see no iSCSI multipathing tab.
As I'm building this new DC, could this be because this DC is not yet 
initialized?


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Cannot remove snapshot

2017-11-21 Thread Nicolas Ecarnot

Le 17/11/2017 à 16:38, Nicolas Ecarnot a écrit :
- export the VM then re-import, if this is related to some LV space 
missing. Then removing the snapshot the usual way.


Self-replying, for the record:

The backing image was seen full of errors by qemu-img check.
I exported the whole backing + img without commiting the snapshot.
I then imported with commiting, and it all went well.

4 hours of doubt.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Cannot remove snapshot

2017-11-17 Thread Nicolas Ecarnot

Hello,

oVirt 3.6.7.5-1
I'm trying to remove a snapshot in a cold way (VM shut down).
It is failing, and VDSM is telling :

4f1588f3-ae2d-4702-b7e1-4ef53b5b5a1d::DEBUG::2017-11-17 
13:04:11,448::lvm::290::Storage.Misc.excCmd::(cmd) SUCCESS:  = ' 
WARNING: lvmetad is running but disabled. Restart lvmetad before 
enabling it!\n';  = 0
4f1588f3-ae2d-4702-b7e1-4ef53b5b5a1d::DEBUG::2017-11-17 
13:04:11,456::lvm::462::Storage.LVM::(_reloadlvs) lvs reloaded
4f1588f3-ae2d-4702-b7e1-4ef53b5b5a1d::DEBUG::2017-11-17 
13:04:11,456::lvm::462::Storage.OperationMutex::(_reloadlvs) Operation 
'lvm reload operation' released the operation mutex
4f1588f3-ae2d-4702-b7e1-4ef53b5b5a1d::ERROR::2017-11-17 
13:04:11,457::image::1302::Storage.Image::(merge) Unexpected error

Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 1293, in merge
sdDom, srcVolParams, volParams, reqSize, chain)
  File "/usr/share/vdsm/storage/image.py", line 1039, in 
_baseCowVolumeMerge

unsafe=False, rollback=True)
  File "/usr/share/vdsm/storage/volume.py", line 278, in rebase
raise se.MergeSnapshotsError(self.volUUID)
MergeSnapshotsError: Error merging snapshots: 
('4a8c17aa-5882-45a1-8a6e-40db39ed06ca',)


I read this : https://bugzilla.redhat.com/show_bug.cgi?id=1069610 , 
hoping I could find some workaround. But I couldn't.


If there is no abvious workaround, would there be other ways like :
- find the bare logical volume, shut down the VM, and play with low 
level qemu-img commands. I think I knwn how to do that, but I'm worried 
the oVirt database won't be in sync once I've removed the snapshot


or

- export the VM then re-import, if this is related to some LV space 
missing. Then removing the snapshot the usual way.


Any advice (apart the obvious upgrade-to-4.X-sir)?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] LVM structure

2017-10-05 Thread Nicolas Ecarnot

Hi Adam,


Le 04/10/2017 à 16:48, Adam Litke a écrit :
Sure.  vdsm-tool should be disabling lvmetad on the host automatically.  
Maybe some of the hosts were fresh installed and others have been 
upgraded from older versions?  In any case, you should be able to run on 
any host in maintenance mode:


     sudo vdsm-tool configure --force

And this should edit the lvm.conf file to disable lvmetad globally and 
also prevent the lvmetad service from starting.


Sorry, but nope.



# vdsm-tool configure --force

Checking configuration status...

Current revision of multipath.conf detected, preserving
libvirt is already configured for vdsm
SUCCESS: ssl configured to true. No conflicts

Running configure...
Reconfiguration of sebool is done.
Reconfiguration of libvirt is done.

Done configuring modules to VDSM.

# grep use_lvmetad /etc/lvm/lvm.conf |grep -v '#'
use_lvmetad = 1


Actually, as you found a workaround, it's not a big deal, especially if 
this point has been fixed in version greater than 3.6.7.


It's just to let people know.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] LVM structure

2017-10-04 Thread Nicolas Ecarnot

Le 04/10/2017 à 15:30, Adam Litke a écrit :


On Wed, Oct 4, 2017 at 4:12 AM Nicolas Ecarnot <nico...@ecarnot.net 
<mailto:nico...@ecarnot.net>> wrote:


Adam,

TL;DR : You nailed it!


Great!  Glad you're back up and running. One additional note about LVM 
commands. It's dangerous to use lvmetad for some commands while vdsm is 
running since it will not use lvmetad. You could end up with conflicting 
operations. In general it's safest to not issue any lvm commands while 
the host is activated but if you must, don't forget to disable lvmetad 
for all commands.


OK.

Is it worth trying to understand why amongst our 32 hosts in 2 DC, all 
in the same version (OS, vdsm, qemu packages...) some are showing 
they're using lvmetad and some not?


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt blog screenshots: paging jmarks

2017-10-04 Thread Nicolas Ecarnot

Le 04/10/2017 à 09:39, ov...@fateknollogee.com a écrit :

https://www.ovirt.org/blog/2017/09/introducing-ovirt-4.2.0/

https://www.ovirt.org/blog/2017/10/introducing-high-performance-vms/

@jmarks : can you include the full resolution of your screenshots from 
those 2 blog posts?


Those screenshots are hard to see any detail
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Hello,

I'm using Firefox, and on every picture, I click right mouse button > 
view image

and it shows in really decent resolution.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] LVM structure

2017-10-04 Thread Nicolas Ecarnot

Adam,

TL;DR : You nailed it!

Le 03/10/2017 à 18:12, Adam Litke a écrit :
Does this report an error on the host where you are having problems 
activating logical volumes?


     lvs -a -o +devices


On the hosts where I can't activate a LV, this command returns nothing 
interesting :


root@serv-hv-prd03:~# lvs -a -o +devices
  LV   VG Attr   LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync 
Convert Devices
  home cl -wi-ao 56,25g 
/dev/sda2(1024)
  root cl -wi-ao 50,00g 
/dev/sda2(15423)
  swap cl -wi-ao  4,00g 
/dev/sda2(0)


and so goes for pvs and vgs.



Also, do the lvm commands succeed when you explicitly disable lvmetad, ie...

     lvchange --config 'global {use_lvmetad=0}' -ay ...


Disabling lvmetad usage allows the activation to succeed.

Having understand that, I tried to run some usual LVM commands like pvs 
vgs, lvs, pvscan, vgscan, lvscan, lvmdiskscan, and they all returned 
some quite empty answers (to be short : only the local LV).


Having understood the role of lvmetd, I ran pcscan --cache, and all in a 
sudden it filled up the LVM informations : I found back all my oVirt LVM 
storage domains, as I could see on other hosts.


Things to note :
- trying to run a VM on empty LVM cache was nonetheless successful
- before filling the lvmetad cache, I checked this daemon was running 
and it was.




--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] LVM structure

2017-09-20 Thread Nicolas Ecarnot

Hello,

I'm still coping with my qemu image corruption, and I'm following some 
Redhat guidelines that explains the way to go :

- Start the VM
- Identify the host
- On this host, run the ps command to identify the disk image location :

# ps ax|grep qemu-kvm|grep vm_name

- Look for "-drive 
file=/rhev/data-center/0001-0001-0001-0001-033e/b72773dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75"

(YMMV)

- Resolve this symbolic link
# ls -la 
/rhev/data-center/0001-0001-0001-0001-033e/b72773dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75
lrwxrwxrwx 1 vdsm kvm 78  3 oct.   2016 
/rhev/data-center/0001-0001-0001-0001-033e/b72773dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75 
-> 
/dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75


- Shutdown the VM
- On the SPM, activate the logical volume :
# lvchange -ay 
/dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75


- Verify the state of the qemu image :
# qemu-img check 
/dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75


- If needed, attempt a repair :
# qemu-img check -r all /dev/...

- In any case, deactivate the LV :
# lvchange -an /dev/...


I followed this steps tens of times, and finding the LV and activating 
it was obvious and successful.
Since yesterday, I'm finding some VMs one which these steps are not 
working : I can identify the symbolic link, but the SPM neither the host 
are able to find the LV device, thus can not LV-activate it :


# lvchange -ay 
/dev/de2fdaa0-6e09-4dd2-beeb-1812318eb893/ce13d349-151e-4631-b600-c42b82106a8d
  Failed to find logical volume 
"de2fdaa0-6e09-4dd2-beeb-1812318eb893/ce13d349-151e-4631-b600-c42b82106a8d"


Either I need two more coffees, either I may be missing a step or 
something to check.
Looking at the SPM /dev/disk/* structure, it looks like very sound (I 
can see my three storage domains dm-name-* series of links).


As the VM can nicely be ran and stopped, does the host activates 
something more before being launched?


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] qemu-kvm images corruption

2017-09-15 Thread Nicolas Ecarnot

TL;DR:
How to avoid images corruption?


Hello,

On two of our old 3.6 DC, a recent series of VM migrations lead to some 
issues :

- I'm putting a host into maintenance mode
- most of the VM are migrating nicely
- one remaining VM never migrates, and the logs are showing :

* engine.log : "...VM has been paused due to I/O error..."
* vdsm.log : "...Improbable extension request for volume..."

After digging amongst the RH BZ tickets, I saved the day by :
- stopping the VM
- lvchange -ay the adequate /dev/...
- qemu-img check [-r all] /rhev/blahblah
- lvchange -an...
- boot the VM
- enjoy!

Yesterday this worked for a VM where only one error occurred on the qemu 
image, and the repair was easily done by qemu-img.


Today, facing the same issue on another VM, it failed because the errors 
were very numerous, and also because of this message :


[...]
Rebuilding refcount structure
ERROR writing refblock: No space left on device
qemu-img: Check failed: No space left on device
[...]

The PV/VG/LV are far from being full, so I guess I don't where to look at.
I tried many ways to solve it but I'm not comfortable at all with qemu 
images, corruption and solving, so I ended up exporting this VM (to an 
NFS export domain), importing it into another DC : this had the side 
effect to use qemu-img convert from qcow2 to qcow2, and (maybe?) to 
solve some errors???
I also copied it into another qcow2 file with the same qemu-img convert 
way, but it is leading to another clean qcow2 image without errors.


I saw that on 4.x some bugs are fixed about VM migrations, but this is 
not the point here.
I checked my SANs, my network layers, my blades, the OS (CentOS 7.2) of 
my hosts, but I see nothing special.


The real reason behind my message is not to know how to repair anything, 
rather than to understand what could have lead to this situation?

Where to keep a keen eye?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] iSCSI Multipath issues

2017-07-25 Thread Nicolas Ecarnot

Le 25/07/2017 à 10:26, Maor Lipchuk a écrit :

Hi Vinícius,

For some reason it looks like your networks are both connected to the same IPs.


Hi,

Sorry to jump in this thread, but I'm concerned with this issue.

Correct me if I'm wrong, but in this thread, many people are using 
Equallogic SANs, which provides only one virtual IP to connect to.


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SQL : last time halted?

2017-07-06 Thread Nicolas Ecarnot

[For the record]

Juan,

Thanks to your hint, I eventually found it more convenient for me to use 
a SQL query to find out which VM was unsed for months :


SELECT
  vm_static.vm_name,
  vm_dynamic.status,
  vm_dynamic.vm_ip,
  vm_dynamic.vm_host,
  vm_dynamic.last_start_time,
  vm_dynamic.vm_guid,
  vm_dynamic.last_stop_time
FROM
  public.vm_dynamic,
  public.vm_static
WHERE
  vm_dynamic.vm_guid = vm_static.vm_guid AND
  vm_dynamic.status = 0
ORDER BY
  vm_dynamic.last_stop_time ASC;

Thank you.

--
Nicolas ECARNOT

Le 30/05/2017 à 17:29, Juan Hernández a écrit :

On 05/30/2017 05:02 PM, Nicolas Ecarnot wrote:

Hello,

I'm trying to find a way to clean up the VMs list of my DCs.
I think some of my users have created VM they're not using anymore, but
it's difficult to sort them out.
In some cases, I can shutdown some of them and wait.
Is there somewhere stored in the db tables the date of the last VM
exctinction?

Thank you.



Did you consider using the API? There is a 'stop_time' attribute that
you can use. For example, to list all the VMs and sort them by stop time
you can use the following Python script:

---8<---
import ovirtsdk4 as sdk
import ovirtsdk4.types as types

# Create the connection to the server:
connection = sdk.Connection(
 url='https://engine.example.com/ovirt-engine/api',
 username='admin@internal',
 password='...',
 ca_file='/etc/pki/ovirt-engine/ca.pem'
)

# List the virtual machines:
vms_service = connection.system_service().vms_service()
vms = vms_service.list()

# Sort the them by stop time:
vms.sort(key=lambda vm: vm.stop_time)

# Print the result:
for vm in vms:
 print("%s: %s" % (vm.name, vm.stop_time))

# Close the connection to the server:
connection.close()
--->8---




--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] SQL : last time halted?

2017-05-30 Thread Nicolas Ecarnot

Hello,

I'm trying to find a way to clean up the VMs list of my DCs.
I think some of my users have created VM they're not using anymore, but 
it's difficult to sort them out.

In some cases, I can shutdown some of them and wait.
Is there somewhere stored in the db tables the date of the last VM 
exctinction?


Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 3.6 on CentOS 6.7 based HyperConverged DC : Upgradable?

2017-03-29 Thread Nicolas Ecarnot

Le 29/03/2017 à 15:54, Yedidyah Bar David a écrit :

On Wed, Mar 29, 2017 at 4:35 PM, Nicolas Ecarnot <nico...@ecarnot.net> wrote:

[Please ignore the previous msg]

Hello,


Hello Didi,


One of our DC is a very small one, though quite critical.
It's almost hyper converged : hosts are compute+storage, but the engine is
standalone.


And you intend to keep it that way? You didn't mention below.


I intend to keep it this way.


At first glance, I would go this way (feel free to comment) :
- upgrade the OS of the engine : 6.7 -> 7.3


How? This isn't supported, in principle, although it might work.


Mmmm, yep, you're right. Many people documented it.
I'm not very fond of playing for such a critical part.


The "official" way is using engine-backup to backup and restore it.
See also:
https://bugzilla.redhat.com/show_bug.cgi?id=1332463


Ok,
Sent you a question there.


During this upgrade, I have no constraint to keep everything running, total
shutdown is acceptable.


So you can also do a full backup and restore.


I could.
But I may run out of hosts at present.


Create an NFS export domain
somewhere, export all your VMs there, recreate everything from scratch,
import the VMs. Will take much longer, but then you don't need to risk/
test/prepare for problems in upgrading gluster.


Well, so your overall opinion is that I should stick to KISS, correct?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] oVirt 3.6 on CentOS 6.7 based HyperConverged DC : Upgradable?

2017-03-29 Thread Nicolas Ecarnot

[Please ignore the previous msg]

Hello,

One of our DC is a very small one, though quite critical.
It's almost hyper converged : hosts are compute+storage, but the engine 
is standalone.


It's made of :

Hardware :
- one physical engine : CentOS 6.7
- 3 physical hosts : CentOS 7.2

Software :
- oVirt 3.6.5
- glusterFS 3.7.16 in replica-3, sharded.

The goal is to upgrade all this to oVirt 4.1.1, and also upgrade the 
OSes. (oV 4.x only available on cOS 7.x)


At present, only 3 VMs here are critical, and I have backups for them.
Though, I'm quite nervous with the path I have to follow and the 
hazards. Especially about the gluster parts.


At first glance, I would go this way (feel free to comment) :
- upgrade the OS of the engine : 6.7 -> 7.3
- upgrade the OS of the hosts  : 7.2 -> 7.3
- upgrade and manage the upgrade of gluster, check the volumes...
- upgrade oVirt (engine then hosts)

But when upgrading the OSes, I guess it will also upgrade the gluster layer.

During this upgrade, I have no constraint to keep everything running, 
total shutdown is acceptable.


Is the above procedure seems OK, or may am I missing some essential points?

Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] oVirt 3.6 on CentOS 7.1 based HyperConverged DC : Upgradable?

2017-03-29 Thread Nicolas Ecarnot

Hello,

One of our DC is a very small one, though quite critical.
It's almost hyper converged : hosts are compute+storage, but the engine 
is standalone.


It's made of :

Hardware :
- one physical engine : CentOS 7.1
- 3 physical hosts : CentOS 7.2

Software :
- oVirt 3.6.5
- glusterFS 3.7.16 in replica-3, sharded.

The goal is to upgrade all this to oVirt 4.1.1, and if possible also 
upgrade the OSes.


At present, only 3 VMs here are critical, and I have backups for them.
Though, I'm quite nervous with the path I have to follow and the 
hazards. Especially about the gluster parts.


At first glance, I would go this way (feel free to comment) :
- upgrade the OS of the engine : 7.1 -> 7.3
- upgrade the OS of the hosts  : 7.2 -> 7.3
- upgrade and manage the upgrade of gluster, check the volumes...
- upgrade oVirt (engine then hosts)

But when upgrading the OSes, I guess it will also upgrade the gluster layer.

During this upgrade, I have no constraint to keep everything running, 
total shutdown is acceptable.


Is the above procedure seems OK, or may am I missing some essential points?

Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Upgrade guide for oVirt 4.1.x?

2017-03-27 Thread Nicolas Ecarnot

Le 27/03/2017 à 14:43, Yaniv Dary a écrit :

This is the page covering minor releases:
http://www.ovirt.org/documentation/upgrade-guide/chap-Updates_between_Minor_Releases/

Yaniv Dary Technical Product Manager Red Hat Israel Ltd. 34 Jerusalem
Road Building A, 4th floor Ra'anana, Israel 4350109 Tel : +972 (9)
7692306 8272306 Email: yd...@redhat.com <mailto:yd...@redhat.com> IRC :
ydary


Hi Yaniv,

Just a small note to say that on this page
http://www.ovirt.org/documentation/upgrade-guide/upgrade-guide/
the third link ("Chapter 3...") is pointing to the first one ("Chapter 
1: Updating the oVirt Environment").


Regards,

--
Nicolas Ecarnot
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt-engine failed to check for updates

2017-02-01 Thread Nicolas Ecarnot

Le 01/02/2017 à 18:18, Nicolas Ecarnot a écrit :

Le 01/02/2017 à 17:37, Michael Watters a écrit :

I have ovirt-engine 3.6 set up on a dedicated host which is managing
two ovirt hosts.  I am seeing errors when the engine attempts to
check for updates as follows.

Failed to check for available updates on host ovirt-node-production2
with message 'Command returned failure code 1 during SSH session
'root@ovirt-node-production2''.

I checked the logs on the host and it appears to be an issue with
yum.

2017-01-30 10:21:05 ERROR
otopi.plugins.ovirt_host_mgmt.packages.update update.error:102 Yum:
Cannot queue package ovirt-imageio-daemon: Package
ovirt-imageio-daemon cannot be found

2017-01-30 10:21:05 ERROR otopi.context context._executeMethod:151
Failed to execute stage 'Package installation': Package
ovirt-imageio-daemon cannot be found

Is there a way to resolve this?  I don't see any package named
ovirt-imageio-daemon in my repos when running a yum search.



___ Users mailing list
Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users



Hello Michael,

Last time we spoke about this issue, it was the fault of ipV6 that had
to be turn down (I let you search the relevant posts).


OK Michael,
Your case is different.

Anyway, for the record, I was referring to this :

http://lists.ovirt.org/pipermail/users/2016-September/076113.html



-
cat /etc/sysctl.d/noipv6.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
-




--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt-engine failed to check for updates

2017-02-01 Thread Nicolas Ecarnot

Le 01/02/2017 à 17:37, Michael Watters a écrit :

I have ovirt-engine 3.6 set up on a dedicated host which is managing
two ovirt hosts.  I am seeing errors when the engine attempts to
check for updates as follows.

Failed to check for available updates on host ovirt-node-production2
with message 'Command returned failure code 1 during SSH session
'root@ovirt-node-production2''.

I checked the logs on the host and it appears to be an issue with
yum.

2017-01-30 10:21:05 ERROR
otopi.plugins.ovirt_host_mgmt.packages.update update.error:102 Yum:
Cannot queue package ovirt-imageio-daemon: Package
ovirt-imageio-daemon cannot be found

2017-01-30 10:21:05 ERROR otopi.context context._executeMethod:151
Failed to execute stage 'Package installation': Package
ovirt-imageio-daemon cannot be found

Is there a way to resolve this?  I don't see any package named
ovirt-imageio-daemon in my repos when running a yum search.



___ Users mailing list
Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users



Hello Michael,

Last time we spoke about this issue, it was the fault of ipV6 that had 
to be turn down (I let you search the relevant posts).


-
cat /etc/sysctl.d/noipv6.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
-

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Multipath handling in oVirt

2017-02-01 Thread Nicolas Ecarnot

Le 01/02/2017 à 15:31, Yura Poltoratskiy a écrit :

Here you are:

iSCSI multipathing
<https://dl.dropboxusercontent.com/u/106774860/iSCSI_multipathing.png>

network setup of a host
<https://dl.dropboxusercontent.com/u/106774860/host_network.png>



01.02.2017 15:31, Nicolas Ecarnot пишет:

Hello,

Before replying further, may I ask you, Yura, to post a screenshot of
your iSCSI multipathing setup in the web GUI?

And also the same for the network setup of a host ?

Thank you.





Thank you Yura.

To Yaniv and Pavel, yes, this leads to this oVirt feature of iSCSI 
multipathing, indeed.


I would be curious to see (on Yura's hosts for instance) the translation 
of the oVirt iSCSI multipathing in CLI commands (multipath -ll, iscsiadm 
-m session -P3, dmsetup table, ...)


Yura's setup seems to be perfectly fitted to oVirt (2 NICs, 2 VLANs, 2 
targets in different VLANs, iSCSI multipathing), but I'm trying to see 
how I could make this work with our Equallogic presenting one and only 
one virtual ip (thus one target VLAN)...


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Multipath handling in oVirt

2017-02-01 Thread Nicolas Ecarnot

Hello,

Before replying further, may I ask you, Yura, to post a screenshot of 
your iSCSI multipathing setup in the web GUI?


And also the same for the network setup of a host ?

Thank you.

--
Nicolas ECARNOT

Le 01/02/2017 à 13:14, Yura Poltoratskiy a écrit :

Hi,

As for me personally I have such a config: compute nodes with 4x1G nics
and storages with 2x1G nics and 2 switches (not stackable). All servers
runs on CentOS 7.X (7.3 at this monent).

On compute nodes I have bonding with two nic1 and nic2 (attached to
different switches) for mgmt and VM's network, and the other two nics
nic3 and nic4 without bonding (and also attached to different switches).
On storage nodes I have no bonding and nics nic1 and nic2 connected to
different switches.

I have two networks for iSCSI: 10.0.2.0/24 and 10.0.3.0/24, nic1 of
storage and nic3 of computes connected to one network; nic2 of storage
and nic4 of computes - to another one.

On webUI I've created network iSCSI1 and iSCSI2 for nic3 and nic4, also
created multipath. To have active/active links with double bw throughput
I've added 'path_grouping_policy "multibus"' in defaults section of
/etc/multipath.conf.

After all of that, I have 200+MB/sec throughput to the storage (like
raid0 with 2 sata hdd) and I can lose one nic/link/swith without
stopping vms.

[root@compute02 ~]# multipath -ll
360014052f28c9a60 dm-6 LIO-ORG ,ClusterLunHDD
size=902G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 6:0:0:0 sdc 8:32  active ready running
  `- 8:0:0:0 sdf 8:80  active ready running
36001405551a9610d09b4ff9aa836b906 dm-40 LIO-ORG ,SSD_DOMAIN
size=915G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 7:0:0:0 sde 8:64  active ready running
  `- 9:0:0:0 sdh 8:112 active ready running
360014055eb8d30a91044649bda9ee620 dm-5 LIO-ORG ,ClusterLunSSD
size=135G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 6:0:0:1 sdd 8:48  active ready running
  `- 8:0:0:1 sdg 8:96  active ready running

[root@compute02 ~]# iscsiadm -m session
tcp: [1] 10.0.3.200:3260,1 iqn.2015-09.lab.lnx-san:storage (non-flash)
tcp: [2] 10.0.3.203:3260,1 iqn.2016-10.local.ntu:storage3 (non-flash)
tcp: [3] 10.0.3.200:3260,1 iqn.2015-09.lab.lnx-san:storage (non-flash)
tcp: [4] 10.0.3.203:3260,1 iqn.2016-10.local.ntu:storage3 (non-flash)

[root@compute02 ~]# ip route show | head -4
default via 10.0.1.1 dev ovirtmgmt
10.0.1.0/24 dev ovirtmgmt  proto kernel  scope link  src 10.0.1.102
10.0.2.0/24 dev enp5s0.2  proto kernel  scope link  src 10.0.2.102
10.0.3.0/24 dev enp2s0.3  proto kernel  scope link  src 10.0.3.102

[root@compute02 ~]# brctl show ovirtmgmt
bridge name bridge id   STP enabled interfaces
ovirtmgmt   8000.000475b4f262   no bond0.1001

[root@compute02 ~]# cat /proc/net/bonding/bond0 | grep "Bonding\|Slave
Interface"
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: fault-tolerance (active-backup)
Slave Interface: enp4s6
Slave Interface: enp6s0


01.02.2017 12:50, Nicolas Ecarnot пишет:

Hello,

I'm starting over on this subject because I wanted to clarify what was
the oVirt way to manage multipathing.

(Here I will talk only about the data/iSCSI/SAN/LUN/you name it
networks.)
According to what I see in the host network setup, one can assign
*ONE* data network to an interface or to a group of interfaces.

That implies that if my host has two data-dedicated interfaces, I can
- either group them using bonding (and oVirt is handy for that in the
host network setup), then assign the data virtual network to this bond
- either assign each nic a different ip in each a different VLAN, then
use two different data networks, and assign them each a different data
network. I never played this game and don't know where it's going to.

At first, may the oVirt storage experts comment on the above to check
it's ok.

Then, as many users here, our hardware is this :
- Hosts : Dell poweredge, mostly blades (M610,620,630), or rack servers
- SANs : Equallogic PS4xxx and PS6xxx

Equallogic's recommendation is that bonding is evil in iSCSI access.
To them, multipath is the only true way.
After reading tons of docs and using Dell support, everything is
telling me to use at least two different NICs with different ip, not
bonded - using the same network is bad but ok.

How can oVirt handle that ?






--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Multipath handling in oVirt

2017-02-01 Thread Nicolas Ecarnot

Hello,

I'm starting over on this subject because I wanted to clarify what was 
the oVirt way to manage multipathing.


(Here I will talk only about the data/iSCSI/SAN/LUN/you name it networks.)
According to what I see in the host network setup, one can assign *ONE* 
data network to an interface or to a group of interfaces.


That implies that if my host has two data-dedicated interfaces, I can
- either group them using bonding (and oVirt is handy for that in the 
host network setup), then assign the data virtual network to this bond
- either assign each nic a different ip in each a different VLAN, then 
use two different data networks, and assign them each a different data 
network. I never played this game and don't know where it's going to.


At first, may the oVirt storage experts comment on the above to check 
it's ok.


Then, as many users here, our hardware is this :
- Hosts : Dell poweredge, mostly blades (M610,620,630), or rack servers
- SANs : Equallogic PS4xxx and PS6xxx

Equallogic's recommendation is that bonding is evil in iSCSI access.
To them, multipath is the only true way.
After reading tons of docs and using Dell support, everything is telling 
me to use at least two different NICs with different ip, not bonded - 
using the same network is bad but ok.


How can oVirt handle that ?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VMWare VSAN like setup with oVirt

2017-01-31 Thread Nicolas Ecarnot

Le 31/01/2017 à 09:15, Anantha Raghava a écrit :

Hi,

We are trying to create a setup that uses the internal disks of the
hosts / nodes, yet provide the high availability, replication and
failover using oVirt. The setup we are typing to build is close to
VMWare VSAN which allows for all the above just using the internal disks
of the ESXi servers.

Can we achieve something similar with oVirt with Gluster?


Absolutely. One of our oVirt setup is done this way.
Three hosts are set up as glusterFS servers (replica-3), as well as 
oVirt nodes.
We choose to add a fourth host as an standalone engine, but you can 
choose to use a VM for that (hyperconverge setup).


I have no experience on similar setup with a random number of nodes, 
neither if this can be achievable (some kind of network RAID-10)... (?)


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] guest often looses connectivity I have to ping gateway

2017-01-26 Thread Nicolas Ecarnot

Le 26/01/2017 à 09:03, Gianluca Cecchi a écrit :

On Thu, Jan 26, 2017 at 8:45 AM, Pavel Gashev <p...@acronis.com
<mailto:p...@acronis.com>> wrote:

Gianluca,

It looks like VM doesn't receive broadcasts. It can be a network
topology issue.
Could you double check /sys/class/net/bond1/bonding/mode and
/sys/class/net/bond1/bonding/slaves ?

Is it possible you have another VM with the same MAC address in the
same network segment?


Pavel, I think you are right! Thanks!
I didn't take into consideration that there is another oVirt environment
that has some VMs on this vlan..
And I found a VM with the same mac 00:1a:4a:16:01:51 (and a different ip)
Now I powered off that other VM, restarted my one and things seem ok.

What is the best way to manage when more oVirt environments has VMs on
the same vlans?


I encountered the same problem some years ago, as we have multiple oVirt 
environnements.

We decided to assign specific MAC pools for each env to avoid overlapping.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [PySDK v3] Choose storage domain

2017-01-24 Thread Nicolas Ecarnot

Le 24/01/2017 à 13:18, Nicolas Ecarnot a écrit :

OK, just one second before sending this e-mail, I made a quick test with
the template object and it is working anyway.


Sorry for the noise, but...
No, actually, it wasn't working. I need to start from the api object to 
reach the actual disks.


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [PySDK v3] Choose storage domain

2017-01-24 Thread Nicolas Ecarnot

Juan,

Thank you very much for your help, this is working.

Some comments below.

Le 24/01/2017 à 11:04, Juan Hernández a écrit :

In order to do that you need to specify that you want to clone the disks
of the template, and for each disk you need to specify the storage
domain where you want to create it.


This is what I feared, and it was not really obvious at first sight (to 
prepare a disks list...).

I think I saw it was less weird in V4.


[...]



Also, please be careful when specifying the cluster and the template.
You are currently doing this:

  cluster=vm_cluster,
  template=vm_template,

Not sure how you are assigning the values to those 'vm_cluster' and
'vm_template' variables, but you are probably doing this:

  vm_cluster = api.vms.get(name='mycluster')
  vm_template = api.vms.get(name='mytemplate')


Here is what I was doing (please don't laugh) :

c_list = api.clusters.list()
# At present (2017), each datacenter contains only one cluster
vm_cluster = c_list[0]

vm_template = api.templates.get(name=template_name)


That combination isn't ideal, because you will be sending with the 'add'
request the complete representation of the cluster and the template,
when the server only needs the id or the name. Consider doing this instead:

  cluster=params.Cluster(
id=vm_cluster.get_id()
  ),
  template=params.Cluster(
id=vm_template.get_id()
  ),


Is it right to say that this last method is only valid in the api 
context, and would not work outside of the api object scope?
I mean : if I want to use your method, I can no longer separate the 
preparation of a vm_params object before calling the vms.add, right?





OK, just one second before sending this e-mail, I made a quick test with 
the template object and it is working anyway.

Does it mean nothing is instantiated before the api.vms.add call?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] [PySDK v3] Choose storage domain

2017-01-24 Thread Nicolas Ecarnot

Hello,

When trying to create a VM by cloning a template, I found out I couldn't 
choose the target storage domain :


[...]
vm_storage_domain = api.storagedomains.get(name=storage_domain)
vm_params = params.VM(name=vm_name,
memory=vm_memory,
cluster=vm_cluster,
template=vm_template,
os=vm_os,
storage_domain=vm_storage_domain,
)
try:
api.vms.add(vm=vm_params)
[...]

I'm getting :
Failed to create VM from Template:

status: 400
reason: Bad Request
detail: Cannot add VM. The selected Storage Domain does not contain the 
VM Template.


... which I know, but I thought that, as with the GUI, I could specify 
the target storage domain.


I made my homework, and I found a nice answer from Juan :
http://lists.ovirt.org/pipermail/users/2016-January/037321.html
but this relates to snapshots, and not to template usage, so I'm still 
stuck.

May I ask an advice?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt Python SDK on Ubuntu

2017-01-23 Thread Nicolas Ecarnot

Le 23/01/2017 à 15:05, Ondra Machacek a écrit :

Alas, though we already have one DC in V4, most of our production DCs are
still in V3 for one year, and I have to maintain them.
So far, I have no clue how to add ovirtsdk v3 to my Ubuntu.


You just need to specify which version you would like to install, in your case:

 easy_install ovirt-engine-sdk-python==3.6.9.2


Nice, this is working!
Thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] oVirt Python SDK on Ubuntu

2017-01-23 Thread Nicolas Ecarnot

Hello,

I'm trying to follow 
http://www.ovirt.org/develop/release-management/features/infra/python-sdk/ 
and I'm successfully discovering Python + oVirt SDK on CentOS.


I'd like to do the same on Ubuntu, but the instructions seem incomplete :

"
easy_install ovirt-engine-sdk-python
"

is working, but "import ovirtsdk" doen't give anything.



"
apt-get install python-lxml
cd ovirt-engine-sdk
python setup.py install
"

is wrong because the "cd" isn't going anywhere, obviously.

What am I missing?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Delay difference between queries (Python vs REST)

2017-01-17 Thread Nicolas Ecarnot

Le 17/01/2017 à 16:26, Juan Hernández a écrit :

On 01/17/2017 03:56 PM, Nicolas Ecarnot wrote:

Hello,

On a 3.6.5 DC, I'm trying to figure out how many VMs there are, using
two methods :

_*Python SDK :*_

*from ovirtsdk.xml import params
from ovirtsdk.api import API
api = API(url='https://engine.fqdn/ovirt-engine/api',
username='admin@internal', password='xxx', insecure=True)
print len(api.vms.list())*

time ./getMvm.py
62

real0m23.016s
user0m22.288s
sys0m0.054s


_*REST :*_

*time curl -H "Version: 3" -H "Prefer: persistent-auth" -H "Filter:
false" -H "Accept: application/xml" -H "Content-Type: application/xml"
-k -u 'admin@internal:xxx' https://***engine.fqdn*/ovirt-engine/api/vms*

(Then grep or anything that would get the values from the xml returned.)

real0m0.383s
user0m0.036s
sys0m0.038s


I am a beginner in both methods, but I would prefer play with Python.
I'm very surprised to have to wait more than 20 seconds to get an answer.
Looking at the engine log, I see that the authentication part is
finished after say 3 seconds, then 20 seconds with absolutely no error
message, no CPU load, no RAM burst, no nothing.
On the SPM, exactly triple null nothing nada niet void is obviously
explaining such a delay.

I'm wondering if this super hyper sluggishness is somewhat related to
the GUI global slowness I'm experiencing like other users since we left
3.2.x, and I would love that some oVirt ninja uses the comparison above
to tell what parts in oVirt is used or not that could explain such a
difference (database, access to SPM, LVM, network access, whatever...)

--
Nicolas ECARNOT



The performance problem is inside version 3 of the Python SDK. That is
one of the reasons that we had to do a new version of the Python SDK for
version 4 of the engine. If you are using version 4 of the engine then
you can use version 4 of the SDK:

  https://github.com/oVirt/ovirt-engine-sdk/tree/master/sdk

https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/list_vms.py

It should be much faster. Would be nice if you can repeat your test and
report the results.



Hello Juan,

Indeed, you were right. I tried the same from a recent server with a 
recent SDK, and I let you have a look :


# rpm -q python-ovirt-engine-sdk4
python-ovirt-engine-sdk4-4.0.1-1.el7.centos.x86_64

# time ./getMvm.py
62

real0m1.004s
user0m0.234s
sys 0m0.031s

And repeating the same test gives a very decent average, so thank you.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Delay difference between queries (Python vs REST)

2017-01-17 Thread Nicolas Ecarnot

Hello,

On a 3.6.5 DC, I'm trying to figure out how many VMs there are, using 
two methods :


_*Python SDK :*_

*from ovirtsdk.xml import params
from ovirtsdk.api import API
api = API(url='https://engine.fqdn/ovirt-engine/api', 
username='admin@internal', password='xxx', insecure=True)

print len(api.vms.list())*

time ./getMvm.py
62

real0m23.016s
user0m22.288s
sys0m0.054s


_*REST :*_

*time curl -H "Version: 3" -H "Prefer: persistent-auth" -H "Filter: 
false" -H "Accept: application/xml" -H "Content-Type: application/xml" 
-k -u 'admin@internal:xxx' https://***engine.fqdn*/ovirt-engine/api/vms*


(Then grep or anything that would get the values from the xml returned.)

real0m0.383s
user0m0.036s
sys0m0.038s


I am a beginner in both methods, but I would prefer play with Python. 
I'm very surprised to have to wait more than 20 seconds to get an answer.
Looking at the engine log, I see that the authentication part is 
finished after say 3 seconds, then 20 seconds with absolutely no error 
message, no CPU load, no RAM burst, no nothing.
On the SPM, exactly triple null nothing nada niet void is obviously 
explaining such a delay.


I'm wondering if this super hyper sluggishness is somewhat related to 
the GUI global slowness I'm experiencing like other users since we left 
3.2.x, and I would love that some oVirt ninja uses the comparison above 
to tell what parts in oVirt is used or not that could explain such a 
difference (database, access to SPM, LVM, network access, whatever...)


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Ovirt host activation and lvm looping with high CPU load trying to mount iSCSI storage

2017-01-12 Thread Nicolas Ecarnot

Hi,

As we are using a very similar hardware and usage as Mark (Dell 
poweredge hosts, Dell Equallogic SAN, iSCSI, and tons of LUNs for all 
those VMs), I'm jumping into this thread.


Le 12/01/2017 à 16:29, Yaniv Kaul a écrit :


While it's a bit of a religious war on what is preferred with iSCSI - 
network level bonding (LACP) or multipathing on the iSCSI level, I'm 
on the multipathing side. The main reason is that you may end up 
easily using just one of the paths in a bond - if your policy is not 
set correct on how to distribute connections between the physical 
links (remember that each connection sticks to a single physical link. 
So it really depends on the hash policy and even then - not so sure). 
With iSCSI multipathing you have more control - and it can also be 
determined by queue depth, etc.
(In your example, if you have SRC A -> DST 1 and SRC B -> DST 1 (as 
you seem to have), both connections may end up on the same physical NIC.)


If we reduce the number of storage domains, we reduce the number
of devices and therefore the number of LVM Physical volumes that
appear in Linux correct? At the moment each connection results in
a Linux device which has its own queue. We have some guests with
high IO loads on their device whilst others are low. All the
storage domain / datastore sizing guides we found seem to imply
it’s a trade-off between ease of management (i.e not having
millions of domains to manage), IO contention between guests on a
single large storage domain / datastore and possible wasted space
on storage domains. If you have further information on
recommendations, I am more than willing to change things as this
problem is making our environment somewhat unusable at the moment.
I have hosts that I can’t bring online and therefore reduced
resiliency in clusters. They used to work just fine but the
environment has grown over the last year and we also upgraded the
Ovirt version from 3.6 to 4.x. We certainly had other problems,
but host activation wasn’t one of them and it’s a problem that’s
driving me mad.


I would say that each path has its own device (and therefore its own 
queue). So I'd argue that you may want to have (for example) 4 paths 
to each LUN or perhaps more (8?). For example, with 2 NICs, each 
connecting to two controllers, each controller having 2 NICs (so no 
SPOF and nice number of paths).
Here, one key point I'm trying (to no avail) to discuss for years with 
Redhat people, and either I did not understood, either I wasn't clear 
enough, or Redhat people answered me they owned no Equallogic SAN to 
test it, is :
My (and maybe many others) Equallogic SAN has two controllers, but is 
publishing only *ONE* virtual ip address.
On one of our other EMC SAN, publishing *TWO* ip addresses, which can be 
published in two different subnets, I fully understand the benefits and 
working of multipathing (and even in the same subnet, our oVirt setup is 
happily using multipath).


But on one of our oVirt setup using the Equallogic SAN, we have no 
choice but point our hosts iSCSI interfaces to one single SAN ip, so no 
multipath here.


At this point, we saw no other mean than using bonding mode 1 to reach 
our SAN, which is terrible for storage experts.



To come back to Mark's story, we are still using 3.6.5 DCs and planning 
to upgrade.

Reading all this is making me delay this step.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Request for oVirt Ansible modules testing feedback

2017-01-04 Thread Nicolas Ecarnot

Hello,

Le 04/01/2017 à 11:49, Nathanaël Blanchet a écrit :



Le 04/01/2017 à 10:09, Andrea Ghelardi a écrit :


Personally I don’t think ansible and ovirt-shell are mutually exclusive.

Those who are in ansible and devops realms are not really scared by
making python/ansible work with ovirt.

From what I gather, playbooks are quite a de-facto pre-requisite to
build up a real SaaC “Software as a Code” environment.



On the other hand, ovirt-shell can and is a fast/easy way to perform
“normal daily tasks”.


totally agree but ovirt-shell is deprecated in 4.1 et will be removed in
4.2. Ansible or sdk4 are proposed as an alternative.


Could someone point me to an URL where sdk4 is fully documented, as I 
have to get ready for ovirt-shell deprecation?


I'm sure no one at Redhat thought about deprecating a tool in favor of a 
new one before providing a complete user doc!


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Windows Product Activation

2016-12-22 Thread Nicolas Ecarnot

Le 22/12/2016 à 22:06, Michal Skrivanek a écrit :

I read more about this WPA issue, and I also checked : all our
licences are MAK_B kind, which I read everywhere that they should
not induce such WPA trouble, once they are correctly registered
(which I obviously take care of). I also read the list of
components that are checked to create a hashed key linked to the
licence. As you wrote, changing to many components is triggering a
validity break.

Knowing this, may I ask you to comment on the promising "VM Custom
Serial Number" Alex was talking about : it sounded perfect, but
eventually not enough to cope with the hardware change?


That’s what it is for. Does it not work for this case?


I still have additional tests to do to validate it.
Moreover, as this WPA issue is triggered after 30 days, this kind of
tests is taking quite a lot of time.

Stay tuned.

PS : As usual, I'm very thankful to all who replied, and more generally
to everyone on this mailing list for your help throughout the year.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Windows Product Activation

2016-12-22 Thread Nicolas Ecarnot

Le 22/12/2016 à 17:26, Yaniv Kaul a écrit :
Windows activation, at least for 2008 and below, depend on enough 
hardware changes to happen. Each HW (of non-pluggable devices) change 
is a single 'penalty' point - except for NIC (based on MAC address) 
which is more. 4 or so points - and it requires re-activation. This 
does not apply to KMS licenses.


So unless you drastically change the hardware, you should be safe.
Y.

Hello Yaniv,

When migrating, these VMs can jump from a recent hardware host to an 
older one, with a different generation CPU (though of the same intel kind).


I read more about this WPA issue, and I also checked : all our licences 
are MAK_B kind, which I read everywhere that they should not induce such 
WPA trouble, once they are correctly registered (which I obviously take 
care of).
I also read the list of components that are checked to create a hashed 
key linked to the licence.

As you wrote, changing to many components is triggering a validity break.

Knowing this, may I ask you to comment on the promising "VM Custom 
Serial Number" Alex was talking about : it sounded perfect, but 
eventually not enough to cope with the hardware change?


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Windows Product Activation

2016-12-21 Thread Nicolas Ecarnot

Le 21/12/2016 à 16:13, Tom Gamull a écrit :

My “not relevant” response may be relevant now
Snapshot that server before and run your tests over and over. If you hit
the limit you can restore the snapshot.  That’s what I was trying to
explain.  If you hit the rearm limit without a backup to restore you are
going to be in a tough place.


You're totally right, at first I didn't understood where you were going to.
Indeed, this sounds a perfect time to use snapshots.

Thanks Tom.

Nicolas ECARNOt



Tom Gamull



On Dec 21, 2016, at 10:11 AM, Nicolas Ecarnot <nico...@ecarnot.net
<mailto:nico...@ecarnot.net>> wrote:

Le 21/12/2016 à 16:04, Tom Gamull a écrit :

Are there any events in the event log (usually Application Log entries


With a test server, I'm trying to forcibly reproduce the issue, so
I'll tell you soon.


under Source: Software Licensing Service) similar to this (this error is
unrelated, just example of event log)
- https://support.microsoft.com/en-us/kb/921471
I would consider reporting this to Microsoft, I am unaware of 2008 R2
having this behavior (I have seen 2008 R2 used on KVM and libvirt for
openstack without issue and be migrated).

Tom Gamull



On Dec 21, 2016, at 9:48 AM, Nicolas Ecarnot <nico...@ecarnot.net
<mailto:nico...@ecarnot.net>
<mailto:nico...@ecarnot.net>> wrote:

Le 21/12/2016 à 15:36, Tom Gamull a écrit :

Under Edit Virtual Machine -> System -> (Advanced Parameters) there
is a
Custom CPU Type you may be able to set, are all the hosts in the same
cluster?


On every VM we use the cluster default setting.

And on all our DC we use the same cpu setting.

--
Nicolas ECARNOT





--
Nicolas ECARNOT





--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Windows Product Activation

2016-12-21 Thread Nicolas Ecarnot

Le 21/12/2016 à 16:04, Tom Gamull a écrit :

Are there any events in the event log (usually Application Log entries


With a test server, I'm trying to forcibly reproduce the issue, so I'll 
tell you soon.



under Source: Software Licensing Service) similar to this (this error is
unrelated, just example of event log)
- https://support.microsoft.com/en-us/kb/921471
I would consider reporting this to Microsoft, I am unaware of 2008 R2
having this behavior (I have seen 2008 R2 used on KVM and libvirt for
openstack without issue and be migrated).

Tom Gamull



On Dec 21, 2016, at 9:48 AM, Nicolas Ecarnot <nico...@ecarnot.net
<mailto:nico...@ecarnot.net>> wrote:

Le 21/12/2016 à 15:36, Tom Gamull a écrit :

Under Edit Virtual Machine -> System -> (Advanced Parameters) there is a
Custom CPU Type you may be able to set, are all the hosts in the same
cluster?


On every VM we use the cluster default setting.

And on all our DC we use the same cpu setting.

--
Nicolas ECARNOT





--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Windows Product Activation

2016-12-21 Thread Nicolas Ecarnot

Le 21/12/2016 à 15:17, Alexander Wels a écrit :

On Wednesday, December 21, 2016 2:27:05 PM EST Nicolas Ecarnot wrote:

Hello,

Most of our virtual machines are Linux, but an increasing number of
windows VMs are being integrated into our oVirt DCs.

We bought tons of windows server licences, and successfully activated them.

Due to how Windows Product Activation is working, when a windows VM is
migrating from a host to another, this product activation is reset,
launching a 30 days countdown to auto-shutdown.

According to this old page :

https://mazimi.wordpress.com/2007/07/11/getting-around-windows-activation-wh
en-virtualizing/

and what I can read in microsoft's 2012 server documentations, I then
can re-activate it twice during the next 90 days.

Assuming I *want* to have *no* control upon the location of the VMs
amongst their hosts (I want them to fly freely, confident in the lovely
auto-balance scheduler), I understand all this is not the way to go.

At present, we have 2003, 2008 and 2012 server editions.
the only things I can read about windows 2012 server is related to the
commercial aspects (standard licence = 2 VMs, datacenter licencce =
infinite # of VMs), but not about this Windows Product Activation trouble.

How do you deal with this?
Is there a special licence type or something dedicated that would
prevent such an uncomfortable situation? (Christmas is near, I favor
soft terms.)

Regards.


Nicolas,

IIRC this is what the custom serial number setting is for. As far as I know
what happens when you migrate is that some id that windows looks for is
changed (because it is generate based on an id at the host level). You can set
a custom single value regardless of which host the VM is running on by opening
up the edit virtual machine in the UI, then clicking system, at the bottom
there is a check box called 'Provide custom serial number policy'. Then you
can select VM ID.

Once you have done that, if I understand the feature correctly, the ID won't
change and windows will not think you have new hardware each time the VM
migrates.

I could be wrong, but I believe this is what you are looking for.



This sounds very encouraging.
I have additional tests to drive.

I hope I will report here soon.

Thank you Alex.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Windows Product Activation

2016-12-21 Thread Nicolas Ecarnot

Le 21/12/2016 à 15:36, Tom Gamull a écrit :

Under Edit Virtual Machine -> System -> (Advanced Parameters) there is a
Custom CPU Type you may be able to set, are all the hosts in the same
cluster?


On every VM we use the cluster default setting.

And on all our DC we use the same cpu setting.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Windows Product Activation

2016-12-21 Thread Nicolas Ecarnot

Tom,

Thank you for answering.

Le 21/12/2016 à 14:47, Tom Gamull a écrit :

Something is triggering the activation that windows is detecting as a
change in hardware.


Our DCs are made of hosts from 3 different models, so chances are that 
windows is detecting a different CPU ID or something (that is a pity, as 
I thought all this was hidden to the guest)



I’ve not had this problem on 2012 or past versions,


It may be true that we only encountered these issues on 2008 R2 guests.


you’ll usually encounter it when changing the hardware drivers (such as
converting from physical to virtual).  Generally you want to install
compatible drivers (like the ovirt windows guest tools).


Every guests here is installed with oVirt guest tools.
Since then, we made no driver change, neither on hosts nor guests.


 A good
practice though is to snapshot before you make a change such as drivers
in case you need to set the activation key.
For Desktops in VDI when you use a gold image, you generally make a
snapshot before activation - see here for an answer
- 
https://social.technet.microsoft.com/Forums/en-US/25c4c85c-c8a9-4316-8bfa-d3b7848e6dc6/microsoft-vdi-collections-and-activation?forum=winserver8setup


I'm not sure this was relevant.


what kind of activation keys are you using?


Further readings lead me to think that the kind of key IS the main 
reason I'm facing this.



Do you have  KMS server?


No. I was told to be very prudent with using KMS servers, so not planned.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Windows Product Activation

2016-12-21 Thread Nicolas Ecarnot

Hello,

Most of our virtual machines are Linux, but an increasing number of 
windows VMs are being integrated into our oVirt DCs.


We bought tons of windows server licences, and successfully activated them.

Due to how Windows Product Activation is working, when a windows VM is 
migrating from a host to another, this product activation is reset, 
launching a 30 days countdown to auto-shutdown.


According to this old page :

https://mazimi.wordpress.com/2007/07/11/getting-around-windows-activation-when-virtualizing/

and what I can read in microsoft's 2012 server documentations, I then 
can re-activate it twice during the next 90 days.


Assuming I *want* to have *no* control upon the location of the VMs 
amongst their hosts (I want them to fly freely, confident in the lovely 
auto-balance scheduler), I understand all this is not the way to go.


At present, we have 2003, 2008 and 2012 server editions.
the only things I can read about windows 2012 server is related to the 
commercial aspects (standard licence = 2 VMs, datacenter licencce = 
infinite # of VMs), but not about this Windows Product Activation trouble.


How do you deal with this?
Is there a special licence type or something dedicated that would 
prevent such an uncomfortable situation? (Christmas is near, I favor 
soft terms.)


Regards.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] GFS2 and OCFS2 for Shared Storage

2016-11-23 Thread Nicolas Ecarnot

Le 23/11/2016 à 13:03, Fernando Frediani a écrit :

Has anyone managed to use GFS2 or OCFS2 for Shared Block Storage between
hosts ? How scalable was it and which of the two work better ?

Using traditional CLVM is far from good starting because of the lack of
Thinprovision so I'm willing to consider either of the Filesystems.

Thanks

Fernando

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Hello Fernando,

Redhat took a clear direction towards the use of GlusterFS for its 
Software-defined storage, and lots of efforts are made to make 
oVirt/RHEV work together smoothly.
I know GlusterFS is not a block storage, but it's worth considering it, 
especially if you intend to setup hyper-converged clusters.


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] What is "hosted storage domain"?

2016-11-17 Thread Nicolas Ecarnot

Le 16/11/2016 à 10:25, Nicolas Ecarnot a écrit :

Cc to list, still asking.


 Message transféré 
Sujet : Re: [ovirt-users] Problem moving master storage domain to
maintenance
Date : Thu, 10 Nov 2016 10:53:04 +0100
De : Nicolas Ecarnot <nico...@ecarnot.net>
Organisation : Si peu...
Pour : Roy Golan <rgo...@redhat.com>

Le 10/11/2016 à 10:40, knarra a écrit :

On 11/09/2016 06:26 PM, Roy Golan wrote:



On 9 November 2016 at 14:49, knarra <kna...@redhat.com
<mailto:kna...@redhat.com>> wrote:

Can some one please help me to understand the queries below.

On 11/03/2016 06:43 PM, Maor Lipchuk wrote:

Hi kasturi,

Which version of oVirt are you using?

Apologies for the late reply. I am using the latest master.

Roy, I assume it is related to 4.0 version where the import of
hosted storage domain was introduced.


Hi Roy,

I'm jumping on this thread just to ask you what you mean by hosted
storage domain, and/or where could I read more about it.
I saw nothing obvious in the 4.0.0 release notes, so just wondering...

Thank you




--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Fwd: Re: Problem moving master storage domain to maintenance

2016-11-16 Thread Nicolas Ecarnot

Cc to list, still asking.


 Message transféré 
Sujet : Re: [ovirt-users] Problem moving master storage domain to 
maintenance

Date : Thu, 10 Nov 2016 10:53:04 +0100
De : Nicolas Ecarnot <nico...@ecarnot.net>
Organisation : Si peu...
Pour : Roy Golan <rgo...@redhat.com>

Le 10/11/2016 à 10:40, knarra a écrit :

On 11/09/2016 06:26 PM, Roy Golan wrote:



On 9 November 2016 at 14:49, knarra <kna...@redhat.com
<mailto:kna...@redhat.com>> wrote:

Can some one please help me to understand the queries below.

On 11/03/2016 06:43 PM, Maor Lipchuk wrote:

Hi kasturi,

Which version of oVirt are you using?

Apologies for the late reply. I am using the latest master.

Roy, I assume it is related to 4.0 version where the import of
hosted storage domain was introduced.


Hi Roy,

I'm jumping on this thread just to ask you what you mean by hosted 
storage domain, and/or where could I read more about it.

I saw nothing obvious in the 4.0.0 release notes, so just wondering...

Thank you

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt homeserver

2016-10-28 Thread Nicolas Ecarnot

Le 28/10/2016 à 18:46, david caughey a écrit :

Hi,

I'm building a homeserver to run ovirt and wanted to get opinions on the
best approach.
The server will be used as a test/studybed for
ovirt/kvm/vcloud/openstack/ceph.
The server will be based around a Xeon E5 10 core with 128GB ram.
Option 1:
Build server with CentOS 7.2 and deploy ovirt directly on top.
Option 2:
Build server with CentOS 7.2 and deploy multiple ovirt instances on top
of KVM.
Which will be the most stable versatile method?
If a GPU is used as a passthrough device can it be used on several vm's
or is it restricted to 1 vm?
If 2 GPU's are used can 1 be used as a dedicated passthrough to 1 vm and
the other shared between the remaining vm's?
Is CentOS/RH the best platform for ovirt?
Is it okay/advisable to load the latest kernel, (4.8 ish), on to CentOS
before installing ovirt?

Any and all comments/advice welcome,

David


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



No one found it worth to mention Lago?
Only for test, but you mentionned this use case, so consider reading 
about Lago.


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Upgrading oVirt 3.6 with existing HTTPS certificate signed by custom CA to oVirt 4

2016-10-27 Thread Nicolas Ecarnot

Le 27/10/2016 à 00:14, Kenneth Bingham a écrit :

I did install a server certificate from a private CA on the engine
server for the oVirt 4 Manager GUI, but haven't figured out how to
configure engine to trust the same CA which also issued the server
certificate presented by vdsm. This is important for us because this is
the same server certificate presented by the host when using the console
(e.g. websocket console falls silently if the user agent doesn't trust
the console server's certificate).


Hello,

Maybe related bug : on an oVirt 4, I followed the same procedure below 
to install a custom CA, with *SUCCESS*.


Today, I had to reinstall one of the hosts, and it is failing with :
"CA certificate and CA private key do not match" :

http://pastebin.com/9JS05JtJ

Which certificate did we (Kenneth and I) did we mis-used?
What did we do wrong?

Regards,

Nicolas ECARNOT




On Wed, Oct 26, 2016, 16:58 Beckman, Daniel
<daniel.beck...@ingramcontent.com
<mailto:daniel.beck...@ingramcontent.com>> wrote:

We have oVirt 3.6.7 and I am preparing to upgrade to 4.0.4 release.
I read the release notes (https://www.ovirt.org/release/4.0.4/) and
noted comment #4 under “Install / Upgrade from previous version”:

__ __

/If you are using HTTPS certificate signed by custom certificate
authority, please take a look at https://bugzilla.redhat.com/1336838
for steps which need to be done after migration to 4.0. Also please
consult https://bugzilla.redhat.com/1313379 how to setup this custom
CA for use with virt-viewer clients./

/__ __/

So I referred to the first bugzilla
(https://bugzilla.redhat.com/show_bug.cgi?id=1336838), where it
states as follows:

__ __

If customer wants to use custom HTTPS certificate signed by
different CA, then he has to perform following steps: 

__ __

1. Install custom CA (that signed HTTPS certificate) into host wide
trustore (more info can be found in update-ca-trust man page) 

__ __

2. Configure HTTPS certificate in Apache (this step is same as in
previous versions) 

__ __

3. Create new configuration file (for example
/etc/ovirt-engine/engine.conf.d/99-custom-truststore.conf) with
following content: 

ENGINE_HTTPS_PKI_TRUST_STORE="/etc/pki/java/cacerts"
ENGINE_HTTPS_PKI_TRUST_STORE_PASSWORD="" 

__ __

4. Restart ovirt-engine service

__ __

I find it humorous that step # 1 suggests reading the “man page”
which is only slightly better than suggesting to “google” it. 

__ __

Has anyone using a custom CA for their HTTPS certificate
successfully upgraded to oVirt 4? If so could you share your
detailed steps? Or can anyone point me to an actual example of this
procedure? I’m a little nervous about the upgrade if you can’t
already tell. 

__ __

Thanks,

Daniel

___
Users mailing list
Users@ovirt.org <mailto:Users@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users




--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] migrating via pivotting the storage domain between instances of Manager

2016-10-27 Thread Nicolas Ecarnot

Hello Kenneth,

Le 26/10/2016 à 20:38, Kenneth Bingham a écrit :

Is it possible to "pivot" up or down guests from one instance of Manager
to another instance of Manager by detaching the data storage domain from
the source Manager's data center and attaching it to the destination
Manager's data center?


Here, we're in a phase where we've done that exact action 6 times, and 
yet 2 to go.


We're detaching+attaching these SD between 3.6.5 to 3.6.5 oVirt setups 
(CentOS 7).


After having shut all VMs, cleanly detached the SD and attached to the 
target, one still has to :
- activate the storage domain (according to your setup and the version, 
this step may be automatic)
- manually import each VM : this step is long and has to be done one by 
one (Redhat people : please comment).


In this last step, I found it quite fast, but some caveats are to be 
avoided, like :

- pay attention to the MAC adresses, avoid the duplicates
- pay attention to the access rights of the hosts towards your storage 
solution (iSCSI, NFS, aso...)


*I faced no issue regarding any relation between guests and hosts.*

I guess Redhat people will encourage you to share the logs of the 
adequate machines :

- engine.log of both managers
- vdsm.log of both SPMs


I tried this with a block type storage domain,
but there are no storage domains found when I do "import domain" in the
destination Manager.

If I do "import domain" on the source Manager and choose the same oVirt
host to perform the import then the detached storage domain is able to
be imported. If I re-attach the storage domain to the source Manager's
data center the unregistered guests are available for import. Why does
this require that the oVirt host performing the import be the same host
that had formerly mounted the domain? Or, is it that the host is still
recognized as enrolled in the DC that had created the unregistered
guests on the detached storage domain?

What if, instead of detaching the storage domain from the source
Manager's data center, I shut down all guests and put the oVirt host in
maintenance mode and enroll that same oVirt host in the destination
Manager's data center, then would it be possible to import the detached
storage domain and import the unregistered guests and templates stored
there?


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users




--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Average VM per host

2016-10-19 Thread Nicolas Ecarnot

Le 19/10/2016 à 19:49, Yaniv Kaul a écrit :

On Oct 19, 2016 5:56 PM, "Nicolas Ecarnot" <nico...@ecarnot.net
<mailto:nico...@ecarnot.net>> wrote:


Hello,


Hello Yaniv,



Though I read some surveys about this, I'd rather directly ask the

oVirt community this question, and especially to people using it as a
production cluster : as an average, how many VM are running on each of
your hosts?

The host with 1TB RAM or 64GB?

There is no meaningful average for two reasons :
- Host specs (example above) - some believe in scale up (fewer but
stronger hosts), some believe in scale out (more hosts, not as high end).


*This* is mainly the point I am expecting to get some stats.
Literature about competitor's products are commonly describing few and 
strong hosts scenario, but I'd like to know how it goes in oVirt's world


--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


  1   2   3   4   >