[ovirt-users] Re: Proper way to upgrade hosts OS?
Le 26/06/2019 à 12:34, Nicolas Ecarnot a écrit : Hello, We're not using nodes but CentOS 7.x hosts. Do you know if some documentation has been written about the proper way to upgrade the operating system of the hosts, and especially how to prevent breaking dependencies or cause versions flaws? Thank you. Hello, As no answer came, may anyone just tell me if there's any chance to break something? Thank you. -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GZYVUCMBIZIZOMKSZUIJRZ6IMWBBI2X6/
[ovirt-users] Proper way to upgrade hosts OS?
Hello, We're not using nodes but CentOS 7.x hosts. Do you know if some documentation has been written about the proper way to upgrade the operating system of the hosts, and especially how to prevent breaking dependencies or cause versions flaws? Thank you. -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4SYJWWODEY2VZOAMU5NIRDOJCPANNR6S/
[ovirt-users] Re: Old mailing list SPAM
Le 15/05/2019 à 07:46, Markus Stockhausen a écrit : Hi, does anyone currently get old mails of 2016 from the mailing list? I do. (Though it is annoying, it allowed me to get an answer about which I never thought to ask - Thanks Nir, by the way) We are spammed with something like this from teknikservice.nu: ... Received: from mail.ovirt.org (localhost [IPv6:::1])by mail.ovirt.org (Postfix) with ESMTP id A33EA46AD3;Tue, 14 May 2019 14:48:48 -0400 (EDT) Received: by mail.ovirt.org (Postfix, from userid 995)id D283A407D0; Tue, 14 May 2019 14:42:29 -0400 (EDT) Received: from bauhaus.teknikservice.nu (smtp.teknikservice.nu [81.216.61.60]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))(No client certificate requested)by mail.ovirt.org (Postfix) with ESMTPS id BF954467FEfor ; Tue, 14 May 2019 14:36:54 -0400 (EDT) Received: by bauhaus.teknikservice.nu (Postfix, from userid 0)id 259822F504; Tue, 14 May 2019 20:32:33 +0200 (CEST) <- 3 YEAR TIME WARP ? Received: from washer.actnet.nu (washer.actnet.nu [212.214.67.187])(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))(No client certificate requested)by bauhaus.teknikservice.nu (Postfix) with ESMTPS id 430FEDA541for ; Thu, 6 Oct 2016 18:02:51 +0200 (CEST) Received: from lists.ovirt.org (lists.ovirt.org [173.255.252.138])(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))(No client certificate requested)by washer.actnet.nu (Postfix) with ESMTPS id D75A82293FCfor ; Thu, 6 Oct 2016 18:04:11 +0200 (CEST) ... Markus ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XI3LV4GPACT7ILZ3BNJLHHQBEWI3HWLI/ -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/IEOIF3KVPKLBO2UNZ65FSRX7EFPXHF3V/
[ovirt-users] Re: VM has paused due to no storage space error
Hi Nir, hi Sandvik, As I saw this issue lots of times and as I'm using thin prov. + block storage, I feel concerned. Read my question below. Le 02/10/2016 à 12:55, Nir Soffer a écrit : On Sun, Oct 2, 2016 at 12:06 PM, Sandvik Agustin wrote: Hi users, I have this problem that sometimes 1 to 3 VM just automatically paused with user interaction and getting this error "VM has paused due to no storage space error". any inputs from you guys are very appreciated. This is expected - when there is no storage space :-) The vm is paused when there are some io pending io requests that could not be fulfilled since you don't have enough space. In a real machine the io requests would fail. In a vm, the vm can pause, you can fix the issue (extend the storage domain), and resume the vm. But I guess there is storage space available, otherwise you would not spend the time sending this mail. This can happen when using thin provisioned disks on block storage (iSCSI, FC). We provision such disk with 1G, and and extend the disk (add 1G) when it becomes too full (by default, free space < 0.5G). If we fail to extend the disk quick enough, "quick enough" -> Is there some place where this threshold can be configured? the vm will pause before the extend was completed. Once the extend was completed, we resume the vm. So you may see very short pauses, but they should be rare. To understand the issue, we need to inspect vdsm logs from the host running the vm that paused, showing the timeframe when the vm was paused. You should see this message in the log each time a vm pauses: abnormal vm stop device error ENOSPC Nir ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- IMPORTANT! This message has been scanned for viruses and phishing links. However, it is your responsibility to evaluate the links and attachments you choose to click. If you are uncertain, we always try to help. Greetings helpd...@actnet.se -- IMPORTANT! This message has been scanned for viruses and phishing links. However, it is your responsibility to evaluate the links and attachments you choose to click. If you are uncertain, we always try to help. Greetings helpd...@actnet.se ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/5MAYP4SZZQC5BB2VVPQBXYWH4OOJ7LUW/ -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/KF4SVQOE7U7ELLOIE4CNPSH2TAN7MW3K/
[ovirt-users] Re: DISCARD support?
Hello, Sending this here to share knowledge. Here is what I learned from many BZ and mailing list posts readings. I'm not working at Redhat, so please correct me if I'm wrong. We are using thin-provisioned block storage LUNs (Equallogic), on which oVirt is creating numerous Logical Volumes, and we're very happy with it. When oVirt is removing a virtual disk, the SAN is not informed, because the LVM layer is not sending the "issue_discard" flag. /etc/lvm/lvm.conf is not the natural place to try to change this parameter, as VDSM is not using it. Efforts are presently made to include issue_discard setting support directly into vdsm.conf, first on a datacenter scope (4.0.x), then per storage domain (4.1.x) and maybe via a web GUI check-box. Part of the effort is to make sure every bit of a planned to be removed LV get wiped out. Part is to inform the block storage side about the deletion, in case of thin provisioned LUNs. https://bugzilla.redhat.com/show_bug.cgi?id=1342919 https://bugzilla.redhat.com/show_bug.cgi?id=981626 -- Nicolas ECARNOT On Mon, Oct 3, 2016 at 2:24 PM, Nicolas Ecarnot <mailto:nico...@ecarnot.net>> wrote: Yaniv, As a pure random way of web surfing, I found that you posted on twitter an information about DISCARD support. (https://twitter.com/YanivKaul/status/773513216664174592 <https://twitter.com/YanivKaul/status/773513216664174592>) I did not dig any further, but has it any relation with the fact that so far, oVirt did not reclaim lost storage space amongst its logical volumes of its storage domains? A BZ exist about this, but one was told no work would be done about it until 4.x.y, so now we're there, I was wondering if you knew more? Feel free to send such questions on the mailing list (ovirt users or devel), so other will be able to both chime in and see the response. We've supported a custom hook for enabling discard per disk (which is only relevant for virtio-SCSI and IDE) for some versions now (3.5 I believe). We are planning to add this via a UI and API in 4.1. In addition, we are looking into discard (instead of wipe after delete, when discard is also zero'ing content) as well as discard when removing LVs. See: http://www.ovirt.org/develop/release-management/features/storage/pass-discard-from-guest-to-underlying-storage/ http://www.ovirt.org/develop/release-management/features/storage/wipe-volumes-using-blkdiscard/ http://www.ovirt.org/develop/release-management/features/storage/discard-after-delete/ Y. Best, -- Nicolas ECARNOT -- IMPORTANT! This message has been scanned for viruses and phishing links. However, it is your responsibility to evaluate the links and attachments you choose to click. If you are uncertain, we always try to help. Greetings helpd...@actnet.se -- IMPORTANT! This message has been scanned for viruses and phishing links. However, it is your responsibility to evaluate the links and attachments you choose to click. If you are uncertain, we always try to help. Greetings helpd...@actnet.se ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XNWYONXSWEN5AJVUJURRL7G3QJW62SNJ/
[ovirt-users] Logical Volume extend failed
Hello, [Context : I'm moving all my VMs from an old 3.6 DC to a brand new 4.3 DC. For local reasons, I'm doing it using an export domain, and one by one. ] Today, for no obvious reason, error messages began to appear : " VDSM SPM-servername command failed: Logical Volume extend failed " Lots of similar errors appear in the engine log, with no obvious additional hint. In the VDSM log, I'm not skilled enough to see what's wrong either. The 3.6 engine and vdsm log files are here : https://framadrop.org/r/6cFSb0GRc1#VQ6XqYWg9HzniHMjgKmXVpXy0I+RIS/MiMGBpU+1bak= https://framadrop.org/r/JFswiD3fkA#fdU+m3JCVMVg/eLjtJVTqOiAKIj4eyhsRWisxcrea7I= It may come from one of our storage domain that was close to full, but I freed 200Go space since, and the issue keeps appearing. Now, my attempts to export a VM are failing. I still can stop and start a VM. (I'm not completely relaxed with this situation.) I read some similar experience here (https://www.canarytek.com/2017/07/21/Harmfull_bug_in_oVirt_block_storage.html) but I'm not sure it is related. I can psql-query and check things if needed, but I mostly need advices. Thank you. -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/OFE5IWWFKQLWWJR3KHCIDMTS2JHLHEC4/
[ovirt-users] Re: Fencing : SSL or not?
Le 22/02/2019 à 15:45, Martin Perina a écrit : If I understand that correctly, this is a request to open session to IPMI. If you haven't received any response, then I'd check: 1. Do you have IPMI enabled? Hello Martin, you hit the point. IPMI was not unable (anymore). IPMI is activated by default since years in all our hosts. But recent firmware upgrades on some of our Dell hosts, and especially on iDRAC firmwares led to the disabling of IPMI. I'm sorry for having bothered you and the audience. Sorry for this waste of time. Thank you Dell :-\ -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/KO7REWCFUWRGU453N5XYSFZSS75RFFU6/
[ovirt-users] Re: Host choice when migrating VMs
Le 22/02/2019 à 15:48, Dominik Holler a écrit : Hosts _needs_ the same networks to be available in the same cluster. Different networked hosts needs to be put in a separate cluster. This is the most straight approach, which is supported by oVirt. But there is the possibility to attach logical networks, which are neither required in the cluster, nor attached to all hosts in the cluster, to a VM. oVirt's scheduling will respect this. So you're saying oVirt knows which other hosts in the cluster have the non-mandatory network(s) the VM has and only chooses between those a host to migrate the VM to? Yes. If you try to trigger the migration manually, UI will provide you the list of possible hosts to migrate the VM. https://github.com/oVirt/ovirt-engine/blob/7d111f3aa089f77f92049f4d3ec792e5ff7e5324/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/scheduling/policyunits/NetworkPolicyUnit.java#L132 *THIS* is precisely the answer I was expecting. Thank you Dominik. -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/LT6I4GS42VIPQYBF4EGT7HBS2LVLUN2Z/
[ovirt-users] Re: Host choice when migrating VMs
Le 22/02/2019 à 15:02, Karli Sjöberg a écrit : Den 22 feb. 2019 09:24 skrev Nicolas Ecarnot : Hello, I'm almost sure the following is useless as I think I know how it's working, but as I'm preparing a major change in our infrastructure, I'd rather be sure and not mess up. And also to be sure. (Just to be sure) For some reasons, and for the first time in our infra., one of our new DC will temporary include heterogeneous hosts : some networks will be available only on parts of them. Hi Karli, Hosts _needs_ the same networks to be available in the same cluster. Correct me if I'm wrong, but I think that your statement is true *if* the networks are set as mandatory, which is not automatically wanted nor true. In our case, we have to disable this mandatory attribute. I agree that when the networks are mandatory, every host unable to use them will end up unavailable. Different networked hosts needs to be put in a separate cluster. /K -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/CGPHGFXYI3OZX2XKTLCFZ6W3GN4Q6U4Q/
[ovirt-users] Re: Fencing : SSL or not?
Le 22/02/2019 à 12:13, Martin Perina a écrit : Unfortunately using fence_ipmilan is not possible to display more debugging details, so as mentioned earlier could you please run ipmitool directly? ipmitool vv -I lanplus -H c-hv05.prd.sdis38.fr <http://c-hv05.prd.sdis38.fr> -p 623 -U stonith -P -L ADMINISTRATOR chassis power status Above should display more details ... root@hv04:/etc# ipmitool -vv -I lanplus -H c-hv05.prd.sdis38.fr -p 623 -U stonith -P 'xxx' -L ADMINISTRATOR chassis power status Sending IPMI command payload netfn : 0x06 command : 0x38 data: 0x8e 0x04 Sending IPMI command payload netfn : 0x06 command : 0x38 data: 0x8e 0x04 Sending IPMI command payload netfn : 0x06 command : 0x38 data: 0x8e 0x04 Sending IPMI command payload netfn : 0x06 command : 0x38 data: 0x8e 0x04 Sending IPMI command payload netfn : 0x06 command : 0x38 data: 0x0e 0x04 Sending IPMI command payload netfn : 0x06 command : 0x38 data: 0x0e 0x04 Sending IPMI command payload netfn : 0x06 command : 0x38 data: 0x0e 0x04 Sending IPMI command payload netfn : 0x06 command : 0x38 data: 0x0e 0x04 Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session root@hv04:/etc# -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/DQKUC2G745CKN6BT2SC3T6LSCEEML7NN/
[ovirt-users] Re: Fencing : SSL or not?
Hi Martin, Le 21/02/2019 à 13:04, Martin Perina a écrit : Hi Nicolas, see my reply inline See mine below. On Mon, Feb 18, 2019 at 9:51 AM Nicolas Ecarnot <mailto:nico...@ecarnot.net>> wrote: Hello, As fence_idrac has never worked for us, and as fence_ipmilan has worked nicely since years, we are using fence_ipmilan with the lanplus=1 option and we're happy with it. We upgraded to 4.3.0.4 and we're witnessing that we cannot fence our hosts anymore : 2019-02-18 09:42:08,678+01 ERROR [org.ovirt.engine.core.bll.pm <http://org.ovirt.engine.core.bll.pm>.FenceProxyLocator] (default task-11) [2f78ed99-6703-4d92-b7cb-948c2d24b623] Can not run fence action on host 'x', no suitable proxy host was found. This is not related fence_ipmi issue below. Engine, is order to be able to execute fencing operation, needs at least one other hosts in Up status, which is used as a proxy host to perform fencing operation. So do you have at least one host in Up status in the same cluster/datacenter as the host you want to run fencing operation on? Yes. If so, then please enable debug information to find out why we cannot find any host acting as fence proxy: 1. Please download log-control.sh script from https://github.com/oVirt/ovirt-engine/tree/master/contrib#log-control-sh and save on engine machine 2. Please execute following on engine machine log-control.sh org.ovirt.engine.core.bll.pm <http://org.ovirt.engine.core.bll.pm> DEBUG 3. Go to the problematic host, click Edit, go to Power Management tab, click on the existing fence agent and click on Test button 4. Take a look at engine.log, there should be logged information, why we were not able to find out fence proxy I followed the instructions above, but I feel this is not the best debug path. I learned nothing new. The fence proxy is not missing. It is known and found, and it is trying to do its job, as written below : and on the SPM : fence_ipmilan: Failed: Unable to obtain correct plug status or plug is not available Could you please provide debug output of below command? ipmitool -vv -I lanplus -H -p 623 -U -P -L ADMINISTRATOR chassis power status See below a debug session. I'm comparing two hosts, and one only is answering fence status queries. I must add that before the upgrade to 4.3, both hosts were responding correctly. fence_ipmilan --username=stonith --password='xxx' --lanplus --ip=c-serv-hv-prds01.sdis.isere.fr --action=status -v 2019-02-22 11:34:01,537 INFO: Executing: /usr/bin/ipmitool -I lanplus -H c-serv-hv-prds01.sdis.isere.fr -p 623 -U stonith -P [set] -L ADMINISTRATOR chassis power status 2019-02-22 11:34:01,654 DEBUG: 0 Chassis Power is on Status: ON root@hv04:/etc# fence_ipmilan --username=stonith --password='xxx' --lanplus --ip=c-hv05.prd.sdis38.fr --action=status -v 2019-02-22 11:34:15,335 INFO: Executing: /usr/bin/ipmitool -I lanplus -H c-hv05.prd.sdis38.fr -p 623 -U stonith -P [set] -L ADMINISTRATOR chassis power status 2019-02-22 11:34:35,338 ERROR: Connection timed out root@hv04:/etc# nmap c-serv-hv-prds01.sdis.isere.fr Starting Nmap 6.40 ( http://nmap.org ) at 2019-02-22 11:34 CET Nmap scan report for c-serv-hv-prds01.sdis.isere.fr (192.168.53.2) Host is up (0.010s latency). rDNS record for 192.168.53.2: c-5g3yxx1.sdis.isere.fr Not shown: 996 closed ports PORT STATE SERVICE 22/tcp open ssh 80/tcp open http 443/tcp open https 5900/tcp open vnc Nmap done: 1 IP address (1 host up) scanned in 0.45 seconds root@hv04:/etc# nmap c-hv05.prd.sdis38.fr Starting Nmap 6.40 ( http://nmap.org ) at 2019-02-22 11:34 CET Nmap scan report for c-hv05.prd.sdis38.fr (192.168.50.194) Host is up (0.00060s latency). rDNS record for 192.168.50.194: C-550W2S2.sdis.isere.fr Not shown: 996 closed ports PORT STATE SERVICE 22/tcp open ssh 80/tcp open http 443/tcp open https 5900/tcp open vnc MAC Address: CC:C5:E5:57:26:E0 (Unknown) Nmap done: 1 IP address (1 host up) scanned in 0.20 seconds root@hv04:/etc# ping -c 1 c-serv-hv-prds01.sdis.isere.fr PING c-5g3yxx1.sdis.isere.fr (192.168.53.2) 56(84) bytes of data. 64 bytes from c-5g3yxx1.sdis.isere.fr (192.168.53.2): icmp_seq=1 ttl=61 time=2.37 ms --- c-5g3yxx1.sdis.isere.fr ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 2.371/2.371/2.371/0.000 ms root@hv04:/etc# ping -c 1 c-hv05.prd.sdis38.fr PING c-550w2s2.prd.sdis38.fr (192.168.50.194) 56(84) bytes of data. 64 bytes from C-550W2S2.sdis.isere.fr (192.168.50.194): icmp_seq=1 ttl=64 time=0.189 ms --- c-550w2s2.prd.sdis38.fr ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.189/0.189/0.189/0.000 ms Above is the command which fence_ipmi is internally executing, and -vv adds debugging output which can reveal issue with the plug status Regards, Martin I found the sugg
[ovirt-users] Host choice when migrating VMs
Hello, I'm almost sure the following is useless as I think I know how it's working, but as I'm preparing a major change in our infrastructure, I'd rather be sure and not mess up. And also to be sure. (Just to be sure) For some reasons, and for the first time in our infra., one of our new DC will temporary include heterogeneous hosts : some networks will be available only on parts of them. Please may someone confirm me that with every load balancing / VM startup / VM migration / host choice, oVirt will smartly choose the available host equipped with the adequate networks? -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QGX3PHA4T3SXXDTYZ4VGY6UHECO7P6V5/
[ovirt-users] Fencing : SSL or not?
Hello, As fence_idrac has never worked for us, and as fence_ipmilan has worked nicely since years, we are using fence_ipmilan with the lanplus=1 option and we're happy with it. We upgraded to 4.3.0.4 and we're witnessing that we cannot fence our hosts anymore : 2019-02-18 09:42:08,678+01 ERROR [org.ovirt.engine.core.bll.pm.FenceProxyLocator] (default task-11) [2f78ed99-6703-4d92-b7cb-948c2d24b623] Can not run fence action on host 'x', no suitable proxy host was found. and on the SPM : fence_ipmilan: Failed: Unable to obtain correct plug status or plug is not available I found the suggested workaround here : https://access.redhat.com/solutions/3349841 but no combination of - lanplus={0,1} - -z - ssl=={0,1} lead to no solution. The package version is the same as what's described in the KB : fence-agents-rhevm-4.2.1-11.el7_6.7.x86_64 What should I test now? Thank you. -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SEUAZ6JB6CIYY2GOBNJN2XSWOSH6DHDJ/
[ovirt-users] Re: Forum available
Le 08/02/2019 à 09:05, Josep Manel Andrés Moscardó a écrit : Hi all, I am just wondering if anyone like me would like to have everything that is bump here in a forum, with all the benefits it brings Absolutely. Digging through mail archives is somethimes painful. (and people will still be able to subscribe and reply through email). Something like Discourse would be nice in my opinion. Best. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TUU357HINGWFA23T3SMKDVTM7EKLX6VS/ -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/H427EVNMN3NZHB7NGW4Z62IOPRIGFNGP/
[ovirt-users] Re: Bug in the web interface?
Le 06/02/2019 à 15:42, Greg Sheremeta a écrit : On Wed, Feb 6, 2019 at 6:33 AM Nicolas Ecarnot <mailto:nico...@ecarnot.net>> wrote: Le 06/02/2019 à 10:53, Lucie Leistnerova a écrit : > > On 2/6/19 10:22 AM, Nicolas Ecarnot wrote: >> Hi Lucie, >> >> Le 06/02/2019 à 10:02, Lucie Leistnerova a écrit : >>> I'm sorry, my mistake I did not mention to remove the package without >>> dependencies. Same -- sorry, ugh. For anyone in the same situation, the better thing to do now is simply 'yum update ovirt-engine-ui-extensions' That will remove the old dashboard correctly. https://github.com/oVirt/ovirt-engine-ui-extensions/blob/master/packaging/spec.in#L16 Thank you. We need this kind of wheels greasing as oVirt's complexity increases. To sum up, I think what I'm missing is a clear and solide documentation or official Redhat message about whether/what/how/when can/cannot we update (with "yum update") the engine host and/or the hosts. Not Red Hat -- oVirt :) Yep, Greg Sheremeta ;-) Indeed, we need an Upgrade Guide update. I'll look into it. Generally, on my dev instances (which are probably nowhere near as complicated as your setups), I run 'yum update' followed by 'engine-setup'. Actually, my experience is that yum-upgrading the engine was most of the times harmless, but yum-upgrading the hosts lead to complex situations. I'm at a point where I no longer update my hosts with yum update, and only relies on oVirt's update (either via the web GUI or ansible's cluster upgrade) which only updates part of the packages. I'd rather have a strong enough RPM environment around oVirt preventing any issue (the version lock usage shows that it's already a concern oVirt's people are dealing with and I thank you. Keep strengthening.) -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TQAYEZSGMLQCWFJTMAUERABCUNYWG3N6/
[ovirt-users] Re: Bug in the web interface?
Le 06/02/2019 à 10:53, Lucie Leistnerova a écrit : On 2/6/19 10:22 AM, Nicolas Ecarnot wrote: Hi Lucie, Le 06/02/2019 à 10:02, Lucie Leistnerova a écrit : I'm sorry, my mistake I did not mention to remove the package without dependencies. rpm -e --nodeps ... I'll write that down. When looking at the log file above (https://framadrop.org/r/ywTOD-Q02-#dA6hdYaxfZpgUB68gtJLB9inH5oJajrL4H9LTktDd6o=) [...] "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", The error is cause by missing ovirt-engine-dbscripts. OK Well, I thought I messed up with packages, and I thought a compete yum update would help, as I read : Le 05/02/2019 à 15:19, Greg Sheremeta wrote : The fix is pushed. Standalone engine upgrades should be fine starting now. `yum update` any appliance engines or already upgraded engines to get the latest ovirt-engine-ui-extensions, which fixes the problem. So I ran a yum update. This package is part of ovirt-engine versionlock so can't be installed/updated separately. engine-setup should install the missing packages. I tried it by myself and it fixed the issue. [install] ovirt-engine-dbscripts-4.3.0.5-0.0.master.20190205084851.gitaaebfc9.el7.noarch will be installed I see I have this package, though in an older version : # rpm -qa|grep -i dbscripts ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch The version shouldn't be problem. I tested it in u/s ovirt. Now I tried with same version. Try to remove that package and install again. Versionlock seems to differ here so I was able to install it separately, if not run engine-setup. # rpm -e --nodeps ovirt-engine-dbscripts Indeed, it found a lot of missing files/dir. # yum install ovirt-engine-dbscripts I forgot to set LANG=C so you'll read some parts in french, but I get the idea : root@mvm01:/tmp# yum install ovirt-engine-dbscripts Modules complémentaires chargés : fastestmirror, versionlock Loading mirror speeds from cached hostfile * base: centos.mirror.fr.planethoster.net * epel: pkg.adfinis-sygroup.ch * extras: ftp.pasteur.fr * ovirt-4.3: ovirt.repo.nfrance.com * ovirt-4.3-epel: pkg.adfinis-sygroup.ch * updates: centos.mirror.fr.planethoster.net Excluding 1 update due to versionlock (use "yum versionlock status" to show it) Résolution des dépendances --> Lancement de la transaction de test ---> Le paquet ovirt-engine-dbscripts.noarch 0:4.3.0.4-1.el7 sera installé --> Résolution des dépendances terminée Dépendances résolues = Package Architecture Version Dépôt Taille = Installation : ovirt-engine-dbscripts noarch 4.3.0.4-1.el7 ovirt-4.3 331 k Résumé de la transaction = Installation 1 Paquet Taille totale des téléchargements : 331 k Taille d'installation : 1.6 M Is this ok [y/d/N]: y Downloading packages: ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch.rpm | 331 kB 00:00:02 Running transaction check Running transaction test Transaction test succeeded Running transaction Avertissement : RPMDB a été modifiée par une autre application que yum. ** 1 problèmes RPMDB préexistants trouvés, la sortie de « yum check » est la suivante : ovirt-engine-4.3.0.4-1.el7.noarch a des dépendances manquantes de ovirt-engine-dbscripts = ('0', '4.3.0.4', '1.el7') Installation : ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch 1/1 Vérification : ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch 1/1 Installé : ovirt-engine-dbscripts.noarch 0:4.3.0.4-1.el7 Terminé ! - After that, I ran again engine-setup and it went OK. Now, my ovirt DC and dashboard is back to life, thanks to you Lucie. To sum up, I think what I'm missing is a clear and solide documentation or official Redhat message about whether/what/how/when can/cannot we update (with "yum update") the engine host and/or the hosts. ?? -- Nicolas ECARNOT __
[ovirt-users] Re: Bug in the web interface?
Hi Lucie, Le 06/02/2019 à 10:02, Lucie Leistnerova a écrit : I'm sorry, my mistake I did not mention to remove the package without dependencies. rpm -e --nodeps ... I'll write that down. When looking at the log file above (https://framadrop.org/r/ywTOD-Q02-#dA6hdYaxfZpgUB68gtJLB9inH5oJajrL4H9LTktDd6o=) [...] "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", The error is cause by missing ovirt-engine-dbscripts. OK Well, I thought I messed up with packages, and I thought a compete yum update would help, as I read : Le 05/02/2019 à 15:19, Greg Sheremeta wrote : The fix is pushed. Standalone engine upgrades should be fine starting now. `yum update` any appliance engines or already upgraded engines to get the latest ovirt-engine-ui-extensions, which fixes the problem. So I ran a yum update. This package is part of ovirt-engine versionlock so can't be installed/updated separately. engine-setup should install the missing packages. I tried it by myself and it fixed the issue. [install] ovirt-engine-dbscripts-4.3.0.5-0.0.master.20190205084851.gitaaebfc9.el7.noarch will be installed I see I have this package, though in an older version : # rpm -qa|grep -i dbscripts ovirt-engine-dbscripts-4.3.0.4-1.el7.noarch Not sure what went wrong by you, send please the setup log and the >> (https://framadrop.org/r/ywTOD-Q02-#dA6hdYaxfZpgUB68gtJLB9inH5oJajrL4H9LTktDd6o=) ovirt-engine* rpms list. And also result of 'ls /usr/share/ovirt-engine/dbscripts' # LANG=C ls -la /usr/share/ovirt-engine/dbscripts ls: cannot access /usr/share/ovirt-engine/dbscripts: No such file or directory You seem to hit the point. -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/DA3RSDSTLAHWDCIAZNAGRUMKFHT7Y2GN/
[ovirt-users] Re: Bug in the web interface?
Le 05/02/2019 à 15:19, Greg Sheremeta wrote : The fix is pushed. Standalone engine upgrades should be fine starting now. `yum update` any appliance engines or already upgraded engines to get the latest ovirt-engine-ui-extensions, which fixes the problem. So I ran a yum update. After running again engine-setup, it is failing the same way. I compared the complete rpm list with another 4.3 DC with no issue, and apart the removed ovirt-engine-dashboard package and obviously many upgraded packages, I see no obvious missing parts. I'm at loss and don't know how to save this DC, so any help is welcome. -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/7QT44H4DEIZPZVMBO6UPRQ6GZWAKWP3S/
[ovirt-users] Re: [4.3.0] VNC Virt-viewer console not opening
Hello Greg, Le 04/02/2019 à 21:13, Greg Sheremeta a écrit : When I try to use Spice instead of VNc, it is working nicely. My goal is to stick to VNC. When I try to use noVNC, the additional tab opens and shows "Unsupported security types: 19" Looks like https://bugzilla.redhat.com/show_bug.cgi?id=1659155 Can you try disabling vnc security on the cluster and then reboot the host? VNC security is already disabled. What could I give to help you help me? -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ARBA5SBJLY3QS73XYRJYQ7F7TZJ5KOYT/
[ovirt-users] [4.3.0] VNC Virt-viewer console not opening
Hello, First, congratulations to all of you who worked for this 4.3.0 release, and obviously thank you. Today, I upgraded 4 oVirt setups (4 DC) from 4.2.7 to 4.3.0. I went well on all 4 DCs. But on one of them, when I try to open a console, I see it open as a flash (it opens and closes immediately). I'm using Firefox 64.0 with Ubuntu 18.10, and all my VMs are setup like this : - video type : QXL - Gfx protocol : VNC - VNC Kbd layout : fr and I'm using virt-viewer On the problematic DC, all the VMs are showing the same issue. When I try to use Spice instead of VNc, it is working nicely. When I try to use noVNC, the additional tab opens and shows "Unsupported security types: 19" I tried to track down this issue thanks to the firefox dev console, but it's beyond my understanding. Trying the same with Chromium does the same blinking open/close. I'd rather learn how to provide additionnal debug messages, but /var/log/ovirt-engine/engine.log does not give any useful hint : 2019-02-04 16:57:04,150+01 INFO [org.ovirt.engine.core.bll.SetVmTicketCommand] (default task-24) [1fb01d42] Running command: SetVmTicketCommand internal: false. Entities affected : ID: 0c3e02b3-7fec-4bb1-b3d6-2e6c228e7278 Type: VMAction group CONNECT_TO_VM with role type USER 2019-02-04 16:57:04,155+01 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SetVmTicketVDSCommand] (default task-24) [1fb01d42] START, SetVmTicketVDSCommand(HostName = hv01.prd.sdis38.fr, SetVmTicketVDSCommandParameters:{hostId=' 687c1c01-a5e1-449c-89d2-9713ccfc2487', vmId='0c3e02b3-7fec-4bb1-b3d6-2e6c228e7278', protocol='VNC', ticket='IivrpGHx5zSw', validTime='120', userName='admin', userId='4a340386-851a-11e8-863d-3417ebeef1af', disconnectAction='NONE'} ), log id: 2a897f30 2019-02-04 16:57:04,188+01 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SetVmTicketVDSCommand] (default task-24) [1fb01d42] FINISH, SetVmTicketVDSCommand, return: , log id: 2a897f30 2019-02-04 16:57:04,211+01 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-24) [1fb01d42] EVENT_ID: VM_SET_TICKET(164), User admin@internal-authz initiated console session for VM ad02.ct at.sdis38.fr What could I give to help you help me? -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/KGCM25ILBTQTY6NLVJUDE7CNF5C5BRE7/
[ovirt-users] Re: The admin portal ui should be more simplified
Le 10/01/2019 à 15:13, fle...@hotmail.com a écrit : We have a rhv of 11 Datacerters, 11 clusters, 40 hosts and 300 vms. The 4 of us administrators are suffering from the new 4.2 UI lack of active area 。The manipulation logic also make us confused. A simple operation needs more clicks than before. Please just make the UI more simplified, ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ETR6Q5YWUFTF6Y6RN6SHEAURJBK7OGOQ/ Hello, Would it be wise to suggest two clever ways to deal with complexity : - ManageIQ - Ansible We use them both, and are quite happy with them. Regards, -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MSVKUQMBBXUOVOAWE5FICFL5MACXWERT/
[ovirt-users] Re: Trouble connecting to IDRAC7
Le 01/08/2018 à 15:28, Jayme a écrit : I just enabled power management/fencing successfully on two of my hosts (Dell poweredge R720s with Idrac 7) but am failing to add the third. I enter the IP and user/pass like the others, it takes 15 seconds or so they spits out "Test Failed: Internal JSON-RPC error" I tried resetting the IDRAC on that server. I can also ping it and access it fine in a web browser. I can ping it from the host as well. Is there any configuration in IDRAC that could be blocking the fence attempt or any logs in oVirt I can look at to figure out what might be happening with the connection? I see there is a "fence_idrac" command on the hosts but unsure what switches to use with it to test. Thanks! ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UJQDE3W6NSZWLMSZJQZD7OZM4CYEMNKI/ Hello Jayme, All our iDrac are successfully power-managed this way : type : ipmilan options : lanplus=1 In the Drac, we use a dedicated user with the appropriate rights. HTH -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FTT6IBBAONVMLWWHDW3W76KWT433AYQ2/
[ovirt-users] [No question] NFS disabled, hosts wandering tearful
Hello, This is a simple testimony about what happened yesterday in one of our DC. This DC runs on a dedicated bare-metal engine, oversized compared to the need, thus I've added a NFS service on it to host a small storage domain and the ISO storage domain. Yesterday, after having received the colorful announce about the 4.2.5 version, I decided to upgrade. As our engine was still on a CentOS 7.4, I first upgraded its OS version to 7.5, then reboot. Smooth. Then I followed the very usual oVirt engine upgrade path. Smooth. Eventually, I upgraded the hosts with ovirt-ansible-cluster-upgrade as usual. The result was frightening because the hosts were put in maintenance, upgraded, back to life, seen unavailable, unreachable, connecting, alive, rebooted, then back to another turn and looping... During this, the SPM role was obviously jumping around, and that did not help the debug. In the end, it appeared that something during an upgrade stopped and disabled the NFS service. My hosts partially relied on it, so after having restarted the NFS service, all came back to life. The NFS disabling may come from the CentOS upgrade, except if someone tells me it could come from something on the oVirt side? I'm sure the RH people will advice me not to run NFS on the engine, but apart this event, I had no trouble doing this in years. Regards, -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GB72URRHAB3TNUO4QQBRMWITGTLSJBZJ/
[ovirt-users] Re: Is enabling Epel repo will break the installation?
Le 23/07/2018 à 15:33, Arman Khalatyan a écrit : Hello, As I remember some time ago the epel collectd was in conflict with the ovirt one. Is it still the case? Thanks, Arman. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/S4SYV6L5EIW36B3CIR7VWA42FNJCDCUG/ Hello, With a recent 4.2.4.5-1.el7 it was still the case... I just excluded collectd from epel.repo and it was OK. -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GYZPPUBDSNGKKUYANCEHRRCOHKPUY24N/
[ovirt-users] Host reboot failed
Hello, [oVirt 4.2.4.5-1.el7] Sequence : - Among 7 active UP hosts, one of them runs zero VM - On this (still in UP state) host, I run a SSH-restart via the web GUI - The host gracefully shuts down then reboots, with no issue - In the web GUI, as in real life, the host stays in Reboot state forever A this point, the engine can ping it, can ssh-connect to it, the host seems to have zero issue. In the web GUI, I can not put it into active state because it is not in maintenance state. It stays in reboot state. I can not either put it in maintenance state because it stays in reboot state. This state lasts long enough to allow me type this mail, look into logs, and as I was about to send logs, I see the host is returning to life (its states comes back as UP). I don't type fast, so after the host has finished rebooting, maybe 5 or 10 minutes have passed before the engine links again to the host. Before posting additional logs and comments, does anybody know if this is a know bug or behavior, or do I have to open a BZ? Regards, -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/B5HCXSJR57LQ2SNRFK4POUIX7Z2DX2S6/
[ovirt-users] Re: Lost host after upgrade/reboot
Le 19/06/2018 à 10:14, Nicolas Ecarnot a écrit : In this engine log above, you see that I'm using my account to manage this engine, as I 'm doing for years with no issue. I'll try the exact same path with admin@internal to see what could change, but I don't see the link. I just tried on another host, using admin@internal, and the same issue occurred. What other logs could I give you to debug this? -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q2KI7OJKUYJLZ3MQU5LPBQW77A5A4YOX/
[ovirt-users] Lost host after upgrade/reboot
Hello, TL;DR : engine stops talking with rebooted host. [oVirt 4.2.3.5-1.el7.centos] - From the web gui, upgrading a host, allowing the reboot checkbox checked - upgrade is OK (/var/log/yum.log is showing successful updates + the Ansible host deploy log is also OK) - reboot is OK (clean, SSH OK...) - the host eventually appears as "Install failed" - the engine.log is telling : 2018-06-19 10:02:24,896+02 ERROR [org.ovirt.engine.core.bll.SshHostRebootCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] SSH reboot command failed on host 'serv-hv-prds06': SSH session timeout host 'root@ serv-hv-prds06' Stdout: Stderr: 2018-06-19 10:02:25,028+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] EVENT_ID: SYSTEM_FAILED_SSH_HOST_RESTART(198), A restart usin g SSH initiated by the engine to Host serv-hv-prds06 has failed. 2018-06-19 10:02:25,185+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] START, SetVdsStatusVDSCommand(HostName = serv-hv-prds06, SetVdsStatusVDSCom mandParameters:{hostId='9c1566a4-8432-4de6-b30d-fd3b8e5fafca', status='InstallFailed', nonOperationalReason='NONE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 833f9bd 2018-06-19 10:02:25,191+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] FINISH, SetVdsStatusVDSCommand, log id: 833f9bd 2018-06-19 10:02:25,191+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.UpgradeHostInternalCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] Engine failed to restart via ssh host 'serv-hv-prds06' ('9c1566a4- 8432-4de6-b30d-fd3b8e5fafca') after upgrade 2018-06-19 10:02:25,256+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID: HOST_UPGRADE_FAILED(841 ), Failed to upgrade Host serv-hv-prds06 (User: necar...@sdis.isere.fr@SDIS38-authz). 2018-06-19 10:02:30,755+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-69) [8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID: HOST_UPGRADE_FAILED(841), Failed to upgrade Host serv-hv-prds06 (User: necar...@sdis.isere.fr@SDIS38-authz). - Manually activating the host puts it back on track without issue The usual SSH communications between the engine and the host are usually very sound (VM migrations, maintenance...). On this oVirt DC, I reproduced this issue twice on 2 different hosts. In this engine log above, you see that I'm using my account to manage this engine, as I 'm doing for years with no issue. I'll try the exact same path with admin@internal to see what could change, but I don't see the link. What other logs could I give you to debug this? Regards, -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/CT5KHY3C2ASOXBVNUIEBG5WA42JKJGXH/
[ovirt-users] Re: Hosts : Upgrade failed - 4.2.3
Le 16/05/2018 à 12:55, Fred Rolland a écrit : It looks you still have 4.1 repos... Yes. I thought Ansible was in charge of disabling oldest repos. Is does not seem to be the case, is it? -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org
[ovirt-users] Hosts : Upgrade failed - 4.2.3
Hello, I was on 4.2.2 and it failed. I upgraded to 4.2.3 and it's still failing. From the GUI, I switch one host into maintenance mode, try to upgrade it, and it is failing. On the engine, the engine.log is not saying anything helpful. But on the engine, I see in /var/log/ovirt-engine/host-deploy/ovirt-host-mgmt-ansible-20180516121013-xxx-dacf1972-f184-4d01-a863-7974579e6bc8.log, I see : http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.8/repodata/repomd.xml: [Errno 14] HTTP Error 404 - Not Found Essai d'un autre miroir. To address this issue please refer to the below wiki article https://wiki.centos.org/yum-errors If above article doesn't help to resolve this issue please use https://bugs.centos.org/. http://mirror.centos.org/centos/7/virt/x86_64/ovirt-4.1/repodata/repomd.xml: [Errno 14] HTTP Error 404 - Not Found Essai d'un autre miroir. This is french, but I'm sure you understand that it translates into "gluster repo issue". Is there something I could do? Thank you. -- Nicolas ECARNOT ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org
[ovirt-users] Why RAW images when using GlusterFS?
Hello, Amongst others, I have one 3.6 DC working very well since years and all based on GlusterFS. When having a close look (qemu-img info) on the images, I see their format is all RAW and not QCOW2. I never noticed or bothered before, but I'm wondering : - is it by design? - it is something we can change (I'd prefer qcow2) - it there some limitations? And finally, I have the same questions about NFS storage domains. Thank you. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VM has been paused due to NO STORAGE SPACE ERROR ?!?!?!?!
Le 16/03/2018 à 15:48, Alex Crow a écrit : On 16/03/18 13:46, Nicolas Ecarnot wrote: Le 16/03/2018 à 13:28, Karli Sjöberg a écrit : Den 16 mars 2018 12:26 skrev Enrico Becchetti <enrico.becche...@pg.infn.it>: Dear All, Does someone had seen that error ? Yes, I experienced it dozens of times on 3.6 (my 4.2 setup has insufficient workload to trigger such event). And in every case, there was no actual lack of space. Enrico Becchetti Servizio di Calcolo e Reti I think I remember something to do with thin provisioning and not being able to grow fast enough, so out of space. Are the VM's disk thick or thin? All our storage domains are thin-prov. and served by iSCSI (Equallogic PS6xxx and 4xxx). Enrico, do you know if a bug has been filed about this? Did the VM remain paused? In my experience the VM just gets temporarily paused while the storage is expanded. RH confirmed to me in a ticket that this is expected behaviour. AFAIR, most of them went back up and running by themselves (we had to manually some of them from times to times). The storage side weakness is an interesting trail to follow. We also experienced this behavior when migrating lots of VMs at once, yet using a dedicated storage network. Being on this mailing list since long, I remember we already discussed several times about how some users feel how oVirt can appear sensitive to storage latencies. On my side, the site where most of our workload resides is still in 3.6, so I can not yet witness the efforts oVirt devs have made to cope with this in 4.2 but I'm sure they did. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VM has been paused due to NO STORAGE SPACE ERROR ?!?!?!?!
Le 16/03/2018 à 13:28, Karli Sjöberg a écrit : Den 16 mars 2018 12:26 skrev Enrico Becchetti <enrico.becche...@pg.infn.it>: Dear All, Does someone had seen that error ? Yes, I experienced it dozens of times on 3.6 (my 4.2 setup has insufficient workload to trigger such event). And in every case, there was no actual lack of space. Enrico Becchetti Servizio di Calcolo e Reti I think I remember something to do with thin provisioning and not being able to grow fast enough, so out of space. Are the VM's disk thick or thin? All our storage domains are thin-prov. and served by iSCSI (Equallogic PS6xxx and 4xxx). Enrico, do you know if a bug has been filed about this? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] firewall node
https://www.mail-archive.com/users@ovirt.org/msg46608.html Le 09/03/2018 à 20:12, Fabrice SOLER a écrit : Hello, I am trying to open a port on the node. For that, in the cluster configuration I have choosed firewalld, I have created the |*/etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml* file.| | - name: Enable additional port on firewalld firewalld: port: "12345/tcp" permanent: yes immediate: yes state: enabled | |then I have rebooted the node like it is noticed on this link : | |https://www.ovirt.org/blog/2017/12/host-deploy-customization/ | |On the node, after the reboot, I read the iptables (iptables -L) and the port is not open. | |I have just updated the engine and the node is 4.2.1.1.| |Is there some change about the firewalld in this version ? (in 4.2.0 it worked) | |Sincerery | -- ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Power off VM from VM portal
Le 07/03/2018 à 13:42, Alexandr Krivulya a écrit : 06.03.2018 17:39, Nicolas Ecarnot пишет: Le 06/03/2018 à 16:02, Alexandr Krivulya a écrit : Hi, is there any way to power off VM from VM portal (4.2.1.7)? I can't find "power off" button, just "shutdown". ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hello Alexandr, After having clicked on the VM link, you'll notice that on the right of the Shutdown button is an arrow allowing you to access to the Power Off feature. I cant find this arrow on Shutdown button ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Oh sorry I answered in the context of admin portal. Indeed, in the VM portal, I neither see this poweroff button. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Power off VM from VM portal
Le 06/03/2018 à 16:02, Alexandr Krivulya a écrit : Hi, is there any way to power off VM from VM portal (4.2.1.7)? I can't find "power off" button, just "shutdown". ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hello Alexandr, After having clicked on the VM link, you'll notice that on the right of the Shutdown button is an arrow allowing you to access to the Power Off feature. Regards, -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Importing VM fails with "No space left on device"
Hello, When importing a VM, I'm facing the know bug : https://access.redhat.com/solutions/2770791 QImgError: ecode=1, stdout=[], stderr=['qemu-img: error while writing sector 93569024: No space left on device' The difference between my case and what is described in the RH webpage is that I have no "Failed to flush the refcount block cache". Here is what I see : ecfbd1a4-f9d2-463a-ade6-def5bd217b43::DEBUG::2018-03-06 09:57:36,460::utils::718::root::(watchCmd) FAILED: = ['qemu-img: error while writing sector 205517952: No space left on device']; = 1 ecfbd1a4-f9d2-463a-ade6-def5bd217b43::ERROR::2018-03-06 09:57:36,460::image::865::Storage.Image::(copyCollapsed) conversion failure for volume ac08bc8d-1eea-449a-a102-cf763c6726c8 Traceback (most recent call last): File "/usr/share/vdsm/storage/image.py", line 860, in copyCollapsed volume.fmt2str(dstVolFormat)) File "/usr/lib/python2.7/site-packages/vdsm/qemuimg.py", line 207, in convert raise QImgError(rc, out, err) QImgError: ecode=1, stdout=[], stderr=['qemu-img: error while writing sector 205517952: No space left on device'], message=None ecfbd1a4-f9d2-463a-ade6-def5bd217b43::ERROR::2018-03-06 09:57:36,461::image::878::Storage.Image::(copyCollapsed) Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/image.py", line 866, in copyCollapsed raise se.CopyImageError(str(e)) CopyImageError: low level Image copy failed: ("ecode=1, stdout=[], stderr=['qemu-img: error while writing sector 205517952: No space left on device'], message=None",) I followed the advices in the RH webpage (check if the figures are correct between the qemu-img sizes and the meta-data file), and they seem to be correct : root@serv-hv-adm30:/etc# qemu-img info /rhev/data-center/mnt/serv-lin-adm1.sdis.isere.fr\:_home_vmexport3/be2878c9-2c46-476b-bfae-8b02a4679022/images/a5d68d88-3b54-488d-a61e-7995a1906994/ac08bc8d-1eea-449a-a102-cf763c6726c8 image: /rhev/data-center/mnt/serv-lin-adm1.sdis.isere.fr:_home_vmexport3/be2878c9-2c46-476b-bfae-8b02a4679022/images/a5d68d88-3b54-488d-a61e-7995a1906994/ac08bc8d-1eea-449a-a102-cf763c6726c8 file format: qcow2 virtual size: 98G (105226698752 bytes) disk size: 97G cluster_size: 65536 Format specific information: compat: 0.10 refcount bits: 16 root@serv-hv-adm30:/etc# cat /rhev/data-center/mnt/serv-lin-adm1.sdis.isere.fr\:_home_vmexport3/be2878c9-2c46-476b-bfae-8b02a4679022/images/a5d68d88-3b54-488d-a61e-7995a1906994/ac08bc8d-1eea-449a-a102-cf763c6726c8.meta DOMAIN=be2878c9-2c46-476b-bfae-8b02a4679022 CTIME=1520318755 FORMAT=COW DISKTYPE=1 LEGALITY=LEGAL SIZE=205520896 VOLTYPE=LEAF DESCRIPTION= IMAGE=a5d68d88-3b54-488d-a61e-7995a1906994 PUUID=---- MTIME=0 POOL_UUID= TYPE=SPARSE EOF So I don't see what's wrong? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVIRT 4.1 / iSCSI Multipathing
Hello, [Unusual setup] Last week, I eventually managed to make a 4.2.1.7 oVirt work with iscsi-multipathing on both hosts and guest, connected to a Dell Equallogic SAN which is providing one single virtual ip - my hosts have two dedicated NICS for iscsi, but on the same VLAN. Torture-tests showed good resilience. [Classical setup] But this year, we plan to create at least two additional DCs but to connect their hosts to a "classical" SAN, ie which provides TWO IPs on segregated VLANs (not routed), and we'd like to use the same iscsi-multipathing feature. The discussion below could lead to think that oVirt needs the two iscsi VLANs to be routed, allowing the hosts in one VLAN to access to resources in the other. As Vinicius explained, this is not a best practice to say the least. Searching through the mailing list archive, I found no answer to Vinicius' question. May a Redhat storage and/or network expert enlighten us on these points? Regards, -- Nicolas Ecarnot Le 21/07/2017 à 20:56, Vinícius Ferrão a écrit : On 21 Jul 2017, at 15:12, Yaniv Kaul <yk...@redhat.com <mailto:yk...@redhat.com>> wrote: On Wed, Jul 19, 2017 at 9:13 PM, Vinícius Ferrão <fer...@if.ufrj.br <mailto:fer...@if.ufrj.br>> wrote: Hello, I’ve skipped this message entirely yesterday. So this is per design? Because the best practices of iSCSI MPIO, as far as I know, recommends two completely separate paths. If this can’t be achieved with oVirt what’s the point of running MPIO? With regular storage it is quite easy to achieve using 'iSCSI bonding'. I think the Dell storage is a bit different and requires some more investigation - or experience with it. Y. Yaniv, thank you for answering this. I’m really hoping that a solution would be found. Actually I’m not running anything from DELL. My storage system is FreeNAS which is pretty standard and, as far as I know, iSCSI practices dictates segregate networks for proper working. All other major virtualization products supports iSCSI this way: vSphere, XenServer and Hyper-V. So I was really surprised that oVirt (and even RHV, I requested a trial yesterday) does not implement ISCSI with the well know best practices. There’s a picture of the architecture that I take from Google when searching for ”mpio best practives”: https://image.slidesharecdn.com/2010-12-06-midwest-reg-vmug-101206110506-phpapp01/95/nextgeneration-best-practices-for-vmware-and-storage-15-728.jpg?cb=1296301640 Ans as you can see it’s segregated networks on a machine reaching the same target. In my case, my datacenter has five Hypervisor Machines, with two NICs dedicated for iSCSI. Both NICs connect to different converged ethernet switches and the iStorage is connected the same way. So it really does not make sense that a the first NIC can reach the second NIC target. In a case of a switch failure the cluster will go down anyway, so what’s the point of running MPIO? Right? Thanks once again, V. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt 4.2.x and ManageIQ : Adding 'cfme' credentials
Le 01/03/2018 à 15:50, Nicolas Ecarnot a écrit : Couldn't the Redhat documentation mentioned above be more accurate? Something like 'scl enable rh-postgrsql95' should help. Not that much... root@serv-mvm-prds01:/etc/ovirt-engine-setup.conf.d# cd /tmp root@serv-mvm-prds01:/tmp# su - postgres Dernière connexion : jeudi 1 mars 2018 à 15:42:40 CET sur pts/2 -bash-4.2$ scl enable rh-postgrsql95 Need at least 3 arguments. Run scl --help to get help. After reading and reading again : For the record, here are the steps allowing me to add this user : su - postgres scl enable rh-postgresql95 'psql ovirt_engine_history' CREATE ROLE cfme with LOGIN ENCRYPTED PASSWORD 'xxx'; SELECT 'GRANT SELECT ON ' || relname || ' TO cfme;' FROM pg_class JOIN pg_namespace ON pg_namespace.oid = pg_class.relnamespace WHERE nspname = 'public' AND relkind IN ('r', 'v','S'); \q exit -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt 4.2.x and ManageIQ : Adding 'cfme' credentials
Le 01/03/2018 à 15:00, Yaniv Kaul a écrit : On Thu, Mar 1, 2018 at 2:13 PM, Nicolas Ecarnot <nico...@ecarnot.net <mailto:nico...@ecarnot.net>> wrote: Hello, As for my 4 previous oVirt DCs, I'm trying to add them to ManageIQ providers. I tried to follow this guide : https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.6/html-single/deployment_planning_guide/#data_collection_for_rhev_33_34 <https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.6/html-single/deployment_planning_guide/#data_collection_for_rhev_33_34> But when trying to run psql, the shell tells me the command is not found. Hello Yanniv, Thank you for answering. Because you are probably on PG 9.5 SCL, I assume? I've never heard about that before today. I installed a bare-metal CentOS 7.4 on which I installed oVirt 4.2. I saw no reference to SCL nowhere, neither during the setup, neither in the oVirt install documentation. How an average user is supposed to behave in such a situation? (In my case, as usual, I read and read again) Couldn't the Redhat documentation mentioned above be more accurate? Something like 'scl enable rh-postgrsql95' should help. Not that much... root@serv-mvm-prds01:/etc/ovirt-engine-setup.conf.d# cd /tmp root@serv-mvm-prds01:/tmp# su - postgres Dernière connexion : jeudi 1 mars 2018 à 15:42:40 CET sur pts/2 -bash-4.2$ scl enable rh-postgrsql95 Need at least 3 arguments. Run scl --help to get help. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] oVirt 4.2.x and ManageIQ : Adding 'cfme' credentials
Hello, As for my 4 previous oVirt DCs, I'm trying to add them to ManageIQ providers. I tried to follow this guide : https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.6/html-single/deployment_planning_guide/#data_collection_for_rhev_33_34 But when trying to run psql, the shell tells me the command is not found. I made a very simple setup : when running engine-setup, I answered the default question about DWH, so the DB is local. When viewing (with pgAdmin) the roles of this new PostgreSQL DB, I see there is no 'cfme' user. Do I have to re-run the setup and answer different things to ensure other packages and setup are made? I saw https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/data_warehouse_guide/#Overview_of_Configuring_Data_Warehouse telling me to re-run. But I see that : rpm -qa|grep -i dwh ovirt-engine-dwh-4.2.1.2-1.el7.centos.noarch ovirt-engine-dwh-setup-4.2.1.2-1.el7.centos.noarch so I thought it was already enough... ? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Hosts firewall custom setup
Hello, For the record : The workaround you suggest below is successful. Thank you. -- Nicolas Ecarnot Le 27/02/2018 à 14:15, Ondra Machacek a écrit : On 02/27/2018 11:29 AM, Nicolas Ecarnot wrote: Le 26/02/2018 à 15:00, Yedidyah Bar David a écrit : But how do we add custom rules in case of firewalld type? Please see: https://ovirt.org/blog/2017/12/host-deploy-customization/ Hello Didi and al, - I followed the advices found in this blog page, I created the exact same filename with the adequate content. - I've setup the cluster type to firewalld - I restarted ovirt-engine - I reinstalled a host I see no usage of this Ansible yml file. I see the creation of an ansible deploy log file for my host, and I see the usual firewall ports being opened, but I see nowhere any usage of the /etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml file. - I added the debug msg part in the ansible recipe, but to no avail. - Huge grepping through the /var/log of the engine shows no calls of this script. Thus, I see no effect on ports of the host's firewalld config. What should I look at now? It looks like you hit the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=1549163 It will be fixed in 4.2.2 release. I believe you can meanwhile remove line: - oVirt-metrics from file: /usr/share/ovirt-engine/playbooks/roles/ovirt-host-deploy/meta/main.yml Thank you. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Hosts firewall custom setup
Le 26/02/2018 à 15:00, Yedidyah Bar David a écrit : But how do we add custom rules in case of firewalld type? Please see: https://ovirt.org/blog/2017/12/host-deploy-customization/ Hello Didi and al, - I followed the advices found in this blog page, I created the exact same filename with the adequate content. - I've setup the cluster type to firewalld - I restarted ovirt-engine - I reinstalled a host I see no usage of this Ansible yml file. I see the creation of an ansible deploy log file for my host, and I see the usual firewall ports being opened, but I see nowhere any usage of the /etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml file. - I added the debug msg part in the ansible recipe, but to no avail. - Huge grepping through the /var/log of the engine shows no calls of this script. Thus, I see no effect on ports of the host's firewalld config. What should I look at now? Thank you. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Hosts firewall custom setup
Le 26/02/2018 à 14:03, Yedidyah Bar David a écrit : On Mon, Feb 26, 2018 at 2:01 PM, Nicolas Ecarnot <nico...@ecarnot.net> wrote: Hello, On oVirt 4.2.1.7, I'm trying to setup custom iptables rules as I'm doing since years with engine-config --set IPTablesConfigSiteCustom="blah blah blah". On my hosts, I can see in my hosts that /etc/sysconfig/iptables does contain the correct custom rules I added, but when manually checking with iptables -L, I don't see my rules active. On my hosts, I see that the iptables services is stopped and disabled, and that the firewalld service is up and running. That explains why iptables customization has no effect. Indeed. IIRC the type of firewall is now set per cluster or something like that, not sure about the details - adding Ondra. Per cluster, one can indeed choose the firewall type. I suppose it translates on the hosts into the activation of the adequate service. But how do we add custom rules in case of firewalld type? On the hosts, I imagine that could translate into changes in : /etc/firewalld/zones/public.xml -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Hosts firewall custom setup
Hello, On oVirt 4.2.1.7, I'm trying to setup custom iptables rules as I'm doing since years with engine-config --set IPTablesConfigSiteCustom="blah blah blah". On my hosts, I can see in my hosts that /etc/sysconfig/iptables does contain the correct custom rules I added, but when manually checking with iptables -L, I don't see my rules active. On my hosts, I see that the iptables services is stopped and disabled, and that the firewalld service is up and running. That explains why iptables customization has no effect. In the engine setup, I see that /etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf contains : OVESETUP_CONFIG/firewallManager=none:None I'm confused about this setting : when running engine-setup, I'm not sure to understand if answering yes to the question about the firewall will modify the engine, the hosts, or all of them? Actually, I'd like my engine to stay with a disabled firewall, but my hosts with an active one. Is it true to say that this is not an option and I have to answer yes, enable the firewall on the engine, allowing the OVESETUP_CONFIG/firewallManager option to be set up (to firewalld or iptables), thus allowing the spread of this setup towards the hosts? Thank you. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [Qemu-block] qcow2 images corruption
https://framadrop.org/r/Lvvr392QZo#/wOeYUUlHQAtkUw1E+x2YdqTqq21Pbic6OPBIH0TjZE= Le 14/02/2018 à 00:01, John Snow a écrit : On 02/13/2018 04:41 AM, Kevin Wolf wrote: Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben: TL; DR : qcow2 images keep getting corrupted. Any workaround? Not without knowing the cause. The first thing to make sure is that the image isn't touched by a second process while QEMU is running a VM. The classic one is using 'qemu-img snapshot' on the image of a running VM, which is instant corruption (and newer QEMU versions have locking in place to prevent this), but we have seen more absurd cases of things outside QEMU tampering with the image when we were investigating previous corruption reports. This covers the majority of all reports, we haven't had a real corruption caused by a QEMU bug in ages. After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all. It would be good if you could make the 'qemu-img check' output available somewhere. It would be even better if we could have a look at the respective image. I seem to remember that John (CCed) had a few scripts to analyse corrupted qcow2 images, maybe we would be able to see something there. Hi! I did write a pretty simplistic tool for trying to tell the shape of a corruption at a glance. It seems to work pretty similarly to the other tool you already found, but it won't hurt anything to run it: https://github.com/jnsnow/qcheck (Actually, that other tool looks like it has an awful lot of options. I'll have to check it out.) It can print a really upsetting amount of data (especially for very corrupt images), but in the default case, the simple setting should do the trick just fine. You could always put the output from this tool in a pastebin too; it might help me visualize the problem a bit more -- I find seeing the exact offsets and locations of where all the various tables and things to be pretty helpful. You can also always use the "deluge" option and compress it if you want, just don't let it print to your terminal: jsnow@probe (dev) ~/s/qcheck> ./qcheck -xd /home/bos/jsnow/src/qemu/bin/git/install_test_f26.qcow2 > deluge.log; and ls -sh deluge.log 4.3M deluge.log but it compresses down very well: jsnow@probe (dev) ~/s/qcheck> 7z a -t7z -m0=ppmd deluge.ppmd.7z deluge.log jsnow@probe (dev) ~/s/qcheck> ls -s deluge.ppmd.7z 316 deluge.ppmd.7z So I suppose if you want to send along: (1) The basic output without any flags, in a pastebin (2) The zipped deluge output, just in case and I will try my hand at guessing what went wrong. (Also, maybe my tool will totally choke for your image, who knows. It hasn't received an overwhelming amount of testing apart from when I go to use it personally and inevitably wind up displeased with how it handles certain situations, so ...) What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame. This seems very unlikely. The corruption you're seeing is in the qcow2 metadata, not only in the guest data. If anything, virtio-scsi exercises more qcow2 code paths than virtio-blk, so any potential bug that affects virtio-blk should also affect virtio-scsi, but not the other way around. I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now. To be honest, debugging corruption after the fact is pretty hard. We'd need the 'qemu-img check' output and ideally the image to do anything, but I can't promise that anything would come out of this. Best would be a reproducer, or at least some operation that you can link to the appearance of the corruption. Then we could take a more targeted look at the respective code. As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug) Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2
Re: [ovirt-users] [Qemu-block] qcow2 images corruption
Le 13/02/2018 à 16:26, Nicolas Ecarnot a écrit : >> It would be good if you could make the 'qemu-img check' output available >> somewhere. > I found this : https://github.com/ShijunDeng/qcow2-dump and the transcript (beautiful colors when viewed with "more") is attached : -- Nicolas ECARNOT Le script a débuté sur mar. 13 févr. 2018 17:31:05 CET ]0;root@serv-hv-adm13:/home[?1034h[01;32mroot@serv-hv-adm13[00m:[01;34m/home[00m# /root/qcow2-dump -m check serv-term-adm4-corr.qcow2.img [1;32m File:[1;36m serv-term-adm4-corr.qcow2.img [0m magic: 0x514649fb version: [1;36m2 [0mbacking_file_offset: 0x0 backing_file_size: 0 fs_type: [1;32mxfs [0mvirtual_size: 64424509440 / 61440M / 60G disk_size: 36507222016 / 34816M / 34G seek_end: 36507222016 [[1;32m0x88000[0m] / 34816M / 34G cluster_bits: [1;36m16 [0mcluster_size: [1;36m65536 [0mcrypt_method: 0 csize_shift: 54 csize_mask: 255 cluster_offset_mask: [1;36m0x3f [0ml1_table_offset: [1;32m0x76a46 [0ml1_size: [1;32m120 [0ml1_vm_state_index: [1;32m120 [0ml2_size: [1;36m8192 [0mrefcount_order: [1;36m4 [0mrefcount_bits: [1;36m16 [0mrefcount_block_bits: [1;36m15 [0mrefcount_block_size: [1;36m32768 [0mrefcount_table_offset: [1;32m0x1 [0mrefcount_table_clusters: [1;32m1 [0msnapshots_offset: [1;32m0x0 [0mnb_snapshots: [1;32m0 [0mincompatible_features: compatible_features: autoclear_features: [1;32mActive Snapshot: [0m L1 Table: [offset: 0x76a46, len: 120] [1;36mResult: [0mL1 Table: unaligned: [1;33m0, [0minvalid: [1;33m0, [0munused: 53, used: 67 L2 Table: unaligned: [1;33m0, [0minvalid: [1;33m0, [0munused: 20304, used: 528560 [1;32mRefcount Table: [0m Refcount Table: [offset: 0x1, len: 8192] [1;36mResult: [0mRefcount Table: unaligned: [1;33m0, [0minvalid: [1;33m0, [0munused: 8175, used: 17 Refcount: error: [1;33m4342, [0mleak: [1;33m0, [0munused: 28426, used: 524288 [1;32mCOPIED OFLAG: [0m [1;36mResult: [0mL1 Table ERROR OFLAG_COPIED: [1;33m1 [0mL2 Table ERROR OFLAG_COPIED: [1;33m4323 [0mActive L2 COPIED: [1;33m528560 [34639708160 / 33035M / 32G] [0m [1;32mActive Cluster: [0m [1;36m Result: [0mActive Cluster: reuse: [1;33m17 [0m [1;31mSummary: [0mpreallocation: [1;32moff [0mActive Cluster: [1;31mreuse: 17 [0mRefcount Table: [1;33munaligned: 0, [0m[1;33minvalid: 0, [0munused: 8175, used: 17 Refcount: [1;33merror: [0m[1;31m4342, [0m[1;33mleak: 0, [0m[1;31mrebuild: 4325, [0munused: 28426, used: 524288 L1 Table: [1;33munaligned: 0, [0m[1;33minvalid: 0, [0munused: 53, used: 67 [1;33moflag copied: [0m[1;31m1 [0mL2 Table: [1;33munaligned: 0, [0m[1;33minvalid: 0, [0munused: 20304, used: 528560 [1;33moflag copied: [0m[1;31m4323 [0m ###[5;31m qcow2 image has refcount errors! (=_=#)[0m### ###[5;31mand qcow2 image has copied errors! (o_0)?[0m### ###[5;31m Sadly: refcount error cause active cluster reused! Orz[0m ### ###[1;33m Please backup this image and contact the author![0m ### ]0;root@serv-hv-adm13:/home[01;32mroot@serv-hv-adm13[00m:[01;34m/home[00m# exit Script terminé sur mar. 13 févr. 2018 17:31:13 CET ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [Qemu-block] qcow2 images corruption
Hello Kevin, Le 13/02/2018 à 10:41, Kevin Wolf a écrit : Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben: TL; DR : qcow2 images keep getting corrupted. Any workaround? Not without knowing the cause. Actually, my main concern is mostly about finding the cause rather than correcting my corrupted VMs. Another way to say it : I prefer to help oVirt than help myself. The first thing to make sure is that the image isn't touched by a second process while QEMU is running a VM. Indeed, I read some BZ about this issue : they were raised by a user who ran some qemu-img commands on a "mounted" image, thus leading to some corruption. In my case, I'm not playing with this, and the corrupted VMs were only touched by classical oVirt actions. The classic one is using 'qemu-img snapshot' on the image of a running VM, which is instant corruption (and newer QEMU versions have locking in place to prevent this), but we have seen more absurd cases of things outside QEMU tampering with the image when we were investigating previous corruption reports. This covers the majority of all reports, we haven't had a real corruption caused by a QEMU bug in ages. May I ask after what QEMU version this kind of locking has been added. As I wrote, our oVirt setup is 3.6 so not recent. After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all. It would be good if you could make the 'qemu-img check' output available somewhere. See attachment. It would be even better if we could have a look at the respective image. I seem to remember that John (CCed) had a few scripts to analyse corrupted qcow2 images, maybe we would be able to see something there. I just exported it like this : qemu-img convert /dev/the_correct_path /home/blablah.qcow2.img The resulting file is 32G and I need an idea to transfer this img to you. What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame. This seems very unlikely. The corruption you're seeing is in the qcow2 metadata, not only in the guest data. Are you saying: - the corruption is in the metadata and in the guest data OR - the corruption is only in the metadata ? If anything, virtio-scsi exercises more qcow2 code paths than virtio-blk, so any potential bug that affects virtio-blk should also affect virtio-scsi, but not the other way around. I get that. I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now. To be honest, debugging corruption after the fact is pretty hard. We'd need the 'qemu-img check' output Done. and ideally the image to do anything, I remember some Redhat people once gave me a temporary access to put heavy file on some dedicated server. Is it still possible? but I can't promise that anything would come out of this. Best would be a reproducer, or at least some operation that you can link to the appearance of the corruption. Then we could take a more targeted look at the respective code. Sure. Alas I find no obvious pattern leading to corruption : From the guest side, it appeared with windows 2003, 2008, 2012, linux centOS 6 and 7. It appeared with virtio-blk; and I changed some VMs to used virtio-scsi but it's too soon to see appearance of corruption in that case. As I said, I'm using snapshots VERY rarely, and our versions are too old so we do them the cold way only (VM shutdown). So very safely. The "weirdest" thing we do is to migrate VMs : you see how conservative we are! As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug) Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images. To my knowledge, oVirt only uses external snapshots and creates them with QMP. This should be perfectly safe because from the perspective of the qcow2 image being snapshotted,
Re: [ovirt-users] qcow2 images corruption
Le 08/02/2018 à 13:59, Yaniv Kaul a écrit : On Feb 7, 2018 7:08 PM, "Nicolas Ecarnot" <nico...@ecarnot.net <mailto:nico...@ecarnot.net>> wrote: Hello, TL; DR : qcow2 images keep getting corrupted. Any workaround? Long version: This discussion has already been launched by me on the oVirt and on qemu-block mailing list, under similar circumstances but I learned further things since months and here are some informations : - We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 7.{2,3} hosts - Hosts : - CentOS 7.2 1511 : - Kernel = 3.10.0 327 - KVM : 2.3.0-31 - libvirt : 1.2.17 - vdsm : 4.17.32-1 - CentOS 7.3 1611 : - Kernel 3.10.0 514 - KVM : 2.3.0-31 - libvirt 2.0.0-10 - vdsm : 4.17.32-1 All are somewhat old releases. I suggest upgrading to the latest RHEL and qemu-kvm bits. Later on, upgrade oVirt. Y. Hello Yaniv, We could discuss for hours about the fact that CentOS 7.3 was released in January 2017, thus not that old. And also discuss for hours explaining the gap between developers' will to push their freshest releases and the curb we - industry users - put on adopting such new versions. In my case, the virtualization infrastructure is just one of the +30 domains I have to master everyday, and the more stable the better. In the setup described previously, the qemu qcow2 images were correct, then not. We did not change anything. We have to find a workaround and we need your expertise. Not understanding the cause of the corruption threatens us to the same situation in oVirt 4.2. -- Nicolas Ecarnot ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] qcow2 images corruption
Hello, TL; DR : qcow2 images keep getting corrupted. Any workaround? Long version: This discussion has already been launched by me on the oVirt and on qemu-block mailing list, under similar circumstances but I learned further things since months and here are some informations : - We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 7.{2,3} hosts - Hosts : - CentOS 7.2 1511 : - Kernel = 3.10.0 327 - KVM : 2.3.0-31 - libvirt : 1.2.17 - vdsm : 4.17.32-1 - CentOS 7.3 1611 : - Kernel 3.10.0 514 - KVM : 2.3.0-31 - libvirt 2.0.0-10 - vdsm : 4.17.32-1 - Our storage is 2 Equallogic SANs connected via iSCSI on a dedicated network - Depends on weeks, but all in all, there are around 32 hosts, 8 storage domains and for various reasons, very few VMs (less than 200). - One peculiar point is that most of our VMs are provided an additional dedicated network interface that is iSCSI-connected to some volumes of our SAN - these volumes not being part of the oVirt setup. That could lead to a lot of additional iSCSI traffic. From times to times, a random VM appears paused by oVirt. Digging into the oVirt engine logs, then into the host vdsm logs, it appears that the host considers the qcow2 image as corrupted. Along what I consider as a conservative behavior, vdsm stops any interaction with this image and marks it as paused. Any try to unpause it leads to the same conservative pause. After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all. Months ago, I already sent a similar message but the error message was about No space left on device (https://www.mail-archive.com/qemu-block@gnu.org/msg00110.html). This time, I don't have this message about space, but only corruption. I kept reading and found a similar discussion in the Proxmox group : https://lists.ovirt.org/pipermail/users/2018-February/086750.html https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-disk-i-o.32865/page-2 What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame. I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now. As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug) Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images. Last point about the versions we use : yes that's old, yes we're planning to upgrade, but we don't know when. Regards, -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] qcow2 images corruption
Hello, TL; DR : qcow2 images keep getting corrupted. Any workaround? Long version: This discussion has already been launched by me on the oVirt and on qemu-block mailing list, under similar circumstances but I learned further things since months and here are some informations : - We are using 2 oVirt 3.6.7.5-1.el7.centos datacenters, using CentOS 7.{2,3} hosts - Hosts : - CentOS 7.2 1511 : - Kernel = 3.10.0 327 - KVM : 2.3.0-31 - libvirt : 1.2.17 - vdsm : 4.17.32-1 - CentOS 7.3 1611 : - Kernel 3.10.0 514 - KVM : 2.3.0-31 - libvirt 2.0.0-10 - vdsm : 4.17.32-1 - Our storage is 2 Equallogic SANs connected via iSCSI on a dedicated network - Depends on weeks, but all in all, there are around 32 hosts, 8 storage domains and for various reasons, very few VMs (less than 200). - One peculiar point is that most of our VMs are provided an additional dedicated network interface that is iSCSI-connected to some volumes of our SAN - these volumes not being part of the oVirt setup. That could lead to a lot of additional iSCSI traffic. From times to times, a random VM appears paused by oVirt. Digging into the oVirt engine logs, then into the host vdsm logs, it appears that the host considers the qcow2 image as corrupted. Along what I consider as a conservative behavior, vdsm stops any interaction with this image and marks it as paused. Any try to unpause it leads to the same conservative pause. After having found (https://access.redhat.com/solutions/1173623) the right logical volume hosting the qcow2 image, I can run qemu-img check on it. - On 80% of my VMs, I find no errors. - On 15% of them, I find Leaked cluster errors that I can correct using "qemu-img check -r all" - On 5% of them, I find Leaked clusters errors and further fatal errors, which can not be corrected with qemu-img. In rare cases, qemu-img can correct them, but destroys large parts of the image (becomes unusable), and on other cases it can not correct them at all. Months ago, I already sent a similar message but the error message was about No space left on device (https://www.mail-archive.com/qemu-block@gnu.org/msg00110.html). This time, I don't have this message about space, but only corruption. I kept reading and found a similar discussion in the Proxmox group : https://lists.ovirt.org/pipermail/users/2018-February/086750.html https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-disk-i-o.32865/page-2 What I read similar to my case is : - usage of qcow2 - heavy disk I/O - using the virtio-blk driver In the proxmox thread, they tend to say that using virtio-scsi is the solution. Having asked this question to oVirt experts (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's not clear the driver is to blame. I agree with the answer Yaniv Kaul gave to me, saying I have to properly report the issue, so I'm longing to know which peculiar information I can give you now. As you can imagine, all this setup is in production, and for most of the VMs, I can not "play" with them. Moreover, we launched a campaign of nightly stopping every VM, qemu-img check them one by one, then boot. So it might take some time before I find another corrupted image. (which I'll preciously store for debug) Other informations : We very rarely do snapshots, but I'm close to imagine that automated migrations of VMs could trigger similar behaviors on qcow2 images. Last point about the versions we use : yes that's old, yes we're planning to upgrade, but we don't know when. Regards, -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] qemu-kvm images corruption
Hello, On our two 3.6 DCs, we're still facing qcow2 corruptions, even on freshly installed VMs (CentOS7, win2012, win2008...). (We are still hoping to find some time to migrate all this to 4.2, but it's a big work and our one-person team - me - is overwhelmed.) My workaround is described in my previous thread below, but it's just a workaround. Reading further, I found that : https://forum.proxmox.com/threads/qcow2-corruption-after-snapshot-or-heavy-disk-i-o.32865/page-2 There are many things I don't know or understand, and I'd like your opinion : - Is "virtio" is synonym of "virtio-blk"? - Is it true that the development of virtio-scsi is active and the one of virtio is stopped? - People in the proxmox forum seem to say that no qcow2 corruption occurs when using IDE (not an option for me) neither virtio-scsi. Does any Redhat people ever heard of this? - Is converting all my VMs to use virtio-scsi a guarantee against further corruptions? - What is the non-official but nonetheless recommended driver oVirt devs recommend in the sense of future, development and stability? Regards, -- Nicolas ECARNOT Le 15/09/2017 à 14:06, Nicolas Ecarnot a écrit : TL;DR: How to avoid images corruption? Hello, On two of our old 3.6 DC, a recent series of VM migrations lead to some issues : - I'm putting a host into maintenance mode - most of the VM are migrating nicely - one remaining VM never migrates, and the logs are showing : * engine.log : "...VM has been paused due to I/O error..." * vdsm.log : "...Improbable extension request for volume..." After digging amongst the RH BZ tickets, I saved the day by : - stopping the VM - lvchange -ay the adequate /dev/... - qemu-img check [-r all] /rhev/blahblah - lvchange -an... - boot the VM - enjoy! Yesterday this worked for a VM where only one error occurred on the qemu image, and the repair was easily done by qemu-img. Today, facing the same issue on another VM, it failed because the errors were very numerous, and also because of this message : [...] Rebuilding refcount structure ERROR writing refblock: No space left on device qemu-img: Check failed: No space left on device [...] The PV/VG/LV are far from being full, so I guess I don't where to look at. I tried many ways to solve it but I'm not comfortable at all with qemu images, corruption and solving, so I ended up exporting this VM (to an NFS export domain), importing it into another DC : this had the side effect to use qemu-img convert from qcow2 to qcow2, and (maybe?) to solve some errors??? I also copied it into another qcow2 file with the same qemu-img convert way, but it is leading to another clean qcow2 image without errors. I saw that on 4.x some bugs are fixed about VM migrations, but this is not the point here. I checked my SANs, my network layers, my blades, the OS (CentOS 7.2) of my hosts, but I see nothing special. The real reason behind my message is not to know how to repair anything, rather than to understand what could have lead to this situation? Where to keep a keen eye? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] critical production issue for a vm
Le 06/12/2017 à 11:21, Nathanaël Blanchet a écrit : Hi all, I'm about to lose one very important vm. I shut down this vm for maintenance and then I moved the four disks to a new created lun. This vm has 2 snapshots. After successful move, the vm refuses to start with this message: Bad volume specification {u'index': 0, u'domainID': u'961ea94a-aced-4dd0-a9f0-266ce1810177', 'reqsize': '0', u'format': u'cow', u'bootOrder': u'1', u'discard': False, u'volumeID': u'a0b6d5cb-db1e-4c25-aaaf-1bbee142c60b', 'apparentsize': '2147483648', u'imageID': u'4a95614e-bf1d-407c-aa72-2df414abcb7a', u'specParams': {}, u'readonly': u'false', u'iface': u'virtio', u'optional': u'false', u'deviceId': u'4a95614e-bf1d-407c-aa72-2df414abcb7a', 'truesize': '2147483648', u'poolID': u'48ca3019-9dbf-4ef3-98e9-08105d396350', u'device': u'disk', u'shared': u'false', u'propagateErrors': u'off', u'type': u'disk'}. I tried to merge the snaphots, export , clone from snapshot, copy disks, or deactivate disks and every action fails when it is about disk. I began to dd lv group to get a new vm intended to a standalone libvirt/kvm, the vm quite boots up but it is an outdated version before the first snapshot. There is a lot of disks when doing a "lvs | grep 961ea94a" supposed to be disks snapshots. Which of them must I choose to get the last vm before shutting down? I'm not used to deal snapshot with virsh/libvirt, so some help will be much appreciated. Is there some unknown command to recover this vm into ovirt? Thank you in advance. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Beside specific oVirt answers, did you try to get informations about the snapshot tree with qemu-img info --backing-chain on the adequate /dev/... logical volume? As you know how to dd from LVs, you could extract every needed snapshots files and rebuild your VM outside of oVirt. Then take time to re-import it later and safely. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] iSCSI multipathing missing tab
Le 21/11/2017 à 15:21, Nicolas Ecarnot a écrit : Hello, oVirt 4.1.6.2-1.el7.centos Under the datacenter section, I see no iSCSI multipathing tab. As I'm building this new DC, could this be because this DC is not yet initialized? Self-replying (sorry, once again), for the record : https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/administration_guide/#Configuring_iSCSI_Multipathing Prerequisites Ensure you have created an iSCSI storage domain and discovered and logged into all the paths to the iSCSI target(s). As usual : Me, Read The Fine Manual... -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] iSCSI multipathing missing tab
Hello, oVirt 4.1.6.2-1.el7.centos Under the datacenter section, I see no iSCSI multipathing tab. As I'm building this new DC, could this be because this DC is not yet initialized? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Cannot remove snapshot
Le 17/11/2017 à 16:38, Nicolas Ecarnot a écrit : - export the VM then re-import, if this is related to some LV space missing. Then removing the snapshot the usual way. Self-replying, for the record: The backing image was seen full of errors by qemu-img check. I exported the whole backing + img without commiting the snapshot. I then imported with commiting, and it all went well. 4 hours of doubt. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Cannot remove snapshot
Hello, oVirt 3.6.7.5-1 I'm trying to remove a snapshot in a cold way (VM shut down). It is failing, and VDSM is telling : 4f1588f3-ae2d-4702-b7e1-4ef53b5b5a1d::DEBUG::2017-11-17 13:04:11,448::lvm::290::Storage.Misc.excCmd::(cmd) SUCCESS: = ' WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!\n'; = 0 4f1588f3-ae2d-4702-b7e1-4ef53b5b5a1d::DEBUG::2017-11-17 13:04:11,456::lvm::462::Storage.LVM::(_reloadlvs) lvs reloaded 4f1588f3-ae2d-4702-b7e1-4ef53b5b5a1d::DEBUG::2017-11-17 13:04:11,456::lvm::462::Storage.OperationMutex::(_reloadlvs) Operation 'lvm reload operation' released the operation mutex 4f1588f3-ae2d-4702-b7e1-4ef53b5b5a1d::ERROR::2017-11-17 13:04:11,457::image::1302::Storage.Image::(merge) Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/image.py", line 1293, in merge sdDom, srcVolParams, volParams, reqSize, chain) File "/usr/share/vdsm/storage/image.py", line 1039, in _baseCowVolumeMerge unsafe=False, rollback=True) File "/usr/share/vdsm/storage/volume.py", line 278, in rebase raise se.MergeSnapshotsError(self.volUUID) MergeSnapshotsError: Error merging snapshots: ('4a8c17aa-5882-45a1-8a6e-40db39ed06ca',) I read this : https://bugzilla.redhat.com/show_bug.cgi?id=1069610 , hoping I could find some workaround. But I couldn't. If there is no abvious workaround, would there be other ways like : - find the bare logical volume, shut down the VM, and play with low level qemu-img commands. I think I knwn how to do that, but I'm worried the oVirt database won't be in sync once I've removed the snapshot or - export the VM then re-import, if this is related to some LV space missing. Then removing the snapshot the usual way. Any advice (apart the obvious upgrade-to-4.X-sir)? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] LVM structure
Hi Adam, Le 04/10/2017 à 16:48, Adam Litke a écrit : Sure. vdsm-tool should be disabling lvmetad on the host automatically. Maybe some of the hosts were fresh installed and others have been upgraded from older versions? In any case, you should be able to run on any host in maintenance mode: sudo vdsm-tool configure --force And this should edit the lvm.conf file to disable lvmetad globally and also prevent the lvmetad service from starting. Sorry, but nope. # vdsm-tool configure --force Checking configuration status... Current revision of multipath.conf detected, preserving libvirt is already configured for vdsm SUCCESS: ssl configured to true. No conflicts Running configure... Reconfiguration of sebool is done. Reconfiguration of libvirt is done. Done configuring modules to VDSM. # grep use_lvmetad /etc/lvm/lvm.conf |grep -v '#' use_lvmetad = 1 Actually, as you found a workaround, it's not a big deal, especially if this point has been fixed in version greater than 3.6.7. It's just to let people know. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] LVM structure
Le 04/10/2017 à 15:30, Adam Litke a écrit : On Wed, Oct 4, 2017 at 4:12 AM Nicolas Ecarnot <nico...@ecarnot.net <mailto:nico...@ecarnot.net>> wrote: Adam, TL;DR : You nailed it! Great! Glad you're back up and running. One additional note about LVM commands. It's dangerous to use lvmetad for some commands while vdsm is running since it will not use lvmetad. You could end up with conflicting operations. In general it's safest to not issue any lvm commands while the host is activated but if you must, don't forget to disable lvmetad for all commands. OK. Is it worth trying to understand why amongst our 32 hosts in 2 DC, all in the same version (OS, vdsm, qemu packages...) some are showing they're using lvmetad and some not? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt blog screenshots: paging jmarks
Le 04/10/2017 à 09:39, ov...@fateknollogee.com a écrit : https://www.ovirt.org/blog/2017/09/introducing-ovirt-4.2.0/ https://www.ovirt.org/blog/2017/10/introducing-high-performance-vms/ @jmarks : can you include the full resolution of your screenshots from those 2 blog posts? Those screenshots are hard to see any detail ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hello, I'm using Firefox, and on every picture, I click right mouse button > view image and it shows in really decent resolution. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] LVM structure
Adam, TL;DR : You nailed it! Le 03/10/2017 à 18:12, Adam Litke a écrit : Does this report an error on the host where you are having problems activating logical volumes? lvs -a -o +devices On the hosts where I can't activate a LV, this command returns nothing interesting : root@serv-hv-prd03:~# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices home cl -wi-ao 56,25g /dev/sda2(1024) root cl -wi-ao 50,00g /dev/sda2(15423) swap cl -wi-ao 4,00g /dev/sda2(0) and so goes for pvs and vgs. Also, do the lvm commands succeed when you explicitly disable lvmetad, ie... lvchange --config 'global {use_lvmetad=0}' -ay ... Disabling lvmetad usage allows the activation to succeed. Having understand that, I tried to run some usual LVM commands like pvs vgs, lvs, pvscan, vgscan, lvscan, lvmdiskscan, and they all returned some quite empty answers (to be short : only the local LV). Having understood the role of lvmetd, I ran pcscan --cache, and all in a sudden it filled up the LVM informations : I found back all my oVirt LVM storage domains, as I could see on other hosts. Things to note : - trying to run a VM on empty LVM cache was nonetheless successful - before filling the lvmetad cache, I checked this daemon was running and it was. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] LVM structure
Hello, I'm still coping with my qemu image corruption, and I'm following some Redhat guidelines that explains the way to go : - Start the VM - Identify the host - On this host, run the ps command to identify the disk image location : # ps ax|grep qemu-kvm|grep vm_name - Look for "-drive file=/rhev/data-center/0001-0001-0001-0001-033e/b72773dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75" (YMMV) - Resolve this symbolic link # ls -la /rhev/data-center/0001-0001-0001-0001-033e/b72773dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75 lrwxrwxrwx 1 vdsm kvm 78 3 oct. 2016 /rhev/data-center/0001-0001-0001-0001-033e/b72773dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75 -> /dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75 - Shutdown the VM - On the SPM, activate the logical volume : # lvchange -ay /dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75 - Verify the state of the qemu image : # qemu-img check /dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75 - If needed, attempt a repair : # qemu-img check -r all /dev/... - In any case, deactivate the LV : # lvchange -an /dev/... I followed this steps tens of times, and finding the LV and activating it was obvious and successful. Since yesterday, I'm finding some VMs one which these steps are not working : I can identify the symbolic link, but the SPM neither the host are able to find the LV device, thus can not LV-activate it : # lvchange -ay /dev/de2fdaa0-6e09-4dd2-beeb-1812318eb893/ce13d349-151e-4631-b600-c42b82106a8d Failed to find logical volume "de2fdaa0-6e09-4dd2-beeb-1812318eb893/ce13d349-151e-4631-b600-c42b82106a8d" Either I need two more coffees, either I may be missing a step or something to check. Looking at the SPM /dev/disk/* structure, it looks like very sound (I can see my three storage domains dm-name-* series of links). As the VM can nicely be ran and stopped, does the host activates something more before being launched? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] qemu-kvm images corruption
TL;DR: How to avoid images corruption? Hello, On two of our old 3.6 DC, a recent series of VM migrations lead to some issues : - I'm putting a host into maintenance mode - most of the VM are migrating nicely - one remaining VM never migrates, and the logs are showing : * engine.log : "...VM has been paused due to I/O error..." * vdsm.log : "...Improbable extension request for volume..." After digging amongst the RH BZ tickets, I saved the day by : - stopping the VM - lvchange -ay the adequate /dev/... - qemu-img check [-r all] /rhev/blahblah - lvchange -an... - boot the VM - enjoy! Yesterday this worked for a VM where only one error occurred on the qemu image, and the repair was easily done by qemu-img. Today, facing the same issue on another VM, it failed because the errors were very numerous, and also because of this message : [...] Rebuilding refcount structure ERROR writing refblock: No space left on device qemu-img: Check failed: No space left on device [...] The PV/VG/LV are far from being full, so I guess I don't where to look at. I tried many ways to solve it but I'm not comfortable at all with qemu images, corruption and solving, so I ended up exporting this VM (to an NFS export domain), importing it into another DC : this had the side effect to use qemu-img convert from qcow2 to qcow2, and (maybe?) to solve some errors??? I also copied it into another qcow2 file with the same qemu-img convert way, but it is leading to another clean qcow2 image without errors. I saw that on 4.x some bugs are fixed about VM migrations, but this is not the point here. I checked my SANs, my network layers, my blades, the OS (CentOS 7.2) of my hosts, but I see nothing special. The real reason behind my message is not to know how to repair anything, rather than to understand what could have lead to this situation? Where to keep a keen eye? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] iSCSI Multipath issues
Le 25/07/2017 à 10:26, Maor Lipchuk a écrit : Hi Vinícius, For some reason it looks like your networks are both connected to the same IPs. Hi, Sorry to jump in this thread, but I'm concerned with this issue. Correct me if I'm wrong, but in this thread, many people are using Equallogic SANs, which provides only one virtual IP to connect to. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] SQL : last time halted?
[For the record] Juan, Thanks to your hint, I eventually found it more convenient for me to use a SQL query to find out which VM was unsed for months : SELECT vm_static.vm_name, vm_dynamic.status, vm_dynamic.vm_ip, vm_dynamic.vm_host, vm_dynamic.last_start_time, vm_dynamic.vm_guid, vm_dynamic.last_stop_time FROM public.vm_dynamic, public.vm_static WHERE vm_dynamic.vm_guid = vm_static.vm_guid AND vm_dynamic.status = 0 ORDER BY vm_dynamic.last_stop_time ASC; Thank you. -- Nicolas ECARNOT Le 30/05/2017 à 17:29, Juan Hernández a écrit : On 05/30/2017 05:02 PM, Nicolas Ecarnot wrote: Hello, I'm trying to find a way to clean up the VMs list of my DCs. I think some of my users have created VM they're not using anymore, but it's difficult to sort them out. In some cases, I can shutdown some of them and wait. Is there somewhere stored in the db tables the date of the last VM exctinction? Thank you. Did you consider using the API? There is a 'stop_time' attribute that you can use. For example, to list all the VMs and sort them by stop time you can use the following Python script: ---8<--- import ovirtsdk4 as sdk import ovirtsdk4.types as types # Create the connection to the server: connection = sdk.Connection( url='https://engine.example.com/ovirt-engine/api', username='admin@internal', password='...', ca_file='/etc/pki/ovirt-engine/ca.pem' ) # List the virtual machines: vms_service = connection.system_service().vms_service() vms = vms_service.list() # Sort the them by stop time: vms.sort(key=lambda vm: vm.stop_time) # Print the result: for vm in vms: print("%s: %s" % (vm.name, vm.stop_time)) # Close the connection to the server: connection.close() --->8--- -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] SQL : last time halted?
Hello, I'm trying to find a way to clean up the VMs list of my DCs. I think some of my users have created VM they're not using anymore, but it's difficult to sort them out. In some cases, I can shutdown some of them and wait. Is there somewhere stored in the db tables the date of the last VM exctinction? Thank you. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt 3.6 on CentOS 6.7 based HyperConverged DC : Upgradable?
Le 29/03/2017 à 15:54, Yedidyah Bar David a écrit : On Wed, Mar 29, 2017 at 4:35 PM, Nicolas Ecarnot <nico...@ecarnot.net> wrote: [Please ignore the previous msg] Hello, Hello Didi, One of our DC is a very small one, though quite critical. It's almost hyper converged : hosts are compute+storage, but the engine is standalone. And you intend to keep it that way? You didn't mention below. I intend to keep it this way. At first glance, I would go this way (feel free to comment) : - upgrade the OS of the engine : 6.7 -> 7.3 How? This isn't supported, in principle, although it might work. Mmmm, yep, you're right. Many people documented it. I'm not very fond of playing for such a critical part. The "official" way is using engine-backup to backup and restore it. See also: https://bugzilla.redhat.com/show_bug.cgi?id=1332463 Ok, Sent you a question there. During this upgrade, I have no constraint to keep everything running, total shutdown is acceptable. So you can also do a full backup and restore. I could. But I may run out of hosts at present. Create an NFS export domain somewhere, export all your VMs there, recreate everything from scratch, import the VMs. Will take much longer, but then you don't need to risk/ test/prepare for problems in upgrading gluster. Well, so your overall opinion is that I should stick to KISS, correct? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] oVirt 3.6 on CentOS 6.7 based HyperConverged DC : Upgradable?
[Please ignore the previous msg] Hello, One of our DC is a very small one, though quite critical. It's almost hyper converged : hosts are compute+storage, but the engine is standalone. It's made of : Hardware : - one physical engine : CentOS 6.7 - 3 physical hosts : CentOS 7.2 Software : - oVirt 3.6.5 - glusterFS 3.7.16 in replica-3, sharded. The goal is to upgrade all this to oVirt 4.1.1, and also upgrade the OSes. (oV 4.x only available on cOS 7.x) At present, only 3 VMs here are critical, and I have backups for them. Though, I'm quite nervous with the path I have to follow and the hazards. Especially about the gluster parts. At first glance, I would go this way (feel free to comment) : - upgrade the OS of the engine : 6.7 -> 7.3 - upgrade the OS of the hosts : 7.2 -> 7.3 - upgrade and manage the upgrade of gluster, check the volumes... - upgrade oVirt (engine then hosts) But when upgrading the OSes, I guess it will also upgrade the gluster layer. During this upgrade, I have no constraint to keep everything running, total shutdown is acceptable. Is the above procedure seems OK, or may am I missing some essential points? Thank you. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] oVirt 3.6 on CentOS 7.1 based HyperConverged DC : Upgradable?
Hello, One of our DC is a very small one, though quite critical. It's almost hyper converged : hosts are compute+storage, but the engine is standalone. It's made of : Hardware : - one physical engine : CentOS 7.1 - 3 physical hosts : CentOS 7.2 Software : - oVirt 3.6.5 - glusterFS 3.7.16 in replica-3, sharded. The goal is to upgrade all this to oVirt 4.1.1, and if possible also upgrade the OSes. At present, only 3 VMs here are critical, and I have backups for them. Though, I'm quite nervous with the path I have to follow and the hazards. Especially about the gluster parts. At first glance, I would go this way (feel free to comment) : - upgrade the OS of the engine : 7.1 -> 7.3 - upgrade the OS of the hosts : 7.2 -> 7.3 - upgrade and manage the upgrade of gluster, check the volumes... - upgrade oVirt (engine then hosts) But when upgrading the OSes, I guess it will also upgrade the gluster layer. During this upgrade, I have no constraint to keep everything running, total shutdown is acceptable. Is the above procedure seems OK, or may am I missing some essential points? Thank you. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Upgrade guide for oVirt 4.1.x?
Le 27/03/2017 à 14:43, Yaniv Dary a écrit : This is the page covering minor releases: http://www.ovirt.org/documentation/upgrade-guide/chap-Updates_between_Minor_Releases/ Yaniv Dary Technical Product Manager Red Hat Israel Ltd. 34 Jerusalem Road Building A, 4th floor Ra'anana, Israel 4350109 Tel : +972 (9) 7692306 8272306 Email: yd...@redhat.com <mailto:yd...@redhat.com> IRC : ydary Hi Yaniv, Just a small note to say that on this page http://www.ovirt.org/documentation/upgrade-guide/upgrade-guide/ the third link ("Chapter 3...") is pointing to the first one ("Chapter 1: Updating the oVirt Environment"). Regards, -- Nicolas Ecarnot ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] ovirt-engine failed to check for updates
Le 01/02/2017 à 18:18, Nicolas Ecarnot a écrit : Le 01/02/2017 à 17:37, Michael Watters a écrit : I have ovirt-engine 3.6 set up on a dedicated host which is managing two ovirt hosts. I am seeing errors when the engine attempts to check for updates as follows. Failed to check for available updates on host ovirt-node-production2 with message 'Command returned failure code 1 during SSH session 'root@ovirt-node-production2''. I checked the logs on the host and it appears to be an issue with yum. 2017-01-30 10:21:05 ERROR otopi.plugins.ovirt_host_mgmt.packages.update update.error:102 Yum: Cannot queue package ovirt-imageio-daemon: Package ovirt-imageio-daemon cannot be found 2017-01-30 10:21:05 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Package installation': Package ovirt-imageio-daemon cannot be found Is there a way to resolve this? I don't see any package named ovirt-imageio-daemon in my repos when running a yum search. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hello Michael, Last time we spoke about this issue, it was the fault of ipV6 that had to be turn down (I let you search the relevant posts). OK Michael, Your case is different. Anyway, for the record, I was referring to this : http://lists.ovirt.org/pipermail/users/2016-September/076113.html - cat /etc/sysctl.d/noipv6.conf net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 - -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] ovirt-engine failed to check for updates
Le 01/02/2017 à 17:37, Michael Watters a écrit : I have ovirt-engine 3.6 set up on a dedicated host which is managing two ovirt hosts. I am seeing errors when the engine attempts to check for updates as follows. Failed to check for available updates on host ovirt-node-production2 with message 'Command returned failure code 1 during SSH session 'root@ovirt-node-production2''. I checked the logs on the host and it appears to be an issue with yum. 2017-01-30 10:21:05 ERROR otopi.plugins.ovirt_host_mgmt.packages.update update.error:102 Yum: Cannot queue package ovirt-imageio-daemon: Package ovirt-imageio-daemon cannot be found 2017-01-30 10:21:05 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Package installation': Package ovirt-imageio-daemon cannot be found Is there a way to resolve this? I don't see any package named ovirt-imageio-daemon in my repos when running a yum search. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hello Michael, Last time we spoke about this issue, it was the fault of ipV6 that had to be turn down (I let you search the relevant posts). - cat /etc/sysctl.d/noipv6.conf net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 - -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Multipath handling in oVirt
Le 01/02/2017 à 15:31, Yura Poltoratskiy a écrit : Here you are: iSCSI multipathing <https://dl.dropboxusercontent.com/u/106774860/iSCSI_multipathing.png> network setup of a host <https://dl.dropboxusercontent.com/u/106774860/host_network.png> 01.02.2017 15:31, Nicolas Ecarnot пишет: Hello, Before replying further, may I ask you, Yura, to post a screenshot of your iSCSI multipathing setup in the web GUI? And also the same for the network setup of a host ? Thank you. Thank you Yura. To Yaniv and Pavel, yes, this leads to this oVirt feature of iSCSI multipathing, indeed. I would be curious to see (on Yura's hosts for instance) the translation of the oVirt iSCSI multipathing in CLI commands (multipath -ll, iscsiadm -m session -P3, dmsetup table, ...) Yura's setup seems to be perfectly fitted to oVirt (2 NICs, 2 VLANs, 2 targets in different VLANs, iSCSI multipathing), but I'm trying to see how I could make this work with our Equallogic presenting one and only one virtual ip (thus one target VLAN)... -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Multipath handling in oVirt
Hello, Before replying further, may I ask you, Yura, to post a screenshot of your iSCSI multipathing setup in the web GUI? And also the same for the network setup of a host ? Thank you. -- Nicolas ECARNOT Le 01/02/2017 à 13:14, Yura Poltoratskiy a écrit : Hi, As for me personally I have such a config: compute nodes with 4x1G nics and storages with 2x1G nics and 2 switches (not stackable). All servers runs on CentOS 7.X (7.3 at this monent). On compute nodes I have bonding with two nic1 and nic2 (attached to different switches) for mgmt and VM's network, and the other two nics nic3 and nic4 without bonding (and also attached to different switches). On storage nodes I have no bonding and nics nic1 and nic2 connected to different switches. I have two networks for iSCSI: 10.0.2.0/24 and 10.0.3.0/24, nic1 of storage and nic3 of computes connected to one network; nic2 of storage and nic4 of computes - to another one. On webUI I've created network iSCSI1 and iSCSI2 for nic3 and nic4, also created multipath. To have active/active links with double bw throughput I've added 'path_grouping_policy "multibus"' in defaults section of /etc/multipath.conf. After all of that, I have 200+MB/sec throughput to the storage (like raid0 with 2 sata hdd) and I can lose one nic/link/swith without stopping vms. [root@compute02 ~]# multipath -ll 360014052f28c9a60 dm-6 LIO-ORG ,ClusterLunHDD size=902G features='0' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 6:0:0:0 sdc 8:32 active ready running `- 8:0:0:0 sdf 8:80 active ready running 36001405551a9610d09b4ff9aa836b906 dm-40 LIO-ORG ,SSD_DOMAIN size=915G features='0' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 7:0:0:0 sde 8:64 active ready running `- 9:0:0:0 sdh 8:112 active ready running 360014055eb8d30a91044649bda9ee620 dm-5 LIO-ORG ,ClusterLunSSD size=135G features='0' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 6:0:0:1 sdd 8:48 active ready running `- 8:0:0:1 sdg 8:96 active ready running [root@compute02 ~]# iscsiadm -m session tcp: [1] 10.0.3.200:3260,1 iqn.2015-09.lab.lnx-san:storage (non-flash) tcp: [2] 10.0.3.203:3260,1 iqn.2016-10.local.ntu:storage3 (non-flash) tcp: [3] 10.0.3.200:3260,1 iqn.2015-09.lab.lnx-san:storage (non-flash) tcp: [4] 10.0.3.203:3260,1 iqn.2016-10.local.ntu:storage3 (non-flash) [root@compute02 ~]# ip route show | head -4 default via 10.0.1.1 dev ovirtmgmt 10.0.1.0/24 dev ovirtmgmt proto kernel scope link src 10.0.1.102 10.0.2.0/24 dev enp5s0.2 proto kernel scope link src 10.0.2.102 10.0.3.0/24 dev enp2s0.3 proto kernel scope link src 10.0.3.102 [root@compute02 ~]# brctl show ovirtmgmt bridge name bridge id STP enabled interfaces ovirtmgmt 8000.000475b4f262 no bond0.1001 [root@compute02 ~]# cat /proc/net/bonding/bond0 | grep "Bonding\|Slave Interface" Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Slave Interface: enp4s6 Slave Interface: enp6s0 01.02.2017 12:50, Nicolas Ecarnot пишет: Hello, I'm starting over on this subject because I wanted to clarify what was the oVirt way to manage multipathing. (Here I will talk only about the data/iSCSI/SAN/LUN/you name it networks.) According to what I see in the host network setup, one can assign *ONE* data network to an interface or to a group of interfaces. That implies that if my host has two data-dedicated interfaces, I can - either group them using bonding (and oVirt is handy for that in the host network setup), then assign the data virtual network to this bond - either assign each nic a different ip in each a different VLAN, then use two different data networks, and assign them each a different data network. I never played this game and don't know where it's going to. At first, may the oVirt storage experts comment on the above to check it's ok. Then, as many users here, our hardware is this : - Hosts : Dell poweredge, mostly blades (M610,620,630), or rack servers - SANs : Equallogic PS4xxx and PS6xxx Equallogic's recommendation is that bonding is evil in iSCSI access. To them, multipath is the only true way. After reading tons of docs and using Dell support, everything is telling me to use at least two different NICs with different ip, not bonded - using the same network is bad but ok. How can oVirt handle that ? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Multipath handling in oVirt
Hello, I'm starting over on this subject because I wanted to clarify what was the oVirt way to manage multipathing. (Here I will talk only about the data/iSCSI/SAN/LUN/you name it networks.) According to what I see in the host network setup, one can assign *ONE* data network to an interface or to a group of interfaces. That implies that if my host has two data-dedicated interfaces, I can - either group them using bonding (and oVirt is handy for that in the host network setup), then assign the data virtual network to this bond - either assign each nic a different ip in each a different VLAN, then use two different data networks, and assign them each a different data network. I never played this game and don't know where it's going to. At first, may the oVirt storage experts comment on the above to check it's ok. Then, as many users here, our hardware is this : - Hosts : Dell poweredge, mostly blades (M610,620,630), or rack servers - SANs : Equallogic PS4xxx and PS6xxx Equallogic's recommendation is that bonding is evil in iSCSI access. To them, multipath is the only true way. After reading tons of docs and using Dell support, everything is telling me to use at least two different NICs with different ip, not bonded - using the same network is bad but ok. How can oVirt handle that ? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VMWare VSAN like setup with oVirt
Le 31/01/2017 à 09:15, Anantha Raghava a écrit : Hi, We are trying to create a setup that uses the internal disks of the hosts / nodes, yet provide the high availability, replication and failover using oVirt. The setup we are typing to build is close to VMWare VSAN which allows for all the above just using the internal disks of the ESXi servers. Can we achieve something similar with oVirt with Gluster? Absolutely. One of our oVirt setup is done this way. Three hosts are set up as glusterFS servers (replica-3), as well as oVirt nodes. We choose to add a fourth host as an standalone engine, but you can choose to use a VM for that (hyperconverge setup). I have no experience on similar setup with a random number of nodes, neither if this can be achievable (some kind of network RAID-10)... (?) -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] guest often looses connectivity I have to ping gateway
Le 26/01/2017 à 09:03, Gianluca Cecchi a écrit : On Thu, Jan 26, 2017 at 8:45 AM, Pavel Gashev <p...@acronis.com <mailto:p...@acronis.com>> wrote: Gianluca, It looks like VM doesn't receive broadcasts. It can be a network topology issue. Could you double check /sys/class/net/bond1/bonding/mode and /sys/class/net/bond1/bonding/slaves ? Is it possible you have another VM with the same MAC address in the same network segment? Pavel, I think you are right! Thanks! I didn't take into consideration that there is another oVirt environment that has some VMs on this vlan.. And I found a VM with the same mac 00:1a:4a:16:01:51 (and a different ip) Now I powered off that other VM, restarted my one and things seem ok. What is the best way to manage when more oVirt environments has VMs on the same vlans? I encountered the same problem some years ago, as we have multiple oVirt environnements. We decided to assign specific MAC pools for each env to avoid overlapping. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [PySDK v3] Choose storage domain
Le 24/01/2017 à 13:18, Nicolas Ecarnot a écrit : OK, just one second before sending this e-mail, I made a quick test with the template object and it is working anyway. Sorry for the noise, but... No, actually, it wasn't working. I need to start from the api object to reach the actual disks. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [PySDK v3] Choose storage domain
Juan, Thank you very much for your help, this is working. Some comments below. Le 24/01/2017 à 11:04, Juan Hernández a écrit : In order to do that you need to specify that you want to clone the disks of the template, and for each disk you need to specify the storage domain where you want to create it. This is what I feared, and it was not really obvious at first sight (to prepare a disks list...). I think I saw it was less weird in V4. [...] Also, please be careful when specifying the cluster and the template. You are currently doing this: cluster=vm_cluster, template=vm_template, Not sure how you are assigning the values to those 'vm_cluster' and 'vm_template' variables, but you are probably doing this: vm_cluster = api.vms.get(name='mycluster') vm_template = api.vms.get(name='mytemplate') Here is what I was doing (please don't laugh) : c_list = api.clusters.list() # At present (2017), each datacenter contains only one cluster vm_cluster = c_list[0] vm_template = api.templates.get(name=template_name) That combination isn't ideal, because you will be sending with the 'add' request the complete representation of the cluster and the template, when the server only needs the id or the name. Consider doing this instead: cluster=params.Cluster( id=vm_cluster.get_id() ), template=params.Cluster( id=vm_template.get_id() ), Is it right to say that this last method is only valid in the api context, and would not work outside of the api object scope? I mean : if I want to use your method, I can no longer separate the preparation of a vm_params object before calling the vms.add, right? OK, just one second before sending this e-mail, I made a quick test with the template object and it is working anyway. Does it mean nothing is instantiated before the api.vms.add call? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] [PySDK v3] Choose storage domain
Hello, When trying to create a VM by cloning a template, I found out I couldn't choose the target storage domain : [...] vm_storage_domain = api.storagedomains.get(name=storage_domain) vm_params = params.VM(name=vm_name, memory=vm_memory, cluster=vm_cluster, template=vm_template, os=vm_os, storage_domain=vm_storage_domain, ) try: api.vms.add(vm=vm_params) [...] I'm getting : Failed to create VM from Template: status: 400 reason: Bad Request detail: Cannot add VM. The selected Storage Domain does not contain the VM Template. ... which I know, but I thought that, as with the GUI, I could specify the target storage domain. I made my homework, and I found a nice answer from Juan : http://lists.ovirt.org/pipermail/users/2016-January/037321.html but this relates to snapshots, and not to template usage, so I'm still stuck. May I ask an advice? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt Python SDK on Ubuntu
Le 23/01/2017 à 15:05, Ondra Machacek a écrit : Alas, though we already have one DC in V4, most of our production DCs are still in V3 for one year, and I have to maintain them. So far, I have no clue how to add ovirtsdk v3 to my Ubuntu. You just need to specify which version you would like to install, in your case: easy_install ovirt-engine-sdk-python==3.6.9.2 Nice, this is working! Thank you. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] oVirt Python SDK on Ubuntu
Hello, I'm trying to follow http://www.ovirt.org/develop/release-management/features/infra/python-sdk/ and I'm successfully discovering Python + oVirt SDK on CentOS. I'd like to do the same on Ubuntu, but the instructions seem incomplete : " easy_install ovirt-engine-sdk-python " is working, but "import ovirtsdk" doen't give anything. " apt-get install python-lxml cd ovirt-engine-sdk python setup.py install " is wrong because the "cd" isn't going anywhere, obviously. What am I missing? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Delay difference between queries (Python vs REST)
Le 17/01/2017 à 16:26, Juan Hernández a écrit : On 01/17/2017 03:56 PM, Nicolas Ecarnot wrote: Hello, On a 3.6.5 DC, I'm trying to figure out how many VMs there are, using two methods : _*Python SDK :*_ *from ovirtsdk.xml import params from ovirtsdk.api import API api = API(url='https://engine.fqdn/ovirt-engine/api', username='admin@internal', password='xxx', insecure=True) print len(api.vms.list())* time ./getMvm.py 62 real0m23.016s user0m22.288s sys0m0.054s _*REST :*_ *time curl -H "Version: 3" -H "Prefer: persistent-auth" -H "Filter: false" -H "Accept: application/xml" -H "Content-Type: application/xml" -k -u 'admin@internal:xxx' https://***engine.fqdn*/ovirt-engine/api/vms* (Then grep or anything that would get the values from the xml returned.) real0m0.383s user0m0.036s sys0m0.038s I am a beginner in both methods, but I would prefer play with Python. I'm very surprised to have to wait more than 20 seconds to get an answer. Looking at the engine log, I see that the authentication part is finished after say 3 seconds, then 20 seconds with absolutely no error message, no CPU load, no RAM burst, no nothing. On the SPM, exactly triple null nothing nada niet void is obviously explaining such a delay. I'm wondering if this super hyper sluggishness is somewhat related to the GUI global slowness I'm experiencing like other users since we left 3.2.x, and I would love that some oVirt ninja uses the comparison above to tell what parts in oVirt is used or not that could explain such a difference (database, access to SPM, LVM, network access, whatever...) -- Nicolas ECARNOT The performance problem is inside version 3 of the Python SDK. That is one of the reasons that we had to do a new version of the Python SDK for version 4 of the engine. If you are using version 4 of the engine then you can use version 4 of the SDK: https://github.com/oVirt/ovirt-engine-sdk/tree/master/sdk https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/list_vms.py It should be much faster. Would be nice if you can repeat your test and report the results. Hello Juan, Indeed, you were right. I tried the same from a recent server with a recent SDK, and I let you have a look : # rpm -q python-ovirt-engine-sdk4 python-ovirt-engine-sdk4-4.0.1-1.el7.centos.x86_64 # time ./getMvm.py 62 real0m1.004s user0m0.234s sys 0m0.031s And repeating the same test gives a very decent average, so thank you. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Delay difference between queries (Python vs REST)
Hello, On a 3.6.5 DC, I'm trying to figure out how many VMs there are, using two methods : _*Python SDK :*_ *from ovirtsdk.xml import params from ovirtsdk.api import API api = API(url='https://engine.fqdn/ovirt-engine/api', username='admin@internal', password='xxx', insecure=True) print len(api.vms.list())* time ./getMvm.py 62 real0m23.016s user0m22.288s sys0m0.054s _*REST :*_ *time curl -H "Version: 3" -H "Prefer: persistent-auth" -H "Filter: false" -H "Accept: application/xml" -H "Content-Type: application/xml" -k -u 'admin@internal:xxx' https://***engine.fqdn*/ovirt-engine/api/vms* (Then grep or anything that would get the values from the xml returned.) real0m0.383s user0m0.036s sys0m0.038s I am a beginner in both methods, but I would prefer play with Python. I'm very surprised to have to wait more than 20 seconds to get an answer. Looking at the engine log, I see that the authentication part is finished after say 3 seconds, then 20 seconds with absolutely no error message, no CPU load, no RAM burst, no nothing. On the SPM, exactly triple null nothing nada niet void is obviously explaining such a delay. I'm wondering if this super hyper sluggishness is somewhat related to the GUI global slowness I'm experiencing like other users since we left 3.2.x, and I would love that some oVirt ninja uses the comparison above to tell what parts in oVirt is used or not that could explain such a difference (database, access to SPM, LVM, network access, whatever...) -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Ovirt host activation and lvm looping with high CPU load trying to mount iSCSI storage
Hi, As we are using a very similar hardware and usage as Mark (Dell poweredge hosts, Dell Equallogic SAN, iSCSI, and tons of LUNs for all those VMs), I'm jumping into this thread. Le 12/01/2017 à 16:29, Yaniv Kaul a écrit : While it's a bit of a religious war on what is preferred with iSCSI - network level bonding (LACP) or multipathing on the iSCSI level, I'm on the multipathing side. The main reason is that you may end up easily using just one of the paths in a bond - if your policy is not set correct on how to distribute connections between the physical links (remember that each connection sticks to a single physical link. So it really depends on the hash policy and even then - not so sure). With iSCSI multipathing you have more control - and it can also be determined by queue depth, etc. (In your example, if you have SRC A -> DST 1 and SRC B -> DST 1 (as you seem to have), both connections may end up on the same physical NIC.) If we reduce the number of storage domains, we reduce the number of devices and therefore the number of LVM Physical volumes that appear in Linux correct? At the moment each connection results in a Linux device which has its own queue. We have some guests with high IO loads on their device whilst others are low. All the storage domain / datastore sizing guides we found seem to imply it’s a trade-off between ease of management (i.e not having millions of domains to manage), IO contention between guests on a single large storage domain / datastore and possible wasted space on storage domains. If you have further information on recommendations, I am more than willing to change things as this problem is making our environment somewhat unusable at the moment. I have hosts that I can’t bring online and therefore reduced resiliency in clusters. They used to work just fine but the environment has grown over the last year and we also upgraded the Ovirt version from 3.6 to 4.x. We certainly had other problems, but host activation wasn’t one of them and it’s a problem that’s driving me mad. I would say that each path has its own device (and therefore its own queue). So I'd argue that you may want to have (for example) 4 paths to each LUN or perhaps more (8?). For example, with 2 NICs, each connecting to two controllers, each controller having 2 NICs (so no SPOF and nice number of paths). Here, one key point I'm trying (to no avail) to discuss for years with Redhat people, and either I did not understood, either I wasn't clear enough, or Redhat people answered me they owned no Equallogic SAN to test it, is : My (and maybe many others) Equallogic SAN has two controllers, but is publishing only *ONE* virtual ip address. On one of our other EMC SAN, publishing *TWO* ip addresses, which can be published in two different subnets, I fully understand the benefits and working of multipathing (and even in the same subnet, our oVirt setup is happily using multipath). But on one of our oVirt setup using the Equallogic SAN, we have no choice but point our hosts iSCSI interfaces to one single SAN ip, so no multipath here. At this point, we saw no other mean than using bonding mode 1 to reach our SAN, which is terrible for storage experts. To come back to Mark's story, we are still using 3.6.5 DCs and planning to upgrade. Reading all this is making me delay this step. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Request for oVirt Ansible modules testing feedback
Hello, Le 04/01/2017 à 11:49, Nathanaël Blanchet a écrit : Le 04/01/2017 à 10:09, Andrea Ghelardi a écrit : Personally I don’t think ansible and ovirt-shell are mutually exclusive. Those who are in ansible and devops realms are not really scared by making python/ansible work with ovirt. From what I gather, playbooks are quite a de-facto pre-requisite to build up a real SaaC “Software as a Code” environment. On the other hand, ovirt-shell can and is a fast/easy way to perform “normal daily tasks”. totally agree but ovirt-shell is deprecated in 4.1 et will be removed in 4.2. Ansible or sdk4 are proposed as an alternative. Could someone point me to an URL where sdk4 is fully documented, as I have to get ready for ovirt-shell deprecation? I'm sure no one at Redhat thought about deprecating a tool in favor of a new one before providing a complete user doc! -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Windows Product Activation
Le 22/12/2016 à 22:06, Michal Skrivanek a écrit : I read more about this WPA issue, and I also checked : all our licences are MAK_B kind, which I read everywhere that they should not induce such WPA trouble, once they are correctly registered (which I obviously take care of). I also read the list of components that are checked to create a hashed key linked to the licence. As you wrote, changing to many components is triggering a validity break. Knowing this, may I ask you to comment on the promising "VM Custom Serial Number" Alex was talking about : it sounded perfect, but eventually not enough to cope with the hardware change? That’s what it is for. Does it not work for this case? I still have additional tests to do to validate it. Moreover, as this WPA issue is triggered after 30 days, this kind of tests is taking quite a lot of time. Stay tuned. PS : As usual, I'm very thankful to all who replied, and more generally to everyone on this mailing list for your help throughout the year. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Windows Product Activation
Le 22/12/2016 à 17:26, Yaniv Kaul a écrit : Windows activation, at least for 2008 and below, depend on enough hardware changes to happen. Each HW (of non-pluggable devices) change is a single 'penalty' point - except for NIC (based on MAC address) which is more. 4 or so points - and it requires re-activation. This does not apply to KMS licenses. So unless you drastically change the hardware, you should be safe. Y. Hello Yaniv, When migrating, these VMs can jump from a recent hardware host to an older one, with a different generation CPU (though of the same intel kind). I read more about this WPA issue, and I also checked : all our licences are MAK_B kind, which I read everywhere that they should not induce such WPA trouble, once they are correctly registered (which I obviously take care of). I also read the list of components that are checked to create a hashed key linked to the licence. As you wrote, changing to many components is triggering a validity break. Knowing this, may I ask you to comment on the promising "VM Custom Serial Number" Alex was talking about : it sounded perfect, but eventually not enough to cope with the hardware change? -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Windows Product Activation
Le 21/12/2016 à 16:13, Tom Gamull a écrit : My “not relevant” response may be relevant now Snapshot that server before and run your tests over and over. If you hit the limit you can restore the snapshot. That’s what I was trying to explain. If you hit the rearm limit without a backup to restore you are going to be in a tough place. You're totally right, at first I didn't understood where you were going to. Indeed, this sounds a perfect time to use snapshots. Thanks Tom. Nicolas ECARNOt Tom Gamull On Dec 21, 2016, at 10:11 AM, Nicolas Ecarnot <nico...@ecarnot.net <mailto:nico...@ecarnot.net>> wrote: Le 21/12/2016 à 16:04, Tom Gamull a écrit : Are there any events in the event log (usually Application Log entries With a test server, I'm trying to forcibly reproduce the issue, so I'll tell you soon. under Source: Software Licensing Service) similar to this (this error is unrelated, just example of event log) - https://support.microsoft.com/en-us/kb/921471 I would consider reporting this to Microsoft, I am unaware of 2008 R2 having this behavior (I have seen 2008 R2 used on KVM and libvirt for openstack without issue and be migrated). Tom Gamull On Dec 21, 2016, at 9:48 AM, Nicolas Ecarnot <nico...@ecarnot.net <mailto:nico...@ecarnot.net> <mailto:nico...@ecarnot.net>> wrote: Le 21/12/2016 à 15:36, Tom Gamull a écrit : Under Edit Virtual Machine -> System -> (Advanced Parameters) there is a Custom CPU Type you may be able to set, are all the hosts in the same cluster? On every VM we use the cluster default setting. And on all our DC we use the same cpu setting. -- Nicolas ECARNOT -- Nicolas ECARNOT -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Windows Product Activation
Le 21/12/2016 à 16:04, Tom Gamull a écrit : Are there any events in the event log (usually Application Log entries With a test server, I'm trying to forcibly reproduce the issue, so I'll tell you soon. under Source: Software Licensing Service) similar to this (this error is unrelated, just example of event log) - https://support.microsoft.com/en-us/kb/921471 I would consider reporting this to Microsoft, I am unaware of 2008 R2 having this behavior (I have seen 2008 R2 used on KVM and libvirt for openstack without issue and be migrated). Tom Gamull On Dec 21, 2016, at 9:48 AM, Nicolas Ecarnot <nico...@ecarnot.net <mailto:nico...@ecarnot.net>> wrote: Le 21/12/2016 à 15:36, Tom Gamull a écrit : Under Edit Virtual Machine -> System -> (Advanced Parameters) there is a Custom CPU Type you may be able to set, are all the hosts in the same cluster? On every VM we use the cluster default setting. And on all our DC we use the same cpu setting. -- Nicolas ECARNOT -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Windows Product Activation
Le 21/12/2016 à 15:17, Alexander Wels a écrit : On Wednesday, December 21, 2016 2:27:05 PM EST Nicolas Ecarnot wrote: Hello, Most of our virtual machines are Linux, but an increasing number of windows VMs are being integrated into our oVirt DCs. We bought tons of windows server licences, and successfully activated them. Due to how Windows Product Activation is working, when a windows VM is migrating from a host to another, this product activation is reset, launching a 30 days countdown to auto-shutdown. According to this old page : https://mazimi.wordpress.com/2007/07/11/getting-around-windows-activation-wh en-virtualizing/ and what I can read in microsoft's 2012 server documentations, I then can re-activate it twice during the next 90 days. Assuming I *want* to have *no* control upon the location of the VMs amongst their hosts (I want them to fly freely, confident in the lovely auto-balance scheduler), I understand all this is not the way to go. At present, we have 2003, 2008 and 2012 server editions. the only things I can read about windows 2012 server is related to the commercial aspects (standard licence = 2 VMs, datacenter licencce = infinite # of VMs), but not about this Windows Product Activation trouble. How do you deal with this? Is there a special licence type or something dedicated that would prevent such an uncomfortable situation? (Christmas is near, I favor soft terms.) Regards. Nicolas, IIRC this is what the custom serial number setting is for. As far as I know what happens when you migrate is that some id that windows looks for is changed (because it is generate based on an id at the host level). You can set a custom single value regardless of which host the VM is running on by opening up the edit virtual machine in the UI, then clicking system, at the bottom there is a check box called 'Provide custom serial number policy'. Then you can select VM ID. Once you have done that, if I understand the feature correctly, the ID won't change and windows will not think you have new hardware each time the VM migrates. I could be wrong, but I believe this is what you are looking for. This sounds very encouraging. I have additional tests to drive. I hope I will report here soon. Thank you Alex. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Windows Product Activation
Le 21/12/2016 à 15:36, Tom Gamull a écrit : Under Edit Virtual Machine -> System -> (Advanced Parameters) there is a Custom CPU Type you may be able to set, are all the hosts in the same cluster? On every VM we use the cluster default setting. And on all our DC we use the same cpu setting. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Windows Product Activation
Tom, Thank you for answering. Le 21/12/2016 à 14:47, Tom Gamull a écrit : Something is triggering the activation that windows is detecting as a change in hardware. Our DCs are made of hosts from 3 different models, so chances are that windows is detecting a different CPU ID or something (that is a pity, as I thought all this was hidden to the guest) I’ve not had this problem on 2012 or past versions, It may be true that we only encountered these issues on 2008 R2 guests. you’ll usually encounter it when changing the hardware drivers (such as converting from physical to virtual). Generally you want to install compatible drivers (like the ovirt windows guest tools). Every guests here is installed with oVirt guest tools. Since then, we made no driver change, neither on hosts nor guests. A good practice though is to snapshot before you make a change such as drivers in case you need to set the activation key. For Desktops in VDI when you use a gold image, you generally make a snapshot before activation - see here for an answer - https://social.technet.microsoft.com/Forums/en-US/25c4c85c-c8a9-4316-8bfa-d3b7848e6dc6/microsoft-vdi-collections-and-activation?forum=winserver8setup I'm not sure this was relevant. what kind of activation keys are you using? Further readings lead me to think that the kind of key IS the main reason I'm facing this. Do you have KMS server? No. I was told to be very prudent with using KMS servers, so not planned. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Windows Product Activation
Hello, Most of our virtual machines are Linux, but an increasing number of windows VMs are being integrated into our oVirt DCs. We bought tons of windows server licences, and successfully activated them. Due to how Windows Product Activation is working, when a windows VM is migrating from a host to another, this product activation is reset, launching a 30 days countdown to auto-shutdown. According to this old page : https://mazimi.wordpress.com/2007/07/11/getting-around-windows-activation-when-virtualizing/ and what I can read in microsoft's 2012 server documentations, I then can re-activate it twice during the next 90 days. Assuming I *want* to have *no* control upon the location of the VMs amongst their hosts (I want them to fly freely, confident in the lovely auto-balance scheduler), I understand all this is not the way to go. At present, we have 2003, 2008 and 2012 server editions. the only things I can read about windows 2012 server is related to the commercial aspects (standard licence = 2 VMs, datacenter licencce = infinite # of VMs), but not about this Windows Product Activation trouble. How do you deal with this? Is there a special licence type or something dedicated that would prevent such an uncomfortable situation? (Christmas is near, I favor soft terms.) Regards. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] GFS2 and OCFS2 for Shared Storage
Le 23/11/2016 à 13:03, Fernando Frediani a écrit : Has anyone managed to use GFS2 or OCFS2 for Shared Block Storage between hosts ? How scalable was it and which of the two work better ? Using traditional CLVM is far from good starting because of the lack of Thinprovision so I'm willing to consider either of the Filesystems. Thanks Fernando ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hello Fernando, Redhat took a clear direction towards the use of GlusterFS for its Software-defined storage, and lots of efforts are made to make oVirt/RHEV work together smoothly. I know GlusterFS is not a block storage, but it's worth considering it, especially if you intend to setup hyper-converged clusters. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] What is "hosted storage domain"?
Le 16/11/2016 à 10:25, Nicolas Ecarnot a écrit : Cc to list, still asking. Message transféré Sujet : Re: [ovirt-users] Problem moving master storage domain to maintenance Date : Thu, 10 Nov 2016 10:53:04 +0100 De : Nicolas Ecarnot <nico...@ecarnot.net> Organisation : Si peu... Pour : Roy Golan <rgo...@redhat.com> Le 10/11/2016 à 10:40, knarra a écrit : On 11/09/2016 06:26 PM, Roy Golan wrote: On 9 November 2016 at 14:49, knarra <kna...@redhat.com <mailto:kna...@redhat.com>> wrote: Can some one please help me to understand the queries below. On 11/03/2016 06:43 PM, Maor Lipchuk wrote: Hi kasturi, Which version of oVirt are you using? Apologies for the late reply. I am using the latest master. Roy, I assume it is related to 4.0 version where the import of hosted storage domain was introduced. Hi Roy, I'm jumping on this thread just to ask you what you mean by hosted storage domain, and/or where could I read more about it. I saw nothing obvious in the 4.0.0 release notes, so just wondering... Thank you -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Fwd: Re: Problem moving master storage domain to maintenance
Cc to list, still asking. Message transféré Sujet : Re: [ovirt-users] Problem moving master storage domain to maintenance Date : Thu, 10 Nov 2016 10:53:04 +0100 De : Nicolas Ecarnot <nico...@ecarnot.net> Organisation : Si peu... Pour : Roy Golan <rgo...@redhat.com> Le 10/11/2016 à 10:40, knarra a écrit : On 11/09/2016 06:26 PM, Roy Golan wrote: On 9 November 2016 at 14:49, knarra <kna...@redhat.com <mailto:kna...@redhat.com>> wrote: Can some one please help me to understand the queries below. On 11/03/2016 06:43 PM, Maor Lipchuk wrote: Hi kasturi, Which version of oVirt are you using? Apologies for the late reply. I am using the latest master. Roy, I assume it is related to 4.0 version where the import of hosted storage domain was introduced. Hi Roy, I'm jumping on this thread just to ask you what you mean by hosted storage domain, and/or where could I read more about it. I saw nothing obvious in the 4.0.0 release notes, so just wondering... Thank you -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] ovirt homeserver
Le 28/10/2016 à 18:46, david caughey a écrit : Hi, I'm building a homeserver to run ovirt and wanted to get opinions on the best approach. The server will be used as a test/studybed for ovirt/kvm/vcloud/openstack/ceph. The server will be based around a Xeon E5 10 core with 128GB ram. Option 1: Build server with CentOS 7.2 and deploy ovirt directly on top. Option 2: Build server with CentOS 7.2 and deploy multiple ovirt instances on top of KVM. Which will be the most stable versatile method? If a GPU is used as a passthrough device can it be used on several vm's or is it restricted to 1 vm? If 2 GPU's are used can 1 be used as a dedicated passthrough to 1 vm and the other shared between the remaining vm's? Is CentOS/RH the best platform for ovirt? Is it okay/advisable to load the latest kernel, (4.8 ish), on to CentOS before installing ovirt? Any and all comments/advice welcome, David ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users No one found it worth to mention Lago? Only for test, but you mentionned this use case, so consider reading about Lago. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Upgrading oVirt 3.6 with existing HTTPS certificate signed by custom CA to oVirt 4
Le 27/10/2016 à 00:14, Kenneth Bingham a écrit : I did install a server certificate from a private CA on the engine server for the oVirt 4 Manager GUI, but haven't figured out how to configure engine to trust the same CA which also issued the server certificate presented by vdsm. This is important for us because this is the same server certificate presented by the host when using the console (e.g. websocket console falls silently if the user agent doesn't trust the console server's certificate). Hello, Maybe related bug : on an oVirt 4, I followed the same procedure below to install a custom CA, with *SUCCESS*. Today, I had to reinstall one of the hosts, and it is failing with : "CA certificate and CA private key do not match" : http://pastebin.com/9JS05JtJ Which certificate did we (Kenneth and I) did we mis-used? What did we do wrong? Regards, Nicolas ECARNOT On Wed, Oct 26, 2016, 16:58 Beckman, Daniel <daniel.beck...@ingramcontent.com <mailto:daniel.beck...@ingramcontent.com>> wrote: We have oVirt 3.6.7 and I am preparing to upgrade to 4.0.4 release. I read the release notes (https://www.ovirt.org/release/4.0.4/) and noted comment #4 under “Install / Upgrade from previous version”: __ __ /If you are using HTTPS certificate signed by custom certificate authority, please take a look at https://bugzilla.redhat.com/1336838 for steps which need to be done after migration to 4.0. Also please consult https://bugzilla.redhat.com/1313379 how to setup this custom CA for use with virt-viewer clients./ /__ __/ So I referred to the first bugzilla (https://bugzilla.redhat.com/show_bug.cgi?id=1336838), where it states as follows: __ __ If customer wants to use custom HTTPS certificate signed by different CA, then he has to perform following steps: __ __ 1. Install custom CA (that signed HTTPS certificate) into host wide trustore (more info can be found in update-ca-trust man page) __ __ 2. Configure HTTPS certificate in Apache (this step is same as in previous versions) __ __ 3. Create new configuration file (for example /etc/ovirt-engine/engine.conf.d/99-custom-truststore.conf) with following content: ENGINE_HTTPS_PKI_TRUST_STORE="/etc/pki/java/cacerts" ENGINE_HTTPS_PKI_TRUST_STORE_PASSWORD="" __ __ 4. Restart ovirt-engine service __ __ I find it humorous that step # 1 suggests reading the “man page” which is only slightly better than suggesting to “google” it. __ __ Has anyone using a custom CA for their HTTPS certificate successfully upgraded to oVirt 4? If so could you share your detailed steps? Or can anyone point me to an actual example of this procedure? I’m a little nervous about the upgrade if you can’t already tell. __ __ Thanks, Daniel ___ Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] migrating via pivotting the storage domain between instances of Manager
Hello Kenneth, Le 26/10/2016 à 20:38, Kenneth Bingham a écrit : Is it possible to "pivot" up or down guests from one instance of Manager to another instance of Manager by detaching the data storage domain from the source Manager's data center and attaching it to the destination Manager's data center? Here, we're in a phase where we've done that exact action 6 times, and yet 2 to go. We're detaching+attaching these SD between 3.6.5 to 3.6.5 oVirt setups (CentOS 7). After having shut all VMs, cleanly detached the SD and attached to the target, one still has to : - activate the storage domain (according to your setup and the version, this step may be automatic) - manually import each VM : this step is long and has to be done one by one (Redhat people : please comment). In this last step, I found it quite fast, but some caveats are to be avoided, like : - pay attention to the MAC adresses, avoid the duplicates - pay attention to the access rights of the hosts towards your storage solution (iSCSI, NFS, aso...) *I faced no issue regarding any relation between guests and hosts.* I guess Redhat people will encourage you to share the logs of the adequate machines : - engine.log of both managers - vdsm.log of both SPMs I tried this with a block type storage domain, but there are no storage domains found when I do "import domain" in the destination Manager. If I do "import domain" on the source Manager and choose the same oVirt host to perform the import then the detached storage domain is able to be imported. If I re-attach the storage domain to the source Manager's data center the unregistered guests are available for import. Why does this require that the oVirt host performing the import be the same host that had formerly mounted the domain? Or, is it that the host is still recognized as enrolled in the DC that had created the unregistered guests on the detached storage domain? What if, instead of detaching the storage domain from the source Manager's data center, I shut down all guests and put the oVirt host in maintenance mode and enroll that same oVirt host in the destination Manager's data center, then would it be possible to import the detached storage domain and import the unregistered guests and templates stored there? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Average VM per host
Le 19/10/2016 à 19:49, Yaniv Kaul a écrit : On Oct 19, 2016 5:56 PM, "Nicolas Ecarnot" <nico...@ecarnot.net <mailto:nico...@ecarnot.net>> wrote: Hello, Hello Yaniv, Though I read some surveys about this, I'd rather directly ask the oVirt community this question, and especially to people using it as a production cluster : as an average, how many VM are running on each of your hosts? The host with 1TB RAM or 64GB? There is no meaningful average for two reasons : - Host specs (example above) - some believe in scale up (fewer but stronger hosts), some believe in scale out (more hosts, not as high end). *This* is mainly the point I am expecting to get some stats. Literature about competitor's products are commonly describing few and strong hosts scenario, but I'd like to know how it goes in oVirt's world -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users