Re: [ovirt-users] VDSM memory consumption
On Mon, Mar 09, 2015 at 11:49:01PM +0100, Matt . wrote: Hi, I also see this on the latest 3.5 version, I'm thinking about setting up a cronjob to restart vdsm every night. I cannot believe that people say they don't have this issue. Can someone of the devs dive in maybe ? 10:01:54 AM saggi: YamakasY: it's in getCapabilities(). Here is the RSS graph. The flatlines are when I stopped calling it and called other verbs. http://i.imgur.com/CLm0Q75.png I do ***NOT*** recall what is the issue Saggi and YamakasY were dicussing (CCing the pair), or if it reached fruition as a patch. It is certainly something other than Bug 1158108, as the latter speak about a leak in a normal working state, with no getCapabilities calls. Please notice an important word that fell off my text. Do YOU recall if a fix was posted? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed)
On 09/03/15 17:53, Simone Tiraboschi wrote: it gathers the engine SSH public key from http://{enginefqdn}/engine.ssh.key.txt and it stores it under ~root/.ssh/authenticated_keys to make the engine able to add the host without knowing the host root password. Sorry that I'm getting off topic, but: are you sure this is done via _http_ (without s)? this should be done via https imho. should I open a BZ for this? -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt engine installation with Oracle Virtualbox
Anyone have ever succeeded to run all-in-one inside a VM running on windows? Any hints what tools to use? - Original Message - From: Nasim Banu mobapp...@gmail.com To: users@ovirt.org Cc: Tomas Jelinek tjeli...@redhat.com Sent: Monday, March 9, 2015 8:03:27 PM Subject: oVirt engine installation with Oracle Virtualbox Hello, I am a new user of oVirt. I am trying to install oVirt all in one version 3.5.1 in my windows vista/7 system with Oracle VirtualBox 4.3.10. When I create a new virtual machine in VirtualBox,what should I choose for OS and memory setting for proper booting iso file. When I tried to boot, the blue screen appears with CentOS written and it says automaic boot in 10sec,9 sec,8 sec... after 0 sec nothing hapens. It gets struck there. What might be wrong? Regards Nasim Banu ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed)
- Original Message - From: Sven Kieske s.kie...@mittwald.de To: users@ovirt.org Sent: Tuesday, March 10, 2015 10:39:36 AM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed) On 09/03/15 17:53, Simone Tiraboschi wrote: it gathers the engine SSH public key from http://{enginefqdn}/engine.ssh.key.txt and it stores it under ~root/.ssh/authenticated_keys to make the engine able to add the host without knowing the host root password. Sorry that I'm getting off topic, but: are you sure this is done via _http_ (without s)? this should be done via https imho. Yes, I am. should I open a BZ for this? On my opinion no: you just installed the engine and the engine just created its CA. In order to trust an https connection to the engine you have to trust its CA but you still don't know it cause it's a private one and it has been just created on the engine from scratch. Blindly downloading the engine CA cert and blindly trusting it is not that different that simply using http to download the public key: in order to fetch it you don't need to send any password or token and being a public key you don't need to crypt it by definition so you don't need encryption. -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed)
- Original Message - From: Sven Kieske s.kie...@mittwald.de To: Simone Tiraboschi stira...@redhat.com Cc: users@ovirt.org Sent: Tuesday, March 10, 2015 11:12:38 AM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed) On 10/03/15 10:53, Simone Tiraboschi wrote: In order to trust an https connection to the engine you have to trust its CA but you still don't know it cause it's a private one and it has been just created on the engine from scratch. Can't the setup display the necessary parameters to make sure I trust the right CA when I accept it in my browser? It could even create a consumable file, which I can copy to my workstation and import there. This is another things: having the user explicitly trusting the CA cert by manually and explicitly checking its fingerprint on both the host is the right solution but is more invasive and a lot of user is already complaining that hosted-engine involves to many steps. Blindly downloading the engine CA cert and blindly trusting it is not that different that simply using http to download the public key: this is correct, but who would do this? of course you need to check if it is the right CA! in order to fetch it you don't need to send any password or token and being a public key you don't need to crypt it by definition so you don't need encryption. this is not about keeping the public key secret, but about keeping the channel over which it is transferred secure. so no one can tamper with the key and send you another public key to a different machine. (dns spoofing, arp spoofing etc.) if you don't check the public key and ensure you connect to the correct machine, there is no need for public keys anyway and you could just skip this step. imho this is a security bug. other people would just consider this a hardening. trusting the local network is a security mindset from the 90's. No, I didn't said that I trust the network to be secure cause it's a local network. I said another thing, please read it carefully and follow me: 1. in order to trust an https connection you need to trust the CA that signed the cert that the engine host is using. 2. that CA is by default a private CA and so it has just been created on engine VM, so you don't have the CA cert on the host 3. so, to trust the https connection, you need to have/download the CA cert from the engine VM to the host 4. if you just download engine CA via http (https is not more secure at this point cause you are still trusting everything cause you don't have the CA cert) you just moved the issue instead of solving it So the issue is that the CA cert should reach the host in secure way otherwise you are in the same situation: somebody could provide a tampered CA cert and make you trusting a tampered https connection. It's just false security: it simply adds complexity without adding real security. I would be different if we ask to the user to copy and paste the engine CA cert by himself or at least to validate its fingerprint, without that step its really the same. most LANs have to many hosts which you might don't even know. you could also be on some shared foreign network where third party machines from different users can tamper with the network. I have seen user reports who used some leased hardware in offsite data centers to install ovirt, where you can't fully trust all local clients. this should be more secure by default imho. -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)
- Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Tuesday, March 10, 2015 2:40:13 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state) On 03/10/2015 04:58 AM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Monday, March 9, 2015 11:48:03 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state) On 03/09/2015 02:47 PM, Bob Doolittle wrote: Resending with CC to list (and an update). On 03/09/2015 01:40 PM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Monday, March 9, 2015 6:26:30 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed) ... OK, I've started over. Simply removing the storage domain was insufficient, the hosted-engine deploy failed when it found the HA and Broker services already configured. I decided to just start over fresh starting with re-installing the OS on my host. I can't deploy DNS at the moment, so I have to simply replicate /etc/hosts files on my host/engine. I did that this time, but have run into a new problem: [ INFO ] Engine replied: DB Up!Welcome to Health Status! Enter the name of the cluster to which you want to add the host (Default) [Default]: [ INFO ] Waiting for the host to become operational in the engine. This may take several minutes... [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. [ ERROR ] Unable to add ovirt-vm to the manager Please shutdown the VM allowing the system to launch it as a monitored service. The system will wait until the VM is down. [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection refused [ INFO ] Stage: Clean up [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection refused I've attached my engine log and the ovirt-hosted-engine-setup log. I think I had an issue with resolving external hostnames, or else a connectivity issue during the install. For some reason your engine wasn't able to deploy your hosts but the SSH session this time was established. 2015-03-09 13:05:58,514 ERROR [org.ovirt.engine.core.bll.InstallVdsInternalCommand] (org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.: java.io.IOException: Command returned failure code 1 during SSH session 'r...@xion2.smartcity.net' Can you please attach host-deploy logs from the engine VM? OK, attached. Like I said, it looks to me like a name-resolution issue during the yum update on the engine. I think I've fixed that, but do you have a better suggestion for cleaning up and re-deploying other than installing the OS on my host and starting all over again? I just finished starting over from scratch, starting with OS installation on my host/node, and wound up with a very similar problem - the engine couldn't reach the hosts during the yum operation. But this time the error was Network is unreachable. Which is weird, because I can ssh into the engine and ping many of those hosts, after the operation has failed. Here's my latest host-deploy log from the engine. I'd appreciate any clues. It seams that now your host is able to resolve that addresses but it's not able to connect over http. On your hosts some of them resolves as IPv6 addresses; can you please try to use curl to get one of the file that it wasn't able to fetch? Can you please check your network configuration before and after host-deploy? I can give you the network configuration after host-deploy, at least for the host/Node. The engine won't start for me this morning, after I shut down the host for the night. In order to give you the config before host-deploy (or, apparently for the engine), I'll have to re-install the OS on the host and start again from scratch. Obviously I'd rather not do that unless absolutely necessary. Here's the host config after the failed host-deploy: Host/Node: # ip route 169.254.0.0/16 dev ovirtmgmt scope link metric 1007 172.16.0.0/16 dev ovirtmgmt proto kernel scope link src 172.16.0.58 You are missing a default gateway and so the issue. Are you sure that it was properly configured before trying to deploy that host? # ip addr 1: lo: LOOPBACK,UP,LOWER_UP mtu 65536 qdisc noqueue state
[ovirt-users] Communication errors between engine and nodes?
Setup: oVirt 3.5.1 w/hosted engine, nodes: CentOS 7, engine: CentOS 6 I am periodically seeing errors like this in my engine web UI: 2015-Mar-10, 04:42 Host node5 is not responding. It will stay in Connecting state for a grace period of 89 seconds and after that an attempt to fence the host will be issued. 2015-Mar-10, 04:42 Host node3 from cluster c1 was chosen as a proxy to execute Status command on Host node5. 2015-Mar-10, 04:42 Status of host node5 was set to Up. 2015-Mar-10, 04:42 Host node5 power management was verified successfully. The engine.log file has this: 2015-03-10 04:42:23,310 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ListVDSCommand] (DefaultQuartzScheduler_Worker-40) [75b9e6d9] Command ListVDSCommand(HostName = node5, HostId = 8dfd0195-f386-4e16-9379-a5287221d5bd, vds=Host[node5,8dfd0195-f386-4e16-9379-a5287221d5bd]) execution failed. Exception: VDSNetworkException: VDSGenericException: VDSNetworkException: Heartbeat exeeded This seems to happen with a random node sometimes. The VMs on the node stay up and don't appear to experience any problem. I can't find any sign of a network problem on either the node, the engine, the node hosting the engine, or the switches. I don't see anything obvious in the logs on any of the systems involved either. The node network setup is VLANs on top of a bond of two NICs, each connected to a different switch in a two-switch stack. -- Chris Adams c...@cmadams.net ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during host deploy for 3.5.1, package installation
- Original Message - From: Erik Brakke hawkeyee...@gmail.com To: Alon Bar-Lev alo...@redhat.com Cc: users@ovirt.org Sent: Tuesday, March 10, 2015 3:21:53 PM Subject: Re: [ovirt-users] Error during host deploy for 3.5.1,package installation Hi Alon, thanks for replying. When I: yum update vdsm No packages marked for update The issue is that you are trying to deploy on i686 while we are now distributing the binary rpms only for x86_64. Other packages like are vdsm-xmlrpc arch independent and so it finds them, vdsm is arch specific and vdsm-4.16 is not available for i686 on ovirt repos while older version are still available in fedora ones also for i686 and so the fake dependency issue. Do you really want to deploy on i686 or is it a mistake? When I: yum update vdsm-xmlrpc Error: package: vdsm-4.14.8.1-0.fc20.i686 (@updates) Requires: vdsm-xmlrpc = 4.14.8.1-0.fc20.noarch (@updates) Removing: vdsm-xmlrpc-4.14.8.1-0.fc20.noarch (@updates) vdsm-xmlrpc-4.14.8.1-0.fc20 Updated By: vdsm-xmlrpc-4.16.10-8.gitc937927.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.10-8.gitc937927.fc20 Available: vdsm-xmlrpc-4.12.1-1.fc20.noarch (fedora) vdsm-xmlrpc-4.12.1-1.fc20 Available: vdsm-xmlrpc-4.16.7-1.gitdb83943.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.7-1.gitdb83943.fc20 Available: vdsm-xmlrpc-4.16.10-0.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.10-0.fc20 I also get matching results for vdsm-python and vdsm-python-zombiereaper. Do I need to disable the Fedora updates repo? Thanks! -Erik On Tue, Mar 10, 2015 at 2:44 AM, Alon Bar-Lev alo...@redhat.com wrote: Hi, What do you get when you try to update vdsm manually? # yum update vdsm - Original Message - From: Erik Brakke hawkeyee...@gmail.com To: users@ovirt.org Sent: Tuesday, March 10, 2015 4:25:53 AM Subject: [ovirt-users] Error during host deploy for 3.5.1, package installation Hello, When deploying a new host from the admin portal to FC20 target, the package dependency check fails (host-deploy log): ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:97 Yum [u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-xmlrpc = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python-zombiereaper = 4.14.8.1-0.fc20'] I've tried the release 3.5 and 3.5-snapshot repos. Installing the packages manually does not satisfy host deploy. It appears vdsm 4.16 packages are available in the repository. Engine was previously running 3.5.0, updated to 3.5.1, no change. I was able to deploy hosts in January with 3.5.0. Any assistance greatly appreciated! Best - Erik ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)
On 03/10/2015 04:58 AM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Monday, March 9, 2015 11:48:03 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state) On 03/09/2015 02:47 PM, Bob Doolittle wrote: Resending with CC to list (and an update). On 03/09/2015 01:40 PM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Monday, March 9, 2015 6:26:30 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed) ... OK, I've started over. Simply removing the storage domain was insufficient, the hosted-engine deploy failed when it found the HA and Broker services already configured. I decided to just start over fresh starting with re-installing the OS on my host. I can't deploy DNS at the moment, so I have to simply replicate /etc/hosts files on my host/engine. I did that this time, but have run into a new problem: [ INFO ] Engine replied: DB Up!Welcome to Health Status! Enter the name of the cluster to which you want to add the host (Default) [Default]: [ INFO ] Waiting for the host to become operational in the engine. This may take several minutes... [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. [ ERROR ] Unable to add ovirt-vm to the manager Please shutdown the VM allowing the system to launch it as a monitored service. The system will wait until the VM is down. [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection refused [ INFO ] Stage: Clean up [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection refused I've attached my engine log and the ovirt-hosted-engine-setup log. I think I had an issue with resolving external hostnames, or else a connectivity issue during the install. For some reason your engine wasn't able to deploy your hosts but the SSH session this time was established. 2015-03-09 13:05:58,514 ERROR [org.ovirt.engine.core.bll.InstallVdsInternalCommand] (org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.: java.io.IOException: Command returned failure code 1 during SSH session 'r...@xion2.smartcity.net' Can you please attach host-deploy logs from the engine VM? OK, attached. Like I said, it looks to me like a name-resolution issue during the yum update on the engine. I think I've fixed that, but do you have a better suggestion for cleaning up and re-deploying other than installing the OS on my host and starting all over again? I just finished starting over from scratch, starting with OS installation on my host/node, and wound up with a very similar problem - the engine couldn't reach the hosts during the yum operation. But this time the error was Network is unreachable. Which is weird, because I can ssh into the engine and ping many of those hosts, after the operation has failed. Here's my latest host-deploy log from the engine. I'd appreciate any clues. It seams that now your host is able to resolve that addresses but it's not able to connect over http. On your hosts some of them resolves as IPv6 addresses; can you please try to use curl to get one of the file that it wasn't able to fetch? Can you please check your network configuration before and after host-deploy? I can give you the network configuration after host-deploy, at least for the host/Node. The engine won't start for me this morning, after I shut down the host for the night. In order to give you the config before host-deploy (or, apparently for the engine), I'll have to re-install the OS on the host and start again from scratch. Obviously I'd rather not do that unless absolutely necessary. Here's the host config after the failed host-deploy: Host/Node: # ip route 169.254.0.0/16 dev ovirtmgmt scope link metric 1007 172.16.0.0/16 dev ovirtmgmt proto kernel scope link src 172.16.0.58 # ip addr 1: lo: LOOPBACK,UP,LOWER_UP mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: p3p2: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast master ovirtmgmt state UP group default qlen 1000 link/ether b8:ca:3a:79:22:12 brd ff:ff:ff:ff:ff:ff inet6 fe80::baca:3aff:fe79:2212/64 scope link valid_lft forever preferred_lft forever 3: bond0: NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP mtu 1500 qdisc noqueue
Re: [ovirt-users] VDSM memory consumption
On 03/10/2015 12:19 AM, Dan Kenigsberg wrote: On Mon, Mar 09, 2015 at 12:17:00PM -0500, Chris Adams wrote: Once upon a time, Dan Kenigsberg dan...@redhat.com said: I'm afraid that we are yet to find a solution for this issue, which is completly different from the horrible leak of supervdsm 4.16.7. Could you corroborate the claim of Bug 1147148 - M2Crypto usage in vdsm leaks memory ? Does the leak disappear once you start using plaintext transport? So, to confirm, it looks like to do that, the steps would be: - In the [vars] section of /etc/vdsm/vdsm.conf, set ssl = false. - Restart the vdsmd service. Is that all that is needed? No. You'd have to reconfigure libvirtd to work in plaintext vdsm-tool congfigure --force and also set you Engine to work in plaintext (unfortunately, I don't recall how's that done. surely Yaniv does) if the host already managed by the engine you can move it to maintenance, set directly in vdc_options table by psql client to your db- update to False in vdc_options the value of 'EncryptHostCommunication' 'SSLEnabled' options, then restart ovirt-engine. expect the engine side, run also the changes on host (ssl=False and configure --force as Dan mentions above) and reactivate the host. Is it safe to restart vdsmd on a node with active VMs? It's safe in the sense that I have not heard of a single failure to reconnected to already-running VMs in years. However, this is still not recommended for production environment, and particularly not if one of the VMs is defined as highly-available. This can end up with your host being fenced and all your VMs dead. Dan. -- Yaniv Bronhaim. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VDSM memory consumption
NO! The fix that should have fixed it didn't change a thing... we lost track there as some devs were going to look at it. 2015-03-10 11:47 GMT+01:00 Dan Kenigsberg dan...@redhat.com: On Mon, Mar 09, 2015 at 11:49:01PM +0100, Matt . wrote: Hi, I also see this on the latest 3.5 version, I'm thinking about setting up a cronjob to restart vdsm every night. I cannot believe that people say they don't have this issue. Can someone of the devs dive in maybe ? 10:01:54 AM saggi: YamakasY: it's in getCapabilities(). Here is the RSS graph. The flatlines are when I stopped calling it and called other verbs. http://i.imgur.com/CLm0Q75.png I do ***NOT*** recall what is the issue Saggi and YamakasY were dicussing (CCing the pair), or if it reached fruition as a patch. It is certainly something other than Bug 1158108, as the latter speak about a leak in a normal working state, with no getCapabilities calls. Please notice an important word that fell off my text. Do YOU recall if a fix was posted? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] hosted-engine --vm-status output
Hi Filipe, there is currently no easy way to clean the metadata section. We used to hide hosts that were not active in the past day or so, but I think we had to disable that as part of our time-shift recovery bugs. There is a patch that will add the metadata clean capability (https://gerrit.ovirt.org/#/c/38289/), but it has not been merged yet. So please stay tuned, we will add a way to remove a host from the hosted engine cluster very soon. Best regards -- Martin Sivák msi...@redhat.com Red Hat Czech RHEV-M SLA / Brno, CZ - Original Message - Hello guys I installed ovirt using hosted-engine procedure with six fisical hosts, with more than 60 vms, and until now, everythings ok and my environment works fine. I decided to use some of my hosts for other tasks, so have been removed four of my six hosts and put it way from my environment. After few days, my second host (hosted_engine_2) start to fail. It's hardware issue. My 10GbE interface stoped. I decide to put my host 4 as a second hosted_engine_2. It's works fine. but when I use command hosted-engine --vm-status, its still returns all of the old members of hosted-engines (1 to 6) how can i fix it leave only just active active nodes? See below the output for my hosted-engine --vm-status [root@bmh0001 ~]# hosted-engine --vm-status --== Host 1 status ==-- Status up-to-date : True Hostname : bmh0001.place.brazil Host ID : 1 Engine status : {reason: vm not running on this host, health: bad, vm: down, detail: unknown} Score : 2400 Local maintenance : False Host timestamp : 68830 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=68830 (Sun Mar 8 17:38:05 2015) host-id=1 score=2400 maintenance=False state=EngineDown --== Host 2 status ==-- Status up-to-date : True Hostname : bmh0004.place.brazil Host ID : 2 Engine status : {health: good, vm: up, detail: up} Score : 2400 Local maintenance : False Host timestamp : 2427 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=2427 (Sun Mar 8 17:38:09 2015) host-id=2 score=2400 maintenance=False state=EngineUp --== Host 3 status ==-- Status up-to-date : False Hostname : bmh0003.place.brazil Host ID : 3 Engine status : unknown stale-data Score : 0 Local maintenance : True Host timestamp : 331389 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=331389 (Tue Mar 3 14:48:25 2015) host-id=3 score=0 maintenance=True state=LocalMaintenance --== Host 4 status ==-- Status up-to-date : False Hostname : bmh0004.place.brazil Host ID : 4 Engine status : unknown stale-data Score : 0 Local maintenance : True Host timestamp : 364358 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=364358 (Tue Mar 3 16:10:36 2015) host-id=4 score=0 maintenance=True state=LocalMaintenance --== Host 5 status ==-- Status up-to-date : False Hostname : bmh0005.place.brazil Host ID : 5 Engine status : unknown stale-data Score : 0 Local maintenance : True Host timestamp : 241930 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=241930 (Fri Mar 6 09:40:31 2015) host-id=5 score=0 maintenance=True state=LocalMaintenance --== Host 6 status ==-- Status up-to-date : False Hostname : bmh0006.place.brazil Host ID : 6 Engine status : unknown stale-data Score : 0 Local maintenance : True Host timestamp : 77376 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=77376 (Wed Mar 4 09:11:17 2015) host-id=6 score=0 maintenance=True state=LocalMaintenance [root@bmh0001 ~]# thank you very much. -- Regards Filipe Guarino ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during host deploy for 3.5.1, package installation
Hi Alon, thanks for replying. When I: yum update vdsm No packages marked for update When I: yum update vdsm-xmlrpc Error: package: vdsm-4.14.8.1-0.fc20.i686 (@updates) Requires: vdsm-xmlrpc = 4.14.8.1-0.fc20.noarch (@updates) Removing: vdsm-xmlrpc-4.14.8.1-0.fc20.noarch (@updates) vdsm-xmlrpc-4.14.8.1-0.fc20 Updated By: vdsm-xmlrpc-4.16.10-8.gitc937927.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.10-8.gitc937927.fc20 Available: vdsm-xmlrpc-4.12.1-1.fc20.noarch (fedora) vdsm-xmlrpc-4.12.1-1.fc20 Available: vdsm-xmlrpc-4.16.7-1.gitdb83943.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.7-1.gitdb83943.fc20 Available: vdsm-xmlrpc-4.16.10-0.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.10-0.fc20 I also get matching results for vdsm-python and vdsm-python-zombiereaper. Do I need to disable the Fedora updates repo? Thanks! -Erik On Tue, Mar 10, 2015 at 2:44 AM, Alon Bar-Lev alo...@redhat.com wrote: Hi, What do you get when you try to update vdsm manually? # yum update vdsm - Original Message - From: Erik Brakke hawkeyee...@gmail.com To: users@ovirt.org Sent: Tuesday, March 10, 2015 4:25:53 AM Subject: [ovirt-users] Error during host deploy for 3.5.1,package installation Hello, When deploying a new host from the admin portal to FC20 target, the package dependency check fails (host-deploy log): ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:97 Yum [u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-xmlrpc = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python-zombiereaper = 4.14.8.1-0.fc20'] I've tried the release 3.5 and 3.5-snapshot repos. Installing the packages manually does not satisfy host deploy. It appears vdsm 4.16 packages are available in the repository. Engine was previously running 3.5.0, updated to 3.5.1, no change. I was able to deploy hosts in January with 3.5.0. Any assistance greatly appreciated! Best - Erik ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Ovirt engine-setup fails, Cannot get JAVA_HOME
- Original Message - From: Carter Kindley carter.kind...@deusmachine.com To: Yedidyah Bar David d...@redhat.com Cc: users@ovirt.org Sent: Thursday, March 5, 2015 2:22:31 AM Subject: RE: [ovirt-users] Ovirt engine-setup fails, Cannot get JAVA_HOME Hey folks, Icedtea-7 allows engine-setup to complete - almost... The setup now fails on cleanup: [ ERROR ] Failed to execute stage 'Closing up': Command '/usr/bin/systemctl' failed to execute. The log files indicate that systemctl is attempting to start a unit file (presumably ovirt-engine.service) which does not exist. I'm happy to write my own, but it would be awesome to see what is used as best practice from the oVirt community. Adding Alon again :-) Perhaps the unit files are not packaged for gentoo? Please attach setup log files just to make sure this indeed is the problem. Thanks, -- Didi ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during host deploy for 3.5.1, package installation
Yaniv, can you please assist, there seems to be a conflict and multilib issues of vdsm. - Original Message - From: Erik Brakke hawkeyee...@gmail.com To: Alon Bar-Lev alo...@redhat.com Cc: users@ovirt.org Sent: Tuesday, March 10, 2015 4:21:53 PM Subject: Re: [ovirt-users] Error during host deploy for 3.5.1, package installation Hi Alon, thanks for replying. When I: yum update vdsm No packages marked for update When I: yum update vdsm-xmlrpc Error: package: vdsm-4.14.8.1-0.fc20.i686 (@updates) Requires: vdsm-xmlrpc = 4.14.8.1-0.fc20.noarch (@updates) Removing: vdsm-xmlrpc-4.14.8.1-0.fc20.noarch (@updates) vdsm-xmlrpc-4.14.8.1-0.fc20 Updated By: vdsm-xmlrpc-4.16.10-8.gitc937927.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.10-8.gitc937927.fc20 Available: vdsm-xmlrpc-4.12.1-1.fc20.noarch (fedora) vdsm-xmlrpc-4.12.1-1.fc20 Available: vdsm-xmlrpc-4.16.7-1.gitdb83943.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.7-1.gitdb83943.fc20 Available: vdsm-xmlrpc-4.16.10-0.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.10-0.fc20 I also get matching results for vdsm-python and vdsm-python-zombiereaper. Do I need to disable the Fedora updates repo? Thanks! -Erik On Tue, Mar 10, 2015 at 2:44 AM, Alon Bar-Lev alo...@redhat.com wrote: Hi, What do you get when you try to update vdsm manually? # yum update vdsm - Original Message - From: Erik Brakke hawkeyee...@gmail.com To: users@ovirt.org Sent: Tuesday, March 10, 2015 4:25:53 AM Subject: [ovirt-users] Error during host deploy for 3.5.1,package installation Hello, When deploying a new host from the admin portal to FC20 target, the package dependency check fails (host-deploy log): ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:97 Yum [u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-xmlrpc = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python-zombiereaper = 4.14.8.1-0.fc20'] I've tried the release 3.5 and 3.5-snapshot repos. Installing the packages manually does not satisfy host deploy. It appears vdsm 4.16 packages are available in the repository. Engine was previously running 3.5.0, updated to 3.5.1, no change. I was able to deploy hosts in January with 3.5.0. Any assistance greatly appreciated! Best - Erik ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during host deploy for 3.5.1, package installation
Simone, you're so right, thanks. I was trying to deploy to 32 bit FC20. Thank you both for your time. On Tue, Mar 10, 2015 at 9:33 AM, Simone Tiraboschi stira...@redhat.com wrote: - Original Message - From: Erik Brakke hawkeyee...@gmail.com To: Alon Bar-Lev alo...@redhat.com Cc: users@ovirt.org Sent: Tuesday, March 10, 2015 3:21:53 PM Subject: Re: [ovirt-users] Error during host deploy for 3.5.1, package installation Hi Alon, thanks for replying. When I: yum update vdsm No packages marked for update The issue is that you are trying to deploy on i686 while we are now distributing the binary rpms only for x86_64. Other packages like are vdsm-xmlrpc arch independent and so it finds them, vdsm is arch specific and vdsm-4.16 is not available for i686 on ovirt repos while older version are still available in fedora ones also for i686 and so the fake dependency issue. Do you really want to deploy on i686 or is it a mistake? When I: yum update vdsm-xmlrpc Error: package: vdsm-4.14.8.1-0.fc20.i686 (@updates) Requires: vdsm-xmlrpc = 4.14.8.1-0.fc20.noarch (@updates) Removing: vdsm-xmlrpc-4.14.8.1-0.fc20.noarch (@updates) vdsm-xmlrpc-4.14.8.1-0.fc20 Updated By: vdsm-xmlrpc-4.16.10-8.gitc937927.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.10-8.gitc937927.fc20 Available: vdsm-xmlrpc-4.12.1-1.fc20.noarch (fedora) vdsm-xmlrpc-4.12.1-1.fc20 Available: vdsm-xmlrpc-4.16.7-1.gitdb83943.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.7-1.gitdb83943.fc20 Available: vdsm-xmlrpc-4.16.10-0.fc20.noarch (ovirt-3.5) vdsm-xmlrpc-4.16.10-0.fc20 I also get matching results for vdsm-python and vdsm-python-zombiereaper. Do I need to disable the Fedora updates repo? Thanks! -Erik On Tue, Mar 10, 2015 at 2:44 AM, Alon Bar-Lev alo...@redhat.com wrote: Hi, What do you get when you try to update vdsm manually? # yum update vdsm - Original Message - From: Erik Brakke hawkeyee...@gmail.com To: users@ovirt.org Sent: Tuesday, March 10, 2015 4:25:53 AM Subject: [ovirt-users] Error during host deploy for 3.5.1, package installation Hello, When deploying a new host from the admin portal to FC20 target, the package dependency check fails (host-deploy log): ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:97 Yum [u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-xmlrpc = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python-zombiereaper = 4.14.8.1-0.fc20'] I've tried the release 3.5 and 3.5-snapshot repos. Installing the packages manually does not satisfy host deploy. It appears vdsm 4.16 packages are available in the repository. Engine was previously running 3.5.0, updated to 3.5.1, no change. I was able to deploy hosts in January with 3.5.0. Any assistance greatly appreciated! Best - Erik ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)
On 03/10/2015 10:20 AM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Tuesday, March 10, 2015 2:40:13 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state) On 03/10/2015 04:58 AM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Monday, March 9, 2015 11:48:03 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state) On 03/09/2015 02:47 PM, Bob Doolittle wrote: Resending with CC to list (and an update). On 03/09/2015 01:40 PM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Monday, March 9, 2015 6:26:30 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed) ... OK, I've started over. Simply removing the storage domain was insufficient, the hosted-engine deploy failed when it found the HA and Broker services already configured. I decided to just start over fresh starting with re-installing the OS on my host. I can't deploy DNS at the moment, so I have to simply replicate /etc/hosts files on my host/engine. I did that this time, but have run into a new problem: [ INFO ] Engine replied: DB Up!Welcome to Health Status! Enter the name of the cluster to which you want to add the host (Default) [Default]: [ INFO ] Waiting for the host to become operational in the engine. This may take several minutes... [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. [ ERROR ] Unable to add ovirt-vm to the manager Please shutdown the VM allowing the system to launch it as a monitored service. The system will wait until the VM is down. [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection refused [ INFO ] Stage: Clean up [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection refused I've attached my engine log and the ovirt-hosted-engine-setup log. I think I had an issue with resolving external hostnames, or else a connectivity issue during the install. For some reason your engine wasn't able to deploy your hosts but the SSH session this time was established. 2015-03-09 13:05:58,514 ERROR [org.ovirt.engine.core.bll.InstallVdsInternalCommand] (org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed for host 217016bb-fdcd-4344-a0ca-4548262d10a8, ovirt-vm.: java.io.IOException: Command returned failure code 1 during SSH session 'r...@xion2.smartcity.net' Can you please attach host-deploy logs from the engine VM? OK, attached. Like I said, it looks to me like a name-resolution issue during the yum update on the engine. I think I've fixed that, but do you have a better suggestion for cleaning up and re-deploying other than installing the OS on my host and starting all over again? I just finished starting over from scratch, starting with OS installation on my host/node, and wound up with a very similar problem - the engine couldn't reach the hosts during the yum operation. But this time the error was Network is unreachable. Which is weird, because I can ssh into the engine and ping many of those hosts, after the operation has failed. Here's my latest host-deploy log from the engine. I'd appreciate any clues. It seams that now your host is able to resolve that addresses but it's not able to connect over http. On your hosts some of them resolves as IPv6 addresses; can you please try to use curl to get one of the file that it wasn't able to fetch? Can you please check your network configuration before and after host-deploy? I can give you the network configuration after host-deploy, at least for the host/Node. The engine won't start for me this morning, after I shut down the host for the night. In order to give you the config before host-deploy (or, apparently for the engine), I'll have to re-install the OS on the host and start again from scratch. Obviously I'd rather not do that unless absolutely necessary. Here's the host config after the failed host-deploy: Host/Node: # ip route 169.254.0.0/16 dev ovirtmgmt scope link metric 1007 172.16.0.0/16 dev ovirtmgmt proto kernel scope link src 172.16.0.58 You are missing a default gateway and so the issue. Are you sure that it was properly configured before trying to deploy that host? It should have been, it was a fresh OS install. So I'm starting again, and keeping careful records of my network config. Here
Re: [ovirt-users] Error during host deploy for 3.5.1, package installation
Hi, What do you get when you try to update vdsm manually? # yum update vdsm - Original Message - From: Erik Brakke hawkeyee...@gmail.com To: users@ovirt.org Sent: Tuesday, March 10, 2015 4:25:53 AM Subject: [ovirt-users] Error during host deploy for 3.5.1,package installation Hello, When deploying a new host from the admin portal to FC20 target, the package dependency check fails (host-deploy log): ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:97 Yum [u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-xmlrpc = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python = 4.14.8.1-0.fc20', u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python-zombiereaper = 4.14.8.1-0.fc20'] I've tried the release 3.5 and 3.5-snapshot repos. Installing the packages manually does not satisfy host deploy. It appears vdsm 4.16 packages are available in the repository. Engine was previously running 3.5.0, updated to 3.5.1, no change. I was able to deploy hosts in January with 3.5.0. Any assistance greatly appreciated! Best - Erik ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Troubles starting hosted engine
- Original Message - From: Simone Tiraboschi stira...@redhat.com To: John Florian jflor...@doubledog.org Cc: users@ovirt.org Sent: Monday, March 9, 2015 10:30:36 AM Subject: Re: [ovirt-users] Troubles starting hosted engine - Original Message - From: John Florian jflor...@doubledog.org To: users@ovirt.org Sent: Sunday, March 8, 2015 9:37:39 PM Subject: [ovirt-users] Troubles starting hosted engine I have lots of extra fun bringing up my hosted engine right now due to two issues. First, either during the hosted-engine --deploy or engine-setup (I can't remember) I was prompted for the IP address of my gateway. Since then that address has changed. I'm unable to start the engine VM if that address isn't reachable so my temporary workaround is to add this old address onto the current gateway. How/where do I change things so that this old address can be truly retired? It's written in /etc/ovirt-hosted-engine/hosted-engine.conf If you deployed more than one host, you need to explicitly fix it on each of them. My second issue might be harder. Again during the setup I was prompted for a location of an ISO file for installing the engine's OS. That location is served by NFS and is auto-mounted by /etc/fstab (and systemd). Here's the hitch: my NFS server is now a VM in my cluster. :-) Since I only have a single hypervisor host right now that ISO isn't reachable when I'm trying to start my engine VM so that I can also start the VM that provides the NFS share. I'm getting away with evil right now by touching an empty file at the same path, which gets obscured once the NFS share is mounted, but it's enough. You need that ISO file just to install the OS when you create the engine VM on the first host: you don't need a shared domain for that. So my suggestion is just to copy that ISO image on the first host and use it locally. You can destroy it when the setup is done. It's not at all clear to me how I'm supposed to edit things for my hosted engine setup. I am pretty certain that you can remove it by editing /etc/ovirt-hosted-engine/vm.conf , you can search the list archives. In 3.6 it might be editable from the web admin. Best, -- Didi ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state)
- Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Monday, March 9, 2015 11:48:03 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (The VDSM host was found in a failed state) On 03/09/2015 02:47 PM, Bob Doolittle wrote: Resending with CC to list (and an update). On 03/09/2015 01:40 PM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Monday, March 9, 2015 6:26:30 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed) On 03/09/2015 12:53 PM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Cc: users-ovirt users@ovirt.org Sent: Monday, March 9, 2015 12:48:37 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed) On 03/09/2015 07:12 AM, Simone Tiraboschi wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: Simone Tiraboschi stira...@redhat.com Sent: Monday, March 9, 2015 12:02:49 PM Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed) On Mar 9, 2015 5:23 AM, Simone Tiraboschi stira...@redhat.com wrote: - Original Message - From: Bob Doolittle b...@doolittle.us.com To: users-ovirt users@ovirt.org Sent: Friday, March 6, 2015 9:21:20 PM Subject: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed) Hi, I'm following the instructions here: http://www.ovirt.org/Hosted_Engine_Howto My self-hosted install failed near the end: To continue make a selection from the options below: (1) Continue setup - engine installation is complete (2) Power off and restart the VM (3) Abort setup (4) Destroy VM and abort setup (1, 2, 3, 4)[1]: 1 [ INFO ] Engine replied: DB Up!Welcome to Health Status! Enter the name of the cluster to which you want to add the host (Default) [Default]: [ ERROR ] Cannot automatically add the host to cluster Default: Cannot add Host. Connecting to host via SSH has failed, verify that the host is reachable (IP address, routable address etc.) You may refer to the engine.log file for further details. [ ERROR ] Failed to execute stage 'Closing up': Cannot add the host to cluster Default [ INFO ] Stage: Clean up [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150306135624.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination I can ssh into the engine VM both locally and remotely. There is no /root/.ssh directory, however. Did I need to set that up somehow? It's the engine that needs to open an SSH connection to the host calling it by its hostname. So please be sure that you can SSH to the host from the engine using its hostname and not its IP address. I'm assuming this should be a password-less login (key-based authentication?). Yes, it is. As what user? root OK, I see a couple of problems. First off, I didn't have my deploying-host hostname in the hosts map for my engine. This is enough by itself to make the deploy procedure failing. If possible we recommend to rely a DNS infrastructure especially if you are deploying more than one host. OK, I've started over. Simply removing the storage domain was insufficient, the hosted-engine deploy failed when it found the HA and Broker services already configured. I decided to just start over fresh starting with re-installing the OS on my host. I can't deploy DNS at the moment, so I have to simply replicate /etc/hosts files on my host/engine. I did that this time, but have run into a new problem: [ INFO ] Engine replied: DB Up!Welcome to Health Status! Enter the name of the cluster to which you want to add the host (Default) [Default]: [ INFO ] Waiting for the host to become operational in the engine. This may take several minutes... [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. [ ERROR ] Unable to add ovirt-vm to the manager Please shutdown the VM allowing the system to launch it as a monitored service. The system will wait until the VM is down. [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection refused [ INFO ] Stage: Clean up [ ERROR ] Failed to execute stage