[ovirt-users] unable to add host in oVirt4.2
Hi When I add additional node with the version4.2 in the engine , getting the error " Host has no default route." though the node has the default gateway, below are the logs, Can you please help on the same? https://pastebin.com/sDyVkvVY Thanks, Nagaraju ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/OR7VHK3MMZCKPHAZRFL2OUP537EUJRKA/
Re: [ovirt-users] Unable to add host to cluster after network
Hi Stack, Indeed this is a bug in the engine. We have opened a case [1] and are working to fix it on 4.2 ASAP Thanks for posting... [1] https://bugzilla.redhat.com/show_bug.cgi?id=1570388 Eitan oVirt | Redhat On Wed, Apr 18, 2018 at 8:35 PM, ~Stack~ wrote: > On 04/18/2018 09:55 AM, ~Stack~ wrote: > > On 04/18/2018 08:41 AM, Eitan Raviv wrote: > >> Hi Stack, > >> > >> I read through your ordeal and I would like to post a few comments: > > > > Thanks I appreciate it! > > > >> * When I try to reproduce your scenario with the second network set to > >> 'not required' before on-boarding the second host, it is processed > >> and set to 'up' by the engine without any hiccups or any errors in > >> the log. > > > > Hrm. Yeah, I think I can reproduce the failure. I've only done it once, > > but I have the chance to test so just to make sue I've got the right > > information I'm going to run a another test specifically for it. > > > > I agree with you, Eitan. I did a complete rebuild and made sure my > alternate network was set to 'not required' before adding the second > host. I successfully added a second host. It is possible I did something > else wrong in that first test. > > Since this is an acceptable work-around for now, I am going to finish > building my hosts out so I can move forward with this project. > > I would still like feedback on my other questions in the original post > if anyone is willing. > > Thanks! > ~Stack~ > > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > -- Eitan Raviv IRC: erav (#ovirt #vdsm #devel #rhev-dev) ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Unable to add host to cluster after network
On 04/18/2018 09:55 AM, ~Stack~ wrote: > On 04/18/2018 08:41 AM, Eitan Raviv wrote: >> Hi Stack, >> >> I read through your ordeal and I would like to post a few comments: > > Thanks I appreciate it! > >> * When I try to reproduce your scenario with the second network set to >> 'not required' before on-boarding the second host, it is processed >> and set to 'up' by the engine without any hiccups or any errors in >> the log. > > Hrm. Yeah, I think I can reproduce the failure. I've only done it once, > but I have the chance to test so just to make sue I've got the right > information I'm going to run a another test specifically for it. > I agree with you, Eitan. I did a complete rebuild and made sure my alternate network was set to 'not required' before adding the second host. I successfully added a second host. It is possible I did something else wrong in that first test. Since this is an acceptable work-around for now, I am going to finish building my hosts out so I can move forward with this project. I would still like feedback on my other questions in the original post if anyone is willing. Thanks! ~Stack~ signature.asc Description: OpenPGP digital signature ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Unable to add host to cluster after network
On 04/18/2018 09:55 AM, ~Stack~ wrote: > On 04/18/2018 08:41 AM, Eitan Raviv wrote: [snip] >> but on my setup it can be resolved: initially the second >> network is proclaimed missing and the host becomes non-operational, >> with its interfaces disappearing from the engine as you reported. >> But if the second network is rendered 'not-required' or even deleted >> for that matter from the engine, engine succeeds in reconnecting to >> the second host within a couple of minutes, and the host gains 'up' >> status. > > Setting the second network to 'not-required' does not seem to break my > hosts out of their infinite loop. Confirmed. Setting the second network to 'not required' did not break the loop. I hard powered off the box, let ovirt set it as down (thus breaking the loop), then powered it back on. The loop continued (at least twice anyway - takes roughly 5 minutes for a loop). > > I haven't tried deleting the second network yet. Let me try that before > I rebuild to test the first point. Confirmed. Same thing as above only this time I deleted every network but ovirtmgmt. Again, went through 2 full loops without resolving. I am going to do a fresh rebuild and test by having the second network set to 'not required' before adding a second host. ~Stack~ signature.asc Description: OpenPGP digital signature ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Unable to add host to cluster after network
On 04/18/2018 08:41 AM, Eitan Raviv wrote: > Hi Stack, > > I read through your ordeal and I would like to post a few comments: Thanks I appreciate it! > * When I try to reproduce your scenario with the second network set to > 'not required' before on-boarding the second host, it is processed > and set to 'up' by the engine without any hiccups or any errors in > the log. Hrm. Yeah, I think I can reproduce the failure. I've only done it once, but I have the chance to test so just to make sue I've got the right information I'm going to run a another test specifically for it. > * On the other hand, if the network is 'required' the scenario > reproduces, Whoo! I'm not completely crazy! I'm just lucky to discover a new bug I suppose. :-) > but on my setup it can be resolved: initially the second > network is proclaimed missing and the host becomes non-operational, > with its interfaces disappearing from the engine as you reported. > But if the second network is rendered 'not-required' or even deleted > for that matter from the engine, engine succeeds in reconnecting to > the second host within a couple of minutes, and the host gains 'up' > status. Setting the second network to 'not-required' does not seem to break my hosts out of their infinite loop. I haven't tried deleting the second network yet. Let me try that before I rebuild to test the first point. Thank you for your feedback. It is much appreciated. ~Stack~ signature.asc Description: OpenPGP digital signature ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Unable to add host to cluster after network
Hi Stack, I read through your ordeal and I would like to post a few comments: - When I try to reproduce your scenario with the second network set to 'not required' before on-boarding the second host, it is processed and set to 'up' by the engine without any hiccups or any errors in the log. - On the other hand, if the network is 'required' the scenario reproduces, but on my setup it can be resolved: initially the second network is proclaimed missing and the host becomes non-operational, with its interfaces disappearing from the engine as you reported. But if the second network is rendered 'not-required' or even deleted for that matter from the engine, engine succeeds in reconnecting to the second host within a couple of minutes, and the host gains 'up' status. HTH On Tue, Apr 17, 2018 at 11:35 PM, ~Stack~ wrote: > Greetings, > > After a few days of trial, error, and madness - I *think* I found the > source of my problem. Or at least I can now replicate it reliably. These > are the basics of my speed-run-to-test-failures setup. > > Fresh minimal install of Scientific Linux 7.4 on a physical host for my > engine. Add the 4.2 repo and run engine-setup - just blast through the > defaults. Configure it with default DC and cluster. > > Fresh minimal install of Scientific Linux 7.4 on node1 - configure only > the primary network card. Add the ovirt repo. > > Add the host into cluster. Provisions just fine. Life is good. > > Now here is where things split. > > Scenario 1: build node2 same as node 1 configuring only the primary > network card and add it as a host. Provisions just fine. Life is good. > > Scenario 2: Configure a second network. In my case a BMC/IPMI network. > Doesn't matter if it is required or not - both will cause failures > however the errors are slightly more evident with required. Make sure > the network is assigned to your node1 and is properly assigned an IP and > configured in the up state. Now build node2 same as before with only the > primary network configured and add it as a host. > > Failure followed by infinite loop of setting it into Non-Operational! > > > The pop-up gives you some crap about "Host has no default route." but > that is 100% a red-herring. > > Dig a little deeper and you get a message like this: > "node2 does not comply with the cluster Default networks, the following > networks are missing on host: 'ovirtmgmt'" > > Ah. That's a bit more relevant, but why can't it configure it? Or at > least get to the point where it asks me "Hey, networking is a bit off - > do you want to configure that now?" That would be nice... > > Fortunately the troubleshooting guide has something about that! > https://www.ovirt.org/documentation/how-to/troubleshooting/ > troubleshooting/ > > Unfortunately, it doesn't do anything to help. Even after doing these > steps, the loop just keeps going...nothing changes. > https://www.ovirt.org/develop/developer-guide/vdsm/ > installing-vdsm-from-rpm/ > > Scratch it all and completely rebuild AGAIN for... > Scenario 3: Configure a second network (BMC) and assign it to node1 just > like before. Build out node2 same as node1 but this time add in the > EXACT SAME NETWORK CONFIGURATION THAT IS WORKING ON NODE1 - ALL of the > ifcfg-* files (but update the IP address to correct host, obviously). > Now add it as a host. > > Doh! Same error. :-/ > > OK fine. Let's really get into it. First off, the networking page for > the host is blank. It never pulls back the network cards so you can't > actually make changes via the web page. Nor can you assign networks. So > the web interface doesn't help at all. > > Let's look at the engine log instead. > > > 2018-04-17 14:33:00,336-05 INFO > [org.ovirt.engine.core.bll.VdsEventListener] > (EE-ManagedThreadFactory-engine-Thread-1091) [] > ResourceManager::vdsNotResponding entered for Host > 'f0a3d515-8ba2-490e-8d65-54edbb52cefc', '192.168.1.4' > 2018-04-17 14:33:00,360-05 INFO > [org.ovirt.engine.core.bll.pm.VdsNotRespondingTreatmentCommand] > (EE-ManagedThreadFactory-engine-Thread-1091) [5291eee5] Lock Acquired to > object > 'EngineLock:{exclusiveLocks='[f0a3d515-8ba2-490e-8d65- > 54edbb52cefc=VDS_FENCE]', > sharedLocks=''}' > 2018-04-17 14:33:00,388-05 ERROR > [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] > (EE-ManagedThreadFactory-engineScheduled-Thread-44) [2b853e43] Host > 'node2' is set to Non-Operational, it is missing the following networks: > 'ovirtmgmt' > 2018-04-17 14:33:00,403-05 WARN > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (EE-ManagedThreadFactory-engineScheduled-Thread-44) [2b853e43] EVENT_ID: > VDS_SET_NONOPERATIONAL_NETWORK(519), Host node2 does not comply with the > cluster Default networks, the following networks are missing on host: > 'ovirtmgmt' > 2018-04-17 14:33:00,407-05 INFO > [org.ovirt.engine.core.bll.pm.VdsNotRespondingTreatmentCommand] > (EE-ManagedThreadFactory-engine-Thread-1091) [5291eee5] Running command: > VdsN
[ovirt-users] Unable to add host to cluster after network
Greetings, After a few days of trial, error, and madness - I *think* I found the source of my problem. Or at least I can now replicate it reliably. These are the basics of my speed-run-to-test-failures setup. Fresh minimal install of Scientific Linux 7.4 on a physical host for my engine. Add the 4.2 repo and run engine-setup - just blast through the defaults. Configure it with default DC and cluster. Fresh minimal install of Scientific Linux 7.4 on node1 - configure only the primary network card. Add the ovirt repo. Add the host into cluster. Provisions just fine. Life is good. Now here is where things split. Scenario 1: build node2 same as node 1 configuring only the primary network card and add it as a host. Provisions just fine. Life is good. Scenario 2: Configure a second network. In my case a BMC/IPMI network. Doesn't matter if it is required or not - both will cause failures however the errors are slightly more evident with required. Make sure the network is assigned to your node1 and is properly assigned an IP and configured in the up state. Now build node2 same as before with only the primary network configured and add it as a host. Failure followed by infinite loop of setting it into Non-Operational! The pop-up gives you some crap about "Host has no default route." but that is 100% a red-herring. Dig a little deeper and you get a message like this: "node2 does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt'" Ah. That's a bit more relevant, but why can't it configure it? Or at least get to the point where it asks me "Hey, networking is a bit off - do you want to configure that now?" That would be nice... Fortunately the troubleshooting guide has something about that! https://www.ovirt.org/documentation/how-to/troubleshooting/troubleshooting/ Unfortunately, it doesn't do anything to help. Even after doing these steps, the loop just keeps going...nothing changes. https://www.ovirt.org/develop/developer-guide/vdsm/installing-vdsm-from-rpm/ Scratch it all and completely rebuild AGAIN for... Scenario 3: Configure a second network (BMC) and assign it to node1 just like before. Build out node2 same as node1 but this time add in the EXACT SAME NETWORK CONFIGURATION THAT IS WORKING ON NODE1 - ALL of the ifcfg-* files (but update the IP address to correct host, obviously). Now add it as a host. Doh! Same error. :-/ OK fine. Let's really get into it. First off, the networking page for the host is blank. It never pulls back the network cards so you can't actually make changes via the web page. Nor can you assign networks. So the web interface doesn't help at all. Let's look at the engine log instead. 2018-04-17 14:33:00,336-05 INFO [org.ovirt.engine.core.bll.VdsEventListener] (EE-ManagedThreadFactory-engine-Thread-1091) [] ResourceManager::vdsNotResponding entered for Host 'f0a3d515-8ba2-490e-8d65-54edbb52cefc', '192.168.1.4' 2018-04-17 14:33:00,360-05 INFO [org.ovirt.engine.core.bll.pm.VdsNotRespondingTreatmentCommand] (EE-ManagedThreadFactory-engine-Thread-1091) [5291eee5] Lock Acquired to object 'EngineLock:{exclusiveLocks='[f0a3d515-8ba2-490e-8d65-54edbb52cefc=VDS_FENCE]', sharedLocks=''}' 2018-04-17 14:33:00,388-05 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-44) [2b853e43] Host 'node2' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt' 2018-04-17 14:33:00,403-05 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-44) [2b853e43] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host node2 does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' 2018-04-17 14:33:00,407-05 INFO [org.ovirt.engine.core.bll.pm.VdsNotRespondingTreatmentCommand] (EE-ManagedThreadFactory-engine-Thread-1091) [5291eee5] Running command: VdsNotRespondingTreatmentCommand internal: true. Entities affected : ID: f0a3d515-8ba2-490e-8d65-54edbb52cefc Type: VDS There's the message from before. Good. On the right track. Not sure why it thinks the host is unreachable because the host is just fine. 2018-04-17 14:33:01,978-05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-31) [] Command 'GetAllVmStatsVDSCommand(HostName = node2, VdsIdVDSCommandParametersBase:{hostId='f0a3d515-8ba2-490e-8d65-54edbb52cefc'})' execution failed: java.net.NoRouteToHostException: No route to host Huh. Again with the no route to host. But THERE IS! The network is functioning perfectly. IP's all work. DNS all works. Routing is fine. I have no idea what it is complaining about. 2018-04-17 14:33:03,873-05 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-39) [4f72afaa] START, SetVdsStatusVDSCommand(HostName = node2, SetVdsStatusVDSCommandParameters:{hos
Re: [ovirt-users] Unable to add host
On Sun, Nov 20, 2016 at 10:54 AM, Oscar Segarra wrote: > > > I'd like to know the difference between None, Deploy and Undeploy from the > Hosted Engine option as well: > > [image: Imágenes integradas 1] > > These options are referring to the deployment of hosted engine components that allow the hosted engine VM to run on a host. You can get more information from the feature page [1]. I am currently working on improving the UI around this functionality [2], so please let me know if you have any additional questions after checking out the feature page. [1] http://www.ovirt.org/develop/release-management/features/sla/deploy-hosted-engine-hosts-via-engine/ [2] https://bugzilla.redhat.com/show_bug.cgi?id=1369827 > Thanks a lot. > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Unable to add host
On 11/20/2016 09:24 PM, Oscar Segarra wrote: Hi, When I try to add the second host from the ovirt interface I get the following error: Imágenes integradas 2 Of course, host vdicnode02 does not appear in the GUI and the gluster looks perfectly up und sync: UI supports a functionality called "Importing host into Ovirt" which means that if there is already an existing cluster user can import that cluster and manage it from the UI. In your case i see that you already have a cluster, what you would need to do just importing the cluster into UI. To achieve that you just need to go to 'clusters' tab, there you see a link called 'import'. Simply click on that link and you will see a popup for adding the host. Provide the root password for your hosts and all your hosts will be imported into the UI which are part of the cluster. [root@vdicnode02 ~]# gluster volume status Status of volume: vdic-infr-gv0 Gluster process TCP Port RDMA Port Online Pid -- Brick vdicnode01-priv:/vdic-infr/gv049152 0 Y 3039 Brick vdicnode02-priv:/vdic-infr/gv049152 0 Y 1999 Brick vdicnode03-priv:/vdic-infr/gv049152 0 Y 3456 Self-heal Daemon on localhost N/A N/AY 3043 Self-heal Daemon on vdicnode03-priv N/A N/AY 3496 Self-heal Daemon on vdicnode01-priv N/A N/AY 3267 Task Status of Volume vdic-infr-gv0 -- There are no active volume tasks Status of volume: vdic-infr2-gv0 Gluster process TCP Port RDMA Port Online Pid -- Brick vdicnode01-priv:/vdic-infr2/gv0 49153 0 Y 3048 Brick vdicnode02-priv:/vdic-infr2/gv0 49153 0 Y 2026 Brick vdicnode03-priv:/vdic-infr2/gv0 49153 0 Y 3450 Self-heal Daemon on localhost N/A N/AY 3043 Self-heal Daemon on vdicnode01-priv N/A N/AY 3267 Self-heal Daemon on vdicnode03-priv N/A N/AY 3496 Task Status of Volume vdic-infr2-gv0 -- There are no active volume tasks [root@vdicnode02 ~]# May I activate self-heal? Activate self heal? From the above volume status output i see that SHD process is started and PID for the same is listed which simply means that self heal is active and running. I'd like to know the difference between None, Deploy and Undeploy from the Hosted Engine option as well: Imágenes integradas 1 ah !!!. A lot to explain here. I would suggest you to go through the link below for more details on this. https://devconfcz2016.sched.org/event/5m20/ovirt-and-gluster-hyperconvergence Hope the above helps Thanks a lot. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Unable to add host
Hi, When I try to add the second host from the ovirt interface I get the following error: [image: Imágenes integradas 2] Of course, host vdicnode02 does not appear in the GUI and the gluster looks perfectly up und sync: [root@vdicnode02 ~]# gluster volume status Status of volume: vdic-infr-gv0 Gluster process TCP Port RDMA Port Online Pid -- Brick vdicnode01-priv:/vdic-infr/gv049152 0 Y 3039 Brick vdicnode02-priv:/vdic-infr/gv049152 0 Y 1999 Brick vdicnode03-priv:/vdic-infr/gv049152 0 Y 3456 Self-heal Daemon on localhost N/A N/AY 3043 Self-heal Daemon on vdicnode03-priv N/A N/AY 3496 Self-heal Daemon on vdicnode01-priv N/A N/AY 3267 Task Status of Volume vdic-infr-gv0 -- There are no active volume tasks Status of volume: vdic-infr2-gv0 Gluster process TCP Port RDMA Port Online Pid -- Brick vdicnode01-priv:/vdic-infr2/gv0 49153 0 Y 3048 Brick vdicnode02-priv:/vdic-infr2/gv0 49153 0 Y 2026 Brick vdicnode03-priv:/vdic-infr2/gv0 49153 0 Y 3450 Self-heal Daemon on localhost N/A N/AY 3043 Self-heal Daemon on vdicnode01-priv N/A N/AY 3267 Self-heal Daemon on vdicnode03-priv N/A N/AY 3496 Task Status of Volume vdic-infr2-gv0 -- There are no active volume tasks [root@vdicnode02 ~]# May I activate self-heal? I'd like to know the difference between None, Deploy and Undeploy from the Hosted Engine option as well: [image: Imágenes integradas 1] Thanks a lot. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users