Hi Lukasz, happy to read that you found the root cause and fixed the issue. I have seen a good number of issues, sometimes also related to DNS caching but this happening during the provisioning process is a new one for me. Always interesting
Best Regards, Frédéric On Wed, Jun 21, 2017 at 7:37 PM, Łukasz Strzelec <[email protected]> wrote: > Hi Frédéric:), > > I found out what was the root cause of that behaviour. You would not > believe in this. But first things at first place: > > > It is true that our OSO implementation is let say "unusual". During > scaling up our cluster, we are using FQDNs . We have got also self service > portal for provisioning new hosts. Our customer order 10 atomics hosts in > dedicated vlan and decided to attach them into OSO cluster. Before doing > this, it was decided to change DNS names of them. > > > And here is the place where the story is starting :) The dns zone was > refreshed after 45 minutes. But the host from where we were executing > Ansible playbooks, had got cached old IP addresses. > > > So whats happend. All atomics hosts had beed properly configured, but all > entries in Openshift configuration contains wrong IP addresses . This is > why , from network layer cluster was working, all nodes had beed reported > as "ready". But inside of cluster the configuration was messup. > > Your link was very helpful. Thanks to it, I found wrong configuration: > > # oc get hostsubnet > NAME HOST HOST IP SUBNET > rh71-os1.example.com rh71-os1.example.com 192.168.122.46 > 10.1.1.0/24 > rh71-os2.example.com rh71-os2.example.com 192.168.122.18 > 10.1.2.0/24 > rh71-os3.example.com rh71-os3.example.com 192.168.122.202 > 10.1.0.0/24 > > and from the first shot I've noticed wrong IP addresses. > > > I've re-run the playbook and everything is working like a charm. Thx a > lot for your help. > > Best regards:) > > > 2017-06-21 10:12 GMT+02:00 Frederic Giloux <[email protected]>: > >> Hi Lukasz >> >> if you don't have connectivity at the service level it is likely that the >> IPTables have not been configured on your new node. You can validate that >> with iptables -L -n. Compare the result on your new node and on one in the >> other VLAN. If this is confirmed the master may not be able to connect to >> the kubelet on the new node (port TCP 10250 as per my previews email). >> Another thing that could have gone wrong is the population of the OVS >> table. In that case restarting the node would reinitialise it. >> Other point the traffic between pods communicating through service should >> go through the SDN, which means your network team should only see SDN >> packets between nodes at a firewall between VLANs and not traffic to your >> service IP range. >> This resource should also be of help: https://docs.openshift.com/con >> tainer-platform/3.5/admin_guide/sdn_troubleshooting.html# >> debugging-a-service >> >> I hope this helps. >> >> Regards, >> >> Frédéric >> >> >> On Wed, Jun 21, 2017 at 9:27 AM, Łukasz Strzelec < >> [email protected]> wrote: >> >>> Hello :) >>> >>> Thx for quick replay, >>> >>> I did, I mean, the mentioned port had been opened. All nodes are >>> visible to eachother, and oc get nodes showing "ready" state. >>> >>> But pushing to registry, or simply test connectivity to endpoints or >>> services IPs showin no route to host. >>> >>> Do you know how to test this properly ? >>> >>> The network guy is telling me that he see some denies from VLAN_B to >>> 172.30.0.0/16 network. He also ensure me that traffic on 4780 port is >>> allowed. >>> >>> >>> I did some test once again: >>> >>> I try to deploy example ruby application, and it stops on pushing into >>> registry :( >>> >>> Also when I deploy simple pod (hello-openshift) then expose service. I >>> cannot reach the website. I'm seeing default route page with infor that >>> application doesn't exists >>> >>> >>> Please see the logs below: >>> >>> Fetching gem metadata from https://rubygems.org/............... >>> Fetching version metadata from https://rubygems.org/.. >>> Warning: the running version of Bundler is older than the version that >>> created the lockfile. We suggest you upgrade to the latest version of >>> Bundler by running `gem install bundler`. >>> Installing puma 3.4.0 with native extensions >>> Installing rack 1.6.4 >>> Using bundler 1.10.6 >>> Bundle complete! 2 Gemfile dependencies, 3 gems now installed. >>> Gems in the groups development and test were not installed. >>> Bundled gems are installed into ./bundle. >>> ---> Cleaning up unused ruby gems ... >>> Warning: the running version of Bundler is older than the version that >>> created the lockfile. We suggest you upgrade to the latest version of >>> Bundler by running `gem install bundler`. >>> Pushing image 172.30.123.59:5000/testshared/ddddd:latest ... >>> Registry server Address: >>> Registry server User Name: serviceaccount >>> Registry server Email: [email protected] >>> Registry server Password: <<non-empty>> >>> error: build error: Failed to push image: Put >>> http://172.30.123.59:5000/v1/repositories/testshared/ddddd/: dial tcp >>> 172.30.123.59:5000: getsockopt: no route to host >>> >>> >>> 2017-06-21 7:31 GMT+02:00 Frederic Giloux <[email protected]>: >>> >>>> Hi Lukasz, >>>> >>>> this is not an unusual setup. You will need: >>>> - the SDN port: 4789 UDP (both directions: masters/nodes to nodes) >>>> - the kubelet port: 10250 TCP (masters to nodes) >>>> - the DNS port: 8053 TCP/UDP (nodes to masters) >>>> If you can't reach VLAN b pods from VLAN A the issue is probably with >>>> the SDN port. Mind that it is using UDP. >>>> >>>> Regards, >>>> >>>> Frédéric >>>> >>>> On Wed, Jun 21, 2017 at 4:13 AM, Łukasz Strzelec < >>>> [email protected]> wrote: >>>> >>>>> -- Hello, >>>>> >>>>> I have to install OSO with dedicated HW nodes for one of my >>>>> customer. >>>>> >>>>> Current cluster is placed in VLAN (for the sake of this question) >>>>> called: VLAN_A >>>>> >>>>> The Customer's nodes have to be place in another vlan: VLAN_B >>>>> >>>>> Now the question, what ports and routes I have to setup to get this >>>>> to work? >>>>> >>>>> The assumption is that traffic between vlans is filtered by default. >>>>> >>>>> >>>>> Now, what I already did: >>>>> >>>>> I had opened the ports with accordance to documentation, then scaled >>>>> up the cluster (ansible playbook). >>>>> >>>>> From the first sight , everything was working fine. Nodes had been >>>>> ready. I can deploy simple pod (eg. hello-openshift). But I can't reach te >>>>> service. During S2I process, pushing into registry is ending with >>>>> >>>>> information "no route to host". I've checked this out, and for nodes >>>>> placed in VLAN_A (the same one as registry and router) everything works >>>>> fine. The problem is in the traffic between VLANs A <-> B. I >>>>> >>>>> can't reach any IP of services of deployed pods on newly added nodes. >>>>> Thus, traffic between pods over service-subnet is not allow. Question is >>>>> what should I open? Whole 172.30.0.0/16 between those 2 >>>>> >>>>> vlans, or dedicated rules to /from registry, router , metrics and so >>>>> on ? >>>>> >>>>> >>>>> -- >>>>> Ł.S. >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> [email protected] >>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Frédéric Giloux* >>>> Senior Middleware Consultant >>>> Red Hat Germany >>>> >>>> [email protected] M: +49-174-172-4661 >>>> >>>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted >>>> ________________________________________________________________________ >>>> >>>> Red Hat GmbH, http://www.de.redhat.com/ Sitz: Grasbrunn, >>>> Handelsregister: Amtsgericht München, HRB 153243 >>>> Geschäftsführer: Paul Argiry, Charles Cachera, Michael Cunningham, >>>> Michael O'Neill >>>> >>> >>> >>> >>> -- >>> Ł.S. >>> >> >> >> >> -- >> *Frédéric Giloux* >> Senior Middleware Consultant >> Red Hat Germany >> >> [email protected] M: +49-174-172-4661 >> >> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted >> ________________________________________________________________________ >> Red Hat GmbH, http://www.de.redhat.com/ Sitz: Grasbrunn, >> Handelsregister: Amtsgericht München, HRB 153243 >> Geschäftsführer: Paul Argiry, Charles Cachera, Michael Cunningham, >> Michael O'Neill >> > > > > -- > Ł.S. > -- *Frédéric Giloux* Senior Middleware Consultant Red Hat Germany [email protected] M: +49-174-172-4661 redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted ________________________________________________________________________ Red Hat GmbH, http://www.de.redhat.com/ Sitz: Grasbrunn, Handelsregister: Amtsgericht München, HRB 153243 Geschäftsführer: Paul Argiry, Charles Cachera, Michael Cunningham, Michael O'Neill
_______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
