Re: Openshift origin - nodes in vlans

Frederic Giloux Wed, 21 Jun 2017 11:31:32 -0700

Hi Lukasz,

happy to read that you found the root cause and fixed the issue. I have
seen a good number of issues, sometimes also related to DNS caching but
this happening during the provisioning process is a new one for me. Always
interesting


Best Regards,

Frédéric

On Wed, Jun 21, 2017 at 7:37 PM, Łukasz Strzelec <[email protected]>
wrote:

> Hi Frédéric:),
>
> I found out what was the root cause of that behaviour.  You would not
> believe in this.  But first things at first place:
>
>
> It is true that our OSO implementation is let say "unusual".  During
> scaling up our cluster, we are using FQDNs .  We have got also self service
> portal for provisioning new hosts. Our customer order 10 atomics hosts in
> dedicated vlan and  decided to attach them into OSO cluster.  Before doing
> this, it was decided to change DNS names of them.
>
>
> And here is the place where the story is starting :)  The dns zone was
> refreshed after 45 minutes.  But the host from where we were executing
> Ansible playbooks, had got cached old IP addresses.
>
>
> So whats happend. All atomics hosts had beed properly configured, but all
> entries in Openshift configuration contains wrong IP addresses .  This is
> why , from network layer cluster was working, all nodes had beed reported
> as "ready". But inside of cluster the  configuration was messup.
>
> Your link was very helpful. Thanks to it,  I found wrong configuration:
>
> # oc get hostsubnet
> NAME                   HOST                   HOST IP           SUBNET
> rh71-os1.example.com   rh71-os1.example.com   192.168.122.46
> 10.1.1.0/24
> rh71-os2.example.com   rh71-os2.example.com   192.168.122.18
> 10.1.2.0/24
> rh71-os3.example.com   rh71-os3.example.com   192.168.122.202
> 10.1.0.0/24
>
> and  from the first shot I've noticed  wrong IP addresses.
>
>
> I've re-run  the playbook and everything is working like a charm. Thx a
> lot for your help.
>
> Best regards:)
>
>
> 2017-06-21 10:12 GMT+02:00 Frederic Giloux <[email protected]>:
>
>> Hi Lukasz
>>
>> if you don't have connectivity at the service level it is likely that the
>> IPTables have not been configured on your new node. You can validate that
>> with iptables -L -n. Compare the result on your new node and on one in the
>> other VLAN. If this is confirmed the master may not be able to connect to
>> the kubelet on the new node (port TCP 10250 as per my previews email).
>> Another thing that could have gone wrong is the population of the OVS
>> table. In that case restarting the node would reinitialise it.
>> Other point the traffic between pods communicating through service should
>> go through the SDN, which means your network team should only see SDN
>> packets between nodes at a firewall between VLANs and not traffic to your
>> service IP range.
>> This resource should also be of help: https://docs.openshift.com/con
>> tainer-platform/3.5/admin_guide/sdn_troubleshooting.html#
>> debugging-a-service
>>
>> I hope this helps.
>>
>> Regards,
>>
>> Frédéric
>>
>>
>> On Wed, Jun 21, 2017 at 9:27 AM, Łukasz Strzelec <
>> [email protected]> wrote:
>>
>>> Hello :)
>>>
>>> Thx for quick replay,
>>>
>>> I did, I mean,  the mentioned port had been opened.  All nodes are
>>> visible to eachother, and  oc get nodes showing "ready" state.
>>>
>>> But pushing to registry,  or simply test connectivity to endpoints or
>>> services IPs  showin no route to host.
>>>
>>> Do you know how to test this properly ?
>>>
>>> The network guy is telling me that  he see some denies from VLAN_B to
>>> 172.30.0.0/16 network. He also ensure me that traffic on 4780 port is
>>> allowed.
>>>
>>>
>>> I did some test once again:
>>>
>>> I try to deploy example ruby application, and  it stops on  pushing into
>>> registry :(
>>>
>>> Also when I deploy simple pod (hello-openshift)  then expose service.  I
>>> cannot reach the website. I'm seeing  default route page with infor that
>>> application doesn't exists
>>>
>>>
>>> Please see the logs below:
>>>
>>> Fetching gem metadata from https://rubygems.org/...............
>>> Fetching version metadata from https://rubygems.org/..
>>> Warning: the running version of Bundler is older than the version that
>>> created the lockfile. We suggest you upgrade to the latest version of
>>> Bundler by running `gem install bundler`.
>>> Installing puma 3.4.0 with native extensions
>>> Installing rack 1.6.4
>>> Using bundler 1.10.6
>>> Bundle complete! 2 Gemfile dependencies, 3 gems now installed.
>>> Gems in the groups development and test were not installed.
>>> Bundled gems are installed into ./bundle.
>>> ---> Cleaning up unused ruby gems ...
>>> Warning: the running version of Bundler is older than the version that
>>> created the lockfile. We suggest you upgrade to the latest version of
>>> Bundler by running `gem install bundler`.
>>> Pushing image 172.30.123.59:5000/testshared/ddddd:latest ...
>>> Registry server Address:
>>> Registry server User Name: serviceaccount
>>> Registry server Email: [email protected]
>>> Registry server Password: <<non-empty>>
>>> error: build error: Failed to push image: Put
>>> http://172.30.123.59:5000/v1/repositories/testshared/ddddd/: dial tcp
>>> 172.30.123.59:5000: getsockopt: no route to host
>>>
>>>
>>> 2017-06-21 7:31 GMT+02:00 Frederic Giloux <[email protected]>:
>>>
>>>> Hi Lukasz,
>>>>
>>>> this is not an unusual setup. You will need:
>>>> - the SDN port: 4789 UDP (both directions: masters/nodes to nodes)
>>>> - the kubelet port: 10250 TCP (masters to nodes)
>>>> - the DNS port: 8053 TCP/UDP (nodes to masters)
>>>> If you can't reach VLAN b pods from VLAN A the issue is probably with
>>>> the SDN port. Mind that it is using UDP.
>>>>
>>>> Regards,
>>>>
>>>> Frédéric
>>>>
>>>> On Wed, Jun 21, 2017 at 4:13 AM, Łukasz Strzelec <
>>>> [email protected]> wrote:
>>>>
>>>>> -- Hello,
>>>>>
>>>>> I have to install OSO with dedicated  HW nodes for one of  my
>>>>> customer.
>>>>>
>>>>> Current cluster is placed in VLAN (for the sake of this question)
>>>>> called: VLAN_A
>>>>>
>>>>> The Customer's nodes have to be place in another vlan: VLAN_B
>>>>>
>>>>> Now the question,  what ports and routes I have to setup to get this
>>>>> to work?
>>>>>
>>>>> The assumption is that traffic between vlans is filtered by default.
>>>>>
>>>>>
>>>>> Now, what I already did:
>>>>>
>>>>> I had opened the ports with accordance to documentation, then scaled
>>>>> up  the cluster (ansible playbook).
>>>>>
>>>>> From the first sight , everything  was working fine. Nodes had been
>>>>> ready. I can deploy simple pod (eg. hello-openshift). But I can't reach te
>>>>> service. During S2I process, pushing into registry is ending with
>>>>>
>>>>> information "no route to host". I've checked this out, and for nodes
>>>>> placed in VLAN_A (the same one as registry and router) everything works
>>>>> fine. The problem is in the traffic between VLANs A <-> B. I
>>>>>
>>>>> can't reach any IP of services  of deployed pods on newly added nodes.
>>>>> Thus, traffic between pods over service-subnet is not allow.  Question is
>>>>> what should I open? Whole 172.30.0.0/16 between those 2
>>>>>
>>>>> vlans, or  dedicated rules to /from registry, router , metrics and so
>>>>> on ?
>>>>>
>>>>>
>>>>> --
>>>>> Ł.S.
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected]
>>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Frédéric Giloux*
>>>> Senior Middleware Consultant
>>>> Red Hat Germany
>>>>
>>>> [email protected]     M: +49-174-172-4661
>>>>
>>>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>>>> ________________________________________________________________________
>>>>
>>>> Red Hat GmbH, http://www.de.redhat.com/ Sitz: Grasbrunn,
>>>> Handelsregister: Amtsgericht München, HRB 153243
>>>> Geschäftsführer: Paul Argiry, Charles Cachera, Michael Cunningham,
>>>> Michael O'Neill
>>>>
>>>
>>>
>>>
>>> --
>>> Ł.S.
>>>
>>
>>
>>
>> --
>> *Frédéric Giloux*
>> Senior Middleware Consultant
>> Red Hat Germany
>>
>> [email protected]     M: +49-174-172-4661
>>
>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>> ________________________________________________________________________
>> Red Hat GmbH, http://www.de.redhat.com/ Sitz: Grasbrunn,
>> Handelsregister: Amtsgericht München, HRB 153243
>> Geschäftsführer: Paul Argiry, Charles Cachera, Michael Cunningham,
>> Michael O'Neill
>>
>
>
>
> --
> Ł.S.
>



-- 
*Frédéric Giloux*
Senior Middleware Consultant
Red Hat Germany

[email protected]     M: +49-174-172-4661

redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
________________________________________________________________________
Red Hat GmbH, http://www.de.redhat.com/ Sitz: Grasbrunn,
Handelsregister: Amtsgericht München, HRB 153243
Geschäftsführer: Paul Argiry, Charles Cachera, Michael Cunningham, Michael
O'Neill

_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Openshift origin - nodes in vlans

Reply via email to