My environment is setting up on VMs provided by openstack.

It seemed that nodes not working were created from resource pool in which 
openstack has different version of ovs.

As I have destroyed the environment and want to try again.  I couldn't get more 
information now.


Thanks,

Jared, (韦煜)
Software developer
Interested in open source software, big data, Linux

________________________________
From: Aleksandar Lazic <al...@me2digital.eu>
Sent: Tuesday, October 24, 2017 12:18:55 AM
To: Yu Wei; users@lists.openshift.redhat.com
Subject: Re: Network issues with openvswitch

Hi Yu Wei.

Interesting issue.
What's the difference between the nodes which the connection work and the one 
from which the connection does not work?

Please can you share some more Informations.

I assume this is on aws, is the UDP port 4789 open from everywhere, as 
described in the doc?
https://docs.openshift.org/3.6/install_config/install/prerequisites.html#prereq-network-access

and of course the other ports also.

oc get nodes
oc describe svc -n default docker-registry

Do you have reboot the notworking nodes?
Are there errors in the journald logs?

Best Regards
Aleks

on Montag, 23. Oktober 2017 at 04:38 was written:


        Hi Aleks,

I setup openshift origin cluster with 1lb + 3 masters + 5 nodes.
In some nodes, pods running on them couldn't be reached by other nodes or pods 
running on other nodes. It indicates "no route to host".
[root@host-10-1-130-32 ~]# curl -kv 
docker-registry.default.svc.cluster.local:5000
* About to connect() to docker-registry.default.svc.cluster.local port 5000 (#0)
*   Trying 172.30.22.28...
* No route to host
* Failed connect to docker-registry.default.svc.cluster.local:5000; No route to 
host
* Closing connection 0
curl: (7) Failed connect to docker-registry.default.svc.cluster.local:5000; No 
route to host

And other nodes works fine.
In my previous mail, host name of node is host-10-1-130-32.
Output of "ifconfig tun0" is as below,
[root@host-10-1-130-32 ~]# ifconfig tun0
tun0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
       inet 10.130.2.1  netmask 255.255.254.0  broadcast 0.0.0.0
       inet6 fe80::cc50:3dff:fe07:9ea2  prefixlen 64  scopeid 0x20<link>
       ether ce:50:3d:07:9e:a2  txqueuelen 1000  (Ethernet)
       RX packets 97906  bytes 8665783 (8.2 MiB)
       RX errors 0  dropped 0  overruns 0  frame 0
       TX packets 163379  bytes 27405744 (26.1 MiB)
       TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

I also tried to capture packet via tcpdump, and found some stuff as following,
10.1.130.32.58147 > 10.1.236.92.4789: [no cksum] VXLAN, flags [I] (0x08), vni 0
ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.128.1.45 tell 
10.130.2.1, length 28
       0x0000:  04f9 38ae 659b fa16 3e6c dd90 0800 4500  ..8.e...>l....E.
       0x0010:  004e 543c 4000 4011 63e4 0a01 8220 0a01  .NT<@.@.c.......
       0x0020:  ec5c e323 12b5 003a 0000 0800 0000 0000  .\.#...:........
       0x0030:  0000 ffff ffff ffff ce50 3d07 9ea2 0806  .........P=.....
       0x0040:  0001 0800 0604 0001 ce50 3d07 9ea2 0a82  .........P=.....
       0x0050:  0201 0000 0000 0000 0a80 012d            ...........-
  25  00:22:47.214387 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
10.1.130.2 tell 10.1.130.45, length 46
       0x0000:  ffff ffff ffff fa16 3e5a a862 0806 0001  ........>Z.b....
       0x0010:  0800 0604 0001 fa16 3e5a a862 0a01 822d  ........>Z.b...-
       0x0020:  0000 0000 0000 0a01 8202 0000 0000 0000  ................
       0x0030:  0000 0000 0000 0000 0000 0000            ............
  26  00:22:47.258344 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 
24) :: > ff02::1:ffa1:1fbb: [icmp6 sum ok] ICMP6, neighbor solicitation, length 
24, who has fe80::824:c2ff:fea1:1fbb
       0x0000:  3333 ffa1 1fbb 0a24 c2a1 1fbb 86dd 6000  33.....$......`.
       0x0010:  0000 0018 3aff 0000 0000 0000 0000 0000  ....:...........
       0x0020:  0000 0000 0000 ff02 0000 0000 0000 0000  ................
       0x0030:  0001 ffa1 1fbb 8700 724a 0000 0000 fe80  ........rJ......
       0x0040:  0000 0000 0000 0824 c2ff fea1 1fbb       .......$......
  27  00:22:47.282619 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
10.1.130.2 tell 10.1.130.73, length 46
       0x0000:  ffff ffff ffff fa16 3ec4 a9be 0806 0001  ........>.......
       0x0010:  0800 0604 0001 fa16 3ec4 a9be 0a01 8249  ........>......I
       0x0020:  0000 0000 0000 0a01 8202 0000 0000 0000  ................
       0x0030:  0000 0000 0000 0000 0000 0000            ............

I didn't understand why the IP marked in red above were involved.

Thanks,
Jared, (韦煜)
Software developer
Interested in open source software, big data, Linux
________________________________
From: Aleksandar Lazic <al...@me2digital.eu>
Sent: Monday, October 23, 2017 2:34:13 AM
To: Yu Wei; users@lists.openshift.redhat.com; d...@lists.openshift.redhat.com
Subject: Re: Network issues with openvswitch

Hi Yu Wei.

on Sonntag, 22. Oktober 2017 at 19:13 was written:

> Hi,

> I execute following command on work node of openshift origin cluster 3.6.
>
> [root@host-10-1-130-32 ~]# traceroute docker-registry.default.svc
> traceroute to docker-registry.default.svc (172.30.22.28), 30 hops max, 60 
> byte packets
>  1  bogon (10.130.2.1)  3005.715 ms !H  3005.682 ms !H  3005.664 ms !H
>  It seemed content marked in red  should be hostname of work node.
>  How could I debug such issue? Where to start?

What's the hostname of the node?
I'm not sure what you try to debug or what's the problem you try to
solve?

> Thanks,

> Jared, (韦煜)
>  Software developer
>  Interested in open source software, big data, Linux

--
Best Regards
Aleks

_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to