My environment is setting up on VMs provided by openstack. It seemed that nodes not working were created from resource pool in which openstack has different version of ovs.
As I have destroyed the environment and want to try again. I couldn't get more information now. Thanks, Jared, (韦煜) Software developer Interested in open source software, big data, Linux ________________________________ From: Aleksandar Lazic <al...@me2digital.eu> Sent: Tuesday, October 24, 2017 12:18:55 AM To: Yu Wei; users@lists.openshift.redhat.com Subject: Re: Network issues with openvswitch Hi Yu Wei. Interesting issue. What's the difference between the nodes which the connection work and the one from which the connection does not work? Please can you share some more Informations. I assume this is on aws, is the UDP port 4789 open from everywhere, as described in the doc? https://docs.openshift.org/3.6/install_config/install/prerequisites.html#prereq-network-access and of course the other ports also. oc get nodes oc describe svc -n default docker-registry Do you have reboot the notworking nodes? Are there errors in the journald logs? Best Regards Aleks on Montag, 23. Oktober 2017 at 04:38 was written: Hi Aleks, I setup openshift origin cluster with 1lb + 3 masters + 5 nodes. In some nodes, pods running on them couldn't be reached by other nodes or pods running on other nodes. It indicates "no route to host". [root@host-10-1-130-32 ~]# curl -kv docker-registry.default.svc.cluster.local:5000 * About to connect() to docker-registry.default.svc.cluster.local port 5000 (#0) * Trying 172.30.22.28... * No route to host * Failed connect to docker-registry.default.svc.cluster.local:5000; No route to host * Closing connection 0 curl: (7) Failed connect to docker-registry.default.svc.cluster.local:5000; No route to host And other nodes works fine. In my previous mail, host name of node is host-10-1-130-32. Output of "ifconfig tun0" is as below, [root@host-10-1-130-32 ~]# ifconfig tun0 tun0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet 10.130.2.1 netmask 255.255.254.0 broadcast 0.0.0.0 inet6 fe80::cc50:3dff:fe07:9ea2 prefixlen 64 scopeid 0x20<link> ether ce:50:3d:07:9e:a2 txqueuelen 1000 (Ethernet) RX packets 97906 bytes 8665783 (8.2 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 163379 bytes 27405744 (26.1 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 I also tried to capture packet via tcpdump, and found some stuff as following, 10.1.130.32.58147 > 10.1.236.92.4789: [no cksum] VXLAN, flags [I] (0x08), vni 0 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.128.1.45 tell 10.130.2.1, length 28 0x0000: 04f9 38ae 659b fa16 3e6c dd90 0800 4500 ..8.e...>l....E. 0x0010: 004e 543c 4000 4011 63e4 0a01 8220 0a01 .NT<@.@.c....... 0x0020: ec5c e323 12b5 003a 0000 0800 0000 0000 .\.#...:........ 0x0030: 0000 ffff ffff ffff ce50 3d07 9ea2 0806 .........P=..... 0x0040: 0001 0800 0604 0001 ce50 3d07 9ea2 0a82 .........P=..... 0x0050: 0201 0000 0000 0000 0a80 012d ...........- 25 00:22:47.214387 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.1.130.2 tell 10.1.130.45, length 46 0x0000: ffff ffff ffff fa16 3e5a a862 0806 0001 ........>Z.b.... 0x0010: 0800 0604 0001 fa16 3e5a a862 0a01 822d ........>Z.b...- 0x0020: 0000 0000 0000 0a01 8202 0000 0000 0000 ................ 0x0030: 0000 0000 0000 0000 0000 0000 ............ 26 00:22:47.258344 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 24) :: > ff02::1:ffa1:1fbb: [icmp6 sum ok] ICMP6, neighbor solicitation, length 24, who has fe80::824:c2ff:fea1:1fbb 0x0000: 3333 ffa1 1fbb 0a24 c2a1 1fbb 86dd 6000 33.....$......`. 0x0010: 0000 0018 3aff 0000 0000 0000 0000 0000 ....:........... 0x0020: 0000 0000 0000 ff02 0000 0000 0000 0000 ................ 0x0030: 0001 ffa1 1fbb 8700 724a 0000 0000 fe80 ........rJ...... 0x0040: 0000 0000 0000 0824 c2ff fea1 1fbb .......$...... 27 00:22:47.282619 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.1.130.2 tell 10.1.130.73, length 46 0x0000: ffff ffff ffff fa16 3ec4 a9be 0806 0001 ........>....... 0x0010: 0800 0604 0001 fa16 3ec4 a9be 0a01 8249 ........>......I 0x0020: 0000 0000 0000 0a01 8202 0000 0000 0000 ................ 0x0030: 0000 0000 0000 0000 0000 0000 ............ I didn't understand why the IP marked in red above were involved. Thanks, Jared, (韦煜) Software developer Interested in open source software, big data, Linux ________________________________ From: Aleksandar Lazic <al...@me2digital.eu> Sent: Monday, October 23, 2017 2:34:13 AM To: Yu Wei; users@lists.openshift.redhat.com; d...@lists.openshift.redhat.com Subject: Re: Network issues with openvswitch Hi Yu Wei. on Sonntag, 22. Oktober 2017 at 19:13 was written: > Hi, > I execute following command on work node of openshift origin cluster 3.6. > > [root@host-10-1-130-32 ~]# traceroute docker-registry.default.svc > traceroute to docker-registry.default.svc (172.30.22.28), 30 hops max, 60 > byte packets > 1 bogon (10.130.2.1) 3005.715 ms !H 3005.682 ms !H 3005.664 ms !H > It seemed content marked in red should be hostname of work node. > How could I debug such issue? Where to start? What's the hostname of the node? I'm not sure what you try to debug or what's the problem you try to solve? > Thanks, > Jared, (韦煜) > Software developer > Interested in open source software, big data, Linux -- Best Regards Aleks
_______________________________________________ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users