satwikk commented on issue #678: URL: https://github.com/apache/openwhisk-deploy-kube/issues/678#issuecomment-827461167
Yes, they were accessible in a way - Internally in their pod, and between pods on the same node. Sharing a case below, where we were running `K8s 1.18.18` with `weave 2.8.1` on `CentOS7 3.10.0-1160.24.1`. After reproducing this issue, We went back to square one, where this might not be an issue with Openwhisk deployment or ow namespace only. Sharing a case below, where we can connect (telnet) to pods and where we cannot, ```bash [root@rook-ceph-tools-9wbw2 /]# nslookup rook-ceph-mgr.rook-ceph.svc.cluster.local Server: 10.96.0.10 Address: 10.96.0.10#53 Name: rook-ceph-mgr.rook-ceph.svc.cluster.local Address: 10.101.148.210 [root@rook-ceph-tools-9wbw2 /]# telnet rook-ceph-mgr.rook-ceph.svc.cluster.local 9283 Trying 10.101.148.210... Connected to rook-ceph-mgr.rook-ceph.svc.cluster.local. Escape character is '^]'. ^] telnet> quit Connection closed. [root@rook-ceph-tools-9wbw2 /]# nslookup fn-couchdb.fn.svc.cluster.local Server: 10.96.0.10 Address: 10.96.0.10#53 Name: fn-couchdb.fn.svc.cluster.local Address: 10.99.20.83 [root@rook-ceph-tools-9wbw2 /]# telnet fn-couchdb.fn.svc.cluster.local 5984 Trying 10.99.20.83... ^C root@fn-couchdb-848f8bb7c9-dswkt:/# cat /etc/hosts # Kubernetes-managed hosts file. 127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet fe00::0 ip6-mcastprefix fe00::1 ip6-allnodes fe00::2 ip6-allrouters 10.32.0.14 fn-couchdb-848f8bb7c9-dswkt root@fn-couchdb-848f8bb7c9-dswkt:/# netstat -tupln Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:5984 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:9100 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:4369 0.0.0.0:* LISTEN - tcp6 0 0 :::4369 :::* LISTEN - root@fn-couchdb-848f8bb7c9-dswkt:/# telnet localhost 5984 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. ^] telnet> quit Connection closed. root@enfn-wskadmin:/# READINESS_URL=http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker root@enfn-wskadmin:/# while true; do echo 'checking CouchDB readiness'; wget -T 5 --spider $READINESS_URL --header="Authorization: Basic d2hpc2tfYWRtaW46TVRreU9UWXpNelJp"; result=$?; if [ $result -eq 0 ]; then echo 'Success: CouchDB is ready!'; break; fi; echo '...not ready yet; sleeping 3 seconds before retry'; sleep 3; done; checking CouchDB readiness Spider mode enabled. Check if remote file exists. --2021-04-27 06:25:26-- http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker Resolving enfn-couchdb.fn.svc.cluster.local (enfn-couchdb.fn.svc.cluster.local)... 10.99.20.83 Connecting to enfn-couchdb.fn.svc.cluster.local (enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... connected. HTTP request sent, awaiting response... 200 OK Length: 442 [application/json] Remote file exists. Success: CouchDB is ready! root@enfn-wskadmin:/# telnet enfn-couchdb.fn.svc.cluster.local 5984 Trying 10.99.20.83... Connected to enfn-couchdb.fn.svc.cluster.local. Escape character is '^]'. ^] telnet> quit Connection closed. root@enfn-wskadmin:/# root@enfn-couchdb-848f8bb7c9-dswkt:/# READINESS_URL=http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker root@enfn-couchdb-848f8bb7c9-dswkt:/# while true; do echo 'checking CouchDB readiness'; wget -T 5 --spider $READINESS_URL --header="Authorization: Basic d2hpc2tfYWRtaW46TVRreU9UWXpNelJp"; result=$?; if [ $result -eq 0 ]; then echo 'Success: CouchDB is ready!'; break; fi; echo '...not ready yet; sleeping 3 seconds before retry'; sleep 3; done; checking CouchDB readiness Spider mode enabled. Check if remote file exists. --2021-04-27 06:23:40-- http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker Resolving enfn-couchdb.fn.svc.cluster.local (enfn-couchdb.fn.svc.cluster.local)... 10.99.20.83 Connecting to enfn-couchdb.fn.svc.cluster.local (enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... connected. HTTP request sent, awaiting response... 200 OK Length: 442 [application/json] Remote file exists. Success: CouchDB is ready! [root@rook-ceph-tools-9wbw2 /]# READINESS_URL=http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker [root@rook-ceph-tools-9wbw2 /]# while true; do echo 'checking CouchDB readiness'; wget -T 5 --spider $READINESS_URL --header="Authorization: Basic d2hpc2tfYWRtaW46TVRreU9UWXpNelJp"; result=$?; if [ $result -eq 0 ]; then echo 'Success: CouchDB is ready!'; break; fi; echo '...not ready yet; sleeping 3 seconds before retry'; sleep 3; done; checking CouchDB readiness Spider mode enabled. Check if remote file exists. --2021-04-27 06:24:21-- http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker Resolving enfn-couchdb.fn.svc.cluster.local (enfn-couchdb.fn.svc.cluster.local)... 10.99.20.83 Connecting to enfn-couchdb.fn.svc.cluster.local (enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... failed: Connection timed out. Retrying. Spider mode enabled. Check if remote file exists. --2021-04-27 06:24:27-- (try: 2) http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker Connecting to enfn-couchdb.fn.svc.cluster.local (enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... failed: Connection timed out. Retrying. Spider mode enabled. Check if remote file exists. --2021-04-27 06:24:34-- (try: 3) http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker Connecting to enfn-couchdb.fn.svc.cluster.local (enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... ^C ``` > In This case above, whiskadmin and couchdb pods were on the same node. whereas in other cases they were not (might have not have been on the same node) We will look more into this as time permits and update accordingly. Took some time for me to get back and recreate the issue. Our test env was made up of three nodes, ```bash NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME LABELS bm-k8s-master Ready master 22h v1.18.18 10.99.97.118 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.15 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=bm-k8s-master,kubernetes.io/os=linux,node-role.kubernetes.io/master=,openwhisk-role=invoker bm-k8s-slave-2 Ready <none> 18h v1.18.18 10.99.97.116 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.15 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=bm-k8s-slave-2,kubernetes.io/os=linux,openwhisk-role=invoker bm-k8s-slave-3 Ready <none> 18h v1.18.18 10.99.97.115 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.15 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=bm-k8s-slave-3,kubernetes.io/os=linux,openwhisk-role=invoker ``` Openwhisk landscape looks as follows, ```bash kubectl -n fn get pod,svc,pvc -o wide Tue Apr 27 11:47:19 2021 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/fn-alarmprovider-5b9c7b9b8d-sh8nm 0/1 Init:0/1 0 65m 10.36.0.10 bm-k8s-slave-3 <none> <none> pod/fn-apigateway-74b487d8cb-hn5sb 1/1 Running 0 65m 10.44.0.8 bm-k8s-slave-2 <none> <none> pod/fn-controller-0 0/1 Init:1/2 0 65m 10.44.0.9 bm-k8s-slave-2 <none> <none> pod/fn-couchdb-848f8bb7c9-dswkt 1/1 Running 0 65m 10.32.0.14 bm-k8s-master <none> <none> pod/fn-grafana-78dc6fcdff-smvm8 1/1 Running 0 65m 10.36.0.9 bm-k8s-slave-3 <none> <none> pod/fn-init-couchdb-fjn8s 0/1 Completed 0 65m 10.32.0.10 bm-k8s-master <none> <none> pod/fn-install-packages-qgplt 0/1 Init:0/1 0 65m 10.32.0.12 bm-k8s-master <none> <none> pod/fn-invoker-0 0/1 Init:0/1 0 65m 10.32.0.13 bm-k8s-master <none> <none> pod/fn-kafka-0 1/1 Running 0 65m 10.44.0.10 bm-k8s-slave-2 <none> <none> pod/fn-nginx-5d7f747b95-25f7q 0/1 Init:0/1 0 65m 10.44.0.7 bm-k8s-slave-2 <none> <none> pod/fn-prometheus-server-0 1/1 Running 0 65m 10.36.0.11 bm-k8s-slave-3 <none> <none> pod/fn-redis-6d9f5f56b5-5gbrq 1/1 Running 0 65m 10.32.0.15 bm-k8s-master <none> <none> pod/fn-user-events-7bf9665968-rpgsl 1/1 Running 1 65m 10.32.0.11 bm-k8s-master <none> <none> pod/fn-wskadmin 1/1 Running 0 65m 10.32.0.9 bm-k8s-master <none> <none> pod/fn-zookeeper-0 1/1 Running 0 65m 10.36.0.12 bm-k8s-slave-3 <none> <none> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/fn-apigateway ClusterIP 10.97.7.233 <none> 8080/TCP,9000/TCP 65m name=fn-apigateway service/fn-controller ClusterIP 10.97.171.152 <none> 8080/TCP 65m name=fn-controller service/fn-couchdb ClusterIP 10.99.20.83 <none> 5984/TCP 65m name=fn-couchdb service/fn-grafana ClusterIP 10.109.145.44 <none> 3000/TCP 65m name=fn-grafana service/fn-kafka ClusterIP None <none> 9092/TCP 65m name=fn-kafka service/fn-nginx LoadBalancer 10.102.125.90 10.99.97.5 80:31887/TCP,443:31425/TCP 65m name=fn-nginx service/fn-prometheus-server ClusterIP 10.103.139.29 <none> 9090/TCP 65m name=fn-prometheus-server service/fn-redis ClusterIP 10.106.167.149 <none> 6379/TCP 65m name=fn-redis service/fn-user-events ClusterIP 10.108.197.26 <none> 9095/TCP 65m name=fn-user-events service/fn-zookeeper ClusterIP None <none> 2181/TCP,2888/TCP,3888/TCP 65m name=fn-zookeeper NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE persistentvolumeclaim/fn-alarmprovider-pvc Bound pvc-9372adbd-473c-47e6-98aa-1be2d9d24468 1Gi RWO rook-ceph-block 65m Filesystem persistentvolumeclaim/fn-couchdb-pvc Bound pvc-120389ef-5906-4965-a715-ffee967abcae 2Gi RWO rook-ceph-block 65m Filesystem persistentvolumeclaim/fn-kafka-pvc Bound pvc-156c82de-36b4-4f68-94cb-49e72698ab51 512Mi RWO rook-ceph-block 65m Filesystem persistentvolumeclaim/fn-prometheus-pvc Bound pvc-391c4265-0896-4e59-9c1f-539b479cbbeb 1Gi RWO rook-ceph-block 65m Filesystem persistentvolumeclaim/fn-redis-pvc Bound pvc-f05dc3f5-602e-49a4-960d-3ddbff668773 256Mi RWO rook-ceph-block 65m Filesystem persistentvolumeclaim/fn-zookeeper-pvc-data Bound pvc-311fe633-1100-4ba9-b4e1-64b73b232617 256Mi RWO rook-ceph-block 65m Filesystem persistentvolumeclaim/fn-zookeeper-pvc-datalog Bound pvc-066dd5e7-e301-48a7-b690-7b4f289df7b4 256Mi RWO rook-ceph-block 65m Filesystem ``` Nonetheless, we confirm that after switching to calico we could successfully deploy openwhisk in the same environment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org