satwikk commented on issue #678:
URL: 
https://github.com/apache/openwhisk-deploy-kube/issues/678#issuecomment-827461167


   Yes, they were accessible in a way - Internally in their pod, and between 
pods on the same node. Sharing a case below, where we were running `K8s 
1.18.18` with `weave 2.8.1` on `CentOS7 3.10.0-1160.24.1`. 
   
   After reproducing this issue, We went back to square one, where this might 
not be an issue with Openwhisk deployment or ow namespace only. 
   
   Sharing a case below, where we can connect (telnet) to pods and where we 
cannot,
   ```bash
   [root@rook-ceph-tools-9wbw2 /]# nslookup 
rook-ceph-mgr.rook-ceph.svc.cluster.local
   Server:         10.96.0.10
   Address:        10.96.0.10#53
   
   Name:   rook-ceph-mgr.rook-ceph.svc.cluster.local
   Address: 10.101.148.210
   
   [root@rook-ceph-tools-9wbw2 /]# telnet 
rook-ceph-mgr.rook-ceph.svc.cluster.local 9283
   Trying 10.101.148.210...
   Connected to rook-ceph-mgr.rook-ceph.svc.cluster.local.
   Escape character is '^]'.
   ^]
   telnet> quit
   Connection closed.
   [root@rook-ceph-tools-9wbw2 /]# nslookup fn-couchdb.fn.svc.cluster.local
   Server:         10.96.0.10
   Address:        10.96.0.10#53
   
   Name:   fn-couchdb.fn.svc.cluster.local
   Address: 10.99.20.83
   
   [root@rook-ceph-tools-9wbw2 /]# telnet fn-couchdb.fn.svc.cluster.local 5984
   Trying 10.99.20.83...
   
   ^C
   
   root@fn-couchdb-848f8bb7c9-dswkt:/# cat /etc/hosts
   # Kubernetes-managed hosts file.
   127.0.0.1       localhost
   ::1     localhost ip6-localhost ip6-loopback
   fe00::0 ip6-localnet
   fe00::0 ip6-mcastprefix
   fe00::1 ip6-allnodes
   fe00::2 ip6-allrouters
   10.32.0.14      fn-couchdb-848f8bb7c9-dswkt
   root@fn-couchdb-848f8bb7c9-dswkt:/# netstat -tupln
   Active Internet connections (only servers)
   Proto Recv-Q Send-Q Local Address           Foreign Address         State    
   PID/Program name
   tcp        0      0 0.0.0.0:5984            0.0.0.0:*               LISTEN   
   -
   tcp        0      0 0.0.0.0:9100            0.0.0.0:*               LISTEN   
   -
   tcp        0      0 0.0.0.0:4369            0.0.0.0:*               LISTEN   
   -
   tcp6       0      0 :::4369                 :::*                    LISTEN   
   -
   root@fn-couchdb-848f8bb7c9-dswkt:/# telnet localhost 5984
   Trying 127.0.0.1...
   Connected to localhost.
   Escape character is '^]'.
   ^]
   telnet> quit
   Connection closed.
   
   
   root@enfn-wskadmin:/# 
READINESS_URL=http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker
   root@enfn-wskadmin:/# while true; do echo 'checking CouchDB readiness'; wget 
-T 5 --spider $READINESS_URL --header="Authorization: Basic 
d2hpc2tfYWRtaW46TVRreU9UWXpNelJp"; result=$?; if [ $result -eq 0 ]; then echo 
'Success: CouchDB is ready!'; break; fi; echo '...not ready yet; sleeping 3 
seconds before retry'; sleep 3; done;
   checking CouchDB readiness
   Spider mode enabled. Check if remote file exists.
   --2021-04-27 06:25:26--  
http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker
   Resolving enfn-couchdb.fn.svc.cluster.local 
(enfn-couchdb.fn.svc.cluster.local)... 10.99.20.83
   Connecting to enfn-couchdb.fn.svc.cluster.local 
(enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... connected.
   HTTP request sent, awaiting response... 200 OK
   Length: 442 [application/json]
   Remote file exists.
   
   Success: CouchDB is ready!
   root@enfn-wskadmin:/# telnet enfn-couchdb.fn.svc.cluster.local 5984
   Trying 10.99.20.83...
   Connected to enfn-couchdb.fn.svc.cluster.local.
   Escape character is '^]'.
   ^]
   telnet> quit
   Connection closed.
   root@enfn-wskadmin:/#
   
   
   root@enfn-couchdb-848f8bb7c9-dswkt:/# 
READINESS_URL=http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker
   root@enfn-couchdb-848f8bb7c9-dswkt:/# while true; do echo 'checking CouchDB 
readiness'; wget -T 5 --spider $READINESS_URL --header="Authorization: Basic 
d2hpc2tfYWRtaW46TVRreU9UWXpNelJp"; result=$?; if [ $result -eq 0 ]; then echo 
'Success: CouchDB is ready!'; break; fi; echo '...not ready yet; sleeping 3 
seconds before retry'; sleep 3; done;
   checking CouchDB readiness
   Spider mode enabled. Check if remote file exists.
   --2021-04-27 06:23:40--  
http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker
   Resolving enfn-couchdb.fn.svc.cluster.local 
(enfn-couchdb.fn.svc.cluster.local)... 10.99.20.83
   Connecting to enfn-couchdb.fn.svc.cluster.local 
(enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... connected.
   HTTP request sent, awaiting response... 200 OK
   Length: 442 [application/json]
   Remote file exists.
   
   Success: CouchDB is ready!
   
   
   [root@rook-ceph-tools-9wbw2 /]# 
READINESS_URL=http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker
                                                              
[root@rook-ceph-tools-9wbw2 /]# while true; do echo 'checking CouchDB 
readiness'; wget -T 5 --spider $READINESS_URL --header="Authorization: Basic 
d2hpc2tfYWRtaW46TVRreU9UWXpNelJp"; result=$?; if [ $result -eq 0 ]; then echo 
'Success: CouchDB is ready!'; break; fi; echo '...not ready yet; sleeping 3 
seconds before retry'; sleep 3; done;
   checking CouchDB readiness
   Spider mode enabled. Check if remote file exists.
   --2021-04-27 06:24:21--  
http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker
   Resolving enfn-couchdb.fn.svc.cluster.local 
(enfn-couchdb.fn.svc.cluster.local)... 10.99.20.83
   Connecting to enfn-couchdb.fn.svc.cluster.local 
(enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... failed: Connection 
timed out.
   Retrying.
   
   Spider mode enabled. Check if remote file exists.
   --2021-04-27 06:24:27--  (try: 2)  
http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker
   Connecting to enfn-couchdb.fn.svc.cluster.local 
(enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... failed: Connection 
timed out.
   Retrying.
   
   Spider mode enabled. Check if remote file exists.
   --2021-04-27 06:24:34--  (try: 3)  
http://enfn-couchdb.fn.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker
   Connecting to enfn-couchdb.fn.svc.cluster.local 
(enfn-couchdb.fn.svc.cluster.local)|10.99.20.83|:5984... ^C
   
   ```
   > In This case above, whiskadmin and couchdb pods were on the same node. 
whereas in other cases they were not (might have not have been on the same node)
   
   We will look more into this as time permits and update accordingly. 
   
   Took some time for me to get back and recreate the issue. Our test env was 
made up of three nodes,
   ```bash
   NAME             STATUS   ROLES    AGE   VERSION    INTERNAL-IP    
EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                
CONTAINER-RUNTIME   LABELS
   bm-k8s-master    Ready    master   22h   v1.18.18   10.99.97.118   <none>    
    CentOS Linux 7 (Core)   3.10.0-1160.24.1.el7.x86_64   docker://19.3.15    
beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=bm-k8s-master,kubernetes.io/os=linux,node-role.kubernetes.io/master=,openwhisk-role=invoker
   bm-k8s-slave-2   Ready    <none>   18h   v1.18.18   10.99.97.116   <none>    
    CentOS Linux 7 (Core)   3.10.0-1160.24.1.el7.x86_64   docker://19.3.15    
beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=bm-k8s-slave-2,kubernetes.io/os=linux,openwhisk-role=invoker
   bm-k8s-slave-3   Ready    <none>   18h   v1.18.18   10.99.97.115   <none>    
    CentOS Linux 7 (Core)   3.10.0-1160.24.1.el7.x86_64   docker://19.3.15    
beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=bm-k8s-slave-3,kubernetes.io/os=linux,openwhisk-role=invoker
   ```
   Openwhisk landscape looks as follows,
   ```bash
   kubectl -n fn get pod,svc,pvc -o wide                                        
                                                                           Tue 
Apr 27 11:47:19 2021
   
   NAME                                      READY   STATUS      RESTARTS   AGE 
  IP           NODE             NOMINATED NODE   READINESS GATES
   pod/fn-alarmprovider-5b9c7b9b8d-sh8nm   0/1     Init:0/1    0          65m   
10.36.0.10   bm-k8s-slave-3   <none>           <none>
   pod/fn-apigateway-74b487d8cb-hn5sb      1/1     Running     0          65m   
10.44.0.8    bm-k8s-slave-2   <none>           <none>
   pod/fn-controller-0                     0/1     Init:1/2    0          65m   
10.44.0.9    bm-k8s-slave-2   <none>           <none>
   pod/fn-couchdb-848f8bb7c9-dswkt         1/1     Running     0          65m   
10.32.0.14   bm-k8s-master    <none>           <none>
   pod/fn-grafana-78dc6fcdff-smvm8         1/1     Running     0          65m   
10.36.0.9    bm-k8s-slave-3   <none>           <none>
   pod/fn-init-couchdb-fjn8s               0/1     Completed   0          65m   
10.32.0.10   bm-k8s-master    <none>           <none>
   pod/fn-install-packages-qgplt           0/1     Init:0/1    0          65m   
10.32.0.12   bm-k8s-master    <none>           <none>
   pod/fn-invoker-0                        0/1     Init:0/1    0          65m   
10.32.0.13   bm-k8s-master    <none>           <none>
   pod/fn-kafka-0                          1/1     Running     0          65m   
10.44.0.10   bm-k8s-slave-2   <none>           <none>
   pod/fn-nginx-5d7f747b95-25f7q           0/1     Init:0/1    0          65m   
10.44.0.7    bm-k8s-slave-2   <none>           <none>
   pod/fn-prometheus-server-0              1/1     Running     0          65m   
10.36.0.11   bm-k8s-slave-3   <none>           <none>
   pod/fn-redis-6d9f5f56b5-5gbrq           1/1     Running     0          65m   
10.32.0.15   bm-k8s-master    <none>           <none>
   pod/fn-user-events-7bf9665968-rpgsl     1/1     Running     1          65m   
10.32.0.11   bm-k8s-master    <none>           <none>
   pod/fn-wskadmin                         1/1     Running     0          65m   
10.32.0.9    bm-k8s-master    <none>           <none>
   pod/fn-zookeeper-0                      1/1     Running     0          65m   
10.36.0.12   bm-k8s-slave-3   <none>           <none>
   
   NAME                             TYPE           CLUSTER-IP       EXTERNAL-IP 
  PORT(S)                      AGE   SELECTOR
   service/fn-apigateway          ClusterIP      10.97.7.233      <none>        
8080/TCP,9000/TCP            65m   name=fn-apigateway
   service/fn-controller          ClusterIP      10.97.171.152    <none>        
8080/TCP                     65m   name=fn-controller
   service/fn-couchdb             ClusterIP      10.99.20.83      <none>        
5984/TCP                     65m   name=fn-couchdb
   service/fn-grafana             ClusterIP      10.109.145.44    <none>        
3000/TCP                     65m   name=fn-grafana
   service/fn-kafka               ClusterIP      None             <none>        
9092/TCP                     65m   name=fn-kafka
   service/fn-nginx               LoadBalancer   10.102.125.90    10.99.97.5    
80:31887/TCP,443:31425/TCP   65m   name=fn-nginx
   service/fn-prometheus-server   ClusterIP      10.103.139.29    <none>        
9090/TCP                     65m   name=fn-prometheus-server
   service/fn-redis               ClusterIP      10.106.167.149   <none>        
6379/TCP                     65m   name=fn-redis
   service/fn-user-events         ClusterIP      10.108.197.26    <none>        
9095/TCP                     65m   name=fn-user-events
   service/fn-zookeeper           ClusterIP      None             <none>        
2181/TCP,2888/TCP,3888/TCP   65m   name=fn-zookeeper
   
   NAME                                               STATUS   VOLUME           
                          CAPACITY   ACCESS MODES   STORAGECLASS      AGE   
VOLUMEMODE
   persistentvolumeclaim/fn-alarmprovider-pvc       Bound    
pvc-9372adbd-473c-47e6-98aa-1be2d9d24468   1Gi        RWO            
rook-ceph-block   65m   Filesystem
   persistentvolumeclaim/fn-couchdb-pvc             Bound    
pvc-120389ef-5906-4965-a715-ffee967abcae   2Gi        RWO            
rook-ceph-block   65m   Filesystem
   persistentvolumeclaim/fn-kafka-pvc               Bound    
pvc-156c82de-36b4-4f68-94cb-49e72698ab51   512Mi      RWO            
rook-ceph-block   65m   Filesystem
   persistentvolumeclaim/fn-prometheus-pvc          Bound    
pvc-391c4265-0896-4e59-9c1f-539b479cbbeb   1Gi        RWO            
rook-ceph-block   65m   Filesystem
   persistentvolumeclaim/fn-redis-pvc               Bound    
pvc-f05dc3f5-602e-49a4-960d-3ddbff668773   256Mi      RWO            
rook-ceph-block   65m   Filesystem
   persistentvolumeclaim/fn-zookeeper-pvc-data      Bound    
pvc-311fe633-1100-4ba9-b4e1-64b73b232617   256Mi      RWO            
rook-ceph-block   65m   Filesystem
   persistentvolumeclaim/fn-zookeeper-pvc-datalog   Bound    
pvc-066dd5e7-e301-48a7-b690-7b4f289df7b4   256Mi      RWO            
rook-ceph-block   65m   Filesystem
   ```
   
   Nonetheless, we confirm that after switching to calico we could successfully 
deploy openwhisk in the same environment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to