ningyougang opened a new issue #622:
URL: https://github.com/apache/openwhisk-deploy-kube/issues/622


   ### k8s nodes and relative env
   ```shell
   kubectl get nodes --show-label
   test-k8s-001-ncl   Ready    master   31d   v1.18.3  ... 
node-role.kubernetes.io/master=,openwhisk-role=edge
   test-k8s-002-ncl   Ready    <none>   31d   v1.18.3  ... openwhisk-role=core
   test-k8s-003-ncl   Ready    <none>   31d   v1.18.3  ... 
openwhisk-role=invoker
   test-k8s-004-ncl   Ready    <none>   31d   v1.18.3  ... 
openwhisk-role=invoker
   ```
   |  hostname   | ip |
   |  ----  | ----  |
   | test-k8s-001-ncl |  ip_a |
   | test-k8s-002-ncl  | ip_b |
   | test-k8s-003-ncl  | ip_c |
   | test-k8s-004-ncl  | ip_d |
   ### mycluster.yaml
   ```yaml
   whisk:
     ingress:
       type: NodePort
       apiHostName: ${ip_a}  # this is test-k8s-001-ncl physical ip
       apiHostPort: 31001
   
   nginx:
     httpsNodePort: 31001
   ```
   After deployed using below command
   ```shell
   helm install ./helm/openwhisk --namespace=openwhisk --generate-name -f 
mycluster.yaml
   ```
   and configure ~/.wskprops
   ```shell
   APIHOST=${ip_a}:31001
   
AUTH=23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP
 #guest
   ```
   create hello test action
   ```
   wsk -i action create hello /tmp/hello.js --kind:nodejs:10
   ```
   #### reproduce the issue
   I invoked hello action all the time using `while true`
   ```shell
   wsk -i action invoke hello --result >> result &
   ```
   It executed successfully all the time for all activations
   But i executed below command
   ```
   helm upgrade openwhisk-1595565216 ./helm/openwhisk --namespace=openwhisk 
--set nginx.existHealthFile=false --reuse-values 
   ```
   Node: i added a small feature to openwhisk-deploy-kube project in my local 
(add a volume to nginx pod)
   
   After above `helm upgrade` executed, previous nginx pod will be deleted and 
a new nginx pod will be created on other node(in my case, it is recreated on 
test-k8s-002-ncl node, so it is normal here )
   And i invoked hello again all the times, it seems 50% requests are failed, 
50% requests are successfully, reported below error
   ```
   error: Unable to invoke action 'hello_permission': Post 
https://10.106.237.179:31001/api/v1/namespaces/_/actions/hello?blocking=true&result=true:
 dial tcp 10.106.237.179:31001: getsockopt: connection timed out
   ```
   here, if i changed the ~/.wskprops like below
   ```shell
   APIHOST=${ip_b}:31001  #here is the new nginx pod's physical node's ip
   
AUTH=23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP
 #guest
   ```
   Then, invoke hello action all the time, all activations are successfully. (I 
think it is normal here, because `wskprops`'s APIHOST pointed to the new nginx 
pod's physical ip)
   
   ### Issue analyze
   We use `NodePort` for nginx svc, so all k8s nodes will expose `31001` port.
   So even if  nginx pod is recreated on other k8s node,  all k8s nodes will 
expose `31001` port as well.
   But i don't understand why some activations are success and some activations 
are fail.
   Didn't find any error log it.
   Someone know the reason?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to