ningyougang opened a new issue #622:
URL: https://github.com/apache/openwhisk-deploy-kube/issues/622
### k8s nodes and relative env
```shell
kubectl get nodes --show-label
test-k8s-001-ncl Ready master 31d v1.18.3 ...
node-role.kubernetes.io/master=,openwhisk-role=edge
test-k8s-002-ncl Ready <none> 31d v1.18.3 ... openwhisk-role=core
test-k8s-003-ncl Ready <none> 31d v1.18.3 ...
openwhisk-role=invoker
test-k8s-004-ncl Ready <none> 31d v1.18.3 ...
openwhisk-role=invoker
```
| hostname | ip |
| ---- | ---- |
| test-k8s-001-ncl | ip_a |
| test-k8s-002-ncl | ip_b |
| test-k8s-003-ncl | ip_c |
| test-k8s-004-ncl | ip_d |
### mycluster.yaml
```yaml
whisk:
ingress:
type: NodePort
apiHostName: ${ip_a} # this is test-k8s-001-ncl physical ip
apiHostPort: 31001
nginx:
httpsNodePort: 31001
```
After deployed using below command
```shell
helm install ./helm/openwhisk --namespace=openwhisk --generate-name -f
mycluster.yaml
```
and configure ~/.wskprops
```shell
APIHOST=${ip_a}:31001
AUTH=23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP
#guest
```
create hello test action
```
wsk -i action create hello /tmp/hello.js --kind:nodejs:10
```
#### reproduce the issue
I invoked hello action all the time using `while true`
```shell
wsk -i action invoke hello --result >> result &
```
It executed successfully all the time for all activations
But i executed below command
```
helm upgrade openwhisk-1595565216 ./helm/openwhisk --namespace=openwhisk
--set nginx.existHealthFile=false --reuse-values
```
Node: i added a small feature to openwhisk-deploy-kube project in my local
(add a volume to nginx pod)
After above `helm upgrade` executed, previous nginx pod will be deleted and
a new nginx pod will be created on other node(in my case, it is recreated on
test-k8s-002-ncl node, so it is normal here )
And i invoked hello again all the times, it seems 50% requests are failed,
50% requests are successfully, reported below error
```
error: Unable to invoke action 'hello_permission': Post
https://10.106.237.179:31001/api/v1/namespaces/_/actions/hello?blocking=true&result=true:
dial tcp 10.106.237.179:31001: getsockopt: connection timed out
```
here, if i changed the ~/.wskprops like below
```shell
APIHOST=${ip_b}:31001 #here is the new nginx pod's physical node's ip
AUTH=23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP
#guest
```
Then, invoke hello action all the time, all activations are successfully. (I
think it is normal here, because `wskprops`'s APIHOST pointed to the new nginx
pod's physical ip)
### Issue analyze
We use `NodePort` for nginx svc, so all k8s nodes will expose `31001` port.
So even if nginx pod is recreated on other k8s node, all k8s nodes will
expose `31001` port as well.
But i don't understand why some activations are success and some activations
are fail.
Didn't find any error log it.
Someone know the reason?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]