Re: Pods stuck on 'ContainerCreating' when redhat/openshift-ovs-multitenant enabled

Yu Wei Mon, 14 Oct 2019 23:19:36 -0700

I found the root cause for this issue.
In my machine, I firstly deployed cop with calico. It works well.
Then run uninstall playbook and reinstall with sdn openshift-ovs-multitenant.
And it didn’t work anymore.
I found something as below,


[root@buzz1 openshift-ansible]# systemctl status  atomic-openshift-node.service
● atomic-openshift-node.service - OpenShift Node
   Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; 
vendor preset: disabled)
   Active: active (running) since Mon 2019-10-14 00:43:08 PDT; 22h ago
     Docs: https://github.com/openshift/origin
 Main PID: 87388 (hyperkube)
   CGroup: /system.slice/atomic-openshift-node.service
           ├─87388 /usr/bin/hyperkube kubelet --v=6 --address=0.0.0.0 
--allow-privileged=true --anonymous-auth=true --authentication-toke...
           └─88872 /opt/cni/bin/calico

Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.289674   87388 common.go:71] Using 
namespace "kube-s....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.289809   87388 file.go:199] 
Reading config file "/et...yaml"
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.292556   87388 common.go:62] 
Generated UID "598eab3c....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.293602   87388 common.go:66] 
Generated Name "master-....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.294512   87388 common.go:71] Using 
namespace "kube-s....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.295667   87388 file.go:199] 
Reading config file "/et...yaml"
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.296350   87388 common.go:62] 
Generated UID "d71dc810....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.296367   87388 common.go:66] 
Generated Name "master-....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.296379   87388 common.go:71] Using 
namespace "kube-s....yaml
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.300194   87388 config.go:303] 
Setting pods for source file
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.361625   87388 kubelet.go:1884] 
SyncLoop (SYNC): 3 p...d33c)
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.361693   87388 config.go:100] 
Looking for [api file]...e:{}]
Oct 14 23:15:48 buzz1.fyre.ibm.com<http://buzz1.fyre.ibm.com> 
atomic-openshift-node[87388]: I1014 23:15:48.361716   87388 kubelet.go:1907] 
SyncLoop (housekeeping)
Hint: Some lines were ellipsized, use -l to show in full.
[root@buzz1 openshift-ansible]# ps -ef | grep calico
root      88872  87388  0 23:15 ?        00:00:00 /opt/cni/bin/calico
root      88975  74601  0 23:15 pts/0    00:00:00 grep --color=auto calico
[root@buzz1 openshift-ansible]#

It seemed that calico is extra here. Then using the same inventory file, OCP 
3.11 could be deployed on a clean VM successfully.
I guessed that uninstall playbook did not clear calico thoroughly.


On Oct 12, 2019, at 11:52 PM, Yu Wei 
<yu20...@hotmail.com<mailto:yu20...@hotmail.com>> wrote:

Hi,
I tried to install OCP 3.11 with following variables set.
openshift_use_openshift_sdn=true
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant’

Some pods stuck on ‘ContainerCreating’.
[root@buzz1 openshift-ansible]# oc get pods --all-namespaces
NAMESPACE               NAME                                    READY     
STATUS              RESTARTS   AGE
default                 docker-registry-1-deploy                0/1       
ContainerCreating   0          5h
default                 registry-console-1-deploy               0/1       
ContainerCreating   0          5h
kube-system             
master-api-buzz1.center1.com<http://master-api-buzz1.center1.com/>            
1/1       Running             0          5h
kube-system             
master-controllers-buzz1.center1.com<http://master-controllers-buzz1.center1.com/>
    1/1       Running             0          5h
kube-system             
master-etcd-buzz1.center1.com<http://master-etcd-buzz1.center1.com/>           
1/1       Running             0          5h
openshift-node          sync-x8j7d                              1/1       
Running             0          5h
openshift-sdn           ovs-ff7r7                               1/1       
Running             0          5h
openshift-sdn           sdn-7frfw                               1/1       
Running             10         5h
openshift-web-console   webconsole-85494cdb8c-s2dnh             0/1       
ContainerCreating   0          5h

Run ‘oc describe pods’, I got something as below.

Events:
  Type     Reason                  Age              From                        
 Message
  ----     ------                  ----             ----                        
 -------
  Warning  FailedCreatePodSandBox  2m               kubelet, buzz1  Failed 
create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox 
container "8570c350953e29185ef8ab05d628f90c6791a56ac392e40f2f6e30a14a76ab22" 
network for pod "network-diag-test-pod-qz7hv": NetworkPlugin cni failed to set 
up pod "network-diag-test-pod-qz7hv_network-diag-global-ns-q7vbn" network: 
context deadline exceeded, failed to clean up sandbox container 
"8570c350953e29185ef8ab05d628f90c6791a56ac392e40f2f6e30a14a76ab22" network for 
pod "network-diag-test-pod-qz7hv": NetworkPlugin cni failed to teardown pod 
"network-diag-test-pod-qz7hv_network-diag-global-ns-q7vbn" network: context 
deadline exceeded]
  Normal   SandboxChanged          2s (x8 over 2m)  kubelet, buzz1  Pod sandbox 
changed, it will be killed and re-created.

How could I resolve this problem?
Any thoughts?

Thanks,
Jared

_______________________________________________
users mailing list
us...@lists.openshift.redhat.com<mailto:us...@lists.openshift.redhat.com>
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

_______________________________________________
dev mailing list
dev@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

Re: Pods stuck on 'ContainerCreating' when redhat/openshift-ovs-multitenant enabled

Reply via email to