Found that issue was happening when pods were getting launched on one of 
the node and on restarting docker on that node fixed my issue. Do we really 
need to restart docker after some time? How are these issues handled at 
production clusters?


On Saturday, April 28, 2018 at 4:46:29 AM UTC+5:30, Vivek Kumar wrote:
>
> Hi All, 
>
> I am facing a weird issue with my pods. I am launching around 20 
> containers in my env and every time some random 3-4 pods out of them hang 
> with Init:0/1 status. On checking the status of pod, Init container shows 
> running status, which should terminate after task is finished, and app 
> container shows Waiting/Pod Initializing stage. Same init container image 
> and specs are being used in across all 20 pods but this issue is happening 
> with some random pods every time. And on terminating these stuck pods, it 
> stucks in Terminating state. If i ssh on node at which this pod is launched 
> and run docker ps, it shows me init container in running state but on 
> running docker exec it throws error that container doesn't exist. This init 
> container is pulling configs from Consul Server and on checking volume (got 
> from docker inspect), i found that it has pulled all the key-val pairs 
> correctly and saved it in defined file name. I have checked resources on 
> all the nodes and more than enough is available on all. 
> Below is detailed example of on the pod acting like this. 
>
>
>
> kubectl version 
> Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", 
> GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", 
> BuildDate:"2017-12-15T21:07:38Z", GoVersion:"go1.9.2", Compiler:"gc", 
> Platform:"linux/amd64"} 
> Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", 
> GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", 
> BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", 
> Platform:"linux/amd64"} 
>
>
>
> kubectl get pods -n dev1|grep -i session-service 
> session-service-app-75c9c8b5d9-dsmhp               0/1       Init:0/1     
>       0          10h 
> session-service-app-75c9c8b5d9-vq98k               0/1       Terminating   
>      0          11h 
>
>
>
> kubectl describe pods session-service-app-75c9c8b5d9-dsmhp -n dev1 
> Name:           session-service-app-75c9c8b5d9-dsmhp 
> Namespace:      dev1 
> Node:           ip-192-168-44-18.ap-southeast-1.compute.internal/
> 192.168.44.18 
> Start Time:     Fri, 27 Apr 2018 18:14:43 +0530 
> Labels:         app=session-service-app 
>                 pod-template-hash=3175746185 
>                 release=session-service-app 
> Status:         Pending 
> IP:             100.96.4.240 
> Controlled By:  ReplicaSet/session-service-app-75c9c8b5d9 
> Init Containers: 
>   initpullconsulconfig: 
>     Container ID: 
>  docker://c658d59995636e39c9d03b06e4973b6e32f818783a21ad292a2cf20d0e43bb02 
>     Image:         shr-u-nexus-01.myops.de:8082/utils/app-init:1.0 
>     Image ID:      docker-pullable://
> shr-u-nexus-01.myops.de:8082/utils/app-init@sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd
>  
>     Port:          <none> 
>     Args: 
>       -consul-addr=consul-server.consul.svc.cluster.local:8500 
>     State:          Running 
>       Started:      Fri, 27 Apr 2018 18:14:44 +0530 
>     Ready:          False 
>     Restart Count:  0 
>     Environment: 
>       CONSUL_TEMPLATE_VERSION:  0.19.4 
>       POD:                      sand 
>       SERVICE:                  session-service-app 
>       ENV:                      dev1 
>     Mounts: 
>       /var/lib/app from shared-volume-sidecar (rw) 
>       /var/run/secrets/kubernetes.io/serviceaccount from 
> default-token-bthkv (ro) 
> Containers: 
>   session-service-app: 
>     Container ID: 
>     Image:          
> shr-u-nexus-01.myops.de:8082/sand-images/sessionservice-init:sitv12 
>     Image ID: 
>     Port:           8080/TCP 
>     State:          Waiting 
>       Reason:       PodInitializing 
>     Ready:          False 
>     Restart Count:  0 
>     Environment:    <none> 
>     Mounts: 
>       /etc/appenv from shared-volume-sidecar (rw) 
>       /var/run/secrets/kubernetes.io/serviceaccount from 
> default-token-bthkv (ro) 
> Conditions: 
>   Type           Status 
>   Initialized    False 
>   Ready          False 
>   PodScheduled   True 
> Volumes: 
>   shared-volume-sidecar: 
>     Type:    EmptyDir (a temporary directory that shares a pod's lifetime) 
>     Medium: 
>   default-token-bthkv: 
>     Type:        Secret (a volume populated by a Secret) 
>     SecretName:  default-token-bthkv 
>     Optional:    false 
> QoS Class:       BestEffort 
> Node-Selectors:  <none> 
> Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s 
>                  node.kubernetes.io/unreachable:NoExecute for 300s 
> Events:          <none> 
>
>
>
>
> sudo docker ps|grep -i session 
> c658d5999563        
> shr-u-nexus-01.myops.de:8082/utils/app-init@sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd
>  
>                                       "/usr/bin/consul-t..."   10 hours ago 
>        Up 10 hours                             
> k8s_initpullconsulconfig_session-service-app-75c9c8b5d9-dsmhp_dev1_c2075f2a-4a18-11e8-88e7-02929cc89ab6_0
>  
>
>
> da120abd3dbb        gcr.io/google_containers/pause-amd64:3.0             
>                                                                             
>                              "/pause"                 10 hours ago       
>  Up 10 hours                             
> k8s_POD_session-service-app-75c9c8b5d9-dsmhp_dev1_c2075f2a-4a18-11e8-88e7-02929cc89ab6_0
>  
>
>
> f53d48c7d6ec        
> shr-u-nexus-01.myops.de:8082/utils/app-init@sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd
>  
>                                       "/usr/bin/consul-t..."   10 hours ago 
>        Up 10 hours                             
> k8s_initpullconsulconfig_session-service-app-75c9c8b5d9-vq98k_dev1_42837d12-4a12-11e8-88e7-02929cc89ab6_0
>  
>
>
> c26415458d39        gcr.io/google_containers/pause-amd64:3.0             
>                                                                             
>                              "/pause"                 10 hours ago       
>  Up 10 hours                             
> k8s_POD_session-service-app-75c9c8b5d9-vq98k_dev1_42837d12-4a12-11e8-88e7-02929cc89ab6_0
>  
>
>
>
>
>
> sudo docker exec -it c658d5999563 bash 
> rpc error: code = 2 desc = containerd: container not found 
>
>
-- 

IMPORTANT NOTICE: This e-mail, including any attachments, may contain 
confidential information and is intended only for the addressee(s) named 
above. If you are not the intended recipient(s), you should not 
disseminate, distribute, or copy this e-mail. Please notify the sender by 
reply e-mail immediately if you have received this e-mail in error and 
permanently delete all copies of the original message from your system. 
E-mail transmission cannot be guaranteed to be secure as it could be 
intercepted, corrupted, lost, destroyed, arrive late or incomplete, or 
contain viruses. Company accepts no liability for any damage or loss of 
confidential information caused by this email or due to any virus 
transmitted by this email or otherwise.

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.
  • [kubernetes-users... vivek.kumar via Kubernetes user discussion and Q&A
    • [kubernetes-... 'Vivek Kumar' via Kubernetes user discussion and Q&A

Reply via email to