Hi All,

I am facing a weird issue with my pods. I am launching around 20 containers in 
my env and every time some random 3-4 pods out of them hang with Init:0/1 
status. On checking the status of pod, Init container shows running status, 
which should terminate after task is finished, and app container shows 
Waiting/Pod Initializing stage. Same init container image and specs are being 
used in across all 20 pods but this issue is happening with some random pods 
every time. And on terminating these stuck pods, it stucks in Terminating 
state. If i ssh on node at which this pod is launched and run docker ps, it 
shows me init container in running state but on running docker exec it throws 
error that container doesn't exist. This init container is pulling configs from 
Consul Server and on checking volume (got from docker inspect), i found that it 
has pulled all the key-val pairs correctly and saved it in defined file name. I 
have checked resources on all the nodes and more than enough is available on 
all.
Below is detailed example of on the pod acting like this.



kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", 
GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", 
BuildDate:"2017-12-15T21:07:38Z", GoVersion:"go1.9.2", Compiler:"gc", 
Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", 
GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", 
BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", 
Platform:"linux/amd64"}



kubectl get pods -n dev1|grep -i session-service
session-service-app-75c9c8b5d9-dsmhp               0/1       Init:0/1           
0          10h
session-service-app-75c9c8b5d9-vq98k               0/1       Terminating        
0          11h



kubectl describe pods session-service-app-75c9c8b5d9-dsmhp -n dev1
Name:           session-service-app-75c9c8b5d9-dsmhp
Namespace:      dev1
Node:           ip-192-168-44-18.ap-southeast-1.compute.internal/192.168.44.18
Start Time:     Fri, 27 Apr 2018 18:14:43 +0530
Labels:         app=session-service-app
                pod-template-hash=3175746185
                release=session-service-app
Status:         Pending
IP:             100.96.4.240
Controlled By:  ReplicaSet/session-service-app-75c9c8b5d9
Init Containers:
  initpullconsulconfig:
    Container ID:  
docker://c658d59995636e39c9d03b06e4973b6e32f818783a21ad292a2cf20d0e43bb02
    Image:         shr-u-nexus-01.myops.de:8082/utils/app-init:1.0
    Image ID:      
docker-pullable://shr-u-nexus-01.myops.de:8082/utils/app-init@sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd
    Port:          <none>
    Args:
      -consul-addr=consul-server.consul.svc.cluster.local:8500
    State:          Running
      Started:      Fri, 27 Apr 2018 18:14:44 +0530
    Ready:          False
    Restart Count:  0
    Environment:
      CONSUL_TEMPLATE_VERSION:  0.19.4
      POD:                      sand
      SERVICE:                  session-service-app
      ENV:                      dev1
    Mounts:
      /var/lib/app from shared-volume-sidecar (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-bthkv 
(ro)
Containers:
  session-service-app:
    Container ID:
    Image:          
shr-u-nexus-01.myops.de:8082/sand-images/sessionservice-init:sitv12
    Image ID:
    Port:           8080/TCP
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /etc/appenv from shared-volume-sidecar (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-bthkv 
(ro)
Conditions:
  Type           Status
  Initialized    False
  Ready          False
  PodScheduled   True
Volumes:
  shared-volume-sidecar:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  default-token-bthkv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-bthkv
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>




sudo docker ps|grep -i session
c658d5999563        
shr-u-nexus-01.myops.de:8082/utils/app-init@sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd
                                       "/usr/bin/consul-t..."   10 hours ago    
    Up 10 hours                             
k8s_initpullconsulconfig_session-service-app-75c9c8b5d9-dsmhp_dev1_c2075f2a-4a18-11e8-88e7-02929cc89ab6_0

da120abd3dbb        gcr.io/google_containers/pause-amd64:3.0                    
                                                                                
                  "/pause"                 10 hours ago        Up 10 hours      
                       
k8s_POD_session-service-app-75c9c8b5d9-dsmhp_dev1_c2075f2a-4a18-11e8-88e7-02929cc89ab6_0

f53d48c7d6ec        
shr-u-nexus-01.myops.de:8082/utils/app-init@sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd
                                       "/usr/bin/consul-t..."   10 hours ago    
    Up 10 hours                             
k8s_initpullconsulconfig_session-service-app-75c9c8b5d9-vq98k_dev1_42837d12-4a12-11e8-88e7-02929cc89ab6_0

c26415458d39        gcr.io/google_containers/pause-amd64:3.0                    
                                                                                
                  "/pause"                 10 hours ago        Up 10 hours      
                       
k8s_POD_session-service-app-75c9c8b5d9-vq98k_dev1_42837d12-4a12-11e8-88e7-02929cc89ab6_0




sudo docker exec -it c658d5999563 bash
rpc error: code = 2 desc = containerd: container not found


-- 

IMPORTANT NOTICE: This e-mail, including any attachments, may contain 
confidential information and is intended only for the addressee(s) named 
above. If you are not the intended recipient(s), you should not 
disseminate, distribute, or copy this e-mail. Please notify the sender by 
reply e-mail immediately if you have received this e-mail in error and 
permanently delete all copies of the original message from your system. 
E-mail transmission cannot be guaranteed to be secure as it could be 
intercepted, corrupted, lost, destroyed, arrive late or incomplete, or 
contain viruses. Company accepts no liability for any damage or loss of 
confidential information caused by this email or due to any virus 
transmitted by this email or otherwise.

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.
  • [kubernetes-users... vivek.kumar via Kubernetes user discussion and Q&A

Reply via email to