Hello,

im running Spark 2.3 job on kubernetes cluster

kubectl version

Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", 
GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", 
BuildDate:"2018-02-09T21:51:06Z", GoVersion:"go1.9.4", Compiler:"gc", 
Platform:"darwin/amd64"}

Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.3", 
GitCommit:"f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd", GitTreeState:"clean", 
BuildDate:"2017-11-08T18:27:48Z", GoVersion:"go1.8.3", Compiler:"gc", 
Platform:"linux/amd64"}


when i ran spark submit on k8s master the driver pod is stuck in *Waiting: 
PodInitializing* state.


I tried kubectl describe node on the node where trhe driver pod is running 
this is what i got ,i do see there is overcommit on resources but i 
expected kubernetes scheduler not to schedule if resources in node are 
overcommitted or node is in Not Ready state ,in this case node is in Ready 
State but i observe same behaviour if node is in "*Not Ready*" state


Name:               **********

Roles:              worker

Labels:             beta.kubernetes.io/arch=amd64

                    beta.kubernetes.io/os=linux

                    kubernetes.io/hostname=****

                    node-role.kubernetes.io/worker=true

Annotations:        node.alpha.kubernetes.io/ttl=0

                    
volumes.kubernetes.io/controller-managed-attach-detach=true

Taints:             <none>

CreationTimestamp:  Tue, 31 Jul 2018 09:59:24 -0400

Conditions:

  Type             Status  LastHeartbeatTime                
 LastTransitionTime                Reason                       Message

  ----             ------  -----------------                
 ------------------                ------                       -------

  OutOfDisk        False   Tue, 14 Aug 2018 09:31:20 -0400   Tue, 31 Jul 
2018 09:59:24 -0400   KubeletHasSufficientDisk     kubelet has sufficient 
disk space available

  MemoryPressure   False   Tue, 14 Aug 2018 09:31:20 -0400   Tue, 31 Jul 
2018 09:59:24 -0400   KubeletHasSufficientMemory   kubelet has sufficient 
memory available

  DiskPressure     False   Tue, 14 Aug 2018 09:31:20 -0400   Tue, 31 Jul 
2018 09:59:24 -0400   KubeletHasNoDiskPressure     kubelet has no disk 
pressure

  Ready            True    Tue, 14 Aug 2018 09:31:20 -0400   Sat, 11 Aug 
2018 00:41:27 -0400   KubeletReady                 kubelet is posting ready 
status. AppArmor enabled

Addresses:

  InternalIP:  *****

  Hostname:    ******

Capacity:

 cpu:     16

 memory:  125827288Ki

 pods:    110

Allocatable:

 cpu:     16

 memory:  125724888Ki

 pods:    110

System Info:

 Machine ID:                 *************

 System UUID:                **************

 Boot ID:                    1493028d-0a80-4f2f-b0f1-48d9b8910e9f

 Kernel Version:             4.4.0-1062-aws

 OS Image:                   Ubuntu 16.04.4 LTS

 Operating System:           linux

 Architecture:               amd64

 Container Runtime Version:  docker://Unknown

 Kubelet Version:            v1.8.3

 Kube-Proxy Version:         v1.8.3

PodCIDR:                     ******

ExternalID:                  **************

Non-terminated Pods:         (11 in total)

  Namespace                  Name                                          
                  CPU Requests  CPU Limits  Memory Requests  Memory Limits

  ---------                  ----                                          
                  ------------  ----------  ---------------  -------------

  kube-system                calico-node-gj5mb                              
                 250m (1%)     0 (0%)      0 (0%)           0 (0%)

  kube-system                
kube-proxy-****************************************             100m (0%)  
   0 (0%)      0 (0%)           0 (0%)

  kube-system                prometheus-prometheus-node-exporter-9cntq      
                 100m (0%)     200m (1%)   30Mi (0%)        50Mi (0%)

  logging                    
elasticsearch-elasticsearch-data-69df997486-gqcwg               400m (2%)  
   1 (6%)      8Gi (6%)         16Gi (13%)

  logging                    fluentd-fluentd-elasticsearch-tj7nd            
                 200m (1%)     0 (0%)      612Mi (0%)       0 (0%)

  rook                       rook-agent-6jtzm                              
                  0 (0%)        0 (0%)      0 (0%)           0 (0%)

  rook                      
 rook-ceph-osd-10-6-42-250.accel.aws-cardda.cb4good.com-gwb8j    0 (0%)    
    0 (0%)      0 (0%)           0 (0%)

  spark                      
accelerate-test-5-a3bfb8a597e83d459193a183e17f13b5-exec-1       2 (12%)    
   0 (0%)      10Gi (8%)        12Gi (10%)

  spark                      
accelerate-testing-1-8ed0482f3bfb3c0a83da30bb7d433dff-exec-5    2 (12%)    
   0 (0%)      10Gi (8%)        12Gi (10%)

  spark                      
accelerate-testing-2-8cecc18bb42f31a386c6304bd63e9eba-driver    1 (6%)      
  0 (0%)      2Gi (1%)         2432Mi (1%)

  spark                      
accelerate-testing-2-e8bd0607cc693bc8ae25cc6dc300b2c7-driver    1 (6%)      
  0 (0%)      2Gi (1%)         2432Mi (1%)

Allocated resources:

  (Total limits may be over 100 percent, i.e., overcommitted.)

  CPU Requests  CPU Limits  Memory Requests  Memory Limits

  ------------  ----------  ---------------  -------------

  7050m (44%)   1200m (7%)  33410Mi (27%)    45874Mi (37%)

Events:         <none>

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to