今天改用官方最新发布的flink镜像版本1.11.3也启不起来
这是我的命令
./bin/kubernetes-session.sh \
  -Dkubernetes.cluster-id=rtdp \
  -Dtaskmanager.memory.process.size=4096m \
  -Dkubernetes.taskmanager.cpu=2 \
  -Dtaskmanager.numberOfTaskSlots=4 \
  -Dresourcemanager.taskmanager-timeout=3600000 \
  -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8 \
  -Dkubernetes.namespace=rtdp



Events:

  Type     Reason          Age                From               Message

  ----     ------          ----               ----               -------

  Normal   Scheduled       88s                default-scheduler  Successfully 
assigned rtdp/rtdp-6d7794d65d-g6mb5 to cn-shanghai.192.168.16.130

  Warning  FailedMount     88s                kubelet            
MountVolume.SetUp failed for volume "flink-config-volume" : configmap 
"flink-config-rtdp" not found

  Warning  FailedMount     88s                kubelet            
MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap 
"hadoop-config-rtdp" not found

  Normal   AllocIPSucceed  87s                terway-daemon      Alloc IP 
192.168.32.25/22 for Pod

  Normal   Pulling         87s                kubelet            Pulling image 
"flink:1.11.3-scala_2.12-java8"

  Normal   Pulled          31s                kubelet            Successfully 
pulled image "flink:1.11.3-scala_2.12-java8"

  Normal   Created         18s (x2 over 26s)  kubelet            Created 
container flink-job-manager

  Normal   Started         18s (x2 over 26s)  kubelet            Started 
container flink-job-manager

  Normal   Pulled          18s                kubelet            Container 
image "flink:1.11.3-scala_2.12-java8" already present on machine

  Warning  BackOff         10s                kubelet            Back-off 
restarting failed container







这里面有两个ConfigMap没有找到,是需要提前创建吗?官方文档没有说明?还是我看漏了?
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#start-flink-session









在 2020-12-27 22:50:32,"陈帅" <[email protected]> 写道:
>本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
>
>
>git clone 
>https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
>docker build --tag flink:1.12.0-scala_2.12-java8 .
>
>
>cd flink-1.12.0
>./bin/kubernetes-session.sh \ 
>-Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \ 
>-Dkubernetes.rest-service.exposed.type=NodePort \ 
>-Dtaskmanager.numberOfTaskSlots=2 \ 
>-Dkubernetes.cluster-id=flink-session-cluster
>
>
>显示JM启起来了,但无法通过web访问
>
>2020-12-27 22:08:12,387 INFO  
>org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create flink 
>session cluster session001 successfully, JobManager Web Interface: 
>http://192.168.99.100:8081
>
>
>
>
>通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
>
>NAME                                               READY   STATUS              
>RESTARTS   AGE
>
>flink-session-cluster-858bd55dff-bzjk2             0/1     ContainerCreating   
>0          5m59s
>
>kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running             
>0          6d14h
>
>
>
>
>于是通过 `kubectl describe pod flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
>
>
>
>
>Name:         flink-session-cluster-858bd55dff-bzjk2
>
>Namespace:    default
>
>Priority:     0
>
>Node:         minikube/192.168.99.100
>
>Start Time:   Sun, 27 Dec 2020 22:21:56 +0800
>
>Labels:       app=flink-session-cluster
>
>              component=jobmanager
>
>              pod-template-hash=858bd55dff
>
>              type=flink-native-kubernetes
>
>Annotations:  <none>
>
>Status:       Pending
>
>IP:           172.17.0.4
>
>IPs:
>
>  IP:           172.17.0.4
>
>Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff
>
>Containers:
>
>  flink-job-manager:
>
>    Container ID:  
>
>    Image:         flink:1.12.0-scala_2.12-java8
>
>    Image ID:      
>
>    Ports:         8081/TCP, 6123/TCP, 6124/TCP
>
>    Host Ports:    0/TCP, 0/TCP, 0/TCP
>
>    Command:
>
>      /docker-entrypoint.sh
>
>    Args:
>
>      native-k8s
>
>      $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824 
> -Xms1073741824 -XX:MaxMetaspaceSize=268435456 
> -Dlog.file=/opt/flink/log/jobmanager.log 
> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml 
> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties 
> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties 
> org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint -D 
> jobmanager.memory.off-heap.size=134217728b -D 
> jobmanager.memory.jvm-overhead.min=201326592b -D 
> jobmanager.memory.jvm-metaspace.size=268435456b -D 
> jobmanager.memory.heap.size=1073741824b -D 
> jobmanager.memory.jvm-overhead.max=201326592b
>
>    State:          Waiting
>
>      Reason:       ImagePullBackOff
>
>    Ready:          False
>
>    Restart Count:  0
>
>    Limits:
>
>      cpu:     1
>
>      memory:  1600Mi
>
>    Requests:
>
>      cpu:     1
>
>      memory:  1600Mi
>
>    Environment:
>
>      _POD_IP_ADDRESS:   (v1:status.podIP)
>
>      HADOOP_CONF_DIR:  /opt/hadoop/conf
>
>    Mounts:
>
>      /opt/flink/conf from flink-config-volume (rw)
>
>      /opt/hadoop/conf from hadoop-config-volume (rw)
>
>      /var/run/secrets/kubernetes.io/serviceaccount from default-token-s47ht 
> (ro)
>
>Conditions:
>
>  Type              Status
>
>  Initialized       True 
>
>  Ready             False 
>
>  ContainersReady   False 
>
>  PodScheduled      True 
>
>Volumes:
>
>  hadoop-config-volume:
>
>    Type:      ConfigMap (a volume populated by a ConfigMap)
>
>    Name:      hadoop-config-flink-session-cluster
>
>    Optional:  false
>
>  flink-config-volume:
>
>    Type:      ConfigMap (a volume populated by a ConfigMap)
>
>    Name:      flink-config-flink-session-cluster
>
>    Optional:  false
>
>  default-token-s47ht:
>
>    Type:        Secret (a volume populated by a Secret)
>
>    SecretName:  default-token-s47ht
>
>    Optional:    false
>
>QoS Class:       Guaranteed
>
>Node-Selectors:  <none>
>
>Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
>
>                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
>
>Events:
>
>  Type     Reason       Age                  From               Message
>
>  ----     ------       ----                 ----               -------
>
>  Normal   Scheduled    21m                  default-scheduler  Successfully 
> assigned default/flink-session-cluster-858bd55dff-bzjk2 to minikube
>
>  Warning  FailedMount  21m (x2 over 21m)    kubelet            
> MountVolume.SetUp failed for volume "flink-config-volume" : configmap 
> "flink-config-flink-session-cluster" not found
>
>  Warning  FailedMount  21m (x2 over 21m)    kubelet            
> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap 
> "hadoop-config-flink-session-cluster" not found
>
>  Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling image 
> "flink:1.12.0-scala_2.12-java8"
>
>  Warning  Failed       13m (x4 over 15m)    kubelet            Failed to pull 
> image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc = Error 
> response from daemon: manifest for flink:1.12.0-scala_2.12-java8 not found: 
> manifest unknown: manifest unknown
>
>  Normal   BackOff      13m (x5 over 15m)    kubelet            Back-off 
> pulling image "flink:1.12.0-scala_2.12-java8"
>
>  Warning  Failed       11m (x5 over 15m)    kubelet            Error: 
> ErrImagePull
>
>  Warning  Failed       100s (x53 over 15m)  kubelet            Error: 
> ImagePullBackOff
>
>
>
>
>一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看
>
>REPOSITORY                                             TAG                     
>  IMAGE ID       CREATED        SIZE
>
>flink                                                  1.12.0-scala_2.12-java8 
>  f7dd9b9e020b   12 hours ago   642MB
>
>
>
>
>显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?
>
>第一次用k8s,还请各位指点,谢谢!
>
>
>
>
>
>
>
>

回复