我是在MacBook Pro上搭建了一套MiniKube,安装了VirtualBox。请问正确启动 Flink v1.11.3 on K8S 的步骤是怎样的?
我实践的步骤是:
minikube start
cd /Users/admin/dev/flink-1.11.3
./bin/kubernetes-session.sh
此时显示拉取的镜像名称是 flink:1.11.3-scala_2.12 ,而不是dockerhub仓库上flink官方给的
flink:1.11.3-scala_2.12-java8
于是我重新使用命令
./bin/kubernetes-session.sh \
-Dkubernetes.cluster-id=my-flink-cluster \
-Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8
等待一段拉取镜像时间后get pod显示
SJ-DN0393:flink-1.11.3 admin$ kubectl get pods
NAME READY STATUS
RESTARTS AGE
kubernetes-dashboard-1608509744-6bc8455756-mp47w 1/1 Running 3
10d
my-flink-cluster-77c6f85879-9vcx8 0/1 CrashLoopBackOff 5
29m
通过describe pod命令显示
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 29m default-scheduler Successfully
assigned default/my-flink-cluster-77c6f85879-9vcx8 to minikube
Warning FailedMount 29m kubelet
MountVolume.SetUp failed for volume "flink-config-volume" : configmap
"flink-config-my-flink-cluster" not found
Warning FailedMount 29m kubelet
MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
"hadoop-config-my-flink-cluster" not found
Normal Pulling 29m kubelet Pulling image
"flink:1.11.3-scala_2.12-java8"
Normal Pulled 2m41s (x5 over 4m34s) kubelet Container
image "flink:1.11.3-scala_2.12-java8" already present on machine
Normal Created 2m41s (x5 over 4m33s) kubelet Created
container flink-job-manager
Normal Started 2m41s (x5 over 4m33s) kubelet Started
container flink-job-manager
Warning BackOff 2m8s (x10 over 4m18s) kubelet Back-off
restarting failed container
在 2020-12-28 10:40:59,"Yang Wang" <[email protected]> 写道:
>你整个流程理由有两个问题:
>
>1. 镜像找不到
>原因应该是和minikube的driver设置有关,如果是hyperkit或者其他vm的方式,你需要minikube
>ssh到虚拟机内部查看镜像是否正常存在
>
>2. JM链接无法访问
>2020-12-27 22:08:12,387 INFO
>org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create
>flink session cluster session001 successfully, JobManager Web Interface:
>http://192.168.99.100:8081
>
>我猜你上面的这行log应该不是你贴出来的命令打印的,因为你给的命令是NodePort方式,打印出来的JM地址不应该是8081端口的。
>只要你在minikube上提交的任务加上kubernetes.rest-service.exposed.type=NodePort,并且JM能起来,打印出来的JM地址就是可以访问的
>
>当然你也可以手动拼接出来这个链接,minikube ip拿到APIServer地址,然后用kubectl get svc 去查看你创建的Flink
>Session Cluster对应的rest svc的NodePort,拼起来访问就好了
>
>
>Best,
>Yang
>
>陈帅 <[email protected]> 于2020年12月27日周日 下午10:51写道:
>
>>
>> 本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
>>
>>
>> git clone
>> https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
>> docker build --tag flink:1.12.0-scala_2.12-java8 .
>>
>>
>> cd flink-1.12.0
>> ./bin/kubernetes-session.sh \
>> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \
>> -Dkubernetes.rest-service.exposed.type=NodePort \
>> -Dtaskmanager.numberOfTaskSlots=2 \
>> -Dkubernetes.cluster-id=flink-session-cluster
>>
>>
>> 显示JM启起来了,但无法通过web访问
>>
>> 2020-12-27 22:08:12,387 INFO
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create
>> flink session cluster session001 successfully, JobManager Web Interface:
>> http://192.168.99.100:8081
>>
>>
>>
>>
>> 通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
>>
>> NAME READY STATUS
>> RESTARTS AGE
>>
>> flink-session-cluster-858bd55dff-bzjk2 0/1
>> ContainerCreating 0 5m59s
>>
>> kubernetes-dashboard-1608509744-6bc8455756-mp47w 1/1 Running
>> 0 6d14h
>>
>>
>>
>>
>> 于是通过 `kubectl describe pod
>> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
>>
>>
>>
>>
>> Name: flink-session-cluster-858bd55dff-bzjk2
>>
>> Namespace: default
>>
>> Priority: 0
>>
>> Node: minikube/192.168.99.100
>>
>> Start Time: Sun, 27 Dec 2020 22:21:56 +0800
>>
>> Labels: app=flink-session-cluster
>>
>> component=jobmanager
>>
>> pod-template-hash=858bd55dff
>>
>> type=flink-native-kubernetes
>>
>> Annotations: <none>
>>
>> Status: Pending
>>
>> IP: 172.17.0.4
>>
>> IPs:
>>
>> IP: 172.17.0.4
>>
>> Controlled By: ReplicaSet/flink-session-cluster-858bd55dff
>>
>> Containers:
>>
>> flink-job-manager:
>>
>> Container ID:
>>
>> Image: flink:1.12.0-scala_2.12-java8
>>
>> Image ID:
>>
>> Ports: 8081/TCP, 6123/TCP, 6124/TCP
>>
>> Host Ports: 0/TCP, 0/TCP, 0/TCP
>>
>> Command:
>>
>> /docker-entrypoint.sh
>>
>> Args:
>>
>> native-k8s
>>
>> $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
>> -Xms1073741824 -XX:MaxMetaspaceSize=268435456
>> -Dlog.file=/opt/flink/log/jobmanager.log
>> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
>> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
>> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
>> org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
>> -D jobmanager.memory.off-heap.size=134217728b -D
>> jobmanager.memory.jvm-overhead.min=201326592b -D
>> jobmanager.memory.jvm-metaspace.size=268435456b -D
>> jobmanager.memory.heap.size=1073741824b -D
>> jobmanager.memory.jvm-overhead.max=201326592b
>>
>> State: Waiting
>>
>> Reason: ImagePullBackOff
>>
>> Ready: False
>>
>> Restart Count: 0
>>
>> Limits:
>>
>> cpu: 1
>>
>> memory: 1600Mi
>>
>> Requests:
>>
>> cpu: 1
>>
>> memory: 1600Mi
>>
>> Environment:
>>
>> _POD_IP_ADDRESS: (v1:status.podIP)
>>
>> HADOOP_CONF_DIR: /opt/hadoop/conf
>>
>> Mounts:
>>
>> /opt/flink/conf from flink-config-volume (rw)
>>
>> /opt/hadoop/conf from hadoop-config-volume (rw)
>>
>> /var/run/secrets/kubernetes.io/serviceaccount from
>> default-token-s47ht (ro)
>>
>> Conditions:
>>
>> Type Status
>>
>> Initialized True
>>
>> Ready False
>>
>> ContainersReady False
>>
>> PodScheduled True
>>
>> Volumes:
>>
>> hadoop-config-volume:
>>
>> Type: ConfigMap (a volume populated by a ConfigMap)
>>
>> Name: hadoop-config-flink-session-cluster
>>
>> Optional: false
>>
>> flink-config-volume:
>>
>> Type: ConfigMap (a volume populated by a ConfigMap)
>>
>> Name: flink-config-flink-session-cluster
>>
>> Optional: false
>>
>> default-token-s47ht:
>>
>> Type: Secret (a volume populated by a Secret)
>>
>> SecretName: default-token-s47ht
>>
>> Optional: false
>>
>> QoS Class: Guaranteed
>>
>> Node-Selectors: <none>
>>
>> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
>>
>> node.kubernetes.io/unreachable:NoExecute op=Exists for
>> 300s
>>
>> Events:
>>
>> Type Reason Age From Message
>>
>> ---- ------ ---- ---- -------
>>
>> Normal Scheduled 21m default-scheduler
>> Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to
>> minikube
>>
>> Warning FailedMount 21m (x2 over 21m) kubelet
>> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
>> "flink-config-flink-session-cluster" not found
>>
>> Warning FailedMount 21m (x2 over 21m) kubelet
>> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
>> "hadoop-config-flink-session-cluster" not found
>>
>> Normal Pulling 13m (x4 over 21m) kubelet Pulling
>> image "flink:1.12.0-scala_2.12-java8"
>>
>> Warning Failed 13m (x4 over 15m) kubelet Failed to
>> pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc
>> = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8
>> not found: manifest unknown: manifest unknown
>>
>> Normal BackOff 13m (x5 over 15m) kubelet Back-off
>> pulling image "flink:1.12.0-scala_2.12-java8"
>>
>> Warning Failed 11m (x5 over 15m) kubelet Error:
>> ErrImagePull
>>
>> Warning Failed 100s (x53 over 15m) kubelet Error:
>> ImagePullBackOff
>>
>>
>>
>>
>> 一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看
>>
>> REPOSITORY TAG
>> IMAGE ID CREATED SIZE
>>
>> flink
>> 1.12.0-scala_2.12-java8 f7dd9b9e020b 12 hours ago 642MB
>>
>>
>>
>>
>>
>> 显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?
>>
>> 第一次用k8s,还请各位指点,谢谢!
>>
>>
>>
>>
>>
>>
>>
>>
>>