native方式默认使用的是LoadBalancer的方式来暴露,所以会打印出来一个你无法访问的地址 你可以加一个-Dkubernetes.rest-service.exposed.type=NodePort的方式来使用NodePort来暴露 这样Flink Client端打印出来的地址就是正确的了
另外你可以可以使用minikube ip来查看ip地址,同时用kubectl get svc获取你创建的Flink cluster svc的NodePort,拼起来就可以 至于你说的NoResourceAvailableException,你可以看下是不是TaskManager的Pod已经创建出来了,但是pending状态 如果是,那就是你minikube资源不够了,可以把minikube资源调大或者把JobManager、TaskManager的Pod资源调小 如果不是,你可以把完整的JobManager日志发一下,这样方便查问题 Best, Yang 陈帅 <[email protected]> 于2021年1月2日周六 上午10:43写道: > 环境:MacBook Pro 单机安装了 minkube v1.15.1 和 kubernetes v1.19.4 > 我在flink v1.11.3发行版下执行如下命令 > kubectl create namespace flink-session-cluster > > > kubectl create serviceaccount flink -n flink-session-cluster > > > kubectl create clusterrolebinding flink-role-binding-flink \ > --clusterrole=edit \ --serviceaccount=flink-session-cluster:flink > > > ./bin/kubernetes-session.sh \ -Dkubernetes.namespace=flink-session-cluster > \ -Dkubernetes.jobmanager.service-account=flink \ > -Dkubernetes.cluster-id=session001 \ > -Dtaskmanager.memory.process.size=8192m \ -Dkubernetes.taskmanager.cpu=1 \ > -Dtaskmanager.numberOfTaskSlots=4 \ > -Dresourcemanager.taskmanager-timeout=3600000 > > > 屏幕打印的结果显示flink web UI启在了 http://192.168.64.2:8081 而不是类似于 > http://192.168.50.135:31753 这样的5位数端口,是哪里有问题?这里的host ip应该是minikube > ip吗?我本地浏览器访问不了http://192.168.64.2:8081 > > > > 2021-01-02 10:28:04,177 INFO > org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The > derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is > less than its min value 192.000mb (201326592 bytes), min value will be used > instead > > 2021-01-02 10:28:04,907 INFO > org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create > flink session cluster session001 successfully, JobManager Web Interface: > http://192.168.64.2:8081 > > > > > 查看了pods, service, deployment都正常启动好了,显示全绿色的 > > > 接下来提交任务 > ./bin/flink run -d \ -e kubernetes-session \ > -Dkubernetes.namespace=flink-session-cluster \ > -Dkubernetes.cluster-id=session001 \ examples/streaming/WindowJoin.jar > > > > Using windowSize=2000, data rate=3 > > To customize example, use: WindowJoin [--windowSize > <window-size-in-millis>] [--rate <elements-per-second>] > > 2021-01-02 10:21:48,658 INFO > org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Retrieve > flink cluster session001 successfully, JobManager Web Interface: > http://10.106.136.236:8081 > > > > > 这里显示的 http://10.106.136.236:8081 我是能够通过浏览器访问到的,打开显示作业正在运行,而且available > slots一项显示的是 0,查看JM日志有如下error > > > > > Causedby: > org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: > Couldnot allocate the required slot within slot request timeout. Please > make sure that the cluster has enough resources. > at > org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441) > ~[flink-dist_2.12-1.11.3.jar:1.11.3] > ... 47 more > Causedby: java.util.concurrent.CompletionException: > java.util.concurrent.TimeoutException > at > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) > ~[?:1.8.0_275] > at > java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) > ~[?:1.8.0_275] > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) > ~[?:1.8.0_275] > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > ~[?:1.8.0_275] > ... 27 more > Causedby: java.util.concurrent.TimeoutException > ... 25 more > > > 为什么会报这个资源配置不足的错?谢谢解答! > > > > > > > > > 在 2020-12-29 09:53:48,"Yang Wang" <[email protected]> 写道: > >ConfigMap不需要提前创建,那个Warning信息可以忽略,是正常的,主要原因是先创建的deployment,再创建的ConfigMap > >你可以参考社区的文档[1]把Jm的log打到console看一下 > > > >我怀疑是你没有创建service account导致的[2] > > > >[1]. > > > https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#log-files > >[2]. > > > https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#rbac > > > >Best, > >Yang > > > >陈帅 <[email protected]> 于2020年12月28日周一 下午5:54写道: > > > >> 今天改用官方最新发布的flink镜像版本1.11.3也启不起来 > >> 这是我的命令 > >> ./bin/kubernetes-session.sh \ > >> -Dkubernetes.cluster-id=rtdp \ > >> -Dtaskmanager.memory.process.size=4096m \ > >> -Dkubernetes.taskmanager.cpu=2 \ > >> -Dtaskmanager.numberOfTaskSlots=4 \ > >> -Dresourcemanager.taskmanager-timeout=3600000 \ > >> -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8 \ > >> -Dkubernetes.namespace=rtdp > >> > >> > >> > >> Events: > >> > >> Type Reason Age From Message > >> > >> ---- ------ ---- ---- ------- > >> > >> Normal Scheduled 88s default-scheduler > >> Successfully assigned rtdp/rtdp-6d7794d65d-g6mb5 to > >> cn-shanghai.192.168.16.130 > >> > >> Warning FailedMount 88s kubelet > >> MountVolume.SetUp failed for volume "flink-config-volume" : configmap > >> "flink-config-rtdp" not found > >> > >> Warning FailedMount 88s kubelet > >> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap > >> "hadoop-config-rtdp" not found > >> > >> Normal AllocIPSucceed 87s terway-daemon Alloc > IP > >> 192.168.32.25/22 for Pod > >> > >> Normal Pulling 87s kubelet Pulling > >> image "flink:1.11.3-scala_2.12-java8" > >> > >> Normal Pulled 31s kubelet > >> Successfully pulled image "flink:1.11.3-scala_2.12-java8" > >> > >> Normal Created 18s (x2 over 26s) kubelet Created > >> container flink-job-manager > >> > >> Normal Started 18s (x2 over 26s) kubelet Started > >> container flink-job-manager > >> > >> Normal Pulled 18s kubelet > Container > >> image "flink:1.11.3-scala_2.12-java8" already present on machine > >> > >> Warning BackOff 10s kubelet > Back-off > >> restarting failed container > >> > >> > >> > >> > >> > >> > >> > >> 这里面有两个ConfigMap没有找到,是需要提前创建吗?官方文档没有说明?还是我看漏了? > >> > >> > https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#start-flink-session > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> 在 2020-12-27 22:50:32,"陈帅" <[email protected]> 写道: > >> > >> > >本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤: > >> > > >> > > >> >git clone > >> > https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian > >> >docker build --tag flink:1.12.0-scala_2.12-java8 . > >> > > >> > > >> >cd flink-1.12.0 > >> >./bin/kubernetes-session.sh \ > >> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \ > >> -Dkubernetes.rest-service.exposed.type=NodePort \ > >> -Dtaskmanager.numberOfTaskSlots=2 \ > >> -Dkubernetes.cluster-id=flink-session-cluster > >> > > >> > > >> >显示JM启起来了,但无法通过web访问 > >> > > >> >2020-12-27 22:08:12,387 INFO > >> org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create > >> flink session cluster session001 successfully, JobManager Web Interface: > >> http://192.168.99.100:8081 > >> > > >> > > >> > > >> > > >> >通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态 > >> > > >> >NAME READY STATUS > >> RESTARTS AGE > >> > > >> >flink-session-cluster-858bd55dff-bzjk2 0/1 > >> ContainerCreating 0 5m59s > >> > > >> >kubernetes-dashboard-1608509744-6bc8455756-mp47w 1/1 Running > >> 0 6d14h > >> > > >> > > >> > > >> > > >> >于是通过 `kubectl describe pod > >> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下: > >> > > >> > > >> > > >> > > >> >Name: flink-session-cluster-858bd55dff-bzjk2 > >> > > >> >Namespace: default > >> > > >> >Priority: 0 > >> > > >> >Node: minikube/192.168.99.100 > >> > > >> >Start Time: Sun, 27 Dec 2020 22:21:56 +0800 > >> > > >> >Labels: app=flink-session-cluster > >> > > >> > component=jobmanager > >> > > >> > pod-template-hash=858bd55dff > >> > > >> > type=flink-native-kubernetes > >> > > >> >Annotations: <none> > >> > > >> >Status: Pending > >> > > >> >IP: 172.17.0.4 > >> > > >> >IPs: > >> > > >> > IP: 172.17.0.4 > >> > > >> >Controlled By: ReplicaSet/flink-session-cluster-858bd55dff > >> > > >> >Containers: > >> > > >> > flink-job-manager: > >> > > >> > Container ID: > >> > > >> > Image: flink:1.12.0-scala_2.12-java8 > >> > > >> > Image ID: > >> > > >> > Ports: 8081/TCP, 6123/TCP, 6124/TCP > >> > > >> > Host Ports: 0/TCP, 0/TCP, 0/TCP > >> > > >> > Command: > >> > > >> > /docker-entrypoint.sh > >> > > >> > Args: > >> > > >> > native-k8s > >> > > >> > $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824 > >> -Xms1073741824 -XX:MaxMetaspaceSize=268435456 > >> -Dlog.file=/opt/flink/log/jobmanager.log > >> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml > >> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties > >> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties > >> > org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint > >> -D jobmanager.memory.off-heap.size=134217728b -D > >> jobmanager.memory.jvm-overhead.min=201326592b -D > >> jobmanager.memory.jvm-metaspace.size=268435456b -D > >> jobmanager.memory.heap.size=1073741824b -D > >> jobmanager.memory.jvm-overhead.max=201326592b > >> > > >> > State: Waiting > >> > > >> > Reason: ImagePullBackOff > >> > > >> > Ready: False > >> > > >> > Restart Count: 0 > >> > > >> > Limits: > >> > > >> > cpu: 1 > >> > > >> > memory: 1600Mi > >> > > >> > Requests: > >> > > >> > cpu: 1 > >> > > >> > memory: 1600Mi > >> > > >> > Environment: > >> > > >> > _POD_IP_ADDRESS: (v1:status.podIP) > >> > > >> > HADOOP_CONF_DIR: /opt/hadoop/conf > >> > > >> > Mounts: > >> > > >> > /opt/flink/conf from flink-config-volume (rw) > >> > > >> > /opt/hadoop/conf from hadoop-config-volume (rw) > >> > > >> > /var/run/secrets/kubernetes.io/serviceaccount from > >> default-token-s47ht (ro) > >> > > >> >Conditions: > >> > > >> > Type Status > >> > > >> > Initialized True > >> > > >> > Ready False > >> > > >> > ContainersReady False > >> > > >> > PodScheduled True > >> > > >> >Volumes: > >> > > >> > hadoop-config-volume: > >> > > >> > Type: ConfigMap (a volume populated by a ConfigMap) > >> > > >> > Name: hadoop-config-flink-session-cluster > >> > > >> > Optional: false > >> > > >> > flink-config-volume: > >> > > >> > Type: ConfigMap (a volume populated by a ConfigMap) > >> > > >> > Name: flink-config-flink-session-cluster > >> > > >> > Optional: false > >> > > >> > default-token-s47ht: > >> > > >> > Type: Secret (a volume populated by a Secret) > >> > > >> > SecretName: default-token-s47ht > >> > > >> > Optional: false > >> > > >> >QoS Class: Guaranteed > >> > > >> >Node-Selectors: <none> > >> > > >> >Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for > >> 300s > >> > > >> > node.kubernetes.io/unreachable:NoExecute op=Exists > for > >> 300s > >> > > >> >Events: > >> > > >> > Type Reason Age From Message > >> > > >> > ---- ------ ---- ---- ------- > >> > > >> > Normal Scheduled 21m default-scheduler > >> Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to > >> minikube > >> > > >> > Warning FailedMount 21m (x2 over 21m) kubelet > >> MountVolume.SetUp failed for volume "flink-config-volume" : configmap > >> "flink-config-flink-session-cluster" not found > >> > > >> > Warning FailedMount 21m (x2 over 21m) kubelet > >> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap > >> "hadoop-config-flink-session-cluster" not found > >> > > >> > Normal Pulling 13m (x4 over 21m) kubelet Pulling > >> image "flink:1.12.0-scala_2.12-java8" > >> > > >> > Warning Failed 13m (x4 over 15m) kubelet Failed > to > >> pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown > desc > >> = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8 > >> not found: manifest unknown: manifest unknown > >> > > >> > Normal BackOff 13m (x5 over 15m) kubelet > Back-off > >> pulling image "flink:1.12.0-scala_2.12-java8" > >> > > >> > Warning Failed 11m (x5 over 15m) kubelet Error: > >> ErrImagePull > >> > > >> > Warning Failed 100s (x53 over 15m) kubelet Error: > >> ImagePullBackOff > >> > > >> > > >> > > >> > > >> >一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看 > >> > > >> >REPOSITORY TAG > >> IMAGE ID CREATED SIZE > >> > > >> >flink > >> 1.12.0-scala_2.12-java8 f7dd9b9e020b 12 hours ago 642MB > >> > > >> > > >> > > >> > > >> > >> > >显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢? > >> > > >> >第一次用k8s,还请各位指点,谢谢! > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> >
