在两个kubernetes版本下进行一样的操作,结果如下:

v1.17.4  失败
v1.15.1  成功


步骤如下:


创建rbac

rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
 name: flink
 namespace: flink
---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: flink-role-binding
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: ClusterRole
 name: edit
subjects:
- kind: ServiceAccount
 name: flink
 namespace: flink



然后运行如下语句:

/usr/local/flink/flink-1.10.1/bin/kubernetes-session.sh \
         -Dkubernetes.cluster-id=flink \
         -Dkubernetes.jobmanager.service-account=flink \
         -Dtaskmanager.memory.process.size=4096m \
         -Dkubernetes.taskmanager.cpu=2 \
         -Dtaskmanager.numberOfTaskSlots=4 \
         -Dkubernetes.namespace=flink \
         -Dkubernetes.rest-service.exposed.type=NodePort \
         -Dakka.framesize=104857600b \
         -Dkubernetes.container.image=flink:1.10.1


在1.15可以正常构建出集群,在1.17版本会出现Back-off restarting failed container,查看日志除了如下日志无其他输出

Start command : /bin/bash -c $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH 
-Xms424m -Xmx424m -Dlog.file=/opt/flink/log/jobmanager.log 
-Dlogback.configurationFile=file:/opt/flink/conf/logback.xml 
-Dlog4j.configuration=file:/opt/flink/conf/log4j.properties 
org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint 1> 
/opt/flink/log/jobmanager.out 2> /opt/flink/log/jobmanager.err


尝试过通过job的方式提交,yaml如下:


apiVersion: batch/v1
kind: Job
metadata:
 name: boot-flink
 namespace: flink
spec:
 template:
   spec:
     serviceAccount: flink
     restartPolicy: OnFailure
     containers:
     - name: start
       image: flink:1.10.1
       workingDir: /opt/flink
       command: ["bash", "-c", "$FLINK_HOME/bin/kubernetes-session.sh \
         -Dkubernetes.cluster-id=roc \
         -Dkubernetes.jobmanager.service-account=flink \
         -Dtaskmanager.memory.process.size=1024m \
         -Dkubernetes.taskmanager.cpu=1 \
         -Dtaskmanager.numberOfTaskSlots=1 \
         -Dkubernetes.container.image=flink:1.10 \
         -Dkubernetes.namespace=flink"]

在1.15版本正常,在1.17版本,无法构建出对应的service,失败的现象不一样。



我的操作是否是有其他疏漏的地方?目前我遭遇到的使用场景说明如上,希望得到一些回复和解答说明,非常感谢。

Looking forward to your reply and help.

Best


| |
a511955993
|
|
邮箱:[email protected]
|

签名由 网易邮箱大师 定制

回复