Hi leilinee,

I'm not sure whether this is the best practice but I would like to share
our experience about configuring HDFS as checkpoint storage while using
flink kubernetes operator.
There are two steps.

*Step 1)* Mount krb5-conf & keytab file to flink kubernetes operator pod

You have to create configmap and secret for krb5.conf and keytab
respectively, and apply below configs to flink kuberentes operator's
*values.yaml*

operatorVolumeMounts:
  create: true
  data:
    - mountPath: /opt/flink/krb5.conf
      name: krb5-conf
      subPath: krb5.conf
    - mountPath: /opt/flink/{keytab_file}
      name: custom-keytab
      subPath: {keytab_file}
operatorVolumes:
  create: true
  data:
    - configMap:
        name: krb5-configmap
      name: krb5-conf
    - name: custom-keytab
      secret:
        secretName: custom-keytab


*Step 2)* Configure FlinkDeployment like below in your application

apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
spec:
  flinkConfiguration:
    state.checkpoint-storage: "filesystem"
    state.checkpoints.dir: "hdfs:{path_for_checkpoint}"
    security.kerberos.login.keytab: "/opt/flink/{keytab_file}"   #
Absolute path in flink k8s operator pod
    security.kerberos.login.principal: "{principal_name}"
    security.kerberos.relogin.period: "5m"
    security.kerberos.krb5-conf.path: "/opt/flink/krb5.conf"     #
Absolute path in flink k8s operator pod


I hope this could help your work.

Best regards
dongwoo



2023년 6월 21일 (수) 오후 7:36, 李 琳 <leili...@outlook.com>님이 작성:

> Hi all,
>
> Recently, I have been testing the Flink Kubernetes Operator. In the
> official example, the checkpoint/savepoint path is configured with a file
> system:
>
>
> state.savepoints.dir: file:///flink-data/savepoints
> state.checkpoints.dir: file:///flink-data/checkpoints
> high-availability:
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
> high-availability.storageDir: file:///flink-data/ha
>
> However, in our production environment, we use HDFS to store checkpoint
> data. I'm wondering if it's possible to store checkpoint data in the Flink
> Kubernetes Operator as well. If so, could you please guide me on how to set
> up HDFS configuration in the Flink Kubernetes Operator?
>
> I would greatly appreciate any assistance you can provide. Thank you!
>

Reply via email to