Hi Sonam, I am not a long-standing Flink user (3 months only) so perhaps others will have a more authoritative view.
I would say that I am using Flink in k8s, and have had some good success with the Google Flink operator (https://github.com/GoogleCloudPlatform/flink-on-k8s-operator). This includes Custom Resource Definitions (CRDs) so that you can define your Flink clusters in YAML, and deploy using kustomize. The result is: A Flink cluster of a job-manager and one-or-more task-managers. A Kubernetes job which acts as the link “client” to submit the job to the job-manager, the job-submitter e.g. flink-example-job-submitter-g4s6g 0/1 Completed 0 6d15h flink-example-jobmanager-0 1/1 Running 3 6d15h flink-example-taskmanager-0 1/1 Running 3 6d15h This all seems in keeping with Flink’s “Per Job-Mode” deployment option (https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/overview/#per-job-mode) Note: i’m only just getting into to state persistence and recovery, so still some work to do, but I think this is largely understanding and configuration. Hope that helps Paul > On 17 Jun 2021, at 23:55, Sonam Mandal <soman...@linkedin.com> wrote: > > Hello, > > We are exploring running multiple Flink clusters within a Kubernetes cluster > such that each Flink cluster can run with a specified Flink image version. > Since the Flink Job Graph needs to be compatible with the Flink version > running in the Flink cluster, this brings a challenge in how we ensure that > the SQL job graph or Flink job jars are compatible with the Flink cluster > users want to run them on. > > E.g. if the Flink cluster is running version 1.12.1, the job graph generated > from the SQL must be created using compatible 1.12.1 Flink libraries. > Otherwise, we see issues with deserialization etc. > > Is there a recommended way to handle this scenario today? > > Thanks, > Sonam