Hi all.

PROBLEM: We cannot run a PyFlink job with our custom libraries, using K8s 
Operator.

We have a Kubernetes deployment of Flink 1.20.2, session cluster mode.

We have a bunch of PyFlink jobs, which have a structure:

  *
Entrypoint Python script: defines the job.
  *
Core library: core Python classes, shared by all environments.
  *
Client library: client-specific Python classes.

We can successfully launch those jobs from command line, even from a 
specialized deployer pod. The command we use is:

bin/flink run \
  --pyFiles core-1.1.0-…whl \
  --pyFiles clinet1-1.2.1…whl \
  --python workflows/SCHED/entrypoint.py \
    --lookup Name

And this is fine. However, when we try to launch them with K8s operator, we are 
using the following snippet:


apiVersion: flink.apache.org/v1beta1

kind: FlinkSessionJob

metadata:

  name: client-shed-job

spec:

  deploymentName: client1-main-flink

  job:

    args:

    - --python

    - /opt/flink/workflows/SCHED/entrypoint.py

    entryClass: org.apache.flink.client.python.PythonDriver

    parallelism: 1

    state: running

    upgradeMode: savepoint

  restartNonce: 1

If we try to add pyFiles arguments, PythonRunner complains about that. And, 
looking into the source, there is no such option in the PythonDriver.

So, how is it intended to be used? Do we have to have our libraries 
pre-installed in the Docker image?

Nix.

Reply via email to