Abhinav Sarkari created BEAM-13787:
--------------------------------------

             Summary: Job in Flink + K8S Cluster exits without executing the 
pipeline
                 Key: BEAM-13787
                 URL: https://issues.apache.org/jira/browse/BEAM-13787
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-harness
    Affects Versions: 2.34.0
         Environment: Python version: 3.7.9
Flink version: 1.13.5
Beam version: 2.34.0
Beam job server: apache/beam_flink1.13_job_server
Kubernetes (GCP/GKE) version: 1.20.11-gke.1300
            Reporter: Abhinav Sarkari


 

I am running a beam pipeline on Flink deployed on Standalone K8S with Session 
Mode.

Beam pipeline options are

options = [
"--runner=PortableRunner",
"--job_endpoint=beam-job-server:8099",
"--artifact_endpoint=beam-job-server:8098",
"--environment_type=EXTERNAL",
"--environment_config=localhost:50000"
]

Worker pools are deployed as containers in taskmanager pod. A separate job 
server runs as k8S deployment with image apache/beam_flink1.13_job_server

 

The worker pool uses a custom image built on top of 
apache/beam_python3.7_sdk:2.34.0 and contains user code.

 

The client is able to successfully submit the job and it gets dispatched to 
worker-pool container. But the python subprocess appears to fatally crash 
without executing the pipeline. The pipeline status on the client side shows as 
DONE

2022-02-01 00:52:11,910 None wait_until_finish_read (ProcessId : 1) INFO 
portable_runner.py:(576) Job state changed to DONE

 

 

Log from worker pool container 

 
beam-worker-pool
2022/02/01 00:52:05 Initializing python harness: /opt/apache/beam/boot --id=1-1 
--logging_endpoint=localhost:44717 --artifact_endpoint=localhost:42449 
--provision_endpoint=localhost:35353 --control_endpoint=localhost:34919

 
beam-worker-pool
2022/02/01 00:52:05 Installing setup packages ...

 
beam-worker-pool
2022/02/01 00:52:05 Executing: python -m 
apache_beam.runners.worker.sdk_worker_main

 
beam-worker-pool
2022/02/01 00:52:08 Python exited: <nil>

 

 

There is no other log statement. The last statement comes from fatal statement 
in boot code line 209.

https://github.com/apache/beam/blob/v2.34.0/sdks/python/container/boot.go

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to