damccorm opened a new issue, #20973:
URL: https://github.com/apache/beam/issues/20973

   When running a Beam pipeline using Flink as backend, the python sdk harness 
hangs when trying to install pip packages. Tested using Flink 1.10.3.
   
   Images used: 
   
   apache/beam_python3.7_sdk:2.28.0
   
   apache/flink:1.10.3
   
   Beam args used are:
   
   "\--runner=FlinkRunner",
   "–flink_version=1.10", //same with 1.13
    
"\--flink_master=[http://flink-jobmanager.default:8081](http://flink-jobmanager.default:8081/)",
    f"\--artifacts_dir=/mnt/flink",
    "\--environment_type=EXTERNAL",
    "\--environment_config=localhost:50000",
   
    
   
   Specifically this was tested by running a TFX pipeline which gets submitted 
and registered as it should, but the SDK Harness hangs when installing:
   
   2021/03/10 12:16:20 Initializing python harness: /opt/apache/beam/boot 
\--id=1-1 \--logging_endpoint=localhost:39795 
\--artifact_endpoint=localhost:34095 \--provision_endpoint=localhost:42999 
\--control_endpoint=localhost:38129
    2021/03/10 12:16:20 Found artifact: tfx_ephemeral-0.27.0.tar.gz
    2021/03/10 12:16:20 Found artifact: extra_packages.txt
    2021/03/10 12:16:20 Installing setup packages ...
    2021/03/10 12:16:20 Installing extra package: tfx_ephemeral-0.27.0.tar.gz
   
   and nothing else is shown irregardless how long it is left. I can manually 
install the TFX package by exec into the container in < 3 min.
   
   The Flink task-manager then waits idling and periodically  logs:
   
   2021-03-10 11:29:26,287 INFO 
org.apache.beam.runners.fnexecution.environment.ExternalEnvironmentFactory - 
Still waiting for startup of environment from localhost:50000 for worker id 1-1
   
   Helm charts attached below.
   
   Imported from Jira 
[BEAM-11959](https://issues.apache.org/jira/browse/BEAM-11959). Original Jira 
may contain additional context.
   Reported by: ConverJens.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to