Hi All,
I have a conceptual misunderstanding which is keeping me from running BEAM
using Flink. The main killer right now is that I cannot use Docker containers.
I would like to run using something like:
python -m wordcount_flink \
--input war_and_peace.txt \
--runner=PortableRunner \
--job_endpoint=my-host:8099 \
--output result_flink
The first thing I did was start up a Flink cluster using
${FLINK_ROOT}/bin/start-cluster.sh. The website confirms I have workers
running.
Then, I started up a JobServer:
./gradlew :runners:flink:1.8:job-server:runShadow \
-PflinkMasterUrl=my-host:8082
Finally, I attempted to run the wordcount sample:
python -m wordcount_flink \
--input war_and_peace.txt \
--runner=PortableRunner \
--job_endpoint=my-host:8099 \
--output result_flink
This failed with:
... java.io.IOException: Received exit code 126 for command 'docker run -d
--network=host --env=DOCKER_MAC_CONTAINER=null apachebeam/python3.6_sdk:2.16.0
--id=1-1 --logging_endpoint=localhost:39858 --artifact_endpoint=localhost:39422
--provision_endpoint=localhost:40678 --control_endpoint=localhost:35088'
... docker: Got permission denied while trying to connect to the Docker daemon
socket at unix:///var/run/docker.soc
I want to instead use the FlinkRunner and environment_type=PROCESS and
environment_conf=???? What do I need to do to do this?
Generally, I'd like to have some sort of state diagram to describe who calls
what when, if anything like that is available.
Thank you for any guidance.
Please note: I have cross posted this on stackoverflow:
https://stackoverflow.com/questions/58718098/what-environment-config-for-beam-launching-flink
If I get an answer here, I'll enter it there (or you can).
Regards.