mosche commented on issue #23440: URL: https://github.com/apache/beam/issues/23440#issuecomment-1271567347
The solution is to set these two environment variables for the Spark workers: ```shell # By default Beam expects workers (of the SDK worker pool) to connect to a Spark worker on `localhost`. When running # the worker pool in docker on a Mac this isn't possible due to the lack of `host` networking. Using # BEAM_WORKER_POOL_IN_DOCKER_VM=1, Beam will use `host.docker.internal` to communicate via the docker host instead. export BEAM_WORKER_POOL_IN_DOCKER_VM=1 # DOCKER_MAC_CONTAINER=1 limits the ports on a Spark worker for communication with SDK workers to the range 8100 - 8200 # instead of using random ports. Ports of the range are used in a round-robin fashion and have to be published. export DOCKER_MAC_CONTAINER=1 ``` See https://github.com/mosche/beam-portable-spark for a working example. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
