Thanks, still I need to pass parameters to the boot executable, such as, worker id, control endpoint, logging endpoint, etc.
 
Where can I extract these parameters from? (In apache_beam Python code, those can be extracted from StartWorker request parameters)
 
Also, how spark executor can find the port that grpc server is running on?
 
Sent: Wednesday, November 06, 2019 at 5:07 PM
From: "Kyle Weaver" <kcwea...@google.com>
To: dev <dev@beam.apache.org>
Subject: Re: Command for Beam worker on Spark cluster
In Docker mode, most everything's taken care of for you, but in process mode you have to do a lot of setup yourself. The command you're looking for is `sdks/python/container/build/target/launcher/linux_amd64/boot`. You will be required to have both that executable (which you can build from source using `./gradlew :sdks:python:container:build`) and a Python installation including Beam and other dependencies on all of your worker machines.
 
On Wed, Nov 6, 2019 at 2:24 PM Matthew K. <softm...@gmx.com> wrote:
Hi all,
 
I am trying to run *Python* beam pipeline on a Spark cluster. Since workers are running on separate nodes, I am using "PROCESS" for "evironment_type" in pipeline options, but I couldn't find any documentation on what "command" I should pass to "environment_config" to run on the worker, so executor can be able to communicate with.
 
Can someone help me on that?

Reply via email to