On Sat, May 5, 2018 at 3:58 PM, Robert Bradshaw <rober...@google.com> wrote:

>
> I would welcome changes to
> https://github.com/apache/beam/blob/v2.4.0/model/
> pipeline/src/main/proto/beam_runner_api.proto#L730
> that would provide alternatives to docker (one of which comes to mind is "I
> already brought up a worker(s) for you (which could be the same process
> that handled pipeline construction in testing scenarios), here's how to
> connect to it/them.") Another option, which would seem to appeal to you in
> particular, would be "the worker code is linked into the runner's binary,
> use this process as the worker" (though note even for java-on-java, it can
> be advantageous to shield the worker and runner code from each others
> environments, dependencies, and version requirements.) This latter should
> still likely use the FnApi to talk to itself (either over GRPC on local
> ports, or possibly better via direct function calls eliminating the RPC
> overhead altogether--this is how the fast local runner in Python works).
> There may be runner environments well controlled enough that "start up the
> workers" could be specified as "run this command line." We should make this
> environment message extensible to other alternatives than "docker container
> url," though of course we don't want the set of options to grow too large
> or we loose the promise of portability unless every runner supports every
> protocol.
>
>
The pre-launched worker would be an interesting option, which might work
well for a sidecar deployment.

The current worker boot code though makes the assumption that the runner
endpoint to phone home to is known when the process is launched. That
doesn't work so well with a runner that establishes its endpoint
dynamically. Also, the assumption is baked in that a worker will only serve
a single pipeline (provisioning API etc.).

Thanks,
Thomas

Reply via email to