Re: Bootstrapping Beam's Job Server

2018-08-27 Thread Maximilian Michels
Understood, so that's a generalized abstraction for creating RPC-based services that manage SDK harnesses. (What we discussed as "external" in the other thread). Would prefer this REST-based, since this makes interfacing with other systems easier. So probably a shell script would already

Re: Bootstrapping Beam's Job Server

2018-08-27 Thread Robert Bradshaw
I mean that rather than a command line (or docker image) a URL is given that's a GRPC (or REST or ...) endpoint that's invoked to pass what would have been passed by command line arguments (e.g. the FnAPI control plane and logging endpoints). This could be implemented as a script that goes and

Re: Bootstrapping Beam's Job Server

2018-08-27 Thread Maximilian Michels
Robert, just to be clear about the "callback" proposal. Do you mean that the process startup script listens for an RPC from the Runner to bring up SDK harnesses as needed? I agree this would be helpful to know the required parameter, e.g. you mentioned the Fn Api network configuration. On

Re: Bootstrapping Beam's Job Server

2018-08-23 Thread Robert Bradshaw
On Thu, Aug 23, 2018 at 3:47 PM Maximilian Michels wrote: > > > Going down this path may start to get fairly involved, with an almost > > endless list of features that could be requested. Instead, I would > > suggest we keep process-based execution very simple, and specify bash > > script

Re: Bootstrapping Beam's Job Server

2018-08-23 Thread Thomas Weise
On Thu, Aug 23, 2018 at 6:47 AM Maximilian Michels wrote: > > Going down this path may start to get fairly involved, with an almost > > endless list of features that could be requested. Instead, I would > > suggest we keep process-based execution very simple, and specify bash > > script

Re: Bootstrapping Beam's Job Server

2018-08-23 Thread Maximilian Michels
> Going down this path may start to get fairly involved, with an almost > endless list of features that could be requested. Instead, I would > suggest we keep process-based execution very simple, and specify bash > script (that sets up the environment and whatever else one may want to > do) as

Re: Bootstrapping Beam's Job Server

2018-08-23 Thread Robert Bradshaw
On Thu, Aug 23, 2018 at 1:54 PM Maximilian Michels wrote: > > Big +1. Process-based execution should be simple to reason about for > users. +1. In fact, this is exactly what the Python local job server does, with running Docker simply being a particular command line that's passed down here.

Re: Bootstrapping Beam's Job Server

2018-08-23 Thread Maximilian Michels
Big +1. Process-based execution should be simple to reason about for users. The implementation should not be too involved. The user has to ensure the environment is suitable for process-based execution. There are some minor features that we should support: - Activating a virtual environment

Re: Bootstrapping Beam's Job Server

2018-08-21 Thread Henning Rohde
Agree with Luke. Perhaps something simple, prescriptive yet flexible, such as custom command line (defined in the environment proto) rooted at the base of the provided artifacts and either passed the same arguments or defined in the container contract or made available through substitution. That

Re: Bootstrapping Beam's Job Server

2018-08-21 Thread Ismaël Mejía
It is also worth to mention that apart of the testing/development use case there is also the case of supporting people running in Hadoop distributions. There are two extra reasons to want a process based version: (1) Some Hadoop distributions run in machines with really old kernels where docker

Re: Bootstrapping Beam's Job Server

2018-08-21 Thread Maximilian Michels
Thanks Henning and Thomas. It looks like a) we want to keep the Docker Job Server Docker container and rely on spinning up "sibling" SDK harness containers via the Docker socket. This should require little changes to the Runner code. b) have the InProcess SDK harness as an alternative way to

Re: Bootstrapping Beam's Job Server

2018-08-20 Thread Thomas Weise
The original objective was to make test/development easier (which I think is super important for user experience with portable runner). >From first hand experience I can confirm that dealing with Flink clusters and Docker containers for local setup is a significant hurdle for Python developers.

Re: Bootstrapping Beam's Job Server

2018-08-20 Thread Henning Rohde
>> Option 3) would be to map in the docker binary and socket to allow >> the containerized Flink job server to start "sibling" containers on >> the host. > >Do you mean packaging Docker inside the Job Server container and >mounting /var/run/docker.sock from the host inside the container? That

Re: Bootstrapping Beam's Job Server

2018-08-20 Thread Ankur Goenka
Option 4) We are also thinking about adding process based SDKHarness. This will avoid docker in docker scenario. Process based SDKHarness also has other applications and might be desirable in some of the production use cases. On Mon, Aug 20, 2018 at 11:49 AM Henning Rohde wrote: > Option 3)

Re: Bootstrapping Beam's Job Server

2018-08-20 Thread Henning Rohde
Option 3) would be to map in the docker binary and socket to allow the containerized Flink job server to start "sibling" containers on the host. That both avoids docker-in-docker (which is indeed undesirable) as well as extra requirements for each SDK to spin up containers -- notably, if the