Understood, so that's a generalized abstraction for creating RPC-based
services that manage SDK harnesses. (What we discussed as "external" in
the other thread). Would prefer this REST-based, since this makes
interfacing with other systems easier. So probably a shell script would
already
I mean that rather than a command line (or docker image) a URL is
given that's a GRPC (or REST or ...) endpoint that's invoked to pass
what would have been passed by command line arguments (e.g. the FnAPI
control plane and logging endpoints).
This could be implemented as a script that goes and
Robert, just to be clear about the "callback" proposal. Do you mean that
the process startup script listens for an RPC from the Runner to bring
up SDK harnesses as needed?
I agree this would be helpful to know the required parameter, e.g. you
mentioned the Fn Api network configuration.
On
On Thu, Aug 23, 2018 at 3:47 PM Maximilian Michels wrote:
>
> > Going down this path may start to get fairly involved, with an almost
> > endless list of features that could be requested. Instead, I would
> > suggest we keep process-based execution very simple, and specify bash
> > script
On Thu, Aug 23, 2018 at 6:47 AM Maximilian Michels wrote:
> > Going down this path may start to get fairly involved, with an almost
> > endless list of features that could be requested. Instead, I would
> > suggest we keep process-based execution very simple, and specify bash
> > script
> Going down this path may start to get fairly involved, with an almost
> endless list of features that could be requested. Instead, I would
> suggest we keep process-based execution very simple, and specify bash
> script (that sets up the environment and whatever else one may want to
> do) as
On Thu, Aug 23, 2018 at 1:54 PM Maximilian Michels wrote:
>
> Big +1. Process-based execution should be simple to reason about for
> users.
+1. In fact, this is exactly what the Python local job server does,
with running Docker simply being a particular command line that's
passed down here.
Big +1. Process-based execution should be simple to reason about for
users. The implementation should not be too involved. The user has to
ensure the environment is suitable for process-based execution.
There are some minor features that we should support:
- Activating a virtual environment
Agree with Luke. Perhaps something simple, prescriptive yet flexible, such
as custom command line (defined in the environment proto) rooted at the
base of the provided artifacts and either passed the same arguments or
defined in the container contract or made available through substitution.
That
It is also worth to mention that apart of the testing/development use
case there is also the case of supporting people running in Hadoop
distributions. There are two extra reasons to want a process based
version: (1) Some Hadoop distributions run in machines with really old
kernels where docker
Thanks Henning and Thomas. It looks like
a) we want to keep the Docker Job Server Docker container and rely on
spinning up "sibling" SDK harness containers via the Docker socket. This
should require little changes to the Runner code.
b) have the InProcess SDK harness as an alternative way to
The original objective was to make test/development easier (which I think
is super important for user experience with portable runner).
>From first hand experience I can confirm that dealing with Flink clusters
and Docker containers for local setup is a significant hurdle for Python
developers.
>> Option 3) would be to map in the docker binary and socket to allow
>> the containerized Flink job server to start "sibling" containers on
>> the host.
>
>Do you mean packaging Docker inside the Job Server container and
>mounting /var/run/docker.sock from the host inside the container? That
Option 4) We are also thinking about adding process based SDKHarness. This
will avoid docker in docker scenario.
Process based SDKHarness also has other applications and might be desirable
in some of the production use cases.
On Mon, Aug 20, 2018 at 11:49 AM Henning Rohde wrote:
> Option 3)
Option 3) would be to map in the docker binary and socket to allow the
containerized Flink job server to start "sibling" containers on the host.
That both avoids docker-in-docker (which is indeed undesirable) as well as
extra requirements for each SDK to spin up containers -- notably, if the
15 matches
Mail list logo