Hello,

I have been intrigued by the Fn API since the first time I heard about it,
really interesting indeed, maybe you can put your ideas in a google doc, so
it can be part of the Beam Technical Docs dir (and commented by others).

Ismaël


On Tue, Jun 21, 2016 at 4:36 PM, Bill Neubauer <[email protected]>
wrote:

> Hi! I’m Bill. I’m working with Lukasz on defining the Fn API, which will
> provide guidance on the boundary between runner-specific code, and
> constructs the SDK implements.
> Here's our high level framing of the problem, and the type of problems we
> want to address.
>
> Concept:
>
> Decouple runner developers from SDK authors by having a clean separation
> between the execution of user definable functions (DoFns and others)
> marshaled by SDK, and the responsibilities of the runner in providing a
> distributed execution framework for the parallel execution of those
> functions.
>
> The vision that compels us is providing a minimal standard that enables
> SDKs in all languages, and provide non-binding implementation guidance for
> runner authors to help provide additional seams between the SDK and runner,
> trading off increased implementation complexity for better performance
> opportunities.
>
> The concept characterizes the interface in 3 dimensions: the data format
> describing function invocations and their results and side effects, which
> bridges the runner and SDK; the transport of bytes that moves this data
> between runner and SDK, and a specification of the execution environment in
> which the SDK code runs.
>
> Data Format: A UDF is called with arguments containing the data of its
> input arguments. The UDF returns the data that is associated with the named
> outputs, and additional data for counter values and TBD.
>
> Transport: As the SDK implementation language may differ from the runner’s
> language, there are different options for moving the function invocation
> data between the two. SWIG is a common technique for cross-language
> binding. We are investigating using GRPC as a transport, since servers and
> clients exist for it in many languages, providing a useful set of potential
> targets for SDK development that would not be reachable via SWIG.
>
> Execution environment: While less of a concern for certain environments,
> other runners execute in dynamically instantiated environments that may
> require configuration/provisioning to support the execution of arbitrary
> user code. Examples include Python code expecting dependencies to be
> installed, user code expecting files staged on the local filesystem, and so
> on. We are exploring using Docker containers as a mechanism for specifying
> these execution environments. They are advantageous in that they solve the
> deployment issues described above, and are admissible to the various
> implementations of the Beam model today.
>
> I’ll be sharing my thinking as it moves forward, but wanted to touch base
> and invite comments and thoughts.
>

Reply via email to