Re: (mini-doc) Beam (Flink) portable job templates

Thomas Weise Thu, 08 Aug 2019 16:21:36 -0700

Hi Kyle,

It might also be useful to have the option to just output the proto and
artifacts, as alternative to the jar file.


For the Flink entry point we would need to allow for the job server to be
used as a library. It would probably not be too hard to have the Flink job
constructed via the context execution environment, which would require no
changes on the Flink side.

Thanks,
Thomas


On Thu, Aug 8, 2019 at 9:52 AM Kyle Weaver <kcwea...@google.com> wrote:

> Re Javaless/serverless solution:
> I take it this would probably mean that we would construct the jar
> directly from the SDK. There are advantages to this: full separation of
> Python and Java environments, no need for a job server, and likely a
> simpler implementation, since we'd no longer have to work within the
> constraints of the existing job server infrastructure. The only downside I
> can think of is the additional cost of implementing/maintaining jar
> creation code in each SDK, but that cost may be acceptable if it's simple
> enough.
>
> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
>
>
> On Thu, Aug 8, 2019 at 9:31 AM Thomas Weise <t...@apache.org> wrote:
>
>>
>>
>> On Thu, Aug 8, 2019 at 8:29 AM Robert Bradshaw <rober...@google.com>
>> wrote:
>>
>>> > Before assembling the jar, the job server runs to create the
>>> ingredients. That requires the (matching) Java environment on the Python
>>> developers machine.
>>>
>>> We can run the job server and have it create the jar (and if we keep
>>> the job server running we can use it to interact with the running
>>> job). However, if the jar layout is simple enough, there's no need to
>>> even build it from Java.
>>>
>>> Taken to the extreme, this is a one-shot, jar-based JobService API. We
>>> choose a standard layout of where to put the pipeline description and
>>> artifacts, and can "augment" an existing jar (that has a
>>> runner-specific main class whose entry point knows how to read this
>>> data to kick off a pipeline as if it were a users driver code) into
>>> one that has a portable pipeline packaged into it for submission to a
>>> cluster.
>>>
>>
>> It would be nice if the Python developer doesn't have to run anything
>> Java at all.
>>
>> As we just discussed offline, this could be accomplished by  including
>> the proto that is produced by the SDK into the pre-existing jar.
>>
>> And if the jar has an entry point that creates the Flink job in the
>> prescribed manner [1], it can be directly submitted to the Flink REST API.
>> That would allow for Java free client.
>>
>> [1]
>> https://lists.apache.org/thread.html/6db869c53816f4e2917949a7c6992c2b90856d7d639d7f2e1cd13768@%3Cdev.flink.apache.org%3E
>>
>>

Re: (mini-doc) Beam (Flink) portable job templates

Reply via email to