+1

IMO that should be the approach in general. As much code as possible
reusable across runners and default job service implementation that can be
customized per runner if necessary. It will be necessary to build at least
per runner artifacts due to their dependencies (like the profiles we have
for examples/quickstart), but at least at the high level there shouldn't be
a reason why the same job service implementation cannot forward to multiple
runners.

Thanks,
Thomas




On Wed, May 23, 2018 at 3:14 PM, Reuven Lax <re...@google.com> wrote:

>
>
> On Wed, May 23, 2018 at 3:09 PM Ankur Goenka <goe...@google.com> wrote:
>
>> 1. Why JobService is runner specific? Couldn't at least a good part of it
>> be reused given that the runner specific parts are mostly in the
>> translation? or I am missing other reasons?
>>
>> Yes, absolutely. A good chunk of it can be reused. We are reusing a few
>> components from ULR in Flink runner. Calling JobService runner specific
>> gives freedom to runner to have very custom JobService if needed.
>>
>
> So you're suggesting that we should publish common JobService components
> and recommend that runners use them, but that runners are free to build
> something completely custom if they prefer?
>
>>
>> 2. What about authentication and authorisation for production runners ?
>> Once you can use such service to submit/cancel Pipelines is the first
>> thing
>> I can think of abusing.
>>
>> Authentication and authorization is still an unsolved problem. To the
>> best of my knowledge, it is runner specific and any required information
>> should be a part of grpc headers.
>>
>> On Wed, May 23, 2018 at 2:48 PM Ismaël Mejía <ieme...@gmail.com> wrote:
>>
>>> Interesting document, two questions:
>>>
>>> 1. Why JobService is runner specific? Couldn't at least a good part of it
>>> be reused given that the runner specific parts are mostly in the
>>> translation? or I am missing other reasons?
>>>
>>> 2. What about authentication and authorisation for production runners ?
>>> Once you can use such service to submit/cancel Pipelines is the first
>>> thing
>>> I can think of abusing.
>>> On Tue, May 22, 2018 at 9:40 PM Ankur Goenka <goe...@google.com> wrote:
>>>
>>> > Thank you guys for the input.
>>>
>>> > Here is the summary.
>>>
>>> > Responsibility of Beam on Job Management
>>>
>>> > Beam provide a common interface for basic job management operations
>>> called JobService. The supported operations can vary between runners.
>>>
>>>
>>> > What is JobService?
>>>
>>> > JobService is a runner specific component which implements Beams
>>> JobService interface defined here.
>>>
>>>
>>> > What is the life cycle of a JobService?
>>>
>>> > There are 3 scenarios
>>>
>>> > With ULR, JobService is short lived and runs as long as the ULR runs. (
>>> JobService Lifespan ~= Job Lifespan )
>>>
>>> > With Production runners ( Flink, Dataflow etc), JobService can either
>>> be
>>> short lived or long lived. The choice is up to the runner.
>>>
>>> > With Production runners ( Flink, Dataflow etc) without long running
>>> JobService, SDK will spin up a local JobService.
>>>
>>>
>>> > JobService state management
>>>
>>> > The choice of state management is up to JobService implementation. The
>>> basic requirement is that JobService should be able to perform all the
>>> operations with the returned job handle.
>>>
>>> > At the very least it can be the job handle for the underlying runner
>>> job
>>> and JobService will simply proxy actions to the runner using the provided
>>> job handle.
>>>
>>> > A persistent JobService is free to provide a simple string as a
>>> JobHandle. In this case, job handle can only be used with the same job
>>> service.
>>>
>>> > A stateless not persistent JobService can provide a opaque blob
>>> containing all the relevant information about the job. In this case the
>>> job
>>> handle can be used with any instance of JobService with the same code.
>>>
>>>
>>> > JobService code distribution and invocation when JobService is short
>>> lived
>>>
>>> > We will give an easy to run solution using docker. Docker will help in
>>> both executable distribution and providing platform independent binary.
>>>
>>> > We will also give an easy setup script with a supporting document for
>>> users who do not want to use docker on local machine.
>>>
>>>
>>> > Should Flink JobService start a local cluster for testing?
>>>
>>> > Flink JobService will be capable of submitting to a remote Flink
>>> cluster
>>> if an master url is provided else it will execute the pipeline in an
>>> inprocess Flink invocation on the same JVM.
>>>
>>>
>>>
>>>
>>> > On Tue, May 22, 2018 at 12:37 PM Eugene Kirpichov <
>>> kirpic...@google.com>
>>> wrote:
>>>
>>> >> Thanks Ankur, I think there's consensus, so it's probably ready to
>>> share
>>> :)
>>>
>>> >> On Fri, May 18, 2018 at 3:00 PM Ankur Goenka <goe...@google.com>
>>> wrote:
>>>
>>> >>> Thanks for all the input.
>>> >>> I have summarized the discussions at the bottom of the document (
>>> here
>>> ).
>>> >>> Please feel free to provide comments.
>>> >>> Once we agree, I will publish the conclusion on the mailing list.
>>>
>>> >>> On Mon, May 14, 2018 at 1:51 PM Eugene Kirpichov <
>>> kirpic...@google.com>
>>> wrote:
>>>
>>> >>>> Thanks Ankur, this document clarifies a few points and raises some
>>> very important questions. I encourage everybody with a stake in
>>> Portability
>>> to take a look and chime in.
>>>
>>> >>>> +Aljoscha Krettek +Thomas Weise +Henning Rohde
>>>
>>> >>>> On Mon, May 14, 2018 at 12:34 PM Ankur Goenka <goe...@google.com>
>>> wrote:
>>>
>>> >>>>> Updated link to the document as the previous link was not working
>>> for
>>> some people.
>>>
>>>
>>> >>>>> On Fri, May 11, 2018 at 7:56 PM Ankur Goenka <goe...@google.com>
>>> wrote:
>>>
>>> >>>>>> Hi,
>>>
>>> >>>>>> Recent effort on portability has introduced JobService and
>>> ArtifactService to the beam stack along with SDK. This has open up a few
>>> questions around how we start a pipeline in a portable setup (with
>>> JobService).
>>> >>>>>> I am trying to document our approach to launching a portable
>>> pipeline and take binding decisions based on the discussion.
>>> >>>>>> Please review the document and provide your feedback.
>>>
>>> >>>>>> Thanks,
>>> >>>>>> Ankur
>>>
>>

Reply via email to