Re: Beam JobService Problem

Lukasz Cwik Mon, 18 Mar 2019 10:39:18 -0700

I believe at one point in time we wanted to separate the preparation_id
from the job_id so that you could have one definition but multiple
instances of it. (e.g. preparation_id is a class name while job_id is the
pointer to the instance of the class)


On Tue, Jan 15, 2019 at 1:45 PM Sam Rohde <sro...@google.com> wrote:

> On Tue, Jan 15, 2019 at 5:23 AM Robert Bradshaw <rober...@google.com>
> wrote:
>
>> On Tue, Jan 15, 2019 at 1:19 AM Ankur Goenka <goe...@google.com> wrote:
>> >
>> > Thanks Sam for bringing this to the list.
>> >
>> > As preparation_ids are not reusable, having preparation_id and job_id
>> same makes sense to me for Flink.
>>
>> I think we change the protocol and only have one kind of ID. As well
>> as solving the problem at hand, it also simplifies the API.
>>
> That sounds fantastic.
>
> On Tue, Jan 15, 2019 at 5:23 AM Robert Bradshaw <rober...@google.com>
> wrote:
>
>> On Tue, Jan 15, 2019 at 1:19 AM Ankur Goenka <goe...@google.com> wrote:
>
> > Another option is to have a subscription for all states/messages on the
>> JobServer.
>> The problem is forcing the job service to remember all logs that were
>> ever logged ever in case someone requests them at some future date.
>> Best to have a way to register a listener earlier.
>
> I agree with Robert that it should be the caller in charge of what to do
> with generated monitoring data. This is especially true with long-running
> jobs that generate potentially gigabytes worth of logs.
>
> I made https://issues.apache.org/jira/browse/BEAM-6442 to track this. Let
> me know if I missed anything.
>
> On Tue, Jan 15, 2019 at 5:23 AM Robert Bradshaw <rober...@google.com>
> wrote:
>
>> On Tue, Jan 15, 2019 at 1:19 AM Ankur Goenka <goe...@google.com> wrote:
>> >
>> > Thanks Sam for bringing this to the list.
>> >
>> > As preparation_ids are not reusable, having preparation_id and job_id
>> same makes sense to me for Flink.
>>
>> I think we change the protocol and only have one kind of ID. As well
>> as solving the problem at hand, it also simplifies the API.
>>
>> > Another option is to have a subscription for all states/messages on the
>> JobServer.
>>
>> The problem is forcing the job service to remember all logs that were
>> ever logged ever in case someone requests them at some future date.
>> Best to have a way to register a listener earlier.
>>
>> > This will be similar to "docker". As the container id is created after
>> the container creation, the only way to get the container creation even is
>> to start "docker events" before starting a container.
>> >
>> > On Mon, Jan 14, 2019 at 11:13 AM Maximilian Michels <m...@apache.org>
>> wrote:
>> >>
>> >> Hi Sam,
>> >>
>> >> Good observation. Looks like we should fix that.
>> >>
>> >> Looking at InMemoryJobService, it appears that the state can only be
>> retrieved
>> >> by the client once the job is running with a job/invocation id
>> associated.
>> >> Indeed, any messages until that could be lost.
>> >>
>> >> For Flink the JobId is generated here:
>> >>
>> https://github.com/apache/beam/blob/3db71dd9f6f32684903c54b15a5368991cd41f36/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobInvoker.java#L64
>> >>
>> >> I don't see any benefit of having two separate IDs, as the IDs are
>> already
>> >> scoped by preparation and invocation phase.
>> >>
>> >> - Would it be possible to just pass the preparation id as the
>> invocation id at
>> >> JobInvoker#invoke(..)?
>> >>
>> >> - Alternatively, we could have an additional prepare phase for
>> JobInvoker to get
>> >> the job id for the invocation, before we start the job.
>> >>
>> >> Thanks,
>> >> Max
>> >>
>> >> On 14.01.19 12:39, Sam Rohde wrote:
>> >> > Hello all,
>> >> >
>> >> > While going through the codebase I noticed a problem with the Beam
>> JobService.
>> >> > In particular, the API allows for the possibility of never seeing
>> some messages
>> >> > or states with Get(State|Message)Stream. This is because the
>> >> > Get(State|Message)Stream calls need to have the job id which can
>> only be
>> >> > obtained from the RunJobResponse. But in order to see all
>> messages/states the
>> >> > streams need to be opened before the job starts.
>> >> >
>> >> > This is fine in Dataflow as the preparation_id == job_id, but this
>> is not true
>> >> > in Flink. What do you all think of this? Am I misunderstanding
>> something?
>> >> >
>> >> > Thanks,
>> >> > Sam
>> >> >
>>
>

Re: Beam JobService Problem

Reply via email to