Ideally a source like Kafka could recognize it won't get any more
records and declare that it's done, terminating the pipeline once all
data completes. However if only your DoFn can recognize this, another
option would be to have it publish something in a side channel (e.g. a
side output that gets written to pubsub) that the main program is
listening to call terminate() on the job.

https://s.apache.org/splittable-do-fn could probably be made to do
this more cleanly.

On Wed, Sep 21, 2016 at 6:36 PM, amir bahmanyari <amirto...@yahoo.com> wrote:
> I am processing tuples in processElement(). They come from KafkaIO() &
> unbounded.
> A little java app reads tuples from a data file & sends it to kafka.
> My Beam app reads them by KafkaIO().
> However, the data file will end at some point and the last record is
> processed in an instance executing processElement().
> At that point, I can detect it. But I want to terminate the pipeline.
> Thanks for your help Eugene.
>
>
>
> ________________________________
> From: Eugene Kirpichov <kirpic...@google.com>
> To: amir bahmanyari <amirto...@yahoo.com>; "user@beam.incubator.apache.org"
> <user@beam.incubator.apache.org>
> Sent: Wednesday, September 21, 2016 5:59 PM
>
> Subject: Re: Graceful termination of pipeline at runtime
>
> This won't work. You'd need to have the PipelineResult to construct your
> DoFn, but the PipelineResult is obviously unavailable until the whole
> pipeline is constructed and started.
>
> I don't think there is a currently good way to cancel the pipeline from
> within the pipeline. Could you tell more about your use case - why do you
> need this?
>
> On Wed, Sep 21, 2016 at 5:54 PM amir bahmanyari <amirto...@yahoo.com> wrote:
>
> Would work thanks.
> Need to make some code changes so it becomes visible within the inner class
> method processElement().
> Thanks very much Eugene.
>
>
> ________________________________
> From: Eugene Kirpichov <kirpic...@google.com>
> To: amir bahmanyari <amirto...@yahoo.com>; "user@beam.incubator.apache.org"
> <user@beam.incubator.apache.org>
> Sent: Wednesday, September 21, 2016 5:25 PM
>
> Subject: Re: Graceful termination of pipeline at runtime
>
> Hmm, the straightforward way of invoking this API should work just fine:
>
> PipelineResult result = p.run();
> ...wait for your termination condition...
> result.cancel();
>
> Are you looking for something more specific?
>
> On Wed, Sep 21, 2016 at 5:17 PM amir bahmanyari <amirto...@yahoo.com> wrote:
>
> Thanks Eugene...I see it in the package org.apache.beam.sdk.
> Ay code example on accurately using it pls?
> Thanks
> Amir-
> ________________________________
> From: Eugene Kirpichov <kirpic...@google.com>
> To: amir bahmanyari <amirto...@yahoo.com>; "user@beam.incubator.apache.org"
> <user@beam.incubator.apache.org>
> Sent: Wednesday, September 21, 2016 5:07 PM
> Subject: Re: Graceful termination of pipeline at runtime
>
> You need to use a non-blocking runner where p.run() returns immediately.
> It returns a PipelineResult
> (https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/PipelineResult.java)
> which provides cancel() and other operations.
>
> On Wed, Sep 21, 2016 at 5:03 PM amir bahmanyari <amirto...@yahoo.com> wrote:
>
> While still running [p.run()], is there a way to terminate a pipleline p.x
> based on a condition for instance?
> I didnt see any related api like p.terminate() or similar...
> Thanks
> Amir-
>
>
>
>
>
>
>

Reply via email to