Hi Saikat,

Thanks for clarifying. Is there a Beam component that monitors the state,
or this is up to the application? If something fails, will the application
have to retry the whole pipeline?

My concern is that Ignite compute actually provides very limited
guarantees, especially for the async execution. There are some failover
mechanisms, but overall it's up to the application to track the state and
retry. Moreover, if the application fails, all jobs it has submitted are
canceled.

I'm thinking that Ignite should have a reactive event-based processing
engine. The basic idea is this:
- an application submits an event into the cluster
- the event is persisted in Ignite to be eventually processed
- a processed event may result in some new events that are submitted in the
similar fashion

Ignite will provide the at-least-once guarantee (or even exactly-once under
certain assumptions) for all the event handlers, so a user can create a
whole chain by submitting a single event, and they don't have to worry
about failures - it's up to Ignite to handle them.

It seems to me that it might be beneficial for the Beam runner to have such
an engine under the hood. What do you think?

-Val

On Mon, Aug 17, 2020 at 6:00 PM Saikat Maitra <saikat.mai...@gmail.com>
wrote:

> Hi,
>
> Luke - Thank you for sharing the details for the portability layer for
> Flink, Samza and Spark. I will look into them and will reach out if I have
> any questions.
>
> Val - Thank you for your response, yes I am planning to run the beam
> pipeline using Ignite compute engine in async run. Here is a sample code
> for the run method.
>
> IgnitePipelineResult pipelineResult = new IgnitePipelineResult(job,
> metricsAccumulator);
>     ComputeTaskFuture<Void> computeTaskFuture =
>         ignite.compute().withAsync().run(
>                 (r, f) -> {
>                   pipelineResult.freeze(f);
>                   metricsAccumulator.destroy();
>                   ignite.shutdown();
>                 });
>     pipelineResult.setComputeFuture(asyncCompute.future());
>
>     return pipelineResult;
>
>
> My understanding is for failover scenarios we will need to map the job
> state from Ignite known state to Beam Job state, an example like in
> JetPipelineResult
>
> https://github.com/apache/beam/blob/master/runners/jet/src/main/java/org/apache/beam/runners/jet/JetPipelineResult.java#L68-L90
>
> Regards,
> Saikat
>
>
>
>
>
>
> On Mon, Aug 17, 2020 at 2:27 PM Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > Hi Saikat,
> >
> > This sounds very interesting - I've been thinking about how Ignite
> compute
> > engine could be enhanced, and integration with Apache Beam is one of the
> > options I have in mind. Can you please describe how you plan to implement
> > this? Will it run on top of the Ignite Compute Grid? How are you going to
> > handle the failover, especially in the case of async pipeline execution?
> >
> > -Val
> >
> > On Sat, Aug 15, 2020 at 12:50 PM Saikat Maitra <saikat.mai...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I have been working on implementing the Apache Ignite Runner to run
> > Apache
> > > Beam pipeline. I have created IgniteRunner and IgnitePipelineOptions. I
> > > have implemented the normalize pipeline method and currently working on
> > run
> > > method implementation for Pipeline and IgnitePipelineTranslator.
> > >
> > > Jira : https://issues.apache.org/jira/browse/BEAM-9045
> > >
> > > PR : https://github.com/apache/beam/pull/12593
> > >
> > > Please review and feel free to share any feedback or questions.
> > >
> > > Regards,
> > > Saikat
> > >
> >
>

Reply via email to