Re: BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-19 Thread Saikat Maitra
Hi Val, Thank you for the feedback, yes instead of using checkpointing we can store the intermediate snapshot of results directly in Ignite caches. Also, the underlying support for exactly-once guarantee in the Ignite core module will be great and we can use it for Ignite Compute. Regards,

Re: BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-19 Thread Valentin Kulichenko
Hi Saikat, Makes sense. Note that the checkpointing API is a candidate for removal in Ignite 3.0 - it's better to store intermediate results directly in Ignite caches. Also, my feeling is that simple checkpointing might not be enough for the integration, especially if we want to pursue the

Re: BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-18 Thread Saikat Maitra
Hi Val, Thank you for your response. I like the idea of reactive event based processing engine for fault tolerance. As you mentioned it will be upto underlying system to manage job execution and offer fault tolerance and we will need to build it in Ignite compute execution model. I looked into

Re: BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-18 Thread Valentin Kulichenko
Hi Saikat, Thanks for clarifying. Is there a Beam component that monitors the state, or this is up to the application? If something fails, will the application have to retry the whole pipeline? My concern is that Ignite compute actually provides very limited guarantees, especially for the async

Re: BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-17 Thread Saikat Maitra
Hi, Luke - Thank you for sharing the details for the portability layer for Flink, Samza and Spark. I will look into them and will reach out if I have any questions. Val - Thank you for your response, yes I am planning to run the beam pipeline using Ignite compute engine in async run. Here is a

Re: BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-17 Thread Valentin Kulichenko
Hi Saikat, This sounds very interesting - I've been thinking about how Ignite compute engine could be enhanced, and integration with Apache Beam is one of the options I have in mind. Can you please describe how you plan to implement this? Will it run on top of the Ignite Compute Grid? How are you

Re: BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-17 Thread Luke Cwik
At this point in time I would recommend that you build a runner that executes pipelines using only the portability layer like Flink/Samza/Spark [1,2,3]. 1: https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPortablePipelineTranslator.java 2:

BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-15 Thread Saikat Maitra
Hi, I have been working on implementing the Apache Ignite Runner to run Apache Beam pipeline. I have created IgniteRunner and IgnitePipelineOptions. I have implemented the normalize pipeline method and currently working on run method implementation for Pipeline and IgnitePipelineTranslator. Jira