I mentioned that. @Max: you should only try it out if you want to experiment/work with the changes.
On Wed, Jul 22, 2015 at 2:20 PM, Stephan Ewen <se...@apache.org> wrote: > The two pull requests do not go all the way, unfortunately. They cover > only the runtime, the API integration part is missing still, > unfortunately... > > On Mon, Jul 20, 2015 at 5:53 PM, Maximilian Michels <m...@apache.org> > wrote: > >> You could do that but you might run into merge conflicts. Also keep in >> mind that it is work in progress :) >> >> On Mon, Jul 20, 2015 at 4:15 PM, Maximilian Alber < >> alber.maximil...@gmail.com> wrote: >> >>> Thanks! >>> >>> Ok, cool. If I would like to test it, I just need to merge those two >>> pull requests into my current branch? >>> >>> Cheers, >>> Max >>> >>> On Mon, Jul 20, 2015 at 4:02 PM, Maximilian Michels <m...@apache.org> >>> wrote: >>> >>>> Now that makes more sense :) I thought by "nested iterations" you meant >>>> iterations in Flink that can be nested, i.e. starting an iteration inside >>>> an iteration. >>>> >>>> The caching/pinning of intermediate results is still a work in progress >>>> in Flink. It is actually in a state where it could be merged but some >>>> pending pull requests got delayed because priorities changed a bit. >>>> >>>> Essentially, we need to merge these two pull requests: >>>> >>>> https://github.com/apache/flink/pull/858 >>>> This introduces a session management which allows to keep the >>>> ExecutionGraph for the session. >>>> >>>> https://github.com/apache/flink/pull/640 >>>> Implements the actual backtracking and caching of the results. >>>> >>>> Once these are in, we can change the Java/Scala API to support >>>> backtracking. I don't exactly know how Spark's API does it but, essentially >>>> it should work then by just creating new operations on an existing DataSet >>>> and submit to the cluster again. >>>> >>>> Cheers, >>>> Max >>>> >>>> On Mon, Jul 20, 2015 at 3:31 PM, Maximilian Alber < >>>> alber.maximil...@gmail.com> wrote: >>>> >>>>> Oh sorry, my fault. When I wrote it, I had iterations in mind. >>>>> >>>>> What I actually wanted to say, how "resuming from intermediate >>>>> results" will work with (non-nested) "non-Flink" iterations? And with >>>>> iterations I mean something like this: >>>>> >>>>> while(...): >>>>> - change params >>>>> - submit to cluster >>>>> >>>>> where the executed Flink-program is more or less the same at each >>>>> iterations. But with changing input sets, which are reused between >>>>> different loop iterations. >>>>> >>>>> I might got something wrong, because in our group we mentioned caching >>>>> a lá Spark for Flink and someone came up that "pinning" will do that. Is >>>>> that somewhat right? >>>>> >>>>> Thanks and Cheers, >>>>> Max >>>>> >>>>> On Mon, Jul 20, 2015 at 1:06 PM, Maximilian Michels <m...@apache.org> >>>>> wrote: >>>>> >>>>>> "So it is up to debate how the support for resuming from >>>>>> intermediate results will look like." -> What's the current state of that >>>>>> debate? >>>>>> >>>>>> Since there is no support for nested iterations that I know of, the >>>>>> debate how intermediate results are integrated has not started yet. >>>>>> >>>>>> >>>>>>> "Intermediate results are not produced within the iterations >>>>>>> cycles." -> Ok, if there are none, what does it have to do with that >>>>>>> debate? :-) >>>>>>> >>>>>> >>>>>> I was referring to the existing support for intermediate results >>>>>> within iterations. If we were to implement nested iterations, this could >>>>>> (possibly) change. This is all very theoretical because there are no >>>>>> plans >>>>>> to support nested iterations. >>>>>> >>>>>> Hope this clarifies. Otherwise, please restate your question because >>>>>> I might have misunderstood. >>>>>> >>>>>> Cheers, >>>>>> Max >>>>>> >>>>>> >>>>>> On Mon, Jul 20, 2015 at 12:11 PM, Maximilian Alber < >>>>>> alber.maximil...@gmail.com> wrote: >>>>>> >>>>>>> Thanks for the answer! But I need some clarification: >>>>>>> >>>>>>> "So it is up to debate how the support for resuming from >>>>>>> intermediate results will look like." -> What's the current state of >>>>>>> that >>>>>>> debate? >>>>>>> "Intermediate results are not produced within the iterations >>>>>>> cycles." -> Ok, if there are none, what does it have to do with that >>>>>>> debate? :-) >>>>>>> >>>>>>> Cheers, >>>>>>> Max >>>>>>> >>>>>>> On Mon, Jul 20, 2015 at 10:50 AM, Maximilian Michels <m...@apache.org >>>>>>> > wrote: >>>>>>> >>>>>>>> Hi Max, >>>>>>>> >>>>>>>> You are right, there is no support for nested iterations yet. As >>>>>>>> far as I know, there are no concrete plans to add support for it. So >>>>>>>> it is >>>>>>>> up to debate how the support for resuming from intermediate results >>>>>>>> will >>>>>>>> look like. Intermediate results are not produced within the iterations >>>>>>>> cycles. Same would be true for nested iterations. So the behavior for >>>>>>>> resuming from intermediate results should be alike for nested >>>>>>>> iterations. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Max >>>>>>>> >>>>>>>> On Fri, Jul 17, 2015 at 4:26 PM, Maximilian Alber < >>>>>>>> alber.maximil...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Flinksters, >>>>>>>>> >>>>>>>>> as far as I know, there is still no support for nested iterations >>>>>>>>> planned. Am I right? >>>>>>>>> >>>>>>>>> So my question is how such use cases should be handled in the >>>>>>>>> future. >>>>>>>>> More specific: when pinning/caching will be available, you suggest >>>>>>>>> to use that feature and program in "Spark" style? Or is there some >>>>>>>>> other, >>>>>>>>> more flexible, mechanism planned for loops? >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Max >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >