Re: Nested Iterations Outlook

Maximilian Michels Mon, 20 Jul 2015 08:54:48 -0700

You could do that but you might run into merge conflicts. Also keep in mind
that it is work in progress :)


On Mon, Jul 20, 2015 at 4:15 PM, Maximilian Alber <
alber.maximil...@gmail.com> wrote:

> Thanks!
>
> Ok, cool. If I would like to test it, I just need to merge those two pull
> requests into my current branch?
>
> Cheers,
> Max
>
> On Mon, Jul 20, 2015 at 4:02 PM, Maximilian Michels <m...@apache.org>
> wrote:
>
>> Now that makes more sense :) I thought by "nested iterations" you meant
>> iterations in Flink that can be nested, i.e. starting an iteration inside
>> an iteration.
>>
>> The caching/pinning of intermediate results is still a work in progress
>> in Flink. It is actually in a state where it could be merged but some
>> pending pull requests got delayed because priorities changed a bit.
>>
>> Essentially, we need to merge these two pull requests:
>>
>> https://github.com/apache/flink/pull/858
>> This introduces a session management which allows to keep the
>> ExecutionGraph for the session.
>>
>> https://github.com/apache/flink/pull/640
>> Implements the actual backtracking and caching of the results.
>>
>> Once these are in, we can change the Java/Scala API to support
>> backtracking. I don't exactly know how Spark's API does it but, essentially
>> it should work then by just creating new operations on an existing DataSet
>> and submit to the cluster again.
>>
>> Cheers,
>> Max
>>
>> On Mon, Jul 20, 2015 at 3:31 PM, Maximilian Alber <
>> alber.maximil...@gmail.com> wrote:
>>
>>> Oh sorry, my fault. When I wrote it, I had iterations in mind.
>>>
>>> What I actually wanted to say, how "resuming from intermediate results"
>>> will work with (non-nested) "non-Flink" iterations? And with iterations I
>>> mean something like this:
>>>
>>> while(...):
>>>   - change params
>>>   - submit to cluster
>>>
>>> where the executed Flink-program is more or less the same at each
>>> iterations. But with changing input sets, which are reused between
>>> different loop iterations.
>>>
>>> I might got something wrong, because in our group we mentioned caching a
>>> lá Spark for Flink and someone came up that "pinning" will do that. Is that
>>> somewhat right?
>>>
>>> Thanks and Cheers,
>>> Max
>>>
>>> On Mon, Jul 20, 2015 at 1:06 PM, Maximilian Michels <m...@apache.org>
>>> wrote:
>>>
>>>>  "So it is up to debate how the support for resuming from intermediate
>>>> results will look like." -> What's the current state of that debate?
>>>>
>>>> Since there is no support for nested iterations that I know of, the
>>>> debate how intermediate results are integrated has not started yet.
>>>>
>>>>
>>>>> "Intermediate results are not produced within the iterations cycles."
>>>>> -> Ok, if there are none, what does it have to do with that debate? :-)
>>>>>
>>>>
>>>> I was referring to the existing support for intermediate results within
>>>> iterations. If we were to implement nested iterations, this could
>>>> (possibly) change. This is all very theoretical because there are no plans
>>>> to support nested iterations.
>>>>
>>>> Hope this clarifies. Otherwise, please restate your question because I
>>>> might have misunderstood.
>>>>
>>>> Cheers,
>>>> Max
>>>>
>>>>
>>>> On Mon, Jul 20, 2015 at 12:11 PM, Maximilian Alber <
>>>> alber.maximil...@gmail.com> wrote:
>>>>
>>>>> Thanks for the answer! But I need some clarification:
>>>>>
>>>>> "So it is up to debate how the support for resuming from intermediate
>>>>> results will look like." -> What's the current state of that debate?
>>>>> "Intermediate results are not produced within the iterations cycles."
>>>>> -> Ok, if there are none, what does it have to do with that debate? :-)
>>>>>
>>>>> Cheers,
>>>>> Max
>>>>>
>>>>> On Mon, Jul 20, 2015 at 10:50 AM, Maximilian Michels <m...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Hi Max,
>>>>>>
>>>>>> You are right, there is no support for nested iterations yet. As far
>>>>>> as I know, there are no concrete plans to add support for it. So it is up
>>>>>> to debate how the support for resuming from intermediate results will 
>>>>>> look
>>>>>> like. Intermediate results are not produced within the iterations cycles.
>>>>>> Same would be true for nested iterations. So the behavior for resuming 
>>>>>> from
>>>>>> intermediate results should be alike for nested iterations.
>>>>>>
>>>>>> Cheers,
>>>>>> Max
>>>>>>
>>>>>> On Fri, Jul 17, 2015 at 4:26 PM, Maximilian Alber <
>>>>>> alber.maximil...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Flinksters,
>>>>>>>
>>>>>>> as far as I know, there is still no support for nested iterations
>>>>>>> planned. Am I right?
>>>>>>>
>>>>>>> So my question is how such use cases should be handled in the future.
>>>>>>> More specific: when pinning/caching will be available, you suggest
>>>>>>> to use that feature and program in "Spark" style? Or is there some 
>>>>>>> other,
>>>>>>> more flexible, mechanism planned for loops?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Max
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Nested Iterations Outlook

Reply via email to