[jira] [Commented] (FLINK-19693) Scheduler Change for Approximate Local Recovery to Restart Downstream of a Failed Task

Jin Xing (Jira) Tue, 03 Nov 2020 06:45:49 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-19693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225467#comment-17225467
 ]


Jin Xing commented on FLINK-19693:
----------------------------------

Hi, Yuan  ~
I posted above comment after learning this series of JIRAs. The comment is not 
strongly about the details of this JIRA. But I don't mean to distract the 
subject. I will open another thread if you and the community think my concerns 
are valid. Really appreciate your kindness.

> Scheduler Change for Approximate Local Recovery to Restart Downstream of a 
> Failed Task
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-19693
>                 URL: https://issues.apache.org/jira/browse/FLINK-19693
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>            Reporter: Yuan Mei
>            Assignee: Yuan Mei
>            Priority: Major
>              Labels: pull-request-available
>
> Enables downstream failover for approximate local recovery.
> That says if a task fails, all its downstream tasks restart, including 
> itself. This is achieved by reusing the existing 
> {{RestartPipelinedRegionFailoverStrategy}} --- treat each individual task 
> connected by ResultPartition.Pipelined_Approximate as a separate region.
>  
> It introduces an attribute "reconnectable" in ResultPartitionType to indicate 
> whether the partition is reconnectable. Notice that this is only a temporary 
> solution for now. It will be removed after:
>  # Approximate local recovery has its won failover strategy to restart the 
> failed set of tasks instead of restarting downstream of failed tasks 
> depending on {[@link|https://github.com/code] 
> RestartPipelinedRegionFailoverStrategy}
>  # FLINK-19895: Unify the life cycle of ResultPartitionType Pipelined Family. 
> There is also a good discussion on this in FLINK-19632.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-19693) Scheduler Change for Approximate Local Recovery to Restart Downstream of a Failed Task

Reply via email to