[
https://issues.apache.org/jira/browse/FLINK-20048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17227996#comment-17227996
]
Yuan Mei commented on FLINK-20048:
----------------------------------
This should have higher priority since it blocks engine team to deprecate eager
scheduling strategy.
> Make Approximate Local Recovery Compatible With
> PipelinedRegionSchedulingStrategy
> ---------------------------------------------------------------------------------
>
> Key: FLINK-20048
> URL: https://issues.apache.org/jira/browse/FLINK-20048
> Project: Flink
> Issue Type: Sub-task
> Reporter: Yuan Mei
> Priority: Major
>
> in the case where approximate_failover is enabled, *only source task* is
> scheduled to be deployed, and consumer tasks never deployed. So I guess it
> may be related to how the region is used in the region
> PipelinedRegionSchedulingStrategy.
> * For *pipeline_(bounded)*, all vertices connected are considered as one
> region; there are not any dependent relations between regions, so
> PipelinedRegionSchedulingStrategy works roughly the same as
> EagerSchedulingStrategy
> * For *blocking*, regions are scheduled after dependent regions’ produced
> partitions are consumable.
> * From this point, it sounds like PipelinedRegionSchedulingStrategy should
> also work for pipeline_(approximate), but depends on how “produced partitions
> are consumable” is notified.
> The current version of how “produced partitions are consumable” is notified
> is very “blocking” specific.
> consumerRegions are maybe scheduled upon
> {{PipelinedRegionSchedulingStrategy#onExecutionStateChange}}
> First of all, it needs the “executionState == ExecutionState.FINISHED”;
> Second, only FINISHED and FAILED are notifiable in
> {{SchedulerBase#updateTaskExecutionState}}
> and e.t.c.
> I think if we make the “produced partitions are consumable” notification
> propagated properly for pipeline_approximated, it should work with
> pipeline_approximated as well.
> But the question is whether it is worthing the change? Because later, we
> probably won’t make each task in approximate mode a region after it has its
> own restart strategy?
> In short, the reason is approximate failover is restarted regionally but
> expected to be deployed as one region (if connected).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)