[
https://issues.apache.org/jira/browse/FLINK-20048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuan Mei updated FLINK-20048:
-----------------------------
Description:
in the case where approximate_failover is enabled, *only source task* is
scheduled to be deployed, and consumer tasks never deployed. So I guess it may
be related to how the region is used in the region
PipelinedRegionSchedulingStrategy.
* For *pipeline_(bounded)*, all vertices connected are considered as one
region; there are not any dependent relations between regions, so
PipelinedRegionSchedulingStrategy works roughly the same as
EagerSchedulingStrategy
* For *blocking*, regions are scheduled after dependent regions’ produced
partitions are consumable.
* From this point, it sounds like PipelinedRegionSchedulingStrategy should
also work for pipeline_(approximate), but depends on how “produced partitions
are consumable” is notified.
The current version of how “produced partitions are consumable” is notified is
very “blocking” specific.
consumerRegions are maybe scheduled upon
{{PipelinedRegionSchedulingStrategy#onExecutionStateChange}}
First of all, it needs the “executionState == ExecutionState.FINISHED”;
Second, only FINISHED and FAILED are notifiable in
{{SchedulerBase#updateTaskExecutionState}}
and e.t.c.
I think if we make the “produced partitions are consumable” notification
propagated properly for pipeline_approximated, it should work with
pipeline_approximated as well.
But the question is whether it is worthing the change? Because later, we
probably won’t make each task in approximate mode a region after it has its own
restart strategy?
In short, the reason is approximate failover is restarted regionally but
expected to be deployed as one region (if connected).
> Make Approximate Local Recovery Compatible With
> PipelinedRegionSchedulingStrategy
> ---------------------------------------------------------------------------------
>
> Key: FLINK-20048
> URL: https://issues.apache.org/jira/browse/FLINK-20048
> Project: Flink
> Issue Type: Sub-task
> Reporter: Yuan Mei
> Priority: Major
>
> in the case where approximate_failover is enabled, *only source task* is
> scheduled to be deployed, and consumer tasks never deployed. So I guess it
> may be related to how the region is used in the region
> PipelinedRegionSchedulingStrategy.
> * For *pipeline_(bounded)*, all vertices connected are considered as one
> region; there are not any dependent relations between regions, so
> PipelinedRegionSchedulingStrategy works roughly the same as
> EagerSchedulingStrategy
> * For *blocking*, regions are scheduled after dependent regions’ produced
> partitions are consumable.
> * From this point, it sounds like PipelinedRegionSchedulingStrategy should
> also work for pipeline_(approximate), but depends on how “produced partitions
> are consumable” is notified.
> The current version of how “produced partitions are consumable” is notified
> is very “blocking” specific.
> consumerRegions are maybe scheduled upon
> {{PipelinedRegionSchedulingStrategy#onExecutionStateChange}}
> First of all, it needs the “executionState == ExecutionState.FINISHED”;
> Second, only FINISHED and FAILED are notifiable in
> {{SchedulerBase#updateTaskExecutionState}}
> and e.t.c.
> I think if we make the “produced partitions are consumable” notification
> propagated properly for pipeline_approximated, it should work with
> pipeline_approximated as well.
> But the question is whether it is worthing the change? Because later, we
> probably won’t make each task in approximate mode a region after it has its
> own restart strategy?
> In short, the reason is approximate failover is restarted regionally but
> expected to be deployed as one region (if connected).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)