curcur commented on pull request #13880: URL: https://github.com/apache/flink/pull/13880#issuecomment-723023910
Hey @tillrohrmann , I think I roughly know the reason why the region scheduler does not work for approximate mode. Here is what I found: in the case where approximate_failover is enabled, **only source task** is scheduled to be deployed, and consumer tasks never deployed. So I guess it may be related to how the region is used in the region PipelinedRegionSchedulingStrategy. - For **pipeline_(bounded)**, all vertices connected are considered as one region; there are not any dependent relations between regions, so PipelinedRegionSchedulingStrategy works roughly the same as EagerSchedulingStrategy - For **blocking**, regions are scheduled after dependent regions’ produced partitions are consumable. - From this point, it sounds like PipelinedRegionSchedulingStrategy should also work for pipeline_(approximate), but depends on how “produced partitions are consumable” is notified. The current version of how “produced partitions are consumable” is notified is very “blocking” specific. consumerRegions are maybe scheduled upon `PipelinedRegionSchedulingStrategy#onExecutionStateChange` First of all, it needs the “executionState == ExecutionState.FINISHED”; Second, only FINISHED and FAILED are notifiable in `SchedulerBase#updateTaskExecutionState` and e.t.c. I think if we make the “produced partitions are consumable” notification propagated properly for pipeline_approximated, it should work with pipeline_approximated as well. But the question is whether it is worthing the change? Because later, we probably won’t make each task in approximate mode a region after it has its own restart strategy? In short, the reason is approximate failover is restarted regionally but expected to be deployed as one region (if connected). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
