GJL commented on a change in pull request #11691: [FLINK-14234][runtime]
Notifies all kinds of consumable partitions to SchedulingStrategy
URL: https://github.com/apache/flink/pull/11691#discussion_r406411129
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java
##########
@@ -527,6 +531,26 @@ private boolean isNotifiable(
return false;
}
+ private void maybeNotifyBlockingPartitionsConsumable(final
ExecutionVertexID executionVertexId) {
+ final ExecutionVertex executionVertex =
getExecutionVertex(executionVertexId);
+
+ if (executionVertex.getExecutionState() !=
ExecutionState.FINISHED) {
+ return;
+ }
+
+ final Set<IntermediateResultPartitionID>
consumableResultPartitions = new HashSet<>();
+ for (IntermediateResultPartition resultPartition :
executionVertex.getProducedPartitions().values()) {
+ if (resultPartition.getResultType().isBlocking() &&
resultPartition.isConsumable()) {
Review comment:
We currently define an `IntermediateResultPartition` to be consumable if all
of of the partitions in the intermediate result are consumable. I think one of
the reasons `LazyFromSourcesSchedulingStrategy` is so bloated, is that we
wanted to enable the `SchedulingStrategy` to define consumability. That is, a
`SchedulingStrategy` could decide to schedule downstream operators as soon as
one partition is finished (instead of waiting for the entire intermediate
result). Now my questions are:
1. Is it already possible to implement the above requirement in Flink, or
does something impede us from consuming partitions of an incomplete
intermediate result?
1. How likely is it that we have to implement the above requirement? If we
merge this PR and later change the contract of `notifyPartitionConsumable()`,
`LazyFromSourcesSchedulingStrategy` will break. Can we postpone the decision
whether to simplify `LazyFromSourcesSchedulingStrategy`?
1. What should be the behavior for Pipelined Region Scheduling?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services