zhuzhurk commented on a change in pull request #10278: [FLINK-14735][scheduler]
Improve scheduling of all-to-all partitions with ALL input constraint for
legacy scheduler
URL: https://github.com/apache/flink/pull/10278#discussion_r349057067
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
##########
@@ -1090,6 +1104,26 @@ else if (current == CANCELED || current == FAILED) {
}
}
+ private void finishPartitionsAndScheduleOrUpdateConsumers() {
+ final List<IntermediateResultPartition> newlyFinishedResults =
getVertex().finishAllBlockingPartitions();
+ if (newlyFinishedResults.isEmpty()) {
+ return;
+ }
+
+ final HashSet<ExecutionVertex> consumersToSchedule = new
HashSet<>();
Review comment:
One side effect of using a Set is that the vertices are disordered.
It's not problematic but may make the scheduling logs a bit mess.
e.g. A has 2 downstream JobVertices B, C, when all instances of A finishes
previously it is:
> ... B1 ... transitioned from CREATED to SCHEDULED
> ... B2 ...
> ...
> ... Bn ...
> ... C1 ...
> ... C2 ...
> ...
> ... Cn ... transitioned from CREATED to SCHEDULED
With this change, it might be:
> ... B2 ... transitioned from CREATED to SCHEDULED
> ... Cn ...
> ...
> ... B3 ...
> ... C5 ...
> ... C2 ...
> ...
> ... B1 ... transitioned from CREATED to SCHEDULED
We can avoid the disorders by adding the vertices to schedule in a list and
using the set to track vertices added. But I'm not sure whether it's needed.
WDYT?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services