zentol commented on a change in pull request #9960: [FLINK-14476] Extend
PartitionTracker to support promotions
URL: https://github.com/apache/flink/pull/9960#discussion_r341038943
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/PartitionTrackerImpl.java
##########
@@ -147,22 +160,35 @@ public boolean isPartitionTracked(final
ResultPartitionID resultPartitionID) {
return Optional.of(partitionInfo);
}
+ private Stream<PartitionInfo>
stopTrackingPartitionsAndRetrievePartitionInfo(Collection<ResultPartitionID>
resultPartitionIds) {
+ return resultPartitionIds.stream()
+ .map(this::internalStopTrackingPartition)
+ .flatMap(PartitionTrackerImpl::asStream);
+ }
+
private void internalReleasePartitions(
ResourceID potentialPartitionLocation,
Collection<ResultPartitionDeploymentDescriptor>
partitionDeploymentDescriptors) {
internalReleasePartitionsOnTaskExecutor(potentialPartitionLocation,
partitionDeploymentDescriptors);
-
internalReleasePartitionsOnShuffleMaster(partitionDeploymentDescriptors);
+
internalReleasePartitionsOnShuffleMaster(partitionDeploymentDescriptors.stream());
+ }
+
+ private void internalReleaseOrPromotePartitions(
+ ResourceID potentialPartitionLocation,
+ Collection<ResultPartitionDeploymentDescriptor>
partitionDeploymentDescriptors) {
+
+
internalReleaseOrPromotePartitionsOnTaskExecutor(potentialPartitionLocation,
partitionDeploymentDescriptors);
+
internalReleasePartitionsOnShuffleMaster(excludePersistentPartitions(partitionDeploymentDescriptors));
Review comment:
~~I don't think a heartbeat changes anything.~~
~~The ShuffleMaster isn't exposed to the concept of jobs, and has no access
to the JobID. Hence external cleanup routines cannot be tied to the lifetime of
a job. I don't see a way how an external shuffle service could determine which
partitions can be safely released.~~
~~The only thing the shuffle service can observe is whether the cluster is
still alive; if it cleans up partitions on cluster shutdown the behavior is
in-line with internal shuffle services.~~
Fair enough, an external shuffle service may couple the lifetime of
partitions to the lifetime of the ShuffleMaster, which only exists until the
job is shutdown. It wouldn't need any information from Flink for this to work.
This should be a minor change, but as you said, I would tackle this later
when we revisit the ShuffleServices in the context of the FLIP.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services