azagrebin commented on a change in pull request #8789: [FLINK-12890] Add
partition lifecycle related Shuffle API
URL: https://github.com/apache/flink/pull/8789#discussion_r295854495
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/shuffle/ShuffleMaster.java
##########
@@ -41,4 +44,17 @@
CompletableFuture<T> registerPartitionWithProducer(
PartitionDescriptor partitionDescriptor,
ProducerDescriptor producerDescriptor);
+
+ /**
+ * Release any external resources occupied by the given partition.
+ *
+ * <p>This call triggers release of any resources which are occupied by
the given partition in the external systems
+ * outside of the producer executor. This is mostly relevant for the
batch jobs and blocking result partitions
+ * for which {@link
ResultPartitionDeploymentDescriptor#isReleasedOnConsumption()} returns {@code
false}.
+ * The producer local resources are managed by {@link
ShuffleDescriptor#hasLocalResources()} and
+ * {@link ShuffleEnvironment#releasePartitions(Collection)}.
+ *
+ * @param shuffleDescriptor shuffle descriptor of the result partition
to release externally.
+ */
+ void releasePartitionExternally(T shuffleDescriptor);
Review comment:
We already explicitly assume that there can be some local TM resources to
release for which we have to keep TM connection and special bookkeeping in TM
(#8778 and `JobAwareShuffleEnvironment` in #8687). The way we are solving
currently tracking of partitions to be released from JM in TM, we have to do
the RPC local release anyways to update the `JobAwareShuffleEnvironment`. Then
there is no need for extra internal communication between `ShuffleMaster` and
`ShuffleEnvironment` for this purpose atm (this would also require a lot of
effort w/o having `TaskManagerGateway`).
And we still have to be able to do the external release in future
optimisation where we do not keep TM connection (no need for local release) in
case of external shuffle service.
In this regard, I think it might be even better to rename
`ShuffleEnvironment#releasePartitions` to
`ShuffleEnvironment#releasePartitionsLocally` because this way it reflects how
we intend to use it in JM/TM (final users of shuffle service). Of course,
eventually shuffle service can do anything internally including extra
communication between `ShuffleMater` and `ShuffleEnvironment` or doing
local/external cleanup at the same time.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services