azagrebin commented on a change in pull request #8789: [FLINK-12890] Add 
partition lifecycle related Shuffle API
URL: https://github.com/apache/flink/pull/8789#discussion_r295854495
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/shuffle/ShuffleMaster.java
 ##########
 @@ -41,4 +44,17 @@
        CompletableFuture<T> registerPartitionWithProducer(
                PartitionDescriptor partitionDescriptor,
                ProducerDescriptor producerDescriptor);
+
+       /**
+        * Release any external resources occupied by the given partition.
+        *
+        * <p>This call triggers release of any resources which are occupied by 
the given partition in the external systems
+        * outside of the producer executor. This is mostly relevant for the 
batch jobs and blocking result partitions
+        * for which {@link 
ResultPartitionDeploymentDescriptor#isReleasedOnConsumption()} returns {@code 
false}.
+        * The producer local resources are managed by {@link 
ShuffleDescriptor#hasLocalResources()} and
+        * {@link ShuffleEnvironment#releasePartitions(Collection)}.
+        *
+        * @param shuffleDescriptor shuffle descriptor of the result partition 
to release externally.
+        */
+       void releasePartitionExternally(T shuffleDescriptor);
 
 Review comment:
   We already explicitly assume that there can be some local TM resources to 
release for which we have to keep TM connection and special bookkeeping in TM 
(#8778 and `JobAwareShuffleEnvironment` in #8687). The way we are solving 
currently tracking of partitions to be released from JM in TM, we have to do 
the RPC local release anyways to update the `JobAwareShuffleEnvironment`. Then 
there is no need for extra internal communication between `ShuffleMaster` and 
`ShuffleEnvironment` for this purpose atm (this would also require a lot of 
effort w/o having `TaskManagerGateway`).
   
   And we still have to be able to do the external release in future 
optimisation where we do not keep TM connection (no need for local release) in 
case of external shuffle service.
   
   In this regard, I think it might be even better to rename 
`ShuffleEnvironment#releasePartitions` to 
`ShuffleEnvironment#releasePartitionsLocally` because this way it reflects how 
we intend to use it in JM/TM (final users of shuffle service). Of course, 
eventually shuffle service can do anything internally including extra 
communication between `ShuffleMater` and `ShuffleEnvironment` or doing 
local/external cleanup at the same time.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to