zhijiangW commented on a change in pull request #8789: [FLINK-12890] Add 
partition lifecycle related Shuffle API
URL: https://github.com/apache/flink/pull/8789#discussion_r296115872
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/shuffle/ShuffleMaster.java
 ##########
 @@ -41,4 +44,17 @@
        CompletableFuture<T> registerPartitionWithProducer(
                PartitionDescriptor partitionDescriptor,
                ProducerDescriptor producerDescriptor);
+
+       /**
+        * Release any external resources occupied by the given partition.
+        *
+        * <p>This call triggers release of any resources which are occupied by 
the given partition in the external systems
+        * outside of the producer executor. This is mostly relevant for the 
batch jobs and blocking result partitions
+        * for which {@link 
ResultPartitionDeploymentDescriptor#isReleasedOnConsumption()} returns {@code 
false}.
+        * The producer local resources are managed by {@link 
ShuffleDescriptor#hasLocalResources()} and
+        * {@link ShuffleEnvironment#releasePartitions(Collection)}.
+        *
+        * @param shuffleDescriptor shuffle descriptor of the result partition 
to release externally.
+        */
+       void releasePartitionExternally(T shuffleDescriptor);
 
 Review comment:
   Thanks for the explanation! 
   If I understood correctly, the local release is for RPC JM->TM and external 
release is for RPC `ShuffleMaster`->`ShuffleEnvironment`. ATM we only have one 
internal shuffle implementation, then the 
`ShuffleEnvironment#releasePartitionsLocally` is actually used but 
`ShuffleMaster#releasePartitionsExternally` might never be used.  But for 
extending external shuffle implementation future, I am not sure how these two 
release methods would be used then.
   
   My only concern is wondering this abstraction might bring confusing for 
future extending implementations, because we seem give some specific 
tags/limitations in general interface. We only need to provide the 
semantic/ability for the function, no need to reflect the backend detail 
implementations (`TaskManagerGateway`, `TM connection`).  
   
   We already provide one simple implementation via JM/TM for release and also 
define the interface between `ShuffleMaster/ShuffleEnvironment` for release. 
The general `ShuffleMaster#releasePartition` is only for indicating that 
`ShuffleMaster` should release given partitions, internal/external shuffle 
implementations could both rely on it or not.
   
   E.g. the current `releaseOnConsumption` could also be implemented like this: 
when JM receives the notification of finished consumer task, it notifies 
`ShuffleMaster` to release corresponding producer's partitions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to