azagrebin commented on a change in pull request #8789: [FLINK-12890] Add 
partition lifecycle related Shuffle API
URL: https://github.com/apache/flink/pull/8789#discussion_r296608511
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/shuffle/ShuffleMaster.java
 ##########
 @@ -41,4 +44,17 @@
        CompletableFuture<T> registerPartitionWithProducer(
                PartitionDescriptor partitionDescriptor,
                ProducerDescriptor producerDescriptor);
+
+       /**
+        * Release any external resources occupied by the given partition.
+        *
+        * <p>This call triggers release of any resources which are occupied by 
the given partition in the external systems
+        * outside of the producer executor. This is mostly relevant for the 
batch jobs and blocking result partitions
+        * for which {@link 
ResultPartitionDeploymentDescriptor#isReleasedOnConsumption()} returns {@code 
false}.
+        * The producer local resources are managed by {@link 
ShuffleDescriptor#hasLocalResources()} and
+        * {@link ShuffleEnvironment#releasePartitions(Collection)}.
+        *
+        * @param shuffleDescriptor shuffle descriptor of the result partition 
to release externally.
+        */
+       void releasePartitionExternally(T shuffleDescriptor);
 
 Review comment:
   Thanks for more explanation! I am still not sure that I fully understand the 
suggested final way to define this part of interface.
   
   If we define both methods as `ShuffleMaster#releasePartitions` and 
`ShuffleEnvironment#releasePartitions` without stating the purpose, it can look 
for an implementer like any of them can be used for full release and it has to 
be implemented this way which is currently not true, e.g. for the default netty 
implementation. Basically, the existing netty implementation will not comply 
with the interface or we will have to make an implicit assumption in JM/TM 
about how to use shuffle service just for Netty which does a partial release in 
case of each method call. True, the methods are coupled at the moment which is 
not ideal but to make them more flexible, all implementations have to be 
flexible.
   
   `PD#releaseOnConsumption` is an information from JM (user of shuffle 
service) about how the partition is intended to be used, I think we should 
actually move it to the `PartitionDescriptor`, not `ShuffleDescriptor`. If 
`releaseOnConsumption` is not supported by shuffle service for certain 
partition type, it should actually throw an exception already in 
`ShuffleMaster#registerPartitionWithProducer`.
   
   Do I understand correctly your suggestion or it should be defined 
differently?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to