[GitHub] [flink] tillrohrmann commented on a change in pull request #10362: [FLINK-14792][coordination] Implement TE cluster partition release

GitBox Wed, 18 Mar 2020 10:03:28 -0700

tillrohrmann commented on a change in pull request #10362: 
[FLINK-14792][coordination] Implement TE cluster partition release
URL: https://github.com/apache/flink/pull/10362#discussion_r394483007


 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/TaskExecutorPartitionTrackerImpl.java
 ##########
 @@ -76,39 +78,58 @@ public void 
promoteJobPartitions(Collection<ResultPartitionID> partitionsToPromo
 
                final Collection<PartitionTrackerEntry<JobID, 
TaskExecutorPartitionInfo>> partitionTrackerEntries = 
stopTrackingPartitions(partitionsToPromote);
 
-               final Map<TaskExecutorPartitionInfo, Set<ResultPartitionID>> 
newClusterPartitions = partitionTrackerEntries.stream()
-                       .collect(Collectors.groupingBy(
-                               PartitionTrackerEntry::getMetaInfo,
-                               
Collectors.mapping(PartitionTrackerEntry::getResultPartitionId, 
Collectors.toSet())));
-
-               newClusterPartitions.forEach(
-                       (dataSetMetaInfo, newPartitionEntries) -> 
clusterPartitions.compute(dataSetMetaInfo, (ignored, existingPartitions) -> {
-                               if (existingPartitions == null) {
-                                       return newPartitionEntries;
+               partitionTrackerEntries.forEach(
+                       partitionTrackerEntry -> 
clusterPartitions.compute(partitionTrackerEntry.getMetaInfo().getIntermediateDataSetId(),
 (key, existingEntry) -> {
+                               if (existingEntry == null) {
+                                       final Set<ResultPartitionID> newSet = 
new HashSet<>();
+                                       
newSet.add(partitionTrackerEntry.getResultPartitionId());
+                                       return new PartitionEntry(newSet , 
partitionTrackerEntry.getMetaInfo().getNumberOfPartitions());
                                } else {
-                                       
existingPartitions.addAll(newPartitionEntries);
-                                       return existingPartitions;
+                                       
existingEntry.addPartition(partitionTrackerEntry.getResultPartitionId());
+                                       return existingEntry;
                                }
-                       }));
+                       })
+               );
 
 Review comment:
   Maybe this whole block could be simplified via
   
   ```
   for (PartitionTrackerEntry<JobID, TaskExecutorPartitionInfo> 
partitionTrackerEntry : partitionTrackerEntries) {
                        final TaskExecutorPartitionInfo metaInfo = 
partitionTrackerEntry.getMetaInfo();
                        clusterPartitions.computeIfAbsent(
                                metaInfo.getIntermediateDataSetId(),
                                ignored -> new 
PartitionEntry(metaInfo.getNumberOfPartitions()))
                                
.addPartition(partitionTrackerEntry.getResultPartitionId());
                }
   ```
   
   If doing computations with side effects I would choose explicit loops 
instead of the stream API. For the stream API, computations should be side 
effect free.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [flink] tillrohrmann commented on a change in pull request #10362: [FLINK-14792][coordination] Implement TE cluster partition release

Reply via email to