huwh commented on code in PR #21496:
URL: https://github.com/apache/flink/pull/21496#discussion_r1055383454


##########
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedTaskManagerTracker.java:
##########
@@ -118,6 +124,34 @@ public void removeTaskManager(InstanceID instanceId) {
         }
     }
 
+    @Override
+    public void addUnWantedTaskManager(InstanceID instanceId) {
+        TaskManagerInfo taskManagerInfo = 
taskManagerRegistrations.get(instanceId);
+        if (taskManagerInfo != null) {
+            addUnWantedTaskManagerWithResource(
+                    instanceId,
+                    taskManagerInfo.getTotalResource(),
+                    taskManagerInfo.getDefaultSlotResourceProfile());
+        }
+    }
+
+    public void addUnWantedTaskManagerWithResource(
+            InstanceID instanceId,
+            ResourceProfile totalResourceProfile,
+            ResourceProfile defaultSlotResourceProfile) {
+        unWantedTaskManagers.put(
+                instanceId,
+                WorkerResourceSpec.fromTotalResourceProfile(
+                        totalResourceProfile,
+                        SlotManagerUtils.calculateDefaultNumSlots(
+                                totalResourceProfile, 
defaultSlotResourceProfile)));
+    }

Review Comment:
   There are two possible reasons for adding unwanted workers: reaching maximum 
resource limitation, and TM idle timeout. The workers which reach idle timeout 
are always registered workers. But workers which reach the maximum resource 
limitation will be released without adding registered workers.
   As discussed, unwantedWorkers should not contain workers that do not belong 
to registered workers, so SlotManager#registerTaskManager is changed, it will 
return false if registration fails (before, it returns false when registration 
failsĀ or tm already registered), and then ResourceManager will 
closeTaskManagerConnection. After this change, RM will call onWorkerRegistered 
when TaskManager registered SlotManager, even if TM has already registered 
SlotManager before. IMO, This change is acceptable because onWorkerRegistered 
is idempotent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to