KarmaGYZ commented on code in PR #19840:
URL: https://github.com/apache/flink/pull/19840#discussion_r899686715


##########
flink-core/src/main/java/org/apache/flink/configuration/ResourceManagerOptions.java:
##########
@@ -150,6 +150,18 @@ public class ResourceManagerOptions {
                     .defaultValue(Duration.ofMillis(50))
                     .withDescription("The delay of the resource requirements 
check.");
 
+    /**
+     * The long delay of requirements check. This is only used for waiting for 
the previous
+     * TaskManagers registration and will only take effect in the first 
requirements check.
+     */
+    public static final ConfigOption<Duration> REQUIREMENTS_CHECK_LONG_DELAY =

Review Comment:
   ```suggestion
       @Documentation.Section(Documentation.Sections.EXPERT_SCHEDULING)
       public static final ConfigOption<Duration> REQUIREMENTS_CHECK_LONG_DELAY 
=
   ```



##########
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java:
##########
@@ -491,10 +504,15 @@ public void freeSlot(SlotID slotId, AllocationID 
allocationId) {
      * are performed with a slight delay.
      */
     private void checkResourceRequirementsWithDelay() {
-        if (requirementsCheckDelay.toMillis() <= 0) {
-            checkResourceRequirements();
-        } else {
-            if (requirementsCheckFuture == null || 
requirementsCheckFuture.isDone()) {
+        long delay =
+                isRequirementsCheckLongDelay
+                        ? requirementsCheckLongDelay.toMillis()
+                        : requirementsCheckDelay.toMillis();
+        boolean pendingRequirementsCheck =

Review Comment:
   ```suggestion
           boolean hasPendingRequirementsCheck =
   ```



##########
flink-core/src/main/java/org/apache/flink/configuration/ResourceManagerOptions.java:
##########
@@ -150,6 +150,18 @@ public class ResourceManagerOptions {
                     .defaultValue(Duration.ofMillis(50))
                     .withDescription("The delay of the resource requirements 
check.");
 
+    /**
+     * The long delay of requirements check. This is only used for waiting for 
the previous
+     * TaskManagers registration and will only take effect in the first 
requirements check.
+     */
+    public static final ConfigOption<Duration> REQUIREMENTS_CHECK_LONG_DELAY =
+            ConfigOptions.key("slotmanager.requirement-check.long-delay")
+                    .durationType()
+                    .defaultValue(Duration.ofSeconds(3))

Review Comment:
   I think the default value might be too long. 1s or 500ms might be good 
enough?



##########
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java:
##########
@@ -434,10 +446,15 @@ public void freeSlot(SlotID slotId, AllocationID 
allocationId) {
      * are performed with a slight delay.
      */
     private void checkResourceRequirementsWithDelay() {

Review Comment:
   Same as the `FineGrainedSlotManager`.



##########
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java:
##########
@@ -504,8 +522,11 @@ private void checkResourceRequirementsWithDelay() {
                                             
Preconditions.checkNotNull(requirementsCheckFuture)
                                                     .complete(null);
                                         }),
-                        requirementsCheckDelay.toMillis(),
+                        delay,
                         TimeUnit.MILLISECONDS);
+                isRequirementsCheckLongDelay = false;
+            } else {
+                checkResourceRequirements();

Review Comment:
   It seems a bit complicated here. We just need to ensure the delay is no less 
than zero. Then, the `ScheduledExecutor#schedule` will help us to handle it.



##########
flink-runtime/src/test/java/org/apache/flink/runtime/resourcemanager/slotmanager/AbstractFineGrainedSlotManagerITCase.java:
##########
@@ -752,4 +753,76 @@ public void 
testAllocationUpdatesIgnoredIfSlotMarkedAsAllocatedAfterSlotReport()
 
         System.setSecurityManager(null);
     }
+
+    @Test
+    public void testRequirementLongDelayOnlyTakeEffectOnce() throws Exception {
+        final CompletableFuture<Void> allocationIdFuture1 = new 
CompletableFuture<>();
+        final CompletableFuture<Void> allocationIdFuture2 = new 
CompletableFuture<>();
+        final TaskExecutorGateway taskExecutorGateway1 =
+                new TestingTaskExecutorGatewayBuilder()
+                        .setRequestSlotFunction(
+                                tuple6 -> {
+                                    allocationIdFuture1.complete(null);
+                                    return 
CompletableFuture.completedFuture(Acknowledge.get());
+                                })
+                        .createTestingTaskExecutorGateway();
+        final TaskExecutorGateway taskExecutorGateway2 =
+                new TestingTaskExecutorGatewayBuilder()
+                        .setRequestSlotFunction(
+                                tuple6 -> {
+                                    allocationIdFuture2.complete(null);
+                                    return 
CompletableFuture.completedFuture(Acknowledge.get());
+                                })
+                        .createTestingTaskExecutorGateway();
+        final TaskExecutorConnection taskExecutorConnection1 =
+                new TaskExecutorConnection(ResourceID.generate(), 
taskExecutorGateway1);
+        final TaskExecutorConnection taskExecutorConnection2 =
+                new TaskExecutorConnection(ResourceID.generate(), 
taskExecutorGateway2);
+        new Context() {
+            {
+                runTest(
+                        () -> {
+                            final CompletableFuture<Boolean> 
registerTaskManagerFuture1 =
+                                    new CompletableFuture<>();
+                            final CompletableFuture<Boolean> 
registerTaskManagerFuture2 =
+                                    new CompletableFuture<>();
+                            runInMainThread(
+                                    () -> {
+                                        
getSlotManager().enlargeRequirementsCheckDelayOnce();
+                                        getSlotManager()
+                                                .processResourceRequirements(
+                                                        
createResourceRequirements(
+                                                                new JobID(),
+                                                                2,
+                                                                
DEFAULT_SLOT_RESOURCE_PROFILE));
+                                        registerTaskManagerFuture1.complete(
+                                                getSlotManager()
+                                                        .registerTaskManager(
+                                                                
taskExecutorConnection1,
+                                                                new 
SlotReport(),
+                                                                
DEFAULT_SLOT_RESOURCE_PROFILE,
+                                                                
DEFAULT_SLOT_RESOURCE_PROFILE));
+                                    });
+                            assertThat(
+                                    
assertFutureCompleteAndReturn(registerTaskManagerFuture1),
+                                    is(true));
+                            assertFutureNotComplete(allocationIdFuture1);
+                            allocationIdFuture1.get(200, 
TimeUnit.MILLISECONDS);

Review Comment:
   It's a bit fragile. Maybe we can enlarge the `requirementCheckLongDelay` to 
400ms.



##########
flink-runtime/src/test/java/org/apache/flink/runtime/resourcemanager/slotmanager/AbstractFineGrainedSlotManagerITCase.java:
##########
@@ -752,4 +753,76 @@ public void 
testAllocationUpdatesIgnoredIfSlotMarkedAsAllocatedAfterSlotReport()
 
         System.setSecurityManager(null);
     }
+
+    @Test
+    public void testRequirementLongDelayOnlyTakeEffectOnce() throws Exception {
+        final CompletableFuture<Void> allocationIdFuture1 = new 
CompletableFuture<>();
+        final CompletableFuture<Void> allocationIdFuture2 = new 
CompletableFuture<>();
+        final TaskExecutorGateway taskExecutorGateway1 =
+                new TestingTaskExecutorGatewayBuilder()
+                        .setRequestSlotFunction(
+                                tuple6 -> {
+                                    allocationIdFuture1.complete(null);
+                                    return 
CompletableFuture.completedFuture(Acknowledge.get());
+                                })
+                        .createTestingTaskExecutorGateway();
+        final TaskExecutorGateway taskExecutorGateway2 =
+                new TestingTaskExecutorGatewayBuilder()
+                        .setRequestSlotFunction(
+                                tuple6 -> {
+                                    allocationIdFuture2.complete(null);
+                                    return 
CompletableFuture.completedFuture(Acknowledge.get());
+                                })
+                        .createTestingTaskExecutorGateway();
+        final TaskExecutorConnection taskExecutorConnection1 =
+                new TaskExecutorConnection(ResourceID.generate(), 
taskExecutorGateway1);
+        final TaskExecutorConnection taskExecutorConnection2 =
+                new TaskExecutorConnection(ResourceID.generate(), 
taskExecutorGateway2);
+        new Context() {
+            {
+                runTest(
+                        () -> {
+                            final CompletableFuture<Boolean> 
registerTaskManagerFuture1 =
+                                    new CompletableFuture<>();
+                            final CompletableFuture<Boolean> 
registerTaskManagerFuture2 =
+                                    new CompletableFuture<>();
+                            runInMainThread(
+                                    () -> {
+                                        
getSlotManager().enlargeRequirementsCheckDelayOnce();
+                                        getSlotManager()
+                                                .processResourceRequirements(
+                                                        
createResourceRequirements(
+                                                                new JobID(),
+                                                                2,
+                                                                
DEFAULT_SLOT_RESOURCE_PROFILE));
+                                        registerTaskManagerFuture1.complete(
+                                                getSlotManager()
+                                                        .registerTaskManager(
+                                                                
taskExecutorConnection1,
+                                                                new 
SlotReport(),
+                                                                
DEFAULT_SLOT_RESOURCE_PROFILE,
+                                                                
DEFAULT_SLOT_RESOURCE_PROFILE));
+                                    });
+                            assertThat(
+                                    
assertFutureCompleteAndReturn(registerTaskManagerFuture1),
+                                    is(true));
+                            assertFutureNotComplete(allocationIdFuture1);
+                            allocationIdFuture1.get(200, 
TimeUnit.MILLISECONDS);

Review Comment:
   And here, we just use `assertFutureCompleteAndReturn`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to