[
https://issues.apache.org/jira/browse/FLINK-34070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806072#comment-17806072
]
Matthias Pohl commented on FLINK-34070:
---------------------------------------
The {{AdaptiveScheduler}} uses {{DeclarativeSlotPoolServiceFactory}} in
contrast to the {{DefaultScheduler}} and the {{AdaptiveBatchScheduler}} which
use {{DeclarativeSlotPoolBridgeServiceFactory}} (see
[DefaultSlotPoolServiceSchedulerFactory:177ff|https://github.com/apache/flink/blob/a9383fd4d51b1161292628145e2f427f574a07d4/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/DefaultSlotPoolServiceSchedulerFactory.java#L177]).
That makes the {{JobMaster}} use different implementations of the
{{SlotPoolService}} interface in
[JobMaster:333|https://github.com/apache/flink/blob/16bac7802284563c95cfe18fcf153e91dc06216e/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java#L333]:
* {{DeclarativeSlotPoolService}} (used by the {{{}AdaptiveScheduler{}}})
doesn't implement {{SlotPoolService#notifyNotEnoughResourcesAvailable}} but
relies on the empty default implementation.
* {{DeclarativeSlotPoolBridge}} (used by the other two scheduler
implementations) does implement implement {{notifyNotEnoughResourcesAvailable}}
in
[DeclarativeSlotPoolBridge:395ff|https://github.com/apache/flink/blob/72bff2a2d0072602e4e625476bf5480dc50dc76c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/slotpool/DeclarativeSlotPoolBridge.java#L395]
which triggers aborting pending requests.
Conclusion: The {{AdaptiveScheduler}} doesn't participate in the early
cancellation of pending slot requests that cannot be fulfilled.
> MiniClusterITCase.testHandleStreamingJobsWhenNotEnoughSlot fails for the
> AdaptiveScheduler
> ------------------------------------------------------------------------------------------
>
> Key: FLINK-34070
> URL: https://issues.apache.org/jira/browse/FLINK-34070
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.19.0
> Reporter: Matthias Pohl
> Assignee: Matthias Pohl
> Priority: Major
> Labels: test-stability
>
> We experience test failures of
> {{MiniClusterITCase.testHandleStreamingJobsWhenNotEnoughSlot}} with the
> {{AdaptiveScheduler}} being enabled after FLINK-33414 was fixed:
> {code:java}
> Jan 09 02:01:16 at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> Jan 09 02:01:16 at
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> Jan 09 02:01:16 at
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3313)
> Jan 09 02:01:16 at
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> Jan 09 02:01:16 at
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> Jan 09 02:01:16 at
> org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:1050)
> Jan 09 02:01:16 at
> org.apache.flink.runtime.minicluster.MiniClusterITCase.runHandleJobsWhenNotEnoughSlots(MiniClusterITCase.java:152)
> Jan 09 02:01:16 at
> org.apache.flink.runtime.minicluster.MiniClusterITCase.lambda$testHandleStreamingJobsWhenNotEnoughSlot$0(MiniClusterITCase.java:119)
> Jan 09 02:01:16 at
> org.apache.flink.runtime.minicluster.MiniClusterITCase$$Lambda$1927/1144737794.call(Unknown
> Source)
> Jan 09 02:01:16 at
> org.assertj.core.api.ThrowableAssert.catchThrowable(ThrowableAssert.java:63)
> Jan 09 02:01:16 at
> org.assertj.core.api.AssertionsForClassTypes.catchThrowable(AssertionsForClassTypes.java:892)
> Jan 09 02:01:16 at
> org.assertj.core.api.Assertions.catchThrowable(Assertions.java:1366)
> Jan 09 02:01:16 at
> org.assertj.core.api.Assertions.assertThatThrownBy(Assertions.java:1210)
> Jan 09 02:01:16 at
> org.apache.flink.runtime.minicluster.MiniClusterITCase.testHandleStreamingJobsWhenNotEnoughSlot(MiniClusterITCase.java:119)
> Jan 09 02:01:16 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> Jan 09 02:01:16 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jan 09 02:01:16 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56124&view=logs&j=0e7be18f-84f2-53f0-a32d-4a5e4a174679&t=7c1d86e3-35bd-5fd5-3b7c-30c126a78702&l=9813]
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56166&view=logs&j=0e7be18f-84f2-53f0-a32d-4a5e4a174679&t=7c1d86e3-35bd-5fd5-3b7c-30c126a78702&l=10782]
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56226&view=logs&j=0e7be18f-84f2-53f0-a32d-4a5e4a174679&t=7c1d86e3-35bd-5fd5-3b7c-30c126a78702&l=10773]
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56285&view=logs&j=0e7be18f-84f2-53f0-a32d-4a5e4a174679&t=7c1d86e3-35bd-5fd5-3b7c-30c126a78702&l=10800]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)