noorall commented on PR #27091: URL: https://github.com/apache/flink/pull/27091#issuecomment-3454786833
> @noorall This is indeed very helpful context. I can revert the new commit and make improvements on the first commit that fixes the test here. > > One quick question though: why the test fails intermittently? Is the scheduler used in tests non-deterministic? Can we make it static in test and/or make test code aware of which schedule is being used? Thanks! 1. This isn’t actually an intermittently failing test—when I locally configure the scheduler type to adaptive, the test fails 100% of the time, not intermittently. 2. The scheduler used in tests is not deterministic. It’s controlled by JobManagerOptions.SCHEDULER, which defaults to DEFAULT, but certain CI pipelines inject extra environment variables to force a specific scheduler type (see the logic in ClusterOptions.isAdaptiveSchedulerEnabled() and DefaultSlotPoolServiceSchedulerFactory.getSchedulerType()). 3. It’s better not to hardcode a particular scheduler type in tests. Instead, we can explicitly pass a slotPoolServiceSchedulerFactory to the JobMasterBuilder and let the test code determine the active scheduler type via slotPoolServiceSchedulerFactory.getSchedulerType(). This way, the test remains flexible while correctly handling both scheduler modes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
