noorall commented on PR #27091:
URL: https://github.com/apache/flink/pull/27091#issuecomment-3454786833

   > @noorall This is indeed very helpful context. I can revert the new commit 
and make improvements on the first commit that fixes the test here.
   > 
   > One quick question though: why the test fails intermittently? Is the 
scheduler used in tests non-deterministic? Can we make it static in test and/or 
make test code aware of which schedule is being used? Thanks!
   
   1. This isn’t actually an intermittently failing test—when I locally 
configure the scheduler type to adaptive, the test fails 100% of the time, not 
intermittently.
   2. The scheduler used in tests is not deterministic. It’s controlled by 
JobManagerOptions.SCHEDULER, which defaults to DEFAULT, but certain CI 
pipelines inject extra environment variables to force a specific scheduler type 
(see the logic in ClusterOptions.isAdaptiveSchedulerEnabled() and 
DefaultSlotPoolServiceSchedulerFactory.getSchedulerType()).
   3. It’s better not to hardcode a particular scheduler type in tests. 
Instead, we can explicitly pass a slotPoolServiceSchedulerFactory to the 
JobMasterBuilder and let the test code determine the active scheduler type via 
slotPoolServiceSchedulerFactory.getSchedulerType().
   This way, the test remains flexible while correctly handling both scheduler 
modes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to