Heart Zhou created FLINK-37606:
----------------------------------
Summary: Blocklist timeout check may lost
Key: FLINK-37606
URL: https://issues.apache.org/jira/browse/FLINK-37606
Project: Flink
Issue Type: Bug
Components: Runtime / Coordination
Affects Versions: 1.20.1
Reporter: Heart Zhou
The blocklist timeout check may be scheduled before the rpc server starts
The blocklist timeout check is scheduled by the mainThreadExecutor in the
constructor.
{code:java}
DefaultBlocklistHandler(xxx,
Duration timeoutCheckInterval,
ComponentMainThreadExecutor mainThreadExecutor,
xxx) {
xxx
this.timeoutCheckInterval = checkNotNull(timeoutCheckInterval);
this.mainThreadExecutor = checkNotNull(mainThreadExecutor);
xxx
scheduleTimeoutCheck();
} {code}
When the check function is called, the
org.apache.flink.runtime.rpc.RpcEndpoint#start method may not have been called
yet, although it will be called very soon.
Therefore, the check function might be lost.
{code:java}
public ScheduledFuture<?> schedule(Runnable command, long delay, TimeUnit unit)
{
final long delayMillis = TimeUnit.MILLISECONDS.convert(delay, unit);
FutureTask<Void> ft = new FutureTask<>(command, null);
if (mainScheduledExecutor.isShutdown()) {
log.warn(
"The scheduled executor service is shutdown and ignores the
command {}",
command);
} else {
mainScheduledExecutor.schedule(
() -> gateway.runAsync(ft), delayMillis, TimeUnit.MILLISECONDS);
}
return new ScheduledFutureAdapter<>(ft, delayMillis, TimeUnit.MILLISECONDS);
}{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)