[
https://issues.apache.org/jira/browse/FLINK-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chesnay Schepler updated FLINK-24343:
-------------------------------------
Issue Type: Technical Debt (was: Bug)
> Revisit Scheduler and Coordinator Startup Procedure
> ---------------------------------------------------
>
> Key: FLINK-24343
> URL: https://issues.apache.org/jira/browse/FLINK-24343
> Project: Flink
> Issue Type: Technical Debt
> Components: Runtime / Coordination
> Affects Versions: 1.14.0, 1.13.2
> Reporter: Stephan Ewen
> Priority: Major
> Fix For: 1.15.0
>
>
> We need to re-examine the startup procedure of the scheduler, and how it
> interacts with the startup of the operator coordinators.
> We need to make sure the following conditions are met:
> - The Operator Coordinators are started before the first action happens
> that they need to be informed of. That includes as task being ready, a
> checkpoint happening, etc.
> - The scheduler must be started to the point that it can handle
> "failGlobal()" calls, because the coordinators might trigger that during
> their startup when an exception in "start()" occurs.
> /cc [~chesnay]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)