XComp opened a new pull request, #19567:
URL: https://github.com/apache/flink/pull/19567

   ## What is the purpose of the change
   
   In job mode, we triggered the shutdown as soon as the job reached a globally 
terminal state. This was fine in 1.14- because we didn't do any promises on the 
cleanup anyway. With 1.15, we introduced retries for cleanup which results in 
the final termination taking longer. During cluster shutdown the 
ResourceManager is informed about deregistering the cluster which results in 
the workers being shutdown in case of active RMs (i.e. k8s and YARN). See 
further details in FLINK-26772 (parent issue of this issue).
   
   ## Brief change log
   
   * removed overwriting of `Dispatcher#jobReachedTerminalState` in 
`MiniDispatcher`
   * Introduced new method that is called when the job reached a globally 
terminal state which then gets implemented by `MiniDispatcher`
   
   ## Verifying this change
   
   * I extended existing tests to verify that the shutdown happens after the 
cleanup
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: yes
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to