tillrohrmann commented on a change in pull request #7565: [FLINK-11400]
Linearize leadership operations in JobManagerRunner
URL: https://github.com/apache/flink/pull/7565#discussion_r251457733
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java
##########
@@ -1091,12 +1093,12 @@ private Acknowledge suspendExecution(final Exception
cause) {
suspendAndClearExecutionGraphFields(cause);
// the slot pool stops receiving messages and clears its pooled
slots
- slotPoolGateway.suspend();
+ CompletableFuture<Acknowledge> slotPoolSuspendFuture =
slotPoolGateway.suspend();
// disconnect from resource manager:
closeResourceManagerConnection(cause);
- return Acknowledge.get();
+ return slotPoolSuspendFuture;
Review comment:
- `slotPoolGateway.suspend` is a RPC which is executed asynchronously
because `slotPool` is a `RpcEndpoint`
- I think it does not matter much in which order `slotPoolGateway.suspend()`
and `closeResourceManagerConnection()` are called. In the worst case the
`slotPool` will send some requests to the `ResourceManager` which will be
rejected because the `JobMaster` is no longer connected to it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services