Github user mxm commented on the issue:
https://github.com/apache/flink/pull/2928
There is one problem we overlooked. In detached mode we ensure cluster
shutdown through a message sent by the client during job submission to tell the
JobManager that this is going to be the last job it has to execute. In
interactive execution mode, the user jar can contain multiple jobs; this is
mostly useful for interactive batch jobs. Since we just execute the main method
of the user jar, we don't know how many jobs are submitted and when to shutdown
the cluster. That's why we chose to delegate the shutdown to the client for
interactive jobs. Thus, I'm hesitant to remove the shutdown hook because it
ensures that the cluster shuts down during interactive job executions. It
prevents clusters from lingering around when the client shuts down.
A couple of solution for this problem:
1. The JobManager watches the client and shuts down a) if it looses
connection to the client and the job it executes has completed or b) the client
tells the JobManager to shut down.
2. The JobManager drives the execution which is now part of the client
3. We don't allow multiple jobs to execute. Then we always have a clear
shutdown point. This is perhaps the easiest and most elegant solution. Most
users only execute a single job at a time anyways. We can still allow
interactive job executions if the user chooses to. Perhaps we can make this
more explicit in the API to give a hint to the client.
I'm afraid we will have to close this PR until we realize one of the above
solutions (or another one).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---