imback82 commented on pull request #28435:
URL: https://github.com/apache/spark/pull/28435#issuecomment-623570107


   The shutdown hook is eventually called from 
`org.apache.hadoop.util.ShutdownHookManager`:
   ```JAVA
       for (HookEntry entry: getShutdownHooksInOrder()) {
         Future<?> future = EXECUTOR.submit(entry.getHook());
         try {
           future.get(entry.getTimeout(), entry.getTimeUnit());
         } catch (TimeoutException ex) {
   ```
   where `EXECUTOR` is
   ```JAVA
     private static final ExecutorService EXECUTOR =
         HadoopExecutors.newSingleThreadExecutor(new ThreadFactoryBuilder()
             .setDaemon(true)
             .setNameFormat("shutdown-hook-%01d")
             .build());
   ```
   What I noticed when there is a timeout, the hook is submitted to be 
executed, but didn't run at all until the timeout is reached. By contrast, in 
Hadoop <=2.7, the hook was run in the same thread that was running above loop, 
but without the timeout functionality.
   
   So I don't think we should rely on the shutdown hook to unregister for the 
successful run.
   
   This applies to both client and cluster mode (since the hook is registered 
in `ApplicationMaster.run`) although my main observation is from the cluster 
runs.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to