[ 
https://issues.apache.org/jira/browse/HADOOP-15679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584515#comment-16584515
 ] 

Steve Loughran commented on HADOOP-15679:
-----------------------------------------

[~xyao]: I'm adding a log at the end of the run; not doing the per-hook details 
as it would be more complicated. With log4j set to debug and print threads, the 
log of the test run is

{code}
2018-08-17 16:54:44,969 [main] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:shutdownHookManager(121)) - invoking 
executeShutdown()
2018-08-17 16:54:44,978 [shutdown-hook-0] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:run(257)) - Starting shutdown of hook4 with sleep 
time of 25000
2018-08-17 16:54:46,980 [main] WARN  util.ShutdownHookManager 
(ShutdownHookManager.java:executeShutdown(128)) - ShutdownHook 'Hook' timeout, 
java.util.concurrent.TimeoutException
java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:205)
        at 
org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124)
        at 
org.apache.hadoop.util.TestShutdownHookManager.shutdownHookManager(TestShutdownHookManager.java:122)
  ...
2018-08-17 16:54:46,980 [shutdown-hook-0] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:run(268)) - Shutdown hook4 interrupted exception
java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method)
        at 
org.apache.hadoop.util.TestShutdownHookManager$Hook.run(TestShutdownHookManager.java:260)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2018-08-17 16:54:46,984 [shutdown-hook-0] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:run(257)) - Starting shutdown of hook3 with sleep 
time of 1000
2018-08-17 16:54:47,985 [shutdown-hook-0] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:run(262)) - Completed shutdown of hook3
2018-08-17 16:54:47,986 [shutdown-hook-0] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:run(257)) - Starting shutdown of hook2 with sleep 
time of 0
2018-08-17 16:54:47,986 [shutdown-hook-0] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:run(262)) - Completed shutdown of hook2
2018-08-17 16:54:47,987 [shutdown-hook-0] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:run(257)) - Starting shutdown of hook1 with sleep 
time of 0
2018-08-17 16:54:47,987 [shutdown-hook-0] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:run(262)) - Completed shutdown of hook1
2018-08-17 16:54:47,987 [main] INFO  util.TestShutdownHookManager 
(TestShutdownHookManager.java:shutdownHookManager(123)) - Shutdown completed

// and here, in the real shutdown hook of the process.
2018-08-17 16:54:47,994 [Thread-0] DEBUG util.ShutdownHookManager 
(ShutdownHookManager.java:run(97)) - Completed shutdown in 0.000 seconds; 
Timeouts: 0
2018-08-17 16:54:47,997 [Thread-0] DEBUG util.ShutdownHookManager 
(ShutdownHookManager.java:shutdownExecutor(154)) - ShutdownHookManger completed 
shutdown.

{code}

> ShutdownHookManager shutdown time needs to be configurable & extended
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-15679
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15679
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 2.8.0, 3.0.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: HADOOP-15679-001.patch, HADOOP-15679-002.patch, 
> HADOOP-15679-002.patch, HADOOP-15679-003.patch
>
>
> HADOOP-12950 added a timeout on shutdowns to avoid problems with hanging 
> shutdowns. But the timeout is too short for applications where a large flush 
> of data is needed on shutdown.
> A key example of this is Spark apps which save their history to object 
> stores, where the file close() call triggers an upload of the final local 
> cached block of data (could be 32+MB), and then execute the final mutipart 
> commit.
> Proposed
> # make the default sleep time 30s, not 10s
> # make it configurable with a time duration property (with minimum time of 
> 1s.?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to