[ 
https://issues.apache.org/jira/browse/HADOOP-15679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16583352#comment-16583352
 ] 

Xiaoyu Yao commented on HADOOP-15679:
-------------------------------------

Thanks [[email protected]] for working on this. The patch v2 LGTM. I just have 
one question and few minor comments. 

It's hard to come up with a magic timeout value that applies to the various 
usages of shutdown hooks in Hadoop. With a single configuration, different file 
systems may still conflict on their timeout settings. Have you considering per 
file system (like s3, wasb, etc.) shutdown timeout (passed in when calling 
ShutdownHookManager#addShutdownHook in FileSystem#getInternal() ) as needed 
while keep others with a small default value? 

 

ShutdownHookManager.java

 

Line 90: do we intend to use LOG.debug to display the stack info here? Can we 
add a conditional guard?

Line 94: can we add a LOG.info (or warn) on the number of timeout hooks 
returned from executeShutdown()? Maybe we can also add a debug log profiling 
each shutdown hook execution time, which will help deciding the timeout value 
configuration for future?

 

 Line 144: NIT: unfinished comments

Line 183: can we add a LOG.info (or warn) on invalid timeout value?

 

Line 297: NIT: blank change

> ShutdownHookManager shutdown time needs to be configurable & extended
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-15679
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15679
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 2.8.0, 3.0.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: HADOOP-15679-001.patch, HADOOP-15679-002.patch, 
> HADOOP-15679-002.patch
>
>
> HADOOP-12950 added a timeout on shutdowns to avoid problems with hanging 
> shutdowns. But the timeout is too short for applications where a large flush 
> of data is needed on shutdown.
> A key example of this is Spark apps which save their history to object 
> stores, where the file close() call triggers an upload of the final local 
> cached block of data (could be 32+MB), and then execute the final mutipart 
> commit.
> Proposed
> # make the default sleep time 30s, not 10s
> # make it configurable with a time duration property (with minimum time of 
> 1s.?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to