[ 
https://issues.apache.org/jira/browse/HADOOP-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712205#action_12712205
 ] 

Todd Lipcon commented on HADOOP-5890:
-------------------------------------

Woops, I pasted a bad example from the log... here's an example that actually 
demonstrates the behavior discussed:

{code}
2009-05-21 22:43:21,259 INFO  datanode.DataNode (DataNode.java:shutdown(637)) - 
Waiting for threadgroup to exit, active threads is 1
2009-05-21 22:43:21,259 WARN  datanode.DataNode 
(DataXceiverServer.java:run(137)) - DatanodeRegistration(127.0.0.1:40197, 
storageID=DS-2052133204-127.0.1.1-40197-1242971000238, infoPort=52207, 
ipcPort=52592):DataXceiveServer: java.nio.channels.AsynchronousCloseException
        at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:152)
        at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:130)
        at java.lang.Thread.run(Thread.java:619)

2009-05-21 22:43:21,315 INFO  datanode.DataBlockScanner 
(DataBlockScanner.java:run(620)) - Exiting DataBlockScanner thread.
2009-05-21 22:43:22,259 INFO  datanode.DataNode (DataNode.java:shutdown(637)) - 
Waiting for threadgroup to exit, active threads is 0
{code}

Note the exact 1second offset between 22:43:21,259  and 22:43:22,259. This 
patch reduces that significantly.

> Use exponential backoff on Thread.sleep during DN shutdown
> ----------------------------------------------------------
>
>                 Key: HADOOP-5890
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5890
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-5890.txt
>
>
> Tests waste a lot of time in DataNode.shutdown. Typical logs look like:
> {code}
> 2009-05-21 17:13:20,177 INFO  datanode.DataNode (DataNode.java:shutdown(637)) 
> - Waiting for threadgroup to exit, active threads is 0
> 2009-05-21 17:13:20,177 INFO  datanode.DataBlockScanner 
> (DataBlockScanner.java:run(620)) - Exiting DataBlockScanner thread.
> 2009-05-21 17:13:21,117 INFO  datanode.DataNode (DataNode.java:shutdown(637)) 
> - Waiting for threadgroup to exit, active threads is 0
> {code}
> In this example (and very commonly) the DataBlockScanner thread exits within 
> 5-10ms after the first wait. The DN then sleeps an entire second before 
> succeeding in shutting down.
> Using exponential backoff from a short value like 2ms up to a maximum of 
> 1000ms would solve this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to