[
https://issues.apache.org/jira/browse/HADOOP-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712205#action_12712205
]
Todd Lipcon commented on HADOOP-5890:
-------------------------------------
Woops, I pasted a bad example from the log... here's an example that actually
demonstrates the behavior discussed:
{code}
2009-05-21 22:43:21,259 INFO datanode.DataNode (DataNode.java:shutdown(637)) -
Waiting for threadgroup to exit, active threads is 1
2009-05-21 22:43:21,259 WARN datanode.DataNode
(DataXceiverServer.java:run(137)) - DatanodeRegistration(127.0.0.1:40197,
storageID=DS-2052133204-127.0.1.1-40197-1242971000238, infoPort=52207,
ipcPort=52592):DataXceiveServer: java.nio.channels.AsynchronousCloseException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
at
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:152)
at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:130)
at java.lang.Thread.run(Thread.java:619)
2009-05-21 22:43:21,315 INFO datanode.DataBlockScanner
(DataBlockScanner.java:run(620)) - Exiting DataBlockScanner thread.
2009-05-21 22:43:22,259 INFO datanode.DataNode (DataNode.java:shutdown(637)) -
Waiting for threadgroup to exit, active threads is 0
{code}
Note the exact 1second offset between 22:43:21,259 and 22:43:22,259. This
patch reduces that significantly.
> Use exponential backoff on Thread.sleep during DN shutdown
> ----------------------------------------------------------
>
> Key: HADOOP-5890
> URL: https://issues.apache.org/jira/browse/HADOOP-5890
> Project: Hadoop Core
> Issue Type: Improvement
> Components: dfs
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hadoop-5890.txt
>
>
> Tests waste a lot of time in DataNode.shutdown. Typical logs look like:
> {code}
> 2009-05-21 17:13:20,177 INFO datanode.DataNode (DataNode.java:shutdown(637))
> - Waiting for threadgroup to exit, active threads is 0
> 2009-05-21 17:13:20,177 INFO datanode.DataBlockScanner
> (DataBlockScanner.java:run(620)) - Exiting DataBlockScanner thread.
> 2009-05-21 17:13:21,117 INFO datanode.DataNode (DataNode.java:shutdown(637))
> - Waiting for threadgroup to exit, active threads is 0
> {code}
> In this example (and very commonly) the DataBlockScanner thread exits within
> 5-10ms after the first wait. The DN then sleeps an entire second before
> succeeding in shutting down.
> Using exponential backoff from a short value like 2ms up to a maximum of
> 1000ms would solve this.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.