[
https://issues.apache.org/jira/browse/HDFS-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Shvachko updated HDFS-13174:
---------------------------------------
Fix Version/s: 2.10.2
> hdfs mover -p /path times out after 20 min
> ------------------------------------------
>
> Key: HDFS-13174
> URL: https://issues.apache.org/jira/browse/HDFS-13174
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: balancer & mover
> Affects Versions: 2.8.0, 2.7.4, 3.0.0-alpha2
> Reporter: István Fajth
> Assignee: István Fajth
> Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.4, 2.10.2
>
> Attachments: HDFS-13174.001.patch, HDFS-13174.002.patch,
> HDFS-13174.003.patch, HDFS-13174.004.patch, HDFS-13174.005.patch
>
>
> In HDFS-11015 there is an iteration timeout introduced in Dispatcher.Source
> class, that is checked during dispatching the moves that the Balancer and the
> Mover does. This timeout is hardwired to 20 minutes.
> In the Balancer we have iterations, and even if an iteration is timing out
> the Balancer runs further and does an other iteration before it fails if
> there were no moves happened in a few iterations.
> The Mover on the other hand does not have iterations, so if moving a path
> runs for more than 20 minutes, and there are moves decided and enqueued
> between two DataNode, after 20 minutes Mover will stop with the following
> exception reported to the console (lines might differ as this exception came
> from a CDH5.12.1 installation).
> java.io.IOException: Block move timed out
> at
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:382)
> at
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:328)
> at
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:186)
> at
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:956)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
> Note that this issue is not coming up if all blocks can be moved inside the
> DataNodes without having to move the block to an other DataNode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]