[
https://issues.apache.org/jira/browse/HDFS-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685844#comment-15685844
]
Rakesh R commented on HDFS-11164:
---------------------------------
Presently {{DataXceiver#copyBlock}} block is having the
{{datanode.data.getPinning(block)}} checks and throwing generic ERROR code to
the caller. Since this err code is too generic, it would be difficult to
differentiate the block pinned errors from other IOExceptions. One idea that
comes in my mind is to introduce new error code say {{ERROR_BLOCK_PINNED}}
represents the block pinning errors. That way, block pinning errors can be
easily identified by other layers(mover) and do the necessary logic.
{code}
DataXceiver.java
if (datanode.data.getPinning(block)) {
String msg = "Not able to copy block " + block.getBlockId() + " " +
"to " + peer.getRemoteAddressString() + " because it's pinned ";
LOG.info(msg);
sendResponse(ERROR_BLOCK_PINNED, msg);
return;
}
{code}
> Mover should avoid unnecessary retries if the block is pinned
> -------------------------------------------------------------
>
> Key: HDFS-11164
> URL: https://issues.apache.org/jira/browse/HDFS-11164
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: balancer & mover
> Reporter: Rakesh R
> Assignee: Rakesh R
>
> When mover is trying to move a pinned block to another datanode, it will
> internally hits the following IOException and mark the block movement as
> {{failure}}. Since the Mover has {{dfs.mover.retry.max.attempts}} configs, it
> will continue moving this block until it reaches {{retryMaxAttempts}}. This
> retry is unnecessary and would be good to avoid retry attempts as pinned
> block won't be able to move.
> {code}
> 2016-11-22 10:56:10,537 WARN
> org.apache.hadoop.hdfs.server.balancer.Dispatcher: Failed to move
> blk_1073741825_1001 with size=52 from 127.0.0.1:19501:DISK to
> 127.0.0.1:19758:ARCHIVE through 127.0.0.1:19501
> java.io.IOException: Got error, status=ERROR, status message opReplaceBlock
> BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 received
> exception java.io.IOException: Got error, status=ERROR, status message Not
> able to copy block 1073741825 to /127.0.0.1:19826 because it's pinned , copy
> block BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001 from
> /127.0.0.1:19501, reportedBlock move is failed
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:118)
> at
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:417)
> at
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:358)
> at
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$5(Dispatcher.java:322)
> at
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1075)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]