[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978517#comment-16978517 ]
Xiaoqiao He commented on HDFS-14997: ------------------------------------ Thanks [~sodonnell] for your comments. {quote}We have seen a lot of occurrences when a client gets the error like "failed to close file as the last block has insufficient number of replicas"{quote} suggest to tune up the parameter `dfs.client.block.write.locateFollowingBlock.retries` (10 in my practice). This case could be also related to high IO load in my opinion. {quote}there are many places in the datanode where the FsDatasetImpl lock is held for IO operations, and I suspect there are times we could potentially lock on a volume rather than DN wide. {quote} What I noticed that three methods include FsDatasetImpl#{finalizeBlock, finalizeReplica, createRbw} are very time consuming operation. Both of them include IO operation in the global lock FsDatasetImpl#datasetLock. I believe we can improve the implementation. My colleague [~Aiphag0] is working on this now, he will file JIRA to trace it. I am not sure if all 14 different type commands as following could be processed async. For instance, if datanode process command not in-time, NameNode will return DNA_REGISTER many times in some case. I think it is OK since DNA_REGISTER is idempotent in both NameNode and DataNode sides. But I am not sure if others are same idempotent and not have to keep the process order. {DNA_UNKNOWN,DNA_TRANSFER,DNA_INVALIDATE,DNA_SHUTDOWN,DNA_REGISTER,DNA_FINALIZE,DNA_RECOVERBLOCK,DNA_ACCESSKEYUPDATE,DNA_BALANCERBANDWIDTHUPDATE,DNA_CACHE,DNA_UNCACHE,DNA_ERASURE_CODING_RECONSTRUCTION,DNA_BLOCK_STORAGE_MOVEMENT,DNA_DROP_SPS_WORK_COMMAND} Any furthermore suggestions? Thanks again. > BPServiceActor process command from NameNode asynchronously > ----------------------------------------------------------- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Reporter: Xiaoqiao He > Assignee: Xiaoqiao He > Priority: Major > Attachments: HDFS-14997.001.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org