[ http://issues.apache.org/jira/browse/HADOOP-774?page=comments#action_12455740 ] dhruba borthakur commented on HADOOP-774: -----------------------------------------
My apologies. My comment should have said "Better comments in the code". > Datanodes fails to heartbeat when a directory with a large number of blocks > is deleted > -------------------------------------------------------------------------------------- > > Key: HADOOP-774 > URL: http://issues.apache.org/jira/browse/HADOOP-774 > Project: Hadoop > Issue Type: Bug > Components: dfs > Reporter: dhruba borthakur > Assigned To: dhruba borthakur > Attachments: chunkinvalidateBlocks2.java > > > If a user removes a few files that are huge, it causes the namenode to send > BlockInvalidate command to the relevant Datanodes. The Datanode process the > blockInvalidate command as part of its heartbeat thread. If the number of > blocks to be invalidated is huge, the datanode takes a long time to process > it. This causes the datanode to not send new heartbeats to the namenode. The > namenode declares the datanode as dead! > 1. One option is to process the blockInvalidate as a separate thread from the > heartbeat thread in the Datanode. > 2. Another option would be to constrain the namenode to send a max (e.g. 500) > blocks per blockInvalidate message. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira