Large number of decommission freezes the Namenode
-------------------------------------------------

                 Key: HADOOP-4061
                 URL: https://issues.apache.org/jira/browse/HADOOP-4061
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.17.2
            Reporter: Koji Noguchi


On 1900 nodes cluster, we tried decommissioning 400 nodes with 30k blocks each. 
Other 1500 nodes were almost empty.

When decommission started, namenode's queue overflowed every 6 minutes.

Looking at the cpu usage,  it showed that every 5 minutes 
org.apache.hadoop.dfs.FSNamesystem$DecommissionedMonitor thread was taking 100% 
of the CPU for 1 minute causing the queue to overflow.

{noformat}
  public synchronized void decommissionedDatanodeCheck() {
    for (Iterator<DatanodeDescriptor> it = datanodeMap.values().iterator();
         it.hasNext();) {
      DatanodeDescriptor node = it.next();
      checkDecommissionStateInternal(node);
    }
  }
{noformat}






-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to