Improve Block report processing and name node restarts (Master Jira)
--------------------------------------------------------------------

                 Key: HADOOP-2448
                 URL: https://issues.apache.org/jira/browse/HADOOP-2448
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
            Reporter: Sanjay Radia
            Assignee: Sanjay Radia
             Fix For: 0.16.0


It has been reported that for large clusters (2K datanodes) , a restarted  
namenode can often take hours to leave the safe-mode.
- admins have reported that if the data nodes are started, say 100 at a time, 
it significantly improves the startup time of the name node
- setting  the initial heap (as opposed to max heap) to be larger also helps t- 
this avoids the GCs before more memory is added to the heap.

Observations of the Name node via JConsole and instrumentation:
- if 80% of memory is used for maintining the names and blocks data structures, 
then processing block reports can generate a lot of GC causing block reports to 
take a long time to process. This causes datanodes that sent the block reports 
to timeout and resend the block reports making the situation worse.


Hence to improve the situation the following are proposed:

1. Have random backoffs (of say 60sec for a 1K cluster) of the initial block 
report sent by a DN. This would match the randomization of the normal hourly 
block reports. (Jira  HADOOP-2326)

2. Have the NN tell the DN how much to backoff (i.e. rather than a single 
configuration parameter for the backoff). This would allow the system to adjust 
automatically to cluster size - smaller clusters will startup faster than 
larger clusters. (Jira HADOOP-2444)

3. Change the block reports to be array of longs rather then array of block 
report objects - this would reduce the amount of memory used to process a block 
report. This would help the initial startup and also the block report process 
during normal operation outside of the safe-mode. (Jira HADOOP-2110)

4. Queue and acknowledge the receipts of the block reports and have separate 
set of threads process the block report queue. (HADOOP-2111)

4 Jiras have been filed as noted.

Based on experiments, we may not want to proceed with option 4. While option 4 
did help block report processing when tried on its own, it turned out that in 
combination with 1 it did not help much. Furthermore, clean up of RPC to remove 
the client-side timeout (see JIRA Hadoop-2188) would make this fix obsolete.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to