Improve Block report processing and name node restarts (Master Jira)
--------------------------------------------------------------------
Key: HADOOP-2448
URL: https://issues.apache.org/jira/browse/HADOOP-2448
Project: Hadoop
Issue Type: Improvement
Components: dfs
Reporter: Sanjay Radia
Assignee: Sanjay Radia
Fix For: 0.16.0
It has been reported that for large clusters (2K datanodes) , a restarted
namenode can often take hours to leave the safe-mode.
- admins have reported that if the data nodes are started, say 100 at a time,
it significantly improves the startup time of the name node
- setting the initial heap (as opposed to max heap) to be larger also helps t-
this avoids the GCs before more memory is added to the heap.
Observations of the Name node via JConsole and instrumentation:
- if 80% of memory is used for maintining the names and blocks data structures,
then processing block reports can generate a lot of GC causing block reports to
take a long time to process. This causes datanodes that sent the block reports
to timeout and resend the block reports making the situation worse.
Hence to improve the situation the following are proposed:
1. Have random backoffs (of say 60sec for a 1K cluster) of the initial block
report sent by a DN. This would match the randomization of the normal hourly
block reports. (Jira HADOOP-2326)
2. Have the NN tell the DN how much to backoff (i.e. rather than a single
configuration parameter for the backoff). This would allow the system to adjust
automatically to cluster size - smaller clusters will startup faster than
larger clusters. (Jira HADOOP-2444)
3. Change the block reports to be array of longs rather then array of block
report objects - this would reduce the amount of memory used to process a block
report. This would help the initial startup and also the block report process
during normal operation outside of the safe-mode. (Jira HADOOP-2110)
4. Queue and acknowledge the receipts of the block reports and have separate
set of threads process the block report queue. (HADOOP-2111)
4 Jiras have been filed as noted.
Based on experiments, we may not want to proceed with option 4. While option 4
did help block report processing when tried on its own, it turned out that in
combination with 1 it did not help much. Furthermore, clean up of RPC to remove
the client-side timeout (see JIRA Hadoop-2188) would make this fix obsolete.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.