Increase dfs scalability by optimizing locking on namenode.
-----------------------------------------------------------
Key: HADOOP-814
URL: http://issues.apache.org/jira/browse/HADOOP-814
Project: Hadoop
Issue Type: Bug
Components: dfs
Reporter: dhruba borthakur
Assigned To: dhruba borthakur
The current dfs namenode encounters locking bottlenecks when the number of
datanodes is large. The namenode uses a single global lock to protect access to
data structures. One key area is heartbeat processing. The lower the cost of
processing a heartbeat, more the number of nodes HDFS can support. A simple
change to this current locking model can increase the scalability. Here are the
details:
Case 1: Currently we have three locks, the global lock (on FSNamesystem), the
heartbeat lock and the datanodeMap lock. The following function is called when
a heartbeat is received by the Namenode
public synchronized FSNamesystem. gotHeartbeat() { ........ (A)
synchronized (heartbeat) { ........
(B)
synchronized (datanodeMap) { ......... (C)
...
}
}
In the above piece of code, statement (A) acquires the
global-FSNamesystem-lock. This synchronization can be safely removed (remove
updateStats too). This means that a heartbeat from the datanode can be
processed without holding the FSnamesystem-global-lock.
Case 2: A following thread called the heartbeatCheck thread periodically
traverses all known Datanodes to determine if any of them has timed out. It is
of the following form:
void FSNamesystem.heartbeatCheck() {
synchronized (this) {
........... (D)
synchronized (heartbeats) {
.............(E)
}
This thread acquires the global-FSNamesystem lock in Statement (D). This
statement (D) can be removed. Instead the loop can check to see if any nodes
are dead. If a dead node is found, only then it acquires the
FSNamesystem-global-lock.
It is possible that fixing the above two cases will cause HDFS to scale to
higher number of nodes.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira