hfutatzhanghb commented on code in PR #6737:
URL: https://github.com/apache/hadoop/pull/6737#discussion_r1609154301


##########
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/NamenodeFGL.md:
##########
@@ -0,0 +1,210 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+HDFS Namenode Fine-grained Locking
+==================================
+
+<!-- MACRO{toc|fromDepth=0|toDepth=3} -->
+
+Overview
+--------
+
+HDFS relies on a single master, the Namenode (NN), as its metadata center.
+From an architectural point of view, a few elements make NN the bottleneck of 
an HDFS cluster:
+* NN keeps the entire namespace in memory (directory tree, blocks, Datanode 
related info, etc.)
+* Read requests (`getListing`, `getFileInfo`, `getBlockLocations`) are served 
from memory.
+Write requests (`mkdir`, `create`, `addBlock`, `complete`) update the memory 
state and write a journal transaction into QJM.
+Both types of requests need a locking mechanism to ensure data consistency and 
correctness.
+* All requests are funneled into NN and have to go through the global FS lock.
+Each write operation acquires this lock in write mode and holds it until that 
operation is executed.
+This lock mode prevents concurrent execution of write operations even if they 
involve different branches of the directory tree.
+
+NN fine-grained locking (FGL) implementation aims to alleviate this bottleneck 
by allowing concurrency of disjoint write operations.
+
+JIRA: [HDFS-17366](https://issues.apache.org/jira/browse/HDFS-17366)
+
+Design
+------
+In theory, fully independent operations can be processed concurrently, such as 
operations involving different subdirectory trees.
+As such, NN can split the global lock into the full path lock, just using the 
full path lock to protect a special subdirectory tree.
+
+### RPC Categorization
+
+Roughly, RPC operations handled by NN can be divided into 8 main categories
+
+| Category                               | Operations                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                      |
+|----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Involving namespace tree               | `mkdir`, `create` (without 
overwrite), `getFileInfo` (without locations), `getListing` (without 
locations), `setOwner`, `setPermission`, `getStoragePolicy`, 
`setStoragePolicy`, `rename`, `isFileClosed`, `getFileLinkInfo`, `setTimes`, 
`modifyAclEntries`, `removeAclEntries`, `setAcl`, `getAcl`, `setXAttr`, 
`getXAttrs`, `listXAttrs`, `removeXAttr`, `checkAccess`, 
`getErasureCodingPolicy`, `unsetErasureCodingPolicy`, `getQuotaUsage`, 
`getPreferredBlockSize` |
+| Involving only blocks                  | `reportBadBlocks`, 
`updateBlockForPipeline`, `updatePipeline`                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
+| Involving only DNs                     | `registerDatanode`, 
`setBalancerBandwidth`, `sendHeartbeat`                                         
                                                                                
                                                                                
                                                                                
                                                                                
                                      |
+| Involving both namespace tree & blocks | `getBlockLocation`, `create` (with 
overwrite), `append`, `setReplication`, `abandonBlock`, `addBlock`, 
`getAdditionalDatanode`, `complete`, `concat`, `truncate`, `delete`, 
`getListing` (with locations), `getFileInfo` (with locations), `recoverLease`, 
`listCorruptFileBlocks`, `fsync`, `commitBlockSynchronization`, 
`RedundancyMonitor`, `processMisReplicatedBlocks`                               
                                                               |
+| Involving both DNs & blocks            | `getBlocks`, `errorReport`          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                      |
+| Involving namespace tree, DNs & blocks | `blockReport`, 
`blockReceivedAndDeleted`, `HeartbeatManager`, `Decommission`                   
                                                                                
                                                                                
                                                                                
                                                                                
                                           |

Review Comment:
   @kokonguyen191 @ZanderXu Sir, i did not see any namespace tree modification 
in the scope of writeLock()  
    in method HeartbeatManager#heartbeatCheck. Please correct me if i mistook 
it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to