[ https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922880#comment-16922880 ]
Konstantin Shvachko commented on HDFS-14703: -------------------------------------------- Hey [~hexiaoqiao], clarifying on your questions. # The POC patches use latch lock only for one operation - mkdir. All other operations are unchanged and use the global lock. So concurrency in POC is guaranteed only for concurrent mkdir operations. If you use delete (or any other op) and mkdir concurrently the results will be unpredictable exactly as you describe. The POC goal is to demonstrate the idea, it is not the final product. ??`deleting a directory should lock all RangeGets involved`. Is it one special case about Delete Ops??? Not only directory deletes. Several operations may need to lock multiple RangeGets like rename, recursive mkdir. # The POC patch adds {{long[] namespaceKey}} field into INode, which would increase the footprint of the namespace, which is bad. {{namespaceKey}} not really needed, as one can always calculate the the key via {{parent}} reference. It's an optimization. An alternative is to move {{long[]}} into {{INodesInPath}} so that they exist only when the INode is accessed. Again POC does not do A LOT of things, which the final implementation should. It's a large project, please don't blame me that I didn't do everything already ;). # Actually there is unlock for mkdir, otherwise the POC wouldn't work. {{FSNamesystemLock.writeUnlock()}} unlocks all locked children when {{unlockChildren == true}}. [~hexiaoqiao] looking forward working with you on this feature. Any and all help is very much welcomed. > NameNode Fine-Grained Locking via Metadata Partitioning > ------------------------------------------------------- > > Key: HDFS-14703 > URL: https://issues.apache.org/jira/browse/HDFS-14703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode > Reporter: Konstantin Shvachko > Priority: Major > Attachments: 001-partitioned-inodeMap-POC.tar.gz, NameNode > Fine-Grained Locking.pdf > > > We target to enable fine-grained locking by splitting the in-memory namespace > into multiple partitions each having a separate lock. Intended to improve > performance of NameNode write operations. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org