[ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922880#comment-16922880
 ] 

Konstantin Shvachko commented on HDFS-14703:
--------------------------------------------

Hey [~hexiaoqiao], clarifying on your questions.
 # The POC patches use latch lock only for one operation - mkdir. All other 
operations are unchanged and use the global lock. So concurrency in POC is 
guaranteed only for concurrent mkdir operations. If you use delete (or any 
other op) and mkdir concurrently the results will be unpredictable exactly as 
you describe. The POC goal is to demonstrate the idea, it is not the final 
product.
 ??`deleting a directory should lock all RangeGets involved`. Is it one special 
case about Delete Ops???
 Not only directory deletes. Several operations may need to lock multiple 
RangeGets like rename, recursive mkdir.
 # The POC patch adds {{long[] namespaceKey}} field into INode, which would 
increase the footprint of the namespace, which is bad. {{namespaceKey}} not 
really needed, as one can always calculate the the key via {{parent}} 
reference. It's an optimization. An alternative is to move {{long[]}} into 
{{INodesInPath}} so that they exist only when the INode is accessed.
 Again POC does not do A LOT of things, which the final implementation should. 
It's a large project, please don't blame me that I didn't do everything already 
;).
 # Actually there is unlock for mkdir, otherwise the POC wouldn't work. 
{{FSNamesystemLock.writeUnlock()}} unlocks all locked children when 
{{unlockChildren == true}}.

[~hexiaoqiao] looking forward working with you on this feature. Any and all 
help is very much welcomed.

> NameNode Fine-Grained Locking via Metadata Partitioning
> -------------------------------------------------------
>
>                 Key: HDFS-14703
>                 URL: https://issues.apache.org/jira/browse/HDFS-14703
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs, namenode
>            Reporter: Konstantin Shvachko
>            Priority: Major
>         Attachments: 001-partitioned-inodeMap-POC.tar.gz, NameNode 
> Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to