[ https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025761#comment-14025761 ]
Kihwal Lee commented on HDFS-6482: ---------------------------------- BlockIDs are sequential nowadays. With the proposed block distribution method, leaf dirs can get severely unbalanced, especially in smaller clusters. Besides the cost of looking up entries in a directory, directory lock contention can become high and hurt performance if many files are created and read from a small set of directories. I think limiting the number to 64 kind of imposed a cap on how contentious it can be. We might do better by more evenly distributing blocks. > Use block ID-based block layout on datanodes > -------------------------------------------- > > Key: HDFS-6482 > URL: https://issues.apache.org/jira/browse/HDFS-6482 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 2.5.0 > Reporter: James Thomas > Assignee: James Thomas > Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch > > > Right now blocks are placed into directories that are split into many > subdirectories when capacity is reached. Instead we can use a block's ID to > determine the path it should go in. This eliminates the need for the LDir > data structure that facilitates the splitting of directories when they reach > capacity as well as fields in ReplicaInfo that keep track of a replica's > location. > An extension of the work in HDFS-3290. -- This message was sent by Atlassian JIRA (v6.2#6252)