[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025980#comment-14025980
 ] 

Colin Patrick McCabe commented on HDFS-6482:
--------------------------------------------

bq. Suppose we had a cluster of 16 million blocks (with sequential block IDs), 
we could in theory have a single DN with a directory as large as 1024 entries, 
if we got unlucky with the assignment of blocks to DNs.

I don't think this calculation is right.

Even if all the blocks end up on a single DN (maximally unbalanced), in a 16 
million block cluster, you have  (16 * 1024 * 1024) / (256 * 256) = 256 entries 
per directory.

To confirm this calculation, I ran this test program:
{code}
#include <inttypes.h>
#include <stdio.h>

#define MAX_A 256
#define MAX_B 256

uint64_t dir_entries[MAX_A][MAX_B];

int main(void)
{
  uint64_t i, j, l, a, b, c;
  uint64_t max = (16LL * 1024LL * 1024LL);

  for (i = 0; i < max; i++) {
    l = (i & 0x00000000000000ffLL);
    a = (i & 0x000000000000ff00LL) >> 8LL;
    b = (i & 0x0000000000ff0000LL) >> 16LL;
    c = (i & 0xffffffffff000000LL) >> 16LL;
    c |= l;
    //printf("%02"PRIx64"/%02"PRIx64"/%012"PRIx64"\n", a, b, c);
    dir_entries[a][b]++;
  }
  max = 0;
  for (i = 0; i < MAX_A; i++) {
    for (j = 0; j < MAX_B; j++) {
      if (max < dir_entries[i][j]) {
        max = dir_entries[i][j];
      }
    }
  }
  printf("max entries per directory = %"PRId64"\n", max);
  return 0;
}
{code}

bq. we were considering using some sort of deterministic probing (as in hash 
tables) to find less full directories if the initial directory for a block is 
full...

I don't think probing is a good idea.  It's going to slow things down in the 
common case when we're reading a block.

Maybe we should add another layer in the hierarchy so that we know we won't get 
big directories even on huge clusters.

> Use block ID-based block layout on datanodes
> --------------------------------------------
>
>                 Key: HDFS-6482
>                 URL: https://issues.apache.org/jira/browse/HDFS-6482
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.5.0
>            Reporter: James Thomas
>            Assignee: James Thomas
>         Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.patch
>
>
> Right now blocks are placed into directories that are split into many 
> subdirectories when capacity is reached. Instead we can use a block's ID to 
> determine the path it should go in. This eliminates the need for the LDir 
> data structure that facilitates the splitting of directories when they reach 
> capacity as well as fields in ReplicaInfo that keep track of a replica's 
> location.
> An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to