Walter Su created HDFS-9122: ------------------------------- Summary: DN automatically add more volumes to avoid large volume Key: HDFS-9122 URL: https://issues.apache.org/jira/browse/HDFS-9122 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su
Currently if a DataNode has too many blocks, it partition blockReport by storage. In practice, we've seen a single storage can contains large amount of blocks and the report even exceeds the max RPC data length. Storage density increases quickly, a DataNode can hold more and more blocks. It's harder to include so many blocks in one RPC report. One option is "Support splitting BlockReport of a storage into multiple RPC"(HDFS-9011). I'm thinking maybe we could add more "logical" volumes (more storage directories in one device). DataNodeStorageInfo in NameNode is cheap. And Processing a single blockReport need NN hold the lock, so splitting one big volume to many volume can avoid a single processing hold lock too long. We can support wildcard in dfs.datanode.data.dir. Like /physical-volume/dfs/data/dir* When a volume exceeds threshold(like 1m blocks), DN automatically create a new storage directory, also a volume. We have to change RoundRobinVolumeChoosingPolicy as well, once we chosen a physical volume, we choose the logical volume which has least number of blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)