Yiqun Lin created HDFS-11464:
--------------------------------
Summary: Improve the selection in choosing storage for blocks
Key: HDFS-11464
URL: https://issues.apache.org/jira/browse/HDFS-11464
Project: Hadoop HDFS
Issue Type: Improvement
Components: namenode
Reporter: Yiqun Lin
Assignee: Yiqun Lin
Currently the logic in choosing storage for blocks is not a good way. It always
uses the first valid storage of a given StorageType ({{see
DataNodeDescriptor#chooseStorage4Block}}). This should not be a good selection.
That means blcoks will always be written to the same volume (first volume)
until this volume has not available space. This problem is brought up by this
comment (
https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382
)
There is one solution from me:
* First, based on existing storages in one node, extract all the valid storages
into a collection.
* Then, disrupt the order of these vaild storages, get a new collection.
* Finally, get the first storage from the new storages collection.
These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}} and
replace current logic. I I think this improvement can be done as a subtask
under HDFS-11419. Any further comments are welcomed.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]