[
https://issues.apache.org/jira/browse/HDFS-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997916#comment-15997916
]
Yiqun Lin commented on HDFS-11464:
----------------------------------
After the work in HDFS-9807, the storageID chosen from the NameNode will be
passed to DataNode and can be used in VolumeChoosingPolicy.However, currently
the existing VolumeChoosingPolicies will usually ignore the chosen storageID.
But if we implement a new policy which will respect the storageID, then the
behavior of choosing storage for blocks in BlockPlacement should also be
improved.
So I'd like to add an new boolean config like {{dfs.datanode.consider.storage}}
to make BlockPlacementPolicy on the Namenode and the VolumeChoosingPolicy be
consistent in the way the volumes are chosen. I don't plan to implement a new
storageID-respected VolumeChoosingPolicy now. But it doesn't affect the
improvement that did in this JIRA.
Attach the updated patch and reopen this JIRA. Any comments are welcomed.
Thanks.
> Improve the selection in choosing storage for blocks
> ----------------------------------------------------
>
> Key: HDFS-11464
> URL: https://issues.apache.org/jira/browse/HDFS-11464
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode
> Reporter: Yiqun Lin
> Assignee: Yiqun Lin
> Attachments: HDFS-11464.001.patch
>
>
> Currently the logic in choosing storage for blocks is not a good way. It
> always uses the first valid storage of a given StorageType ({{see
> DataNodeDescriptor#chooseStorage4Block}}). This should not be a good
> selection. That means blcoks will always be written to the same volume (first
> volume) and other valid volumes have no choices. This problem is brought up
> by this comment (
> https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382
> )
> There is one solution from me:
> * First, based on existing storages in one node, extract all the valid
> storages into a collection.
> * Then, disrupt the order of these vaild storages, get a new collection.
> * Finally, get the first storage from the new storages collection.
> These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}}
> and replace current logic. I think this improvement can be done as a subtask
> under HDFS-11419. Any further comments are welcomed.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]