[ 
https://issues.apache.org/jira/browse/HDFS-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11464:
-----------------------------
    Description: 
Currently the logic in choosing storage for blocks is not a good way. It always 
uses the first valid storage of a given StorageType ({{see 
DataNodeDescriptor#chooseStorage4Block}}). This should not be a good selection. 
That means blcoks will always be written to the same volume (first volume) and 
other valid volumes have no choices. This problem is brought up by this comment 
( 
https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382
 )

There is one solution from me:

* First, based on existing storages in one node, extract all the valid storages 
into a collection.
* Then, disrupt the order of these vaild storages, get a new collection.
* Finally, get the first storage from the new storages collection.

These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}} and 
replace current logic. I think this improvement can be done as a subtask under 
HDFS-11419. Any further comments are welcomed.


  was:
Currently the logic in choosing storage for blocks is not a good way. It always 
uses the first valid storage of a given StorageType ({{see 
DataNodeDescriptor#chooseStorage4Block}}). This should not be a good selection. 
That means blcoks will always be written to the same volume (first volume) 
until this volume has not available space. This problem is brought up by this 
comment ( 
https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382
 )

There is one solution from me:

* First, based on existing storages in one node, extract all the valid storages 
into a collection.
* Then, disrupt the order of these vaild storages, get a new collection.
* Finally, get the first storage from the new storages collection.

These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}} and 
replace current logic. I I think this improvement can be done as a subtask 
under HDFS-11419. Any further comments are welcomed.



> Improve the selection in choosing storage for blocks
> ----------------------------------------------------
>
>                 Key: HDFS-11464
>                 URL: https://issues.apache.org/jira/browse/HDFS-11464
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>
> Currently the logic in choosing storage for blocks is not a good way. It 
> always uses the first valid storage of a given StorageType ({{see 
> DataNodeDescriptor#chooseStorage4Block}}). This should not be a good 
> selection. That means blcoks will always be written to the same volume (first 
> volume) and other valid volumes have no choices. This problem is brought up 
> by this comment ( 
> https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382
>  )
> There is one solution from me:
> * First, based on existing storages in one node, extract all the valid 
> storages into a collection.
> * Then, disrupt the order of these vaild storages, get a new collection.
> * Finally, get the first storage from the new storages collection.
> These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}} 
> and replace current logic. I think this improvement can be done as a subtask 
> under HDFS-11419. Any further comments are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to