[jira] [Comment Edited] (HDFS-11530) Use HDFS specific network topology to choose datanode in BlockPlacementPolicyDefault

Chen Liang (JIRA) Wed, 19 Apr 2017 12:26:03 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-11530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975343#comment-15975343
 ]


Chen Liang edited comment on HDFS-11530 at 4/19/17 7:25 PM:
------------------------------------------------------------

Thanks [~linyiqun] for the updated patch!

Regarding the earlier discussion about the point#1 [~arpitagarwal] made. I 
think the suspicious part is like this (please correct me if I make any wrong 
statement here):
the original logic:
# find a random node (if no node found, there is no node available at all, 
break the while loop)
# loop through all the storage requirements, to see if the found node can 
satisfy (if it satisfies any of the storage type, {{numOfReplicas--;}} and 
proceed)

the new logic from the patch:
# loop through all the storage type, for each type, find a node, if it 
successfully finds a node for any storage type (say X), it breaks
# loop through all the storage requirements, to see if the found node can 
satisfy (if it satisfies any of the storage type, {{numOfReplicas--;}} and 
proceed)

There are two things about this difference:
# since now we pick node based on storage type (instead blindly picking a node 
by chance), we probably don't need the #2 step in the original logic that loops 
through all the storage requirements. i.e.
{code}
for (Iterator<Map.Entry<StorageType, Integer>> iter = storageTypes
            .entrySet().iterator(); iter.hasNext();) {...}
{code}
loop seems a bit unnecessary for new logic. Because we knew the node we got is 
because it has type X, so we can probably just check for storage type X.
# in the old logic, when the while exits, all storage type requirement would be 
found, unless there is no node to pick at all (in which case it breaks without 
{{numOfReplicas}} down to 0). We should be careful about what is the proper 
behaviour in the new logic.


was (Author: vagarychen):
Thanks [~linyiqun] for the updated patch!

Regarding the earlier discussion about the point#1 [~arpitagarwal] made. I 
think the suspicious part is like this:
the original logic:
# find a random node (if no node found, there is no node available at all, 
break the while loop)
# loop through all the storage requirements, to see if the found node can 
satisfy (if it satisfies any of the storage type, {{numOfReplicas--;}} and 
proceed)

the new logic from the patch:
# loop through all the storage type, for each type, find a node
# loop through all the storage requirements, to see if the found node can 
satisfy (if it satisfies any of the storage type, {{numOfReplicas--;}} and 
proceed)

There are two things about this difference:
# since now we pick node based on storage type (instead blindly picking a node 
by chance), we probably don't need the #2 step in the original logic that loops 
through all the storage requirements. i.e.
{code}
for (Iterator<Map.Entry<StorageType, Integer>> iter = storageTypes
            .entrySet().iterator(); iter.hasNext();) {...}
{code}
loop seems a bit unnecessary for new logic.
# in the old logic, when the while exits, all storage type requirement would be 
found, unless there is no node to pick at all (in which case it breaks without 
{{numOfReplicas}} down to 0). We should be careful about what is the proper 
behaviour in the new logic.

> Use HDFS specific network topology to choose datanode in 
> BlockPlacementPolicyDefault
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-11530
>                 URL: https://issues.apache.org/jira/browse/HDFS-11530
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>         Attachments: HDFS-11530.001.patch, HDFS-11530.002.patch, 
> HDFS-11530.003.patch, HDFS-11530.004.patch, HDFS-11530.005.patch, 
> HDFS-11530.006.patch, HDFS-11530.007.patch, HDFS-11530.008.patch, 
> HDFS-11530.009.patch, HDFS-11530.010.patch, HDFS-11530.011.patch, 
> HDFS-11530.012.patch
>
>
> The work for {{chooseRandomWithStorageType}} has been merged in HDFS-11482. 
> But this method is contained in new topology {{DFSNetworkTopology}} which is 
> specified for HDFS. We should update this and let 
> {{BlockPlacementPolicyDefault}} use the new way since the original way is 
> inefficient.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDFS-11530) Use HDFS specific network topology to choose datanode in BlockPlacementPolicyDefault

Reply via email to