[
https://issues.apache.org/jira/browse/HDFS-11530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975343#comment-15975343
]
Chen Liang edited comment on HDFS-11530 at 4/19/17 7:25 PM:
------------------------------------------------------------
Thanks [~linyiqun] for the updated patch!
Regarding the earlier discussion about the point#1 [~arpitagarwal] made. I
think the suspicious part is like this (please correct me if I make any wrong
statement here):
the original logic:
# find a random node (if no node found, there is no node available at all,
break the while loop)
# loop through all the storage requirements, to see if the found node can
satisfy (if it satisfies any of the storage type, {{numOfReplicas--;}} and
proceed)
the new logic from the patch:
# loop through all the storage type, for each type, find a node, if it
successfully finds a node for any storage type (say X), it breaks
# loop through all the storage requirements, to see if the found node can
satisfy (if it satisfies any of the storage type, {{numOfReplicas--;}} and
proceed)
There are two things about this difference:
# since now we pick node based on storage type (instead blindly picking a node
by chance), we probably don't need the #2 step in the original logic that loops
through all the storage requirements. i.e.
{code}
for (Iterator<Map.Entry<StorageType, Integer>> iter = storageTypes
.entrySet().iterator(); iter.hasNext();) {...}
{code}
loop seems a bit unnecessary for new logic. Because we knew the node we got is
because it has type X, so we can probably just check for storage type X.
# in the old logic, when the while exits, all storage type requirement would be
found, unless there is no node to pick at all (in which case it breaks without
{{numOfReplicas}} down to 0). We should be careful about what is the proper
behaviour in the new logic.
was (Author: vagarychen):
Thanks [~linyiqun] for the updated patch!
Regarding the earlier discussion about the point#1 [~arpitagarwal] made. I
think the suspicious part is like this:
the original logic:
# find a random node (if no node found, there is no node available at all,
break the while loop)
# loop through all the storage requirements, to see if the found node can
satisfy (if it satisfies any of the storage type, {{numOfReplicas--;}} and
proceed)
the new logic from the patch:
# loop through all the storage type, for each type, find a node
# loop through all the storage requirements, to see if the found node can
satisfy (if it satisfies any of the storage type, {{numOfReplicas--;}} and
proceed)
There are two things about this difference:
# since now we pick node based on storage type (instead blindly picking a node
by chance), we probably don't need the #2 step in the original logic that loops
through all the storage requirements. i.e.
{code}
for (Iterator<Map.Entry<StorageType, Integer>> iter = storageTypes
.entrySet().iterator(); iter.hasNext();) {...}
{code}
loop seems a bit unnecessary for new logic.
# in the old logic, when the while exits, all storage type requirement would be
found, unless there is no node to pick at all (in which case it breaks without
{{numOfReplicas}} down to 0). We should be careful about what is the proper
behaviour in the new logic.
> Use HDFS specific network topology to choose datanode in
> BlockPlacementPolicyDefault
> ------------------------------------------------------------------------------------
>
> Key: HDFS-11530
> URL: https://issues.apache.org/jira/browse/HDFS-11530
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode
> Affects Versions: 3.0.0-alpha2
> Reporter: Yiqun Lin
> Assignee: Yiqun Lin
> Attachments: HDFS-11530.001.patch, HDFS-11530.002.patch,
> HDFS-11530.003.patch, HDFS-11530.004.patch, HDFS-11530.005.patch,
> HDFS-11530.006.patch, HDFS-11530.007.patch, HDFS-11530.008.patch,
> HDFS-11530.009.patch, HDFS-11530.010.patch, HDFS-11530.011.patch,
> HDFS-11530.012.patch
>
>
> The work for {{chooseRandomWithStorageType}} has been merged in HDFS-11482.
> But this method is contained in new topology {{DFSNetworkTopology}} which is
> specified for HDFS. We should update this and let
> {{BlockPlacementPolicyDefault}} use the new way since the original way is
> inefficient.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]