[ 
https://issues.apache.org/jira/browse/HDFS-15014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166918#comment-17166918
 ] 

Fengnan Li commented on HDFS-15014:
-----------------------------------

[~csun] We can close this one since HDFS-15417 is closed?

> RBF: WebHdfs chooseDatanode shouldn't call getDatanodeReport 
> -------------------------------------------------------------
>
>                 Key: HDFS-15014
>                 URL: https://issues.apache.org/jira/browse/HDFS-15014
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: rbf
>            Reporter: Chao Sun
>            Priority: Major
>
> Currently the {{chooseDatanode}} call (which is shared by {{open}}, 
> {{create}}, {{append}} and {{getFileChecksum}}) in RBF WebHDFS calls 
> {{getDatanodeReport}} from ALL downstream namenodes:
> {code}
>   private DatanodeInfo chooseDatanode(final Router router,
>       final String path, final HttpOpParam.Op op, final long openOffset,
>       final String excludeDatanodes) throws IOException {
>     // We need to get the DNs as a privileged user
>     final RouterRpcServer rpcServer = getRPCServer(router);
>     UserGroupInformation loginUser = UserGroupInformation.getLoginUser();
>     RouterRpcServer.setCurrentUser(loginUser);
>     DatanodeInfo[] dns = null;
>     try {
>       dns = rpcServer.getDatanodeReport(DatanodeReportType.LIVE);
>     } catch (IOException e) {
>       LOG.error("Cannot get the datanodes from the RPC server", e);
>     } finally {
>       // Reset ugi to remote user for remaining operations.
>       RouterRpcServer.resetCurrentUser();
>     }
>     HashSet<Node> excludes = new HashSet<Node>();
>     if (excludeDatanodes != null) {
>       Collection<String> collection =
>           getTrimmedStringCollection(excludeDatanodes);
>       for (DatanodeInfo dn : dns) {
>         if (collection.contains(dn.getName())) {
>           excludes.add(dn);
>         }
>       }
>     }
> ...
> {code}
> The {{getDatanodeReport}} is very expensive (particularly in a large cluster) 
> as it need to lock the {{DatanodeManager}} which is also shared by calls such 
> as processing heartbeats. Check HDFS-14366 for a similar issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to