[jira] [Comment Edited] (HDFS-14774) RBF: Improve RouterWebhdfsMethods#chooseDatanode() error handling

CR Hota (Jira) Thu, 29 Aug 2019 15:33:24 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919011#comment-16919011
 ]


CR Hota edited comment on HDFS-14774 at 8/29/19 10:32 PM:
----------------------------------------------------------

[~jojochuang] Thanks for reporting this.

This is ok at this point. Reason being, router has 2 layers, One the server to 
external clients and client to downstream namenodes. Client to downstream 
namenodes (aka RouterRpcClient) is configured to retry multiple times based on 
failures from downstream namenode. It also has logic to failover and try 
standby namenode if standby becomes active etc. So ya retries are present 
before dns comes back as null.

And if it does come back as null then parent method sends back an appropriate 
IOexception. 
{code:java}
 
    if (dn == null) {
      throw new IOException("Failed to find datanode, suggest to check cluster"
          + " health. excludeDatanodes=" + excludeDatanodes);
    }

{code}
Let me know if this helps ?

 

 

 


was (Author: crh):
[~jojochuang] Thanks for reporting this.

This is ok at this point. Reason being, router has 2 layers, One the server to 
external clients and client to downstream namenodes. Client to downstream 
namenodes (aka RouterRpcClient) is configured to retry multiple times based on 
failures from downstream namenode. It also has logic to failover and try 
standby namenode if standby becomes active etc. So ya retries are present 
before dns comes back as null.

And if it does come back as null then parent method does send back an 
appropriate IOexception. 
{code:java}
 
    if (dn == null) {
      throw new IOException("Failed to find datanode, suggest to check cluster"
          + " health. excludeDatanodes=" + excludeDatanodes);
    }

{code}
Let me know if this helps ?

 

 

 

> RBF: Improve RouterWebhdfsMethods#chooseDatanode() error handling
> -----------------------------------------------------------------
>
>                 Key: HDFS-14774
>                 URL: https://issues.apache.org/jira/browse/HDFS-14774
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Wei-Chiu Chuang
>            Assignee: CR Hota
>            Priority: Minor
>
>  HDFS-13972 added the following code:
> {code}
> try {
>       dns = rpcServer.getDatanodeReport(DatanodeReportType.LIVE);
>     } catch (IOException e) {
>       LOG.error("Cannot get the datanodes from the RPC server", e);
>     } finally {
>       // Reset ugi to remote user for remaining operations.
>       RouterRpcServer.resetCurrentUser();
>     }
>     HashSet<Node> excludes = new HashSet<Node>();
>     if (excludeDatanodes != null) {
>       Collection<String> collection =
>           getTrimmedStringCollection(excludeDatanodes);
>       for (DatanodeInfo dn : dns) {
>         if (collection.contains(dn.getName())) {
>           excludes.add(dn);
>         }
>       }
>     }
> {code}
> If {{rpcServer.getDatanodeReport()}} throws an exception, {{dns}} will become 
> null. This does't look like the best way to handle the exception. Should 
> router retry upon exception? Does it perform retry automatically under the 
> hood?
> [~crh] [~brahmareddy]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14774) RBF: Improve RouterWebhdfsMethods#chooseDatanode() error handling

Reply via email to