[jira] [Commented] (HDFS-14774) RBF: Improve RouterWebhdfsMethods#chooseDatanode() error handling

2019-09-09 Thread CR Hota (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926050#comment-16926050
 ] 

CR Hota commented on HDFS-14774:


Hey [~jojochuang], 

Do you have any follow up questions or shall we close this?

> RBF: Improve RouterWebhdfsMethods#chooseDatanode() error handling
> -
>
> Key: HDFS-14774
> URL: https://issues.apache.org/jira/browse/HDFS-14774
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: CR Hota
>Priority: Minor
>
>  HDFS-13972 added the following code:
> {code}
> try {
>   dns = rpcServer.getDatanodeReport(DatanodeReportType.LIVE);
> } catch (IOException e) {
>   LOG.error("Cannot get the datanodes from the RPC server", e);
> } finally {
>   // Reset ugi to remote user for remaining operations.
>   RouterRpcServer.resetCurrentUser();
> }
> HashSet excludes = new HashSet();
> if (excludeDatanodes != null) {
>   Collection collection =
>   getTrimmedStringCollection(excludeDatanodes);
>   for (DatanodeInfo dn : dns) {
> if (collection.contains(dn.getName())) {
>   excludes.add(dn);
> }
>   }
> }
> {code}
> If {{rpcServer.getDatanodeReport()}} throws an exception, {{dns}} will become 
> null. This does't look like the best way to handle the exception. Should 
> router retry upon exception? Does it perform retry automatically under the 
> hood?
> [~crh] [~brahmareddy]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14774) RBF: Improve RouterWebhdfsMethods#chooseDatanode() error handling

2019-08-29 Thread CR Hota (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919011#comment-16919011
 ] 

CR Hota commented on HDFS-14774:


[~jojochuang] Thanks for reporting this.

This is ok at this point. Reason being, router has 2 layers, One the server to 
external clients and client to downstream namenodes. Client to downstream 
namenodes (aka RouterRpcClient) is configured to retry multiple times based on 
failures from downstream namenode. It also has logic to failover and try 
standby namenode if standby becomes active etc. So ya retries are present 
before dns comes back as null.

And if it does come back as null then parent method does send back an 
appropriate IOexception. 
{code:java}
 
if (dn == null) {
  throw new IOException("Failed to find datanode, suggest to check cluster"
  + " health. excludeDatanodes=" + excludeDatanodes);
}

{code}
Let me know if this helps ?

 

 

 

> RBF: Improve RouterWebhdfsMethods#chooseDatanode() error handling
> -
>
> Key: HDFS-14774
> URL: https://issues.apache.org/jira/browse/HDFS-14774
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: CR Hota
>Priority: Minor
>
>  HDFS-13972 added the following code:
> {code}
> try {
>   dns = rpcServer.getDatanodeReport(DatanodeReportType.LIVE);
> } catch (IOException e) {
>   LOG.error("Cannot get the datanodes from the RPC server", e);
> } finally {
>   // Reset ugi to remote user for remaining operations.
>   RouterRpcServer.resetCurrentUser();
> }
> HashSet excludes = new HashSet();
> if (excludeDatanodes != null) {
>   Collection collection =
>   getTrimmedStringCollection(excludeDatanodes);
>   for (DatanodeInfo dn : dns) {
> if (collection.contains(dn.getName())) {
>   excludes.add(dn);
> }
>   }
> }
> {code}
> If {{rpcServer.getDatanodeReport()}} throws an exception, {{dns}} will become 
> null. This does't look like the best way to handle the exception. Should 
> router retry upon exception? Does it perform retry automatically under the 
> hood?
> [~crh] [~brahmareddy]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org