[ 
https://issues.apache.org/jira/browse/HDFS-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651791#comment-16651791
 ] 

Adam Antal edited comment on HDFS-13985 at 10/16/18 2:23 PM:
-------------------------------------------------------------

It looks like there are no means to obtain that information around the NN 
(DataXceiver and surrounding classes). For a rare use case as this exception 
using higher level communication (NN-DN) like rpcs would cause more smoke than 
fire.

Although I could imagine that a message is sent informing the NN about the not 
found replicas, where the NN could provide the associated metadata (filename, 
replication factor, block location). It would otherwise won't work, so wouldn't 
increase traffic between the NN and the DNs, just in the case where this 
exception is thrown, but in that case it makes the investigation a lot easier.


was (Author: adam.antal):
It looks like there are no means to obtain that information around the NN 
(DataXceiver and surrounding classes). For a rare use case as this exception 
using higher level communication (NN-DN) like rpcs would cause more smoke than 
fire.

> Clearer error message for ReplicaNotFoundException
> --------------------------------------------------
>
>                 Key: HDFS-13985
>                 URL: https://issues.apache.org/jira/browse/HDFS-13985
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>            Reporter: Adam Antal
>            Priority: Major
>
> The issue is that we came across a ReplicaNotFoundException in a bug report, 
> the most informative thing we could get is "Replica not found for 
> [ExtendedBlock]". If someone tries to investigate cases including 
> ReplicaNotFoundExceptions should review diagnostic bundles, dig through logs, 
> but as a starting point enhancing the exception message would boost this 
> process, and be beneficial in the long run.
> More concretely, it would be helpful if any of the following information was 
> displayed along with the exception: file's name, replication factor or block 
> location.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to