[ 
https://issues.apache.org/jira/browse/HDFS-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-6478:
--------------------------

    Attachment: HDFS-6478.patch

The patch has the followings:

1. Modify the proxy chain order for NamenodeProtocol and ClientProtocol so that 
NamenodeProtocolTranslatorPB/ClientNamenodeProtocolTranslatorPB directly call  
NamenodeProtocolPB and ClientNamenodeProtocolPB for non-HA case.
2. Update unit test TestFileCreation to verify retry count. This depends on 
HADOOP-10673, thus the patch also include HADOOP-10673 so that the patch can be 
submitted to run unit test.
3. Simplify the remoteException policy setup in NameNodeProxies.
4. Remove unnecessary retry policy for method "create" in 
DatanodeProtocolClientSideTranslatorPB.
5. DatanodeProtocolClientSideTranslatorPB still has the old proxy order. Leave 
it as it is given DataNodeProtocol doesn't do retry. We can open a separate 
jira to DataNodeProtocol retry if that is necessary.

> RemoteException can't be retried properly for non-HA scenario
> -------------------------------------------------------------
>
>                 Key: HDFS-6478
>                 URL: https://issues.apache.org/jira/browse/HDFS-6478
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HDFS-6478.patch
>
>
> For HA case, the call stack is DFSClient -> RetryInvocationHandler -> 
> ClientNamenodeProtocolTranslatorPB -> ProtobufRpcEngine. ProtobufRpcEngine. 
> ProtobufRpcEngine throws ServiceException and expects the caller to unwrap 
> it; ClientNamenodeProtocolTranslatorPB is the component that takes care of 
> that.
> {noformat}
>         at org.apache.hadoop.ipc.Client.call
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
>         at com.sun.proxy.$Proxy26.getFileInfo
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo
>         at sun.reflect.GeneratedMethodAccessor24.invoke
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke
>         at java.lang.reflect.Method.invoke
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
>         at com.sun.proxy.$Proxy27.getFileInfo
>         at org.apache.hadoop.hdfs.DFSClient.getFileInfo
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus
> {noformat}
> However, for non-HA case, the call stack is DFSClient -> 
> ClientNamenodeProtocolTranslatorPB -> RetryInvocationHandler -> 
> ProtobufRpcEngine. RetryInvocationHandler gets ServiceException and can't be 
> retried properly.
> {noformat}
> at org.apache.hadoop.ipc.Client.call
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke
> at com.sun.proxy.$Proxy9.getListing
> at sun.reflect.NativeMethodAccessorImpl.invoke0
> at sun.reflect.NativeMethodAccessorImpl.invoke
> at sun.reflect.DelegatingMethodAccessorImpl.invoke
> at java.lang.reflect.Method.invoke
> at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod
> at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke
> at com.sun.proxy.$Proxy9.getListing
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing
> at org.apache.hadoop.hdfs.DFSClient.listPaths
> {noformat}
> Perhaps, we can fix it by have NN wrap RetryInvocationHandler around 
> ClientNamenodeProtocolTranslatorPB and other PBs, instead of the current wrap 
> order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to