[
https://issues.apache.org/jira/browse/HDFS-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892062#comment-15892062
]
Yongtao Yang edited comment on HDFS-8409 at 3/3/17 7:12 AM:
------------------------------------------------------------
I found a clue that may be useful to this problem. {{DFSClient#namenode}} is a
proxy whose {{InvocationHandler}} is {{RetryInvocationHandler}}. If
{{namenode.toString()}} or {{namenode.hashCode()}} or some other non-rpc
methods are called, then the next real rpc method will failed at
{{setCallIdAndRetryCount()}}. The cause of this is that {{toString()}} is also
forwarded to {{RetryInvocationHandler.invoke()}} and
{{setCallIdAndRetryCount()}} is executed there, then
{{ipc.Client#callId.get()}} will not be {{null}} any more. For real rpc
methods, {{ipc.Client#callId}} will be reset to the value {{null}} when
creating a {{Call}} instance({{org.apache.hadoop.ipc.Client.Call.Call(RpcKind,
Writable)}}), but for {{toString()}} or {{hashCode()}}, it can't reach the
{{Call}} constructor, so {{ipc.Client#callId}} will never recover to be
{{null}} and the next method(whether it is a real rpc or not) will fail.
was (Author: ytyang):
I found a clue that may be useful to this problem. {{DFSClient#namenode}} is a
proxy whose {{InvocationHandler}} is {{RetryInvocationHandler}}. If
{{namenode.toString()}} or {{namenode.hashCode()}} or some other non-rpc
methods are called, then the next real rpc method will failed at
{{setCallIdAndRetryCount()}}. The cause of this is that {{toString()}} is also
forwarded to {{RetryInvocationHandler.invoke()}} and
{{setCallIdAndRetryCount()}} is executed there, then
{{ipc.Client#callId.get()}} will not be null any more. For real rpc methods,
{{ipc.Client#callId}} will be reset to null when creating a {{Call}}
instance({{org.apache.hadoop.ipc.Client.Call.Call(RpcKind, Writable)}}), but
for {{toString()}} or {{hashCode()}}, it can't reach the {{Call}} constructor,
so {{ipc.Client#callId}} will never recover to be null and the next
method(whether it is a real rpc or not) will fail.
> HDFS client RPC call throws "java.lang.IllegalStateException"
> -------------------------------------------------------------
>
> Key: HDFS-8409
> URL: https://issues.apache.org/jira/browse/HDFS-8409
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Reporter: Juan Yu
> Assignee: Juan Yu
> Attachments: HDFS-8409.001.patch, HDFS-8409.002.patch,
> HDFS-8409.003.patch
>
>
> When the HDFS client RPC calls need to retry, it sometimes throws
> "java.lang.IllegalStateException" and retry is aborted and cause the client
> call will fail.
> {code}
> Caused by: java.lang.IllegalStateException
> at
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at org.apache.hadoop.ipc.Client.setCallIdAndRetryCount(Client.java:116)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:99)
> at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1912)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1089)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
> {code}
> Here is the check that throws exception
> {code}
> public static void setCallIdAndRetryCount(int cid, int rc) {
> ...
> Preconditions.checkState(callId.get() == null);
> }
> {code}
> The RetryInvocationHandler tries to call it with not null callId and causes
> exception.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]