[
https://issues.apache.org/jira/browse/HDFS-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044978#comment-15044978
]
Brahma Reddy Battula commented on HDFS-9513:
--------------------------------------------
This issue is already reported in HDFS-6481 and its throwing
HadoopIllegalArgmentException instead of AIOBException.
I think it would be good to provide a way to workaround this problem for the
older clients rather than rejecting their requests straightway considering the
cluster size (8000+) mentioned here to upgrade.
Patch provided might work as workaround for older clients. Any thoughts
[~szetszwo]/[~arpitagarwal]
[~Deng FEI],I have few comments about the patch.
1. DatanodeDescriptor#getPerferedStorageInfo(), might not be required. instead
one DUMMY storage, May be first storage of DN, can be chosen.
Currently {{DatanodeManager#getDatanodeStorageInfos(..)}} is used for 3 calls,
getAdditionalDatanode(), updatePipeline(), commitBlockSynchronization(). In all
these operations, Might get the same problem in case of old clients. Since
client dont understand storages, we need not worry to select storage.
a. getAdditionalDatanode(), updatePipeline() will be calls from clients
directly. But only updatePipeline results will be stored as is. Even then, once
the Incremental Block reports is received, these storage will be updated to
have proper details in namenode.
b. commitBlockSynchronization() will be from datanodes, and even in this
case, it might not lead to this workaround as it will have new targets.
Updated code can look like this
{code}
if (old) {
// CHOOSE First storage as dummy for writing support of Old client as a
// workaround for HDFS-9513.
// Later when the block report comes, actual storage Id will be
// restored in blocks map.
storages[i] = dd.getStorageInfos()[0];
} else {
storages[i] = dd.getStorageInfo(storageIDs[i]);
}
{code}
2. Also, may be we can have a config for this workaround and only use when
required. ?
3. Can you post a patch for trunk and latest branch-2.7, with HDFS-6481
included? Can throw exception, in case of config disabled, or request is not
from old client.
> DataNodeManager#getDataNodeStorageInfos not backward compatibility
> ------------------------------------------------------------------
>
> Key: HDFS-9513
> URL: https://issues.apache.org/jira/browse/HDFS-9513
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client, namenode
> Affects Versions: 2.2.0, 2.7.1
> Environment: 2.2.0 HDFS Client &2.7.1 HDFS Cluster
> Reporter: 邓飞
> Assignee: 邓飞
> Priority: Blocker
> Attachments: patch.HDFS-9513.20151207
>
>
> We is upgraded our new HDFS cluster to 2.7.1,but we YARN cluster is
> 2.2.0(8000+,it's too hard to upgrade as soon as HDFS cluster).
> The compatible case happened datasteamer do pipeline recovery, the NN need
> DN's storageInfo to update pipeline, and the storageIds is pair of
> pipleline's DN,but HDFS support storage type feature from 2.3.0
> [HDFS-2832|https://issues.apache.org/jira/browse/HDFS-2832], older version
> not have storageId ,although the protobuf serialization make the protocol
> compatible,but the client will throw remote exception as
> ArrayIndexOutOfBoundsException.
> ----
> the exception stack is below:
> {noformat}
> 2015-12-05 20:26:38,291 ERROR [Thread-4] org.apache.hadoop.hdfs.DFSClient:
> Failed to close file XXX
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
> 0
> at
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:513)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:6439)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:6404)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updatePipeline(NameNodeRpcServer.java:892)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updatePipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:997)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1066)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy10.updatePipeline(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updatePipeline(ClientNamenodeProtocolTranslatorPB.java:801)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy11.updatePipeline(Unknown Source)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1047)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)