[jira] [Commented] (HDFS-9513) DataNodeManager#getDataNodeStorageInfos not backward compatibility

Brahma Reddy Battula (JIRA) Mon, 07 Dec 2015 05:58:19 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044978#comment-15044978
 ]


Brahma Reddy Battula commented on HDFS-9513:
--------------------------------------------

This issue is already reported in HDFS-6481 and its throwing 
HadoopIllegalArgmentException instead of AIOBException.

I think it would be good to provide a way to workaround this problem for the 
older clients rather than rejecting their requests straightway considering the 
cluster size (8000+) mentioned here to upgrade.
Patch provided might work as workaround for older clients. Any thoughts 
[~szetszwo]/[~arpitagarwal]

[~Deng FEI],I have few comments about the patch.

1. DatanodeDescriptor#getPerferedStorageInfo(), might not be required. instead 
one DUMMY storage, May be first storage of DN, can be chosen.

Currently {{DatanodeManager#getDatanodeStorageInfos(..)}} is used for 3 calls, 
getAdditionalDatanode(), updatePipeline(), commitBlockSynchronization(). In all 
these operations, Might get the same problem in case of old clients. Since 
client dont understand storages, we need not worry to select storage.
   a. getAdditionalDatanode(), updatePipeline() will be calls from clients 
directly. But only updatePipeline results will be stored as is. Even then, once 
the Incremental Block reports is received, these storage will be updated to 
have proper details in namenode.
   b. commitBlockSynchronization() will be from datanodes, and even in this 
case, it might not lead to this workaround as it will have new targets.

Updated code can look like this
{code}
if (old) {
// CHOOSE First storage as dummy for writing support of Old client as a
// workaround for HDFS-9513.
// Later when the block report comes, actual storage Id will be
// restored in blocks map.
storages[i] = dd.getStorageInfos()[0];
} else {
storages[i] = dd.getStorageInfo(storageIDs[i]);
}
{code}
2. Also, may be we can have a config for this workaround and only use when 
required. ? 

3. Can you post a patch for trunk and latest branch-2.7, with HDFS-6481 
included? Can throw exception, in case of config disabled, or request is not 
from old client.

> DataNodeManager#getDataNodeStorageInfos not backward compatibility
> ------------------------------------------------------------------
>
>                 Key: HDFS-9513
>                 URL: https://issues.apache.org/jira/browse/HDFS-9513
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client, namenode
>    Affects Versions: 2.2.0, 2.7.1
>         Environment:  2.2.0 HDFS Client &2.7.1 HDFS Cluster
>            Reporter: 邓飞
>            Assignee: 邓飞
>            Priority: Blocker
>         Attachments: patch.HDFS-9513.20151207
>
>
> We is upgraded our new HDFS cluster to 2.7.1,but we YARN cluster is 
> 2.2.0(8000+,it's too hard to upgrade as soon as HDFS cluster).
> The compatible case happened  datasteamer do pipeline recovery, the NN need 
> DN's storageInfo to update pipeline, and the storageIds is pair of 
> pipleline's DN,but HDFS support storage type feature from 2.3.0 
> [HDFS-2832|https://issues.apache.org/jira/browse/HDFS-2832], older version 
> not have storageId ,although the protobuf serialization make the protocol 
> compatible,but the client  will throw remote exception as 
> ArrayIndexOutOfBoundsException.
> ----
> the exception stack is below:
> {noformat}
> 2015-12-05 20:26:38,291 ERROR [Thread-4] org.apache.hadoop.hdfs.DFSClient: 
> Failed to close file XXX
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
>  0
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:513)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:6439)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:6404)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updatePipeline(NameNodeRpcServer.java:892)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updatePipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:997)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1066)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>       at com.sun.proxy.$Proxy10.updatePipeline(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updatePipeline(ClientNamenodeProtocolTranslatorPB.java:801)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>       at com.sun.proxy.$Proxy11.updatePipeline(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1047)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9513) DataNodeManager#getDataNodeStorageInfos not backward compatibility

Reply via email to