[
https://issues.apache.org/jira/browse/HDFS-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15309860#comment-15309860
]
Michael Tamm commented on HDFS-5012:
------------------------------------
We had the same problem in our Hadoop cluster (not a test cluster, but a live
cluster with real data, HDFS version: 2.0.0-cdh4.2.0):
{noformat}
Failed to obtain replica info for block
(=BP-655596758-10.10.34.1-1341996058045:blk_2570851709037266390_1527175689)
from datanode (=10.10.34.35:50010)
java.io.IOException: THIS IS NOT SUPPOSED TO HAPPEN:
replica.getGenerationStamp() >= recoveryId = 1527175689,
block=blk_2570851709037266390_1527175689, replica=FinalizedReplica,
blk_2570851709037266390_1527175689, FINALIZED
getNumBytes() = 48360562
getBytesOnDisk() = 48360562
getVisibleLength()= 48360562
getVolume() = /var/lib/hdfs2/data/current
getBlockFile() =
/var/lib/hdfs2/data/current/BP-655596758-10.10.34.1-1341996058045/current/finalized/subdir9/blk_2570851709037266390
unlinked =false
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:1451)
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:1411)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.initReplicaRecovery(DataNode.java:1920)
at
org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolServerSideTranslatorPB.initReplicaRecovery(InterDatanodeProtocolServerSideTranslatorPB.java:55)
at
org.apache.hadoop.hdfs.protocol.proto.InterDatanodeProtocolProtos$InterDatanodeProtocolService$2.callBlockingMethod(InterDatanodeProtocolProtos.java:2198)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.callInitReplicaRecovery(DataNode.java:1933)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2000)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.access$400(DataNode.java:214)
at org.apache.hadoop.hdfs.server.datanode.DataNode$2.run(DataNode.java:1905)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): THIS IS
NOT SUPPOSED TO HAPPEN: replica.getGenerationStamp() >= recoveryId =
1527175689, block=blk_2570851709037266390_1527175689, replica=FinalizedReplica,
blk_2570851709037266390_1527175689, FINALIZED
getNumBytes() = 48360562
getBytesOnDisk() = 48360562
getVisibleLength()= 48360562
getVolume() = /var/lib/hdfs2/data/current
getBlockFile() =
/var/lib/hdfs2/data/current/BP-655596758-10.10.34.1-1341996058045/current/finalized/subdir9/blk_2570851709037266390
unlinked =false
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:1451)
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:1411)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.initReplicaRecovery(DataNode.java:1920)
at
org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolServerSideTranslatorPB.initReplicaRecovery(InterDatanodeProtocolServerSideTranslatorPB.java:55)
at
org.apache.hadoop.hdfs.protocol.proto.InterDatanodeProtocolProtos$InterDatanodeProtocolService$2.callBlockingMethod(InterDatanodeProtocolProtos.java:2198)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689)
at org.apache.hadoop.ipc.Client.call(Client.java:1225)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy11.initReplicaRecovery(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolTranslatorPB.initReplicaRecovery(InterDatanodeProtocolTranslatorPB.java:83)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.callInitReplicaRecovery(DataNode.java:1931)
... 4 more
{noformat}
> replica.getGenerationStamp() may be >= recoveryId
> -------------------------------------------------
>
> Key: HDFS-5012
> URL: https://issues.apache.org/jira/browse/HDFS-5012
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.0.5-alpha
> Reporter: Ted Yu
> Attachments: testReplicationQueueFailover.txt
>
>
> The following was first observed by [~jdcryans] in
> TestReplicationQueueFailover running against 2.0.5-alpha:
> {code}
> 2013-07-16 17:14:33,340 ERROR [IPC Server handler 7 on 35081]
> security.UserGroupInformation(1481): PriviledgedActionException as:ec2-user
> (auth:SIMPLE) cause:java.io.IOException: THIS IS NOT SUPPOSED TO HAPPEN:
> replica.getGenerationStamp() >= recoveryId = 1041,
> block=blk_4297992342878601848_1041, replica=FinalizedReplica,
> blk_4297992342878601848_1041, FINALIZED
> getNumBytes() = 794
> getBytesOnDisk() = 794
> getVisibleLength()= 794
> getVolume() =
> /home/ec2-user/jenkins/workspace/HBase-0.95-Hadoop-2/hbase-server/target/test-data/f2763e32-fe49-4988-ac94-eeca82431821/dfscluster_643a635e-4e39-4aa5-974c-25e01db16ff7/dfs/data/data3/current
> getBlockFile() =
> /home/ec2-user/jenkins/workspace/HBase-0.95-Hadoop-2/hbase-server/target/test-data/f2763e32-fe49-4988-ac94-eeca82431821/dfscluster_643a635e-4e39-4aa5-974c-25e01db16ff7/dfs/data/data3/current/BP-1477359609-10.197.55.49-1373994849464/current/finalized/blk_4297992342878601848
> unlinked =false
> 2013-07-16 17:14:33,341 WARN
> [org.apache.hadoop.hdfs.server.datanode.DataNode$2@64a1fcba]
> datanode.DataNode(1894): Failed to obtain replica info for block
> (=BP-1477359609-10.197.55.49-1373994849464:blk_4297992342878601848_1041) from
> datanode (=127.0.0.1:47006)
> java.io.IOException: THIS IS NOT SUPPOSED TO HAPPEN:
> replica.getGenerationStamp() >= recoveryId = 1041,
> block=blk_4297992342878601848_1041, replica=FinalizedReplica,
> blk_4297992342878601848_1041, FINALIZED
> getNumBytes() = 794
> getBytesOnDisk() = 794
> getVisibleLength()= 794
> getVolume() =
> /home/ec2-user/jenkins/workspace/HBase-0.95-Hadoop-2/hbase-server/target/test-data/f2763e32-fe49-4988-ac94-eeca82431821/dfscluster_643a635e-4e39-4aa5-974c-25e01db16ff7/dfs/data/data3/current
> getBlockFile() =
> /home/ec2-user/jenkins/workspace/HBase-0.95-Hadoop-2/hbase-server/target/test-data/f2763e32-fe49-4988-ac94-eeca82431821/dfscluster_643a635e-4e39-4aa5-974c-25e01db16ff7/dfs/data/data3/current/BP-1477359609-10.197.55.49-1373994849464/current/finalized/blk_4297992342878601848
> unlinked =false
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]