[ 
https://issues.apache.org/jira/browse/HDFS-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698672#comment-14698672
 ] 

Felix Borchers commented on HDFS-8093:
--------------------------------------

grep /system/balancer.id hadoop-hdfs-namenode-devhmn02.rz.is.log.1

{code}
...
2015-08-14 00:30:03,843 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: /system/balancer.id. BP-322804774-10.13.54.1-1412684451669 
blk_1074256920_516292{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-4db312aa-bc23-47dc-b768-52a2d72b09d3:NORMAL:10.13.53.30:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-c7db1b58-8e25-435f-8af8-08b6754c021c:NORMAL:10.13.53.16:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-4457ae11-7684-4187-b4ad-56466d79fba2:NORMAL:10.13.53.19:50010|RBW]]}
2015-08-14 00:30:03,958 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: 
/system/balancer.id for DFSClient_NONMAPREDUCE_-1841368225_1
2015-08-14 00:30:03,986 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: /system/balancer.id. BP-322804774-10.13.54.1-1412684451669 
blk_1074256921_516293{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-8f3d8860-b977-4b7b-b681-d25c112ad1f3:NORMAL:10.13.53.14:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-abb5362f-6d29-478f-a678-53f09c096871:NORMAL:10.13.53.12:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-b02f3ebc-955e-4e11-82df-dc51278dc06f:NORMAL:10.13.53.17:50010|RBW]]}
2015-08-14 00:30:04,002 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: 
/system/balancer.id for DFSClient_NONMAPREDUCE_-1841368225_1
2015-08-14 00:46:44,975 WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:hdfs (auth:SIMPLE) 
cause:org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
/system/balancer.id (inode 709043): File does not exist. Holder 
DFSClient_NONMAPREDUCE_-1841368225_1 does not have any open files.
2015-08-14 00:46:44,975 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.complete from 
10.13.52.1:58633 Call#220 Retry#0: 
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
/system/balancer.id (inode 709043): File does not exist. Holder 
DFSClient_NONMAPREDUCE_-1841368225_1 does not have any open files.
...
{code}

The LogMessages are between the two timestamps are:
{code}
2015-08-14 00:30:03,843 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: /system/balancer.id. BP-322804774-10.13.54.1-1412684451669 
blk_1074256920_516292{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-4db312aa-bc23-47dc-b768-52a2d72b09d3:NORMAL:10.13.53.30:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-c7db1b58-8e25-435f-8af8-08b6754c021c:NORMAL:10.13.53.16:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-4457ae11-7684-4187-b4ad-56466d79fba2:NORMAL:10.13.53.19:50010|RBW]]}
2015-08-14 00:30:03,958 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: 
/system/balancer.id for DFSClient_NONMAPREDUCE_-1841368225_1
2015-08-14 00:30:03,986 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: /system/balancer.id. BP-322804774-10.13.54.1-1412684451669 
blk_1074256921_516293{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-8f3d8860-b977-4b7b-b681-d25c112ad1f3:NORMAL:10.13.53.14:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-abb5362f-6d29-478f-a678-53f09c096871:NORMAL:10.13.53.12:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-b02f3ebc-955e-4e11-82df-dc51278dc06f:NORMAL:10.13.53.17:50010|RBW]]}
2015-08-14 00:30:04,000 INFO BlockStateChange: BLOCK* addBlock: block 
blk_1074256920_516292 on node 10.13.53.16:50010 size 134217728 does not belong 
to any file
2015-08-14 00:30:04,000 INFO BlockStateChange: BLOCK* InvalidateBlocks: add 
blk_1074256920_516292 to 10.13.53.16:50010
2015-08-14 00:30:04,000 INFO BlockStateChange: BLOCK* BlockManager: ask 
10.13.53.16:50010 to delete [blk_1074256920_516292]
2015-08-14 00:30:04,000 INFO BlockStateChange: BLOCK* BlockManager: ask 
10.13.53.14:50010 to delete [blk_1074256910_516282]
2015-08-14 00:30:04,002 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: 
/system/balancer.id for DFSClient_NONMAPREDUCE_-1841368225_1
2015-08-14 00:30:04,213 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.13.53.14:50010 is added to blk_1074256517_515889 size 9460
2015-08-14 00:30:04,214 INFO BlockStateChange: BLOCK* InvalidateBlocks: add 
blk_1074256517_515889 to 10.13.53.30:50010
2015-08-14 00:30:04,214 INFO BlockStateChange: BLOCK* chooseExcessReplicates: 
([DISK]DS-4db312aa-bc23-47dc-b768-52a2d72b09d3:NORMAL:10.13.53.30:50010, 
blk_1074256517_515889) is added to invalidated blocks set
{code}

> BP does not exist or is not under Constructionnull
> --------------------------------------------------
>
>                 Key: HDFS-8093
>                 URL: https://issues.apache.org/jira/browse/HDFS-8093
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer & mover
>    Affects Versions: 2.6.0
>         Environment: Centos 6.5
>            Reporter: LINTE
>
> HDFS balancer run during several hours blancing blocs beetween datanode, it 
> ended by failing with the following error.
> getStoredBlock function return a null BlockInfo.
> java.io.IOException: Bad response ERROR for block 
> BP-970443206-192.168.0.208-1397583979378:blk_1086729930_13046030 from 
> datanode 192.168.0.18:1004
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:897)
> 15/04/08 05:52:51 WARN hdfs.DFSClient: Error Recovery for block 
> BP-970443206-192.168.0.208-1397583979378:blk_1086729930_13046030 in pipeline 
> 192.168.0.63:1004, 192.168.0.1:1004, 192.168.0.18:1004: bad datanode 
> 192.168.0.18:1004
> 15/04/08 05:52:51 WARN hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> BP-970443206-192.168.0.208-1397583979378:blk_1086729930_13046030 does not 
> exist or is not under Constructionnull
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:6913)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:6980)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:717)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:931)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1468)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>         at com.sun.proxy.$Proxy11.updateBlockForPipeline(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolTranslatorPB.java:877)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy12.updateBlockForPipeline(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1266)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1004)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:548)
> 15/04/08 05:52:51 ERROR hdfs.DFSClient: Failed to close inode 19801755
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> BP-970443206-192.168.0.208-1397583979378:blk_1086729930_13046030 does not 
> exist or is not under Constructionnull
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:6913)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:6980)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:717)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:931)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1468)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>         at com.sun.proxy.$Proxy11.updateBlockForPipeline(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolTranslatorPB.java:877)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy12.updateBlockForPipeline(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1266)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1004)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:548)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to