[
https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607267#comment-13607267
]
Ted Yu edited comment on HBASE-7878 at 3/20/13 4:26 AM:
--------------------------------------------------------
>From
>https://builds.apache.org/job/PreCommit-HBASE-Build/4900/artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.regionserver.wal.TestHLogSplitCompressed-output.txt
> , notice the second line below:
{code}
2013-03-20 00:51:04,404 INFO [IPC Server handler 4 on 33341]
namenode.FSNamesystem(2294): Recovering lease=[Lease. Holder:
DFSClient_NONMAPREDUCE_289697813_8, pendingcreates: 1],
src=/hbase/hlog/hlog.dat.1
2013-03-20 00:51:04,405 DEBUG [pool-1-thread-1] util.FSHDFSUtils(89):
recoverLease returned false
2013-03-20 00:51:04,922 INFO
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.DataNode(2130): NameNode calls
recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:55498,
127.0.0.1:50293])
2013-03-20 00:51:04,923 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.FSDataset(2143): Interrupting active writer threads for block
blk_1918679507966322481_1015
2013-03-20 00:51:04,924 INFO
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc]
datanode.BlockReceiver(613): Exception in receiveBlock for block
blk_1918679507966322481_1015 java.io.InterruptedIOException: Interruped while
waiting for IO on channel java.nio.channels.SocketChannel[connected
local=/127.0.0.1:55498 remote=/127.0.0.1:54218]. 0 millis timeout left.
2013-03-20 00:51:04,924 INFO
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc]
datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received
exception java.io.IOException: Interrupted receiveBlock
2013-03-20 00:51:04,924 INFO [PacketResponder 1 for Block
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893):
PacketResponder blk_1918679507966322481_1015 1 : Thread is interrupted.
2013-03-20 00:51:04,924 INFO [PacketResponder 1 for Block
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958):
PacketResponder 1 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,924 INFO
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17]
datanode.BlockReceiver(613): Exception in receiveBlock for block
blk_1918679507966322481_1015 java.io.EOFException: while trying to read 547
bytes
2013-03-20 00:51:04,925 INFO [PacketResponder 0 for Block
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893):
PacketResponder blk_1918679507966322481_1015 0 : Thread is interrupted.
2013-03-20 00:51:04,925 INFO [PacketResponder 0 for Block
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958):
PacketResponder 0 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,925 INFO
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17]
datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received
exception java.io.EOFException: while trying to read 547 bytes
2013-03-20 00:51:04,925 WARN [ResponseProcessor for block
blk_1918679507966322481_1015]
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(3175): DFSOutputStream
ResponseProcessor exception for block
blk_1918679507966322481_1015java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:3127)
2013-03-20 00:51:04,925 ERROR
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17]
datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:50293,
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946,
ipcPort=59764):DataXceiver
java.io.EOFException: while trying to read 547 bytes
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:295)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:339)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:403)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:581)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,925 WARN [DataStreamer for file /hbase/hlog/hlog.dat.1
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3211): Error
Recovery for block blk_1918679507966322481_1015 bad datanode[0] 127.0.0.1:55498
2013-03-20 00:51:04,927 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17]
datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:50293,
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946,
ipcPort=59764):Number of active connections is: 2
2013-03-20 00:51:04,927 WARN [DataStreamer for file /hbase/hlog/hlog.dat.1
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3262): Error
Recovery for block blk_1918679507966322481_1015 in pipeline 127.0.0.1:55498,
127.0.0.1:50293: bad datanode 127.0.0.1:55498
2013-03-20 00:51:04,925 ERROR
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc]
datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:55498,
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242,
ipcPort=60306):DataXceiver
java.io.IOException: Interrupted receiveBlock
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:626)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,928 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc]
datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:55498,
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242,
ipcPort=60306):Number of active connections is: 3
2013-03-20 00:51:04,928 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.FSDataset(2159): getBlockMetaDataInfo successful
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,933 INFO [IPC Server handler 1 on 59764]
datanode.DataNode(2130): Client calls
recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:50293])
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764]
datanode.FSDataset(2143): Interrupting active writer threads for block
blk_1918679507966322481_1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764]
datanode.FSDataset(2159): getBlockMetaDataInfo successful
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764]
datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006),
syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015
wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=false
2013-03-20 00:51:04,935 INFO [IPC Server handler 1 on 33341]
namenode.FSNamesystem(5662): blk_1918679507966322481_1015is being recovered by
NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 ERROR [IPC Server handler 1 on 33341]
security.UserGroupInformation(1152): PriviledgedActionException as:jenkins
cause:java.io.IOException: blk_1918679507966322481_1015is being recovered by
NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 DEBUG [IPC Server handler 0 on 59764]
datanode.FSDataset(2143): Interrupting active writer threads for block
blk_1918679507966322481_1015
2013-03-20 00:51:04,936 DEBUG [IPC Server handler 0 on 59764]
datanode.FSDataset(2159): getBlockMetaDataInfo successful
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,936 ERROR [IPC Server handler 1 on 59764]
security.UserGroupInformation(1152): PriviledgedActionException
as:blk_1918679507966322481_1015 cause:org.apache.hadoop.ipc.RemoteException:
java.io.IOException: blk_1918679507966322481_1015is being recovered by
NameNode, ignoring the request from a client
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
2013-03-20 00:51:04,936 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006),
syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015
wasRecoveredOnStartup=false) node=127.0.0.1:55498),
BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015
wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=true
2013-03-20 00:51:04,937 WARN [DataStreamer for file /hbase/hlog/hlog.dat.1
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3287):
Failed recovery attempt #0 from primary datanode 127.0.0.1:50293
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: blk_1918679507966322481_1015is being recovered by
NameNode, ignoring the request from a client
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
at org.apache.hadoop.ipc.Client.call(Client.java:1107)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at $Proxy12.nextGenerationStamp(Unknown Source)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:2066)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2034)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
at org.apache.hadoop.ipc.Client.call(Client.java:1107)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at $Proxy17.recoverBlock(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3285)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.java:2749)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2953)
2013-03-20 00:51:04,937 INFO
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006),
newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:55498
2013-03-20 00:51:04,937 WARN [DataStreamer for file /hbase/hlog/hlog.dat.1
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3326): Error
Recovery for block blk_1918679507966322481_1015 failed because recovery from
primary datanode 127.0.0.1:50293 failed 1 times. Pipeline was 127.0.0.1:55498,
127.0.0.1:50293. Will retry...
2013-03-20 00:51:04,938 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.FSDataset$FSDir(175): addBlock: Moved
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481_1024.meta
to
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,938 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.FSDataset$FSDir(176): addBlock: Moved
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481
to
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481
2013-03-20 00:51:04,938 INFO
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size
1006 as part of lease recovery.
2013-03-20 00:51:04,938 DEBUG [DataNode:
[/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data2]]
datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:55498,
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242,
ipcPort=60306) sleeping for artificial delay: 1 ms
2013-03-20 00:51:04,938 INFO [IPC Server handler 2 on 59764]
datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006),
newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:50293
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764]
datanode.FSDataset$FSDir(175): addBlock: Moved
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481_1024.meta
to
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764]
datanode.FSDataset$FSDir(176): addBlock: Moved
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481
to
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481
2013-03-20 00:51:04,939 INFO [IPC Server handler 2 on 59764]
datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size
1006 as part of lease recovery.
2013-03-20 00:51:04,939 DEBUG [DataNode:
[/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data4]]
datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:50293,
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946,
ipcPort=59764) sleeping for artificial delay: 3 ms
2013-03-20 00:51:04,940 INFO [IPC Server handler 3 on 33341]
namenode.FSNamesystem(2374):
commitBlockSynchronization(lastblock=blk_1918679507966322481_1015,
newgenerationstamp=1024, newlength=1006, newtargets=[127.0.0.1:55498,
127.0.0.1:50293], closeFile=true, deleteBlock=false)
2013-03-20 00:51:04,942 INFO [IPC Server handler 3 on 33341]
namenode.FSNamesystem(2458):
commitBlockSynchronization(newblock=blk_1918679507966322481_1024,
file=/hbase/hlog/hlog.dat.1, newgenerationstamp=1024, newlength=1006,
newtargets=[127.0.0.1:55498, 127.0.0.1:50293]) successful
2013-03-20 00:51:04,943 WARN [IPC Server handler 5 on 33341]
namenode.FSNamesystem(3850): BLOCK* NameSystem.addStoredBlock: Redundant
addStoredBlock request received for blk_1918679507966322481_1024 on
127.0.0.1:50293 size 1006
2013-03-20 00:51:05,406 INFO [pool-1-thread-1] util.FSHDFSUtils(137): Finished
lease recover attempt for hdfs://localhost:33341/hbase/hlog/hlog.dat.1
{code}
was (Author: [email protected]):
From
https://builds.apache.org/job/PreCommit-HBASE-Build/4900/artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.regionserver.wal.TestHLogSplitCompressed-output.txt
, notice the first line below:
{code}
2013-03-20 00:51:04,405 DEBUG [pool-1-thread-1] util.FSHDFSUtils(89):
recoverLease returned false
2013-03-20 00:51:04,922 INFO
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.DataNode(2130): NameNode calls
recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:55498,
127.0.0.1:50293])
2013-03-20 00:51:04,923 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.FSDataset(2143): Interrupting active writer threads for block
blk_1918679507966322481_1015
2013-03-20 00:51:04,924 INFO
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc]
datanode.BlockReceiver(613): Exception in receiveBlock for block
blk_1918679507966322481_1015 java.io.InterruptedIOException: Interruped while
waiting for IO on channel java.nio.channels.SocketChannel[connected
local=/127.0.0.1:55498 remote=/127.0.0.1:54218]. 0 millis timeout left.
2013-03-20 00:51:04,924 INFO
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc]
datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received
exception java.io.IOException: Interrupted receiveBlock
2013-03-20 00:51:04,924 INFO [PacketResponder 1 for Block
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893):
PacketResponder blk_1918679507966322481_1015 1 : Thread is interrupted.
2013-03-20 00:51:04,924 INFO [PacketResponder 1 for Block
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958):
PacketResponder 1 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,924 INFO
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17]
datanode.BlockReceiver(613): Exception in receiveBlock for block
blk_1918679507966322481_1015 java.io.EOFException: while trying to read 547
bytes
2013-03-20 00:51:04,925 INFO [PacketResponder 0 for Block
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893):
PacketResponder blk_1918679507966322481_1015 0 : Thread is interrupted.
2013-03-20 00:51:04,925 INFO [PacketResponder 0 for Block
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958):
PacketResponder 0 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,925 INFO
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17]
datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received
exception java.io.EOFException: while trying to read 547 bytes
2013-03-20 00:51:04,925 WARN [ResponseProcessor for block
blk_1918679507966322481_1015]
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(3175): DFSOutputStream
ResponseProcessor exception for block
blk_1918679507966322481_1015java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:3127)
2013-03-20 00:51:04,925 ERROR
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17]
datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:50293,
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946,
ipcPort=59764):DataXceiver
java.io.EOFException: while trying to read 547 bytes
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:295)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:339)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:403)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:581)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,925 WARN [DataStreamer for file /hbase/hlog/hlog.dat.1
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3211): Error
Recovery for block blk_1918679507966322481_1015 bad datanode[0] 127.0.0.1:55498
2013-03-20 00:51:04,927 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17]
datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:50293,
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946,
ipcPort=59764):Number of active connections is: 2
2013-03-20 00:51:04,927 WARN [DataStreamer for file /hbase/hlog/hlog.dat.1
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3262): Error
Recovery for block blk_1918679507966322481_1015 in pipeline 127.0.0.1:55498,
127.0.0.1:50293: bad datanode 127.0.0.1:55498
2013-03-20 00:51:04,925 ERROR
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc]
datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:55498,
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242,
ipcPort=60306):DataXceiver
java.io.IOException: Interrupted receiveBlock
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:626)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,928 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc]
datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:55498,
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242,
ipcPort=60306):Number of active connections is: 3
2013-03-20 00:51:04,928 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.FSDataset(2159): getBlockMetaDataInfo successful
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,933 INFO [IPC Server handler 1 on 59764]
datanode.DataNode(2130): Client calls
recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:50293])
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764]
datanode.FSDataset(2143): Interrupting active writer threads for block
blk_1918679507966322481_1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764]
datanode.FSDataset(2159): getBlockMetaDataInfo successful
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764]
datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006),
syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015
wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=false
2013-03-20 00:51:04,935 INFO [IPC Server handler 1 on 33341]
namenode.FSNamesystem(5662): blk_1918679507966322481_1015is being recovered by
NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 ERROR [IPC Server handler 1 on 33341]
security.UserGroupInformation(1152): PriviledgedActionException as:jenkins
cause:java.io.IOException: blk_1918679507966322481_1015is being recovered by
NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 DEBUG [IPC Server handler 0 on 59764]
datanode.FSDataset(2143): Interrupting active writer threads for block
blk_1918679507966322481_1015
2013-03-20 00:51:04,936 DEBUG [IPC Server handler 0 on 59764]
datanode.FSDataset(2159): getBlockMetaDataInfo successful
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,936 ERROR [IPC Server handler 1 on 59764]
security.UserGroupInformation(1152): PriviledgedActionException
as:blk_1918679507966322481_1015 cause:org.apache.hadoop.ipc.RemoteException:
java.io.IOException: blk_1918679507966322481_1015is being recovered by
NameNode, ignoring the request from a client
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
2013-03-20 00:51:04,936 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006),
syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015
wasRecoveredOnStartup=false) node=127.0.0.1:55498),
BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015
wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=true
2013-03-20 00:51:04,937 WARN [DataStreamer for file /hbase/hlog/hlog.dat.1
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3287):
Failed recovery attempt #0 from primary datanode 127.0.0.1:50293
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: blk_1918679507966322481_1015is being recovered by
NameNode, ignoring the request from a client
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
at org.apache.hadoop.ipc.Client.call(Client.java:1107)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at $Proxy12.nextGenerationStamp(Unknown Source)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:2066)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2034)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
at org.apache.hadoop.ipc.Client.call(Client.java:1107)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at $Proxy17.recoverBlock(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3285)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.java:2749)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2953)
2013-03-20 00:51:04,937 INFO
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006),
newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:55498
2013-03-20 00:51:04,937 WARN [DataStreamer for file /hbase/hlog/hlog.dat.1
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3326): Error
Recovery for block blk_1918679507966322481_1015 failed because recovery from
primary datanode 127.0.0.1:50293 failed 1 times. Pipeline was 127.0.0.1:55498,
127.0.0.1:50293. Will retry...
2013-03-20 00:51:04,938 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.FSDataset$FSDir(175): addBlock: Moved
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481_1024.meta
to
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,938 DEBUG
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.FSDataset$FSDir(176): addBlock: Moved
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481
to
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481
2013-03-20 00:51:04,938 INFO
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44]
datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size
1006 as part of lease recovery.
2013-03-20 00:51:04,938 DEBUG [DataNode:
[/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data2]]
datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:55498,
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242,
ipcPort=60306) sleeping for artificial delay: 1 ms
2013-03-20 00:51:04,938 INFO [IPC Server handler 2 on 59764]
datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006),
newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:50293
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764]
datanode.FSDataset$FSDir(175): addBlock: Moved
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481_1024.meta
to
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764]
datanode.FSDataset$FSDir(176): addBlock: Moved
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481
to
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481
2013-03-20 00:51:04,939 INFO [IPC Server handler 2 on 59764]
datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size
1006 as part of lease recovery.
2013-03-20 00:51:04,939 DEBUG [DataNode:
[/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data4]]
datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:50293,
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946,
ipcPort=59764) sleeping for artificial delay: 3 ms
2013-03-20 00:51:04,940 INFO [IPC Server handler 3 on 33341]
namenode.FSNamesystem(2374):
commitBlockSynchronization(lastblock=blk_1918679507966322481_1015,
newgenerationstamp=1024, newlength=1006, newtargets=[127.0.0.1:55498,
127.0.0.1:50293], closeFile=true, deleteBlock=false)
2013-03-20 00:51:04,942 INFO [IPC Server handler 3 on 33341]
namenode.FSNamesystem(2458):
commitBlockSynchronization(newblock=blk_1918679507966322481_1024,
file=/hbase/hlog/hlog.dat.1, newgenerationstamp=1024, newlength=1006,
newtargets=[127.0.0.1:55498, 127.0.0.1:50293]) successful
2013-03-20 00:51:04,943 WARN [IPC Server handler 5 on 33341]
namenode.FSNamesystem(3850): BLOCK* NameSystem.addStoredBlock: Redundant
addStoredBlock request received for blk_1918679507966322481_1024 on
127.0.0.1:50293 size 1006
2013-03-20 00:51:05,406 INFO [pool-1-thread-1] util.FSHDFSUtils(137): Finished
lease recover attempt for hdfs://localhost:33341/hbase/hlog/hlog.dat.1
{code}
> recoverFileLease does not check return value of recoverLease
> ------------------------------------------------------------
>
> Key: HBASE-7878
> URL: https://issues.apache.org/jira/browse/HBASE-7878
> Project: HBase
> Issue Type: Bug
> Components: util
> Affects Versions: 0.95.0, 0.94.6
> Reporter: Eric Newton
> Assignee: Ted Yu
> Priority: Critical
> Fix For: 0.95.0, 0.98.0, 0.94.7
>
> Attachments: 7878.94, 7878-94.addendum, 7878-94.addendum2,
> 7878-trunk.addendum, 7878-trunk.addendum2, 7878-trunk-v10.txt,
> 7878-trunk-v2.txt, 7878-trunk-v3.txt, 7878-trunk-v4.txt, 7878-trunk-v5.txt,
> 7878-trunk-v6.txt, 7878-trunk-v7.txt, 7878-trunk-v8.txt, 7878-trunk-v9.txt,
> 7878-trunk-v9.txt
>
>
> I think this is a problem, so I'm opening a ticket so an HBase person takes a
> look.
> Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease
> recovery for Accumulo after HBase's lease recovery. During testing, we
> experienced data loss. I found it is necessary to wait until recoverLease
> returns true to know that the file has been truly closed. In FSHDFSUtils,
> the return result of recoverLease is not checked. In the unit tests created
> to check lease recovery in HBASE-2645, the return result of recoverLease is
> always checked.
> I think FSHDFSUtils should be modified to check the return result, and wait
> until it returns true.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira