[ 
https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607267#comment-13607267
 ] 

Ted Yu edited comment on HBASE-7878 at 3/20/13 10:39 PM:
---------------------------------------------------------

>From 
>https://builds.apache.org/job/PreCommit-HBASE-Build/4900/artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.regionserver.wal.TestHLogSplitCompressed-output.txt
> , notice the second line below:
{code}
2013-03-20 00:51:04,404 INFO  [IPC Server handler 4 on 33341] 
namenode.FSNamesystem(2294): Recovering lease=[Lease.  Holder: 
DFSClient_NONMAPREDUCE_289697813_8, pendingcreates: 1], 
src=/hbase/hlog/hlog.dat.1
2013-03-20 00:51:04,405 DEBUG [pool-1-thread-1] util.FSHDFSUtils(89): 
recoverLease returned false
2013-03-20 00:51:04,922 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.DataNode(2130): NameNode calls 
recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:55498, 
127.0.0.1:50293])
2013-03-20 00:51:04,923 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.FSDataset(2143): Interrupting active writer threads for block 
blk_1918679507966322481_1015
2013-03-20 00:51:04,924 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] 
datanode.BlockReceiver(613): Exception in receiveBlock for block 
blk_1918679507966322481_1015 java.io.InterruptedIOException: Interruped while 
waiting for IO on channel java.nio.channels.SocketChannel[connected 
local=/127.0.0.1:55498 remote=/127.0.0.1:54218]. 0 millis timeout left.
2013-03-20 00:51:04,924 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] 
datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received 
exception java.io.IOException: Interrupted receiveBlock
2013-03-20 00:51:04,924 INFO  [PacketResponder 1 for Block 
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893): 
PacketResponder blk_1918679507966322481_1015 1 : Thread is interrupted.
2013-03-20 00:51:04,924 INFO  [PacketResponder 1 for Block 
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958): 
PacketResponder 1 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,924 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] 
datanode.BlockReceiver(613): Exception in receiveBlock for block 
blk_1918679507966322481_1015 java.io.EOFException: while trying to read 547 
bytes
2013-03-20 00:51:04,925 INFO  [PacketResponder 0 for Block 
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893): 
PacketResponder blk_1918679507966322481_1015 0 : Thread is interrupted.
2013-03-20 00:51:04,925 INFO  [PacketResponder 0 for Block 
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958): 
PacketResponder 0 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,925 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] 
datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received 
exception java.io.EOFException: while trying to read 547 bytes
2013-03-20 00:51:04,925 WARN  [ResponseProcessor for block 
blk_1918679507966322481_1015] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(3175): DFSOutputStream 
ResponseProcessor exception  for block 
blk_1918679507966322481_1015java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readLong(DataInputStream.java:399)
        at 
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:3127)

2013-03-20 00:51:04,925 ERROR 
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] 
datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:50293, 
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, 
ipcPort=59764):DataXceiver
java.io.EOFException: while trying to read 547 bytes
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:295)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:339)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:403)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:581)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
        at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,925 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3211): Error 
Recovery for block blk_1918679507966322481_1015 bad datanode[0] 127.0.0.1:55498
2013-03-20 00:51:04,927 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] 
datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:50293, 
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, 
ipcPort=59764):Number of active connections is: 2
2013-03-20 00:51:04,927 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3262): Error 
Recovery for block blk_1918679507966322481_1015 in pipeline 127.0.0.1:55498, 
127.0.0.1:50293: bad datanode 127.0.0.1:55498
2013-03-20 00:51:04,925 ERROR 
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] 
datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:55498, 
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, 
ipcPort=60306):DataXceiver
java.io.IOException: Interrupted receiveBlock
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:626)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
        at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,928 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] 
datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:55498, 
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, 
ipcPort=60306):Number of active connections is: 3
2013-03-20 00:51:04,928 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.FSDataset(2159): getBlockMetaDataInfo successful 
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,933 INFO  [IPC Server handler 1 on 59764] 
datanode.DataNode(2130): Client calls 
recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:50293])
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] 
datanode.FSDataset(2143): Interrupting active writer threads for block 
blk_1918679507966322481_1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] 
datanode.FSDataset(2159): getBlockMetaDataInfo successful 
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] 
datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006), 
syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 
wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=false
2013-03-20 00:51:04,935 INFO  [IPC Server handler 1 on 33341] 
namenode.FSNamesystem(5662): blk_1918679507966322481_1015is being recovered by 
NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 ERROR [IPC Server handler 1 on 33341] 
security.UserGroupInformation(1152): PriviledgedActionException as:jenkins 
cause:java.io.IOException: blk_1918679507966322481_1015is being recovered by 
NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 DEBUG [IPC Server handler 0 on 59764] 
datanode.FSDataset(2143): Interrupting active writer threads for block 
blk_1918679507966322481_1015
2013-03-20 00:51:04,936 DEBUG [IPC Server handler 0 on 59764] 
datanode.FSDataset(2159): getBlockMetaDataInfo successful 
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,936 ERROR [IPC Server handler 1 on 59764] 
security.UserGroupInformation(1152): PriviledgedActionException 
as:blk_1918679507966322481_1015 cause:org.apache.hadoop.ipc.RemoteException: 
java.io.IOException: blk_1918679507966322481_1015is being recovered by 
NameNode, ignoring the request from a client
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

2013-03-20 00:51:04,936 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006), 
syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 
wasRecoveredOnStartup=false) node=127.0.0.1:55498), 
BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 
wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=true
2013-03-20 00:51:04,937 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3287): 
Failed recovery attempt #0 from primary datanode 127.0.0.1:50293
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.ipc.RemoteException: 
java.io.IOException: blk_1918679507966322481_1015is being recovered by 
NameNode, ignoring the request from a client
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at $Proxy12.nextGenerationStamp(Unknown Source)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:2066)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2034)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2114)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at $Proxy17.recoverBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3285)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.java:2749)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2953)
2013-03-20 00:51:04,937 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006), 
newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:55498
2013-03-20 00:51:04,937 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3326): Error 
Recovery for block blk_1918679507966322481_1015 failed  because recovery from 
primary datanode 127.0.0.1:50293 failed 1 times.  Pipeline was 127.0.0.1:55498, 
127.0.0.1:50293. Will retry...
2013-03-20 00:51:04,938 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.FSDataset$FSDir(175): addBlock: Moved 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481_1024.meta
 to 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,938 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.FSDataset$FSDir(176): addBlock: Moved 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481
 to 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481
2013-03-20 00:51:04,938 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size 
1006 as part of lease recovery.
2013-03-20 00:51:04,938 DEBUG [DataNode: 
[/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data2]]
 datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:55498, 
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, 
ipcPort=60306) sleeping for artificial delay: 1 ms
2013-03-20 00:51:04,938 INFO  [IPC Server handler 2 on 59764] 
datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006), 
newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:50293
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764] 
datanode.FSDataset$FSDir(175): addBlock: Moved 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481_1024.meta
 to 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764] 
datanode.FSDataset$FSDir(176): addBlock: Moved 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481
 to 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481
2013-03-20 00:51:04,939 INFO  [IPC Server handler 2 on 59764] 
datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size 
1006 as part of lease recovery.
2013-03-20 00:51:04,939 DEBUG [DataNode: 
[/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data4]]
 datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:50293, 
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, 
ipcPort=59764) sleeping for artificial delay: 3 ms
2013-03-20 00:51:04,940 INFO  [IPC Server handler 3 on 33341] 
namenode.FSNamesystem(2374): 
commitBlockSynchronization(lastblock=blk_1918679507966322481_1015, 
newgenerationstamp=1024, newlength=1006, newtargets=[127.0.0.1:55498, 
127.0.0.1:50293], closeFile=true, deleteBlock=false)
2013-03-20 00:51:04,942 INFO  [IPC Server handler 3 on 33341] 
namenode.FSNamesystem(2458): 
commitBlockSynchronization(newblock=blk_1918679507966322481_1024, 
file=/hbase/hlog/hlog.dat.1, newgenerationstamp=1024, newlength=1006, 
newtargets=[127.0.0.1:55498, 127.0.0.1:50293]) successful
2013-03-20 00:51:04,943 WARN  [IPC Server handler 5 on 33341] 
namenode.FSNamesystem(3850): BLOCK* NameSystem.addStoredBlock: Redundant 
addStoredBlock request received for blk_1918679507966322481_1024 on 
127.0.0.1:50293 size 1006
2013-03-20 00:51:05,406 INFO  [pool-1-thread-1] util.FSHDFSUtils(137): Finished 
lease recover attempt for hdfs://localhost:33341/hbase/hlog/hlog.dat.1
{code}
Here was why the test failed:
{code}
2013-03-20 00:52:48,342 INFO  [IPC Server handler 8 on 33341] 
namenode.FSNamesystem(169): ugi=jenkins ip=/127.0.0.1 cmd=open  
src=/hbase/hlog/hlog.dat.5  dst=null  perm=null
Juliet file creator: created file /hbase/hlog/hlog.dat..juliet...2013-03-20 
00:53:08,059 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): before: 
regionserver.wal.TestHLogSplitCompressed#testSplitFailsIfNewHLogGetsCreatedAfterSplitStarted
 Thread=70, OpenFileDescriptor=194, MaxFileDescriptor=60000, ConnectionCount=0
...
2013-03-20 00:53:25,275 INFO  [pool-1-thread-1] hbase.ResourceChecker(171): 
after: 
regionserver.wal.TestHLogSplitCompressed#testSplitFailsIfNewHLogGetsCreatedAfterSplitStarted
 Thread=70 (was 70), OpenFileDescriptor=189 (was 194), MaxFileDescriptor=60000 
(was 60000), ConnectionCount=0 (was 0)
Exception in thread "ZombieNewLogWriterRegionServer" java.lang.AssertionError: 
Failed to create file /hbase/hlog/hlog.dat..juliet at 
org.junit.Assert.fail(Assert.java:88)
  at org.junit.Assert.assertTrue(Assert.java:41)
  at 
org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit$ZombieNewLogWriterRegionServer.run(TestHLogSplit.java:1104)
{code}
ZombieNewLogWriterRegionServer didn't create the fake HLog. But there was no 
log about the IOE that caused the failure.
                
      was (Author: [email protected]):
    From 
https://builds.apache.org/job/PreCommit-HBASE-Build/4900/artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.regionserver.wal.TestHLogSplitCompressed-output.txt
 , notice the second line below:
{code}
2013-03-20 00:51:04,404 INFO  [IPC Server handler 4 on 33341] 
namenode.FSNamesystem(2294): Recovering lease=[Lease.  Holder: 
DFSClient_NONMAPREDUCE_289697813_8, pendingcreates: 1], 
src=/hbase/hlog/hlog.dat.1
2013-03-20 00:51:04,405 DEBUG [pool-1-thread-1] util.FSHDFSUtils(89): 
recoverLease returned false
2013-03-20 00:51:04,922 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.DataNode(2130): NameNode calls 
recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:55498, 
127.0.0.1:50293])
2013-03-20 00:51:04,923 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.FSDataset(2143): Interrupting active writer threads for block 
blk_1918679507966322481_1015
2013-03-20 00:51:04,924 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] 
datanode.BlockReceiver(613): Exception in receiveBlock for block 
blk_1918679507966322481_1015 java.io.InterruptedIOException: Interruped while 
waiting for IO on channel java.nio.channels.SocketChannel[connected 
local=/127.0.0.1:55498 remote=/127.0.0.1:54218]. 0 millis timeout left.
2013-03-20 00:51:04,924 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] 
datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received 
exception java.io.IOException: Interrupted receiveBlock
2013-03-20 00:51:04,924 INFO  [PacketResponder 1 for Block 
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893): 
PacketResponder blk_1918679507966322481_1015 1 : Thread is interrupted.
2013-03-20 00:51:04,924 INFO  [PacketResponder 1 for Block 
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958): 
PacketResponder 1 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,924 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] 
datanode.BlockReceiver(613): Exception in receiveBlock for block 
blk_1918679507966322481_1015 java.io.EOFException: while trying to read 547 
bytes
2013-03-20 00:51:04,925 INFO  [PacketResponder 0 for Block 
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893): 
PacketResponder blk_1918679507966322481_1015 0 : Thread is interrupted.
2013-03-20 00:51:04,925 INFO  [PacketResponder 0 for Block 
blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958): 
PacketResponder 0 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,925 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] 
datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received 
exception java.io.EOFException: while trying to read 547 bytes
2013-03-20 00:51:04,925 WARN  [ResponseProcessor for block 
blk_1918679507966322481_1015] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(3175): DFSOutputStream 
ResponseProcessor exception  for block 
blk_1918679507966322481_1015java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readLong(DataInputStream.java:399)
        at 
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:3127)

2013-03-20 00:51:04,925 ERROR 
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] 
datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:50293, 
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, 
ipcPort=59764):DataXceiver
java.io.EOFException: while trying to read 547 bytes
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:295)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:339)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:403)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:581)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
        at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,925 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3211): Error 
Recovery for block blk_1918679507966322481_1015 bad datanode[0] 127.0.0.1:55498
2013-03-20 00:51:04,927 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] 
datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:50293, 
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, 
ipcPort=59764):Number of active connections is: 2
2013-03-20 00:51:04,927 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3262): Error 
Recovery for block blk_1918679507966322481_1015 in pipeline 127.0.0.1:55498, 
127.0.0.1:50293: bad datanode 127.0.0.1:55498
2013-03-20 00:51:04,925 ERROR 
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] 
datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:55498, 
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, 
ipcPort=60306):DataXceiver
java.io.IOException: Interrupted receiveBlock
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:626)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
        at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,928 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] 
datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:55498, 
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, 
ipcPort=60306):Number of active connections is: 3
2013-03-20 00:51:04,928 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.FSDataset(2159): getBlockMetaDataInfo successful 
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,933 INFO  [IPC Server handler 1 on 59764] 
datanode.DataNode(2130): Client calls 
recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:50293])
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] 
datanode.FSDataset(2143): Interrupting active writer threads for block 
blk_1918679507966322481_1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] 
datanode.FSDataset(2159): getBlockMetaDataInfo successful 
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] 
datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006), 
syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 
wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=false
2013-03-20 00:51:04,935 INFO  [IPC Server handler 1 on 33341] 
namenode.FSNamesystem(5662): blk_1918679507966322481_1015is being recovered by 
NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 ERROR [IPC Server handler 1 on 33341] 
security.UserGroupInformation(1152): PriviledgedActionException as:jenkins 
cause:java.io.IOException: blk_1918679507966322481_1015is being recovered by 
NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 DEBUG [IPC Server handler 0 on 59764] 
datanode.FSDataset(2143): Interrupting active writer threads for block 
blk_1918679507966322481_1015
2013-03-20 00:51:04,936 DEBUG [IPC Server handler 0 on 59764] 
datanode.FSDataset(2159): getBlockMetaDataInfo successful 
block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,936 ERROR [IPC Server handler 1 on 59764] 
security.UserGroupInformation(1152): PriviledgedActionException 
as:blk_1918679507966322481_1015 cause:org.apache.hadoop.ipc.RemoteException: 
java.io.IOException: blk_1918679507966322481_1015is being recovered by 
NameNode, ignoring the request from a client
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

2013-03-20 00:51:04,936 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006), 
syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 
wasRecoveredOnStartup=false) node=127.0.0.1:55498), 
BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 
wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=true
2013-03-20 00:51:04,937 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3287): 
Failed recovery attempt #0 from primary datanode 127.0.0.1:50293
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.ipc.RemoteException: 
java.io.IOException: blk_1918679507966322481_1015is being recovered by 
NameNode, ignoring the request from a client
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at $Proxy12.nextGenerationStamp(Unknown Source)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:2066)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2034)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2114)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at $Proxy17.recoverBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3285)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.java:2749)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2953)
2013-03-20 00:51:04,937 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006), 
newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:55498
2013-03-20 00:51:04,937 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 
block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3326): Error 
Recovery for block blk_1918679507966322481_1015 failed  because recovery from 
primary datanode 127.0.0.1:50293 failed 1 times.  Pipeline was 127.0.0.1:55498, 
127.0.0.1:50293. Will retry...
2013-03-20 00:51:04,938 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.FSDataset$FSDir(175): addBlock: Moved 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481_1024.meta
 to 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,938 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.FSDataset$FSDir(176): addBlock: Moved 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481
 to 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481
2013-03-20 00:51:04,938 INFO  
[org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] 
datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size 
1006 as part of lease recovery.
2013-03-20 00:51:04,938 DEBUG [DataNode: 
[/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data2]]
 datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:55498, 
storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, 
ipcPort=60306) sleeping for artificial delay: 1 ms
2013-03-20 00:51:04,938 INFO  [IPC Server handler 2 on 59764] 
datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006), 
newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:50293
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764] 
datanode.FSDataset$FSDir(175): addBlock: Moved 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481_1024.meta
 to 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764] 
datanode.FSDataset$FSDir(176): addBlock: Moved 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481
 to 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481
2013-03-20 00:51:04,939 INFO  [IPC Server handler 2 on 59764] 
datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size 
1006 as part of lease recovery.
2013-03-20 00:51:04,939 DEBUG [DataNode: 
[/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data4]]
 datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:50293, 
storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, 
ipcPort=59764) sleeping for artificial delay: 3 ms
2013-03-20 00:51:04,940 INFO  [IPC Server handler 3 on 33341] 
namenode.FSNamesystem(2374): 
commitBlockSynchronization(lastblock=blk_1918679507966322481_1015, 
newgenerationstamp=1024, newlength=1006, newtargets=[127.0.0.1:55498, 
127.0.0.1:50293], closeFile=true, deleteBlock=false)
2013-03-20 00:51:04,942 INFO  [IPC Server handler 3 on 33341] 
namenode.FSNamesystem(2458): 
commitBlockSynchronization(newblock=blk_1918679507966322481_1024, 
file=/hbase/hlog/hlog.dat.1, newgenerationstamp=1024, newlength=1006, 
newtargets=[127.0.0.1:55498, 127.0.0.1:50293]) successful
2013-03-20 00:51:04,943 WARN  [IPC Server handler 5 on 33341] 
namenode.FSNamesystem(3850): BLOCK* NameSystem.addStoredBlock: Redundant 
addStoredBlock request received for blk_1918679507966322481_1024 on 
127.0.0.1:50293 size 1006
2013-03-20 00:51:05,406 INFO  [pool-1-thread-1] util.FSHDFSUtils(137): Finished 
lease recover attempt for hdfs://localhost:33341/hbase/hlog/hlog.dat.1
{code}
                  
> recoverFileLease does not check return value of recoverLease
> ------------------------------------------------------------
>
>                 Key: HBASE-7878
>                 URL: https://issues.apache.org/jira/browse/HBASE-7878
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.95.0, 0.94.6
>            Reporter: Eric Newton
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.95.0, 0.98.0, 0.94.7
>
>         Attachments: 7878.94, 7878-94.addendum, 7878-94.addendum2, 
> 7878-trunk.addendum, 7878-trunk.addendum2, 7878-trunk-v10.txt, 
> 7878-trunk-v11-test.txt, 7878-trunk-v2.txt, 7878-trunk-v3.txt, 
> 7878-trunk-v4.txt, 7878-trunk-v5.txt, 7878-trunk-v6.txt, 7878-trunk-v7.txt, 
> 7878-trunk-v8.txt, 7878-trunk-v9.txt, 7878-trunk-v9.txt
>
>
> I think this is a problem, so I'm opening a ticket so an HBase person takes a 
> look.
> Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease 
> recovery for Accumulo after HBase's lease recovery.  During testing, we 
> experienced data loss.  I found it is necessary to wait until recoverLease 
> returns true to know that the file has been truly closed.  In FSHDFSUtils, 
> the return result of recoverLease is not checked. In the unit tests created 
> to check lease recovery in HBASE-2645, the return result of recoverLease is 
> always checked.
> I think FSHDFSUtils should be modified to check the return result, and wait 
> until it returns true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to