[ 
https://issues.apache.org/jira/browse/HBASE-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931926#action_12931926
 ] 

Jonathan Gray commented on HBASE-3234:
--------------------------------------

>From a new run, here is all of the activity of the block that causes the 
>Abort...

{noformat}

2010-11-14 15:44:00,042 DEBUG [IPC Server handler 4 on 40196] 
namenode.FSDirectory(273): DIR* FSDirectory.addFile: 
/user/jgray/.logs/dev692.sf2p.facebook.com,58498,1289778232413/10.17.83.207%3A58498.1289778233005
 with blk_4643386764144918409_1009 block is added to the in-memory file system
2010-11-14 15:44:00,042 INFO  [IPC Server handler 4 on 40196] 
namenode.FSNamesystem(1482): BLOCK* NameSystem.allocateBlock: 
/user/jgray/.logs/dev692.sf2p.facebook.com,58498,1289778232413/10.17.83.207%3A58498.1289778233005.
 blk_4643386764144918409_1009
2010-11-14 15:44:00,044 INFO  
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.DataXceiver(228): Receiving block blk_4643386764144918409_1009 src: 
/127.0.0.1:48990 dest: /127.0.0.1:48738
2010-11-14 15:44:00,044 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.FSDataset(1517): b=blk_4643386764144918409_1009, f=null
2010-11-14 15:44:00,044 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.FSDataset(1313): writeTo metafile is 
/var/users/jgray/hbase/trunk/target/test-data/5173adaa-a370-457d-8293-8f059f70bd31/dfs/data/data1/blocksBeingWritten/blk_4643386764144918409_1009.meta
 of size 0
2010-11-14 15:44:00,045 DEBUG [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(789): 
PacketResponder 0 seqno = -2 for block blk_4643386764144918409_1009 waiting for 
local datanode to finish write.
2010-11-14 15:44:00,045 DEBUG [DataStreamer for file 
/user/jgray/.logs/dev692.sf2p.facebook.com,58498,1289778232413/10.17.83.207%3A58498.1289778233005
 block blk_4643386764144918409_1009] 
hdfs.DFSClient$DFSOutputStream$DataStreamer(2429): DataStreamer block 
blk_4643386764144918409_1009 wrote packet seqno:0 size:339 offsetInBlock:0 
lastPacketInBlock:false
2010-11-14 15:44:00,045 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.BlockReceiver(393): Receiving one packet for block 
blk_4643386764144918409_1009 of length 318 seqno 0 offsetInBlock 0 
lastPacketInBlock false
2010-11-14 15:44:00,046 DEBUG [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(891): 
PacketResponder 0 for block blk_4643386764144918409_1009 responded an ack: 
Replies for seqno 0 are SUCCESS
2010-11-14 15:44:00,046 DEBUG [ResponseProcessor for block 
blk_4643386764144918409_1009] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(2534): DFSClient Replies for 
seqno 0 are SUCCESS
2010-11-14 15:44:00,046 DEBUG [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(789): 
PacketResponder 0 seqno = -2 for block blk_4643386764144918409_1009 waiting for 
local datanode to finish write.
2010-11-14 15:44:31,551 DEBUG [DataStreamer for file 
/user/jgray/.logs/dev692.sf2p.facebook.com,58498,1289778232413/10.17.83.207%3A58498.1289778233005
 block blk_4643386764144918409_1009] 
hdfs.DFSClient$DFSOutputStream$DataStreamer(2429): DataStreamer block 
blk_4643386764144918409_1009 wrote packet seqno:-1 size:25 offsetInBlock:0 
lastPacketInBlock:false
2010-11-14 15:44:31,551 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.BlockReceiver(393): Receiving one packet for block 
blk_4643386764144918409_1009 of length 4 seqno -1 offsetInBlock 0 
lastPacketInBlock false
2010-11-14 15:44:31,551 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.BlockReceiver(636): Changing block file offset of block 
blk_4643386764144918409_1009 from 310 to 0 meta file offset to 7
2010-11-14 15:44:31,552 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.BlockReceiver(422): Receiving empty packet for block 
blk_4643386764144918409_1009
2010-11-14 15:44:31,552 DEBUG [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(891): 
PacketResponder 0 for block blk_4643386764144918409_1009 responded an ack: 
Replies for seqno -1 are SUCCESS
2010-11-14 15:44:31,552 DEBUG [ResponseProcessor for block 
blk_4643386764144918409_1009] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(2534): DFSClient Replies for 
seqno -1 are
2010-11-14 15:44:31,552 DEBUG [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(789): 
PacketResponder 0 seqno = -2 for block blk_4643386764144918409_1009 waiting for 
local datanode to finish write.
2010-11-14 15:44:31,867 DEBUG [DataStreamer for file 
/user/jgray/.logs/dev692.sf2p.facebook.com,58498,1289778232413/10.17.83.207%3A58498.1289778233005
 block blk_4643386764144918409_1009] 
hdfs.DFSClient$DFSOutputStream$DataStreamer(2429): DataStreamer block 
blk_4643386764144918409_1009 wrote packet seqno:1 size:804 offsetInBlock:0 
lastPacketInBlock:false
2010-11-14 15:44:31,868 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.BlockReceiver(393): Receiving one packet for block 
blk_4643386764144918409_1009 of length 783 seqno 1 offsetInBlock 0 
lastPacketInBlock false
2010-11-14 15:44:31,868 DEBUG [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(891): 
PacketResponder 0 for block blk_4643386764144918409_1009 responded an ack: 
Replies for seqno 1 are SUCCESS
2010-11-14 15:44:31,868 DEBUG [ResponseProcessor for block 
blk_4643386764144918409_1009] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(2534): DFSClient Replies for 
seqno 0 are FAILED
2010-11-14 15:44:31,868 DEBUG [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(789): 
PacketResponder 0 seqno = -2 for block blk_4643386764144918409_1009 waiting for 
local datanode to finish write.
2010-11-14 15:44:31,868 WARN  [ResponseProcessor for block 
blk_4643386764144918409_1009] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(2580): DFSOutputStream 
ResponseProcessor exception  for block 
blk_4643386764144918409_1009java.io.IOException: Bad response 1 for block 
blk_4643386764144918409_1009 from datanode 127.0.0.1:48738
2010-11-14 15:44:31,868 WARN  [DataStreamer for file 
/user/jgray/.logs/dev692.sf2p.facebook.com,58498,1289778232413/10.17.83.207%3A58498.1289778233005
 block blk_4643386764144918409_1009] hdfs.DFSClient$DFSOutputStream(2616): 
Error Recovery for block blk_4643386764144918409_1009 bad datanode[0] 
127.0.0.1:48738
2010-11-14 15:44:31,869 INFO  
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.BlockReceiver(565): Exception in receiveBlock for block 
blk_4643386764144918409_1009 java.io.IOException: Connection reset by peer
2010-11-14 15:44:31,869 INFO  [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(844): 
PacketResponder blk_4643386764144918409_1009 0 : Thread is interrupted.
2010-11-14 15:44:31,869 INFO  [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(907): 
PacketResponder 0 for block blk_4643386764144918409_1009 terminating
2010-11-14 15:44:31,869 INFO  
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.DataXceiver(377): writeBlock blk_4643386764144918409_1009 received 
exception java.io.IOException: Connection reset by peer
2010-11-14 15:45:32,574 INFO  [IPC Server handler 8 on 40196] 
namenode.INodeFileUnderConstruction(212): BLOCK* blk_4643386764144918409_1009 
recovery started, primary=127.0.0.1:48738
2010-11-14 15:45:34,183 INFO  
[org.apache.hadoop.hdfs.server.datanode.datanod...@26c94114] 
datanode.DataNode(1632): NameNode calls 
recoverBlock(block=blk_4643386764144918409_1009, targets=[127.0.0.1:48738])
2010-11-14 15:45:34,183 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.datanod...@26c94114] 
datanode.DataNode(1409): block=blk_4643386764144918409_1009
2010-11-14 15:45:34,184 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.datanod...@26c94114] 
datanode.DataNode(1419): getBlockMetaDataInfo successful 
block=blk_4643386764144918409_1009 length 771 genstamp 1009
2010-11-14 15:45:34,185 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.datanod...@26c94114] 
datanode.DataNode(1563): block=blk_4643386764144918409_1009, (length=771), 
syncList=[block:blk_4643386764144918409_1009 node:127.0.0.1:48738], 
closeFile=true
2010-11-14 15:45:34,186 INFO  
[org.apache.hadoop.hdfs.server.datanode.datanod...@26c94114] 
datanode.DataNode(1450): oldblock=blk_4643386764144918409_1009(length=771), 
newblock=blk_4643386764144918409_1014(length=771), datanode=127.0.0.1:48738
2010-11-14 15:45:34,188 INFO  [IPC Server handler 1 on 40196] 
namenode.FSNamesystem(3149): BLOCK* NameSystem.addStoredBlock: blockMap 
updated: 127.0.0.1:48738 is added to blk_4643386764144918409_1009 size 771
2010-11-14 15:45:34,189 INFO  [IPC Server handler 2 on 40196] 
namenode.FSNamesystem(1970): 
commitBlockSynchronization(lastblock=blk_4643386764144918409_1009, 
newgenerationstamp=1014, newlength=771, newtargets=[127.0.0.1:48738], 
closeFile=true, deleteBlock=false)
{noformat}

All of the recovery stuff happens after the abort.  The snippet around the 
exception/abort from this log is this:

{noformat}
2010-11-14 15:44:31,867 DEBUG [PRI IPC Server handler 3 on 58498] 
hdfs.DFSClient$DFSOutputStream(3154): DFSClient writeChunk allocating new 
packet seqno=1, src=/user/jgray/.logs/dev692.sf2p.facebook.com
,58498,1289778232413/10.17.83.207%3A58498.1289778233005, packetSize=65557, 
chunksPerPacket=127, bytesCurBlock=0
2010-11-14 15:44:31,867 DEBUG 
[RegionServer:0;dev692.sf2p.facebook.com,58498,1289778232413.logSyncer] 
hdfs.DFSClient$DFSOutputStream(3238): DFSClient flush() : saveOffset 512 
bytesCurBlock 771 lastFlush
Offset 310
2010-11-14 15:44:31,867 DEBUG [DataStreamer for file 
/user/jgray/.logs/dev692.sf2p.facebook.com,58498,1289778232413/10.17.83.207%3A58498.1289778233005
 block blk_4643386764144918409_1009] hdfs.DFSClient$
DFSOutputStream$DataStreamer(2429): DataStreamer block 
blk_4643386764144918409_1009 wrote packet seqno:1 size:804 offsetInBlock:0 
lastPacketInBlock:false
2010-11-14 15:44:31,868 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.BlockReceiver(393): Receiving one packet for block 
blk_4643386764144918409_1009 of length 783 seqno 1
 offsetInBlock 0 lastPacketInBlock false
2010-11-14 15:44:31,868 DEBUG 
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.BlockReceiver$PacketResponder(737): PacketResponder 0 adding seqno 1 
to ack queue.
2010-11-14 15:44:31,868 DEBUG [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(891): 
PacketResponder 0 for block blk_4643386764144918409_1009 responded a
n ack: Replies for seqno 1 are SUCCESS
2010-11-14 15:44:31,868 DEBUG [ResponseProcessor for block 
blk_4643386764144918409_1009] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(2534): DFSClient Replies for 
seqno 0 are FAILED
2010-11-14 15:44:31,868 DEBUG [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(789): 
PacketResponder 0 seqno = -2 for block blk_4643386764144918409_1009 
waiting for local datanode to finish write.
2010-11-14 15:44:31,868 WARN  [ResponseProcessor for block 
blk_4643386764144918409_1009] 
hdfs.DFSClient$DFSOutputStream$ResponseProcessor(2580): DFSOutputStream 
ResponseProcessor exception  for block bl
k_4643386764144918409_1009java.io.IOException: Bad response 1 for block 
blk_4643386764144918409_1009 from datanode 127.0.0.1:48738
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2542)

2010-11-14 15:44:31,868 WARN  [DataStreamer for file 
/user/jgray/.logs/dev692.sf2p.facebook.com,58498,1289778232413/10.17.83.207%3A58498.1289778233005
 block blk_4643386764144918409_1009] hdfs.DFSClient$
DFSOutputStream(2616): Error Recovery for block blk_4643386764144918409_1009 
bad datanode[0] 127.0.0.1:48738
2010-11-14 15:44:31,869 INFO  
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.BlockReceiver(565): Exception in receiveBlock for block 
blk_4643386764144918409_1009 java.io.IOExcept
ion: Connection reset by peer
2010-11-14 15:44:31,869 INFO  [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(844): 
PacketResponder blk_4643386764144918409_1009 0 : Thread is interrupt
ed.
2010-11-14 15:44:31,869 INFO  [PacketResponder 0 for Block 
blk_4643386764144918409_1009] datanode.BlockReceiver$PacketResponder(907): 
PacketResponder 0 for block blk_4643386764144918409_1009 terminating
2010-11-14 15:44:31,869 INFO  
[org.apache.hadoop.hdfs.server.datanode.dataxcei...@6a25b72a] 
datanode.DataXceiver(377): writeBlock blk_4643386764144918409_1009 received 
exception java.io.IOException: Con
nection reset by peer
2010-11-14 15:44:31,869 WARN  
[RegionServer:0;dev692.sf2p.facebook.com,58498,1289778232413.logSyncer] 
hdfs.DFSClient$DFSOutputStream(3293): Error while syncing
java.io.IOException: All datanodes 127.0.0.1:48738 are bad. Aborting...
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2666)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1500(DFSClient.java:2157)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2356)
{noformat}

> hdfs-724 "breaks" TestHBaseTestingUtility multiClusters
> -------------------------------------------------------
>
>                 Key: HBASE-3234
>                 URL: https://issues.apache.org/jira/browse/HBASE-3234
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.90.0
>
>         Attachments: 
> org.apache.hadoop.hbase.TestHBaseTestingUtility-output.txt, 
> org.apache.hadoop.hbase.TestHBaseTestingUtility.txt
>
>
> We upgraded our hadoop jar in TRUNK to latest on 0.20-append branch.  
> TestHBaseTestingUtility started failing reliably.  If I back out hdfs-724, 
> the test passes again.  This issue is about figuring whats up here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to